“ZLinq”, a Zero-Allocation LINQ Library for .NET

neuecc.medium.com

269 points by cempaka 5 months ago

zamalek 5 months ago

In theory .Net 10 should make this obsolete, the headline features[1] are basically all about this. In practice, well, it's heuristics, I'm adding this to a particularly performance sensitive project right now :)

Edit: what's also nice is that C# recognizes Linq as a contract. So long as this has the correct method names and signatures (it does), the Linq syntax will light up automatically. You can also use this trick for your own home-grown things (add Select, Join, Where, etc. overloads) if the Linq syntax is something you like.

[1]: https://learn.microsoft.com/en-us/dotnet/core/whats-new/dotn...

Jordanpomeroy 5 months ago

Could you elaborate? I don’t see anything about improving the performance of enumerator. Zlinq appears to remove the penalty of allocating enumerators on the heap to be garbage collected. The link you sent mention improvements, but I don’t see how they lead to linq avoiding heap allocations.
- kevingadd 5 months ago
  
  I believe they're referring to the stack allocation improvements, which would ideally allow all the LINQ temporary objects to live on the stack. I'm not sure whether it does in practice though.
  - andyayers 5 months ago
    
    Unfortunately, those improvements don't work for Linq.
    Some notes on why this is so here: https://github.com/dotnet/runtime/blob/main/docs/design/core...
    
    zamalek 5 months ago
    
    Aw, I had no idea that it didn't work out. If they work that out I'd put good money on a colossal perf boost across the board.
- giancarlostoro 5 months ago
  
  Not just that but Zlinq also works across all C# environments it seems including versions embedded in game engines like Godot, Unity, .NET Standard, .NET 8 and 9.
- moomin 5 months ago
  
  Select doesn’t have to return IEnumerable. A struct that exposes the same methods will work. So allocate-free foreach is very possible.
  - debugnik 5 months ago
    
    But that's what ZLinq does, not what the upcoming changes to .NET do. What's your point?

jasonthorsness 5 months ago

This is great. I've worked on production .NET services and we often had to avoid LINQ in hot paths because of the allocations. Reimplementing functions with for-loops and other structures was time-consuming and error-prone compared to LINQ method chaining. Chaining LINQ methods is extremely powerful; like filter, map, and reduce in JS but with a bunch of other operators and optimizations. I wish more languages had something like it.

garganzol 5 months ago

I know that some companies try to avoid LINQ as a rule. However, avoiding LINQ gives negligible gains most of the times.
Of course, if it's really a hot path like matrix multiplication then it makes total sense, but avoiding LINQ gives unpleasant side effect: loss of code soundness and quality.
- hoppp 5 months ago
  
  I can understand avoiding it, it can get messy when used with large amounts of data fetched from an sql database, if the lazy dev uses Linq instead of implementing proper queries in SQL or using stored functions.
meisel 5 months ago

What are the advantages of this over using higher order functions? In Ruby I can do list.map { }.select { } …. That feels more natural (doesn’t require special language support), has a very rich set of functions (group_by, chunk_while, etc.), and is something the user can extend with their own methods (if they don’t mind monkeypatching)
- int_19h 5 months ago
  LINQ is higher-order functions - Ruby `map` is 'Enumerable.Select`, Ruby `select` is `Enumerable.Where` etc.
  The special syntax is really just syntactic sugar on top of all this that makes things a little bit more readable for complex queries because e.g. you don't have to repeat computed variables every time after binding it once in the chain. Consider:
  from x in xs where x.IsFoo let y = Frob(x) where y.IsBar let z = Frob(y) where z.IsBaz order by x, y descending, z select z;
  If you were to rewrite this with explicit method calls and lambdas, it becomes something like:
  xs.Where(x => x.IsFoo) .Select(x => (x: x, y: Frob(x)) } .Where(xy => xy.y.IsBar) .Select(xy => (x: xy.x, y: xy.y, z: Frob(xy.y))) .Where(xyz => xyz.z.IsBaz) .OrderBy(xyz => xyz.x) .ThenByDescending(xyz => xyz.y) .ThenBy(xyz => xyz.z) .Select(xyz => xyz.z)
  Note how it needs to weave `x` and `y` through all the Select/Where calls so that they can be used for ordering in the end here, whereas with syntactic sugar the scope of `let` extends to the remainder of the expression (although under the hood it still does roughly the same thing as the handwritten code).
  - sumibi 5 months ago
    
    I tend to use the query syntax a lot for this exact reason.
    It would be even better if it supported exposing pattern matching variables and null safety annotations from where clauses to the following operations but I guess it's hard to translate it to methods.
    Something like this:
    from x in xs where x is { Y: { } y } select y.z
    Another feature I'd like to see is standalone `where` without needing to add `select` after it like in VB.net.
    
    int_19h 5 months ago
    
    Pattern matching shouldn't be hard to translate, really. It would not be a 1:1 mapping to the existing IEnumerable methods, but it's a straightforward translation to a single Select that returns a Nullable<ValueTuple> (to indicate match success/failure and pass the data in the former case) followed by a Where that would remove failures. Better yet, they could always add a new extension method to IEnumerable for this, and then translate to that.
    One feature I'd like to see is integration with foreach so that you don't have to repeat the variable and come up with a different name to work around shadowing rules. I.e. instead of:
    foreach (var x in from x0 ...)
    it would be nice to be able to write simply:
    foreach (from x in ...)
    and have it "just work", including more complicated cases with multiple nested from-clauses, let etc (effectively extending the scope of all of those into the body of foreach).
- whizzter 5 months ago
  
  Like mentioned, groupby,etc all operate on functions, (map/reduce/filter/etc) are just named differently (select/aggregate/where/etc).
  What makes people love Linq is that it handles 2 different cases (identical syntax but different backing objects).
  1: The in-memory variant does things lazily, select/aggregate/where produce enumeration objects so after the chain you add ToArray, ToList, ToDictionary,etc and the final object is built lazily with most of the chain executed on as few objects as possible (thus if you have an effective Where at the start, the rest of the pipeline will do very little work and very few allocations).
  2: The compiler also helps the libraries by providing syntax-tree's, thus database Linq providers just translates the Linq to SQL and sends it off to the server, letting the server do the heavy lifting _with indexes_ so that we can query tables of arbitrary sizes with regular C# Linq syntax very quickly without most of it never going over the network.
- tinco 5 months ago
  
  Linq came out as part of a set of features that addressed the comforts of languages like Ruby. I don't know if they considered Ruby to be a threat at the time but they put a bunch of marketing power behind the release of linq. The way I understand it, as someone who jumped from C# to Ruby just around the time Linq came out is that it's a DSL for composing higher order functions as well as a library of higher order functions.
  I always liked how the C# team took inspiration from other language ecosystems. Usually they do it with a lot more taste than the C++ committee. The suppose the declarative linq syntax gives the compiler some freedom of optimization, but I feel Ruby's do syntax makes higher order functions shine in a way that's only surpassed by functional languages like Haskell or Lisp.
- rafaelmn 5 months ago
  
  LINQ methods are higher order functions ? LINQ syntax is just sugar and probably a design mistake (dead weight feature that I've only seen abused to implement monadic composition by insane people).
  And Ruby doesn't even enter this conversation if we're talking about these kinds of optimizations - it's an order of magnitude away from what you're aiming from if you're unrolling sequence operations in C#.
  - alkonaut 5 months ago
    
    > LINQ syntax is just sugar and probably a design mistake
    I find that the term "LINQ" these days tend to mean the extensions on IEnumerable/IQueryable, and not the special query syntax. Whatever the term meant when it was launched is now forgotten. Almost no one uses the special query syntax, but everyone uses the enumerable/queryable extension methods like Select() etc, and calls it "Linq".
    
    jodrellblank 5 months ago
    
    > "Whatever the term meant when it was launched is now forgotten"
    Language INtegrated Query. The SQL query isn't written inside a string, opaque and uncheckable, it's part of the C# language which means the tooling can autosuggest and sanity check database table names and field names against the live database connection, it means the compiler is aware of the SQL data types without manually building a separate ORM/layer.
    That people who don't use C# think it's just a Microsoft way to write a lambda filter on an in-memory list is sad.
    
    jermaustin1 5 months ago
    
    > Almost no one uses the special query syntax
    Source? I'm in multiple current, active development projects with companies, they are all using the LINQ query syntax.
    Not to mention all of the legacy code out there that is under active maintenance.
    To say almost no one feels very much antithetical to my (albeit anecdotal) experience. I can't imagine, I'm the only c# consultant that has multiple clients that use LINQ queries extensively throughout their applications.
    
    alkonaut 5 months ago
    
    Trying to crudely estimate the popularity i tried doing a regex search for C# statements ending with select ___; vs calls to Select(/Where(. It was (only) a factor 10x which was way less difference than I thought to be honest. I have seen and used one in the wild just a handful of times during 22 years of C#. But it might also vary with industry. It's likely a lot more common in fields where there are databases than where there are not.
    [1] https://github.com/search?q=%2F%28%3F-i%29%5Cs%2Bselect%5Cs%2B%5Cw%2B%3B%2F+language%3AC%23&type=code&ref=advsearch [2] https://github.com/search?q=%2F%28%3F-i%29%5Cs%2B%5C.%28Select%7CWhere%29%5C%28%2F+language%3AC%23&type=code&ref=advsearch
    
    jermaustin1 5 months ago
    
    I'm actually surprised reading some of those LINQ queries, first few pages, none of them (IMO) should be using query syntax. Hell, I hate query syntax to begin with, and use the extensions any opportunity I'm allowed.
    But I would also say that the companies using query syntax are also vastly underrepresented on GitHub public repositories.
    > It's likely a lot more common in fields where there are databases than where there are not.
    I think beyond that, it's probably more common with developers that came from doing SQL queries but "needed" type safety.
    I've worked with developers that didn't even know the extension methods existed. They went from SqlCommand to LINQ to EF or LINQ to SQL.
    The only time I think query syntax is better than the extension methods is when dealing with table joins.
    
    david_allison 5 months ago
    
    > The only time I think query syntax is better than the extension methods is when dealing with table joins.
    Fairly niche, but query syntax is a great approximation of Haskell's do notation:
    https://github.com/louthy/language-ext/wiki/Thinking-Functio...
    EDIT: updated URL
    
    taco_emoji 5 months ago
    
    > but everyone uses the enumerable/queryable extension methods like Select() etc, and calls it "Linq"
    Most likely because those extension methods are all under the System.Linq namespace. Really they should've gone under System.Query or something like that.
  - contextfree 5 months ago
    
    It's me, I'm insane people :) (or was in 2010?)
    IIRC it was to implement a constraint solver, which I couched in monadic terms somehow, don't remember the details. Not sure if I'd do it the same way again, but I did get it to work.
    
    rafaelmn 5 months ago
    
    If it's some small isolated part or personal project I guess it doesn't matter - but I've seen a mature codebase that was started that way - and it was among the worst codebases I've seen in 20 years (of similar scale at least).
    Few people even knew how to use it or what monads were, it was a huge issue when onboarding people. When the initial masochist that inflicted this on the codebase left, and stop enforcing the madness, half of the codebase dropped it, half kept it, new people kept onboarding and squinting through it. This created huge pieces of shit glue code that was isolating the monadic crap everyone was too afraid to touch. Worst part was that even if you knew monads and were comfortable with them in other languages they just didn't fit - and it made writing the code super awkward.
    Not to mention debugging that shit was a nightmare with the Result + Exceptions - worst of both worlds.
    It's basically writing your own DSL by repurposing LINQ syntax - DSLs are almost always a bad idea, abusing language constructs to hack it in makes it even worse.

bigmattystyles 5 months ago

This is cool - excited to try it - I would note that I've been a dotnet grunt for almost 15 years now. I am good at it, I know how to use the language, I know the ecosystem - this level of familiarity with the language is just not within my grasp. I can understand the code (mostly) reading it, but I never would have been able to conjure up, let alone implement this. Props to the author.

martijn_himself 5 months ago

Could anyone more across the detail of this chime in on what this means for the 'average' .NET developer?

I rely heavily on LINQ calls in a .NET (Core) Web App, should I replace these with Zlinq calls?

Or is this only helpful in case you want to do LINQ operations on let's say 1000's of objects that would get garbage collected at the end of a game loop?

alkonaut 5 months ago

It means that if you find a performance problem, and you could hand-code that performance problem away (e.g. use some for-loops and preallocated buffers etc instead of wrestling enumerables, adding to lists) then you may find it useful, because you can keep the code cleaner.
This is to the Queryable/Enumerable extensions what ValueTask is to Task, or ref struct to struct etc. If you are the type of developer that sees great benefit switching from Task to ValueTask then you will probably find this useful too.
ComputerGuru 5 months ago

Presumably you absolutely shouldn't use this on any EF Core LINQ expressions or you'll end up materializing the entire table!
ozim 5 months ago

In the article author writes about linq.js that he is not maintaining anymore but someone forked it.
I guess this library will at some point end up unmaintained after author is bored with it.
So I would not use it in any of my production code of a web app unless I get some problem I need to fix with this library specifically. Replacing all just because “it is faster” doesn’t seem good enough.
- thatwasunusual 5 months ago
  
  As the saying goes: if you don't know if you need something or not, you probably don't need it. :)
  I have been using .NET (and LINQ) for many years on a daily basis, and I've yet to run into performance problems that can't be fixed by either rewriting the LINQ statement or do some other quick workarounds.
  But will I try out ZLinq? Sure, but I won't create anything that depends on it.
  - Jordanpomeroy 5 months ago
    
    I think many people don’t need to worry performance of reference type allocations vs value type.
    I don’t mean to assume you do or do not need to worry about that consideration. But 99% of the code I’ve written does not need to be concerned about it.
- gwbas1c 5 months ago
  
  It really depends on how simple / complex ZLinq is. Sometimes simple libraries are "done" and don't need constant updates.
  - ozim 5 months ago
    
    Guy just published - definitely not finished. In 2-3 years after couple thousand people use it and bugs shake out it might be done.
    
    Atotalnoob 5 months ago
    
    It passes all of the tests for dotnets implementation of linq….
    Seems pretty bug free for a first version.

KallDrexx 5 months ago

This is neat, but how does this get away with being zero allocation? It appears to use `Func<T,U>` for its predicates, and since `Func<T>` is a reference type this will allocate. The IL definitely generates definitely seems like it's forming a reference type from what I can tell.

ziml77 5 months ago

The JIT can optimize this. I know for sure if there's no captures in the lambda it won't allocate. It's likely also smart enough to recognize when a function parameter doesn't have its lifetime extended past the caller's, which is a case where it would also be possible to not allocate.
- theolivenbaum 5 months ago
  
  To add on that, you can define your lambdas as static to make sure you're not capturing anything by mistake.
  Something like dates.Where(static x => x.Date > DateTime.Now)

HexDecOctBin 5 months ago

What features does C# has that makes LINQ possible in it and not in other languages?

whizzter 5 months ago

It's 2 different syntactically identical API's under an umbrella.
1: IEnumerable<T> that works lazily in-memory (and similar to the authors improvement) can be done in any language with first class functions, see the authors linq.js or Java's streams library (it's not entirely the same as a chain of map/reduce/filter since it's lazy but that's mostly not a drawback since it improves performance by removing temporary storage objects).
2: IQueryable<T> is the really magical part though, by specifying certain wrapper types the compiler is informed that the library expects an bound syntax tree(AST) of the expression you write, the library can then translate the syntax tree to SQL queries and send the query to the server to be processed.
Thus huge tables can be efficiently queried by just writing regular C# and never touch SQL. In most ORM's it's annoying or have impedance mistmatches but with EF you can write code and be fairly certain that it'll produce a good SQL query since the entire Linq syntax is fairly close to SQL.
tehlike 5 months ago

It's part of the compiler - ast. Linq has two forms - one in the linq ordinary syntax
from x select x.name
And other is just lambda with anonymous types and so on.
For the lambda syntax, you can just do this: https://www.npmjs.com/package/linq
Of course, if you want to run this against a query provider, you do need compiler support to instead give you an expression tree, and provider to process it and convert them to a language (often sql) that database can understand.
There seems to be some transpilers, or things like that - but i don't know what the state of the art is on this: https://github.com/sinclairzx81/linqbox
fixprix 5 months ago

C# can turn lambdas into expression trees at runtime allowing libraries like EF to transform code like `db.Products.Where(p => p.Price < 100).Select(p => p.Name);` right to SQL by iterating the structure of that code. JavaScript ORMs would be revolutionized if they had this ability.
- whstl 5 months ago
  Good answer. To elaborate on it and provide examples.
  In languages that don't have expression inspection capabilities you have to replace the `(p) => p.Price < 100` part with something that is possible for the language to inspect.
  Normally it's strings or something using a builder pattern.
  For example, in TypeORM:
  queryBuilder.where("product.price < :price", { price: 100 })
  And in Mongoose:
  Product.find({ price: { $lt: 100 } });
  The LINQ-ish version would be:
  Product.find((p) => p.price < 100);
  --
  Similarly, for Ruby on Rails:
  Product.where("price < ?", 100)
  Ruby's Sequel overloads operators to have a more natural syntax:
  DB[:products].where { price < 100 }
  But the "lambda" syntax would be:
  Product.where { |p| p.price < 100 }
- pjmlp 5 months ago
  
  Sadly expression trees got out of love in modern .NET and it remains to be seen how much they will ever improve them.
  https://github.com/dotnet/csharplang/discussions/158
  - contextfree 5 months ago
    
    The last comment thread by agocke is interesting. I've thought before that it's unfortunate that LINQ and expression trees were implemented before the move to Roslyn, because if they'd been implemented afterwards they could maybe have just directly used the same object model that the compiler itself uses, which could make it more sustainable to keep them in sync with language changes.
  - shellac 5 months ago
    
    Java has two somewhat related projects in this space, and it does add a substantial cost to language changes (assuming you commit to keep expression trees up to date). https://openjdk.org/projects/babylon/ is the most interesting to me, as a linq+++ potentially.
    
    pjmlp 5 months ago
    
    Yes, being on both ecosystems most of the time, means I get to have lots of nice toys both ways, with them counter influencing each other all these years.
- zigzag312 5 months ago
  
  An interesting thing about expression trees is that with JIT they can be compiled at runtime, but with AOT they are interpreted at runtime.
- vosper 5 months ago
  
  > JavaScript ORMs would be revolutionized if they had this ability.
  Is this possible in JavaScript?
  - paavohtl 5 months ago
    
    Not easily. There's no built-in way to access the abstract syntax tree (or equivalent) of a function at run time. The best thing you can do is to obtain the source code of a function using `.toString()` and then use a separate JS parser to process it, but that's not a very realistic option.
  - Arnavion 5 months ago
    
    There is a limited form of such "expression rewriting" using tagged template strings introduced in ES2015. But it wouldn't be particularly useful for the ORM case.
hansvm 5 months ago

C# is definitely not the only possible language, but some things stand out:
1. You can extend other people's interfaces. If you care about method chaining, _something_ like that is required (alternative solutions include syntactic support for method chaining as a generic function-call syntax).
2. The language has support for "code as data." The mechanism is expression trees, but it's really powerful to be able to use method chaining to define a computation and then dispatch it to different backends, complete with optimization steps.
3. The language has a sub-language as a form of syntactic sugar, allowing certain blessed constructs to be written as basically inline SQL with full editor support.
- CharlieDigital 5 months ago
  Expression trees are highly underrated.
  Compare C# ORMs to JS/TS for example. In C#, it is possible to use expression trees to build queries. In TS, the only options are as strings or using structural representation of the trees.
  Compare this:
  var loadedAda = await db.Runners .Include(r => r.RaceResults.Where( finish => finish.Position <= 10 && finish.Time <= TimeSpan.FromHours(2) && finish.Race.Name.Contains("New") ) ) .FirstAsync(r => r.Email == "ada@example.org");
  To the equivalent in Prisma (structural representation of the tree):
  const loadedAda2 = await tx.runner.findFirst({ where: { email: 'ada@example.org' }, include: { races: { where: { AND: [ { position: { lte: 10 } }, { time: { lte: 120 } }, { race: { name: { contains: 'New' } } } ] } } } })
  Yikes! Look how dicey that gets with even a basic query!
osigurdson 5 months ago

I get that Go maintainers want to keep things simple, but this stuff is pretty useful.
- wvenable 5 months ago
  
  A simple language can make written in it code complex. A complex language can make code simpler. It's not a perfect rule or anything but it's been my experience that attempts at making simpler programming languages just put more demands on the programmer. The lack of expressive power has to be paid for somewhere.
gwbas1c 5 months ago

(Note: A lot of answers discuss LINQ to SQL, which ZLinq does not appear to optimize.)
Iterators: LINQ works on any type that supports iterators. In most languages, this is any type that you can write a for (foreach) loop on and perform an operation on each item in a collection / array / list. (In C#, the collection must implement the IEnumerable<T> interface.)
Lambda functions: LINQ then relies heavily on Lambda functions, which are used as filters or to transform / narrow down data. Most languages also have something similar to these.
Generics: C# allows for "list of foo objects" instead of "list of objects that I have to typecast to foo." Although not explicitly required to implement something LINQ-like in other languages, the compiler verifying type helps with autocomplete and in-IDE suggestions; and helps avoid silly typing bugs.
Generic inference: C# can infer the return type from a lambda function, and infer the argument type in a lambda function. This means you don't need to decorate LINQ syntax with type information; except in some rare corner cases.
This is why, for example, there are LINQ-like libraries in Javascript and Rust. Java supports something that is LINQ-like, although in my limited Java experience, I didn't use it enough to really "get the hang" of it.
---
Note that LINQ has a very serious pitfall: It's easy to accidentally build a filter, and then have a lot of overhead re-running an expensive operation to re-load the source collection. The simplest way to avoid this is to call .ToArray() or .ToList() at the end of the chain to ensure that you store the result in a collection once.
- jayd16 5 months ago
  
  Unchecked Exceptions make a big difference as well. Otherwise, some kind of exception forwarding would need to be handled.
  Extension methods allow LINQ to be implemented as library over all the existing collection types instead of needing child types or refactoring the core collections library.
  - gwbas1c 5 months ago
    
    (Perhaps that is why I never got the hang of Java's LINQ equivalent?)
    Java has RuntimeException, which is unchecked.
    Granted, I can understand why some developers don't understand why some exceptions need to be checked versus not checked. Rust got this right with its Result type and panic.
Merad 5 months ago

Basic LINQ on in-memory collections isn't really that different from what you have in other languages. Where things get special is the LINQ used by Entity Framework. It operates on expressions, which allow code to be compiled into the application and manipulated at runtime. For example, the lambda expression that you pass to Where() will be examined by an EF query provider that translates it into the where clause for a SQL query.
sherburt3 5 months ago

I feel like pretty much every language with generics has a LINQ, like functools/itertools in Python, lodash for javascript. It’s just a different expression of the same ideas.
- jeswin 5 months ago
  
  Nope, very different. Depending on whether the expression is on an Enumerable or a Queryable, the compiler generates an anonymous function or an AST. That is, you can get "code as data" as in say Lisp; and allows expressions to be converted to say SQL based on the backend.
  - pjmlp 5 months ago
    
    You can do exactly the same with Smalltalk metaclasses and reflection.
    However I do conceed most developers will only see them for the first time in .NET languages.

stuaxo 5 months ago

I don't use .NET, but always thought LINQ a really interesting part of it.

incoming1211 5 months ago

Is there a reason these sort of improvements cannot be contributed back into .NET itself?

nikeee 5 months ago

ZLinq relies on its own enumerable type called ValueEnumerable, which is a struct. While it would probably work when using this as a drop-in replacement and re-compiling, things will be more complicated in larger applications. There might be some code that depends on the exact signature of the Linq methods. This might not even be detectable in cases involving reflection and could break stuff silently.
Adding another enumerable type would be a very large change that could effectively double the API surface of the entire ecosystem. This could take some time. Some places still don't even support Span<T>. Also there were some design decisions related to Linq where the number of overloads were a consideration.
Adding this API to .NET could probably be done with that extension method that converts to ValueEnumerable. But without support for that enumerable, this would pretty much be a walled garden where you have to convert back and forth between different enumerable types. Not that great if you'd ask me, but possible I guess.
lmz 5 months ago

I can easily imagine the kind of person that goes out and builds something like this would have little patience with the bureaucracy of getting it integrated into .NET.
- CharlieDigital 5 months ago
  
  I'd say it's less about bureaucracy and more about what the .NET team has to consider when they make sweeping changes.
  Backwards compatibility, security, edge cases, downstream effects on other libraries that are reliant on LINQ, etc.
  One guy with an optional library can break things. If the .NET team breaks things in LINQ, it's going to be a bad, bad time for a lot of people.
  I think Evan You's approach with Vue is really interesting. Effectively, they have set up a build pipeline that includes testing major downstream projects as well for compatibility. This means that when the Vue team build something like "Vapor Mode" for 3.6, they've already run it against a large body of community projects to check for breaking changes and edge cases. You can see some of the work they do in this video: https://www.youtube.com/watch?v=zvjOT7NHl4Q
  - akdev1l 5 months ago
    
    I think this approach predates Vue.
    I know of two examples:
    1. Fedora in collaboration with GCC maintainers keep GCC on the bleeding edge so it can be used to compile the whole Fedora corpus. This validates the compiler against a set of packages which known to work with the previous GCC
    2. I think the rust team also builds all crates on crates.io when working on `rustc`. It seems they created a tool to achieve that: https://github.com/rust-lang/crater
    I would assume the .NET guys have something similar already but maybe there’s not enough open code to do that
    
    zamalek 5 months ago
    
    Rust also has the advantage of having no ABI. Binary interface is a whole lot more difficult to maintain than code interface.
    C# has multiple technologies built to deal with ABI (though it probably all goes unused these days with folder-based deployments, you really need the GAC for it to work).
    
    jasonjayr 5 months ago
    
    IIRC perl tested new releases by running all the unit tests in the CPAN library, waaaaay back when.
    
    clscott 5 months ago
    
    They still do and investigate each failure. If the end result is that the library is “wrong” tickets and patches get sent to the library maintainers.
  - mrmedix 5 months ago
    
    You have to add an extra function call at the start of the Linq method chain in order to make it zero-allocation. So I don't think it would break backwards compatibility. But adding it does create an additional maintenance burden.
- qingcharles 5 months ago
  
  From some experience, the MS guys are actually really eager to get more outside help and many will help guide you through the process if you have something to offer.
  Every release has a fairly decent amount of fixes and additions from outside contributors, and while I can see a lot of to/fro on the PRs to get them through, it's probably not quite as bad as you'd expect.
kevingadd 5 months ago

From looking at the blog post I suspect the explosion of generic instances could be a serious problem for code size and startup time, but that's probably solvable somehow. The performance certainly seems impressive.
The way LINQ currently works by default makes aggressive use of interfaces like IEnumerable to hide the actual types being iterated over. This has performance consequences (which is part of why ZLinq can beat it) but it has advantages - for example, the same implementation of Where<T>(seq) can be used for various T's instead of having to JIT or AOT-compile a unique body for every distinct class you iterate over.
From looking at ZLinq it seems like it would potentially have an explosion of unique generic struct types as your queries get more complex, since for it to work you potentially end up with types vaguely resembling Query3<Query2<Query1<T>>>>. But it might not actually be that bad in practice.
jayd16 5 months ago

Using reference types are more idiomatic in C#. To some degree they are less bug prone as well (they can be passed around without issue). Most of the core library use them instead of starting with value types and boxing.
The Task library has successfully added ValueTask but it took some doing. LINQ on the other hand can be replaced with unrolled loops or libraries more easily so the pressure just hasn't been there.
I could see something happening in the future but it would take a lot of be work.
- chris_pie 5 months ago
  
  To your point, ValueTask is less safe than Task. For example, it's important not to await it more than once.
theolivenbaum 5 months ago

There are some minor breaking changes like the order of iteration is not always the same as the official Linq implementation, or Sum might give different values due to checked vs unchecked summing. Probably not an issue for most people, but a subtle breaking change nevertheless.
- theolivenbaum 5 months ago
  
  See here for more info: https://github.com/Cysharp/ZLinq?tab=readme-ov-file#differen...
bob1029 5 months ago

I don't see why not: https://github.com/dotnet/runtime/pulls
There's an official process for API change requests: https://github.com/dotnet/runtime/blob/main/docs/project/api...

srean 5 months ago

I don't know if anyone remembers expression templates of C++ nowadays. Once upon a time I had written a library of expression templates primitives to chain streams of computations and maps so that the sub-streams do not need to be fully realized in memory - essentially the old idea of Unix pipelines.

Writing it was a lot of fun. Debugging compiler errors were a lot less fun because the template language of C++ has no static typing, so errors would be triggered very deep in the expression tree. The expression tree got processed and inlined at compile time so there was no to minimal overhead at runtime.

I was very impressed with GCC's inlining and vectorization. Especially the messages that explained why it could not vectorize.

ImHereToVote 5 months ago

Why is LINQ allocating memory in the first place?

debugnik 5 months ago

LINQ methods build up a chain of IEnumerable/IQueryable objects, which then build up chains of IEnumerator objects each time you iterate.
These types are all .NET interfaces, which are reference types, so they're allocated on the heap. .NET's escape analysis can sometimes move reference types to the stack, but this feature is currently very limited and didn't even exist until .NET 9.
ZLinq uses generic structs to prevent these allocations at the expense of some really verbose intermediate types.
- ImHereToVote 5 months ago
  
  Thanks.
oaiey 5 months ago

Internal iterators, expression trees, etc. Many LINQ variants (depends on the data source) also do not execute the chain step by step but building first an expression tree and then translating that into native query syntax (e.g. SQL)

torginus 5 months ago

Is it just me, or does this have a case of false advertising?

One of the biggest sources of allocation is lambda captures, like when you write something like this

  var myPerson = people.First(x=>x.Name==myPersonName);

in this case, a phantom object is allocated, that captures the myPersonName variable, for which a delegate is allocated, which is then passed to the First() method, making the number of allocations per call a minimum of 2. I don't see ZLinq doing anything about this.

fefawfefafds 5 months ago

[dead]

fdaffeafe 5 months ago

[dead]

fdadsfdsd 5 months ago

[flagged]

garganzol 5 months ago

Interesting approach, but it should be .NET Runtime (or JIT) who would optimize small memory allocations preferring the stack rather than the heap when possible.

.NET 10 takes a step in that direction [1].

[1] https://learn.microsoft.com/en-us/dotnet/core/whats-new/dotn...