In this article, I'm going to show you that:
I'm a big fan of Rust, and think it's a huge leap forward in language design. Rust has shown us a lot of techniques that work really well.
This isn't a hit piece, this is an honest criticism of Rust. 0 Still, those that identify heavily with it might want to stop reading here. Keep in mind that I'm comparing with Rust only because, as mentioned before, Rust is the gold standard for safety and speed.
Some assumptions up-front:
Rust is what ignited my passion for programming language development in the first place! So it's rather ironic that I'm writing an article about what might come after Rust.
See this article for this perspective.
Contrary to popular belief, the borrow checker incurs run-time overhead.
This isn't to say that Rust is slow. Despite the below costs, Rust is still one of the fastest languages in existence. It's not perfect though, and that means languages of the future might do better.
To illustrate the borrow checker's overhead, let's say we have a Vec<Spaceship>, and a dogfight function that:
The borrow checker rejects our call dogfight(&mut spaceships[shipAIndex], &mut spaceships[shipBIndex]). 2
The three main workarounds: 3
Each of these three workarounds introduces some run-time overhead. Respectively:
Memory safety is never free. Memory safe languages alway incur run-time overhead. One just has to zoom out enough to see it. 5
There are other safe and fast memory models with different tradeoffs (covered further below) and more are being discovered every year! It's an exciting time for language design.
It rejects it because it can't guarantee that shipAIndex != shipBIndex.
There are other tricks we can do to work around the borrow checker here, but none that are generally applicable in any situation. These three solutions can always be used.
We could also sunder our Vec<Spaceship> into a parallel Vec
One can contrive very small and simple programs that use no Vecs, but such programs are incredibly rare.
In general, Rust forces us to use a lot of Vecs, which always have costs.
Not only does indexing into them require bounds checks, but expanding them is very costly...
...and not as amortized as one would think. 7
If one is using generational indices, looking up an element will incur a branch when checking that the generations match.
The CPU can avoid cache misses for linear copies, but only for large Vecs (~50 cache lines)
For example, if a Vec is used for a temporary calculation in a function, the Vec's resizing cost isn't amortized across all the function calls.
Rust forces us to use a lot of Vecs to store our objects. However, Vecs have a tradeoff: they use a lot of memory. Your Vec with three elements might actually be 128 elements wide, if at some point in the past it had 65 elements in it. Rust programs use more and more memory as time goes on. 8
A very common technique in idiomatic Rust is to use generational indices to refer to objects. These too have memory overhead: a "current generation" integer for a slot in the array, and a "target generation" integer in every generational_index that is serving as a non-owning reference.
If a language could offer more flexibility on where objects reside and how they're represented, then they might avoid this memory overhead.
Rust reaches a miraculous level of safety language compared to, say, C++. Its ability to be very safe and low-level is truly an achievement.
As said in the beginning, this article is assuming that we're not using any unsafe in our code. However, we can never be certain if a dependency is using unsafe wisely, and it's rarely feasible to audit all dependencies' usage of unsafe.
Contrary to popular belief, not all memory unsafety happens inside unsafe blocks:
If a language can achieve Rust's performance without dropping into unsafe, and make a few other adjustments, 11 then this problem evaporates. I believe such a thing is possible.
There's another paradigm that gives you speed if you sacrifice memory: garbage collection! GC becomes faster and faster the more memory you give it.
The experienced Rustacean knows that any program is technically expressible in idiomatic Rust, but not everyone that contributes to the ecosystem is so experienced.
We shouldn't pretend that we can simply grep for unsafe in our dependencies and immediately see problems. It's not that simple; it's very easy for a bug in safe Rust to violate the assumptions in an unsafe block.
It's important to note that offering FFI (a foreign function interface) is just as bad as unsafe blocks, so any language that wishes to improve upon this will need to address FFI's potential unsafety as well.
In idiomatic Rust (without Rc), the borrow checker is very limiting, and rejects a staggering amount of safe patterns.
For example, Rust won't let you use the observer pattern.
There's only one option here: use globally-accessible data. 12
A great example of this is in the NAME HERE library. Because Rust can't handle observers, it requires all your data be in a SOMETHING struct, accessible to all observers.
An experienced architect will back up, and question why we have to use a framework for such a simple and safe pattern.
There are countless other examples of patterns that the borrow checker rejects.
If a language could offer static analysis on par with the borrow checker or integrate the borrow checker well with other strategies, this problem could be solved.
That development time is valuable, because it can be spent on optimizing the code that matters. If a hypothetical language was half as time-consuming as Rust, then that extra time could be spent optimizing the hot path, 13 which could result in a more efficient program in the end than the Rust one.
This can either be an actual global, or some other data that's widely accessible to most of the program. Both are harmful.
It's often said that a program spends 95% of its time in 10% of the code. The "hot path" refers to that 10%.
Rust isn't perfect, and one day, we'll find a language that will give us better safety, or more speed, or both. There are already a lot of fascinating techniques on the horizon which show a lot of promise.
Lobster is a new language that achieves some pretty amazing performance. At compile-time, its algorithm looks at all aliases for a given object, infers which one should be the "owning" reference, and make all other references into borrow references. For the rare case where this doesnt work, it falls back on reference counting. The algorithm eliminates 95% of all reference count adjustments.
It's like an "automatic borrow checker" in a way; it produces code that's similar to a Rust program that uses the occasional Rc, but much easier to use and with more potential optimizations since Rc is built into the language itself.
Additionally, it isolates memory between threads (similar to Rust and Vale) which drastically reduces reference counting's cost.
This is exciting because it shows that reference counting can be incredibly fast. There's a lot of different directions we can go with something like this, so keep an eye out for Lobster!
Cone is a language exploring what happens when we build the borrow checker on top of any kind of "region" that the user wants: lexical, arena, pool, reference counting, garbage collection, or the user can easily code their own region as well.
I'm particularly excited about this because a user can create their own region to suit a particular use case. Perhaps these could soon be possible:
I suspect that one day, we'll see a flood of interesting new memory management techniques, specialized for all sorts of use cases. After that ecosystem of regions forms, we'll start to learn when to use one over the other, how to seamlessly use them together, and achieve some kind of memory management enlightenment.
Vale is exploring how RAII interacts with different kinds of memory regions. Because all mutable objects are owned by a single owner, the language can "secede" an object (and all those it indirectly owns) into an isolated region, which we can then keep separate or merge with other regions.
With this, Vale makes it so that a function call (and any functions it indirectly calls) can use a specific region for its own allocations, while still being able to access the caller's region.
It keeps these separate by tracking which region every reference is pointing into.
You can read more about this at Zero-Cost References with Regions.
This is particularly exciting because:
Recently, the Vale team invented a new memory management paradigm, an alternative to borrow checking or reference counting or garbage collection, called generational memory. In generational memory, every allocation has an "actual generation" number at the top, 14 which represents "I am the nth inhabitant to occupy this location in RAM." Every (non-owning, non-borrow) reference is made of a pointer and a "target generation" number, so before each dereference, it checks that the allocation's actual generation still matches the reference's target generation, effectively checking if the target object is still alive.
Vale's next goal is to build on generational memory to make hybrid-generational memory, which blends in scope tethering 16 to enable automatic borrow checking in a way similar to an experienced Rust programmer, but without the difficulties of the borrow checker. You can get a sneak peak at the design at Hybrid-Generational Memory.
Address Sanitizer is a tool that numerous companies use to gain more confidence in their programs' memory safety. It instruments programs during development and testing to detect invalid memory accesses. 17 18 Then, after rigorous testing with this, a team will turn of Address Sanitizer in production, to maintain maximum performance. For certain use cases (such as games and apps) this is a perfect balance between safety and speed.
Constraint references go one step further than address sanitizer; instead of waiting until you dereference a dangling pointer, a constraint reference can detect when a pointer becomes dangling. 19 With this, a program can detect risky pointers before they're even dereferenced, similar to how SQL's foreign key constraints work (hence the name).
This was first implemented by Adam Dingle and David Bacon in an experimental language named Gel. We experimented with this in Vale and were wonderfully surprised at how easy they were to learn and use, and how few constraint halts we encountered (only two in a 1,500 line program, both which actually found legitimate mishandling). We decided to keep it in the language as Assist mode, and an additional Unsafe mode to turn off all protections if one deemed their testing with Assist mode was sufficient. You can read more about this at The Next Steps for Single Ownership and RAII.
This is particularly exciting because, given a good test suite, it's conservative enough to give high confidence in safety, 20 with zero memory or speed overhead, which will be the best choice for a vast number of programs.
The above languages are pushing the state of the art in regions, static analysis, and new memory management paradigms.
I predict that the 2020s will be a decade of exploration in the best ways to blend static analysis with a combination of different memory management paradigms in the same program, in a way that Rust programmers manually do right now.
If there's something that writing this article has taught me, it's that we need more programming languages. We should to explore these realms more, because there's so much interesting potential! If I had to list a few interesting ideas:
There are a lot of other ideas and experiments and new languages using techniques which I couldn't fit here. In a future article, I could cover Nim's ORC, Zig's ReleaseSafe, Basil's stack-based model, and much much more. Comment and let me know what you'd like to hear about!
This requires that our malloc uses size classes (like jemalloc and mimalloc) in a heap specifically for these "generationed allocations" so we can be certain nothing will overwrite them.
This is comparing naive generational memory to naive reference counting. In other words, there are no optimizations on top of either.
This means that locals can share ownership of an object by marking its "tethered" bit, preventing it from disappearing and allowing more aggressive static analysis.
This means that the compiler inserts assembly instructions before every dereference to make sure we're accessing only valid memory.
Asan is basically like generational memory, but 20-40x slower.
If the user intends for a reference to dangle, then they can use a weak reference instead of a constraint reference.
Keep in mind, Rust programs (especially those with dependencies) still have to worry about memory unsafety too, because of unsafe.