Memory safety is an incredibly useful aspect for a programming language. It protects us from maddening bugs, devious vulnerabilities, and cantankerous shenanigans.
Getting memory safety with predictable performance 0 is quite a challenge! Most languages need to sacrifice memory safety for other features, such as FFI or unsafe blocks.
We've discovered a better way to offer memory safety.
Vale starts with a foundation of generational references. A generational reference is when every object has a "current generation" integer which we increment on free, and every pointer has a "remembered generation" number from the object. When we dereference a pointer, we assert that those two numbers still match. See Generational References for a more in-depth explanation!
Most other languages use garbage collection, reference counting, or borrow checking, but we chose generational references because they're fast 1 and more flexible than borrow checking, allowing us to use common safe patterns like observers, higher RAII, the dependency injection pattern, delegates, back-references, graphs, and so on.
We also discovered that they enable Vale to have complete memory safety, something no native language has been able to achieve.
This is because generational references:
Generational references are pretty stellar for memory safety. However, there's one remaining problem: what about FFI?
Foreign Function Interface (FFI) is when a language allows calling into another language's code. For example, Objective-C might make a pointer to some data and pass it to C, which might have a bug which corrupts that data.
This is called "leaky safety", and its bugs are very difficult to track down, because their symptoms manifest so far from their cause.
This can also happen when a language has unsafe escape hatches. If some unsafe code corrupts some memory, it can cause undefined behavior in safe code. For example, see this Rust snippet where an unsafe block corrupts some memory that's later used by the safe code.
In all these cases, we know that the unsafe language was involved somewhere in the chain of events, but since the bugs actually happen later on, in supposedly safe code, there's no easy way to identify which unsafe code was the original culprit.
To solve this, Vale has Fearless FFI which decouples and isolates unsafe C data from safe Vale data.
See Fearless FFI for more on this!
This protects us from any bugs in C that might otherwise accidentally corrupt our Vale data. 3
Generational references are very fast, but if we want that extra sliver of performance, Vale plans to add the Check Override Operator.
Most generation checks will be skipped by the region borrow checker and Hybrid-Generational Memory. For example, we recently implemented a Cellular Automata algorithm, and by our measurements, regions and hybrid-generational memory would eliminate every single generation check in the entire algorithm.
Still, for the occasional generation check that those two might not eliminate, we have the Check Override Operator, which will skip the generation check for a generational reference.
It would seem so, except for one key detail: the check override operator is ignored by default for any dependencies. One must explicitly enable a dependency to ignore its checks.
Most people will be using it with checks on, so everyone will find out very quickly if there's any unsafe behavior in practice. Anyone who wants that extra sliver of performance can then opt-in to skipping the checks in release mode, with a little more confidence that unsafety will be detected by other users of the library, or in development or testing.
And if a user isn't comfortable with that for their situation, they simply stick with the defaults, which ignore the check override operator.
We think this is a perfect tradeoff to allow memory safety when it's critical, while not compromising the safety of the language or ecosystem.
We're also thinking of adding a way to skip all generation checks for a given block of code. 4
We talked about three mechanisms:
With these measures in place, Vale will be the first completely memory safe native language!
Tracing garbage collection (like in Java) is a great solution for memory safety, but there are cases where more predictable performance is highly desirable, such as in embedded devices, games, or certain kinds of servers.
They're not just fast, but they could get even faster when we introduce regions and Hybrid-Generational Memory
For example, when Python code sends a Python object into C, if the C code doesn't correctly call Py_INCREF, it will corrupt Python's memory and cause some mysterious behavior later on in the Python code.
We could also protect against malicious code with sandboxing, via webassembly or subprocesses. This is a planned feature, see Fearless FFI for more on this!
We might even call it an unsafe block, if that doesn't cause confusion with other languages.