Rust Flexible Memory

Rust allows you to jump from high-level algorithms and data handling to low-level byte manipulations and optimizations. I’ve found it incredibly helpful to be able to shift between the two. You can start out with heap allocated Vec and HashMap and slowly migrated to more specialized and byte-optimized structures.

This flexibility is what made it practical to implement my scripting language in Rust.

Since Crafting Interpreter was released I’ve worked on several interations of my scripting language, Shimlang. It was written as a tree-walking interpreter in Python, then Zig, then Rust. It originally went from Zig to Rust since I found it much easier to handle memory via RAII. It also felt oddly bloated to have the allocator as a property on many objects.

I got fairly far with that Rust implementation, even going as far as making a little game that I compiled as WASM for the web. However, it relied on a custom allocator that was fairly tedious to pass around, and required writing many custom collections to work with. For some reason I felt the need to support collections with fallible allocations. That was probably influenced by Zig. Garbage collection was implemented as reference counts with cycle detection.

Development on that was sidetracked when I had the desire to implement a game asset manager. Coming back to the project a couple years later I felt like the implementation didn’t fit the new vision I had for the language. I was inspired by this Tomorrow Corporation Tech Demo. In that demo they showed a time-travelling debugger, live editing code, graphical debugging, and way more. I wanted my language to be a scripting language for games, and I knew that the pointer soup I implemented for the initial Rust implementation would be very hard to translate into something that could support those sorts of features. I thought Rust would make it hard to do the sorts of low-level memory manipulations I wanted to make it work, so I went back to Zig with a fresh start.

With some more experience I was able to be more productive with Zig than I had originally. I still had problems though. The build system (at the time) was still unstable and changing (seemingly) every major release. While some people may like the fact the build system is also in Zig, I think that implementing a build script in a language that has you manually managing memory is a bad idea. Further, the docs for Zig are not robust enough to make using it enjoyable and the language is changing often enough for LLMs to not be that big of a help.

The thing that finally made me stop was the lack of good sum types and the inflexible error handling. Defining the tag separate from the union is unergonomic, and needing to pass error information through some object or global doesn’t feel great to use.

On top of that, it’s really really annoying to constantly get an error message from the compiler that a variable is unused. When rapidly iterating on code this is highly unnecessary and is a huge break from every other language.

Again, I implemented Shimlang in Rust. This is a completely separate implementation from any of the others. The biggest departure from the other implementations is creating a dedicated block of memory for the interpreter to use. Any values that can’t fit into 64-bits use this memory. Further, since I know how much space values in this block need, freeing memory also requires passing in how much to free (in contrast to C’s free implementation). This saves the allocator from needing to store metadata about allocations for subsequent freeing.

To take this approach, I needed to free myself from the idea of writing idiomatic Rust code. Taking this approach requires using unsafe everywhere. It’s still much safer than C/Zig, but it’s a bug departure from typical Rust programming. Now I’m willing to write “hacky” code and I’m more willing to write things myself It’s much more satisfying than learning yet-another-library.

The single other thing that enabled this is LLMs. Bluntly, using unsafe in the pre-LLMs days was frustrating. It was hard to know if you’d unexpectly call a drop to be called or how to get the proper reference given a pointer or to get some pointer, or how to write to the pointer. Things that are very very easy in C became headaches in Rust. This changed post-LLMs. Now I can ask how to turn a u64 representing a pointer to a &SomeValue. And that when initializing the pointer I should use .write. It’s much easier to learn now that there are chat bots that can patiently explain how to do these things. I’m sure I’m making mistakes, but I’m getting things done in a way I wasn’t able to before. Beyond unsafe, I’m writing better Rust code since I can always ask how to do something better.

Working on this, I found that it easy to use a combination of my memory block alongside Vec/Box/HashMap. I had the struct data for boxes/hashmaps/vecs in my block of memory which pointed out to memory on the heap. Rather than needing to manually manage the memory from the get-go, I found that I could do a combination of both manual and automatic memory management. Rather than being delayed by not having my own hashmap, I was able to use the std HashMap and continue making good progress.

I expect this feeling isn’t unique to Rust, and folks using C++ or D could relate to this. However, I don’t think most folks using Rust can really relate. The community shuns uses of unsafe. That probably makes sense for networked and security-minded code, but not everything. For my little language unsafe makes it so that I can even consider the sorts of tooling and debugging that I desperately want for my language.