High-level description
This chapter contains the design docs for the persistable collection types implemented withing magpie, to be used when defining data that is meant to be stored in objects operated on from within nanotransactions.
The problem
To model most interesting problems, we will require non-scalar types like sets, dynamic arrays etc. As things currently stand, we cannot leverage Rust's built-in collection types, because of how they lay out their memory.
Take the built-in Vec type as an example. A
Rust vector is composed of a header detailing the underlying array's length and current capacity.
This header also contains a pointer to the buffer that holds the array's contents; this pointer is
in the virtual address space of the process that instantiated the vector. What this means is that,
if we were using such a vector as part of a persistable data structure in our Magpie program, the
"transparent" abstraction of manipulating data in an mmap-ed file would break, as the contents
of the vector would reside outside of the file (given that the Rust runtime uses the global
allocator to allocate the buffer on the process' heap).
The solution
In an ideal world, all standard library collection types would allow us to set the allocator for
only the instances we are interested in (i.e. those that are meant to end up transparently
persisted) to a custom allocator that "understands" how to lay memory out for the task at hand. Even
though the allocator_api RFC is a step in this
direction, it is unclear when (and if) it is actually going to become more than just a facility that
is in place, and when the standard library collection types are going to leverage it to support
parameterization through custom allocators.
Because of this, we are opting to implement (some) collection types ourselves. We implement custom allocators, and leverage those to create collection types that expose a subset of the standard library APIs to users (so that they can be dropped-in to programs that would use standard library collection types and work as users expect them to without many surprises), but play nicely with the rest of the infrastructure we have set up around transparent operations over persistent data and data movement.