Nanotransactions - v0

Declaring a nanotransaction
- nandoize
- Persistable Trait
Execution
What works and what doesn't

Declaring a nanotransaction

The prototype implementation of "nanotransactions" in magpie currently leverages an operational macro (nandoize) that users use to annotate functions that are meant to be run as nanotransactions.

`nandoize`

The purpose of the nandoize macro is to wrap (almost) any arbitrary user function with a "transactional context". The implementation documentation is located here.

As an example, say we have the following function set_key() operating over arguments of type KeyValuePair:

#![allow(unused)]
fn main() {
#[derive(PersistableDerive)]
struct KeyValuePair {
    key: u64,
    value: u128,
}

#[nandoize]
fn set_key(x: &mut KeyValuePair, k: u64) {
    x.key = k;
}
}

The macro will result in the following meta-function being generated:

#![allow(unused)]
fn main() {
1  | struct NandoManager {
2  |    fn set_key_nando(object_tracker: &object_lib::ObjectTracker, iptr0: &object_lib::IPtr, k: u64) {
3  |        let mut obj0 = match object_tracker.get(iptr0.object_id) {
4  |            Some(o) => o.lock(),
5  |            None => panic!("Object {} not found", iptr0.object_id),
6  |        };
7  |        obj0.advise();
8  |        let mut v = unsafe { obj0.read_into::<KeyValuePair>(&iptr0).unwrap().as_mut() }.unwrap();
9  |        /* pre-image logging */
10 |        let res = set_key(v, k);
11 |        /* post-image logging and flushing/fsync'ing here */
12 |        res
13 |    }
13 | }
}

There are three parts to these generated functions:

The first part is the "preamble" where
- invariant pointers are resolved to concrete in-memory pointers (in the sample above, lines 3-8), potentially also resolving for the appropriate version. Arguments that the caller expects to receive by value (scalars or more complex structures) are simply passed through to the wrapped user function.
- Pre-transaction state is captured for all (potentially) mutable fields.
The second part is the call of the original user function (line 10).
The last part is the logging and syncing phase, during which:
- post-transaction state of the touched data is captured
- flushes to disc are made

With this setup, we still get to write "normal" rust functions over "normal" rust data, but we can then invoke them through the runtime which is also supposed to manage objects.

Not all functions may be annotated with nandoize. The current restrictions are:

The annotated function may not be async (this is subject to change, see epics)
Any argument passed by reference must implement the Persistable trait.

If the macro is invoked on an async function, a panic with an appropriate error message is thrown. If a non-Persistable argument is included in the function's arguments, the macro expansion will succeed, but the final user program will not typecheck.

`Persistable` Trait

The Persistable trait should be implemented for any type that the user wants to store in an object. Its basic implementation includes two methods to transparently convert between stored, mmap-ed data and usable data structure instances:

#![allow(unused)]
fn main() {
pub trait Persistable {
    fn as_bytes(&self) -> &[u8]
    where
        Self: Sized,
    {
        unsafe { slice::from_raw_parts((self as *const Self) as *const u8, mem::size_of::<Self>()) }
    }

    fn from_bytes(src: *mut [u8]) -> *mut Self
    where
        Self: Sized,
    {
        let src_p = src as *mut Self;
        src_p
    }
}
}

This is trivially derivable for "simple" types (hence the inclusion of a simple derive macro, PersistableDerive), but really falls over for owned types (I won't even mention collections, they have been covered elsewhere) -- this is in the short-term list of things to address.

Execution

Currently, the only way to invoke a nanotransaction "from the outside" (as in, when not feeding a workload file to the runtime or invoking rust functions that trigger nanotransactions, like tests) is to make an RPC to a magpie instance, where the endpoint is (currently, unfortunately) handwritten by the user. For the set_key() method, the user would also have to implement the following function (some boilerplate has been removed for illustration):

#![allow(unused)]
fn main() {
async fn set_key(&self, request: Request<RequestBody>) -> Result<Response<()>, Status> {
    let object_iptr = IPtr::from(request.get_ref());
    let new_key: u64 = request.get_ref().key.into();

    self.wait_for_object(&object_iptr).await;

    {
        // clone async Arc from the server instance
        let object_tracker = Arc::clone(&self.object_tracker);
        tokio::task::spawn_blocking(move || {
            let rt_handle = tokio::runtime::Handle::current();
            let object_tracker = rt_handle.block_on(object_tracker.read());
            NandoManagerBase::set_key_nando(
                &object_tracker,
                &object_iptr,
                key,
            );
        })
        .await
        .unwrap();
    }

    Ok(Response::new(()))
}
}

What works and what doesn't

Pretty much all of magpie is reliant on tokio, including nanotransaction scheduling and execution, and I want to get away from that, at the very least for some performance predictability and hopefully some better debugging facilities.

I think the approach of generating transactionally-boxed code from macro-annotated user functions is a good one, but the current implementation is very hacky. Additionally, as it stands now, I have no clear plan in my mind for extracting user application code outside the magpie binary, and providing a clean library, and some kind of integration with a magpie daemon or something along those lines.

The following page goes into more detail for the next target design for the prototype.

Keyboard shortcuts

The Sibley Guide to magpie