Ownership Management - v0
System Breakdown
There are two distinct components to the ownership tracking subsystem:
- A "server" component that maintains the mapping of object id to current owner, together with some piece of information that encodes the "lease" of ownership (during which ownership is not permitted to change)
- Client-side libraries (where clients are magpie instances) for instances to contact the server.
No ownership state is maintained within magpie itself, at least in the current iteration. It might be beneficial to cache ownership results (given that we can retrieve the lease time from the server) to reduce the number of round trips, but the necessity for this remains to be decided.
The way the APIs exposed by the ownership subsytem are leveraged by the rest of the system for things like scheduling are not covered by this document. They should be included under the respective components' design documents.
Prototype Implementation
For the purposes of the v0 prototype, we will host ownership information in Redis, which will be the server component described above, mapping object IDs to host IDs.
We will use two logical databases (indexes 0 and 1) to maintain ownership information. Database 0
is the one that ownership-tracker reads to and writes from. The key/value pairs present in it at
any given point in time represent all the current strong ownership relations. Database 1 is the
database that location-manager communicates with. It will contain a mapping from objects to
hosts, but these pairs represent object locations. When an entry in database 0 expires, the system
will fall back to querying the location database to determine weak ownership relations.
The above scheme is less a matter of "proper design" and more of a side-effect of choosing to use redis and wanting to keep interactions with it at a minimum, as well as not wanting to store complex serialized data structures as values. Under this scheme, magpie has to do the bare minimum to maintain ownership relations and to enforce multi-key atomic ownership updates, as described in the protocol section below.
Protocols
Location Manager
There are two cases when the location manager of a magpie instance will need to contact the location database:
- Whenever a new object is created. This in essence publishes the object to the entire cluster.
- Whenever an object is moved. At this point, the location manager of the receiving node will update the value associated with an existing object as soon as the data transfer is complete. Ownership of the object should be atomically updated before the data movement is triggered by the node initiating the data movement, as described in the ownership tracker's protocol section below.
The API exposed by the location manager for this piece of functionality can be boiled down to the following two functions.
#![allow(unused)] fn main() { fn get_location(object_id: ObjectId) -> Result<HostId> fn set_location(object_id: ObjectId, host_id: HostId) -> Result<()> }
Both operations are expected to succeed in the absence of connectivity issues to the redis instance. Updating the location of a given object is unconditional.
NOTE It is probably the case that we will not need to update a set of object locations atomically, but if we end up in that position, the above signatures will be updated to operate over collections of objects.
Ownership Tracker
The ownership tracker exposes the following operations, all of which should be considered atomic over the set of keys passed in as arguments.
#![allow(unused)] fn main() { fn get_owners(object_ids: Vec<ObjectId>) -> Result<Vec<(ObjectId, HostId)>> fn set_owner(object_ids: Vec<ObjectId>, host_id: HostId) -> Result<()> fn renew_ownership(object_ids: Vec<ObjectId>) -> Result<()> }
A call to set_owner() will fail totally if at least one of the provided objects are already
strongly owned by a host, even if it is the currently owning host. To renew an owner's lease,
renew_ownership() should be used.