Spur, memfd and a restart trick?

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Spur, memfd and a restart trick?

Holger Freyther

a small thought experiment. One issue of running a system a long time is how to make small and bigger upgrades? In theory a long running image can be updated in place but in practice this is problematic[1]. For C based systems we played with the idea of using shared memory to "park" state so that we can read it later but shm has some cleanup and management issues.

IIRC in a system like Minix an upgrade is done by externalizing the state, starting a new server, parsing/upgrading the state, downgrading the state and if it is the same and then the new server is allowed to take over.

Yesterday I wondered how spur memory segments and memfd could play together to implement something like the above. Imagine a REST server with some open connections, how can we restart it without dropping the open connections?

Maybe something like this could work:

* Use memfd_create to allocate memory and use setenv to "remember" the fd
* Serialize Fuel/STON the server data
* Serialize/remember the open file descriptors sockets
* exec the VM/new image
* Materialize the state
* Re-create Socket/Fileplugin resources
* Close the memory

One issue with Fuel is that the representation of objects must be similar between the old/new version of the image (but that could be solved). What would be more neat and a lot more difficult is something like that...

* Use memfd_create and treat this as a memory segment of spur (old space?)
* Move/copy the "state" (whatever that is, transitive cover might be too large) into that segment
* On re-exec move these objects into the main image again

Again the issue is if sizes of instances changed...but maybe we can do the object migration from the image where we have the old and new class around?

What do you think


[1] When does an update take place (so state can be updated)? If you have put too much code into a block closure that is currently executing it will not be updated. E.g. in Erlang it is well defined when one process receives the update.