Copy-on-write for a multithreaded VM

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Copy-on-write for a multithreaded VM

Ben Coman
I am curious what are the issues holding us back from a multithreaded
VM being mainstream, since I had a passing thought about a strategy
for a partial-multi-threaded-VM.

I see multi-threading been prototyped before with RoarVM [1] & HydraVM
[2], and the CogBlog project page [3] says "While multi-threading
seems like an obvious and important direction to take the system in,
making the VM multi-threaded per-se   **does not provide any benefit
before the Smalltalk image is made thread-safe**   and that is
probably more work than providing a multi-threaded VM.  Hence a
potentially more profitable approach is to concentrate on federating
multiple VMs running multiple images, communicating through the
threaded FFI."

So I agree it would be a big job to make the whole Image thread-safe,
but I wonder if a useful subset would be threads that are unable to
disturb the system state? This might cover a lot of needs for
parallelism, for example:
* Complex force layouts with Roassal
* Web server worker threads
* Screen rendering by #fullDraw: & #drawOn:

A copy-on-write facility might provide this, such that any required
system state changes require manual coding in the parent-thread to
process a result object returned from the child-thread when it exits.
This might be implemented by combining Spur's new features for:
* immutability
* lazy become forwarding

Implementation might require an extra object header bit WriteCopied
and a new class CopyOnWriteThread known by the VM.

Whenever a child-COWT reads an object, if WriteCopied is not set, set
both it and the Immutable bits.    Any thread writing to that object
will trigger the existing immutability handler, which now
additionally, when WriteCopied is set:
  1. Creates a new WriteCopiedIndirection object holding
         a. the old object
         b. the newly written object
         c. the thread that performed the write.
  2. Sets the old object's forward-pointer to that.

Later the forward-following code observes the LocalCopy bit and based
on matching the current thread with that stored in 1c. unrolls the
correct object.  The child-COWT leaves the WriteCopied bit set so as
to avoid immutability being set again.  Actually that precludes using
real-immutability in a COWT, so my description above is not ideal -
but hopefully is enough to show intent.

Could this avoid needing to make the whole image thread safe? and thus
facilitate running Smalltalk worker threads across multiple CPUs?


Actually, this might even be useful without running across multiple
CPUs, since the cause of several intermittent Red-Screen-Of-Deaths was
due to using multiple green-threads to improve UI interactivity.  (For
example forking update of Monticello lists to allow typing into search
boxes. Here #drawOn: had "aList size" returning different values
halfway through the algorithm.  Where execution forked to provide this
useful feature was the best place for code readability and intuitive
understanding, but it was long path to trace through execution to
discover the coupling between there and the rendering code.  Certainly
the fix is less readable that the original.)

[1] https://github.com/smarr/RoarVM
[2] http://squeakvm.org/~sig/hydravm/devnotes.html
[3] http://www.mirandabanda.org/cogblog/cog-projects/

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Copy-on-write for a multithreaded VM

Eliot Miranda-2
Hi Ben,

On Thu, Jul 16, 2015 at 8:29 PM, Ben Coman <[hidden email]> wrote:

I am curious what are the issues holding us back from a multithreaded
VM being mainstream, since I had a passing thought about a strategy
for a partial-multi-threaded-VM.

The issue is the image.  The class library, the tools, etc are essentially single-threaded.  Making the VM multi-threaded is only a relatively small part of the job.  Rewriting the rest of the system to be thread-safe is a big job.  Hence my focus on simply providing a threaded VM that supports sharing the Vm between threads so that any and all external calls are non-blocking.  This is a really useful first stage.  Maybe later economic activity around Pharo and/or Squeak will motivate and support work on a multi-threaded system, but that's on a further horizon than having the kind of interconnectivity a smaller first step provides.  Further, Smalltalk's strong encapsulation is so good at providing a distributed programing framework which uses lots of interconnected images that that's also an attractive route to go that yields value with far less effort than a fully multi-threaded system, and of course is more generally useful.

 

I see multi-threading been prototyped before with RoarVM [1] & HydraVM
[2], and the CogBlog project page [3] says "While multi-threading
seems like an obvious and important direction to take the system in,
making the VM multi-threaded per-se   **does not provide any benefit
before the Smalltalk image is made thread-safe**   and that is
probably more work than providing a multi-threaded VM.  Hence a
potentially more profitable approach is to concentrate on federating
multiple VMs running multiple images, communicating through the
threaded FFI."

So I agree it would be a big job to make the whole Image thread-safe,
but I wonder if a useful subset would be threads that are unable to
disturb the system state? This might cover a lot of needs for
parallelism, for example:
* Complex force layouts with Roassal
* Web server worker threads
* Screen rendering by #fullDraw: & #drawOn:

A copy-on-write facility might provide this, such that any required
system state changes require manual coding in the parent-thread to
process a result object returned from the child-thread when it exits.
This might be implemented by combining Spur's new features for:
* immutability
* lazy become forwarding

Implementation might require an extra object header bit WriteCopied
and a new class CopyOnWriteThread known by the VM.

Whenever a child-COWT reads an object, if WriteCopied is not set, set
both it and the Immutable bits.    Any thread writing to that object
will trigger the existing immutability handler, which now
additionally, when WriteCopied is set:
  1. Creates a new WriteCopiedIndirection object holding
         a. the old object
         b. the newly written object
         c. the thread that performed the write.
  2. Sets the old object's forward-pointer to that.

Later the forward-following code observes the LocalCopy bit and based
on matching the current thread with that stored in 1c. unrolls the
correct object.  The child-COWT leaves the WriteCopied bit set so as
to avoid immutability being set again.  Actually that precludes using
real-immutability in a COWT, so my description above is not ideal -
but hopefully is enough to show intent.

Build it, they will come!
 

Could this avoid needing to make the whole image thread safe? and thus
facilitate running Smalltalk worker threads across multiple CPUs?


Actually, this might even be useful without running across multiple
CPUs, since the cause of several intermittent Red-Screen-Of-Deaths was
due to using multiple green-threads to improve UI interactivity.  (For
example forking update of Monticello lists to allow typing into search
boxes. Here #drawOn: had "aList size" returning different values
halfway through the algorithm.  Where execution forked to provide this
useful feature was the best place for code readability and intuitive
understanding, but it was long path to trace through execution to
discover the coupling between there and the rendering code.  Certainly
the fix is less readable that the original.)

[1] https://github.com/smarr/RoarVM
[2] http://squeakvm.org/~sig/hydravm/devnotes.html
[3] http://www.mirandabanda.org/cogblog/cog-projects/

cheers -ben



--
_,,,^..^,,,_
best, Eliot