Smalltalk › Squeak › Squeak - Dev

[squeak-dev] Multi-core VMs (was: Suspending process fix)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

8 messages Options

Michael van der Gulik-2

[squeak-dev] Multi-core VMs (was: Suspending process fix)

On 4/29/09, Igor Stasenko <[hidden email]> wrote:

> As for moving to multi-cores.. yes, as Gulik suggests, its like adding
> a new dimension:
> - local scheduler for each core
> - single global scheduler for freezing everything
>
> This, of course, if we could afford running same object memory over
> multiple cores. Handling interpreter/object memory state(s) with
> multiple cores is not trivial thing.

Implementing it isn't hard. It's fixing all the bugs we'll find that's
hard. There'll be bugs in the image and in the VM, and it'll be a good
30 years before we've found them all.

*Most* parts of the VM will continue working fine. The parts that will
break... er... some of the parts that will break are:

* garbage collection.
* allocating memory for new objects.
* primitives and devices.
* pointer swapping *might* need to be atomic (become:, becomeForward:).
* Semaphore signalling.
* (more things???)

Most other things should work fine if we fire up a second interp() on
another pthread which shares the same object memory.

Writing to object slots (aka instance variables) should continue to
work fine provided that the write itself occurs in a single atomic
machine code instruction and that the object that the new reference
points to is already allocated and initialized.

Creating new objects could be improved by having each scheduler have
it's own eden space. But then the garbage collection becomes more
complex. The other alternative is to have a global image lock and
whoever whines about it can implement a better solution.

I think that's it. The only important thing shared between different
OS threads (pthreads?) would be the object memory, and all you really
do with the object memory is write or read from object slots, create
new objects and run garbage collection.

Also, it shouldn't always be necessary to freeze the entire image. If
every scheduler has its own eden space and its own list of processes
that nothing else is allowed to modify, then there's no need to freeze
the whole VM for scheduler work. The only time the whole VM will need
to freeze is for garbage collection, but even being very intelligent
with a new GC design can avoid that.

Gulik.

--
http://gulik.pbwiki.com/

Joshua Gargus-2

Re: [squeak-dev] Multi-core VMs (was: Suspending process fix)

Michael van der Gulik wrote:

On 4/29/09, Igor Stasenko [hidden email] wrote:

As for moving to multi-cores.. yes, as Gulik suggests, its like adding
a new dimension:
 - local scheduler for each core
 - single global scheduler for freezing everything

This, of course, if we could afford running same object memory over
multiple cores. Handling interpreter/object memory state(s) with
multiple cores is not trivial thing.


Implementing it isn't hard. It's fixing all the bugs we'll find that's
hard. There'll be bugs in the image and in the VM, and it'll be a good
30 years before we've found them all.

LOL, did I miss a smiley? Doesn't sound trivial to me.

With great effort, I will avoid getting into another long thread about how a Hydra-like model is more suitable to the memory architectures of future multi-core processors (Nehalem is already splitting up the L2 between cores instead of making it uniformly accessible to all cores, and this trend will continue).

(BTW, there is an ongoing discussion on GDAlgorithms ([hidden email]) about task-parallel multithreading architectures to take full advantage of multi-cores. The subject is "General purpose task parallel threading approach". C++-centric, but still interesting; game programmers do know how to eke performance out of their hardware).

Cheers,
Josh

*Most* parts of the VM will continue working fine. The parts that will
break... er... some of the parts that will break are:

* garbage collection.
* allocating memory for new objects.
* primitives and devices.
* pointer swapping *might* need to be atomic (become:, becomeForward:).
* Semaphore signalling.
* (more things???)

Most other things should work fine if we fire up a second interp() on
another pthread which shares the same object memory.

Writing to object slots (aka instance variables) should continue to
work fine provided that the write itself occurs in a single atomic
machine code instruction and that the object that the new reference
points to is already allocated and initialized.

Creating new objects could be improved by having each scheduler have
it's own eden space. But then the garbage collection becomes more
complex. The other alternative is to have a global image lock and
whoever whines about it can implement a better solution.

I think that's it. The only important thing shared between different
OS threads (pthreads?) would be the object memory, and all you really
do with the object memory is write or read from object slots, create
new objects and run garbage collection.

Also, it shouldn't always be necessary to freeze the entire image. If
every scheduler has its own eden space and its own list of processes
that nothing else is allowed to modify, then there's no need to freeze
the whole VM for scheduler work. The only time the whole VM will need
to freeze is for garbage collection, but even being very intelligent
with a new GC design can avoid that.

Gulik.

Steve Wart

Re: [squeak-dev] Multi-core VMs (was: Suspending process fix)

On Wed, Apr 29, 2009 at 5:13 PM, Joshua Gargus <[hidden email]> wrote:

Michael van der Gulik wrote:
On 4/29/09, Igor Stasenko [hidden email] wrote:
As for moving to multi-cores.. yes, as Gulik suggests, its like adding
a new dimension:
 - local scheduler for each core
 - single global scheduler for freezing everything

This, of course, if we could afford running same object memory over
multiple cores. Handling interpreter/object memory state(s) with
multiple cores is not trivial thing.
    
Implementing it isn't hard. It's fixing all the bugs we'll find that's
hard. There'll be bugs in the image and in the VM, and it'll be a good
30 years before we've found them all.
  
LOL, did I miss a smiley? Doesn't sound trivial to me.

With great effort, I will avoid getting into another long thread about how a Hydra-like model is more suitable to the memory architectures of future multi-core processors (Nehalem is already splitting up the L2 between cores instead of making it uniformly accessible to all cores, and this trend will continue).

I love this thread :)

I don't think there's a smiley missing. Hydra is wonderful because it provides concurrency in an elegant way, but the price of that elegance is that it completely ignores the IPC issues.

Threading is easy. Sharing is hard. Hopefully I'm not speaking out of school, but it seems that the semaphore discussions are independent of Hydra. It's not an alternative, but maybe a precondition.

Cheers,
Steve

Igor Stasenko

Re: [squeak-dev] Multi-core VMs (was: Suspending process fix)

In reply to this post by Joshua Gargus-2

2009/4/30 Joshua Gargus <[hidden email]>:

> Michael van der Gulik wrote:
>
> On 4/29/09, Igor Stasenko <[hidden email]> wrote:
>
>
>
> As for moving to multi-cores.. yes, as Gulik suggests, its like adding
> a new dimension:
> - local scheduler for each core
> - single global scheduler for freezing everything
>
> This, of course, if we could afford running same object memory over
> multiple cores. Handling interpreter/object memory state(s) with
> multiple cores is not trivial thing.
>
>
> Implementing it isn't hard. It's fixing all the bugs we'll find that's
> hard. There'll be bugs in the image and in the VM, and it'll be a good
> 30 years before we've found them all.
>
>
> LOL, did I miss a smiley? Doesn't sound trivial to me.
>
> With great effort, I will avoid getting into another long thread about how a
> Hydra-like model is more suitable to the memory architectures of future
> multi-core processors (Nehalem is already splitting up the L2 between cores
> instead of making it uniformly accessible to all cores, and this trend will
> continue).
>

I wouldn't say that implementing it would be hard.
If you have a clear vision what it should do and how, then its only a
question about how fast you type.
The 'hard' part is the tough decisions you have to make along the road, like:
- what to do with lookup cache? If each thread holds own cache, then
when you installing new method, how to make sure that you flush all of
them in atomic fashion? And if you leave a single global cache - how
to make concurrent access to it to be not the bottleneck of whole
system.
I think that implementation of multicore VM will be a waves of
triumphs and failures, when you think its done, something will stab
you in the back. :)

> (BTW, there is an ongoing discussion on GDAlgorithms
> ([hidden email]) about task-parallel multithreading
> architectures to take full advantage of multi-cores. The subject is
> "General purpose task parallel threading approach". C++-centric, but still
> interesting; game programmers do know how to eke performance out of their
> hardware).
>
> Cheers,
> Josh
>
>
> *Most* parts of the VM will continue working fine. The parts that will
> break... er... some of the parts that will break are:
>
> * garbage collection.
> * allocating memory for new objects.
> * primitives and devices.
> * pointer swapping *might* need to be atomic (become:, becomeForward:).
> * Semaphore signalling.
> * (more things???)
>

there's always (more things???). It adds a new dimension to every
single algorythm which were invented for running in non-concurrent
environment.

> Most other things should work fine if we fire up a second interp() on
> another pthread which shares the same object memory.
>
> Writing to object slots (aka instance variables) should continue to
> work fine provided that the write itself occurs in a single atomic
> machine code instruction and that the object that the new reference
> points to is already allocated and initialized.
>
> Creating new objects could be improved by having each scheduler have
> it's own eden space. But then the garbage collection becomes more
> complex. The other alternative is to have a global image lock and
> whoever whines about it can implement a better solution.
>
> I think that's it. The only important thing shared between different
> OS threads (pthreads?) would be the object memory, and all you really
> do with the object memory is write or read from object slots, create
> new objects and run garbage collection.
>
> Also, it shouldn't always be necessary to freeze the entire image. If
> every scheduler has its own eden space and its own list of processes
> that nothing else is allowed to modify, then there's no need to freeze
> the whole VM for scheduler work. The only time the whole VM will need
> to freeze is for garbage collection, but even being very intelligent
> with a new GC design can avoid that.
>
> Gulik.
>
>
>
>
>
>

--
Best regards,
Igor Stasenko AKA sig.

Nicolas Cellier

Re: [squeak-dev] Multi-core VMs (was: Suspending process fix)

In reply to this post by Michael van der Gulik-2

2009/4/30 Michael van der Gulik <[hidden email]>:

> On 4/29/09, Igor Stasenko <[hidden email]> wrote:
>
>> As for moving to multi-cores.. yes, as Gulik suggests, its like adding
>> a new dimension:
>> - local scheduler for each core
>> - single global scheduler for freezing everything
>>
>> This, of course, if we could afford running same object memory over
>> multiple cores. Handling interpreter/object memory state(s) with
>> multiple cores is not trivial thing.
>
> Implementing it isn't hard. It's fixing all the bugs we'll find that's
> hard. There'll be bugs in the image and in the VM, and it'll be a good
> 30 years before we've found them all.
>
> *Most* parts of the VM will continue working fine. The parts that will
> break... er... some of the parts that will break are:
>
> * garbage collection.
> * allocating memory for new objects.
> * primitives and devices.
> * pointer swapping *might* need to be atomic (become:, becomeForward:).
> * Semaphore signalling.
> * (more things???)
>

I love the *might*

What happens for example if you change a class definition... Say add
or remove an instance slot.
What would happen if the becomeForward on the array of instances and
subinstances and method dictionaries were not atomic?

Don't forget Smalltalk environment rely on such in vivo chirurgical operations.
You should better stop running while you or another thread is
replacing your own leg.

Of course, there is not a single solution. We could invent a world
where there is a compile phase and an execution phase...

Nicolas

Nicolas Cellier

Re: [squeak-dev] Multi-core VMs (was: Suspending process fix)

Anyway, no need for native threads to show how broken it can be...
It's not only a matter of operation atomicity.
Classes and methods must stay in sync, or bad things will happen.
Just try this with current green thread model:

Object subclass: #AAA instanceVariableNames: 'a b c'.
(Smalltalk at: #AAA) compile: 'test1
(Delay forSeconds: 2) wait.
c := 0' classified: 'test'.
[(Smalltalk at: #AAA) new test1] forkAt: Processor userInterruptPriority.
Object subclass: #AAA instanceVariableNames: 'c'.

Sooner or later, your VM should crash because ClassBuilder did not
take care to analyze concurrent Process Context stack... Or because VM
did not take care to lock receiver of some Context...

Fortunately, a base image uses very few processes but for some hard to
modify/debug low level operations the average user should better not
touch (event fetching, finalization, ...).

Nicolas

2009/4/30 Nicolas Cellier <[hidden email]>:

> 2009/4/30 Michael van der Gulik <[hidden email]>:
>> On 4/29/09, Igor Stasenko <[hidden email]> wrote:
>>
>>> As for moving to multi-cores.. yes, as Gulik suggests, its like adding
>>> a new dimension:
>>> - local scheduler for each core
>>> - single global scheduler for freezing everything
>>>
>>> This, of course, if we could afford running same object memory over
>>> multiple cores. Handling interpreter/object memory state(s) with
>>> multiple cores is not trivial thing.
>>
>> Implementing it isn't hard. It's fixing all the bugs we'll find that's
>> hard. There'll be bugs in the image and in the VM, and it'll be a good
>> 30 years before we've found them all.
>>
>> *Most* parts of the VM will continue working fine. The parts that will
>> break... er... some of the parts that will break are:
>>
>> * garbage collection.
>> * allocating memory for new objects.
>> * primitives and devices.
>> * pointer swapping *might* need to be atomic (become:, becomeForward:).
>> * Semaphore signalling.
>> * (more things???)
>>
>
> I love the *might*
>
> What happens for example if you change a class definition... Say add
> or remove an instance slot.
> What would happen if the becomeForward on the array of instances and
> subinstances and method dictionaries were not atomic?
>
> Don't forget Smalltalk environment rely on such in vivo chirurgical operations.
> You should better stop running while you or another thread is
> replacing your own leg.
>
> Of course, there is not a single solution. We could invent a world
> where there is a compile phase and an execution phase...
>
> Nicolas
>

Philippe Marschall

Re: [squeak-dev] Multi-core VMs (was: Suspending process fix)

2009/4/30 Nicolas Cellier <[hidden email]>:
> Anyway, no need for native threads to show how broken it can be...

There is a pretty easy way to hard crash (segfaul/access violation)
most (Squeak, VW, probably not GemStone) Smalltalk VMs:
- create a class with an instance variable
- make a block that access this instance variable
- remove the instance variable
- evaluate the block
-> BOOM!

Cheers
Philippe

Michael van der Gulik-2

Re: [squeak-dev] Multi-core VMs (was: Suspending process fix)

In reply to this post by Nicolas Cellier

(regarding hypothetical implementation of a multi-core capable Squeak VM)

On 4/30/09, Nicolas Cellier <[hidden email]> wrote:
> 2009/4/30 Michael van der Gulik <[hidden email]>:

>> *Most* parts of the VM will continue working fine. The parts that will
>> break... er... some of the parts that will break are:
>>
>> * garbage collection.
>> * allocating memory for new objects.
>> * primitives and devices.
>> * pointer swapping *might* need to be atomic (become:, becomeForward:).
>> * Semaphore signalling.
>> * (more things???)
>>
>
> I love the *might*
>
> What happens for example if you change a class definition... Say add
> or remove an instance slot.
> What would happen if the becomeForward on the array of instances and
> subinstances and method dictionaries were not atomic?

Umm... brain cogs turning... umm...

To retain the behaviour that Squeak has now, the safest, least buggy
and easiest way is to do the above operation with a lock on the entire
VM. Currently this is done in ClassBuilder>>update:to: by using
BlockContext>>valueUnpreemptively. Whole VM locks suck, but nobody has
the right to complain unless they've written a better solution.

I'm not even going to bother trying to work out how this could be done
concurrently. There are so many possible bugs. See the loads of
comments in ClassBuilder>>update:to:. Infact, as Philippe Marschall
points out in this thread, there are even existing bugs related to
this - it's even commented as such in
ClassDescription>>updateInstancesFrom:.

Another option is to make classes and method immutable, so that they
need to be copied when modifications are wanted. This is the approach
I've taken in my Namespaces implementation, but unfortunately you lose
a lot of the malleability of code that makes Smalltalk nice.

> Don't forget Smalltalk environment rely on such in vivo chirurgical
> operations.
> You should better stop running while you or another thread is
> replacing your own leg.

A heart surgeon went to a mechanic to get his motorbike serviced. The
mechanic started some small talk with the heart surgeon: "You know,
our jobs are quite similar. We both take a body, pull it to pieces,
fix or replace any broken parts and put it all back together again. So
how come you get paid so much more than I do?". The surgeon retorted:
"Try doing it with the motor running!".

Gulik.

--
http://gulik.pbwiki.com/