Smalltalk › Squeak › Squeak - Dev

Multy-core CPUs

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

194 messages Options

1 ... 345678910

Igor Stasenko

Re: Multy-core CPUs

On 24/10/2007, Sebastian Sastre <[hidden email]> wrote:

> >
> > This having a perspective, only if you have unlimited memory
> > resources and zero cost memory allocation.
> > Lets look more precise on this. I will write only in ST(i
> > don't know Erlang), and assuming that i understood well your
> > concept , by having following ST code:
> >
> > SomeClass>>setVars
> > self setVar1: value1.
> > self setVar2: value2.
> > ...
> > ^ self
> >
> > here at each message send , instead of writing to receiver
> > memory, we do copy-on-write cloning.
>
> But I don't understand why you want to do that if you can just make that the
> process of the receiver update it's own memory when receives that message
> that 1 was already assigned and 2 no copy is needed at all.
>

Because a receiver is an object which passed as parameter to method.
We can't modify it, thats why - we can't suppose where it passed from
and can't suppose it it used in another parallel process, so best we
can do is cloning. And since the above, you forced to do same for
_any_ method, so you'll go cloning and cloning even by sending self
setVar: ..

> Cheers,
>
> Sebastian
>
>
>
> > so, self setVar1: value1 will return us a modified copy -
> > self' , To keep things semantically correct, then we
> > substitute self in 'self
> > setVar2: value2' by just received copy and so on..
> > at the end by returning self we substitute it by self''''''' .
> > So, each time we modifying object we got a modified copy
> > instead modifying original.
> > Now think about costs: memory allocation and orders of
> > magnitude more garbage generated.
> > Now, even if we assume that each process haves own private
> > memory region, its still should be located somewhere in
> > physical memory. And as you may know, a physical memory is
> > shared among all cores, so your 'topmost' memory manager have
> > no excuses, but to use so disliked by you locking to deal
> > with concurrent requests for resources.
> > And as you may see from this example, this model really fast
> > going to, that memory manager will become great bottleneck of
> > your model, because of orders of magnitude higher memory consumption.
> >
> > And now consider alternative: even by putting a dumb
> > lock-write-unlock we can have much less cycles wasted.
> > Because in your 'non-locking'
> > model your main load is just producing tons of garbage by
> > cloning objects over and over.
> >
> > --
> > Best regards,
> > Igor Stasenko AKA sig.
> >
>
>
>

--
Best regards,
Igor Stasenko AKA sig.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by Igor Stasenko

On 10/24/07, Igor Stasenko <[hidden email]> wrote:
>
> This having a perspective, only if you have unlimited memory resources
> and zero cost memory allocation.

I don't understand.

> Lets look more precise on this. I will write only in ST(i don't know
> Erlang), and assuming that i understood well your concept , by having
> following ST code:
>
> SomeClass>>setVars
> self setVar1: value1.
> self setVar2: value2.
> ...
> ^ self
>
> here at each message send , instead of writing to receiver memory, we
> do copy-on-write cloning.

No, this is exactly what we *do not* do. As I have mentioned several
times, I want message passing to be explicit. I had hoped the Erlang
code would be clear, since it wasn't here is the same thing again in
proposed Smalltalk code:

SomeProcess>>run "arbitrary name, doesn't have to be run or anything
like that"
self breakupStructureWith: 10000.
self buildNewStrucureFrom: 10000.
^ structure

SomeProcess>>breakupStructureWith: aProcessCount
|rest|
rest := structure.

1 to: aProcessCount do: [
rest := splitAndSend: rest ].

SomeProcess>>buildNewStructureFrom: aProcessCount
1 to: aProcessCount do: [ |data|
data := self process receive.
self addDataToStructure: data.

Ok. Not optimal code in either case, but this Smalltalk code is the
equivalent of what the Erlang code above did. Note that in ST I'm not
passing the structure around or doing recursion because in ST I can
modify variables. Also Note that in the Erlang example and this
example the actual send was not shown. The function (split_and_send
and splitAndSend respectively) were not shown because I didn't want to
write a bunch of code breaking up some imaginary structure.

These are all just normal message sends. The only interprocess stuff
is the unshown send (I had planned to use the binary message #!) and
the receive method.

> So, each time we modifying object we got a modified copy instead
> modifying original.

Why? Inside a given process I don't see a reason to disallow regular
mutability.

Sebastian Sastre-2

RE: Multy-core CPUs

In reply to this post by Igor Stasenko

> > > we do copy-on-write cloning.
> >
> > But I don't understand why you want to do that if you can just make
> > that the process of the receiver update it's own memory
> when receives
> > that message that 1 was already assigned and 2 no copy is
> needed at all.
> >
> Because a receiver is an object which passed as parameter to method.
> We can't modify it, thats why - we can't suppose where it
> passed from and can't suppose it it used in another parallel
> process, so best we can do is cloning. And since the above,
> you forced to do same for _any_ method, so you'll go cloning
> and cloning even by sending self
> setVar: ..
>

Well I see your point. But let me clarify that I talk about something that
does not suffer of that problem at all.

If we can forget that problem for a minute I can try to show you what I see.
So here I go:

I don't know if you saw the reference I cited about the Alan Kay's OOPSLA 97
presentation. Is about an hour and a half or less in duration. It's
reacheable with youtube.

There at some point Alan talks about someone said once that every host must
be able to have a valid IP in internet and he states that every object
should be able to have a valid IP. At first that statement is shocking
because is too radical. And we don't have resources for that yet. The
problem is that no matter how radical it is, it's still being a great idea.

As I saw it as just scaling the message passing between objects paradigm
from image level to interenet level (which is of course massive).

We have no technology to make use of something like that, and maybe is not
important to try to make that today nor in next five years. But that unhappy
fact, forced by todays lack of resources reality, does not make that idea to
be less good.

So we can decide to have the attitude to prepare ourselves to the moment in
which hardware and industry makes that reality more closer. Could be in next
10? 20? 30 years? Nobody knows but we all do know that this industry is very
accellerated and we are inventing future.

Now I'm trying to think <metaphore>with same software Alan used to sate that
prhase</metaphore> but in another domain. The domain of processes. And with
something that todays is closer to us: multicore technology.

So I'm stating here that in a smalltalk image of the future *every object
should have a process*. Every instance. All of them.

We also could decide to interpret the phenomenon of seing the Erlang VM
managing 100k processes messaging themselves sucessfully like a mere proof
of concept to encourage us about hardware starting to turn *less worst*
creatures than in the past. That way we can start to think that hardware
it's becoming less worst to the point in wich we can take more seriusly
making this hardware to manage a quantity of processes of the same order of
magnitude of the quantity of instances in a smalltalk image. This makes
feasible to map 1:1 process with instances.

Said that I return to the problem you stated about the need of copy copy
copy, saying that this premise changes things and you don't need to copy
anymore because a VM like that, no matter who or when, an instVar of an
object is to be modified it will provide you of guarantee that the write
will be made by the process that corresponds to that instance.

This idea is redefining what we understood today as anObject by *coupling*
it to aProcess. In this hypothetical smalltalk anObject can't live without a
process. It's indissociable.

So.. in this hypothetical smalltalk:

- we can supose that every object lives in a process
- we can supose that nobody but the very owner of an instVar can
write that instVar
- we can supose that no other process but the one of that instance
will write that piece of RAM
- we can supose that everythig is an object
- we can supose that all the instances-processes can be freely
balanced trhough cores

Besides:
- we have no need to pollute syntax nor smalltalk rules
- we are not introducing singularities in the paradigm
- we do are consuming more resources but in compensation of gaining
unprecedent scalability
- we are keeping the heuristic, completism and simplicity that
defines smalltalk
- we are making a step forward in anthropomorphism that will
maintain smalltalk concepts familiar to persons - last but not least: we
take advantage of multicore cpus transparently

Take a minute to think in it's consequences. This suposition, same concept
in more explicit words, this assumption of anObject being an indissociable
thing with aProcess, objects being 1:1 with processes, makes all the
difference and dramatically simplifies all. An we do know that
simplification improves scalability.

I'm, of course, being extremely speculative in the exploration of this idea
in group with you all. But think is cheap :). In fact I don't even buy it
myself yet. The problem is that I'm honestrly unable to refute myself about
the convenience of this path :) so it becomes stronger.

I hope you and others understand why I'm starting to think that this is a
powerful idea.

all the best,

Sebastian Sastre
PS: Sorry for the size. I've tried to express this in my previous post. I'm
trying to be didactic and illustrative.

>
> > Cheers,
> >
> > Sebastian
> >
...
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>

Igor Stasenko

Re: Multy-core CPUs

In reply to this post by Jason Johnson-5

On 24/10/2007, Jason Johnson <[hidden email]> wrote:

> On 10/24/07, Igor Stasenko <[hidden email]> wrote:
> >
> > This having a perspective, only if you have unlimited memory resources
> > and zero cost memory allocation.
>
> I don't understand.
>
> > Lets look more precise on this. I will write only in ST(i don't know
> > Erlang), and assuming that i understood well your concept , by having
> > following ST code:
> >
> > SomeClass>>setVars
> > self setVar1: value1.
> > self setVar2: value2.
> > ...
> > ^ self
> >
> > here at each message send , instead of writing to receiver memory, we
> > do copy-on-write cloning.
>
> No, this is exactly what we *do not* do. As I have mentioned several
> times, I want message passing to be explicit. I had hoped the Erlang
> code would be clear, since it wasn't here is the same thing again in
> proposed Smalltalk code:
>
> SomeProcess>>run "arbitrary name, doesn't have to be run or anything
> like that"
> self breakupStructureWith: 10000.
> self buildNewStrucureFrom: 10000.
> ^ structure
>
> SomeProcess>>breakupStructureWith: aProcessCount
> |rest|
> rest := structure.
>
> 1 to: aProcessCount do: [
> rest := splitAndSend: rest ].
>
> SomeProcess>>buildNewStructureFrom: aProcessCount
> 1 to: aProcessCount do: [ |data|
> data := self process receive.
> self addDataToStructure: data.
>
> Ok. Not optimal code in either case, but this Smalltalk code is the
> equivalent of what the Erlang code above did. Note that in ST I'm not
> passing the structure around or doing recursion because in ST I can
> modify variables. Also Note that in the Erlang example and this
> example the actual send was not shown. The function (split_and_send
> and splitAndSend respectively) were not shown because I didn't want to
> write a bunch of code breaking up some imaginary structure.
>
> These are all just normal message sends. The only interprocess stuff
> is the unshown send (I had planned to use the binary message #!) and
> the receive method.
>
> > So, each time we modifying object we got a modified copy instead
> > modifying original.
>
> Why? Inside a given process I don't see a reason to disallow regular
> mutability.
>

Aha, now i get it. So, your approach is to establish a fence between
different processes, so they can't share objects. Or maybe more
correct to say, that any callee process can have read-only access to
any objects which belongs to caller process?

Its unclear how you would determine to which process object belongs
to? This is at minimum would require an additional slot per object
(ok, this is doable easily).

Also, unclear how you would persist state (or results of computation).
Since all you can do now is to send a message to process, which will
return an object in answer. Now, since returned object most probably
will belong to callee process you must copy it to caller process. But
in real you should care of copying a whole subgraph of objects (since
you can return a collection of newly created objects (and they , in
own turn can be a collections e.t.c.)- and all belonging to callee
process). Then , after you done merging a graph, you can simply wipe
all memory which was allocated by callee process. This part is easy.

Now, the most interesting part: mutating an objects in caller process.
Suppose my starting process calls two different processes.
And they came to the point, that they are willing to update a state of
some object(s) in caller process (to be clear: process A contains
object a, it calls processes B and C in parallel, and now B and C
wanting to change state of object a).
This could be done by detecting that active process tries to perform a
write to an object which is not belongs to current process - so you
could transform this attempt into implicit message which will be sent
to caller process, like:
callerProcess setInstVarOf: object index: x value: y
or
callerProcess setIndexVarOf: object index: x value: y
(the number of cases is not very interesting here)

Now, if that _is_ allowed, we having a race condition, when two or
more processes trying to update a state of same object(s) in parent
process. And there is no ways instead of !! locking !! semantics to
solve this. Or maybe i'm wrong here? ;)

If this is not allowed, then you must follow by two ways:
- do copy-on-write , at any attempt of updating 'foreign' object. This
arises new problem - how to merge a copied and modified state of
object with previous one? How to propagate these changes to other
processes? (If you not propagate them, then you actually breaking
semantics).

- generate an exception on write attempt. Basta. This option diverges
your implementation from any current implementation of smalltalk. And
its impossible to adopt old code to new VM with such limitations.

--
Best regards,
Igor Stasenko AKA sig.

Jason Johnson-5

Re: Re[4]: Multy-core CPUs

In reply to this post by Herbert König

Ah! I misread your email. :)

On 10/24/07, Herbert König <[hidden email]> wrote:

> Hello Jason,
>
>
> JJ> Ok, what can I do to make myself clearer and easier to understand?
>
> JJ> On 10/24/07, Herbert König <[hidden email]> wrote:
> >> Hello Peter,
> >>
>
> nothing on your side, I wisher Peter would use proper quoting in his
> html mails. In my mailer (the bat) I can't distinguish his argument
> from his quote of your argument. He also sends text mails with proper
> quotings.
>
>
>
> Cheers,
>
> Herbert mailto:[hidden email]
>
>
>

Igor Stasenko

Re: Multy-core CPUs

In reply to this post by Sebastian Sastre-2

Sebastian, you can envision that any unique object in VM is a unique
process. No changes required to VM. Your concept having zero worth for
me, because VM already supports that each objects have own
encapsulated state, and you can change object's state only by sending
messages to it.
So, we already might say, that all objects are living and can be
represented as a processes which triggered by sending message(s) to
them.

On 24/10/2007, Sebastian Sastre <[hidden email]> wrote:

> > > > we do copy-on-write cloning.
> > >
> > > But I don't understand why you want to do that if you can just make
> > > that the process of the receiver update it's own memory
> > when receives
> > > that message that 1 was already assigned and 2 no copy is
> > needed at all.
> > >
> > Because a receiver is an object which passed as parameter to method.
> > We can't modify it, thats why - we can't suppose where it
> > passed from and can't suppose it it used in another parallel
> > process, so best we can do is cloning. And since the above,
> > you forced to do same for _any_ method, so you'll go cloning
> > and cloning even by sending self
> > setVar: ..
> >
> Well I see your point. But let me clarify that I talk about something that
> does not suffer of that problem at all.
>
> If we can forget that problem for a minute I can try to show you what I see.
> So here I go:
>
> I don't know if you saw the reference I cited about the Alan Kay's OOPSLA 97
> presentation. Is about an hour and a half or less in duration. It's
> reacheable with youtube.
>
> There at some point Alan talks about someone said once that every host must
> be able to have a valid IP in internet and he states that every object
> should be able to have a valid IP. At first that statement is shocking
> because is too radical. And we don't have resources for that yet. The
> problem is that no matter how radical it is, it's still being a great idea.
>
> As I saw it as just scaling the message passing between objects paradigm
> from image level to interenet level (which is of course massive).
>
> We have no technology to make use of something like that, and maybe is not
> important to try to make that today nor in next five years. But that unhappy
> fact, forced by todays lack of resources reality, does not make that idea to
> be less good.
>
> So we can decide to have the attitude to prepare ourselves to the moment in
> which hardware and industry makes that reality more closer. Could be in next
> 10? 20? 30 years? Nobody knows but we all do know that this industry is very
> accellerated and we are inventing future.
>
> Now I'm trying to think <metaphore>with same software Alan used to sate that
> prhase</metaphore> but in another domain. The domain of processes. And with
> something that todays is closer to us: multicore technology.
>
> So I'm stating here that in a smalltalk image of the future *every object
> should have a process*. Every instance. All of them.
>
> We also could decide to interpret the phenomenon of seing the Erlang VM
> managing 100k processes messaging themselves sucessfully like a mere proof
> of concept to encourage us about hardware starting to turn *less worst*
> creatures than in the past. That way we can start to think that hardware
> it's becoming less worst to the point in wich we can take more seriusly
> making this hardware to manage a quantity of processes of the same order of
> magnitude of the quantity of instances in a smalltalk image. This makes
> feasible to map 1:1 process with instances.
>
> Said that I return to the problem you stated about the need of copy copy
> copy, saying that this premise changes things and you don't need to copy
> anymore because a VM like that, no matter who or when, an instVar of an
> object is to be modified it will provide you of guarantee that the write
> will be made by the process that corresponds to that instance.
>
> This idea is redefining what we understood today as anObject by *coupling*
> it to aProcess. In this hypothetical smalltalk anObject can't live without a
> process. It's indissociable.
>
> So.. in this hypothetical smalltalk:
>
> - we can supose that every object lives in a process
> - we can supose that nobody but the very owner of an instVar can
> write that instVar
> - we can supose that no other process but the one of that instance
> will write that piece of RAM
> - we can supose that everythig is an object
> - we can supose that all the instances-processes can be freely
> balanced trhough cores
>
> Besides:
> - we have no need to pollute syntax nor smalltalk rules
> - we are not introducing singularities in the paradigm
> - we do are consuming more resources but in compensation of gaining
> unprecedent scalability
> - we are keeping the heuristic, completism and simplicity that
> defines smalltalk
> - we are making a step forward in anthropomorphism that will
> maintain smalltalk concepts familiar to persons - last but not least: we
> take advantage of multicore cpus transparently
>
> Take a minute to think in it's consequences. This suposition, same concept
> in more explicit words, this assumption of anObject being an indissociable
> thing with aProcess, objects being 1:1 with processes, makes all the
> difference and dramatically simplifies all. An we do know that
> simplification improves scalability.
>
> I'm, of course, being extremely speculative in the exploration of this idea
> in group with you all. But think is cheap :). In fact I don't even buy it
> myself yet. The problem is that I'm honestrly unable to refute myself about
> the convenience of this path :) so it becomes stronger.
>
> I hope you and others understand why I'm starting to think that this is a
> powerful idea.
>
> all the best,
>
> Sebastian Sastre
> PS: Sorry for the size. I've tried to express this in my previous post. I'm
> trying to be didactic and illustrative.
>
> >
> > > Cheers,
> > >
> > > Sebastian
> > >
> ...
> >
> >
> > --
> > Best regards,
> > Igor Stasenko AKA sig.
> >
>
>
>

--
Best regards,
Igor Stasenko AKA sig.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by Igor Stasenko

On 10/24/07, Igor Stasenko <[hidden email]> wrote:
>
> Aha, now i get it.

Good, I should have just posted some theoritical Smalltalk code to
begin with, this thread would probably be half as big. :)

>So, your approach is to establish a fence between
> different processes, so they can't share objects. Or maybe more
> correct to say, that any callee process can have read-only access to
> any objects which belongs to caller process?

No, the plan was that since in Smalltalk objects are mutable, I will
have to pay an extra cost for internal message sends and have the VM
do a deep copy for the sent objects.

Another alternative would be to introduce an immutable flag on
references, then the "receiver" gets a reference to the object but
flagged immutable. This way might be better, but requires more
changes.

> Its unclear how you would determine to which process object belongs
> to? This is at minimum would require an additional slot per object
> (ok, this is doable easily).

Not needed. The boundaries are: object creation, object send and
object receive. All controlled from the VM or in the library, so I
just have to guarantee that a mutable object can never sneak out of
it's process, i.e. a process can never get a mutable reference to an
object owned by a different process.

> Also, unclear how you would persist state (or results of computation).
> Since all you can do now is to send a message to process, which will
> return an object in answer.

No, all message sends are async, fire-and-forget. You don't get a
future or anything back. You can easily build sync messages on top of
this if you want, but the base system isn't planned to directly
support it.

> Now, since returned object most probably
> will belong to callee process you must copy it to caller process. But
> in real you should care of copying a whole subgraph of objects (since
> you can return a collection of newly created objects (and they , in
> own turn can be a collections e.t.c.)- and all belonging to callee
> process). Then , after you done merging a graph, you can simply wipe
> all memory which was allocated by callee process. This part is easy.

Ah, if you're talking about the receive call, yes that will get an
object returned. My first cut will just be (as mentioned above) a
deep copy. Performance will likely drive me to adding immutable
references, or objects or something. Or perhaps what you're
suggestion here.

> Now, the most interesting part: mutating an objects in caller process.
> Suppose my starting process calls two different processes.
> And they came to the point, that they are willing to update a state of
> some object(s) in caller process (to be clear: process A contains
> object a, it calls processes B and C in parallel, and now B and C
> wanting to change state of object a).

Can't happen. There is no shared state. Ever. The only thing that's
shared is the Process' mail box, but that is an internal VM detail,
not visible to the processes.

(remaining comments snipped since I think they assume shared state
which does not exist in my plan. If you have some reason you think I
can't avoid it let me know because I don't see it so far).

Jason Johnson-5

Re: Multy-core CPUs

On 10/24/07, Jason Johnson <[hidden email]> wrote:
>
> No, the plan was that since in Smalltalk objects are mutable, I will
> have to pay an extra cost for internal message sends and have the VM
> do a deep copy for the sent objects.

Ack, terminology overload. :) What I meant here is, obviously if I
sent a message between two literal images there is no choice but to do
a deep copy. Erlang gains some benefit from sending interprocess
messages where the sender and receiver are in the same literal image
via reference, but I can't because Smalltalk can mutate variables. So
this means I have to do the deep copy in *every* case. Unless I make
some changes.

dpharris

RE: Multy-core CPUs

In reply to this post by Sebastian Sastre-2

Quoting Sebastian Sastre <[hidden email]>:

> ...
> Take a minute to think in it's consequences. This suposition, same concept
> in more explicit words, this assumption of anObject being an indissociable
> thing with aProcess, objects being 1:1 with processes, makes all the
> difference and dramatically simplifies all. An we do know that
> simplification improves scalability.
>
> I'm, of course, being extremely speculative in the exploration of this idea
> in group with you all. But think is cheap :). In fact I don't even buy it
> myself yet. The problem is that I'm honestrly unable to refute myself about
> the convenience of this path :) so it becomes stronger.
>
> I hope you and others understand why I'm starting to think that this is a
> powerful idea.
>
> all the best,
>
> Sebastian Sastre
> PS: Sorry for the size. I've tried to express this in my previous post. I'm
> trying to be didactic and illustrative.

I was always impressed in Self by an idea of the implementers. They chose a
structure that is implicitly inefficient in terms of implementation, namely that
every object has its own named slots.

However, the smart idea was that, behind the scenes they used implementation
'tricks' to mitigate the efficiency hit, while maintaining the model at the top
level. For example, they defined "maps to transparently group objects cloned
from the same prototype, providing data type information and eliminating the
apparent space overhead for prototype-based systems."

Similarly, we could accept the object-process model, and then explore ways to
make this efficient behind the scenes. It seems to me that on a uniprocessor,
the 'behind the scenes' would look much like Smalltalk today. But as multi-core
and distributed computing becomes more prevalent we would gain the benefit.
Certainly we could explore the realities and consequences of this model.

David

Igor Stasenko

Re: Multy-core CPUs

In reply to this post by Jason Johnson-5

On 24/10/2007, Jason Johnson <[hidden email]> wrote:

> On 10/24/07, Igor Stasenko <[hidden email]> wrote:
> >
> > Aha, now i get it.
>
> Good, I should have just posted some theoritical Smalltalk code to
> begin with, this thread would probably be half as big. :)
>
> >So, your approach is to establish a fence between
> > different processes, so they can't share objects. Or maybe more
> > correct to say, that any callee process can have read-only access to
> > any objects which belongs to caller process?
>
> No, the plan was that since in Smalltalk objects are mutable, I will
> have to pay an extra cost for internal message sends and have the VM
> do a deep copy for the sent objects.
>
> Another alternative would be to introduce an immutable flag on
> references, then the "receiver" gets a reference to the object but
> flagged immutable. This way might be better, but requires more
> changes.
>
> > Its unclear how you would determine to which process object belongs
> > to? This is at minimum would require an additional slot per object
> > (ok, this is doable easily).
>
> Not needed. The boundaries are: object creation, object send and
> object receive. All controlled from the VM or in the library, so I
> just have to guarantee that a mutable object can never sneak out of
> it's process, i.e. a process can never get a mutable reference to an
> object owned by a different process.
>
> > Also, unclear how you would persist state (or results of computation).
> > Since all you can do now is to send a message to process, which will
> > return an object in answer.
>
> No, all message sends are async, fire-and-forget. You don't get a
> future or anything back. You can easily build sync messages on top of
> this if you want, but the base system isn't planned to directly
> support it.
>
> > Now, since returned object most probably
> > will belong to callee process you must copy it to caller process. But
> > in real you should care of copying a whole subgraph of objects (since
> > you can return a collection of newly created objects (and they , in
> > own turn can be a collections e.t.c.)- and all belonging to callee
> > process). Then , after you done merging a graph, you can simply wipe
> > all memory which was allocated by callee process. This part is easy.
>
> Ah, if you're talking about the receive call, yes that will get an
> object returned. My first cut will just be (as mentioned above) a
> deep copy. Performance will likely drive me to adding immutable
> references, or objects or something. Or perhaps what you're
> suggestion here.
>
> > Now, the most interesting part: mutating an objects in caller process.
> > Suppose my starting process calls two different processes.
> > And they came to the point, that they are willing to update a state of
> > some object(s) in caller process (to be clear: process A contains
> > object a, it calls processes B and C in parallel, and now B and C
> > wanting to change state of object a).
>
> Can't happen. There is no shared state. Ever. The only thing that's
> shared is the Process' mail box, but that is an internal VM detail,
> not visible to the processes.
>
> (remaining comments snipped since I think they assume shared state
> which does not exist in my plan. If you have some reason you think I
> can't avoid it let me know because I don't see it so far).
>

No sharing you say? Oh.. don't let me starting on this. How about a
common procedure of creating a new class? Creating or modifying a
method in some class?
Note that these changes propagated globally in current implementation
due to having a single global namespace (SystemDictionary).
How you planning to deal with that without breaking a uniform model of
smalltalk (everything is an object e.t.c)?

--
Best regards,
Igor Stasenko AKA sig.

Igor Stasenko

Re: Multy-core CPUs

In reply to this post by Jason Johnson-5

On 24/10/2007, Jason Johnson <[hidden email]> wrote:

Sorry for spurious replies.. but.. this statement means that your
processes are not so cheap as Erlang ones. By passing a single object
to new process you could trigger a cloning a substantial part of image
in this case (read - megabytes of data). Or even if not cloning, then
marking objects as read-only, or creating a 'hollow' references to
objects, which is too have own costs - extra space and access time.
Even for spawning a process which doing no more than adding 1+1 i need
to copy/mark a SmallInteger class and all its references, until i mark
everything reachable from it..
Honestly, i can't see how this concept can be considered cheap and scalable.

--
Best regards,
Igor Stasenko AKA sig.

pwl

Re: Multy-core CPUs, ERLANG

In reply to this post by Jason Johnson-5

Jason Johnson wrote:

PWL wrote

The erlangification (erlangization, or erlangisation) of Smalltalk may
be a radical enough transformation that it's no longer Smalltalk. If
that's the way of Squeak that's fine however it seems that a fork is
likely the result (and yes, the pun of forking was intended).


No, what I envision will act just like Smalltalk does today with the
singular exception that you wont have need of #critical:, Semaphore or
any of that stuff anymore.  And since most people don't use it, it
doesn't look that painful to me.

All the interprocess communication is just going to be accomplished as
message sends like everything else.

Hi,

Ok, then how does this magic actually work? Rather than bringing Erlang or other systems into your description please just describe how you see it working in Smalltalk. In detail please.

Peter

pwl

Re: Multy-core CPUs, ERLANG

In reply to this post by Jason Johnson-5

Jason Johnson wrote:

On 10/23/07, Peter William Lount [hidden email] wrote:

Of course one could also implement a copy-on-write-bit for objects in
the "read-only-shared-top-level-object-space-of-the-image". In order to
accomplish any work a process must be forked! Also, this way any process
that forks off will need to copy all of the objects it modifies into
it's own private object-space until the process commits it's changes
into the top level object-space or until it aborts.


Once again I have no idea what you're talking about.  I guess you're
not responding to me with this, since the system I'm talking about
would not commit any changes back to a top level process.

Concurrency isn't like automatic garbage collection - which is actually
quite broad and complex a field - at all.


*sigh*.  Ok, if you're going to respond to things I say, please read
what I write.  Speed reading obviously isn't working.  I said message
passing is *ANALOGOUS*.

analogous

adjective
1. 	similar or equivalent in some respects though otherwise
dissimilar; "brains and computers are often considered analogous";
"salmon roe is marketed as analogous to caviar"

Manual memory management is hard to do and does not scale or compose
well as explained in the email I originally linked to.

Shared state fine grained locking is hard to do and does not scale or
compose well as explained in the email I originally linked to.

Hi,

Yes I read what you said. I simply don't think they are analogous.

Certainly the parallels that you see between them are not clear from your analogy since this reader didn't get it.

Many things don't scale well, or don't compose well in computer science. It doesn't mean that they are all analogous.

Now I've not yet had a chance to read the PDF pointed to by Jon Hylands but it seems to me that they are more dissimilar than similar.

Peter

pwl

Re: Multy-core CPUs

In reply to this post by Herbert König

Herbert König wrote:

Hello Jason,


JJ> Ok, what can I do to make myself clearer and easier to understand?

JJ> On 10/24/07, Herbert König [hidden email] wrote:

Hello Peter,


nothing on your side, I wisher Peter would use proper quoting in his
html mails. In my mailer (the bat) I can't distinguish his argument
from his quote of your argument. He also sends text mails with proper
quotings.



Cheers,

Herbert                            [hidden email]

Hi,

I'm using ThunderBird for emails. I use nested quoting and everything looks good to me. Obviously your email program isn't working the same way. Which program do you use? What is "the bat"?

Peter

pwl

Re: Multy-core CPUs

In reply to this post by Jason Johnson-5

Jason Johnson wrote:

On 10/24/07, Jason Johnson [hidden email] wrote:

No, the plan was that since in Smalltalk objects are mutable, I will
have to pay an extra cost for internal message sends and have the VM
do a deep copy for the sent objects.


Ack, terminology overload. :)  What I meant here is, obviously if I
sent a message between two literal images there is no choice but to do
a deep copy.  Erlang gains some benefit from sending interprocess
messages where the sender and receiver are in the same literal image
via reference, but I can't because Smalltalk can mutate variables.  So
this means I have to do the deep copy in *every* case.  Unless I make
some changes.

Hi,

No, you'd not have to deep copy every time you send the messages. You can send references and when accessing them in the remote image (or image B if you prefer) you can ask the local image (or image A if you prefer) to send the missing data. Now this assumes that the objects in image A didn't change in the meantime. Yikes. Problems are getting worse. You can't avoid them. There is no silver bullet with this attempt at simplifying concurrency. It's a harsh reality.

Cheers,

Peter

Sebastian Sastre-2

RE: Multy-core CPUs

In reply to this post by Igor Stasenko

> -----Mensaje original-----
> De: [hidden email]
> [mailto:[hidden email]] En
> nombre de Igor Stasenko
> Enviado el: Miércoles, 24 de Octubre de 2007 17:32
> Para: The general-purpose Squeak developers list
> Asunto: Re: Multy-core CPUs
>
> Sebastian, you can envision that any unique object in VM is a
> unique process. No changes required to VM. Your concept
> having zero worth for me, because VM already supports that
> each objects have own encapsulated state, and you can change
> object's state only by sending messages to it.
> So, we already might say, that all objects are living and can
> be represented as a processes which triggered by sending
> message(s) to them.
>

Well seems to me that you are very near to get the idea. You have described
what we do have now. You seems to be missing the part of the process. I'm
saying that an object has a double nature. It is an object as we know it
today but it lives not statically (like in a photography) but in a process.
And I'm stating that this relation exists by nature. It's the result of
coupling the concept of object with the concept of process. Now we can
cleverly clamp the process part of it and somehow invent a support that maps
this process part nature of the object in some process running in some core
of this incoming hardware.

I think that was with Spoon, I can't recall well now but someone of this
group has made a visual Smalltalk memory map with it's instances. Just to
have an idea try to visualize an instance. It is supported somewhere in RAM
right? Metaphorically it has one "foot" step firmly in RAM. Currently some
VM process, anyone, can tell it to modify itself and it writes in it's piece
of RAM. Ok, now the idea I'm exploring and trying to communicate here it's
that objects should (metaphorically) have "two foots". One in RAM and the
other in a process running in some core. But that VM *will do guarantee*
that every instance has it's piece of RAM as usual and a process that will
be guaranteed to be the only one writing in that piece of RAM: the process
that *is* the process part of the double nature of the instance.

As someone properly already stated: no sharing memory no concurrency
problems.

Now.. what I think is a good question to answer regarding to this
hypothetical Smalltalk is how do we translate this conceptual model to a bit
based model to make it fit our current (incoming) hardware?

I bet that the VM will have to be modified.

As you properly said the current VM guarantees that the state of an object
can be only be changed by sending it some message. What it is not
guaranteeing right now is which process of that VM will be the one that will
send the instructions to do the write. So now could be any VM process. And
<metaphor> that's how we have purchased the concurrency problem </metaphor>
we are trying to solve now.

But once again: what I'm saying here is about a VM that guarantees which
process write that piece of RAM: the only one that is assigned to that piece
of RAM belonging to that instance.

Why we "purchased" that concurrency problem? Becaouse a trade off
prioritizing pragmatism with the resources available at that time. We
polluted the conceptual domain with implementation matters (the need to make
N process share in read/write the same space of RAM) to be able to get what
we get today with the hardware we have today. But may be ideas like this
using better hardware can depollute it.

See: mathematis are nothing but a tool to model. All models are rudimentary
simplifications of some system. Make a limited model reality with boolean
algebra may tehoretically be possible but extremelly unhuman. Programing in
assembler is a paliative to walk that path with less "brain damage".

Smalltalk is also a tool to model. The difference resides in it's nature: it
is heuristic by design. It's intellectually ergonomic. It shows respect to
the form in we humans form thought and concepts, mature old ones by refining
and explore new ones by prototyping, and relegates to a secondary place how
machines need (this couple of decades?) things to be done so they obey
behaving as we need.

So Smalltalk pays a price of being less efficient (than C, etc) to bring you
the freedom of maping what you see in reality directly to a computer. It
frees you from having to map it to mathematics to reifi it later even more
polluted (one order of magnitude of distorsion in the modeling).

That way you can model keeping minimalized the machinery madness so you gain
the chance to make a less polluted virtual model by maping the model your
brain quickly makes 1:1 with the virtual model computers need. It breaks the
trend to model things booleanly, mathematically even relationally. It gives
you a tool that can make your concept be from timidly fragile and embrionary
maturing to rock solid to go for production: that tool is the virtual
object, an instance.

Current hardware is based on mathematics. Boolean. So we had no other option
in that trade off than to take the boolean path to be able to use hardware
to make a Smalltalk opening computers to a bigger, and closer to humans,
space of solutions.

Maybe hardware is reaching a point in which, excuse my french, it sucks
less. And we can give a step backward in that old, absolutely
understandable, trade off and regain the conceptual refinement we allways
needed in this system. Somethig that maybe was seen there in time or may be
is being seen now because it's time to reach a new degree of subtlelty of
cogitation of this clever artifice we know as Smalltalk.

Returning to planet Earth now.. In the idea I'm exploring here, the VM of
this hypothetical Smalltalk you will also have guarantee that the process
that sends intructions to write in memory is the one that belong to the
instance that belongs to that process [1]. That way you never got
inconsistent states nor concurrency to write there because you never shared
anything [2].

That VM should make the processes of the instancess in a fashion that does
not matter in which core it's running, so it can be balanced, nor what part
of the RAM has assigned. Once assigned will be for it and only it. It will
be written by that process and only by that process.

I care about passing this message right so I ask you kindly: do you see
value now?

cheers,

Sebastian

[1] the process belongs to the instance or the instance to the process? A
Moebius thing here? It's reasonable because I have somehow fusioned the
concepts
[2] in the brain analogy makes sense because you dont share yours brain
memory. You "serialize" your thoughts, to written text or spoken words, but
you never ever ever share RAM which, at hardware level, are your synapses.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by pwl

On 10/25/07, Peter William Lount <[hidden email]> wrote:
>
> Hi,
>
> No, you'd not have to deep copy every time you send the messages. You can
> send references and when accessing them in the remote image (or image B if
> you prefer) you can ask the local image (or image A if you prefer) to send
> the missing data.

Or I can just not do that, do the deep copy and not have the problems
mentioned in the rest of your mail. Again you are talking about
something that *I'm not* and then explaining why *your approach* is
hard to do.

> Now this assumes that the objects in image A didn't change
> in the meantime. Yikes. Problems are getting worse. You can't avoid them.
> There is no silver bullet with this attempt at simplifying concurrency. It's
> a harsh reality.
>
> Cheers,
>
> Peter

The insight that Bell labs had with Unix over the mainframe makers was
that *we don't need a silver bullet*. We need to get 90% and provide
some way that the small % of people that need more can use.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by Igor Stasenko

On 10/24/07, Igor Stasenko <[hidden email]> wrote:
>
> No sharing you say? Oh.. don't let me starting on this. How about a
> common procedure of creating a new class? Creating or modifying a
> method in some class?

Ok, when I say "no sharing" I mean of mutable data, which is where all
the problems you and everyone else have mentioned come from.

I have mentioned several times in this thread that classes and code
*are a concern*. But I think a solvable one. Doing things to classes
(e.g. creating, renaming, adding methods, removing methods, changing
methods) are a special case now, and they will be a special case in my
proposed system as well. Note that Erlang has this same issue and
they solved it in their case 10 years ago. I think this is a solvable
problem.

> Note that these changes propagated globally in current implementation
> due to having a single global namespace (SystemDictionary).
> How you planning to deal with that without breaking a uniform model of
> smalltalk (everything is an object e.t.c)?

As mentioned in probably no less then 10 other emails: I think the
"ObsoleteClass" mechanism will be workable for this.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by Igor Stasenko

On 10/24/07, Igor Stasenko <[hidden email]> wrote:
>
> Sorry for spurious replies.. but.. this statement means that your
> processes are not so cheap as Erlang ones.

Why not? At this point in time (afaik) they do the exact same thing
in all but one case (communication between 2 "green" processes in the
same image).

>By passing a single object
> to new process you could trigger a cloning a substantial part of image
> in this case (read - megabytes of data).

Yes, you can. But this isn't the common case.

> Or even if not cloning, then
> marking objects as read-only, or creating a 'hollow' references to
> objects, which is too have own costs - extra space and access time.

Well, I have no plans of doing "futures". Others are well down that
path, so no need for me to duplicate their research. As far as
immutable references, all class have a header with various flags, I
would only need one more for mutability. Such a change does scare me
a bit because the VM would have to be changed that all instVar sets do
an extra check, which would impact non-concurrent code as well, but if
it could turn out to be a good trade off vs. doing a deep copy every
time.

> Even for spawning a process which doing no more than adding 1+1 i need
> to copy/mark a SmallInteger class and all its references, until i mark
> everything reachable from it..

Bad example. :) SmallInteger isn't mutable and isn't an object.

But I understand what you mean, the required traversals do sound
expensive, but this has to be done *every time* when you send between
two images anyway. Lets not optimize prematurely. :)

> Honestly, i can't see how this concept can be considered cheap and scalable.

A system with 9 9's of reliability comes to mind. :)

http://www.cincomsmalltalk.com/userblogs/ralph/blogView?entry=3364027251

But really, as soon as you talk between two systems, no other approach
is better. A "futures" concept can lazily load the data, saving time
when parts of a structure aren't used, but what are the numbers on
this? How often do you send a message with a % of unused data high
enough to offset the complexity cost of the "futures" mechanism?

Yes, interprocess communication between two same-image process would
be a disadvantage, but I'm pretty sure Erlang started this way and
optimized from there. That's my plan as well.

After all, we don't know how expensive this would actually be in
practice anyway. It's easy to come up with theoretical examples that
cripple the system, but will anyone actually do this? And if they do,
it will break and they can code around it.

Back before Unix, people were trying to figure out ways to ensure
resource deadlocks can not happen. This theoretical problem
effectively crippled them from releasing a system. Unix simply
ignored it. They gave you tools to see you had a dead lock, and a way
to kill dead locked processes. Seamed to work out for them. :)

Jason Johnson-5

Re: Multy-core CPUs

On 10/25/07, Jason Johnson <[hidden email]> wrote:
>
> As far as
> immutable references, all class have a header with various flags

Ugh, my proof reading is failing me. Here, of course I meant
"objects" not "class(es)"

1 ... 345678910