Smalltalk › Squeak › Squeak - Dev

Multy-core CPUs

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

194 messages Options

12345678 ... 10

Jason Johnson-5

Re: What about "Erlanging" the smalltalk interstitial space? (was RE: Multy-core CPUs)

On 10/23/07, Jason Johnson <[hidden email]> wrote:

>
> > Sebastian Sastre
> > PD: I've tried to imagine if this saves us from having to make code trhead
> > safe or not. I was unable to refutate this by myself so I also ask kindly
> > that the most experienced and critic minds collaborate on this regard.
>
> Us as people who use Smalltalk? Yes I believe it does. It doesn't
> make it impossible to make a design that has deadlocks, but imo the
> big win is that these concerns move to design time instead of
> implementation time.

Ah, and forgot to mention: If you mean "us" as in people who write
the VM, then no. The VM will get more complex to deal with this stuff
and have to take steps to ensure message operations are atomic.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by Igor Stasenko

On 10/22/07, Igor Stasenko <[hidden email]> wrote:
>
> If there's nothing else, which can be a replacement to this model ,
> then you don't have a choice, but use above.

The *VM* will have to, but no one using the Smalltalk system would.

> Again, a question raised: how to ensure that messages are passed in
> correct order and make sure that messages are delivered?

Message delivery is a guarantee of the system, but order is absolutely
not guaranteed.

> Now lets look inside: to make it working properly, you need to
> implement a message queue. And queue means that you must make an
> 'enqueue' and 'dequeue' operations synchronized.
> And that's exactly what i mean: even if you hide the concurrency
> problems from the eyes of developer, this is not means that problems
> are gone: now you have to deal with them by own.

.... I don't get your objection. Again, the VM abstracts *lots* of
tough details away from us so that we *never* have to think about it.
Yes, in the current OS/Hardware options of course *the VM* will have
to do some synchronization on message "mailboxes", but so what? It
does memory management for us now and saves us a great burden.

> If you know another way(s) how to make proper message passing scheme
> without using synchronized object (such as queue), i am all ears.

There are things out there, but who cares? This is a low level detail
the VM can hide from us just fine. Ask any Erlang programmer how
often they worry about synchronization issues with their message
mailboxes.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by Igor Stasenko

On 10/22/07, Igor Stasenko <[hidden email]> wrote:
>
> Yes, a simple example when i need to have correct order:
> Collection>>do:

Well, first of all, do you want do to be parallel? I would personally
prefer to have a #parallelDo: for that. Second of all, the point of
having the multiple processes is that we can run these things in
parallel in different threads and different processes. How do you
suggest controlling execution order in that scenario?

If you need work done in parallel and then need the results sorted
back to original order you have to do it in multiple steps, e.g.

1) collect a collection of "things" into a collection of associations
that have "thing" keys and position as the value,
2) run the parralelDo on this new collection
3) take the results and collect them back into the correct order using the value

or you could do something like this: http://bc.tech.coop/blog/070520.html

> So, we need at least 2 messages to reflect a different behaviour:
> #do:
> and
> #orderedDo:
>
> and that's only the simplest case...

How do you envision this working? It's no better using
shared-state/fine-grained locking unless you are modifying the
collection in place, which do and co. are not doing now.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by Igor Stasenko

On 10/22/07, Igor Stasenko <[hidden email]> wrote:
>
> Well, its maybe not a proper example, i just wanted to show, that we
> will need changes to codebase (not only VM) to better support of
> parallelism.

Well, personally I'm not trying to add transparent parallelism (Erlang
doesn't try this either). I want inter-process communication to be
completely explicit, just easy. I hadn't planned on adding anything
to the core libraries. a parallelDo: wouldn't know where to send the
work anyway if you don't tell it.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by Igor Stasenko

On 10/22/07, Igor Stasenko <[hidden email]> wrote:
>
> An 'erlangenization' of sends mean that we need deal differently with
> contexts. I think best way for this, is to rethink a context to make
> it look closer to what is a process in Erlang.
> Yes, we must pay the price of making all contexts be real objects for
> each message send, so we might expect a real slow-down of single
> thread execution.
> Then the only way how we could regain this loss is to use highly
> parallelisable algorithms.

Aha! Ok, the confusion is indeed coming from us talking about two
different things.

My suggestion: Add true (explicit!) concurrency to Squeak by way of
async "Actor" style message (like what Erlang has)

What you seem to think I'm suggesting: Making Squeak message send
transparently inter-process.

But this is exactly what I *don't* want. In my experience, trying to
abstract these different concepts into one thing just make code that's
impossible to reason about. I want my inter-process communication
doable and easy, but explicit as I can.

Sebastian Sastre-2

RE: What about "Erlanging" the smalltalk interstitial space? (wasRE: Multy-core CPUs)

In reply to this post by Jason Johnson-5

> -----Mensaje original-----
> De: [hidden email]
> [mailto:[hidden email]] En
> nombre de Jason Johnson
> Enviado el: Martes, 23 de Octubre de 2007 13:58
> Para: The general-purpose Squeak developers list
> Asunto: Re: What about "Erlanging" the smalltalk interstitial
> space? (wasRE: Multy-core CPUs)
>
> On 10/22/07, Sebastian Sastre <[hidden email]> wrote:
> >
> > 1.4 I'm interpreting the normal thousands of
> processes Erlang
> > manages as analogue (so similar in function) to one of this
> tides and
> > point why we cannot make tides of Smalltalk message sends
> in an image?
>
> We can. In fact, I think we could do it right now with the
> existing VM and some packages that have been written already.
>
> > 3. If I understood well, Erlang's main strenght is
> not that it
> > has functions to do things but that it has a message
> passing technique
> > really great that was designed to take advantage of
> parallelism in a
> > simple way efficiently and making processes very cheap
> (ceaper than the OS ones).
>
> Erlang processes are much cheaper then OS processes *and* OS threads.
> But then, so are Smalltalk's. The only difference is that
> Erlang processes are encapsulated entities that have no
> shared memory [1], while Smalltalk's do.
>
> > The intersticial space of virtual objects, AKA
> messages sends,
> > can be "Erlanged" by making each message send to be in a "process"
> > (the cheap
> > ones) of it's own like Erlang messages between "process"?
>
> Well, keep in mind, Erlang is a functional language and code
> written in uses functions. Message sends are for
> communicating between processes.
>
> > Same in other words:
> >
> > What consequences will affect us if we make a
> Squeak to use a
> > VM that to pass messages use processes ala Erlang (that are simple,
> > cheap, efficient and scalable in a world of "cores")?
>
> If you mean every Smalltalk message send is what an Erlang
> message send is, then the results would be devastating. As I
> mentioned above, Erlang does it's work with functions. In
> Smalltalk, the equivalent method of doing work is what
> Smalltalk calls "messages". In Erlang there is a concept of
> sending messages between processes, and I would do the same
> for Smalltalk.
>

Erlang does not have objects so I don't think that one paradigm can map 1:1
the other in both sides. Trying to compare literally will be noisy as
minimum. That's why I'm using, as borgs do :), the word "assimilate" meant
to be parsed as "to take from it's conceptual essence it's virtues and
discarding it's vices".

> > Can this allow us to assimilate in the Smalltalk's objects
> > paradigm the Erlagn's process paradigm? This is: will this
> allow us to
> > gain that parallelizing benefits preventing us to change
> the programing paradigm?
>
> We can, and it would [2]. But I think we should, at least at
> first, make inter-process message sends very obviously
> different from inter-object message sends. It would be
> possible, for example, that objects of type "Process" have a
> different way of handling messages so

But that will introduce a singularity in the paradigm. I'm afraid that
accept that is too much. Can you find a way of archieve the goal of your
proposal without devastating the "all is an object" premise?

> that:
>
> (Processes at: 'bank account') addUSD: 5000
>
> Is actually an inter-process send, but I would still want to use the !
> syntax, or something equivalent so it's completely obvious
> that we are doing something different.
>
And accepting singularities like that is how a language gets it's syntax
polluted and developers has to compensate that uncompletism by having to
remember (and model) in it's brains N more rules. The worst of course is not
the syntax but damaging the paradigm. That's is accepting the policy of
unloading of work the machines to load humans. As I see things humans are
not here for that and machines are not here for that. Dear Jason, I'm in the
"opposite corner of the ring" for that policy.

> Inter-process message sends have their own lookup complexity
> that I think should be separate from the inter-object message
> sends we have now. For example, in Erlang if you send a
> message to a process that happens to be in the same image, a
> simple reference copy happens (no danger since variables are
> immutable). The other two cases would be:
> a different OS/native thread in the same image, and a totally
> different image (same computer or on the network).
>
> Now if what you're talking about is basically promoting every
> object to it's own process (using the terms I have described
> so far), then I haven't really given this much though. This
> would be a totally different paradigm and area of research
> (maybe like CORBA?). Though I'm sure someone somewhere has
> done research on it (or is currently).
> :)
>

Mmmm no. I mean that every message send should have a process ala Erlang. Of
course this will only optimize in the other cores the messages sends that
are parallelizable (discern on which is a question that deserves
cogitation). Maybe is just a modest improvement to take advantage of
multicore but it never has any intention to disrupt the paradigm.

> > Sebastian Sastre
> > PD: I've tried to imagine if this saves us from having to make code
> > trhead safe or not. I was unable to refutate this by myself
> so I also
> > ask kindly that the most experienced and critic minds
> collaborate on this regard.
>
> Us as people who use Smalltalk? Yes I believe it does. It
> doesn't make it impossible to make a design that has
> deadlocks, but imo the big win is that these concerns move to
> design time instead of implementation time.
>
> [1] This is enforced by the language. As far as I know, the
> processes actually share the same heap, etc.. They just
> don't know it and can't take advantage of it :)
>
> [2] Well, I believe so anyway, and aim to find out. The
> issue is that Erlang has it easy: you *can't* share data at a
> language level in Erlang. In Smalltalk this gets a little
> tricky, mainly due to one of Smalltalk's greatest strengths:
> Classes are live objects (with state and so on). I believe
> this can be overcome, while preserving the Smalltalk
> semantics, but I can't prove it yet. :)
>

Well.. To be honest I interpret belief as being the user of a system of
thought and of course diferent from a fact or a model that has enough proofs
of concept to deserve investment (time, efforts, $$, energy, etc).
But I think I see your point. I also think that there is no solution without
tradeoffs and I'm not willing to disrupt the paradigm. To gain my willing
(probably others) show a more complete model that works first ;)

All the best,

Sebastian Sastre

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by Igor Stasenko

On 10/22/07, Igor Stasenko <[hidden email]> wrote:
>
> but this leads to burden our parallel processes with scheduling.
> I would like, instead, to be able to run a number of parallel branches
> for same process (to schedule a process instead each of these
> branches).

Scheduling doesn't have to be a problem if done e.g. event driven [1].
This is one of the optimizations I planned to have as advantage over
the Erlang implementation.

[1] In an event driven scheduler you look at what the process the
process did, and demote or promote them in priority based on this.
You end up just touching two processes per switch, but processes that
quickly give up the CPU (e.g. a process that just sends messages to
have work done) get the CPU any time they want it.

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by pwl

On 10/23/07, Peter William Lount <[hidden email]> wrote:
>
> Even the magical Erlang way of concurrency won't solve real world issues
> such as multiple processes contending for limited hardware resources.
> These need synchronization. No one answered Igor's point on this.

But they do deal with it: points of contention like this get their
own process. When you open a file in Erlang a process is started to
manage it. All reads and writes go through this process so you can
have as many processes doing these read/writes as you want.

Sebastian Sastre-2

RE: Multy-core CPUs, ERLANG

In reply to this post by pwl

I don't have access to it but maybe someone has this paper?:
http://portal.acm.org/citation.cfm?id=38844&coll=portal&dl=ACM

I wonder if there is some light there

cheers,

Sebastian Sastre

> -----Mensaje original-----
> De: [hidden email]
> [mailto:[hidden email]] En
> nombre de Peter William Lount
> Enviado el: Martes, 23 de Octubre de 2007 12:09
> Para: The general-purpose Squeak developers list
> Asunto: Re: Multy-core CPUs, ERLANG
>
> Hi,
>
> Continued.
>
> Of course one could also implement a copy-on-write-bit for
> objects in the
> "read-only-shared-top-level-object-space-of-the-image". In
> order to accomplish any work a process must be forked! Also,
> this way any process that forks off will need to copy all of
> the objects it modifies into it's own private object-space
> until the process commits it's changes into the top level
> object-space or until it aborts. That's assuming a Software
> Transactional Memory scheme is added to Smalltalk.
>
> Actually this idea is quite appealing if done right.
>
> Of course there are a host of other awesomely complex
> problems implied by the above that a simple concurrency model
> will NOT solve.
>
> Concurrency isn't like automatic garbage collection - which
> is actually quite broad and complex a field - at all. The
> sets of problems with concurrent systems are way more
> complex. This is especially the case when you bring
> distribution beyond a single compute node into the fold and
> especially when other issues such as distributed garbage
> collection are required. Welcome to the complex world of
> tomorrow today.
>
> What do the actual vendor's staff who write the virtual
> machines think about this?
>
> All the best,
>
> Peter William Lount
>

Jason Johnson-5

Re: Multy-core CPUs

In reply to this post by pwl

On 10/23/07, Peter William Lount <[hidden email]> wrote:

>
> The principle is that anytime you have more than one thread or process
> working on the same memory space, or object space, you WILL have concurrency
> issues (unless your code is just running very simple concurrency). The point
> is that in order to implement your
> utopia-vision-of-simple-problem-free-concurrency
> (utopia-concurrencia for lack of a better name) in Smalltalk you MUST
> isolate the objects to ONLY ONE thread of possible alteration of their state
> otherwise you end up with the possibility of many classes of concurrency
> problems.

Yes, this is mostly true. The insight with Erlang is that they don't
actually have to be in a different memory space, it just has to be
impossible at the language level for one process to get a reference to
an object of another process *and modify it*.

>Shared memory problems exist even within one protected memory
> space and not just between them. To isolate the objects involved in a
> process you can have a separate object space which contains the objects that
> will be operated on. This is the Erlang way, isn't it?

Kind of. The Erlang approach works so well for them because variables
can't be changed. Once you create a variable it is frozen in that
form. Other process *can* look at it because no change can happen
from either thread.

Obviously more care will have to be taken in Smalltalk as the objects
can always be changed.

> The thing about
> Erlang, unless I'm mistaken (and if I am mistaken I'd expect to be
> corrected), is that the objects in a process are only visible to that
> process until the results are returned. The objects that pass in and out of
> an Erlang process are only primitive data types and not complex objects.

The last sentence is incorrect. The message can be any complexity,
including sending functions, file handles, whatever.

> However for Smalltalk you'd need to pass in complex object graphs of
> arbitrary size and connectedness to be general purpose. This then results in
> a version problem.

Only if what you pass can be modified by either side.

> For example, <snip>
> Now
> you've got a problem that the magical erlang message passing won't solve.

Problem: your example is using shared data and updating of variables.
In the message passing paradigm *there is no shared data*. Period.
None. In Erlang specifically there isn't updating of variables even
within a process. So this would be done in Erlang something like
this:

some_process(DataStructure) ->
break_up_structure(DataStructure, 10000),
get_new_structure({}, 10000). % return
result of get_new_structure

break_up_structure(_, 0) -> done;
% base case, no processes left
break_up_structure(DataStructure, Processes) -> % otherwise
RestOfDataStructure = split_and_send(DataStructure), % cut off a
piece and send
break_up_structure(RestOfDataStructure, Processes - 1). % tail call
with new values

get_new_structure(DataStructure, 0) -> DataStructure; % base case,
return what we built
get_new_structure(DataStructure, Processes) ->
Data = receive,
%psuedo code for brevity
NewDataStructure = add_data_to_structure(Data, DataStructure),
get_new_structure(NewDataStructure, Processes - 1).

The fact that variables are immutable is dealt with in the normal
functional programming way of using tail recursion and passing any
variables that need "updating" as arguments.

In case the above code isn't clear: The process breaks up the parts of
the data structure and farms them out to the different processes, then
waits for responses and incrementally assembles them into the new data
structure.

Now, the issue here is obviously: This only makes sense when the
processing of the data that was carved out is more expensive then the
carving out and reattaching. If the structure is very large that may
well not be the case.

In that case I'm not sure how I would handle it, but I look at it like
any other performance issue: I would try algorithm changes before I
looked at going to a lower level.

> Now someone mentioned Software Transactional Memory (STM) so briefly that
> it would be easy to miss. Is that your solution?

No, if someone else wants to look at this it's ok. I'm a bit
concerned about the book keeping.

> If so you still have other
> concurrency issues, object versioning issues, plus more to deal with. No
> solution is a panacea for all problems unless you are an advocate of silver
> bullet solutions.

There is no such thing, but just as a generational garbage collector
is "good enough" in all but the most special cases, I believe message
passing will be "good enough" as well.

> The problem of editing a large graph of objects with many parallel threads
> is the generalized case of a nasty and complex set of concurrency and
> transactional issues. There are many ways to solve this. If you reply to
> this example I would hope that you do so fully explaining how you'd handle
> the concurrency and - importantly - the object consistency issues.

Transactional and concurrency issues arise because you are sharing
something. If you give one entity alone access to that something and
all access must go through him these issues go away. They are traded
for new issues, but issues that are much easier to reason about.

> Yes, I understand that early tests indicate that Erlang can handle
> approximately 100,000 or so processes at a time without hickups while Java
> can handle about 8,000 or so before blowing up.

No where near 8,000. At least not on any box I've ever seen (or do
you have a reference?). The problem is Java's just too fat, on a
32-bit operating system you run out of memory well before 8k processes
or threads.

> I don't know what the
> various Smalltalks can handle, but I doubt it's as high as Erlang and is
> more likely less than even Java - just a guess though. Maybe someone has
> worked it out.

Actually Smalltalk is not so far from Erlang right now (theoretically.
The question mark is the scheduling). Erlang is optimized for this
so the size of each process might be half the size of a Smalltalk one
(but I'm not sure of even this), but it's *certainly* much higher then
any native process or thread solution can hope to achieve.

> That's only because the current crop of operating systems were designed and
> envisioned when a few hundred processes and threads was considered a lot.
> Also because native operating system processes take a lot of resources.

It's because of the resources and how the OS deals with them. Keep in
mind that a thread can call "detach" and become a running process, so
some care has to be taken that space will be available. Of course
linux deals with this by not having real threads at all, just
processes that have the same memory map as other processes.

> Yes, and how would the no sharing be implemented in Smalltalk?

This is what my investigations will reveal. As I alluded to in a
previous mail, any immutable data is not a concurrency issue. It
doesn't matter who can see it so long as it can't be updated. Mutable
data (e.g. objects) are also no issue provided you can guarantee no
process can get access to it besides the process that created it.

So that leaves globals, especially classes. Until I get into this I'm
now 100% how I'll deal with it, but I can't image that it's not
solvable.

> How would you solve the concurrency one million node editing problem above
> without locking in your utopian threading implementation?

As described above.

> What would you do to Smalltalk to make it do this. So far you and the
> others have been very short on specifics and have just argued that something
> magical can be done to make concurrency happen without locks.

With current hardware/OSes, there will be locks, but in the VM where
they belong. The only structure in Erlang that must be atomic is the
message "mailbox", it's the only place that should can be accessed at
the same time by multiple processes.

> A few papers
> and web sites have been linked to but no one has written down what they are
> proposing or what they mean past it can be done.

Well, I'm a Smalltalker. I form a vague idea and then go try to do
it. I'll let you know what the specification is when I've implemented
it. :) But I have researched into this as far as what exists today
and I haven't seen anything I feel is a show stopper, nor anything
that will require a change in Smalltalk semantics. It's very possible
(even likely) that there's something I've overlooked, but I'll need to
get into it to find that out.

> I'll grant you that you can see that it can be done. Please illuminate what
> it is that you see can be done in detail and how you might do it. Thanks.

Is it clearer now? I feel that I have detailed it out twice now (the
relevant details anyway).

> However, you'll still end up with concurrency control issues and you've
> got an object version explosion problem occurring as well. How will you
> control concurrency problems with your simplified system? Is there a
> succinct description of the way that Erlang does it? Would that apply to
> Smalltalk?

Can you give an example of one of these issues, so I can explain how I
would deal with it? Please note, there is *no data sharing, period*
in this paradigm. At least at the language level.

> Ok, so there would be 10,000 separate process-object-spaces with the one
> million nodes being edited and new nodes being created in each of these
> 10,000 separate spaces. How do you expect to "merge" the results and solve
> the edits that will inevitably cause "logical data inconsistency"
> collisions?

By having just one process that owns the data (or lots of processes
that own their own piece of it) that all processes must talk to if
they wish to make changes.

> You simplified concurrency system also dramatically alters the Smalltalk
> paradigm.
>
> The current paradigm is fine-grained locked/shared state.
>
> So?

So obviously this part of the current paradigm will be altered, and I
say it needs to be. Even if we find that certain parallel tasks need
the old shared state method, this shouldn't be provide anywhere most
people will find it. The problem is that most people who know how to
do concurrency code only know this shared state model, so if you
present multiple options they will all use this, the familiar.

> Why? Please provide more than anticidal or belief driven comments for this
> point of view. What are the reasons? What is it that you'd be moving
> towards?

Because of the reasons I've laid out several times in this thread: 1)
it does not scale, 2) it can not be composed, 3) it's incredibly
difficult to reason about, 4) it's a low level detail, 5) it ensures
encapsulation violation and on and on.

There are plenty of papers out there on this subject, if you are
looking for me to go through them all and condense it for you in a
summary more then I've already done then I'm afraid that's not going
to happen. It's a pretty well known fact that shared-state fine
grained locking *can not scale*.

> It's a huge mistake on their part in my humble view.
>
> While it may be easy from the point of view of adapting their image it's a
> huge mistake. I've had many people comment that that's one of the reasons
> that Java is better than Smalltalk

If someone thinks that mess that is Java is better then Smalltalk, I
already question what useful information they can bring to the table.
Java has *some things* better then Smalltalk sure, but such a
statement is an "information smell" or a "taste smell" to say the
least.

> - it already works with multiple cpu
> cores. Yes they have to solve the concurrency problems, but those are NO
> WORSE than the concurrency problems that already exist within Smalltalk when
> running with a single native process and multiple (green threads aka)
> Smalltalk Processes. No different. Do you actually get that?

For someone who is so violently against personally attacks, you sure
hang over the fence, eh? Just because you're not understanding where
I'm coming from doesn't mean these concepts are just beyond me and
only you get it.

> If you don't
> then you fail to appreciate that the approach that Cincom is taking isn't
> going to solve the concurrency problems since - unless they correct me on
> this - it seems that their direction is to simply have N-instances of their
> image (in the same memory space or in separate operating system processes)
> where N would frequently be the same as the number of cores on the computer
> (or server) in question (although the instances could be more or less as
> needed).

Which *does* solve it! And conveniently walks right past all the
terrible issues that shared-state concurrency programming has. Once
again, while Java people are trying to debug issues the Smalltalk guys
will already be adding features to the next release.

> Each individual image would still have the problems of
> multi-threading within it IF AND ONLY IF there are multiple threads forked.

Right, so don't do that. :)

> This is of course a far cry from the radical concurrency system that is
> being proposed by the erlangization concurrency proponents.

Actually not so much. Erlang spanned actual CPU's by running more
images, just like Smalltalk. So only the processes inside the image
are different, but even this can be done today with discipline. I
would like to remove this need for discipline by making it
*impossible* to affect other processes, but so long as you make sure
you don't update anything that other processes can see you could do
this kind of message passing today.

> There isn't any need for new syntax with the "!" character. Now sure you're
> using it with a binary message selector "!" but why obfuscate it. I'd
> recommend using a keyword selector for better clarity. Thanks.

This is just what Erlang uses. I want inter-process sends to stand out clearly.

> Not so. You'd have to transmit - in my example above - one million objects
> to the various images and have them compute and return their resutls which
> would then have to be combined in a manner that leaves the graph of objects
> in a consistent state with one and a half million objects and 70% more
> interconnections between them. It is this parallel updating of many parts of
> the same data graph that will require the concurrency controls.

No. You are describing shared data which doesn't exist. No shared
data = no locking needed.

> Nothing but you've got to address the concurrency problem that I've
> mentioned above.

It wasn't a problem with message passing style, but for shared-state
concurrency programming.

> Are you talking about forking a new operating system process with a copy of
> the image?

I'm talking about: [ "some code" ] fork

> These are object database problems and attempting to split the processing
> into multiple threads to avoid the "locking" issues does not solve the
> problem. It just pushes it further away. While it might work for some
> applications like telephone switching systems it can't generalize to ALL
> types of problems which could benefit from concurrency solutions.

No it can't, and I don't believe I ever said it did. But garbage
collection can't either and we do fine with that as our only option in
Squeak. If we need more we step outside the normal bounds, as it
should be.

> All Object Databases have a couple of rooted objects. Maybe many more than
> a couple.

Object databases are a whole other can of worms. I don't know how I
would deal with it, but I would start by looking at what Mnesia
(basically an object db for Erlang) does.

> Yes, a variant of the Software Transactional Memory. However, you still
> have the problems mentioned above.

No, Software transactional memory means we update several variables
inside an "automic" block and if the system notices something changed
while these changes were being made it rolls the block back to what it
was before.

I'm talking about a VM optimization to deal with metaclasses.

> Having two spaces, old and new space, won't solve the problems mentioned
> above when you have N processes (threads) running on M-objects in parallel
> and need to combine the results of the parallel computations.

Old space and new space is purely for dealing with live code updates.
Nothing more. I'm not trying to solve any object versioning issues,
because I haven't seen any real evidence they will exist.

> Many problems have this "split processes off with their chunk of data" and
> "recombine" the results. Many of these problems are simplified - if possible
> - so that the results can't collide with the issues presented above.
> However, we are not talking about those special cases - such as parallel ray
> tracing algorithms. We are talking about the completely generic cases that
> occur in general purpose and every day use of code in Smalltalk applications

The only things I can think of that wouldn't work in this model is
problems where splitting up and rebuilding a dataset is more expensive
then the actual processing. But I think can usually be solved by
design changes.

> - such as the massive Smalltalk business database front end applications
> which are typical at many corporations today and which utilize many threads
> to accomplish their parallel tasks in order to speed up the user experience.
> A real world consequence of this is increased productivity of thousands of
> users day in and day out at these corporations.

I'm not sure what you're saying here. Apparently these Smalltalk
applications aren't doing real multithreading now right (since it's
only an option on a few ST implementations)? So how is offering a
simple way to achieve concurrency going to make this worse?

> Maybe your applications aren't a complex as these but I don't see the
> benefits of an Erlang ONLY approach. I do see the benefit of STM and Erlang
> approaches in some cases but why intentionally limit the tool box to just a
> few cases? It makes no sense to ignore the harsh reality of concurrency
> issues by picking a limited set of solutions.

For the reasons mentioned above. Choice isn't the holy grail you seem
to think it is. If it was we would all be on the C++ list talking.
Funny that we ended up in a language that 1) doesn't allow you to
allocate your own memory, 2) forces you to use single inheritance, 3)
forces you to use an image instead of files, etc.

I'm comfortable with the simplest thing being what works 90-99% the
time and having to work much harder if I need something more.

Sebastian Sastre-2

RE: Multy-core CPUs

> > Yes, and how would the no sharing be implemented in Smalltalk?
>
> This is what my investigations will reveal. As I alluded to
> in a previous mail, any immutable data is not a concurrency
> issue. It doesn't matter who can see it so long as it can't
> be updated. Mutable data (e.g. objects) are also no issue
> provided you can guarantee no process can get access to it
> besides the process that created it.
>
> So that leaves globals, especially classes. Until I get into
> this I'm now 100% how I'll deal with it, but I can't image
> that it's not solvable.
>

Jason, I'll describe what I understood about your idea of solution so please
correct me if I don't get you right.

<description>
The idea you are exploring is about a Smalltalk in which one can send
messages to processes, this is, object processes which are instances of
something that can't be defined as the processes we know today not in the OS
not in the current smalltalk vm's but more like the Erlang ones (extremely
cheap). Note: for practical use I will desambiguate this new concept calling
them as if they where instances of ObjectInProcess.

So lets imagine now this smalltalk which instead of the Object class we know
it has today it has an ObjectProcess base class and all it's subclasses are
kind of hierarchy we have today.

In this smalltalk, the VM guarantees that any instance can look anything
about any creature of this virtual universe but nobody can modify it. So
exists an strict respect of encapsulation. The modification of instVars can
only happen if made by the objectProcess instance X itself and VM makes
imposible to me modified differently from this.

So anObjectProcess will listen to messages but for the rest of the creatures
of this image it has read only instVars. If other anyObjectProcess want to
say something to it that demands a modification of some instVar that
modification is again guaranteed by the VM that will be made only by itself.
So creatures in this virtual universe can make pressure or ask kindly to any
other objectProcess creature that it make a change but the reality is that
universe guarantees strinctly that the change can be made only by itself.
</description>

<analogicCuriousObservation>
This is starting to sound strangely familiar to me: anybody can try to
convince me about how things are, etc. but as owner of my "hardware" I'm the
only creature that can change my synapses.
</analogicCuriousObservation>

So in a previous example of yours any instance of any object (that reaches
it) can make:

(Processes at: 'bank account') addUSD: 5000

and the correct instance in it's own process modifies the instVar.

OK.. here go what I think:

What I think is that messages should not be made special between processes.
That smells badly. They should be normal and homogeneous. What should be
refactored is the Object concept refining it's definition (in this new
smalltalk version) to an object that lives in a process. We don't need to
change the word but refine the definition, and conceptual interpretation of
course, of what a smalltalk object is!.

We allways talk about smalltalk being a space of live objects and that it
has a kind of anthropomorphic philosophy. Sorry this is strong but maybe
this hardware shift are telling us that time has come to ask us about the
need of the LiveObject class instead of Object to keep having the most
complete and minimalist paradigm of the industry. I'm not saying the name
formally is just to illustrate the value of time affecting instances. Of
course is not alive, it's also to illustrate emphatically about the
evolutionary timeline of any instance. The things it will experiment once it
start to exist in the image. The evolution in time is what we call "Process"
(not today in this industry but maybe soon).

IMHO the Object class, in the way we know it today, does not contemplate
what Alan claimed in OOPSLA 97 about taking in mind the process part as
being of such importance. The very own existance of an instance depends
(inherently) in the experience it will suffer in time. And today we have no
holder for that.

Besides the very nature of objects (now I mean in real life) is that they
are not frozen in time. Even a piece of ice has strong molecular activity
that we can interpret as being __molecules in processes__ inside.

Refining that concept to allow Object to become more like this LiveObject,
ProcessObject or ObjectInProcess (whatever I don't care about names now, but
I prefer to maintain calling it Object) will mean that we care about
creating the possibility of seriusly modeling this reality about experience
in time that objects suffer.

If we use smalltalk because it's nature is to be heuristic, this is familiar
respect to reality, then maybe it's time to explore stop keeping the process
concept with such a low priority from the design point of view and equalize
the importance, again at design level, of the experience of the objects in
time.

What we have today is that the st vm's makes objects to be supported by
other objects running in vm threads but this idea it's about making objects
to be existing __in__ a process. A process of it's own. All of them.

So in this imaginary smalltalk a process and an object are indissociable
they don't exists as separated entities.

In an hypothetical simulation of the dynamic of one of this brain tides,
every object MessageOfNeuron (may be hundreds of millions) run chemically on
a process that goes ahead by it's own through someAxon and someDendrites of
someNeurons.

So this simulation will run more efficiently in this smalltalk and modeled
as an object and modeled more closer to reality and every MessageOfNeuron
instance will run balanced in N cores.

Wow! I'll stop here by now but this is as much radical as interesting.

I'll appreciate criticism of all about this. Any kind of it, I don't care,
this idea seems to me too important.

Now a technical question:

As said before a squeak image easily has 800k instances. Erlang was claimed
to have about 100k processes without problem (I don't know in which
hardware). Of course that a proof of concept should start small but I'm
worried about that we still need to run about 800k vm _instance processes_
to achieve the goal. I wonder how that will not be a problem?

Jason, please keep us informed about any progress,

All the best,

Sebastian

Igor Stasenko

Re: Multy-core CPUs

In reply to this post by Jason Johnson-5

On 23/10/2007, Jason Johnson <[hidden email]> wrote:

> Problem: your example is using shared data and updating of variables.
> In the message passing paradigm *there is no shared data*. Period.
> None. In Erlang specifically there isn't updating of variables even
> within a process. So this would be done in Erlang something like
> this:
>
> some_process(DataStructure) ->
> break_up_structure(DataStructure, 10000),
> get_new_structure({}, 10000). % return
> result of get_new_structure
>
> break_up_structure(_, 0) -> done;
> % base case, no processes left
> break_up_structure(DataStructure, Processes) -> % otherwise
> RestOfDataStructure = split_and_send(DataStructure), % cut off a
> piece and send
> break_up_structure(RestOfDataStructure, Processes - 1). % tail call
> with new values
>
> get_new_structure(DataStructure, 0) -> DataStructure; % base case,
> return what we built
> get_new_structure(DataStructure, Processes) ->
> Data = receive,
> %psuedo code for brevity
> NewDataStructure = add_data_to_structure(Data, DataStructure),
> get_new_structure(NewDataStructure, Processes - 1).
>
> The fact that variables are immutable is dealt with in the normal
> functional programming way of using tail recursion and passing any
> variables that need "updating" as arguments.
>
> In case the above code isn't clear: The process breaks up the parts of
> the data structure and farms them out to the different processes, then
> waits for responses and incrementally assembles them into the new data
> structure.
>
> Now, the issue here is obviously: This only makes sense when the
> processing of the data that was carved out is more expensive then the
> carving out and reattaching. If the structure is very large that may
> well not be the case.
>
> In that case I'm not sure how I would handle it, but I look at it like
> any other performance issue: I would try algorithm changes before I
> looked at going to a lower level.
>
> > Now someone mentioned Software Transactional Memory (STM) so briefly that
> > it would be easy to miss. Is that your solution?
>
> No, if someone else wants to look at this it's ok. I'm a bit
> concerned about the book keeping.
>
> > If so you still have other
> > concurrency issues, object versioning issues, plus more to deal with. No
> > solution is a panacea for all problems unless you are an advocate of silver
> > bullet solutions.
>
> There is no such thing, but just as a generational garbage collector
> is "good enough" in all but the most special cases, I believe message
> passing will be "good enough" as well.
>

This having a perspective, only if you have unlimited memory resources
and zero cost memory allocation.
Lets look more precise on this. I will write only in ST(i don't know
Erlang), and assuming that i understood well your concept , by having
following ST code:

SomeClass>>setVars
self setVar1: value1.
self setVar2: value2.
...
^ self

here at each message send , instead of writing to receiver memory, we
do copy-on-write cloning.
so, self setVar1: value1 will return us a modified copy - self' ,
To keep things semantically correct, then we substitute self in 'self
setVar2: value2' by just received copy and so on..
at the end by returning self we substitute it by self''''''' .
So, each time we modifying object we got a modified copy instead
modifying original.
Now think about costs: memory allocation and orders of magnitude more
garbage generated.
Now, even if we assume that each process haves own private memory
region, its still should be located somewhere in physical memory. And
as you may know, a physical memory is shared among all cores, so your
'topmost' memory manager have no excuses, but to use so disliked by
you locking to deal with concurrent requests for resources.
And as you may see from this example, this model really fast going to,
that memory manager will become great bottleneck of your model,
because of orders of magnitude higher memory consumption.

And now consider alternative: even by putting a dumb lock-write-unlock
we can have much less cycles wasted. Because in your 'non-locking'
model your main load is just producing tons of garbage by cloning
objects over and over.

--
Best regards,
Igor Stasenko AKA sig.

Herbert König

Re[2]: Multy-core CPUs

In reply to this post by pwl

Hello Peter,

PWL> Jason Johnson wrote:

PWL> Ok, so if you really are talking about a "strict" Erlang style model
PWL> with ONE Smalltalk process per "image" space (whether or not they are in
PWL> one protected memory space or many protected memory spaces) where
PWL> objects are not shared with any other threads except by copying them
PWL> over the "serialization wire" or by "reference" then I get what you are
PWL> talking about.

PWL> That is a strange way of putting it.

these posts of you are very hard to read as it's not easy to find out
what you are saying and what you are quoting.

Would be nice if you could change that.

I wouldn't say that if the topic weren't interesting as well as
complicated.

Cheers,

Herbert mailto:[hidden email]

Jason Johnson-5

Re: Multy-core CPUs, ERLANG

In reply to this post by pwl

On 10/23/07, Peter William Lount <[hidden email]> wrote:
>
> That's interesting. Thus Erlang DOES IN FACT HAVE SHARED MEMORY between
> processes: for code and for data. I'd like to learn more about that.
> Could anyone provide more details?

*sigh*. It does *not* have shared *mutable* memory, and that is the
key. If I have to contextualize everything I say every time I say it
my mails are going to get even longer, and I'm writing books as it is.
But I stated several times that sharing is no problem *when it's read
only*.

> One proposal was a "copy-on-write" object space model where objects that
> are about to be written to in a Smalltalk process would be copied to
> that processes private object space - in effect that processes view of
> the "image".
>
> To implement a copy-on-write technique would require operating system
> support for the typical modern mainstream operating system. To implement
> copy-on-write requires a synchronization primitive to be used by the
> operating system - if I'm not mistaken - at least for a few instructions
> while the page tables are updated - a critical section.

What are you on about? This has been done in Smalltalk before with a
system that had certain objects in ROM. It's pretty simple and
requires no OS help and no locking.

The VM handles requests made by processes. So if a process makes a
request that modifies something in the read only space, the VM simply
copies the data to the processes "area" and makes the update. Simple.
It requires no OS help obviously and *no locking* because *by
definition the read only space can not be changed*.

The only issue not handled by this is *code* changes, so this needs a
separate mechanism as it does in Erlang (which also can change the
code at runtime): the old and new code areas. Smalltalk does this
now with "ObsoleteObject", so it shouldn't be a show stopper.

> One of the crucial aspects that Alan Kay (and others) have promoted over
> and over again is the ability of a language to be expressed in itself.
> This has a certain beauty to it as well as a mathematical aesthetic that
> has important ramifications that go way beyond those characteristics. To
> have a "mobius" system that can rewrite itself while retaining
> functioning versions across a continuous evolutionary path one requires
> a system that can be expressed in itself. Alan Kay points to a page in
> the Lisp Manual where Lisp is implemented in itself. Since Smalltalk is
> supposed to be a general purpose programming language it is crucial that
> it have this aspect of being able to implement itself with itself. So
> far Squeak comes close to this - at least with respect to the virtual
> machine which is written in the slang subset of Smalltalk.

Ok, so what's the problem? The system I'm proposing would also be
written in Slang. I'm certainly not going to do it in C anymore then
I have to.

> Unfortunately
> Squeak relies upon manually written C files for binding with the various
> operating systems. Co-existence with C based technology has it's price
> and it's high in that it blocks access to the entire system from within
> the system; by being blocked one is prevented from online interactive
> exploration and experimentation that we are used to at the Smalltalk
> source code level. At least this is being addressed in the amazing work
> of Ian Piumarta (http://piumarta.com/pepsi/pepsi.html) and the
> incredible work of LLVM (http://llvm.org). In fact I highly recommend
> that Squeak move from it's current obsolete C compilers to make use of
> either of these two projects as the bottom of the VM. Apple is funding
> LLVM and Ian's work seems to be part of the work of Alan Kay's
> Viewpoints Research Institute (http://www.vpri.org).

Now we move into our areas of common ground. :) I too look forward to
the day that we can walk away from the ultimate premature optimization
that is C.

> The "non-destructive" assignment aspect of Erlang is typical of
> non-write-in-place functional and object database systems. It's a key
> aspect of the ZokuScript Object Database Management System and
> Technologies. However it's not a panacea that the silver bullet utopians
> think it is. As with any other solution matrix it has it's benefits,
> payoffs, minuses and costs. These need to be balanced for every
> application. As Wolfgang points out there are issues with it such as the
> "cycle" problem that need to be overcome via implementation exceptions.

Of course. In CS everything is a trade off. I never proposed message
passing as a silver bullet, but rather the analogical equivalent of a
GC in the concurrency world.

> The other issue is how fine to you cut the objects? At what point do you
> say enough is enough? That is at what point does a process say oh, I
> don't really have control of changes to the object in question... as
> that object is private to another object space. Thus control needs to be
> passed to a process in the other object space likely on another compute
> node. For example corporate security constraints may require that
> certain data remain on the server while only permitting some data to be
> shared with a laptop node running remotely.

If the message passing is explicit, and the system isn't trying to do
anything fancy for me, this is no issue. As far as "oh I don't
control the object in question", this is encapsulation. Some people
even consider encapsulation a good thing. ;)

> It's important to consider the wider issues involved in distributed
> systems that are to be deployed in the real world. For Smalltalk to
> evolve we must get really serious about these issues ahead of the curve
> that others are pursuing now.

Exactly. And wasting precious resources to get where Java was 10
years ago when everyone else has realized that model can't scale is
exactly what we need to avoid.

> The erlangification (erlangization, or erlangisation) of Smalltalk may
> be a radical enough transformation that it's no longer Smalltalk. If
> that's the way of Squeak that's fine however it seems that a fork is
> likely the result (and yes, the pun of forking was intended).

No, what I envision will act just like Smalltalk does today with the
singular exception that you wont have need of #critical:, Semaphore or
any of that stuff anymore. And since most people don't use it, it
doesn't look that painful to me.

All the interprocess communication is just going to be accomplished as
message sends like everything else.

Jason Johnson-5

Re: Multy-core CPUs, ERLANG

In reply to this post by pwl

On 10/23/07, Peter William Lount <[hidden email]> wrote:
>
> Of course one could also implement a copy-on-write-bit for objects in
> the "read-only-shared-top-level-object-space-of-the-image". In order to
> accomplish any work a process must be forked! Also, this way any process
> that forks off will need to copy all of the objects it modifies into
> it's own private object-space until the process commits it's changes
> into the top level object-space or until it aborts.

Once again I have no idea what you're talking about. I guess you're
not responding to me with this, since the system I'm talking about
would not commit any changes back to a top level process.

> Concurrency isn't like automatic garbage collection - which is actually
> quite broad and complex a field - at all.

*sigh*. Ok, if you're going to respond to things I say, please read
what I write. Speed reading obviously isn't working. I said message
passing is *ANALOGOUS*.

analogous

adjective
1. similar or equivalent in some respects though otherwise
dissimilar; "brains and computers are often considered analogous";
"salmon roe is marketed as analogous to caviar"

Manual memory management is hard to do and does not scale or compose
well as explained in the email I originally linked to.

Shared state fine grained locking is hard to do and does not scale or
compose well as explained in the email I originally linked to.

Jason Johnson-5

Re: Re[2]: Multy-core CPUs

In reply to this post by Herbert König

Ok, what can I do to make myself clearer and easier to understand?

On 10/24/07, Herbert König <[hidden email]> wrote:

> Hello Peter,
>
> PWL> Jason Johnson wrote:
>
> PWL> Ok, so if you really are talking about a "strict" Erlang style model
> PWL> with ONE Smalltalk process per "image" space (whether or not they are in
> PWL> one protected memory space or many protected memory spaces) where
> PWL> objects are not shared with any other threads except by copying them
> PWL> over the "serialization wire" or by "reference" then I get what you are
> PWL> talking about.
>
> PWL> That is a strange way of putting it.
>
> these posts of you are very hard to read as it's not easy to find out
> what you are saying and what you are quoting.
>
>
> Would be nice if you could change that.
>
> I wouldn't say that if the topic weren't interesting as well as
> complicated.
>
>
> Cheers,
>
> Herbert mailto:[hidden email]
>
>
>

Herbert König

Re[4]: Multy-core CPUs

Hello Jason,

JJ> Ok, what can I do to make myself clearer and easier to understand?

JJ> On 10/24/07, Herbert König <[hidden email]> wrote:
>> Hello Peter,
>>

nothing on your side, I wisher Peter would use proper quoting in his
html mails. In my mailer (the bat) I can't distinguish his argument
from his quote of your argument. He also sends text mails with proper
quotings.

Cheers,

Herbert mailto:[hidden email]

Frank Shearar

Re: Multy-core CPUs, ERLANG

In reply to this post by Jason Johnson-5

"Jason Johnson" <[hidden email]> wrote:

> On 10/23/07, Peter William Lount <[hidden email]> wrote:
<snip>
> > Concurrency isn't like automatic garbage collection - which is actually
> > quite broad and complex a field - at all.
>
> *sigh*. Ok, if you're going to respond to things I say, please read
> what I write. Speed reading obviously isn't working. I said message
> passing is *ANALOGOUS*.

Interestingly enough, there was a paper at OOPSLA titled "The transactional
memory/garbage collection analogy":

http://portal.acm.org/citation.cfm?id=1297080&jmp=abstract&coll=portal&dl=ACM

(The URL requires an ACM subscription to read.)

The abstract reads:

This essay presents remarkable similarities between transactional memory
and garbage collection. The
connections are fascinating in their own right, and they let us better
understand one technology by
thinking about the corresponding issues for the other.

frank

Jon Hylands

Re: Multy-core CPUs, ERLANG

On Wed, 24 Oct 2007 14:16:38 +0200, "Frank Shearar"
<[hidden email]> wrote:

> Interestingly enough, there was a paper at OOPSLA titled "The transactional
> memory/garbage collection analogy":

Here's the paper:

http://www.cs.washington.edu/homes/djg/papers/analogy_oopsla07.pdf

(Thanks to Google Scholar)

Later,
Jon

--------------------------------------------------------------
Jon Hylands [hidden email] http://www.huv.com/jon

Project: Micro Raptor (Small Biped Velociraptor Robot)
http://www.huv.com/blog

Sebastian Sastre-2

RE: Multy-core CPUs

In reply to this post by Igor Stasenko

>
> This having a perspective, only if you have unlimited memory
> resources and zero cost memory allocation.
> Lets look more precise on this. I will write only in ST(i
> don't know Erlang), and assuming that i understood well your
> concept , by having following ST code:
>
> SomeClass>>setVars
> self setVar1: value1.
> self setVar2: value2.
> ...
> ^ self
>
> here at each message send , instead of writing to receiver
> memory, we do copy-on-write cloning.

But I don't understand why you want to do that if you can just make that the
process of the receiver update it's own memory when receives that message
that 1 was already assigned and 2 no copy is needed at all.

Cheers,

Sebastian

> so, self setVar1: value1 will return us a modified copy -
> self' , To keep things semantically correct, then we
> substitute self in 'self
> setVar2: value2' by just received copy and so on..
> at the end by returning self we substitute it by self''''''' .
> So, each time we modifying object we got a modified copy
> instead modifying original.
> Now think about costs: memory allocation and orders of
> magnitude more garbage generated.
> Now, even if we assume that each process haves own private
> memory region, its still should be located somewhere in
> physical memory. And as you may know, a physical memory is
> shared among all cores, so your 'topmost' memory manager have
> no excuses, but to use so disliked by you locking to deal
> with concurrent requests for resources.
> And as you may see from this example, this model really fast
> going to, that memory manager will become great bottleneck of
> your model, because of orders of magnitude higher memory consumption.
>
> And now consider alternative: even by putting a dumb
> lock-write-unlock we can have much less cycles wasted.
> Because in your 'non-locking'
> model your main load is just producing tons of garbage by
> cloning objects over and over.
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>

12345678 ... 10