Smalltalk › Squeak › Squeak - Dev

Multy-core CPUs

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

194 messages Options

1234567 ... 10

Klaus D. Witzel

Re: Multy-core CPUs

On Mon, 22 Oct 2007 14:30:28 +0200, Igor Stasenko wrote:

> On 22/10/2007, Sebastian Sastre wrote:
>> > > That's the point: you *don't* have synchronized access to
>> > any object.
>> > > All you have is messages. Think of it as an OO view of processes.
>> > > You can't see what's inside, you can only ask the process
>> > to do things
>> > > on your behalf.
>> > >
>> >
>> > Again, a question raised: how to ensure that messages are
>> > passed in correct order and make sure that messages are delivered?
>> > Now lets look inside: to make it working properly, you need
>> > to implement a message queue. And queue means that you must
>> > make an 'enqueue' and 'dequeue' operations synchronized.
>> > And that's exactly what i mean: even if you hide the
>> > concurrency problems from the eyes of developer, this is not
>> > means that problems are gone: now you have to deal with them by own.
>> > If you know another way(s) how to make proper message passing
>> > scheme without using synchronized object (such as queue), i
>> > am all ears.
>> >
>> > --
>> > Best regards,
>> > Igor Stasenko AKA sig.
>> >
>> 1. For the correct order: I understand that Erlang is open so, to some
>> point, nothing stop us from looking that how-tos on how Erlang's VM
>> makes
>> the message passing in correct order right? Seems to me that somehow
>> they
>> solved that question and probably we can study how assimilate that
>> virtue.
>
> While reading this topic, i googled, just to look what solutions are
> found in this area for non-locking queues.
> There is no wonder (still) - they all based on atomic CaS (Compare and
> Store) processor instructions. Its of course interesting how Erlang
> manages message passing, but i doubt that it based on something much
> different.
>
>> 2. For ensuring messages sends: "send and pray"
>>
>> That way when a smalltalk's erlangized message send is in a process that
>> terminates it should end with some cause for the act of finish. Maybe
>> this
>> will allow for instance to implement DNU: the VM don't find a proper
>> method
>> in the object to receive that message so it terminates the process
>> stating
>> that as cause.
>>
>
> An 'erlangenization' of sends mean that we need deal differently with
> contexts. I think best way for this, is to rethink a context to make
> it look closer to what is a process in Erlang.
> Yes, we must pay the price of making all contexts be real objects for
> each message send, so we might expect a real slow-down of single
> thread execution.

Not only slow-down :( For an example, have a look at the implementor of
#debug:title:full: in class Process, where thisContext is assigned to a
variable.

When #ifTrue:ifFalse: is really sent, ([thisContext] class) is
BlockContext *and* its sender is nil so the test for #hasContext: in the
next statement fails.

But Squeak's compiler [usually] doesn't emit code for sending
#ifTrue:ifFalse: so ([thisContext] class) is MethodContext and
#hasContext: doesn't fail (in this example).

> Then the only way how we could regain this loss is to use highly
> parallelisable algorithms.

... which can be employed regardless of 'erlangenization' :)

/Klaus

>> Cheers,
>>
>> Sebastian
>>
>

Igor Stasenko

Re: Multy-core CPUs

On 22/10/2007, Klaus D. Witzel <[hidden email]> wrote:

> Not only slow-down :( For an example, have a look at the implementor of
> #debug:title:full: in class Process, where thisContext is assigned to a
> variable.
>
> When #ifTrue:ifFalse: is really sent, ([thisContext] class) is
> BlockContext *and* its sender is nil so the test for #hasContext: in the
> next statement fails.
>
I think this is because of optimization.
For BlockContext a sender should be a context of method
#ifTrue:ifFalse: (which sends #value to block). But compiler never
creates such context due to optimization. In this case, since compiler
'cuts' the #ifTrue:ifFalse: out, then a correct context, i think,
should be a sender of #ifTrue:ifFalse?? but not nil.

> But Squeak's compiler [usually] doesn't emit code for sending
> #ifTrue:ifFalse: so ([thisContext] class) is MethodContext and
> #hasContext: doesn't fail (in this example).
>
> > Then the only way how we could regain this loss is to use highly
> > parallelisable algorithms.
>
> ... which can be employed regardless of 'erlangenization' :)
>

A trivial code comes in mind:

(1 to: 1000) do: [:i | [ aBlock value:i ] fork ]

but this leads to burden our parallel processes with scheduling.
I would like, instead, to be able to run a number of parallel branches
for same process (to schedule a process instead each of these
branches).

(1 to: 1000) doInParallel: [:i | aBlock value ]

I really don't like adding another abstraction like Thread, in
addition to Process. Maybe we should stick with a Process and have a
subclass of it, like ProcessNoScheduling.
I'm just thinking, in what ways we can avoid excessive scheduling/preempting?
Or maybe, by following road of 'erlangisation' we should make a
Process more lightweight, so spawning thousands of them will not cause
a speed degradation.

--
Best regards,
Igor Stasenko AKA sig.

Sebastian Sastre-2

RE: Multy-core CPUs

> >
>
> A trivial code comes in mind:
>
> (1 to: 1000) do: [:i | [ aBlock value:i ] fork ]
>
> but this leads to burden our parallel processes with scheduling.
> I would like, instead, to be able to run a number of parallel
> branches for same process (to schedule a process instead each
> of these branches).
>
> (1 to: 1000) doInParallel: [:i | aBlock value ]
>
> I really don't like adding another abstraction like Thread,
> in addition to Process. Maybe we should stick with a Process
> and have a subclass of it, like ProcessNoScheduling.
> I'm just thinking, in what ways we can avoid excessive
> scheduling/preempting?

Seems to me that OS process and native threads is not the best choice (too
costly to create, etc).

> Or maybe, by following road of 'erlangisation' we should make
> a Process more lightweight, so spawning thousands of them
> will not cause a speed degradation.
>
Making process in Erlang is as cheap as creating objects in Smalltalk so
this how-to's are exactly the ones I think we can take a look in Erlang's VM
design/internals and/or talk with people familiar to Erlang VM. Would be
cool to meet them and have a smalltalk :) to see what goes from it

Cheers,

Sebastian

>
> --
> Best regards,
> Igor Stasenko AKA sig.
>

Bryce Kampjes

Re: Multy-core CPUs

In reply to this post by Rob Withers

Robert Withers writes:
> My thinking is that getting the messaging working is the first step,
> followed by looking at synchronization problems, and then looking at
> what things like Exupery may offer to speed things up.
>
> The example I gave of MacroTransforms is telling. Currently an
> #ifTrue: message is macro transformed into bytecodes that do the
> #ifTrue: inline. I have had to back that out so the #ifTrue: can be
> intercepted if the receiver is non-local. At runtime, it would be
> nice to see that if the receiver is in fact local, then some form of
> inlining could be used, otherwise intercept. Since this is runtime
> selected bytecodes, I thought of Exupery.

One option would be to just disable ifTrue: inlining using Klaus's
code and wait for Exupery to solve the speed problem introduced. Full
message inlining should be able to optimise the message sends out
of ifTrue:. This optimisation is planned for Exupery 2.0.

I'm still working to get to 1.0 so waiting doesn't make sense if
you need the speed now or in the next year or two.

Optimising ifTrue: implemented with message sends is a simple use of
the dynamic inlining work pioneered by Urs Holzle in Self.

Bryce

pwl

Re: Multy-core CPUs

"In February, 2007 NVIDIA, the worldwide leader in programmable graphics processor technologies, launched CUDA, a C-Compiler and developer's kit that gives software developers access to the parallel processing power of the GPU through the standard language of C."

"Until recently, graphic cards' GPUs couldn't be used for applications such as password recovery. Older graphics chips could only perform floating-point calculations, and most cryptography algorithms require fixed-point mathematics. Today's chips can process fixed-point calculations. And with as much as 1.5 Gb of onboard video memory and up to 128 processing units, these powerful GPU chips are much more effective than CPUs in performing many of these calculations."

"Since high-end PC mother boards can work with four separate video cards, the future is bright for even faster ... applications."

Some applications have experienced a 25x speed up using a $150 graphics card's GPU.

http://www.net-security.org/secworld.php?id=5567

"NVIDIA® CUDA™ technology is a fundamentally new computing architecture that enables the GPU to solve complex computational problems in consumer, business, and technical applications. CUDA (Compute Unified Device Architecture) technology gives computationally intensive applications access to the tremendous processing power of NVIDIA graphics processing units (GPUs) through a revolutionary new programming interface. Providing orders of magnitude more performance and simplifying software development by using the standard C language, CUDA technology enables developers to create innovative solutions for data-intensive problems. For advanced research and language development, CUDA includes a low level assembly language layer and driver interface."
http://developer.nvidia.com/object/cuda.html

Hi,

How can Squeak leverage this? Certainly in the area of graphics. Which other areas?

Squeak for a GPU anyone?

What can be accomplished with 128 x 4 GPU processing units per cheap PC node?

All the best,

Peter

Jecel Assumpcao Jr

Re: Multy-core CPUs

Peter William Lount wrote:
> How can Squeak leverage this? Certainly in the area of graphics. Which other areas?
>
> Squeak for a GPU anyone?
>
> What can be accomplished with 128 x 4 GPU processing units per cheap PC node?

One option is to wait for the Wheel of Hardware Reincarnation to crank
through a couple of more steps giving us a huge number of processors,
all alike. Then we are back to the subject of this thread :-)

http://www.cap-lore.com/Hardware/Wheel.html

-- Jecel

pwl

Re: Multy-core CPUs

In reply to this post by pwl

Jecel Assumpcao Jr wrote:
> One option is to wait for the Wheel of Hardware Reincarnation to crank
> through a couple of more steps giving us a huge number of processors,
> all alike. Then we are back to the subject of this thread :-)
>
> http://www.cap-lore.com/Hardware/Wheel.html

Hi Jecel,

Sweet.

Fortunately the cycle is swinging back around with the Tile-64 (and
Tile-N) core processors that are just now being released. Also Intel has
a similar 80-core chip that they've showed off but isn't slated for
production (quite yet). The general purpose highly connected chips using
on chip networks to communicate are likely the way of the future. Intel
will eventually produce X86-64 variants (and hopefully Itanium's) that
have N-cores where N is 64 or larger - maybe sooner than we think.

The next steps of N-core and N-threading design for Squeak and Smalltalk
are crucial.

Even the magical Erlang way of concurrency won't solve real world issues
such as multiple processes contending for limited hardware resources.
These need synchronization. No one answered Igor's point on this.

It would still be nice if someone who is supporting the Erlangisation
(or is that Erlangization) of Smalltalk's processes to write up a
complete description of what they are actually proposing. It makes
debating it easier. Thanks.

All the best,

Peter

Rob Withers

Re: Multy-core CPUs

In reply to this post by Bryce Kampjes

----- Original Message -----
From: <[hidden email]>
To: "The general-purpose Squeak developers list"
<[hidden email]>
Sent: Monday, October 22, 2007 1:30 PM
Subject: Re: Multy-core CPUs

> Robert Withers writes:
> > My thinking is that getting the messaging working is the first step,
> > followed by looking at synchronization problems, and then looking at
> > what things like Exupery may offer to speed things up.
> >
> > The example I gave of MacroTransforms is telling. Currently an
> > #ifTrue: message is macro transformed into bytecodes that do the
> > #ifTrue: inline. I have had to back that out so the #ifTrue: can be
> > intercepted if the receiver is non-local. At runtime, it would be
> > nice to see that if the receiver is in fact local, then some form of
> > inlining could be used, otherwise intercept. Since this is runtime
> > selected bytecodes, I thought of Exupery.
>
> One option would be to just disable ifTrue: inlining using Klaus's
> code and wait for Exupery to solve the speed problem introduced. Full
> message inlining should be able to optimise the message sends out
> of ifTrue:. This optimisation is planned for Exupery 2.0.

That would be awesome. I look forward to it.

cheers,
Rob

>
> I'm still working to get to 1.0 so waiting doesn't make sense if
> you need the speed now or in the next year or two.
>
> Optimising ifTrue: implemented with message sends is a simple use of
> the dynamic inlining work pioneered by Urs Holzle in Self.
>
> Bryce
>
>

Klaus D. Witzel

Scalability [was: Multy-core CPUs]

In reply to this post by Igor Stasenko

On Mon, 22 Oct 2007 19:10:49 +0200, Igor Stasenko wrote:

> On 22/10/2007, Klaus D. Witzel wrote:
>
>> Not only slow-down :( For an example, have a look at the implementor of
>> #debug:title:full: in class Process, where thisContext is assigned to a
>> variable.
>>
>> When #ifTrue:ifFalse: is really sent, ([thisContext] class) is
>> BlockContext *and* its sender is nil so the test for #hasContext: in the
>> next statement fails.
>>
> I think this is because of optimization.
> For BlockContext a sender should be a context of method
> #ifTrue:ifFalse: (which sends #value to block). But compiler never
> creates such context due to optimization. In this case, since compiler
> 'cuts' the #ifTrue:ifFalse: out, then a correct context, i think,
> should be a sender of #ifTrue:ifFalse?? but not nil.
>
>> But Squeak's compiler [usually] doesn't emit code for sending
>> #ifTrue:ifFalse: so ([thisContext] class) is MethodContext and
>> #hasContext: doesn't fail (in this example).
>>
>> > Then the only way how we could regain this loss is to use highly
>> > parallelisable algorithms.
>>
>> ... which can be employed regardless of 'erlangenization' :)
>>
>
> A trivial code comes in mind:
>
> (1 to: 1000) do: [:i | [ aBlock value:i ] fork ]
>
> but this leads to burden our parallel processes with scheduling.
> I would like, instead, to be able to run a number of parallel branches
> for same process (to schedule a process instead each of these
> branches).
>
> (1 to: 1000) doInParallel: [:i | aBlock value ]
>
> I really don't like adding another abstraction like Thread, in
> addition to Process. Maybe we should stick with a Process and have a
> subclass of it, like ProcessNoScheduling.

I think that the present multi-core CPU thread would benefit from having a
look at what other people achieved in this area, I mean people who bet
their whole carreer on optimizing resource allocation and resource
scheduling, like for example this one

Scalability of Microkernel-Based Systems
- http://l4ka.org/publications/2005/uhlig_phd-thesis_scalability.pdf
(just skip the few pages in German, the paper is in English)

And one shouldn't care about that L4 folks are mainly concerned with OS
components since that are objects like any other :) They [L4 folks] have
minimalistic number of concepts and tough, very tough requirement
definitions which have to be matched with reality ;-)

But, the way I understand the present multi-core CPU thread, Squeak people
aim to save the multi-core processing world by reinventing it :-D

> I'm just thinking, in what ways we can avoid excessive
> scheduling/preempting?

I think that you can find answers to this [with benchmarks and,
comparisions also at the conceptual level] in the abovementioned paper :)

> Or maybe, by following road of 'erlangisation' we should make a
> Process more lightweight, so spawning thousands of them will not cause
> a speed degradation.

But it will. There are hidden constants associated with our present
understanding of massive parallelism (you mentioned the cost of resource
allocation and resource scheduling, add to that that messages can get
lost, non-local updates for keeping the system viable, etc).

And you have to find problems which can be solved with massive parallel
threads/processes ;-)

/Klaus

>
>

Michael van der Gulik-2

Re: Multy-core CPUs

In reply to this post by gruntfuttuck

On 10/17/07, gruntfuttuck <[hidden email]> wrote:

How is squeak going to handle multy-core CPUs, if at all? If we see cores of
100 plus in the future and squeak stay as it is, I would imagine other
languages such as erlang, will look more attractive.

The answer seems pretty obvious: modify the VM to support them. I'll skim over the details which I'm sure everybody already knows.

My question is: what are we going to do with multi-core CPUs? The code in the image is almost all single threaded. Morphic freezes up when I run something in the workspace (!!). Smalltalkers just don't seem to understand multi-threaded code, even though the basic capabilities been available to them since day one.

I use Futures now and then; I implemented them myself:

f := Future doing: [ some long computation ].
... insert more code here which runs in parallel with the long computation.
f printResult. "Will block until the long computation has returned the result into f and then print the result. "

I imagine that a parallel collection package would be possible to make:

c := ParOrderedCollection new. "or ParSet, ParBag..."
c addAll: lots of stuff.
c do: [ :each | each doSomething ]. "Will fork a Process for each element in c. "
c map: [ :each | each transform] andGather: [ :each :sum | sum combineWith: each]. "Google's map and gather algorithm"

Object>>changed: can be modified to be parallel; this makes the dependents/updating framework parallel. I did this and the image seemed to work fine.

There's heaps of parallel stuff you can do in Squeak. One day I'd like to have a crack at making the VM use pthreads more, but that will be the day after people actually start writing parallel code.

Gulik.

--
http://people.squeakfoundation.org/person/mikevdg
http://gulik.pbwiki.com/

pwl

Re: Multy-core CPUs

In reply to this post by Jason Johnson-5

Jason Johnson wrote:

On 10/21/07, Peter William Lount [hidden email] wrote:

tim Rowledge wrote:

Ok, so if you really are talking about a "strict" Erlang style model
with ONE Smalltalk process per "image" space (whether or not they are in
one protected memory space or many protected memory spaces) where
objects are not shared with any other threads except by copying them
over the "serialization wire" or by "reference" then I get what you are
talking about.


That is a strange way of putting it.

Why? That is what Erlang achieves via it's total encapsulation of state that is only transferred by message passing to and back from a process. To achieve the same thing in Smalltalk you'd need to isolate the component objects running in an "image" object space with the process otherwise you'd be breaking the encapsulation that provides the protection against a large number of class es of concurrency problems.

The principle is that anytime you have more than one thread or process working on the same memory space, or object space, you WILL have concurrency issues (unless your code is just running very simple concurrency). The point is that in order to implement your utopia-vision-of-simple-problem-free-concurrency (utopia-concurrencia for lack of a better name) in Smalltalk you MUST isolate the objects to ONLY ONE thread of possible alteration of their state otherwise you end up with the possibility of many classes of concurrency problems. Shared memory problems exist even within one protected memory space and not just between them. To isolate the objects involved in a process you can have a separate object space which contains the objects that will be operated on. This is the Erlang way, isn't it? The thing about Erlang, unless I'm mistaken (and if I am mistaken I'd expect to be corrected), is that the objects in a process are only visible to that process until the results are returned. The objects that pass in and out of an Erlang process are only primitive data types and not complex objects. However for Smalltalk you'd need to pass in complex object graphs of arbitrary size and connectedness to be general purpose. This then results in a version problem.

For example, lets say that you have a graph of one million objects that is highly connected and you want to perform not just a simple read operation on it but a massive number of edits which would result in the graph growing by 50% and the number of connections growing by 70%. For speed you decide to implement the algorithms so that they can run in parallel upon this moderately large graph of objects. Lets say that you have enough compute and memory resources to split this into 10,000 processes. Now you have the problem of sharing the one million objects with the 10,000 processes. That's a lot of data to move around just to get things started assuming that you packaged up the whole mess into a serial blob and spit it at the various processes. A lot of redundant data. Ok, maybe it's better to do this in small chunks, after all incrementalism is a powerful technique. For this approach you send each of the 10,000 processes a starting node plus a "search pattern" and the type of edits it will perform upon the graph along with the actual edits as they flow in from another source. So now you have 10,000 processes each vying to traverse the one million node graph scanning for patterns and applying edits as they find what they are looking for. Some of these processes will then update the "shared graph". Oh. What happens when two processes both update the same node in this graph but in different ways? Let's say one edit in one process adds a connection while the other edit in the other node modifies an instance variable on that node? Let's say that these two edits occur at the same time and are mutually exclusive - that is both edits would break the object's own internal consistency rules. So now you have two edits that either must both fail, or one must succeed while the other fails or the other must succeed - both can't succeed. Now you've got a problem that the magical erlang message passing won't solve.

If it does what is the erlang solution to this million node parallel editing problem?

Now someone mentioned Software Transactional Memory (STM) so briefly that it would be easy to miss. Is that your solution? If so you still have other concurrency issues, object versioning issues, plus more to deal with. No solution is a panacea for all problems unless you are an advocate of silver bullet solutions.

The problem of editing a large graph of objects with many parallel threads is the generalized case of a nasty and complex set of concurrency and transactional issues. There are many ways to solve this. If you reply to this example I would hope that you do so fully explaining how you'd handle the concurrency and - importantly - the object consistency issues.

The fact is, Erlang has many processes per image.

Yes, I understand that early tests indicate that Erlang can handle approximately 100,000 or so processes at a time without hickups while Java can handle about 8,000 or so before blowing up. I don't know what the various Smalltalks can handle, but I doubt it's as high as Erlang and is more likely less than even Java - just a guess though. Maybe someone has worked it out.

Many more then you could ever get as real
processes or native threads (as a test I made a little program that
spawned 64 *thousand* threads and passed messages between them on my
laptop).

That's only because the current crop of operating systems were designed and envisioned when a few hundred processes and threads was considered a lot. Also because native operating system processes take a lot of resources.

But with their model, process creation is extremely cheap.  And since
there is no sharing as far as the language is concerned, there is no
need for locking to slow everything down.

Yes, and how would the no sharing be implemented in Smalltalk?

How would you solve the concurrency one million node editing problem above without locking in your utopian threading implementation?

Smalltalk can do this too. I think it needs a little work still, but
I'm optimistic about what can be done here.

What would you do to Smalltalk to make it do this. So far you and the others have been very short on specifics and have just argued that something magical can be done to make concurrency happen without locks. A few papers and web sites have been linked to but no one has written down what they are proposing or what they mean past it can be done.

I'll grant you that you can see that it can be done. Please illuminate what it is that you see can be done in detail and how you might do it. Thanks.

However, you'll still end up with concurrency control issues and you've
got an object version explosion problem occurring as well. How will you
control concurrency problems with your simplified system? Is there a
succinct description of the way that Erlang does it? Would that apply to
Smalltalk?


Much like how Smalltalk does it, as it turns out.  That is, you don't
have a version problem so much as you have "old" and "new".  So when
ready you send the "upgrade" message to  the system and all new calls
to the main functions of a process will be the new version.  All
currently running code will access the old code until it's completion,
and all new code runs in the new space.

Ok, so there would be 10,000 separate process-object-spaces with the one million nodes being edited and new nodes being created in each of these 10,000 separate spaces. How do you expect to "merge" the results and solve the edits that will inevitably cause "logical data inconsistency" collisions?

You simplified concurrency system also dramatically alters the Smalltalk
paradigm.


The current paradigm is fine-grained locked/shared state.

So?

 In my opinion and the opinion of many (probably most in fact, outside of the
Java community) people who are more expert is this area then you or I,
we *have* to move away from this paradigm.

Why? Please provide more than anticidal or belief driven comments for this point of view. What are the reasons? What is it that you'd be moving towards?

Is this the approach that Cincom is using in their Visual Works system?
They seem to not be embracing the notion of native threads.


Thank God. :)

It's a huge mistake on their part in my humble view.

While it may be easy from the point of view of adapting their image it's a huge mistake. I've had many people comment that that's one of the reasons that Java is better than Smalltalk - it already works with multiple cpu cores. Yes they have to solve the concurrency problems, but those are NO WORSE than the concurrency problems that already exist within Smalltalk when running with a single native process and multiple (green threads aka) Smalltalk Processes. No different. Do you actually get that? If you don't then you fail to appreciate that the approach that Cincom is taking isn't going to solve the concurrency problems since - unless they correct me on this - it seems that their direction is to simply have N-instances of their image (in the same memory space or in separate operating system processes) where N would frequently be the same as the number of cores on the computer (or server) in question (although the instances could be more or less as needed). Each individual image would still have the problems of multi-threading within it IF AND ONLY IF there are multiple threads forked. Then you have all the same concurrency problems that happen with multiple threads on objects in one memory space. Sure this is a simpler approach for them as they don't have to completely toss their current virtual machine design - they can hack it by simply using one image space per native processor or per native operating system process. Then all they need is a cheap and dirty distributed object transport system to move object graphs (complete or partial) around between the various images. This will work for them and ALL Smalltalk systems including Squeak. In fact this can work now essentially with unmodified Smalltalk systems - all that's reallly needed is the distributed objects framework and there are a few of those kicking around.

This is of course a far cry from the radical concurrency system that is being proposed by the erlangization concurrency proponents.

However it's
also unlikely that they are embracing the notion of only ONE Smalltalk
process per image either.


If I understand you correctly, then I would suggest not to use the
word "image" as this is confusing.  Another way to put it would be
"each process has it's own view of the world".  And honestly, what is
the problem you see with this?

Ok. How will you implement that?

Right now, if you run two separate images with only one thread or
process, then you have two processes that each have their own set of
objects in their own space interacting with each other.

Yes, exactly. This is the illusion that Erlang provides. This can also be achieved now with ANY Smalltalk version just by starting multiple images - one for each core if you want to map them that way as may be "natural" to want to do.

Now we add a way for one image to send a message *between* images.

Yes. That can be done now.

Perhaps the VM can detect when we are trying to do this, but instead
of complicating the default Smalltalk message sending subsystem, lets
make it explicit with some special binary message:

Processes at: 'value computer' ! computeValue.

There isn't any need for new syntax with the "!" character. Now sure you're using it with a binary message selector "!" but why obfuscate it. I'd recommend using a keyword selector for better clarity. Thanks.

Now we have the ability to send messages locally within a process, and
a way of freely sending between processes.  No locking and the
problems associated with locking.

Not so. You'd have to transmit - in my example above - one million objects to the various images and have them compute and return their resutls which would then have to be combined in a manner that leaves the graph of objects in a consistent state with one and a half million objects and 70% more interconnections between them. It is this parallel updating of many parts of the same data graph that will require the concurrency controls.

So, now what is stopping us from moving this separate process *inside
the same image*?

Nothing but you've got to address the concurrency problem that I've mentioned above.

If you fork a process and he starts making objects,
no other processes have references to those objects.  No shared state
issue there.  This part could work right now today with no changes to
the VM.

Are you talking about forking a new operating system process with a copy of the image? The "copied" objects or the objects that were in the "image" to begin with are "duplicates" (or N-plicates really) which is a real headache if they get modified in multiple images and need to be "recombined" into one real persistent state.

These are object database problems and attempting to split the processing into multiple threads to avoid the "locking" issues does not solve the problem. It just pushes it further away. While it might work for some applications like telephone switching systems it can't generalize to ALL types of problems which could benefit from concurrency solutions. That's wishful thinking and a pipe dream otherwise known as a silver bullet.

The only issue I can think of are globals,

All Object Databases have a couple of rooted objects. Maybe many more than a couple.

the most obvious being
class side variables.  Note that even classes themselves are not an
issue because without class side variables, they are effect free
(well, obviously basicNew would have to be looked at).

I'm not sure what you mean.

But I think this issue is solvable.  The VM could take a "copy on
write" approach on classes/globals.  That is, a class should be side
effect free (to itself, i.e. it's the same after every call), so let
all processes share the memory space where meta-class objects live.
But as soon as any process tries to modify the class in some way
(literally, it would be the class modifying itself), he gets his own
copy.  Processes must not see changes made by other processes, so a
modification to a global class is a "local only" change.

Yes, a variant of the Software Transactional Memory. However, you still have the problems mentioned above.

Of course the only big thing left would be; what happens when we add a
new class.  But Erlang has had success with the old/new space
approach, and what Smalltalk has now is very similar.

Having two spaces, old and new space, won't solve the problems mentioned above when you have N processes (threads) running on M-objects in parallel and need to combine the results of the parallel computations.

Many problems have this "split processes off with their chunk of data" and "recombine" the results. Many of these problems are simplified - if possible - so that the results can't collide with the issues presented above. However, we are not talking about those special cases - such as parallel ray tracing algorithms. We are talking about the completely generic cases that occur in general purpose and every day use of code in Smalltalk applications - such as the massive Smalltalk business database front end applications which are typical at many corporations today and which utilize many threads to accomplish their parallel tasks in order to speed up the user experience. A real world consequence of this is increased productivity of thousands of users day in and day out at these corporations.

Maybe your applications aren't a complex as these but I don't see the benefits of an Erlang ONLY approach. I do see the benefit of STM and Erlang approaches in some cases but why intentionally limit the tool box to just a few cases? It makes no sense to ignore the harsh reality of concurrency issues by picking a limited set of solutions.

All the best,

Peter William Lount
[hidden email]

Wolfgang Eder

Re: Multy-core CPUs, ERLANG

Peter William Lount wrote:

> Jason Johnson wrote:
>> On 10/21/07, Peter William Lount <[hidden email]> wrote:
>>
>>> tim Rowledge wrote:
>>>
>>> Ok, so if you really are talking about a "strict" Erlang style model
>>> with ONE Smalltalk process per "image" space (whether or not they are in
>>> one protected memory space or many protected memory spaces) where
>>> objects are not shared with any other threads except by copying them
>>> over the "serialization wire" or by "reference" then I get what you are
>>> talking about.
>>>
>>
>> That is a strange way of putting it.
>
> Why? That is what Erlang achieves via it's total encapsulation of state
> that is only transferred by message passing to and back from a process.
> To achieve the same thing in Smalltalk you'd need to isolate the
> component objects running in an "image" object space with the process
> otherwise you'd be breaking the encapsulation that provides the
> protection against a large number of class es of concurrency problems.

[more stuff snipped]

Hello all,
I think that Erlang does have mechanisms to share
stuff between processes. First, the code is shared.
When I update a module, all processes using the
code of the module will (eventually) switch to the
new version.
And then there is the Mnesia database and its parts
that can be used to share data between processes.

And, slightly off topic probably:
One thing that strikes me as remarkable about the
Erlang system is that, since there is non-destructive
assignment, you cannot have cycles in your object
graphs. I think this simplifies the GC tremendously.
But I can think of no way of doing something similar
with Smalltalk objects, unfortunately.

Cheers,
Wolfgang

pwl

Re: Multy-core CPUs, ERLANG

Wolfgang Eder wrote:

> [more stuff snipped]
>
> Hello all,
> I think that Erlang does have mechanisms to share
> stuff between processes. First, the code is shared.
> When I update a module, all processes using the
> code of the module will (eventually) switch to the
> new version.
> And then there is the Mnesia database and its parts
> that can be used to share data between processes.
>
> And, slightly off topic probably:
> One thing that strikes me as remarkable about the
> Erlang system is that, since there is non-destructive
> assignment, you cannot have cycles in your object
> graphs. I think this simplifies the GC tremendously.
> But I can think of no way of doing something similar
> with Smalltalk objects, unfortunately.
>
> Cheers,
> Wolfgang

Hi,

That's interesting. Thus Erlang DOES IN FACT HAVE SHARED MEMORY between
processes: for code and for data. I'd like to learn more about that.
Could anyone provide more details?

One proposal was a "copy-on-write" object space model where objects that
are about to be written to in a Smalltalk process would be copied to
that processes private object space - in effect that processes view of
the "image".

To implement a copy-on-write technique would require operating system
support for the typical modern mainstream operating system. To implement
copy-on-write requires a synchronization primitive to be used by the
operating system - if I'm not mistaken - at least for a few instructions
while the page tables are updated - a critical section.

To implement copy-on-write requires a language to have an ability to go
beyond the Erlang style of concurrency capabilities.

One of the crucial aspects that Alan Kay (and others) have promoted over
and over again is the ability of a language to be expressed in itself.
This has a certain beauty to it as well as a mathematical aesthetic that
has important ramifications that go way beyond those characteristics. To
have a "mobius" system that can rewrite itself while retaining
functioning versions across a continuous evolutionary path one requires
a system that can be expressed in itself. Alan Kay points to a page in
the Lisp Manual where Lisp is implemented in itself. Since Smalltalk is
supposed to be a general purpose programming language it is crucial that
it have this aspect of being able to implement itself with itself. So
far Squeak comes close to this - at least with respect to the virtual
machine which is written in the slang subset of Smalltalk. Unfortunately
Squeak relies upon manually written C files for binding with the various
operating systems. Co-existence with C based technology has it's price
and it's high in that it blocks access to the entire system from within
the system; by being blocked one is prevented from online interactive
exploration and experimentation that we are used to at the Smalltalk
source code level. At least this is being addressed in the amazing work
of Ian Piumarta (http://piumarta.com/pepsi/pepsi.html) and the
incredible work of LLVM (http://llvm.org). In fact I highly recommend
that Squeak move from it's current obsolete C compilers to make use of
either of these two projects as the bottom of the VM. Apple is funding
LLVM and Ian's work seems to be part of the work of Alan Kay's
Viewpoints Research Institute (http://www.vpri.org).

The "non-destructive" assignment aspect of Erlang is typical of
non-write-in-place functional and object database systems. It's a key
aspect of the ZokuScript Object Database Management System and
Technologies. However it's not a panacea that the silver bullet utopians
think it is. As with any other solution matrix it has it's benefits,
payoffs, minuses and costs. These need to be balanced for every
application. As Wolfgang points out there are issues with it such as the
"cycle" problem that need to be overcome via implementation exceptions.

The other issue is how fine to you cut the objects? At what point do you
say enough is enough? That is at what point does a process say oh, I
don't really have control of changes to the object in question... as
that object is private to another object space. Thus control needs to be
passed to a process in the other object space likely on another compute
node. For example corporate security constraints may require that
certain data remain on the server while only permitting some data to be
shared with a laptop node running remotely.

It's important to consider the wider issues involved in distributed
systems that are to be deployed in the real world. For Smalltalk to
evolve we must get really serious about these issues ahead of the curve
that others are pursuing now.

It's shocking that systems like Flash MX's Javascript compatible
language has a few features that are more advanced than Smalltalk. It's
shocking that Flash is so popular even though the language also has
serious flaws - for example in it's handling of exceptions.

One of the tremendous strengths of Smalltalk is shared with Unix
systems. If you visit the Smalltalk versions page at Smalltalk.org
(http://Smalltalk.org/versions) you'll a great many versions of
Smalltalk. In fact the page isn't complete as there are older historical
versions of Smalltalk that are missing as well as a slate (no pun
intended) of Smalltalk and Smalltalk like languages that are missing
from the roster listed there. Smalltalk shares this proliferation aspect
with Unix. Count the Unix variants and it's in the hundreds if not
approaching thousands of distributions that have been or that are
available now. Linux alone has hundreds of variants.

Compare this variety with Java and Microsoft. They are stagnant with
just one thread of evolution. Smalltalk and Unix are undergoing a much
wider range of co-evolutionary development much of which is parallel and
much of which is divergent. Both aspects are important.

Divergence is important for strong vendors so that they can distinguish
their products and meet the needs of their set of vertical markets.

Parallel co-evolution, cross pollination and open sharing of code via
libraries and the ANSI Standard for Smalltalk (new version in the works
- please contribute) is important for the language as a unified entity.

Parallelism is one of the low level aspects that needs to be shared
openly between the vendors for such features to become "standard"
features. Otherwise parallelism across the vendors products will become
or remain hodgepodge (as it is now).

The same goes for the Graphical User Interface but that's an entirely
different conversation.

The basic point is that for a language to be expressible in itself it
means that ALL the computer science techniques used to implement the
language must be expressible in the language. It goes beyond this self
referential definition since the language must also be able to express
ANY computer science technique that is needed for the full range of
systems that will be implemented in it. To do less is to create a
language that is less than capable.

With the advances in static compiler just in time technologies (LLVM,
Code-Pepsi, etc...) that can co-exist with the C universe it's possible
for Smalltalk to become a full fledged systems language again as it once
was. To limit the language and prevent this from happening will create a
version of Smalltalk that simply only addresses the needs of a small
segment of the market.

Concurrency control issues are a very important aspect of any general
purpose programming language. To limit the solution space to a tiny
corner of solutions would be a mistake by design.

Certainly making concurrency easier and fool proof is a laudable goal.
However the cost might be too high a price if it's not done well or if
it alters the language beyond it's current shape.

One of the reasons that I'm implementing a new language, ZokuScript, is
that it does change the paradigm beyond that of Smalltalk. Keeping
connected with Smalltalk is done via ZokuTalk. However the execution
engine (not a virtual machine) will translate ZokuTalk (i.e. Smalltalk)
into ZokuScript and then compile it to native code. ZokuTalk and
Smalltalk are subsets of ZokuScript which is a fusion of many ideas and
concepts from other languages and - above all else - application
requirements.

The erlangification (erlangization, or erlangisation) of Smalltalk may
be a radical enough transformation that it's no longer Smalltalk. If
that's the way of Squeak that's fine however it seems that a fork is
likely the result (and yes, the pun of forking was intended).

Since the driver is the requirements and not just technology awe what
are the requirements for concurrency in Squeak and in Smalltalk (since
Squeak is diverging from Smalltalk more and more)?

Inventing the future is fun and hard work. Which future are you inventing?

All the best,

Peter William Lount

Igor Stasenko

Re: Multy-core CPUs

In reply to this post by pwl

On 23/10/2007, Peter William Lount <[hidden email]> wrote:

[ your message was here ]
> Peter William Lount
> [hidden email]
>
A BIG +1 to your point. You expressed most things which i had in mind
(i'm not native English speaker, so sometimes its hard to say what i
have in mind).

Absolutely, there is no magical cure for concurrency. And hoping that
we can deal with it by using __insert tool__ is an illusion.

And what is more frustrating (unfortunately), that a concurrency can
be solved only when we come to problem from both sides: VM and
language. By changing VM and not touching a bit in ST codebase we will
have a crappy solution. By changing ST codebase, but using old
one-threaded VM we have also crappy solution - while its very useful
for generic distributed computing its too ineffective for
computationally heavy problems (for example - raytracing). While you
can do things in parallel, but in the end you must gather results into
same memory space. Its good when your problem domain can be simply
splitted onto smaller parts, which can be computed in parallel and
overhead of objects serialization is too low comparing to time of
computing partial results. But as Peter said in general case we may
expect that overhead will be too high for some tasks, and some tasks
can't be parallelised at all making single OS process working, while
others simply hanging in memory eating space and consuming CPU
resources by computing nothing.

Also, by spawning parallel OS processes we hand over many aspects of
our parallel processing control to OS and losing many elegant and
simple solutions. For example, why i have to lose simple 'send and
receive' and be it replaced by 'send and pray' paradigm?

And don't take me wrong: i'm a big fan of spoon and Croquet islands,
still these solutions hardly can be considered as generic in
perspective of Multi-Core.
Lets make it clear: a Multi-Core is _NOT_ distributed computing. Yet
they have many in common, but to use them most effectively we need
different approaches.

--
Best regards,
Igor Stasenko AKA sig.

pwl

Re: Multy-core CPUs, ERLANG

In reply to this post by pwl

Hi,

Continued.

Of course one could also implement a copy-on-write-bit for objects in
the "read-only-shared-top-level-object-space-of-the-image". In order to
accomplish any work a process must be forked! Also, this way any process
that forks off will need to copy all of the objects it modifies into
it's own private object-space until the process commits it's changes
into the top level object-space or until it aborts. That's assuming a
Software Transactional Memory scheme is added to Smalltalk.

Actually this idea is quite appealing if done right.

Of course there are a host of other awesomely complex problems implied
by the above that a simple concurrency model will NOT solve.

Concurrency isn't like automatic garbage collection - which is actually
quite broad and complex a field - at all. The sets of problems with
concurrent systems are way more complex. This is especially the case
when you bring distribution beyond a single compute node into the fold
and especially when other issues such as distributed garbage collection
are required. Welcome to the complex world of tomorrow today.

What do the actual vendor's staff who write the virtual machines think
about this?

All the best,

Peter William Lount

Igor Stasenko

Re: Scalability [was: Multy-core CPUs]

In reply to this post by Klaus D. Witzel

On 23/10/2007, Klaus D. Witzel <[hidden email]> wrote:

> On Mon, 22 Oct 2007 19:10:49 +0200, Igor Stasenko wrote:
>
> > On 22/10/2007, Klaus D. Witzel wrote:
> >
> >> Not only slow-down :( For an example, have a look at the implementor of
> >> #debug:title:full: in class Process, where thisContext is assigned to a
> >> variable.
> >>
> >> When #ifTrue:ifFalse: is really sent, ([thisContext] class) is
> >> BlockContext *and* its sender is nil so the test for #hasContext: in the
> >> next statement fails.
> >>
> > I think this is because of optimization.
> > For BlockContext a sender should be a context of method
> > #ifTrue:ifFalse: (which sends #value to block). But compiler never
> > creates such context due to optimization. In this case, since compiler
> > 'cuts' the #ifTrue:ifFalse: out, then a correct context, i think,
> > should be a sender of #ifTrue:ifFalse?? but not nil.
> >
> >> But Squeak's compiler [usually] doesn't emit code for sending
> >> #ifTrue:ifFalse: so ([thisContext] class) is MethodContext and
> >> #hasContext: doesn't fail (in this example).
> >>
> >> > Then the only way how we could regain this loss is to use highly
> >> > parallelisable algorithms.
> >>
> >> ... which can be employed regardless of 'erlangenization' :)
> >>
> >
> > A trivial code comes in mind:
> >
> > (1 to: 1000) do: [:i | [ aBlock value:i ] fork ]
> >
> > but this leads to burden our parallel processes with scheduling.
> > I would like, instead, to be able to run a number of parallel branches
> > for same process (to schedule a process instead each of these
> > branches).
> >
> > (1 to: 1000) doInParallel: [:i | aBlock value ]
> >
> > I really don't like adding another abstraction like Thread, in
> > addition to Process. Maybe we should stick with a Process and have a
> > subclass of it, like ProcessNoScheduling.
>
> I think that the present multi-core CPU thread would benefit from having a
> look at what other people achieved in this area, I mean people who bet
> their whole carreer on optimizing resource allocation and resource
> scheduling, like for example this one
>
> Scalability of Microkernel-Based Systems
> - http://l4ka.org/publications/2005/uhlig_phd-thesis_scalability.pdf
> (just skip the few pages in German, the paper is in English)
>
> And one shouldn't care about that L4 folks are mainly concerned with OS
> components since that are objects like any other :) They [L4 folks] have
> minimalistic number of concepts and tough, very tough requirement
> definitions which have to be matched with reality ;-)
>
> But, the way I understand the present multi-core CPU thread, Squeak people
> aim to save the multi-core processing world by reinventing it :-D
>

Of course not. I have read somewhere a description of system which
uses such 'micro-kernel' architecture in mind. A system divided in
parts which communicate by establishing a 'contracts' - some kind of
agreement between system parts on protocols and security.

A Multi-Core CPU's working in parallel by having own 'private' memory
(cache) and shared memory. Why then VM can't do the same? All we need
to do, is to put this in use.

> > I'm just thinking, in what ways we can avoid excessive
> > scheduling/preempting?
>
> I think that you can find answers to this [with benchmarks and,
> comparisions also at the conceptual level] in the abovementioned paper :)
>
> > Or maybe, by following road of 'erlangisation' we should make a
> > Process more lightweight, so spawning thousands of them will not cause
> > a speed degradation.
>
> But it will. There are hidden constants associated with our present
> understanding of massive parallelism (you mentioned the cost of resource
> allocation and resource scheduling, add to that that messages can get
> lost, non-local updates for keeping the system viable, etc).
>
> And you have to find problems which can be solved with massive parallel
> threads/processes ;-)
>
> /Klaus
>
> >
> >
>
>
>
>

--
Best regards,
Igor Stasenko AKA sig.

Klaus D. Witzel

Re: Scalability [was: Multy-core CPUs]

On Tue, 23 Oct 2007 17:38:02 +0200, Igor Stasenko wrote:

> On 23/10/2007, Klaus D. Witzel wrote:
[...]
>> But, the way I understand the present multi-core CPU thread, Squeak
>> people
>> aim to save the multi-core processing world by reinventing it :-D
>>
> Of course not.

:)

> I have read somewhere a description of system which
> uses such 'micro-kernel' architecture in mind. A system divided in
> parts which communicate by establishing a 'contracts' - some kind of
> agreement between system parts on protocols and security.

Go ahead. Don't stop here. What's it about?

> A Multi-Core CPU's working in parallel by having own 'private' memory
> (cache) and shared memory. Why then VM can't do the same? All we need
> to do, is to put this in use.

I know you are familiar with the many aspects of the Squeak VM. Where
would you start with a parallelized VM, perhaps here

- http://en.wikipedia.org/wiki/Automatic_parallelization

which needs complex program analysis on the compiler side, or something
completely different?

/Klaus

Sebastian Sastre-2

RE: Multy-core CPUs, ERLANG

In reply to this post by pwl

Hi there,

I think that Peter posts are very pragmatic and edutative and are
making this discussion just richer.

Erlangization (and friends) of smalltalk message sends is, I'm
afraid, literally not possible. I'm afraid that Erlang simplifies reality
too much to archieve it's goal by abusing of the fact that it's paradigm is
based on data instead of arbitrary object graphs plus a use of processes and
messages ala object paradigm. That way it archieves management of about 100k
processes (have in mind that for example right now the squeak image I'm
developing on has.. about 754794 subinstances of protoobject).

But in this discussion we are exploring, and maybe defining bases
of, what could be a better cocktail of technologic techniques that brings to
smalltalk such an efficient use of the multi-core incoming hardware. This
should be a pacific point for us all.

Now.. Peter, you said in a previous post that implementing a
smalltalk that does not share is not possible. But then you said, if I
understood you right, that if we found a solution to "fine cuting objects"
and a little of transactional memory we can open the door to an appealing
solution space that solves, for instance, your million object model graph.

I'm curious about what problems are left ouside with a concurrency
solution space like that.

In fact even in such a new field I think that an enumeration of the
solution spaces with it's requisites (promoted changes), pros and cons is
necessary to help us all to order ideas (to keep it as candidates or
discarded).

The community should judge about what space will be the one that
solves the most valuable solution space (brings the most promisory solutions
to the most frequent problems of this community). I also think we need to
give more time for community tides to cook a good solution/s in this world
of increasing complexity.

cheers,

Sebastian Sastre
PS: remember that complexity was allways out there. The problem is just that
we are trying to make something useful with less rudimentary models of it
using atumatons as adobe.

> -----Mensaje original-----
> De: [hidden email]
> [mailto:[hidden email]] En
> nombre de Peter William Lount
> Enviado el: Martes, 23 de Octubre de 2007 11:52
> Para: The general-purpose Squeak developers list
> Asunto: Re: Multy-core CPUs, ERLANG
>
> Wolfgang Eder wrote:
> > [more stuff snipped]
> >
> > Hello all,
> > I think that Erlang does have mechanisms to share stuff between
> > processes. First, the code is shared.
> > When I update a module, all processes using the code of the module
> > will (eventually) switch to the new version.
> > And then there is the Mnesia database and its parts that
> can be used
> > to share data between processes.
> >
> > And, slightly off topic probably:
> > One thing that strikes me as remarkable about the Erlang system is
> > that, since there is non-destructive assignment, you cannot have
> > cycles in your object graphs. I think this simplifies the GC
> > tremendously.
> > But I can think of no way of doing something similar with Smalltalk
> > objects, unfortunately.
> >
> > Cheers,
> > Wolfgang
>
> Hi,
>
> That's interesting. Thus Erlang DOES IN FACT HAVE SHARED
> MEMORY between
> processes: for code and for data. I'd like to learn more about that.
> Could anyone provide more details?
>
> One proposal was a "copy-on-write" object space model where
> objects that are about to be written to in a Smalltalk
> process would be copied to that processes private object
> space - in effect that processes view of the "image".
>
> To implement a copy-on-write technique would require
> operating system support for the typical modern mainstream
> operating system. To implement copy-on-write requires a
> synchronization primitive to be used by the operating system
> - if I'm not mistaken - at least for a few instructions while
> the page tables are updated - a critical section.
>
> To implement copy-on-write requires a language to have an
> ability to go beyond the Erlang style of concurrency capabilities.
>
> One of the crucial aspects that Alan Kay (and others) have
> promoted over and over again is the ability of a language to
> be expressed in itself.
> This has a certain beauty to it as well as a mathematical
> aesthetic that has important ramifications that go way beyond
> those characteristics. To have a "mobius" system that can
> rewrite itself while retaining functioning versions across a
> continuous evolutionary path one requires a system that can
> be expressed in itself. Alan Kay points to a page in the Lisp
> Manual where Lisp is implemented in itself. Since Smalltalk
> is supposed to be a general purpose programming language it
> is crucial that it have this aspect of being able to
> implement itself with itself. So far Squeak comes close to
> this - at least with respect to the virtual machine which is
> written in the slang subset of Smalltalk. Unfortunately
> Squeak relies upon manually written C files for binding with
> the various operating systems. Co-existence with C based
> technology has it's price and it's high in that it blocks
> access to the entire system from within the system; by being
> blocked one is prevented from online interactive exploration
> and experimentation that we are used to at the Smalltalk
> source code level. At least this is being addressed in the
> amazing work of Ian Piumarta
> (http://piumarta.com/pepsi/pepsi.html) and the incredible
> work of LLVM (http://llvm.org). In fact I highly recommend
> that Squeak move from it's current obsolete C compilers to
> make use of either of these two projects as the bottom of the
> VM. Apple is funding LLVM and Ian's work seems to be part of
> the work of Alan Kay's Viewpoints Research Institute
> (http://www.vpri.org).
>
> The "non-destructive" assignment aspect of Erlang is typical
> of non-write-in-place functional and object database systems.
> It's a key aspect of the ZokuScript Object Database
> Management System and Technologies. However it's not a
> panacea that the silver bullet utopians think it is. As with
> any other solution matrix it has it's benefits, payoffs,
> minuses and costs. These need to be balanced for every
> application. As Wolfgang points out there are issues with it
> such as the "cycle" problem that need to be overcome via
> implementation exceptions.
>
> The other issue is how fine to you cut the objects? At what
> point do you say enough is enough? That is at what point does
> a process say oh, I don't really have control of changes to
> the object in question... as that object is private to
> another object space. Thus control needs to be passed to a
> process in the other object space likely on another compute
> node. For example corporate security constraints may require
> that certain data remain on the server while only permitting
> some data to be shared with a laptop node running remotely.
>
> It's important to consider the wider issues involved in
> distributed systems that are to be deployed in the real
> world. For Smalltalk to evolve we must get really serious
> about these issues ahead of the curve that others are pursuing now.
>
> It's shocking that systems like Flash MX's Javascript
> compatible language has a few features that are more advanced
> than Smalltalk. It's shocking that Flash is so popular even
> though the language also has serious flaws - for example in
> it's handling of exceptions.
>
> One of the tremendous strengths of Smalltalk is shared with
> Unix systems. If you visit the Smalltalk versions page at
> Smalltalk.org
> (http://Smalltalk.org/versions) you'll a great many versions
> of Smalltalk. In fact the page isn't complete as there are
> older historical versions of Smalltalk that are missing as
> well as a slate (no pun
> intended) of Smalltalk and Smalltalk like languages that are
> missing from the roster listed there. Smalltalk shares this
> proliferation aspect with Unix. Count the Unix variants and
> it's in the hundreds if not approaching thousands of
> distributions that have been or that are available now. Linux
> alone has hundreds of variants.
>
> Compare this variety with Java and Microsoft. They are
> stagnant with just one thread of evolution. Smalltalk and
> Unix are undergoing a much wider range of co-evolutionary
> development much of which is parallel and much of which is
> divergent. Both aspects are important.
>
> Divergence is important for strong vendors so that they can
> distinguish their products and meet the needs of their set of
> vertical markets.
>
> Parallel co-evolution, cross pollination and open sharing of
> code via libraries and the ANSI Standard for Smalltalk (new
> version in the works
> - please contribute) is important for the language as a
> unified entity.
>
> Parallelism is one of the low level aspects that needs to be
> shared openly between the vendors for such features to become
> "standard"
> features. Otherwise parallelism across the vendors products
> will become or remain hodgepodge (as it is now).
>
> The same goes for the Graphical User Interface but that's an
> entirely different conversation.
>
> The basic point is that for a language to be expressible in
> itself it means that ALL the computer science techniques used
> to implement the language must be expressible in the
> language. It goes beyond this self referential definition
> since the language must also be able to express ANY computer
> science technique that is needed for the full range of
> systems that will be implemented in it. To do less is to
> create a language that is less than capable.
>
> With the advances in static compiler just in time
> technologies (LLVM, Code-Pepsi, etc...) that can co-exist
> with the C universe it's possible for Smalltalk to become a
> full fledged systems language again as it once was. To limit
> the language and prevent this from happening will create a
> version of Smalltalk that simply only addresses the needs of
> a small segment of the market.
>
> Concurrency control issues are a very important aspect of any
> general purpose programming language. To limit the solution
> space to a tiny corner of solutions would be a mistake by design.
>
> Certainly making concurrency easier and fool proof is a
> laudable goal.
> However the cost might be too high a price if it's not done
> well or if it alters the language beyond it's current shape.
>
> One of the reasons that I'm implementing a new language,
> ZokuScript, is that it does change the paradigm beyond that
> of Smalltalk. Keeping connected with Smalltalk is done via
> ZokuTalk. However the execution engine (not a virtual
> machine) will translate ZokuTalk (i.e. Smalltalk) into
> ZokuScript and then compile it to native code. ZokuTalk and
> Smalltalk are subsets of ZokuScript which is a fusion of many
> ideas and concepts from other languages and - above all else
> - application requirements.
>
> The erlangification (erlangization, or erlangisation) of
> Smalltalk may be a radical enough transformation that it's no
> longer Smalltalk. If that's the way of Squeak that's fine
> however it seems that a fork is likely the result (and yes,
> the pun of forking was intended).
>
> Since the driver is the requirements and not just technology
> awe what are the requirements for concurrency in Squeak and
> in Smalltalk (since Squeak is diverging from Smalltalk more and more)?
>
> Inventing the future is fun and hard work. Which future are
> you inventing?
>
> All the best,
>
> Peter William Lount
>

Sebastian Sastre-2

RE: Scalability [was: Multy-core CPUs]

In reply to this post by Igor Stasenko

> > But, the way I understand the present multi-core CPU thread, Squeak
> > people aim to save the multi-core processing world by
> reinventing it
> > :-D
> >
> Of course not. I have read somewhere a description of system
> which uses such 'micro-kernel' architecture in mind. A system
> divided in parts which communicate by establishing a
> 'contracts' - some kind of agreement between system parts on
> protocols and security.
>
> A Multi-Core CPU's working in parallel by having own 'private' memory
> (cache) and shared memory. Why then VM can't do the same? All
> we need to do, is to put this in use.
>

I think because we don't have a machinery to comunicate VM's without
disrupting the user experience or making the N-cores thing to make a real
difference. But if a machinery for that is found acceptable enough for most
problems domains then it will become appealing (as the N spoons idea).

Cheers,

Sebastian

Jason Johnson-5

Re: What about "Erlanging" the smalltalk interstitial space? (was RE: Multy-core CPUs)

In reply to this post by Sebastian Sastre-2

On 10/22/07, Sebastian Sastre <[hidden email]> wrote:
>
> 1.4 I'm interpreting the normal thousands of processes Erlang
> manages as analogue (so similar in function) to one of this tides and point
> why we cannot make tides of Smalltalk message sends in an image?

We can. In fact, I think we could do it right now with the existing
VM and some packages that have been written already.

> 3. If I understood well, Erlang's main strenght is not that it has
> functions to do things but that it has a message passing technique really
> great that was designed to take advantage of parallelism in a simple way
> efficiently and making processes very cheap (ceaper than the OS ones).

Erlang processes are much cheaper then OS processes *and* OS threads.
But then, so are Smalltalk's. The only difference is that Erlang
processes are encapsulated entities that have no shared memory [1],
while Smalltalk's do.

> The intersticial space of virtual objects, AKA messages sends, can
> be "Erlanged" by making each message send to be in a "process" (the cheap
> ones) of it's own like Erlang messages between "process"?

Well, keep in mind, Erlang is a functional language and code written
in uses functions. Message sends are for communicating between
processes.

> Same in other words:
>
> What consequences will affect us if we make a Squeak to use a VM
> that to pass messages use processes ala Erlang (that are simple, cheap,
> efficient and scalable in a world of "cores")?

If you mean every Smalltalk message send is what an Erlang message
send is, then the results would be devastating. As I mentioned above,
Erlang does it's work with functions. In Smalltalk, the equivalent
method of doing work is what Smalltalk calls "messages". In Erlang
there is a concept of sending messages between processes, and I would
do the same for Smalltalk.

> Can this allow us to assimilate in the Smalltalk's objects paradigm
> the Erlagn's process paradigm? This is: will this allow us to gain that
> parallelizing benefits preventing us to change the programing paradigm?

We can, and it would [2]. But I think we should, at least at first,
make inter-process message sends very obviously different from
inter-object message sends. It would be possible, for example, that
objects of type "Process" have a different way of handling messages so
that:

(Processes at: 'bank account') addUSD: 5000

Is actually an inter-process send, but I would still want to use the !
syntax, or something equivalent so it's completely obvious that we are
doing something different.

Inter-process message sends have their own lookup complexity that I
think should be separate from the inter-object message sends we have
now. For example, in Erlang if you send a message to a process that
happens to be in the same image, a simple reference copy happens (no
danger since variables are immutable). The other two cases would be:
a different OS/native thread in the same image, and a totally
different image (same computer or on the network).

Now if what you're talking about is basically promoting every object
to it's own process (using the terms I have described so far), then I
haven't really given this much though. This would be a totally
different paradigm and area of research (maybe like CORBA?). Though
I'm sure someone somewhere has done research on it (or is currently).
:)

> Sebastian Sastre
> PD: I've tried to imagine if this saves us from having to make code trhead
> safe or not. I was unable to refutate this by myself so I also ask kindly
> that the most experienced and critic minds collaborate on this regard.

Us as people who use Smalltalk? Yes I believe it does. It doesn't
make it impossible to make a design that has deadlocks, but imo the
big win is that these concerns move to design time instead of
implementation time.

[1] This is enforced by the language. As far as I know, the processes
actually share the same heap, etc.. They just don't know it and can't
take advantage of it :)

[2] Well, I believe so anyway, and aim to find out. The issue is that
Erlang has it easy: you *can't* share data at a language level in
Erlang. In Smalltalk this gets a little tricky, mainly due to one of
Smalltalk's greatest strengths: Classes are live objects (with state
and so on). I believe this can be overcome, while preserving the
Smalltalk semantics, but I can't prove it yet. :)

1234567 ... 10