Travis Griggs wrote:
> > On the how to do it for Smalltalk. The right way IMO is to support the > existence of multiple instantiations of the ProcessScheduler object. You > can create as many of these for as many CPUs as you have. Or even more. > And yet you still get the lightweight local threads. Native threads in > the context of a singular CPU are the same thing as they are in > Smalltalk: a handy illusion. > What I dislike about this solution as well as about running multiple images is that we have to /manage/ threads. IMO we should look at mechanisms where explicit thread management can be hidden and/or automated. We have learned that lesson for memory allocation why oh why don't we have the knee-jerk reaction to demand the same for CPU allocation? Jecel Assumpcao has explored such a mechanism in the past for Self: at /every/ message send the VM decides if it could create a new thread to execute that send in parallel. This alters the Smalltalk semantics slightly but leads to a lot of parallelizing opportunities that can be handled without programmer effort. This is his original paper: http://www.lsi.usp.br/~jecel/jpaper1.ps.gz with the enticing quote: "So, even with top level messages defined as sequential, one in every four sends potentially adds to the prallelism of a program" and here is more recent work as done on TinySelf1: http://www.lsi.usp.br/~jecel/tiny.html#rel1 Less is more: the cheapest way for a program to use multi threading is to *not* need to specify it.... R - > On the why it's hard. At least two Smalltalks have been multithreaded. > Smalltalk MT is. And in a weird sort of way is, so is ST/X. Mapping to > native threads isn't *that* hard. But like the guys who did > Hyperthreading found it, the really tricky part is the memory management > in said situation. Multi processing in a single coherent memory model is > tough. Tough to do efficiently. And this is a particular problem for > Smalltalk, which wants to create lots and lots and lots of transient > memory as we create the object illusion. IOW, the tricky part IMO of a > multithreaded VW is the Garbage Collector. It is particularly good at > what it does (no other Smalltalk does better). ST/X interestingly > associates ST threads with native threads, but does the schedule control > itself (i.e. does not allow the OS to run them in apparent concurrence). > This means that the access to the GC API can remain effectively serial. > > -- > Travis Griggs > Objologist > "There are a thousand hacking at the branches of evil to one who is > striking at the root" - Henry David Thoreau > > |
Hello Reinout,
I would tend to agree with this analysis. As I see it, there is no reason why the same principles of hiding or automating thread management could not be applied as a way of shielding programmers from having to make explicit mechanisms. I liked your phrasing: "a lot of parallelizing opportunities that can be handled without programmer effort.". Well stated. I think we should keep this discussion moving. Respectfully, David ----- Original Message ----- From: "Reinout Heeck" <[hidden email]> To: "VW NC" <[hidden email]> Sent: Wednesday, February 21, 2007 8:12 AM Subject: Re: VisualWorks and support for native multi-threading > Travis Griggs wrote: >> >> On the how to do it for Smalltalk. The right way IMO is to support the >> existence of multiple instantiations of the ProcessScheduler object. You >> can create as many of these for as many CPUs as you have. Or even more. >> And yet you still get the lightweight local threads. Native threads in >> the context of a singular CPU are the same thing as they are in >> Smalltalk: a handy illusion. >> > > What I dislike about this solution as well as about running multiple > images is that we have to /manage/ threads. > IMO we should look at mechanisms where explicit thread management can be > hidden and/or automated. We have learned that lesson for memory allocation > why oh why don't we have the knee-jerk reaction to demand the same for CPU > allocation? > > > Jecel Assumpcao has explored such a mechanism in the past for Self: at > /every/ message send the VM decides if it could create a new thread to > execute that send in parallel. This alters the Smalltalk semantics > slightly but leads to a lot of parallelizing opportunities that can be > handled without programmer effort. > > This is his original paper: > http://www.lsi.usp.br/~jecel/jpaper1.ps.gz > with the enticing quote: > "So, even with top level messages defined as sequential, one in every > four sends potentially adds to the prallelism of a program" > > > and here is more recent work as done on TinySelf1: > http://www.lsi.usp.br/~jecel/tiny.html#rel1 > > > > > Less is more: > the cheapest way for a program to use multi threading is to *not* need to > specify it.... > > > > R > - > > > > >> On the why it's hard. At least two Smalltalks have been multithreaded. >> Smalltalk MT is. And in a weird sort of way is, so is ST/X. Mapping to >> native threads isn't *that* hard. But like the guys who did >> Hyperthreading found it, the really tricky part is the memory management >> in said situation. Multi processing in a single coherent memory model is >> tough. Tough to do efficiently. And this is a particular problem for >> Smalltalk, which wants to create lots and lots and lots of transient >> memory as we create the object illusion. IOW, the tricky part IMO of a >> multithreaded VW is the Garbage Collector. It is particularly good at >> what it does (no other Smalltalk does better). ST/X interestingly >> associates ST threads with native threads, but does the schedule control >> itself (i.e. does not allow the OS to run them in apparent concurrence). >> This means that the access to the GC API can remain effectively serial. >> >> -- >> Travis Griggs >> Objologist >> "There are a thousand hacking at the branches of evil to one who is >> striking at the root" - Henry David Thoreau >> >> > |
In reply to this post by Reinout Heeck-2
On Feb 21, 2007, at 5:12, Reinout Heeck wrote:
24% is the number the paper sites. And that static analysis of a self program is followed by this statement: "Many practical factors will drastically reduce this number". So the basic game it's playing is noting that in this case: someObject someMessage: anObject doesAnotherMessage doesAnotherMessage could potentially be executed to some degree of parallelism, while we go ahead and get started with the someMessage: invocation. As soon as we get to the point in someMessage: where we need the result of doesAnotherMessage, we have to wait. In a rosy world, where the 24% really was realizable, the overhead of adding the "lists of messages" and async return could add up to 24% before it was a wash. And it then be only a wash, not worth the added complexity. We'd have to have it down in more like the 5% overhead realm, for it to make any appreciable improvement for the system. Furthermore, as the number 24% is "drastically reduced" the amount of overhead you can spend on this goes down too. So if we're really only going to get say 10% realistically, we need more like 2% overhead. Then there's the fact that there's Smalltalk code out there that would have issues with this: self handleToken: self nextToken andOther: self nextToken. There is no way for the Merlin approach to know that any side affects caused by nextToken should be done in order, left to right. The paper is very cool though. :) -- Travis Griggs Objologist "I think that we should be men first, and subjects afterward." - Henry David Thoreau |
In reply to this post by Reinout Heeck-2
While I don't disagree with you Reinout and I think the goal is
noble.. I don't think it's in Cincom's best interest to put research dollars in to this problem. If we look at the state of play wrt to threaded VMs.. the most successful one so far took years to make. This suggests that the approach isn't right. Jecel's research tries to find ways to do things simpler - but as Travis points out this may have issues in Smalltalk (although I think he's wrong with his example). What I expect should happen is someone goes off and makes the simplest smalltalk implementation they can make (an interpreter) and then prove the best way to make it natively multi-threaded, do the various benchmarks, stability proofs, etc. In other words, show the rest of the Smalltalk world how it -can- be done. Some time down the track after that sort of initiative is successful, you'd see commercial vendors taking an interest. However, I don't think you'll see Cincom wanting to fund that investment until they can see a clear path that will definitely succeed. It's smart to be risk adverse when talking about such wildly sweeping changes to the VM. Cheers, Michael > What I dislike about this solution as well as about running multiple > images is that we have to /manage/ threads. > IMO we should look at mechanisms where explicit thread management can be > hidden and/or automated. We have learned that lesson for memory > allocation why oh why don't we have the knee-jerk reaction to demand the > same for CPU allocation? > Jecel Assumpcao has explored such a mechanism in the past for Self: at > /every/ message send the VM decides if it could create a new thread to > execute that send in parallel. This alters the Smalltalk semantics > slightly but leads to a lot of parallelizing opportunities that can be > handled without programmer effort. > This is his original paper: > http://www.lsi.usp.br/~jecel/jpaper1.ps.gz > with the enticing quote: > "So, even with top level messages defined as sequential, one in every > four sends potentially adds to the prallelism of a program" > and here is more recent work as done on TinySelf1: > http://www.lsi.usp.br/~jecel/tiny.html#rel1 > Less is more: > the cheapest way for a program to use multi threading is to *not* need > to specify it.... > R > - >> On the why it's hard. At least two Smalltalks have been multithreaded. >> Smalltalk MT is. And in a weird sort of way is, so is ST/X. Mapping to >> native threads isn't *that* hard. But like the guys who did >> Hyperthreading found it, the really tricky part is the memory management >> in said situation. Multi processing in a single coherent memory model is >> tough. Tough to do efficiently. And this is a particular problem for >> Smalltalk, which wants to create lots and lots and lots of transient >> memory as we create the object illusion. IOW, the tricky part IMO of a >> multithreaded VW is the Garbage Collector. It is particularly good at >> what it does (no other Smalltalk does better). ST/X interestingly >> associates ST threads with native threads, but does the schedule control >> itself (i.e. does not allow the OS to run them in apparent concurrence). >> This means that the access to the GC API can remain effectively serial. >> >> -- >> Travis Griggs >> Objologist >> "There are a thousand hacking at the branches of evil to one who is >> striking at the root" - Henry David Thoreau >> >> |
Ditto. One should also keep in mind that Cincom is a company with
existing customers that all want X, Y and Z to get done as soon as possible. With Q resources that are enough to cover that + some reasonable amount of profit, one just can't go off and say "oh that sounds like a neat idea, you guys should do that". It's a much larger project than some people seem to think with benefits unknown at this point. Let them finish Pollock first :) Cheers! -Boris -- +1.604.689.0322 DeepCove Labs Ltd. 4th floor 595 Howe Street Vancouver, Canada V6C 2T5 http://tinyurl.com/r7uw4 [hidden email] CONFIDENTIALITY NOTICE This email is intended only for the persons named in the message header. Unless otherwise indicated, it contains information that is private and confidential. If you have received it in error, please notify the sender and delete the entire message including any attachments. Thank you. > -----Original Message----- > From: Michael Lucas-Smith [mailto:michael.lucas- > [hidden email]] > Sent: Wednesday, February 21, 2007 11:44 AM > To: Reinout Heeck > Cc: VW NC > Subject: Re[2]: VisualWorks and support for native multi-threading > > While I don't disagree with you Reinout and I think the goal is > noble.. I don't think it's in Cincom's best interest to put research > dollars in to this problem. > > If we look at the state of play wrt to threaded VMs.. the most > successful one so far took years to make. This suggests that the > approach isn't right. Jecel's research tries to find ways to do things > simpler - but as Travis points out this may have issues in Smalltalk > (although I think he's wrong with his example). > > What I expect should happen is someone goes off and makes the > simplest smalltalk implementation they can make (an interpreter) and > then prove the best way to make it natively multi-threaded, do the > various benchmarks, stability proofs, etc. In other words, show the > rest of the Smalltalk world how it -can- be done. > > Some time down the track after that sort of initiative is successful, > you'd see commercial vendors taking an interest. However, I don't > think you'll see Cincom wanting to fund that investment until they can > see a clear path that will definitely succeed. It's smart to be risk > adverse when talking about such wildly sweeping changes to the VM. > > Cheers, > Michael > > > What I dislike about this solution as well as about running multiple > > images is that we have to /manage/ threads. > > IMO we should look at mechanisms where explicit thread management > > hidden and/or automated. We have learned that lesson for memory > > allocation why oh why don't we have the knee-jerk reaction to demand the > > same for CPU allocation? > > > > Jecel Assumpcao has explored such a mechanism in the past for Self: at > > /every/ message send the VM decides if it could create a new thread to > > execute that send in parallel. This alters the Smalltalk semantics > > slightly but leads to a lot of parallelizing opportunities that can be > > handled without programmer effort. > > > This is his original paper: > > http://www.lsi.usp.br/~jecel/jpaper1.ps.gz > > with the enticing quote: > > "So, even with top level messages defined as sequential, one in every > > four sends potentially adds to the prallelism of a program" > > > > and here is more recent work as done on TinySelf1: > > http://www.lsi.usp.br/~jecel/tiny.html#rel1 > > > > > > Less is more: > > the cheapest way for a program to use multi threading is to *not* > > to specify it.... > > > > > R > > - > > > > > >> On the why it's hard. At least two Smalltalks have been > >> Smalltalk MT is. And in a weird sort of way is, so is ST/X. Mapping to > >> native threads isn't *that* hard. But like the guys who did > >> Hyperthreading found it, the really tricky part is the memory > management > >> in said situation. Multi processing in a single coherent memory model > is > >> tough. Tough to do efficiently. And this is a particular problem for > >> Smalltalk, which wants to create lots and lots and lots of transient > >> memory as we create the object illusion. IOW, the tricky part IMO of a > >> multithreaded VW is the Garbage Collector. It is particularly good at > >> what it does (no other Smalltalk does better). ST/X interestingly > >> associates ST threads with native threads, but does the schedule > control > >> itself (i.e. does not allow the OS to run them in apparent > concurrence). > >> This means that the access to the GC API can remain effectively serial. > >> > >> -- > >> Travis Griggs > >> Objologist > >> "There are a thousand hacking at the branches of evil to one who is > >> striking at the root" - Henry David Thoreau > >> > >> |
In reply to this post by Michael Lucas-Smith
Michael Lucas-Smith wrote: > While I don't disagree with you Reinout and I think the goal is > noble.. I don't think it's in Cincom's best interest to put research > dollars in to this problem. > Yes, please. Let them finish the urgent basics first. After 30 years of "academic" Smalltalking, there's still a lot to be done in order to make VW keep up with current demand. > [...] > > What I expect should happen is someone goes off and makes the > simplest smalltalk implementation they can make (an interpreter) and > then prove the best way to make it natively multi-threaded, do the > various benchmarks, stability proofs, etc. In other words, show the > rest of the Smalltalk world how it -can- be done. > Great idea. Take Squeak. It's open source, close to st-80 and the perfect lab rat. Andre (sorry, couldn't resist) |
In reply to this post by Michael Lucas-Smith
It seems like the COLA systems that Ian Piumarta is working on are
the perfect place to experiment with new paradigms for multi- processor (or new paradigms in general). You truly get the simplest Smalltalk implementation, implemented in itself. For a long time Alan Kay has been talking about messages. Erlang has shown that you can scale with lots and lots of lightweight threads. It seems like an object system with asynchronous messaging built on the COLA object model is the future. I just wish I was smart enough to build it. Mike > > What I expect should happen is someone goes off and makes the > simplest smalltalk implementation they can make (an interpreter) and > then prove the best way to make it natively multi-threaded, do the > various benchmarks, stability proofs, etc. In other words, show the > rest of the Smalltalk world how it -can- be done. > |
In reply to this post by Travis Griggs-3
Travis wrote:
Consider the obvious mega trend that has recently
begun...
Would anyone want buy a computer today that isn't at
least dual core? In ten years the market may expect hundreds or thousands of
cores for the typical PC. That creates a market demand for software that can
take advantage of the parallel processing power. 24% overhead is nothing if it
means work can be automatically distributed over 100 cores that would otherwise
be idle. Similarly, 64-bit has 100% space overhead compared to 32-bit but the
overhead is increasingly worth the cost in order to increase capacity.
The people that become wealthy over the next
couple decades will be the ones that took advantage of this market opportunity
at the right time. Now is the time to explore this technology--not after Sun
developers figure it out. At this time, a focus on reducing overhead is lower
priority than becoming established with mind-share. The overhead matters after
there is competition in the market; by then one can develop the financial
and intellectual strength to whip out lower-overhead solutions just-in-time. The
difference between a similar solution ten years old and one announced next year
is that next year people will notice and care.
Java won when demand shifted to the internet.
Consider how well suited Smalltalk syntax is for this
next demand shift. We could get it done with a fraction of the complexity
that other languages might have to deal with. I don't look to Cincom
to make the investment, their acquisition history shows they value
predictable revenue streams over speculative investments--not that
there is anything wrong with that. This technology is for the next
generation despite it being just a VM upgrade. A Squeak-based
solution with fancy packaging is most likely to capitalize on the trend
first.
Paul Baumann
|
>
> Would anyone want buy a computer today that isn't at least dual > core? Yeah. If you offer me a 4ghz cpu over dual 2ghz, I'd take it in a heartbeat. > > The people that become wealthy over the next couple decades will > be the ones that took advantage of this market opportunity at the > right time. That's a critical statement. First to market never wins. > Now is the time to explore this technology--not after > Sun developers figure it out. I disagree. Let them figure it out so that we don't have to. > At this time, a focus on reducing > overhead is lower priority than becoming established with > mind-share. Mindshare in the current crowd is marginal. A strong commercial offering once multiple cores are the defacto on the desktop or server space would be the time to strike with a Sun-esque marketing blitz. > by then one can develop the financial and intellectual > strength to whip out lower-overhead solutions just-in-time. I don't think that follows. Mindshare ~~ $$$. > The > difference between a similar solution ten years old and one > announced next year is that next year people will notice and care. Nah, they won't. It still doesn't matter enough. Dual or Quad cores are basically being used to separate heavy cpu processing applications.. not to actually parallelize single tasks - except in video processing, rendering and other such straight-forward number crunching tasks. (example: the dual core macs couldn't play high def if their video playback codec didn't use both cores) > > Java won when demand shifted to the internet. They did this by not being first to market. They didn't have to figure out how to do internet programming, they didn't have to figure out how to do OO, they didn't even have to figure out how to make a VM. They just combined the elements together when there was enough CPU commonly available on desktops to deal with a poorly optimized VM based language. Smalltalk was too early to market - the machines couldn't keep up with it. Java was smart timing and not having to figure out anything tricky themselves. > Consider how well > suited Smalltalk syntax is for this next demand shift. We could get > it done with a fraction of the complexity that other languages > might have to deal with. I don't think that's entirely obvious. Smalltalk syntax, with its ordering semantics may actually be a bad fit for parallelized processing. Functional programming parallelized really well. Any way, we won't know how Smalltalk semantics fair until someone tries it. |
In reply to this post by Paul Baumann
On Feb 21, 2007, at 16:30, Paul Baumann wrote:
But... I don't understand what you're saying. Or am confused that you didn't really read what I wrote. 24%. That means that at absolute best your other cpu could be working ~25% of the time. 75% of it's cycles are wasted. If you throw a quad core at it, you get to have each of the extra 3 cpus run at 8% of their ability. A pretty high cost to pay for all these cores, just to have them rarely used. But the real killer Paul is that little statement "drastically reduced." I was generous. I gave it a 1/2 cut. In reality, when I see a phrase like that, I read "an order of magnitude" less. So you're excited about rewriting the VM, dealing with the deferred side affects edge cases that i gave an example of, all so your new quad core can swap off 0.6% of the execution time to your other three processors transparently? I didn't say that I thought doing something about multithreading was a bad idea. What I questioned was the notion that there should be some sort of transparent magic that just opens up all of these multi cores to us. Or rather, I questioned the particular paper cited by Reinout. Sure, I'd love a magic wand that did that. So your preaching to the choir when it comes to wishing there were easier ways to take advantage of the growing number of multi cores. But you're preaching to the skeptic if you think that the Merlin thing is some low hanging fruit that we should turn-n-burn on. -- Travis Griggs Objologist "I think that we should be men first, and subjects afterward." - Henry David Thoreau |
In reply to this post by Michael Lucas-Smith
Michael:
In response to your words shown in quotes... You should read my reply if you thought I suggested that the first to a market wins. Research into multi-threaded application design is not new--there are probably thousands of university papers on the topic. I spoke of timing for emerging market opportunities just as you spoke of "smart timing". True, there are some people (like gamers) that would prefer a faster single core to get the ultimate performance for a single application. I don't see that Intel has achieved a marketable 4ghz single cpu; that would be a valuable chip. The investment is in multi-core chips and most people want to purchase them today even though they don't immediately benefit. "Let them figure it out so that we don't have to"--come on, it isn't that difficult. It is queues and consumers with some coordination and branching conditions. Doing it efficiently is very important, but I gave my opinion why that may not be the top priority at this time. This kind of research isn't new. The application of it could be new. Use by a broad market would be the goal. Emerging demand would be the reason. "Mindshare in the current crowd is marginal"..."It still doesn't matter enough." Yeah, that is why I spoke in terms of an emerging opportunity and mentioned a one-year time frame to develop product that could be used to gain that mindshare. "A strong commercial offering once multiple cores are the defacto on the desktop or server space would be the time to strike with a Sun-esque marketing blitz"... Sun had resources that were built slowly over time to aim toward development and marketing. Waiting until demand reaches a climax leaves you vulnerable to deeper-pocket competition and alternate solutions. I prefer the approach of building resources over time and using that capital to define the market itself--before the capacity of the market is understood by others. If done well, the market should appear defined and saturated at all times to discourage competition. "They [Sun] just combined the elements" It is unfortunate that Smalltalkers restrict themselves to inventing new things again and again. :) "Mindshare ~~ $$$" Mindshare and money are resources that can be transformed. I never said Mindshare and "$$$" were identical or even equal. There is however a Law of Attraction to resource excesses of any form. Mindshare can attract money, money can attract mindshare. Resources can also attract opportunists that would like to steal your resources. With preparation, all can be transformed to your benefit. Travis: The point I tried to make was that even a seemingly high overhead can be tolerated in initial releases so long as it provides some benefit to even a limited initial market (of perhaps 16 core CPUs). I agree that dual or quad processors won't benefit from that amount of overhead. It is well beyond Quad that I'm thinking--Quad may be the low end by next year. I felt the need to respond why it could be a good time to start development in this area. Your interest in this area is obvious. Paul Baumann -----Original Message----- From: Michael Lucas-Smith [mailto:[hidden email]] Sent: Wednesday, February 21, 2007 9:30 PM To: Paul Baumann Cc: Travis Griggs; VW NC Subject: Re[2]: VisualWorks and support for native multi-threading Importance: High > > Would anyone want buy a computer today that isn't at least dual core? Yeah. If you offer me a 4ghz cpu over dual 2ghz, I'd take it in a heartbeat. > > The people that become wealthy over the next couple decades will be > the ones that took advantage of this market opportunity at the right > time. That's a critical statement. First to market never wins. > Now is the time to explore this technology--not after Sun developers > figure it out. I disagree. Let them figure it out so that we don't have to. > At this time, a focus on reducing > overhead is lower priority than becoming established with mind-share. Mindshare in the current crowd is marginal. A strong commercial offering once multiple cores are the defacto on the desktop or server space would be the time to strike with a Sun-esque marketing blitz. > by then one can develop the financial and intellectual strength to > whip out lower-overhead solutions just-in-time. I don't think that follows. Mindshare ~~ $$$. > The > difference between a similar solution ten years old and one announced > next year is that next year people will notice and care. Nah, they won't. It still doesn't matter enough. Dual or Quad cores are basically being used to separate heavy cpu processing applications.. not to actually parallelize single tasks - except in video processing, rendering and other such straight-forward number crunching tasks. (example: the dual core macs couldn't play high def if their video playback codec didn't use both cores) > > Java won when demand shifted to the internet. They did this by not being first to market. They didn't have to figure out how to do internet programming, they didn't have to figure out how to do OO, they didn't even have to figure out how to make a VM. They just combined the elements together when there was enough CPU commonly available on desktops to deal with a poorly optimized VM based language. Smalltalk was too early to market - the machines couldn't keep up with it. Java was smart timing and not having to figure out anything tricky themselves. > Consider how well > suited Smalltalk syntax is for this next demand shift. We could get > it done with a fraction of the complexity that other languages might > have to deal with. I don't think that's entirely obvious. Smalltalk syntax, with its ordering semantics may actually be a bad fit for parallelized processing. Functional programming parallelized really well. Any way, we won't know how Smalltalk semantics fair until someone tries it. -------------------------------------------------------- This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. |
In reply to this post by Travis Griggs-3
[Grr, my reply last night went to Travis only. Sorry Paul...]
Travis is quite right. The point to remember about multicore processors is that the chip still only has one interface to main memory and each core shares that interface (and the cache and (typically) TLBs) - remember many are socket-compatible with single-core processors. If one has a memory-bound application throwing a multicore at it can do nothing to increase performance. Large symbolic applications, a mainstay of server applications in OO languages, are such beasts. Whether this point is easy to enunciate against the marketing noise is, however, an entirely different question. On 2/21/07,
Travis Griggs <[hidden email]> wrote:
|
Perhaps what GemFire addresses through
the socket layer. I'd assumed things like that had already been worked out
efficiently at the hardware level. I guess I had read the hardware
marketing noise about shared memory cache and such so I didn't
catch those details. Thanks for the information. Does it seem like those
issues might be worked out any time soon? If not, it is a replication issue that
has a whole set of issues and costs that are fortunately very
familiar.
This article was recently brought to my attention. I
fear it blows the possibility of a casual entrance. Efficiency would
be critical to get noticed.
I'm thinking that performance could be achieved by being
selective about what work gets done in separate processors. Doing every
operation through a queue was interesting for research but probably not
practical for real applications. It can be optimizing.
Sorry, I'm getting out of my realm of experience. I hope
somebody figures it out. It seems like interesting work.
Paul Baumann
From: Eliot Miranda [mailto:[hidden email]] Sent: Thursday, February 22, 2007 7:20 PM To: VW NC Subject: Re: VisualWorks and support for native multi-threading Travis is quite right. The point to remember about multicore processors is that the chip still only has one interface to main memory and each core shares that interface (and the cache and (typically) TLBs) - remember many are socket-compatible with single-core processors. If one has a memory-bound application throwing a multicore at it can do nothing to increase performance. Large symbolic applications, a mainstay of server applications in OO languages, are such beasts. Whether this point is easy to enunciate against the marketing noise is, however, an entirely different question. On 2/21/07, Travis
Griggs <[hidden email]>
wrote:
|
In reply to this post by Eliot Miranda-2
On Feb 22, 2007, at 16:19, Eliot Miranda wrote: [Grr, my reply last night went to Travis only. Sorry Paul...] The language of note that would seem to come closest to the notion of "don't worry about your threading" would be Erlang (I'm sure there are more noteworthy research languages, but I'm aware of known with any degree of popularity). The Erlang guys were recently excited about the ability to exploit multi cores. You can google for the results. One of the interesting sites I found was a SIP stack which had no coding changes made to it. When run on a quad core, it ran 1.8 times faster than the single core. -- Travis Griggs Objologist One man's blue plane is another man's pink plane. |
Travis Griggs wrote:
> On Feb 22, 2007, at 16:19, Eliot Miranda wrote: > >> [Grr, my reply last night went to Travis only. Sorry Paul...] >> >> Travis is quite right. The point to remember about multicore >> processors is that the chip still only has one interface to main >> memory and each core shares that interface (and the cache and >> (typically) TLBs) - remember many are socket-compatible with >> single-core processors. If one has a memory-bound application >> throwing a multicore at it can do nothing to increase performance. >> Large symbolic applications, a mainstay of server applications in OO >> languages, are such beasts. >> >> Whether this point is easy to enunciate against the marketing noise >> is, however, an entirely different question. > > The language of note that would seem to come closest to the notion of > "don't worry about your threading" would be Erlang (I'm sure there are > more noteworthy research languages, but I'm aware of known with any > degree of popularity). The Erlang guys were recently excited about the > ability to exploit multi cores. You can google for the results. One of > the interesting sites I found was a SIP stack which had no coding > changes made to it. When run on a quad core, it ran 1.8 times faster > than the single core. The Erlang is definitely worth trying but it is certainly NOT one of those "don't worry about your threading" languages, quite the contrary. But because it uses completely different idiom, you don't have to mess with semaphores and all that (ugly low-level) stuff. In fact these are not present in the language at all. In Erlang, each process has its mailbox for all incoming messages and it can decide when and what messages to read from it. It can also send asynchronous messages to other processes. And that's pretty much it. The rest is pure functional programming of the internals of the process that operates on its internal state. But you can still encounter race conditions only in slightly different form: Process A sends message to B and C. When B receives the message, it queries C for something that C should compute as a result of receiving the message from A. But there is no guarantee in which order will C receive the message from A and the other (query) from B. The language guarantees only that when process A sends messages to process B, B will always receive them in the order their were sent. But working, debugging and testing multiple processes is really MUCH easier in Erlang than in anything else I've ever used. It (almost) works like what you would model as objects in Smalltalk, you would model as processes in Erlang. The smart decision the Erlang authors made was that there will be no memory sharing and processes and scheduler will be part of the language. This enabled them to create a high-level layer which makes the work with processes location transparent - once you have a pid of a process, you can send it messages no matter where it really is (on which node). This transparency is really great, because once you have a program, you can run it on more nodes (computers) or start one node on multicore CPU. Ladislav Lenart |
I feel that using something like the Actalk framework and a transparent distribution framework, e.g. Opentalk-STST you can get pretty close to the Erlang model. The one thing you'd need to be careful about is making sure that things get passed by value (even in local calls) instead of by reference. Although I have doubts about that being a universal rule. What happens in Erlang when you pass some kind of system object like the scheduler ? I feel that maybe the programmer should fine tune what gets passed by value and what by reference.
Ladislav Lenart wrote: > Travis Griggs wrote: >> On Feb 22, 2007, at 16:19, Eliot Miranda wrote: >> >>> [Grr, my reply last night went to Travis only. Sorry Paul...] >>> >>> Travis is quite right. The point to remember about multicore >>> processors is that the chip still only has one interface to main >>> memory and each core shares that interface (and the cache and >>> (typically) TLBs) - remember many are socket-compatible with >>> single-core processors. If one has a memory-bound application >>> throwing a multicore at it can do nothing to increase performance. >>> Large symbolic applications, a mainstay of server applications in OO >>> languages, are such beasts. >>> >>> Whether this point is easy to enunciate against the marketing noise >>> is, however, an entirely different question. >> >> The language of note that would seem to come closest to the notion of >> "don't worry about your threading" would be Erlang (I'm sure there are >> more noteworthy research languages, but I'm aware of known with any >> degree of popularity). The Erlang guys were recently excited about the >> ability to exploit multi cores. You can google for the results. One of >> the interesting sites I found was a SIP stack which had no coding >> changes made to it. When run on a quad core, it ran 1.8 times faster >> than the single core. > > The Erlang is definitely worth trying but it is certainly NOT one of > those "don't worry about your threading" languages, quite the contrary. > But because it uses completely different idiom, you don't have to mess > with semaphores and all that (ugly low-level) stuff. In fact these are > not present in the language at all. In Erlang, each process has its mailbox > for all incoming messages and it can decide when and what messages to read > from it. It can also send asynchronous messages to other processes. And > that's pretty much it. The rest is pure functional programming of the > internals of the process that operates on its internal state. But you > can still encounter race conditions only in slightly different form: > > Process A sends message to B and C. When B receives the message, > it queries C for something that C should compute as a result of > receiving the message from A. But there is no guarantee in which > order will C receive the message from A and the other (query) > from B. > > The language guarantees only that when process A sends messages to > process B, B will always receive them in the order their were sent. > > But working, debugging and testing multiple processes is really MUCH > easier in Erlang than in anything else I've ever used. It (almost) works > like what you would model as objects in Smalltalk, you would model as > processes in Erlang. > > The smart decision the Erlang authors made was that there will be no > memory sharing and processes and scheduler will be part of the language. > This enabled them to create a high-level layer which makes the work with > processes location transparent - once you have a pid of a process, you > can send it messages no matter where it really is (on which node). This > transparency is really great, because once you have a program, you can > run it on more nodes (computers) or start one node on multicore CPU. > > Ladislav Lenart > > |
> The one thing you'd need to be careful about is making sure that things
get passed by value (even in local calls) instead of by reference. Martin et al, It's not my intention to hijack this thread, but I've often wondered in what STST scenarios it makes more sense to pass by reference than value. I have always done it by value. TIA. John Treble > -----Original Message----- > From: Martin Kobetic [mailto:[hidden email]] > Sent: February 23, 2007 9:44 AM > To: Ladislav Lenart > Cc: [hidden email] > Subject: Re: VisualWorks and support for native multi-threading > > I feel that using something like the Actalk framework and a transparent > distribution framework, e.g. Opentalk-STST you can get pretty close to the > Erlang model. The one thing you'd need to be careful about is making sure > that things get passed by value (even in local calls) instead of by > reference. Although I have doubts about that being a universal rule. What > happens in Erlang when you pass some kind of system object like the > scheduler ? I feel that maybe the programmer should fine tune what gets > passed by value and what by reference. > > Ladislav Lenart wrote: > > Travis Griggs wrote: > >> On Feb 22, 2007, at 16:19, Eliot Miranda wrote: > >> > >>> [Grr, my reply last night went to Travis only. Sorry Paul...] > >>> > >>> Travis is quite right. The point to remember about multicore > >>> processors is that the chip still only has one interface to main > >>> memory and each core shares that interface (and the cache and > >>> (typically) TLBs) - remember many are socket-compatible with > >>> single-core processors. If one has a memory-bound application > >>> throwing a multicore at it can do nothing to increase performance. > >>> Large symbolic applications, a mainstay of server applications in OO > >>> languages, are such beasts. > >>> > >>> Whether this point is easy to enunciate against the marketing noise > >>> is, however, an entirely different question. > >> > >> The language of note that would seem to come closest to the notion of > >> "don't worry about your threading" would be Erlang (I'm sure there are > >> more noteworthy research languages, but I'm aware of known with any > >> degree of popularity). The Erlang guys were recently excited about the > >> ability to exploit multi cores. You can google for the results. One of > >> the interesting sites I found was a SIP stack which had no coding > >> changes made to it. When run on a quad core, it ran 1.8 times faster > >> than the single core. > > > > The Erlang is definitely worth trying but it is certainly NOT one of > > those "don't worry about your threading" languages, quite the contrary. > > But because it uses completely different idiom, you don't have to mess > > with semaphores and all that (ugly low-level) stuff. In fact these are > > not present in the language at all. In Erlang, each process has its > mailbox > > for all incoming messages and it can decide when and what messages to > read > > from it. It can also send asynchronous messages to other processes. And > > that's pretty much it. The rest is pure functional programming of the > > internals of the process that operates on its internal state. But you > > can still encounter race conditions only in slightly different form: > > > > Process A sends message to B and C. When B receives the message, > > it queries C for something that C should compute as a result of > > receiving the message from A. But there is no guarantee in which > > order will C receive the message from A and the other (query) > > from B. > > > > The language guarantees only that when process A sends messages to > > process B, B will always receive them in the order their were sent. > > > > But working, debugging and testing multiple processes is really MUCH > > easier in Erlang than in anything else I've ever used. It (almost) works > > like what you would model as objects in Smalltalk, you would model as > > processes in Erlang. > > > > The smart decision the Erlang authors made was that there will be no > > memory sharing and processes and scheduler will be part of the language. > > This enabled them to create a high-level layer which makes the work with > > processes location transparent - once you have a pid of a process, you > > can send it messages no matter where it really is (on which node). This > > transparency is really great, because once you have a program, you can > > run it on more nodes (computers) or start one node on multicore CPU. > > > > Ladislav Lenart > > > > |
John Treble wrote:
>> The one thing you'd need to be careful about is making sure that things > get passed by value (even in local calls) instead of by reference. > > > Martin et al, > > It's not my intention to hijack this thread, but I've often wondered in what > STST scenarios it makes more sense to pass by reference than value. I have > always done it by value. TIA. > > John Treble Passing by-reference is more natural in Smalltalk. Passing by-value breaks identity, it makes a copy. And it can be slower because of the marshaling costs, but that's often amortized in the cost of the remote call. On the other hand it may speed things up by avoiding subsequent remote calls so if identity is not important in a give scenario then it may pay off to make a local copy to interact with, than doing it all remotely. Then again pass by reference can speed things up when you send the same argument back-and forth in a multiple-dispatch sort of way. So as usually, it depends :-). There are also other possible modes, e.g. when you have the same sort of object on either side, you may want to pass them as "the same object on the other side" using some sort of ID. We do something like this for classes for example. Anyway, my main point was that I'm not convinced that marrying yourself to one and only pass-mode scheme is the best for all the problems out there. Martin |
In reply to this post by Travis Griggs-3
Reading this tread it seems it is missing
the point. The issue with multithreading as implemented in Smalltalk is not performance,
though more is always better, but the semantics. Semaphores and shared memory
are too difficult to work with. What should be explored is how to add better,
and by that I mean easier to work with semantics, for multithreading into
Smalltalk. If Smalltalk provided a conceptually manageable way to manage even
100 threads we would be way ahead. I don’t even think there are that many
options:
I like option 3. This approach is a bit
like the notion of using multiple images communication via Open Talk but they
all happen to reside in the memory space sharing objects where possible. From: Travis Griggs
[mailto:[hidden email]] On Feb 22, 2007, at 16:19, Eliot Miranda wrote:
[Grr, my reply last night went to Travis only. Sorry Paul...] The language of note that would seem to come closest to the notion of
"don't worry about your threading" would be Erlang (I'm sure there
are more noteworthy research languages, but I'm aware of known with any degree
of popularity). The Erlang guys were recently excited about the ability to
exploit multi cores. You can google for the results. One of the interesting
sites I found was a SIP stack which had no coding changes made to it. When run
on a quad core, it ran 1.8 times faster than the single core. -- Travis Griggs Objologist One man's blue plane
is another man's pink plane.
|
In reply to this post by kobetic
Martin Kobetic wrote:
> I feel that using something like the Actalk framework and a transparent > distribution framework, e.g. Opentalk-STST you can get pretty close to > the Erlang model. The one thing you'd need to be careful about is making > sure that things get passed by value (even in local calls) instead of by > reference. Although I have doubts about that being a universal rule. > What happens in Erlang when you pass some kind of system object like the > scheduler ? I feel that maybe the programmer should fine tune what gets > passed by value and what by reference. Well, nothing because there is no such thing as a scheduler available to the application programmer. But if it were there, it would be a process and you could send it messages as well as send its pid (process ID) to a process on a different node. There are generally two types of data in Erlang: * data which the application works with - these are always copied (some sort of copy-on-write) and are passed by value, * processes which are always passed by their reference (pid). And because pid is network (location) transparent, you can send it messages from wherever you like (once you got it :-). However Erlang has no built-in ("don't worry") support for migrating active processes among network nodes (but it has library support for this). Ladislav Lenart |
Free forum by Nabble | Edit this page |