Hi Andres,
Andres Valloud wrote: <snip> > In fact, yesterday I ran an experiment which was quite funny. I have > this demo that takes a lot of cpu / graphics to run. If I run it on a > dual core machine when idle, it performs badly. So I start VW, assign > it to run on say core 0, and kick off this: > > [true] whileTrue: [nil] > > Now core 0 is super busy. If I run the demo under these conditions, > it gets scheduled to run mostly on core 1 --- and thus it runs much > better even though the CPU is under 2x the overall load!* > think it fits well in the scenario others were relating to. It would also make sense in the context of Paul's results if there was something about VW's VM which would make it behave differently under the given conditions, e.g. very little built-in parallelism would explain it to me. > Even so... maybe I am wrong, but at first sight it seems to me that a > multi threaded VM may run into the similar issues due to hardware > design limitations that cause very expensive cache thrashing when two > cores look at roughly the same thing. Yes, I'm happy to accept that. For the time being at least - will watch the hardware development space closely :-). > Unless of course the memory > spaces the Smalltalk processes are running on are separated, and then > you have essentially two images running on the same VM, then why not > just two simpler / more efficient VMs? I think a very efficient marshaling protocol would do the trick for me. I just assumed that it must be easier at the VM level than at the application level where you have to mess with likes of BOSS etc. The thing can be quite slow at times... > Besides, if one decides to > crash for whatever reason (e.g. an image runs out of memory, or some > flaky C library finally overwrites some memory it shouldn't have), you > only lose one image instead of a whole blade, etc... > I would not be concerned with application resiliency at such a low level. This kind of problems make much more sense at a higher level where you can use application specific semantics to your advantage. Regards, Jaroslaw. |
Hi Runar,
Sorry for the late answer... Runar Jordahl wrote: > > As pointed out by others, making the VisualWorks Virtual Machine > multithreaded is a large, complex project. In addition, making your > own application work with shared memory parallel processing, is very > hard. Probably a lot harder than one can imagine. The problem is that > thread execution order is highly non-deterministic. What might work on > one box, can fail as new hardware optimises the execution of the > threads. What runs ok when running a test case the first time, fails > the next time it is run, and so on. How threads are scheduled can > decide whether a problem appears or not. It is not that fixing these > problems cannot be done. It is just hard to mentally see the problems > before they surface. And once they do appear, reproducing them can be > next to impossible. Read this blog post for more details on these > issues: > http://www.cincomsmalltalk.com/userblogs/runarj/blogView?showComments=true&entry=3347714260 > <http://www.cincomsmalltalk.com/userblogs/runarj/blogView?showComments=true&entry=3347714260> > Unfortunately I did not find them convincing. As much as I agree with the general reasoning that threads will lead to indeterministic programs I do not think it applies to the multi-threaded processing as the field. I think Mr Lee should incorporate a non-preemptiveness of some operations in the translation function for a multi-threaded program instead of just talking about programmers relaying on "means of purging indeterminism" in their software. Unfortunately all the math necessary to have a proper discussion has evaporated from my brain over the years so I will leave the functional transformations in peace. The material did contain some interesting references though. > > > > A multiple-image-solution using a divide and conquer algorithm, can > easily be designed to run deterministically. Given the same input > data, the execution will either fail or pass, regardless of how the > subtasks are scheduled. Testing, debugging, and understanding such an > image-based solution will be a lot easier than a (native) thread > implementation. > me as a bit of surprise that one would describe green thread application as deterministic, I mean an application that does not use means of synchronization like semaphores. One classic example is the standard implementation of ST Transcript - it hasn't been thread safe for really long. From my experience there is very little difference, minus the given problems with implementing multi-threaded VM support, between green threads and OS-level threads. An image does not leave in isolation - it is being constantly interrupted by OS events or your application interacting with other components in the environment like file system, which can alter execution of the program. Maybe if we wanted to talk about some very academic simpleton the statement would have some merits but in my day-to-day experience it is not so. > > > > Having said that, I know there are scenarios where multithreading > makes sense. We develop a system using Monte Carlo simulations, and > each simulation is fairly easy to make totally independent. If there > were resources, having the option to run on multiple cores would be > nice. But for most problems, it is the wrong thing to do. > > > > The disadvantage of multiple images, as you pointed out, is that > marshalling data is expensive. But I think that for most problems this > will not be a showstopper. If you are able to divide the problem, the > amount of data to be marshalled should decrease. Communication > directly between two images running on the same box (through RPC) can > be fast. > not sufficiently supported by our existing tools and that the maintenance overhead was high. > > My take on this is that we can live without shared memory threads, and > it would be better to focus on how to use multiple images. We need > both to control the start-up and termination of the images, and robust > frameworks for enabling the communication between them. Object > serialization (like BOSS), persistency toolkits, and communicational > layers (OpenTalk) are important. > > > > I have briefly looked at scaling large problems across multiple cores, > and it is not that hard. More info is found here: > http://www.cincomsmalltalk.com/userblogs/runarj/blogView?showComments=true&entry=3348279474 > <http://www.cincomsmalltalk.com/userblogs/runarj/blogView?showComments=true&entry=3348279474> does work but it took an effort from a number of people over a number of years to get it where it is now and we still trip on occasional errors. I dare to say this is not exactly something I would describe as "not that hard". BTW, our mechanism has may common characteristics with Google's MapReduce engine you and Mr. Lee refer to. I think where the main disagreement lays is how we perceive the difficulty of the job and, in consequence, how much support we believe should be offered by the IDE. The multi-threading vs. multi-image is a secondary issue really - I was hoping that memory sharing would play big part but not being an expert in the field of OS multi-threading I'm happy to accept that it may not justify the cost of other factors. The main question I wanted to get answer to was where is VW going to be in the world of parallel processing. Is it going to offer some clever support or is it going to leave it to the users of the product to come up with tools they need in their applications. To judge that one needs to find out what the appetite on the market is and what is the official view of the vendor and I wanted to thank you all for the input. I will read on the stuff David Long and Charles Monteiro were referring to - it did sound interesting - and I will also try to make use of the information Eliot has provided. I will continue to process it at my usual snail speed and drop you an occasional, hopefully not too boring, email if anything interesting pops up. Regards, Jaroslaw. |
Free forum by Nabble | Edit this page |