Here is what Dave Ungar says he did on the project (among other
things): http://www.linkedin.com/in/davidungar His thesis is online at ftp://sunsite.berkeley.edu/pub/techreps/CSD-86-287.html It also got turned into a book: http://www.amazon.com/Evaluation-Performance-Smalltalk-Distinguished-Dissertation/dp/026221010X Here is some criticism of his results: http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/txt/garbage/post.txt I like his thesis very much, but there is certainly room to argue with the conclusions, if for nothing else than that the world has changed in the last 20 years. -Ralph |
Hi Ralph --
I don't think either Ungar's architecture or the criticism of it below address the most important issues of making VHL object-oriented computer hardware (both then and now). Cheers, Alan At 08:05 AM 5/24/2007, Ralph Johnson wrote: >Here is what Dave Ungar says he did on the project (among other >things): http://www.linkedin.com/in/davidungar > >His thesis is online >at ftp://sunsite.berkeley.edu/pub/techreps/CSD-86-287.html > >It also got turned into a book: >http://www.amazon.com/Evaluation-Performance-Smalltalk-Distinguished-Dissertation/dp/026221010X > >Here is some criticism of his results: >http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/txt/garbage/post.txt > >I like his thesis very much, but there is certainly room to argue with >the conclusions, if for nothing else than that the world has changed >in the last 20 years. > >-Ralph |
On 5/24/07, Alan Kay <[hidden email]> wrote:
> Hi Ralph -- > > I don't think either Ungar's architecture or the criticism of it > below address the most important issues of making VHL object-oriented > computer hardware (both then and now). I would be interested in knowing what issues you think are the most important. The thesis is Ungar's, but the architecture is not. I think the architecture was a group project by Patterson's students. The point of the thesis is that a lot of proposed hardware features don't help performance very much, and in fact there are software solutions that are more effective. This is a good point. However, it doesn't mean that no hardware features can help. Moreover, it assumes that hardware is expensive and software is cheap, and in fact the opposite is true. People have been working on a JIT compiler for Squeak for some time, and we aren't using one yet. It is easy to say "just put it in the compiler", but it might be too complex to ever get the compiler working. What do you think are the main issues? -Ralph Johnson |
In reply to this post by timrowledge
Tim Rowledge wrote on Tue, 22 May 2007 22:16:13 -0700
> On 22-May-07, at 8:26 PM, Matthew Fulmer wrote: > > Jecel is the local expert in processors. You should check out > > his site: > > http://www.merlintec.com:8080/hardware Thanks for the "local expert" comment :-) I am catching up with the past week's email and came across this interesting discussion. There is a particularly relevant link nearly hidden at the bottom of the page you indicated which Brad might find interesting - the list of Smalltalk computers. > I still think a very simple RISC architecture with a substantial > above-the-bus chunk of memory that can be used for 'microcode' or > data store, Your wish is my command - check out my RISC42 core design. Its 16 bit instruction set was heavily influenced by my effort to come up with a nice microcode for a Squeak processor. > no traditional (and expensive) cache, Caches have their uses as well, specially if they are tweaked to deal with objects and PICs. I see no reason not to have both caches and visible local memory so all my recent designs have had this style. > transputer-like > communication channels to other cores That might be a bit simplistic, and even the last Transputer from Inmos (T9000) replaced that with hardware routing. Perhaps something more like the old J-Machine from MIT (a Smalltalk computer with 1024 processors)? > and probably no special > floating point hardware would be nice. How about bitblt in hardware and stuff like that? > If you can get to a state > where dozens/hundreds of cores can be sensibly used then one or two > can spend their time as floating point units and if needed many more > can join in. Likewise for video stream processing. Indeed, which is the spirit of Chuck Moore's current design (24 Forth processors): http://www.intellasys.net/ And this is also why I decided to register the "Plurion" trademark (multiple RISC42 cores). For my thesis, however, I will be trying to take advantage of the fact that the hardware is implemented using a FPGA to replace one or two processors with hardware implementation of key objects if the adaptive compilation system (Self/Strongtalk style) decides these are extreme "hot spots" in the current execution. The project Matthew mentioned is the one named "RNA" on the above page. That one doesn't have a memory/processor separation at all. > The really hard part is getting people to actually think about multi- > processing solutions to problems. The software world is far too > comfortable with single-thread thinking and the cosy fantasy version > of Moore's Law. That died with the Pentium 4. Nearly half of the recent talks at http://ee380.stanford.edu/ start out with some graph that proves just that. The software people are slowing starting to figure this out. But back to this thread's subject: the XO is not a particularly good indication of how much a dynabook could cost. It is a great effort to reduce cost but has far too much PC legacy overhead. Certainly some laptop that costs around $500 retail could approach $175 when bought directly off the assembly line in huge quantities and with absolutely no taxes. Not that this would make it as nice a computer for children as the XO, of course! But I feel we should aim to have a Dynabook for $30 under these conditions (some $80 retail) by the end of this decade. -- Jecel |
In reply to this post by Ralph Johnson
I hesitate to go into the issues for lack of time, etc. I used to say
that "HW is just SW crystallized early". The first computers didn't have index registers but did direct address modification. Some of the real disasters in personal computers were done because the hackers didn't want to use address spaces or the CPUs couldn't support them, etc. Has no one noticed that graphics accelerators actually work? Etc. So HW choices make a huge difference. The real question is: what is a really good computation model for the VHLL, and what of it could be made into special HW? In my way of looking at objects and messages, they were supposed to be SW versions of how the ARPAnet and Internet were going to work: i.e. massive numbers, totally encapsulated, loosely coupled by pure messaging, and simultaneously active. Well designed hardware could make a huge difference on each of these compared to SW on a stupid CPU. One of the several reasons we don't have a really decent OOPL today is the lack of suitable HW to run it. The middle ground that was so successful at PARC which made up for the many things we didn't understand was the wonderful microcode architecture of the Alto and subsequent machines we designed and built. Today, FPGAs can fill some of this role. Butler Lampson has estimated that today's architectures are about 1000 times less efficient pound for pound than the Alto. That's 10 doublings which equals 15 years of Moore's Law -- meaning that what can be computed today by all the straining, etc., could have been possible 15 years ago, and what we could do today will perhaps not be possible for another 15 years. This means that Moore's Law on a bad design doesn't make up for a good design of custom HW. Just to pick one item -- suppose that instead of faking message passing with subroutine calls, the HW could enable real message passing. Or better yet, what if we could run a publish and subscribe with parameters matcher (like a modern version of LINDA) at the same speed as today's HW assisted subroutine calls? And what if these were between truly encapsulated objects that were the lowest level entity on the computer? Cheers, Alan At 10:02 AM 5/24/2007, Ralph Johnson wrote: >On 5/24/07, Alan Kay <[hidden email]> wrote: >>Hi Ralph -- >> >>I don't think either Ungar's architecture or the criticism of it >>below address the most important issues of making VHL object-oriented >>computer hardware (both then and now). > >I would be interested in knowing what issues you think are the most important. > >The thesis is Ungar's, but the architecture is not. I think the >architecture was a group project by Patterson's students. > >The point of the thesis is that a lot of proposed hardware features >don't help performance very much, and in fact there are software >solutions that are more effective. This is a good point. However, it >doesn't mean that no hardware features can help. Moreover, it assumes >that hardware is expensive and software is cheap, and in fact the >opposite is true. People have been working on a JIT compiler for >Squeak for some time, and we aren't using one yet. It is easy to say >"just put it in the compiler", but it might be too complex to ever get >the compiler working. > >What do you think are the main issues? > >-Ralph Johnson |
In reply to this post by Ralph Johnson
Ralph Johnson writes:
> The point of the thesis is that a lot of proposed hardware features > don't help performance very much, and in fact there are software > solutions that are more effective. This is a good point. However, it > doesn't mean that no hardware features can help. Moreover, it assumes > that hardware is expensive and software is cheap, and in fact the > opposite is true. People have been working on a JIT compiler for > Squeak for some time, and we aren't using one yet. It is easy to say > "just put it in the compiler", but it might be too complex to ever get > the compiler working. But we've been making continuous progress towards a JIT for the last few years. I doubt that finishing Exupery would cost anything like what a modern high performance CPU costs to design. Exupery has just started compiling in the background, a few bugs need to be fixed before that's a safe way to run it. It is progressing. Writing in Smalltalk is a good way to keep development costs down. Bryce |
In reply to this post by Brad Fuller-3
Brad,
> The Viewpoints "Steps Toward The Reinvention of Programming" (maybe I'll call the authors "The Gang of 5") does not go > into detail on what their "metal" consists of. I'd like to find out more of what they are thinking. Maybe they'll start > with a powerbook and use the XO in parallel. Maybe they have some ideas of building a new platform in parallel with the > software development. I should mention that there are more people working on the project (in addition to the authors who are committing to the project in different ways). The Gang of "5" wouldn't really match what is going on^^; > General XO specs: There was a change on the spec while ago. The latest one is available at http://wiki.laptop.org/go/Hardware_specification#Beta_Test_3_Systems_.28BTest-3.2C_or_B3.29 but the following is the excerpt: > CPU > AMD Geode 400 MHz x86 AMD Geode LX 433 (should allow 500, I think) Mhz. > Memory > 128 Mb 133 MHz DRAM 256 MB. > Storage > 512 Mb SLC NAND Flash memory 1GB. > Video > 693 x 520 pixel color display capable of switching to a 1200 by 900 sunlight-readable monochrome mode Did you devide the number by three to get the color resolution? It doesn't quite work in that way^^; Take a look at great OLPC screen simulator by Bert (http://freudenbergs.de/bert/etoys/OLPC-XO-Display.pr) > Mouse > a touchpad pointing device You might add that the touch pad has two "layers". > Interfaces > 4 USB ports 3 are available to the outside. -- Yoshiki |
In reply to this post by Ralph Johnson
On Thursday 24 May 2007 10:32 pm, Ralph Johnson wrote:
> The point of the thesis is that a lot of proposed hardware features > don't help performance very much, and in fact there are software > solutions that are more effective. This is a good point. However, it > doesn't mean that no hardware features can help. Moreover, it assumes > that hardware is expensive and software is cheap, and in fact the > opposite is true. Mmm.. Software and hardware represents two end of a continuum, so both propositions are over simplifications. The fixed costs in hardware is so high now that it is economical only if manufactured in large numbers. Therefore it is important to get 'it right the first time'. This depends on the accuracy and precision of the design and testing tools. The dependency trace will lead you back to building software that 'works the first time'. Doing that is not cheap. Dijkstra's Turing award lecture goes into why producing correct software in the large is so difficult. Until we solve these issues and learn to produce microcode/nanocode/picocode within acceptable limits of accuracy, the system costs will be tough to control. Regards .. Subbu |
On 24-May-07, at 8:49 PM, subbukk wrote: > > The fixed costs in hardware is so high now that it is economical > only if > manufactured in large numbers. Therefore it is important to get 'it > right the > first time'. Not if you do it right and build highly programmable hardware. The only thing you have to fear then is someone discovering the secret of the Halt and Catch Fire instruction[1] tim [1] I worked with a chap at IBM who used a pair of 29116 bitslice cpus in a project. It was entirely possible to cause a tri-state bus contention that really would execute the HCF instruction. Amusing but expensive. Possibly a useful anti-theft mechanism? -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Useful Latin Phrases:- Fac me cocleario vomere! = Gag me with a spoon! |
On Friday 25 May 2007 11:02 am, tim Rowledge wrote:
> On 24-May-07, at 8:49 PM, subbukk wrote: > > The fixed costs in hardware is so high now that it is economical > > only if > > manufactured in large numbers. Therefore it is important to get 'it > > right the > > first time'. > > Not if you do it right and build highly programmable hardware. The > only thing you have to fear then is someone discovering the secret of > the Halt and Catch Fire instruction[1] million) and the fab costs (a few billion). The cost of the first CPU is quite steep. ASICs and FPGAs are cheaper because it is easier to get them right in the first place. There is a big 'impedance mismatch' between mass CPUs of today and what Dynabook needs means that the cost of the first Dynabook CPU will be quite steep. It would be easier to create a Dynabook virtual machine (like qemu) in software and experiment with it for a few years before taping out silicon. Regards .. Subbu |
In reply to this post by Alan Kay
On Friday 25 May 2007 12:39 am, Alan Kay wrote:
> Just to pick one item -- suppose that instead of faking message > passing with subroutine calls, the HW could enable real message > passing. Not really hardware, but QNX is a network-ready kernel that uses message passing exclusively. It proves that message-passing systems can be compact and efficient even on crufty :-) PCs. It is so compact that an entire dial-up internet appliance with GUI, vector graphics and network-ready browser was packed into a 1.44MB bootable floppy[1]! [1] ftp://ftp.qnx.com/usr/free/qnx4/os/demos/misc/qnxdemo.tgz Regards .. Subbu |
In reply to this post by K. K. Subramaniam
subbukk <[hidden email]> writes:
> On the contrary, it is much simpler to write and reason about programs if > multiprocessing capability is given. Dijkstra's do-od structure was > inherently multi. But building machines to 'execute' such programs was hard, > so system designers invented languages that forced programmers to code for > efficiency rather than simplicity. This trend was beautifully captured by > Gerald Weinberg in his story of Levine the Genius Tailor: > > http://www.zafar.se/bkz/Articles/GeniusLanguageDesigner That's a cool story. It sounds just like language design! I am not so sure, however, that we have found even great ways to *think* about parallel programs. Dijkstra's discussion of do-od is really cool. However, here are two challenges it does not address: 1. Even when you are thinking about formal proof, like Dijkstra was, parallelism in large OO programs means you have to reason about aliasing, i.e. you have to do data flow analysis. This is hard, probably too hard if the language is unconstrained. 2. Most programs are not formally proven. Instead, their correctness relies on careful reasoning, on good processes, and on testing. Parallelism undermines the testing part. The jury is out. One promissing avenue, though, is the message passing used in E and in Actors-based systems. You can try this style in Squeak by using SynchronizedQueue's. Instead of having locks to protect shared data structures, you eliminate sharing and instead assign each structure to a single thread. Any other thread that wants to access the structure, must do it by sending a message to the appropriate thread. In Squeak, you can implement the message sending using SynhcronizedQueue's. Lex |
On Sat, 26 May 2007 13:02:05 -0700, Lex Spoon <[hidden email]> wrote:
> I am not so sure, however, that we have found even great ways to > *think* about parallel programs. Actually, I think programming in parallel could be easier than programming serially, given suffiicient help from the tool. When I was fooling around with language designs, one thing that occurred to me was that, when we program serially, we imply that we =care= what order things are executed in, when very often we don't really. Order has a implied significance which can actually be deceptive. Worse, it can have a significance that is hidden. For example, we could have some initialize code: aVar := someClass new. anotherVar := someOtherClass new. Does anotherVar need aVar to be initialized before it can be initialized? Good programming practice sez "it shouldn't" but play along. The point is, there's no way to take say "I don't care what order things occur in, as long as THIS is done before THAT begins." You could program in two dimensions, almost like music, where everything on a particular line had to be done sequentially, while things that were vertically arranged (rather than being chords, as in music) could be done in any order. Synchronization points could be set up vertically (almost like measure bars). ------serial execution-----> | ] | ] parallel ] | synch point | ] | ] I think it would actually clarify things. |
Blake wrote:
> When I was fooling around with language designs, one thing that > occurred to me was that, when we program serially, we imply that we > =care= what order things are executed in, when very often we don't > really. Order has a implied significance which can actually be > deceptive. Worse, it can have a significance that is hidden. > > For example, we could have some initialize code: > > aVar := someClass new. > anotherVar := someOtherClass new. > > Does anotherVar need aVar to be initialized before it can be > initialized? Good programming practice sez "it shouldn't" but play > along. The point is, there's no way to take say "I don't care what > order things occur in, as long as THIS is done before THAT begins." Occam allows you do to that with its SEQ and PAR constructs. |
In reply to this post by Lex Spoon-3
On Sunday 27 May 2007 1:32 am, Lex Spoon wrote:
> 1. Even when you are thinking about formal proof, like Dijkstra was, > parallelism in large OO programs means you have to reason about > aliasing, i.e. you have to do data flow analysis. This is hard, > probably too hard if the language is unconstrained. What is hard about programming is coming up with the right invariants. What happens between the starting invariant and ending invariant is determined by the semantics. E.g. x, y := y, x is easier to reason about than bringing in a temp to swap x and y. > 2. Most programs are not formally proven. True. Proving large programs automatically is theoretically infeasible for most languages in use today. But many algorithms do go thru formal proofs before adoption - e.g. floating point math, semaphores, regular expressions are all based on proven algorithms. > Instead, their correctness > relies on careful reasoning, on good processes, and on testing. > Parallelism undermines the testing part. Testing invariants is not affected by serial/parallel execution, but debugging or tracing parallel programs is not really for faint-hearted :-). BTW, this thread is veering off-topic. We should get back to the cost of Dynabook before Brad Fuller starts reaching out for a 2x4 :-). Regards .. Subbu |
subbukk <[hidden email]> writes:
> On Sunday 27 May 2007 1:32 am, Lex Spoon wrote: > > 1. Even when you are thinking about formal proof, like Dijkstra was, > > parallelism in large OO programs means you have to reason about > > aliasing, i.e. you have to do data flow analysis. This is hard, > > probably too hard if the language is unconstrained. > What is hard about programming is coming up with the right invariants. What > happens between the starting invariant and ending invariant is determined by > the semantics. E.g. > x, y := y, x > is easier to reason about than bringing in a temp to swap x and y. An interesting example. This one is about parallelism, but not non-determinism. A generalization would be "clocked" systems, which indeed are really cool. > > 2. Most programs are not formally proven. > True. Proving large programs automatically is theoretically infeasible for > most languages in use today. But many algorithms do go thru formal proofs > before adoption - e.g. floating point math, semaphores, regular expressions > are all based on proven algorithms. Yes, algorithms are more amenable to proof than implementation. If you implement a proven algorithm, and something goes wrong, then it's a normal old bug. If you do like Java and implement an unproven type system, then you can get really nasty surprises. The last I heard, it is still not even known if Java's generics *can* be fully implemented to spec. That is not a nice place to be as a programmer! The language question you raise is also interesting. Sometimes implementation languages are pointlessly difficult to prove things about. Other times, though, there is a real tradeoff between implementation speed and provability. A good example is inheritance plus method override. These are really great for adding functionality to existing programs, but are awkward for proof tools because the meaning of a method call depends on which overridden versions of the method you have loaded. It would be really neat to have a subset of Squeak that was designed to be amenable to proof, and then to teach one of the existing proof systems about this subset. If you include blocks, but reject inheritance, then you could come up with something close to the lambda-calculus-like languages that the existing proof tools are so good at. You would not like programming this way, compared to using full Squeak, but for core things like Semaphore and SharedQueue it would seem useful. Lex |
On Tuesday 29 May 2007 12:40 pm, Lex Spoon wrote:
> Yes, algorithms are more amenable to proof than implementation. If > you implement a proven algorithm, and something goes wrong, then it's > a normal old bug. Proofs and Implementation represent two ends of a large spectrum. Algorithms falls there somewhere in between (but closer to Proofs). I think Alan Kay pointed out a few mails before that "HW is just SW crystallized early". A mathematician would work near the Proof end, while an engineer would have to tackle the Implementation side. For engineers, any system that takes us "close enough" is good enough. But, algorithm to implementation is a big leap of faith today :-(. Where the translation covers just a couple of orders of magnitude (e.g. math, graphics etc), we can achieve an accuracy that is sufficient to commit to HW. But, where modularity is introduced at compile time but vanishes at runtime, we are forced to analyze code that runs into millions of machine instructions. Bugs will be the norm rather than an exception :-). Projects like Exupery are good because they help us short-circuit many steps between algorithms to machine instructions. The first Smalltalk machine description didn't come with invariants and the bugs were discovered 'at execution time' (cf. Bits of History). It would be interesting to see if Dynabook could be described in Smalltalk along with all invariants. Regards .. Subbu |
Hi Subbu --
See what you think of the "preposterous proposal" (as one reviewer termed it). http://www.vpri.org/pdf/NSF_prop_RN-2006-002.pdf One of the ideas here is that you have to work the whole spectrum, but try to find qualitative boundaries that can be maintained and will help all. Cheers, Alan At 07:46 AM 5/29/2007, subbukk wrote: >On Tuesday 29 May 2007 12:40 pm, Lex Spoon wrote: > > Yes, algorithms are more amenable to proof than implementation. If > > you implement a proven algorithm, and something goes wrong, then it's > > a normal old bug. >Proofs and Implementation represent two ends of a large spectrum. Algorithms >falls there somewhere in between (but closer to Proofs). I think Alan Kay >pointed out a few mails before that "HW is just SW crystallized early". A >mathematician would work near the Proof end, while an engineer would have to >tackle the Implementation side. For engineers, any system that takes >us "close enough" is good enough. But, algorithm to implementation is a big >leap of faith today :-(. Where the translation covers just a couple of orders >of magnitude (e.g. math, graphics etc), we can achieve an accuracy that is >sufficient to commit to HW. But, where modularity is introduced at compile >time but vanishes at runtime, we are forced to analyze code that runs into >millions of machine instructions. Bugs will be the norm rather than an >exception :-). > >Projects like Exupery are good because they help us short-circuit many steps >between algorithms to machine instructions. The first Smalltalk machine >description didn't come with invariants and the bugs were discovered 'at >execution time' (cf. Bits of History). It would be interesting to see if >Dynabook could be described in Smalltalk along with all invariants. > >Regards .. Subbu |
On Tuesday 29 May 2007 8:58 pm, Alan Kay wrote:
> Hi Subbu -- > > See what you think of the "preposterous proposal" (as one reviewer termed > it). > > http://www.vpri.org/pdf/NSF_prop_RN-2006-002.pdf Some of the ideas presented (e.g. Island, pseudo-time) would have been considered 'bold' or 'risky' in the nineties, but not so today. Kairos works so well in real world (e.g. children, artists, farmers, etc), so there is no reason for computing should stick to chronos time. Building a 20KLOC monolith would definitely be a challenge. Unix kernel was only around 6K lines of C when it started. TinyCC (circa 2004), a bootloader that compiles Linux kernel on the fly and runs it, is about 7500 lines of C. The issue here is not one of space or speed constraints but one of verifying correctness of such a monolith within engineering tolerances. Regards .. Subbu |
Subbukk-
Thanks for sharing this! I had been ignorant of the Kairos / Khronos distinction. Quite relevant for Croquet! -H subbukk wrote: > ...Kairos works > so well in real world (e.g. children, artists, farmers, etc), so there is no > reason for computing should stick to chronos time. > ... |
Free forum by Nabble | Edit this page |