Hi All,
On Mon, Sep 15, 2014 at 6:01 AM, Thierry Goubier <[hidden email]> wrote:
I find this whole discussion depressing. It seems people would rather put their energy in chasing quick fixes or other technologies instead of contributing to the work that is being done in the existing VM. People discuss using LLVM as if the code generation capabilities inside Cog were somehow poor or have no chance of competing. Spur is around twice as fast as the current memory manager, has much better support for the FFI. Clément and I, now with help from Ronie, are making excellent progress towards an adaptive optimizer/speculative inliner that will give us similar performance to V8 (the Google JavaScript VM, lead by Lars Bak, who implemented the HotSpot VM (Smalltalk and Java)) et al. We are trying to get person-power for a high-quality FFI and have a prototype for a non-blocking VM. When we succeed C won't be any better and so it won't be an interesting target. One will be able to program entirely in Smalltalk and get excellent performance. But we need effort. Collaboration. Personally I feel so discouraged when people talk about using LLVM or libffi or whatever instead of having the courage and energy to make our system world-class. I have the confidence in our abilities to compete with the best and am saddened that people in the community don't value the technology we already have and can't show faith in our abilities to improve it further. Show some confidence and express support and above all get involved.
in hope, Eliot
|
Hear hear! -C [1] http://tinyurl.com/m66fx8y (original message) -- Craig Latta netjam.org +31 6 2757 7177 (SMS ok) + 1 415 287 3547 (no SMS) |
Hello, I am segmenting this mail in several sections. --------------------------------------------------------------- - On Lowcode and Cog I have been working in the last week with the Cog VM, implementing the Lowcode instructions in Cog.- Boxing and unboxing of primitive types. - Unchecked comparisons.- The atomic operation compare and swap. - Object pin/unpin (requires Spur). - VM releasing and grabbing for threaded ffi. Current I have implemented the following backends: - A C interpreter plugin. - A LLVM based backend. Currently I am working in getting this working using the Cog code generator. So far I am already generating code for int32/pointer/float32/float64. I am starting to generate C functions calls and object boxing/unboxing. During this work I learned a lot about Cog. Specially that Cog is missing a better Slang generator, that allows to force better inlining and more code reviews. There is a lot of code duplication in Cog, that can be attributed to limitations of Slang. In my opinion, if we could use Slang not only for building the VM we should end with a better code generator. In addition we, need more people working in Cog. We need people that performs code reviews and documentation of Cog. After these weeks, I learned that working in Cogit it is not that hard. Our biggest problem is lack of documentation. Our second problem could be the lack of documentation about Slang. --------------------------------------------------------------- - Smalltalk -> LLVM ? As for having a Smalltalk -> LLVM code generator. The truth is that we will not gain anything. LLVM is a C compiler, which is designed to optimize things such as loops with lot of arithmetics. It is designed to optimize large sections of code. In Smalltalk, most of our code is composed mostly of message sends. LLVM cannot optimize a message send. To optimize a message send, you have to determine which is the method that is going to respond to the message. Then you have to inline the method. And then you can start performing the actual optimizations, such as constant folding, common subexpressions, dead branch elimination, loop unrolling, and a long etc. Because we don't have information in the actual language (e.g. static types a la C/C++/Java/C#) that tells us what is going to be the actual method invoked by a message send, we have the following alternatives to determine it:- Don't optimize anything. In other words, our best bet is in the work of Clément in Sista. The only problem with this bet are real time applications. Real time applications requires an upper bound guarantee in their response time. In some cases, the lack of this guarantee can be just an annoyance, as happens in video games. In some mission critical applications the results can not be good, if this time constraint is not met. An example of a mission critical system could the flight controls of an airplane, or the cooling system of a nuclear reactor. For these application, it is not possible to rely in an adaptive optimizer that can be triggered sometimes. In these application you have to either: - Extend the language to hand optimize some performance critical sections of code. - Use another language to optimize these critical section. - Use another language for the whole project. And of course, you have to perform lot of profiling. Greetings, Ronie 2014-09-15 16:38 GMT-03:00 Craig Latta <[hidden email]>:
|
Hoi Ronie-- Nice summary. Thanks! -C -- Craig Latta netjam.org +31 6 2757 7177 (SMS ok) + 1 415 287 3547 (no SMS) |
In reply to this post by Ronie Salgado
Hi Ronie,
On Mon, Sep 15, 2014 at 2:37 PM, Ronie Salgado <[hidden email]> wrote:
remember to send me code for integration. I'm eagerly waiting to use your code!
Yes, and that's difficult because it's a moving target and I have been lazy, not writing tests, instead using the Cog VM as "the test". I am so happy to have your involvement. You and Clément bring such strength and competence.
Ah! But! Sista has an advantage that other adaptive optimizers don't. Because it optimizes from bytecode to bytecode it can be used during a training phase and then switched off.
The additional option is to "train" the optimizer by running the application before deploying and capturing the optimised methods. Discuss this with Clément and he'll explain how straight-forward it should be. This still leaves the latency in the Cogit when it compiles from bytecode to machine code. But a) I've yet to see anybody raise JIT latency as an issue in Cog b) it would be easy to extend the VM to cause the Cogit to precompile specified methods. We could easily provide a "lock-down" facility that would prevent Cog from discarding specific machine code methods.
Early and often :-). Because we can have complete control over the optimizer, and because Sista is byetcode-to-bytecode and can hence store its results in the image in the form of optimized methods, I believe that Sista is well-positioned for real-time since it can be used before deployment. In fact we should emphasise this in the papers we write on Sista.
best, Eliot
|
In reply to this post by Eliot Miranda-2
Hi Eliot and all! Since I work with Ron at 3DICC and Cog is vital to us, I wanted to chime in here. On 09/15/2014 06:23 PM, Eliot Miranda wrote: > I find this whole discussion depressing. It seems people would rather > put their energy in chasing quick fixes or other technologies instead of > contributing to the work that is being done in the existing VM. People > discuss using LLVM as if the code generation capabilities inside Cog > were somehow poor or have no chance of competing. Spur is around twice > as fast as the current memory manager, has much better support for the > FFI. Clément and I, now with help from Ronie, are making excellent > progress towards an adaptive optimizer/speculative inliner that will > give us similar performance to V8 (the Google JavaScript VM, lead by > Lars Bak, who implemented the HotSpot VM (Smalltalk and Java)) et al. One thing you need to understand Eliot is that most of us don't have the mind power or time to be able to contribute on that level. But still, a lot of us are tickled by ideas on the low level - and thus ideas like reusing LLVM, reusing some other base VM, cross compilation etc - pop up. Don't put too much into it - I am always toying with similar ideas in my head for "fun", it doesn't mean we don't also see/know that *real* VM work like Cog is the main road. > We are trying to get person-power for a high-quality FFI and have a > prototype for a non-blocking VM. When we succeed C won't be any better > and so it won't be an interesting target. One will be able to program > entirely in Smalltalk and get excellent performance. But we need > effort. Collaboration. Let me just mention LuaJIT2 - besides very good performance, among other things it sports a *very* good FFI. Well, in fact Lua in general has several FFIs and tons of C++ bindings tools too - so IMHO anyone doing work in that area should take a sneak peek at LuaJIT2. And this is a truly "sore" area in Smalltalk since eternity. If we had something as solid as the stuff in the Lua community - then Cog and Smalltalk could go places where it haven't been before I suspect. If we look at the codebase we have at 3DICC - a very large part consists of complicated plugin code to external libraries and accompanying complicated Smalltalk glue. Also, if we compare the Lua community with the Squeak/Pharo community, it is quite obvious that the lack of really good FFI solutions leads us to "reinvent" stuff over and over, often poorly, while the Lua people simply wrap high quality external libraries and that's it. Done. Of course still also stems from the very different background and motives behind the two languages and their respective domains, but still. > Personally I feel so discouraged when people talk about using LLVM or > libffi or whatever instead of having the courage and energy to make our > system world-class. Don't feel discouraged - its just that 99% of the community can't help you. :) Instead we should feel blessed that we have 1 Eliot, 1 Clement, 1 Igor and 1 Ronie. Do we have more? > I have the confidence in our abilities to compete > with the best and am saddened that people in the community don't value > the technology we already have and can't show faith in our abilities to > improve it further. Show some confidence and express support and above > all get involved. Let me then make sure you know that 3DICC values *all* work in Cog *tremendously*. As soon as you have something stable on the Linux side - we would start trying it. Just let me know, on Linux (server) we run your upstream Cog "as is". In fact, I should probably update what we use at the moment :) Every bit of performance makes a big impact for us - but to be honest, what we would value even more than performance would be ... robustness. I mean, *really* robust. As in a freaking ROCK. An example deployment: More than 3000 users running the client on private laptops (all Windows variants and hw you can imagine, plus some macs) and the server side running on a SLEW of FAT EC2 servers. We are talking about a whole BUNCH of Cogs running 24x7 on a bunch of servers. We experience VM blow ups on the client side, both Win32 and OSX. OSX may be due to our current VM being built by clang, but I am not sure. Our Win32 VM is old, we need to rebuild it ASAP. Hard to know if these are Cog related or more likely 3DICC plugin related, but still. But the client side is still not the "painful" part - we also experience Linux server side Cogs going berserk (100% CPU, no response) or just locking up or suddenly failing to resolve localhost :) etc. I suspect the networking code in probably all these cases. Here we do NOT have special 3DICC plugins so no, here we blame Cog or more likely, Socket plugin. Often? No, but "sometimes" is often enough to be a big problem. In fact, a whole new networking layer would make sense to me. Also... we need to be able to use more RAM. We are now deploying to cloud servers more and more - and using instances with 16Gb RAM or more is normal. But our Cogs can't utilize it. I am not up to speed what Spur gives us or if we in fact need to go 64 bit for that. regards, Göran |
In reply to this post by Eliot Miranda-2
2014-09-16 1:46 GMT+02:00 Eliot Miranda <[hidden email]>:
Lack of documentation ? About Cog there are these documentation: About Spur: summary and object format And many useful class and method comments that taught me a lot. When I try to work with Pharo frameworks, even recent ones, it is very rare that I see as much documentation than it exists for Cog. Some frameworks are documented in the Pharo books and a few other as Zinc have good documentation, but in general, there are few documentation and even fewer people writing documentation. The website about Cog has existed for over 6 years now. I think Cog is far from the worst documented part of Pharo.
It's also difficult because the first tests to write are the hardest to write.
The solution of Eliot makes sense. To write a paper about that I need benchs showing result on real time applications. So there's quite some work to do before.
|
What would be valuable is a reading list / path to VM enlightenment. Bluebook is useful Then a tour of the Object Engine by Tim Then plugin articles + Slang The bytecode set Primitive... Context to stack mapping Blocks Non local returns Display/Sensor/event look/timer implementation (like in the porting document). and only then one would move to more advanced topics. I saw that Clement had a set of VM related books on his desk at INRIA, maybe posting the list would be great! All the best, Phil On Tue, Sep 16, 2014 at 11:48 AM, Clément Bera <[hidden email]> wrote:
|
In reply to this post by Ronie Salgado
On Tue, Sep 16, 2014 at 1:48 PM, Thierry Goubier <[hidden email]> wrote:
> > > > 2014-09-16 13:14 GMT+02:00 Ben Coman <[hidden email]>: >> >> >> Don't worry/don't bother with thoses: you will never use Smalltalk or a VM :) It will never be certified by authorities, and the industry will never accept it. >> >> >> You are probably right for those two examples, but there are other not-so-regulated domains where real-time is useful - e.g. industrial automation and robotics. > > > Real-time is usefull there, yes. But Smalltalk and Cog will never get there. Except as a DSL / code generator tool (which means a MDE approach, more or less). > > (And code generation is where Pharo to C or LLVM-IR gets us interested) > > Dynamic optimisations, lack of static typing: they will laugh you out in any of those fields. > > Even if their developpers use Python behind their back. > I know the creator of OpenCOMRTOS, in fact, he lives close to my place. They happen to have a "VM" "The ultra small target independent Virtual Machine" Applications: Remote diagnostics. Fail safe and fault tolerant control. Processor independent programming. Thanks to the use of OpenComRTOS, SafeVM tasks can operate system wide across all nodes in the network. The user can also put several SafeVM tasks on the same node. The natively running OpenComRTOS itself acts as a virtual machine for the SVM tasks, isolating them from the underlying hardware details while providing full access. Safe Virtual Machine for C So, looks like we aren't in such a black and white situation. Phil > > Thierry |
In reply to this post by philippeback
2014-09-16 14:55 GMT+02:00 [hidden email] <[hidden email]>:
The book that explains the best how to implement a high performance VM for Smalltalk and why is Urs Holzle phd. Other relevant books on my office focus on specific topics, such as Advanced Compiler Design and Implementation by Steven Muchnick for optimizing compilers or The garbage collection handbook by Richard Jones, Antony Hosking and Eliot Moss.
|
In reply to this post by Göran Krampe
On Tue, Sep 16, 2014 at 12:56 AM, Göran Krampe <[hidden email]> wrote:
Time is the issue. I'm no brighter than anyone here, but I have my passion. And one can learn. Doug McPherson just contributed the ThreadedARMPlugin having never read the ABI (because he never needed to) before he started the project. But still, a lot of us are tickled by ideas on the low level - and thus ideas like reusing LLVM, reusing some other base VM, cross compilation etc - pop up. Well I hear you and think that the FFI is extremely important. That's why I implemented proper callbacks for Squeak, why Spur supports pinning, and why I did the MT prototype, and one of the main areas the Pharo team is working on. Of course still also stems from the very different background and motives behind the two languages and their respective domains, but still. I have the confidence in our abilities to compete Without error reports, in fact, without an ability to debug in place (run the assert VM for example, using the -blockonerror switch to freeze it when an assert fails) there's nt a lot I can do. We use a CI server to run regressions at Cadence and my boss makes sure I fix VM bugs promptly when the CI system shows them. We deploy on linux and so reliability there-on is important to us. So perhaps we can discuss how to debug your server issues. We experience VM blow ups on the client side, both Win32 and OSX. OSX may be due to our current VM being built by clang, but I am not sure. Our Win32 VM is old, we need to rebuild it ASAP. Hard to know if these are Cog related or more likely 3DICC plugin related, but still. There are ways of finding out. But the client side is still not the "painful" part - we also experience Linux server side Cogs going berserk (100% CPU, no response) or just locking up or suddenly failing to resolve localhost :) etc. I suspect the networking code in probably all these cases. Here we do NOT have special 3DICC plugins so no, here we blame Cog or more likely, Socket plugin. Often? No, but "sometimes" is often enough to be a big problem. In fact, a whole new networking layer would make sense to me. So we should talk. Also... we need to be able to use more RAM. We are now deploying to cloud servers more and more - and using instances with 16Gb RAM or more is normal. But our Cogs can't utilize it. I am not up to speed what Spur gives us or if we in fact need to go 64 bit for that. yes. Spur 32-bit will allow you to use a little more memory than 32-bit Cog, but tens of percent, not large factors. You'll need to go to 64-bit Spur to be able to access more than 2 or perhaps 3 Gb at the outside. regards, Göran -- best, Eliot
|
Hi Goran > Also, if we compare the Lua community with the Squeak/Pharo community, > it is quite obvious that the lack of really good FFI solutions leads > us to "reinvent" stuff over and over, often poorly, while the Lua > people simply wrap high quality external libraries and that's it. Done. With Pharo ***every*** single day we improve the system. We asked clement to work since more than a year with Eliot. If people would understand that we created a consortium so that we can put more forces on the VM parts including FFI then it would have an impact. Now comparing lua that has been designed to interact with C and Smalltalk is not really fair but we will get there. We are attracting smart guys now in the VM because the spirit of the VM guys CHANGED. I remember not so long ago Mariano being told to do his homework. And Mariano as well as all the smart guys in our team were shocked. How could we expect smart guys to join and help. Now this period is over and this is good. We are already seeing the difference: clement, ronie and other will follow. I hope that we will be able to edit a book based on clement blogs and other information but this is taking time. RMoD invested in the build and the fact that everybody can compile a VM to attract people too. We proposed to help at the server infrastructure to push commit validation and we will see what can be done. > Every bit of performance makes a big impact for us - but to be honest, > what we would value even more than performance would be ... > robustness. I mean, *really* robust. As in a freaking ROCK. This is why I would like to push more regression testing. Goran do you have a regression system for your deployement? I wanted to check the work of Jan Vrany that he proposed to us more than a year from now. > Here we do NOT have special 3DICC plugins so no, here we blame Cog or > more likely, Socket plugin. Often? No, but "sometimes" is often enough > to be a big problem. In fact, a whole new networking layer would make > sense to me. > For me I found that normal jumping over the dirt catch you after a while: this is a law of nature. Now the point is how can we inverse the tendency as we started to do it. Do you have money to put on the table for that? Else do you prey enough to see it happening magically :) Noury and luc were so fed up with this code that they started to rewrite it and test it but they got exhausted after a while. Because testing network layer. Now these are typical points that we want to discuss within the pharo consortium. Esteban will work on 64 bits port. This is on his official (inria) roadmap. But again we will play it with people that want to play it. 1000/2000 Euros to be in the consortium is not even a trip to the US or Germany. Stef |
In reply to this post by Clément Béra
On 09/16/2014 06:34 AM, Clément Bera wrote: > The book that explains the best how to implement a high performance VM > for Smalltalk and why is Urs Holzle phd > <http://www.cs.ucsb.edu/~urs/oocsb/self/papers/urs-thesis.html>. Agreed. This is good (almost required) reading for anyone who wants to understand how to implement dynamic languages in a way that is not slow, and to understand why performance of dynamic languages does not need to be much slower than that of statically-typed languages. After reading this paper, it's also good to think about the fact that it describes work that was done over 20 years ago, and that hardware has changed a great deal in the interim, and think hard about what improvements might be made today over the techniques that Urs and the Self team came up with back then. Regards, -Martin |
In reply to this post by Eliot Miranda-2
Le 15/09/2014 18:23, Eliot Miranda a écrit : > I find this whole discussion depressing. It seems people would rather > put their energy in chasing quick fixes or other technologies instead of > contributing to the work that is being done in the existing VM. People > discuss using LLVM as if the code generation capabilities inside Cog > were somehow poor or have no chance of competing. Spur is around twice > as fast as the current memory manager, has much better support for the > FFI. Clément and I, now with help from Ronie, are making excellent > progress towards an adaptive optimizer/speculative inliner that will > give us similar performance to V8 (the Google JavaScript VM, lead by > Lars Bak, who implemented the HotSpot VM (Smalltalk and Java)) et al. > We are trying to get person-power for a high-quality FFI and have a > prototype for a non-blocking VM. When we succeed C won't be any better > and so it won't be an interesting target. One will be able to program > entirely in Smalltalk and get excellent performance. But we need > effort. Collaboration. Hi Eliot, Not everybody has the necessary skills to help and contribute to your work, my assembly skills are really faraway and outdated now (... little frustration here :( ... ) but imho your work is unvaluable to pharo and smalltalk community - just to mention it, I noticed a 30 to 50% gain in a small bench I wrote for fun recently (a very dumb chess pawn moves generator) with the last Spur vm I was shocked :) 64bits + x2 perfs + non blocking (or multi threaded?) vm are giant steps forward that makes it possible for pharo smalltalk to compete with mainstream technologies Regards, Alain |
Free forum by Nabble | Edit this page |