Hi to all, I am a Smalltak VM enthusiast. I found this list an interesting
reading. But I was wandering a few things. If you could rewrite the Squeak's object memory from scratch, in order to simplify and accelerate exupery, what would you do? -The first thing that come to my mind is SmallInts, Squeaks uses 1 tagged integers, but what would make your life easier? -What about floats?, would you tag them instead of boxing them? -Arrays,#at: and #at:put:, How would you implement arrays, in order to speed up #at: and #at:put: -bools? -The garbage collector, what solution is best suited for exupery? -sends and blocks?, what about them? Anything else? Thank you very much for your efforts of making exupery. _______________________________________________ Exupery mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery |
Guillermo Adrián Molina writes:
> Hi to all, I am a Smalltak VM enthusiast. I found this list an interesting > reading. But I was wandering a few things. If you could rewrite the > Squeak's object memory from scratch, in order to simplify and accelerate > exupery, what would you do? My list would be: * Integer tags as 0. * Use a card marking memory barrier * Get rid of short classes to simplify type checking. * Store the size of variable objects in a word at a known position not in one of 2 locations. This is to simplify range checking. * A much faster GC which may be needed once Exupery starts providing a good general speed improvement. But Exupery works with Squeak's current object memory, any changes to the object memory would involve making similar changes to Exupery. I'm not sure how much value changing integer tagging would bring. "a := a + 1" is now compiled to the same instruction sequence as a 0 tag would use. There is a single instruction overhead for tagging with 1 not 0 in most cases. The trick is to realise that if you don't detag one of your arguments you will not need to retag the result and that constants can be tag manipulated at compile time. I'm happy with boxed floats. Personally I think tagged floats are likely to be slower than a good optimizing boxed float solution. I'd rather have floats stored in float arrays then optimism away the boxing and deboxing. The problem with tagged floats is you need to detag them and that's much more expensive than tag manipulation on integers which is annoying as modern hardware has about the same floating point bandwidth as integer bandwidth. Booleans as proper objects is fine. There is one true object and one false object so there's no need to create an object which is the expensive part. And again, it's easy to remove intermediate boolean values from the code. I wrote an article for my Smalltalk Solutions presentation that covers what Exupery currently does and analyses the code it produces which should be worthwhile reading if you want to figure out out expensive the operations are. http://goran.krampe.se/exuperyDesign.pdf May I ask what you're trying to do? Practically, it's hard to evaluate how important any object memory changes will be to Exupery because often it's possible to optimism away the overhead in many cases. Until Exupery has a high quality optimizer it's impossible to do experiments to see what overhead is left. Bryce _______________________________________________ Exupery mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery |
> Guillermo Adrián Molina writes:
Thanks a lot for the info.
> > Hi to all, I am a Smalltak VM enthusiast. I found this list an > interesting > > reading. But I was wandering a few things. If you could rewrite the > > Squeak's object memory from scratch, in order to simplify and > accelerate > > exupery, what would you do? > > My list would be: > * Integer tags as 0. > * Use a card marking memory barrier > * Get rid of short classes to simplify type checking. > * Store the size of variable objects in a word at a known position > not in one of 2 locations. This is to simplify range checking. > * A much faster GC which may be needed once Exupery starts providing > a good general speed improvement. > > But Exupery works with Squeak's current object memory, any changes > to the object memory would involve making similar changes to Exupery. > > I'm not sure how much value changing integer tagging would bring. > "a := a + 1" is now compiled to the same instruction sequence as a > 0 tag would use. There is a single instruction overhead for tagging > with 1 not 0 in most cases. The trick is to realise that if you don't > detag one of your arguments you will not need to retag the result and > that constants can be tag manipulated at compile time. > > I'm happy with boxed floats. Personally I think tagged floats are > likely to be slower than a good optimizing boxed float solution. I'd > rather have floats stored in float arrays then optimism away the > boxing and deboxing. The problem with tagged floats is you need to > detag them and that's much more expensive than tag manipulation on > integers which is annoying as modern hardware has about the same > floating point bandwidth as integer bandwidth. > > Booleans as proper objects is fine. There is one true object and one > false object so there's no need to create an object which is the > expensive part. And again, it's easy to remove intermediate boolean > values from the code. > > I wrote an article for my Smalltalk Solutions presentation that > covers what Exupery currently does and analyses the code it produces > which should be worthwhile reading if you want to figure out out > expensive the operations are. > > http://goran.krampe.se/exuperyDesign.pdf > > May I ask what you're trying to do? Practically, it's hard to evaluate > how important any object memory changes will be to Exupery because > often it's possible to optimism away the overhead in many cases. Until > Exupery has a high quality optimizer it's impossible to do experiments > to see what overhead is left. > > Bryce > _______________________________________________ > Exupery mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery > I asked all that because I have read many papers, mails an all that. In many of them, I saw that exupery had to be more complex in order to comply with the Squeak's implementation. I am designing a smalltalk VM, so I have the freedom to choose. In my design I generate assembler directly. Every CompiledMethod has x86 code in it. I use SmaCC and Refactoring Browser, with the Squeak's new ClosureCompiler. With that (and some Class tweaking) I generate assembler (.s text, no code yet). Then I compile that with my VM. So, i don't need an interpreter. Just a few functions for method finding and primitives. Exupery (I hope) will replace my ParseNode to assembler implementation. It will give me translation to machine code, optimization, PIC's, and all the wonders that it includes. I started my design with 1 tagged SmallInts, basically because much of my ST knowledge comes from Squeak, and because I had no idea, how a 0 tagged smallint would be implemented. In my design, as I was focusing on simplicity of the code, an object header has: object: _______________________________________________ Exupery mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery |
>> Guillermo Adrián Molina writes:
(Sorry, something happened to my web mail)
>> > Hi to all, I am a Smalltak VM enthusiast. I found this list an >> interesting >> > reading. But I was wandering a few things. If you could rewrite the >> > Squeak's object memory from scratch, in order to simplify and >> accelerate >> > exupery, what would you do? >> >> My list would be: >> * Integer tags as 0. >> * Use a card marking memory barrier >> * Get rid of short classes to simplify type checking. >> * Store the size of variable objects in a word at a known position >> not in one of 2 locations. This is to simplify range checking. >> * A much faster GC which may be needed once Exupery starts providing >> a good general speed improvement. >> >> But Exupery works with Squeak's current object memory, any changes >> to the object memory would involve making similar changes to Exupery. >> >> I'm not sure how much value changing integer tagging would bring. >> "a := a + 1" is now compiled to the same instruction sequence as a >> 0 tag would use. There is a single instruction overhead for tagging >> with 1 not 0 in most cases. The trick is to realise that if you don't >> detag one of your arguments you will not need to retag the result and >> that constants can be tag manipulated at compile time. >> >> I'm happy with boxed floats. Personally I think tagged floats are >> likely to be slower than a good optimizing boxed float solution. I'd >> rather have floats stored in float arrays then optimism away the >> boxing and deboxing. The problem with tagged floats is you need to >> detag them and that's much more expensive than tag manipulation on >> integers which is annoying as modern hardware has about the same >> floating point bandwidth as integer bandwidth. >> >> Booleans as proper objects is fine. There is one true object and one >> false object so there's no need to create an object which is the >> expensive part. And again, it's easy to remove intermediate boolean >> values from the code. >> >> I wrote an article for my Smalltalk Solutions presentation that >> covers what Exupery currently does and analyses the code it produces >> which should be worthwhile reading if you want to figure out out >> expensive the operations are. >> >> http://goran.krampe.se/exuperyDesign.pdf >> >> May I ask what you're trying to do? Practically, it's hard to evaluate >> how important any object memory changes will be to Exupery because >> often it's possible to optimism away the overhead in many cases. Until >> Exupery has a high quality optimizer it's impossible to do experiments >> to see what overhead is left. >> >> Bryce >> _______________________________________________ >> Exupery mailing list >> [hidden email] >> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery >> > Thanks a lot for the info. > I asked all that because I have read many papers, mails an all that. In > many of them, I saw that exupery had to be more complex in order to comply > with the Squeak's implementation. > I am designing a smalltalk VM, so I have the freedom to choose. In my > design I generate assembler directly. Every CompiledMethod has x86 code in > it. I use SmaCC and Refactoring Browser, with the Squeak's new > ClosureCompiler. With that (and some Class tweaking) I generate assembler > (.s text, no code yet). Then I compile that with my VM. So, i don't need > an interpreter. Just a few functions for method finding and primitives. > Exupery (I hope) will replace my ParseNode to assembler implementation. It > will give me translation to machine code, optimization, PIC's, and all the > wonders that it includes. > I started my design with 1 tagged SmallInts, basically because much of my > ST knowledge comes from Squeak, and because I had no idea, how a 0 tagged > smallint would be implemented. > In my design, as I was focusing on simplicity of the code, an object > header has: > object: > object: .int classreference /* no short classes */ .int objectsizeinbytes .int info (hash, and everything else) _object: /* the real object starts here */ I haven't implemented the GC yet. So, as you can see (from your mail) >> * Integer tags as 0. Working on it, rewriting everithing to achieve that >> * Use a card marking memory barrier When the time comes >> * Get rid of short classes to simplify type checking. Already done >> * Store the size of variable objects in a word at a known position >> not in one of 2 locations. This is to simplify range checking. Already done >> * A much faster GC which may be needed once Exupery starts providing >> a good general speed improvement. When the time comes I am happy to discover that I am not so far away from the ideal solution. Right now, I don't do any optimization. well, I do inline the same methods as squeak, and to the extend that squeak does. For example, in a #to:do: message, I use no closures, but I still do calls to +, <=, and all that. My numbers are: bench #1 [ 1 to: 100000000 do: [ :i | i + 1. ] ] timeToRun Squeak 34.233 s My VM: 2.250 s bench #2 oc := OrderedCollection new. [ 1 to: 10000000 do: [ :i | oc add: i. ] ] timeToRun Squeak 17.418 s My VM: 4.600 s Later on, when I start using exupery, I will need help from the list, to adapt it. Cheers, Guille _______________________________________________ Exupery mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery |
Free forum by Nabble | Edit this page |