New implementation of squeak

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

New implementation of squeak

Guillermo Adrián Molina
Hi to all, I am a Smalltak VM enthusiast. I found this list an interesting
reading. But I was wandering a few things. If you could rewrite the
Squeak's object memory from scratch, in order to simplify and accelerate
exupery, what would you do?
-The first thing that come to my mind is SmallInts, Squeaks uses 1
tagged integers, but what would make your life easier?
-What about floats?, would you tag them instead of boxing them?
-Arrays,#at: and #at:put:, How would you implement arrays, in order to
speed up #at: and #at:put:
-bools?
-The garbage collector, what solution is best suited for exupery?
-sends and blocks?, what about them?

Anything else?

Thank you very much for your efforts of making exupery.




_______________________________________________
Exupery mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
Reply | Threaded
Open this post in threaded view
|

New implementation of squeak

Bryce Kampjes
Guillermo Adrián Molina writes:
 > Hi to all, I am a Smalltak VM enthusiast. I found this list an interesting
 > reading. But I was wandering a few things. If you could rewrite the
 > Squeak's object memory from scratch, in order to simplify and accelerate
 > exupery, what would you do?

My list would be:
 * Integer tags as 0.
 * Use a card marking memory barrier
 * Get rid of short classes to simplify type checking.
 * Store the size of variable objects in a word at a known position
   not in one of 2 locations. This is to simplify range checking.
 * A much faster GC which may be needed once Exupery starts providing
   a good general speed improvement.

But Exupery works with Squeak's current object memory, any changes
to the object memory would involve making similar changes to Exupery.

I'm not sure how much value changing integer tagging would bring.
"a := a + 1" is now compiled to the same instruction sequence as a
0 tag would use. There is a single instruction overhead for tagging
with 1 not 0 in most cases. The trick is to realise that if you don't
detag one of your arguments you will not need to retag the result and
that constants can be tag manipulated at compile time.

I'm happy with boxed floats. Personally I think tagged floats are
likely to be slower than a good optimizing boxed float solution. I'd
rather have floats stored in float arrays then optimism away the
boxing and deboxing. The problem with tagged floats is you need to
detag them and that's much more expensive than tag manipulation on
integers which is annoying as modern hardware has about the same
floating point bandwidth as integer bandwidth.

Booleans as proper objects is fine. There is one true object and one
false object so there's no need to create an object which is the
expensive part. And again, it's easy to remove intermediate boolean
values from the code.

I wrote an article for my Smalltalk Solutions presentation that
covers what Exupery currently does and analyses the code it produces
which should be worthwhile reading if you want to figure out out
expensive the operations are.

  http://goran.krampe.se/exuperyDesign.pdf

May I ask what you're trying to do? Practically, it's hard to evaluate
how important any object memory changes will be to Exupery because
often it's possible to optimism away the overhead in many cases. Until
Exupery has a high quality optimizer it's impossible to do experiments
to see what overhead is left.

Bryce
_______________________________________________
Exupery mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
Reply | Threaded
Open this post in threaded view
|

Re: New implementation of squeak

Guillermo Adrián Molina
> Guillermo Adrián Molina writes:
>  > Hi to all, I am a Smalltak VM enthusiast. I found this list an
> interesting
>  > reading. But I was wandering a few things. If you could rewrite the
>  > Squeak's object memory from scratch, in order to simplify and
> accelerate
>  > exupery, what would you do?
>
> My list would be:
>  * Integer tags as 0.
>  * Use a card marking memory barrier
>  * Get rid of short classes to simplify type checking.
>  * Store the size of variable objects in a word at a known position
>    not in one of 2 locations. This is to simplify range checking.
>  * A much faster GC which may be needed once Exupery starts providing
>    a good general speed improvement.
>
> But Exupery works with Squeak's current object memory, any changes
> to the object memory would involve making similar changes to Exupery.
>
> I'm not sure how much value changing integer tagging would bring.
> "a := a + 1" is now compiled to the same instruction sequence as a
> 0 tag would use. There is a single instruction overhead for tagging
> with 1 not 0 in most cases. The trick is to realise that if you don't
> detag one of your arguments you will not need to retag the result and
> that constants can be tag manipulated at compile time.
>
> I'm happy with boxed floats. Personally I think tagged floats are
> likely to be slower than a good optimizing boxed float solution. I'd
> rather have floats stored in float arrays then optimism away the
> boxing and deboxing. The problem with tagged floats is you need to
> detag them and that's much more expensive than tag manipulation on
> integers which is annoying as modern hardware has about the same
> floating point bandwidth as integer bandwidth.
>
> Booleans as proper objects is fine. There is one true object and one
> false object so there's no need to create an object which is the
> expensive part. And again, it's easy to remove intermediate boolean
> values from the code.
>
> I wrote an article for my Smalltalk Solutions presentation that
> covers what Exupery currently does and analyses the code it produces
> which should be worthwhile reading if you want to figure out out
> expensive the operations are.
>
>   http://goran.krampe.se/exuperyDesign.pdf
>
> May I ask what you're trying to do? Practically, it's hard to evaluate
> how important any object memory changes will be to Exupery because
> often it's possible to optimism away the overhead in many cases. Until
> Exupery has a high quality optimizer it's impossible to do experiments
> to see what overhead is left.
>
> Bryce
> _______________________________________________
> Exupery mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
>
Thanks a lot for the info.
I asked all that because I have read many papers, mails an all that. In
many of them, I saw that exupery had to be more complex in order to comply
with the Squeak's implementation.
I am designing a smalltalk VM, so I have the freedom to choose. In my
design I generate assembler directly. Every CompiledMethod has x86 code in
it. I use SmaCC and Refactoring Browser, with the Squeak's new
ClosureCompiler. With that (and some Class tweaking) I generate assembler
(.s text, no code yet). Then I compile that with my VM. So, i don't need
an interpreter. Just a few functions for method finding and primitives.
Exupery (I hope) will replace my ParseNode to assembler implementation. It
will give me translation to machine code, optimization, PIC's, and all the
wonders that it includes.
I started my design with 1 tagged SmallInts, basically because much of my
ST knowledge comes from Squeak, and because I had no idea, how a 0 tagged
smallint would be implemented.
In my design, as I was focusing on simplicity of the code, an object
header has:
object:


_______________________________________________
Exupery mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
Reply | Threaded
Open this post in threaded view
|

Re: New implementation of squeak

Guillermo Adrián Molina
>> Guillermo Adrián Molina writes:
>>  > Hi to all, I am a Smalltak VM enthusiast. I found this list an
>> interesting
>>  > reading. But I was wandering a few things. If you could rewrite the
>>  > Squeak's object memory from scratch, in order to simplify and
>> accelerate
>>  > exupery, what would you do?
>>
>> My list would be:
>>  * Integer tags as 0.
>>  * Use a card marking memory barrier
>>  * Get rid of short classes to simplify type checking.
>>  * Store the size of variable objects in a word at a known position
>>    not in one of 2 locations. This is to simplify range checking.
>>  * A much faster GC which may be needed once Exupery starts providing
>>    a good general speed improvement.
>>
>> But Exupery works with Squeak's current object memory, any changes
>> to the object memory would involve making similar changes to Exupery.
>>
>> I'm not sure how much value changing integer tagging would bring.
>> "a := a + 1" is now compiled to the same instruction sequence as a
>> 0 tag would use. There is a single instruction overhead for tagging
>> with 1 not 0 in most cases. The trick is to realise that if you don't
>> detag one of your arguments you will not need to retag the result and
>> that constants can be tag manipulated at compile time.
>>
>> I'm happy with boxed floats. Personally I think tagged floats are
>> likely to be slower than a good optimizing boxed float solution. I'd
>> rather have floats stored in float arrays then optimism away the
>> boxing and deboxing. The problem with tagged floats is you need to
>> detag them and that's much more expensive than tag manipulation on
>> integers which is annoying as modern hardware has about the same
>> floating point bandwidth as integer bandwidth.
>>
>> Booleans as proper objects is fine. There is one true object and one
>> false object so there's no need to create an object which is the
>> expensive part. And again, it's easy to remove intermediate boolean
>> values from the code.
>>
>> I wrote an article for my Smalltalk Solutions presentation that
>> covers what Exupery currently does and analyses the code it produces
>> which should be worthwhile reading if you want to figure out out
>> expensive the operations are.
>>
>>   http://goran.krampe.se/exuperyDesign.pdf
>>
>> May I ask what you're trying to do? Practically, it's hard to evaluate
>> how important any object memory changes will be to Exupery because
>> often it's possible to optimism away the overhead in many cases. Until
>> Exupery has a high quality optimizer it's impossible to do experiments
>> to see what overhead is left.
>>
>> Bryce
>> _______________________________________________
>> Exupery mailing list
>> [hidden email]
>> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery
>>
> Thanks a lot for the info.
> I asked all that because I have read many papers, mails an all that. In
> many of them, I saw that exupery had to be more complex in order to comply
> with the Squeak's implementation.
> I am designing a smalltalk VM, so I have the freedom to choose. In my
> design I generate assembler directly. Every CompiledMethod has x86 code in
> it. I use SmaCC and Refactoring Browser, with the Squeak's new
> ClosureCompiler. With that (and some Class tweaking) I generate assembler
> (.s text, no code yet). Then I compile that with my VM. So, i don't need
> an interpreter. Just a few functions for method finding and primitives.
> Exupery (I hope) will replace my ParseNode to assembler implementation. It
> will give me translation to machine code, optimization, PIC's, and all the
> wonders that it includes.
> I started my design with 1 tagged SmallInts, basically because much of my
> ST knowledge comes from Squeak, and because I had no idea, how a 0 tagged
> smallint would be implemented.
> In my design, as I was focusing on simplicity of the code, an object
> header has:
> object:
>
(Sorry, something happened to my web mail)
object:
.int classreference /* no short classes */
.int objectsizeinbytes
.int info (hash, and everything else)
_object: /* the real object starts here */

I haven't implemented the GC yet.
So, as you can see (from your mail)
>>  * Integer tags as 0.
Working on it, rewriting everithing to achieve that
>>  * Use a card marking memory barrier
When the time comes
>>  * Get rid of short classes to simplify type checking.
Already done
>>  * Store the size of variable objects in a word at a known position
>>    not in one of 2 locations. This is to simplify range checking.
Already done
>>  * A much faster GC which may be needed once Exupery starts providing
>>    a good general speed improvement.
When the time comes

I am happy to discover that I am not so far away from the ideal solution.
Right now, I don't do any optimization. well, I do inline the same methods
as squeak, and to the extend that squeak does. For example, in a #to:do:
message, I use no closures, but I still do calls to +, <=, and all that.
My numbers are:

bench #1
[ 1 to: 100000000 do: [ :i |
        i + 1.
] ] timeToRun

Squeak 34.233 s
My VM: 2.250 s

bench #2
oc := OrderedCollection new.
[ 1 to: 10000000 do: [ :i |
        oc add: i.
] ] timeToRun

Squeak 17.418 s
My VM: 4.600 s

Later on, when I start using exupery, I will need help from the list, to
adapt it.

Cheers, Guille

_______________________________________________
Exupery mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery