Smalltalk › Squeak › Squeak - Dev

[squeak-dev] Rearranging variables in interpreter struct (lil speedup)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

2 messages Options

Igor Stasenko

[squeak-dev] Rearranging variables in interpreter struct (lil speedup)

I just wondering, what if i place variables in interpreter struct in
specific order,
not in order, which code generator produces.

Since speed impact expected to be very small (if any), i used
following code to measure difference.

[ 5 timesRepeat: [ 1 tinyBenchmarks] ] timeToRun

This is HydraVM without attempts to arrange interpreters ivars:

28921
28863
28863

This build shows a small slowdown (i placed all big-sized arrays at
tail, and #stackTop #successFlag at head)

28938
28940

With this build i placed methodCache ivar to come first.

28279
28264

so, difference is small but noticeable. Placing methodCache first
gives roughly 1-2% speedup.

I don't really sure if this worth experimenting at all. And i'm
lacking of knowledge of different CPU/compiler details to predict that
these changes will take effect on different CPUs (mine is AMD Athlon
series).

If you having any ideas, concerning given changes, or any other
optimizations which can probably improve speed, please feel free to
uncover details/guidelines.

--
Best regards,
Igor Stasenko AKA sig.

johnmci

Re: [squeak-dev] Rearranging variables in interpreter struct (lil speedup)

I wrote the algorithm that is there to sort the variables based on
usage since I couldn't come up
with a nicer metric, because as you thought cache line placement could
be important. 8 years back (or so)
it wasn't important, now it is, we noticed a 2% gain in a VM build by
adding a single instance variable, which
changed placement of other variables on a cache line.

Hand sorting couldn't hurt, didn't want to do that at the time.

Isn't everything Intel now anyway? Alas...

The other magic that is lurking btw is if you shared a global between
different routines and this variable(s) become
part of a sole routine after the merging of methods and inlining is
done then the variable is moved out of the foo structure
and made a local. There is a method where you can deny that behavior
for a variable/routine if I remember.

The *most* important place this was effective in was the GC logic.
The GC logic is split between various smalltalk methods, which
all get inlined together. So on the powerpc those globals sharing
state between the 4-5 routines would all become local register variables
made quite a different in performance.

Andreas made a change a few years back to change scoping of local
variables to the block they are used in, versus defining the locals
at the top level of the routine, that help register allocation.

Lastly I've a change set, well somewhere, that would extrude slang and
say void foobar{} if in fact method foobar didn't actually return
anything.
This was never accepted because we found certain methods in the
VMMachine defs would return values, but in fact actually didn't return
values.
however one can never tell what the compiler will do given int
foobar{} versus void foobar{}

On Feb 28, 2008, at 8:34 PM, Igor Stasenko wrote:

> I just wondering, what if i place variables in interpreter struct in
> specific order,
> not in order, which code generator produces.
>
> Since speed impact expected to be very small (if any), i used
> following code to measure difference.
>
> [ 5 timesRepeat: [ 1 tinyBenchmarks] ] timeToRun
>
> This is HydraVM without attempts to arrange interpreters ivars:
>
> 28921
> 28863
> 28863
>
> This build shows a small slowdown (i placed all big-sized arrays at
> tail, and #stackTop #successFlag at head)
>
> 28938
> 28940
>
> With this build i placed methodCache ivar to come first.
>
> 28279
> 28264
>
> so, difference is small but noticeable. Placing methodCache first
> gives roughly 1-2% speedup.
>
> I don't really sure if this worth experimenting at all. And i'm
> lacking of knowledge of different CPU/compiler details to predict that
> these changes will take effect on different CPUs (mine is AMD Athlon
> series).
>
> If you having any ideas, concerning given changes, or any other
> optimizations which can probably improve speed, please feel free to
> uncover details/guidelines.
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>

--
=
=
=
========================================================================
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
=
=
=
========================================================================