Smalltalk › Squeak › Squeak - Dev

[squeak-dev] MethodDictionary implementation

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

4 messages Options

Ralph Boland

[squeak-dev] MethodDictionary implementation

I have been looking at MethodDictionary.
It seems to have very little to do with Methods.
It seems to me that a better design would have been
to have a class VarDictionary which does not use
Associations (similar to MethodDictionary).
MethodDictionary would inherit from VarDictionary
and add the couple of additional methods it needs.
VarDictionary would then be available for others to use.

Better still would be to have MethodDictionary not
be a variable class at all but simply use an additional
variable "values". Again MethodDictionary should
really inherit from a class that provides this service.
I would use this service in my own code.

Ralph Boland

Igor Stasenko

Re: [squeak-dev] MethodDictionary implementation

2009/7/20 Ralph Boland <[hidden email]>:

> I have been looking at MethodDictionary.
> It seems to have very little to do with Methods.
> It seems to me that a better design would have been
> to have a class VarDictionary which does not use
> Associations (similar to MethodDictionary).
> MethodDictionary would inherit from VarDictionary
> and add the couple of additional methods it needs.
> VarDictionary would then be available for others to use.
>
> Better still would be to have MethodDictionary not
> be a variable class at all but simply use an additional
> variable "values". Again MethodDictionary should
> really inherit from a class that provides this service.
> I would use this service in my own code.
>

This will increase a method lookup times & lengthen the path for
accessing the methods by VM.
Don't forget, that MethodDictionary having such special format only
because to satisfy VM.

> Ralph Boland
>
>

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] MethodDictionary implementation

In reply to this post by Ralph Boland

Hi Ralph,

On Sun, Jul 19, 2009 at 10:17 PM, Ralph Boland <[hidden email]> wrote:

I have been looking at MethodDictionary.
It seems to have very little to do with Methods.
It seems to me that a better design would have been
to have a class VarDictionary which does not use
Associations (similar to MethodDictionary).
MethodDictionary would inherit from VarDictionary
and add the couple of additional methods it needs.
VarDictionary would then be available for others to use.

Better still would be to have MethodDictionary not
be a variable class at all but simply use an additional
variable "values". Again MethodDictionary should
really inherit from a class that provides this service.
I would use this service in my own code.

Ralph Boland

Actually I think the best representation for MethodDictionary is as a flat pair-wise sequence of selector and method, sorted by selector identityHash.

Advantages:

One saves significant space. There is one less object per method dictionary (no values array) and no empty space in the sequence. We shrank the image by about 8% when we did this for VisualWorks.

Small method dictionaries (say size <= 16 entries) are linearly scanned without needing to fetch the identityHash of the selector. This is faster for small dictionaries because the cost of three memory references in different places (selector's hash, dictionary's selector sequence, dictionary's method sequence) is higher on modern machines than the one scan of the dictionary's contents sequence.

nil looks ot the VM like an empty MethodDictionary so classes with no methods (there are quite a few) don't need a dictionary object at all.

Disadvantages:

Large dictionaries are looked up using binary search, which can be slower, because one has to fetch the hash of the selector being probed for to comp[are it with the message selector's hash. But this isn't the common case.

Dictionaries must be grown whenever a selector/method par is added or shrunk whenever one is removed, but this is rare (only happens when compiling a new method or removing an old) and can be e.g. batched if creating a new dictionary (e.g. if loading code form some binary stream).

Something we didn't yet do in VisualWorks is to make the identityHash of a Symbol dependent on its contents. If this is done then within images with the same hash width dictionaries will always have the same order because selectors will have the same hashes, which means loading binary code can be faster since the order of the dictionary on disc is the same as in memory.

I plan to implement the above when doing the new image format work later this year.

2¢

best

Eliot

Andres Valloud-4

Re: [squeak-dev] MethodDictionary implementation

As measured directly on HPS, you can do linear search in method
dictionaries up to 64 entries in size. Only after 64 entries does
binary search pay off. This is consistent with the numbers derived by
the Quake folks. The issue with binary search is that each left/right
branch behaves at random for random selector lookups, and is thus
unpredictable. Also, linear search works great with cache line reads,
whereas random reads in the dictionary's key array do not (basically,
only one key is used per cache line read mod the overlap, etc).

You can get another ~2x or so speedup in lookup speed by making method
dictionaries an actual hashed collection with well-chosen prime sizes.
Then, you use % in the C method lookup. As far as I could see, it does
not seem to matter much that the VM uses % because the advantage is that
using % with a prime table size (as opposed to & and a power of two
table size) ensures good distribution properties. Consequently, the
vast majority of selectors are found using just one probe attempt. IME,
the first probe should be outside the scanning loops because some
compilers produce better code that way.

IIRC, the cost I measured for a biggish VW image was about 400kb. This
would be a bit lower than the 8% reported by Eliot (but back then I am
sure images were considerably smaller).

Andres.

PS: I am not exactly sure re: 2x, but I am certain it was very
significant with:

| worstCase |
worstCase := Object withAllSubclasses reversed.
worstCase do: [:each | each yourself]

The run time was way lower with hashed method dictionaries. If it was
not 2x faster, it was 60% faster... that kind of number. The
implementation I did used the first slot of the method dictionary to tag
which kind of method dictionary it is. If it's the small integer zero,
then it's a hashed method dictionary. Otherwise, it's the sorted
collection kind. Then you can mix/match method dictionaries all you
want on the image side. Converting all the method dictionaries from
"compact" to "sparse" using become: took about 0.4s. I used a load
factor target of about 0.75, IIRC...

Eliot Miranda wrote:

> Hi Ralph,
>
> On Sun, Jul 19, 2009 at 10:17 PM, Ralph Boland <[hidden email]<mailto:[hidden email]>> wrote:
> I have been looking at MethodDictionary.
> It seems to have very little to do with Methods.
> It seems to me that a better design would have been
> to have a class VarDictionary which does not use
> Associations (similar to MethodDictionary).
> MethodDictionary would inherit from VarDictionary
> and add the couple of additional methods it needs.
> VarDictionary would then be available for others to use.
>
> Better still would be to have MethodDictionary not
> be a variable class at all but simply use an additional
> variable "values". Again MethodDictionary should
> really inherit from a class that provides this service.
> I would use this service in my own code.
>
> Ralph Boland
>
> Actually I think the best representation for MethodDictionary is as a flat pair-wise sequence of selector and method, sorted by selector identityHash.
>
> Advantages:
>
> One saves significant space. There is one less object per method dictionary (no values array) and no empty space in the sequence. We shrank the image by about 8% when we did this for VisualWorks.
>
> Small method dictionaries (say size <= 16 entries) are linearly scanned without needing to fetch the identityHash of the selector. This is faster for small dictionaries because the cost of three memory references in different places (selector's hash, dictionary's selector sequence, dictionary's method sequence) is higher on modern machines than the one scan of the dictionary's contents sequence.
>
> nil looks ot the VM like an empty MethodDictionary so classes with no methods (there are quite a few) don't need a dictionary object at all.
>
> Disadvantages:
> Large dictionaries are looked up using binary search, which can be slower, because one has to fetch the hash of the selector being probed for to comp[are it with the message selector's hash. But this isn't the common case.
>
> Dictionaries must be grown whenever a selector/method par is added or shrunk whenever one is removed, but this is rare (only happens when compiling a new method or removing an old) and can be e.g. batched if creating a new dictionary (e.g. if loading code form some binary stream).
>
>
> Something we didn't yet do in VisualWorks is to make the identityHash of a Symbol dependent on its contents. If this is done then within images with the same hash width dictionaries will always have the same order because selectors will have the same hashes, which means loading binary code can be faster since the order of the dictionary on disc is the same as in memory.
>
> I plan to implement the above when doing the new image format work later this year.
>
> 2¢
>
> best
> Eliot
>
>