Primitive to set an identityHash

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Primitive to set an identityHash

Mariano Martinez Peck
 
Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
Cheers

--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Eliot Miranda-2
 


On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?

It already exists.  See primitiveSetIdentityHash in InterpreterPrimitives.  Primitive # 161.
 
Cheers

--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Mariano Martinez Peck
 


On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?

It already exists.  See primitiveSetIdentityHash in InterpreterPrimitives.  Primitive # 161.

Thank you so much Eliot. You even save my time of coding it ;)  I should have checked before...I always forget about InterpreterPrimitives hahaha
So...Pharaoers... do you want the image side of the primitive and some tests?  I can provide that if desired (in my opinion I would include it)

Cheers



 



--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Eliot Miranda-2
 


On Tue, Jan 17, 2012 at 2:02 PM, Mariano Martinez Peck <[hidden email]> wrote:
 


On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <[hidden email]> wrote:
 


On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
 
Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?

It already exists.  See primitiveSetIdentityHash in InterpreterPrimitives.  Primitive # 161.

Thank you so much Eliot. You even save my time of coding it ;)  I should have checked before...I always forget about InterpreterPrimitives hahaha
So...Pharaoers... do you want the image side of the primitive and some tests?  I can provide that if desired (in my opinion I would include it)

I'd think carefully before including it :)  It's for extremely hairy hacking.  That said, see the attached for a plausible use


Cheers



 



--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot


Deterministic Symbol Hash.st (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

stephane ducasse-2
In reply to this post by Mariano Martinez Peck

After the discussions we got and with a really big comment I would add it.

Stef
On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:

>
>
> On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <[hidden email]> wrote:
>  
>
>
> On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
>  
> Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
>
> It already exists.  See primitiveSetIdentityHash in InterpreterPrimitives.  Primitive # 161.
>
> Thank you so much Eliot. You even save my time of coding it ;)  I should have checked before...I always forget about InterpreterPrimitives hahaha
> So...Pharaoers... do you want the image side of the primitive and some tests?  I can provide that if desired (in my opinion I would include it)
>
> Cheers
>
>
>
>  
>  
> Cheers
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>
>
>
>
> --
> best,
> Eliot
>
>
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Henrik Sperre Johansen

I really don't see what good could come of it being available in general…

Cheers,
Henry

On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:

>
> After the discussions we got and with a really big comment I would add it.
>
> Stef
> On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
>
>>
>>
>> On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <[hidden email]> wrote:
>>
>>
>>
>> On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>> Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
>>
>> It already exists.  See primitiveSetIdentityHash in InterpreterPrimitives.  Primitive # 161.
>>
>> Thank you so much Eliot. You even save my time of coding it ;)  I should have checked before...I always forget about InterpreterPrimitives hahaha
>> So...Pharaoers... do you want the image side of the primitive and some tests?  I can provide that if desired (in my opinion I would include it)
>>
>> Cheers
>>
>>
>>
>>
>>
>> Cheers
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Mariano Martinez Peck
 


On Wed, Jan 18, 2012 at 11:25 AM, Henrik Johansen <[hidden email]> wrote:

I really don't see what good could come of it being available in general…


I think it is a nice feature to have. If you just have it in the VM nobody will see it unless the guy load all VM stuff and checks the code.
For example, for certain hacky scenario, I wanted to create proxies which have exactly the same identitiHash as the object they proxify. That primitive let me do that. I was lucky that I asked in the mailing list, otherwise I would have miss it.
And it is not that we do not have dangerous methods in the image because we do. So... I would include it since I think it could be useful for someone doing hacky stuff, but as Stef says, I would put a really really clear comment.

 
Cheers,
Henry

On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:

>
> After the discussions we got and with a really big comment I would add it.
>
> Stef
> On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
>
>>
>>
>> On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <[hidden email]> wrote:
>>
>>
>>
>> On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>> Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
>>
>> It already exists.  See primitiveSetIdentityHash in InterpreterPrimitives.  Primitive # 161.
>>
>> Thank you so much Eliot. You even save my time of coding it ;)  I should have checked before...I always forget about InterpreterPrimitives hahaha
>> So...Pharaoers... do you want the image side of the primitive and some tests?  I can provide that if desired (in my opinion I would include it)
>>
>> Cheers
>>
>>
>>
>>
>>
>> Cheers
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>




--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Eliot Miranda-2
In reply to this post by Henrik Sperre Johansen
 


On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen <[hidden email]> wrote:

I really don't see what good could come of it being available in general…

I can think of one good use, which my file tried to illustrate.  If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images.  One can take advantage of this in e.g. method dictionary layout and hence binary class loading.  This happens in two steps.

With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method.  Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe.  By ordering method dictionaries  by selector identityHash, very large method dictionaries such as Object's are indexed using binary search.  We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method).  [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].

Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.

Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.


Cheers,
Henry

On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:

>
> After the discussions we got and with a really big comment I would add it.
>
> Stef
> On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
>
>>
>>
>> On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <[hidden email]> wrote:
>>
>>
>>
>> On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>> Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
>>
>> It already exists.  See primitiveSetIdentityHash in InterpreterPrimitives.  Primitive # 161.
>>
>> Thank you so much Eliot. You even save my time of coding it ;)  I should have checked before...I always forget about InterpreterPrimitives hahaha
>> So...Pharaoers... do you want the image side of the primitive and some tests?  I can provide that if desired (in my opinion I would include it)
>>
>> Cheers
>>
>>
>>
>>
>>
>> Cheers
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>




--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Mariano Martinez Peck
 


On Wed, Jan 18, 2012 at 8:02 PM, Eliot Miranda <[hidden email]> wrote:
 


On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen <[hidden email]> wrote:

I really don't see what good could come of it being available in general…

I can think of one good use, which my file tried to illustrate.  If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images.  One can take advantage of this in e.g. method dictionary layout and hence binary class loading.  This happens in two steps.

With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method.  Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe.  By ordering method dictionaries  by selector identityHash, very large method dictionaries such as Object's are indexed using binary search.  We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method).  [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].

Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.

Indeed, in Fuel we would save the rehash of MethodDictionaries. 
 

Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.


Cheers,
Henry

On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:

>
> After the discussions we got and with a really big comment I would add it.
>
> Stef
> On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
>
>>
>>
>> On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <[hidden email]> wrote:
>>
>>
>>
>> On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>> Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
>>
>> It already exists.  See primitiveSetIdentityHash in InterpreterPrimitives.  Primitive # 161.
>>
>> Thank you so much Eliot. You even save my time of coding it ;)  I should have checked before...I always forget about InterpreterPrimitives hahaha
>> So...Pharaoers... do you want the image side of the primitive and some tests?  I can provide that if desired (in my opinion I would include it)
>>
>> Cheers
>>
>>
>>
>>
>>
>> Cheers
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>




--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Mariano Martinez Peck
 


On Wed, Jan 18, 2012 at 8:58 PM, Mariano Martinez Peck <[hidden email]> wrote:


On Wed, Jan 18, 2012 at 8:02 PM, Eliot Miranda <[hidden email]> wrote:
 


On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen <[hidden email]> wrote:

I really don't see what good could come of it being available in general…

I can think of one good use, which my file tried to illustrate.  

 I have another one but I am not sure if I can make it clear. In my case, I have original which I do #becomeForward: to proxies. Original objects are then swapped out to disk and garbage collected. Then after, I materialize from disk, and I do proxies  #becomeForward: materialized objects.  I have no idea whether original objects were stored in hashed collections. Not even in which collections. So...to avoid rehashing all instances of all hashed collections, what I do is the becomeForward: that copies identityHash.  Problem is...that when I do original objects becomeForward: proxies ... it change the identityHash of proxies, and what happens if proxies were stored also in hashed collections?  would need a rehash...

So...with this new primitive I can directly set the same identityHash to the proxy when I create it, since I know who he will proxify :)
 
If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images.  One can take advantage of this in e.g. method dictionary layout and hence binary class loading.  This happens in two steps.

With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method.  Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe.  By ordering method dictionaries  by selector identityHash, very large method dictionaries such as Object's are indexed using binary search.  We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method).  [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].

Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.

Indeed, in Fuel we would save the rehash of MethodDictionaries. 
 

Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.


Cheers,
Henry

On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:

>
> After the discussions we got and with a really big comment I would add it.
>
> Stef
> On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
>
>>
>>
>> On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <[hidden email]> wrote:
>>
>>
>>
>> On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
>>
>> Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
>>
>> It already exists.  See primitiveSetIdentityHash in InterpreterPrimitives.  Primitive # 161.
>>
>> Thank you so much Eliot. You even save my time of coding it ;)  I should have checked before...I always forget about InterpreterPrimitives hahaha
>> So...Pharaoers... do you want the image side of the primitive and some tests?  I can provide that if desired (in my opinion I would include it)
>>
>> Cheers
>>
>>
>>
>>
>>
>> Cheers
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>




--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Henrik Sperre Johansen
In reply to this post by Eliot Miranda-2
 

On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:



On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen <[hidden email]> wrote:

I really don't see what good could come of it being available in general…

I can think of one good use, which my file tried to illustrate.  If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images.  One can take advantage of this in e.g. method dictionary layout and hence binary class loading.  This happens in two steps.

With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method.  Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe.  By ordering method dictionaries  by selector identityHash, very large method dictionaries such as Object's are indexed using binary search.  We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method).  [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].

Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.

Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.

And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)

I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.

So rather than: 

Object setIdentityHashTo: aNumber
<primitive: 161>

you have:

Symbol >> initialize
self deriveIdentityHashFrom: self hash

Symbol >> deriveIdentityHashFrom: aNumber
"This should ONLY be called as part of object initialization!"
"Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"

and similar for Mariano's Proxy class.

Cheers,
Henry



Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Eliot Miranda-2
 


On Wed, Jan 18, 2012 at 12:54 PM, Henrik Johansen <[hidden email]> wrote:
 

On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:



On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen <[hidden email]> wrote:

I really don't see what good could come of it being available in general…

I can think of one good use, which my file tried to illustrate.  If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images.  One can take advantage of this in e.g. method dictionary layout and hence binary class loading.  This happens in two steps.

With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method.  Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe.  By ordering method dictionaries  by selector identityHash, very large method dictionaries such as Object's are indexed using binary search.  We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method).  [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].

Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.

Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.

And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)

I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.

So rather than: 

Object setIdentityHashTo: aNumber
<primitive: 161>

you have:

Symbol >> initialize
self deriveIdentityHashFrom: self hash

Symbol >> deriveIdentityHashFrom: aNumber
"This should ONLY be called as part of object initialization!"
"Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"

and similar for Mariano's Proxy class.

Cheers,
Henry

Good idea! 

--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Mariano Martinez Peck
 


On Wed, Jan 18, 2012 at 10:02 PM, Eliot Miranda <[hidden email]> wrote:
 


On Wed, Jan 18, 2012 at 12:54 PM, Henrik Johansen <[hidden email]> wrote:
 

On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:



On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen <[hidden email]> wrote:

I really don't see what good could come of it being available in general…

I can think of one good use, which my file tried to illustrate.  If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images.  One can take advantage of this in e.g. method dictionary layout and hence binary class loading.  This happens in two steps.

With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method.  Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe.  By ordering method dictionaries  by selector identityHash, very large method dictionaries such as Object's are indexed using binary search.  We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method).  [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].

Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.

Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.

And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)

I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.

So rather than: 

Object setIdentityHashTo: aNumber
<primitive: 161>

you have:

Symbol >> initialize
self deriveIdentityHashFrom: self hash

Symbol >> deriveIdentityHashFrom: aNumber
"This should ONLY be called as part of object initialization!"
"Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"

and similar for Mariano's Proxy class.

Cheers,
Henry

Good idea! 


Wait...I am slower than Eliot ;)
so...deriveIdentityHashFrom:  should be with the primitive call, right?  otherwise I am lost.

Symbol >> deriveIdentityHashFrom: aNumber
"This should ONLY be called as part of object initialization!"
"Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"
        <primitive: 161>

?

 
--
best,
Eliot





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

Henrik Sperre Johansen
 

On Jan 18, 2012, at 11:05 11PM, Mariano Martinez Peck wrote:



On Wed, Jan 18, 2012 at 10:02 PM, Eliot Miranda <[hidden email]> wrote:
 


On Wed, Jan 18, 2012 at 12:54 PM, Henrik Johansen <[hidden email]> wrote:
 

On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:



On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen <[hidden email]> wrote:

I really don't see what good could come of it being available in general…

I can think of one good use, which my file tried to illustrate.  If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images.  One can take advantage of this in e.g. method dictionary layout and hence binary class loading.  This happens in two steps.

With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method.  Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe.  By ordering method dictionaries  by selector identityHash, very large method dictionaries such as Object's are indexed using binary search.  We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method).  [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].

Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.

Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.

And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)

I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.

So rather than: 

Object setIdentityHashTo: aNumber
<primitive: 161>

you have:

Symbol >> initialize
self deriveIdentityHashFrom: self hash

Symbol >> deriveIdentityHashFrom: aNumber
"This should ONLY be called as part of object initialization!"
"Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"

and similar for Mariano's Proxy class.

Cheers,
Henry

Good idea! 


Wait...I am slower than Eliot ;)
so...deriveIdentityHashFrom:  should be with the primitive call, right?  otherwise I am lost.

Symbol >> deriveIdentityHashFrom: aNumber
"This should ONLY be called as part of object initialization!"
"Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"
        <primitive: 161> 
Yes, exactly like it is in his change set example which I'd missed :/

Cheers,
Henry
Reply | Threaded
Open this post in threaded view
|

Re: Primitive to set an identityHash

stephane ducasse-2
In reply to this post by Henrik Sperre Johansen

+1
this is the idea :)

On Jan 18, 2012, at 9:54 PM, Henrik Johansen wrote:

>
> On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:
>
>>
>>
>> On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen <[hidden email]> wrote:
>>
>> I really don't see what good could come of it being available in general…
>>
>> I can think of one good use, which my file tried to illustrate.  If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images.  One can take advantage of this in e.g. method dictionary layout and hence binary class loading.  This happens in two steps.
>>
>> With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method.  Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe.  By ordering method dictionaries  by selector identityHash, very large method dictionaries such as Object's are indexed using binary search.  We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method).  [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
>>
>> Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.
>>
>> Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.
>
> And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)
>
> I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.
>
> So rather than:
>
> Object setIdentityHashTo: aNumber
> <primitive: 161>
>
>
> you have:
>
> Symbol >> initialize
> self deriveIdentityHashFrom: self hash
>
> Symbol >> deriveIdentityHashFrom: aNumber
> "This should ONLY be called as part of object initialization!"
> "Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"
>
> and similar for Mariano's Proxy class.
>
> Cheers,
> Henry
>
>
>