Do we have the new primitive?? [WAS] Re: IdentitySet but using #hash rather than #identityHash ?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Do we have the new primitive?? [WAS] Re: IdentitySet but using #hash rather than #identityHash ?

Mariano Martinez Peck

All I can say is that I am impressed by the numbers it is really much
faster.
I still don't understand why I send this email with a subject say
IdentitySet because what I really need is a fast/large IdentityDictionary
:(  Anyway, there's a place where we can use this LargeIdentitySet in Fuel
I think).

So Levente, you say this is not possible to adapt this for dictionary?  can
we contact Eliot to provide such a primitive?

As promised, I uploaded my LargeIdentityDictionary implementation to http://leves.web.elte.hu/squeak/LargeIdentityDictionary.st .
The numbers will be a bit worse compared to LargeIdentitySet, because of the lack of the primitive, but it's still 2-3x faster than other solutions (IdentityDictionary, PluggableIdentityDictionary, subclassing, etc). I'm about to propose this primitive with other improvements on the vm-dev list.


Hi Eliot/Levente. What is the status of this? Do we have already the new primitive? If true, how can we adapt LargeIdentitySet to use such new primitive?

Thanks!






 

Levente


thanks

On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <[hidden email]> wrote:

On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:

 On 16.12.2011 03:26, Levente Uzonyi wrote:


How about my numbers? :)

"Preallocate objects, so we won't count gc time."
n := 1000000.
objects := Array new: n streamContents: [ :stream |
  n timesRepeat: [ stream nextPut: Object new ] ].

set := IdentitySet new: n.
Smalltalk garbageCollect.
[1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949"

set := LargeIdentitySet new.
Smalltalk garbageCollect.
[1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331"

set := (PluggableSet new: n)
  hashBlock: [ :object | object identityHash * 4096 + object class
identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo"
  equalBlock: [ :a :b | a == b ];
  yourself.
Smalltalk garbageCollect.
[1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511"


I also have a LargeIdentityDictionary, which is relatively fast, but not
as fast as LargeIdentitySet, because (for some unknown reason) we don't
have a primitive that could support it. If we had a primitive like
primitive 132 which would return the index of the element if found or 0 if
not, then we could have a really fast LargeIdentityDictionary.


Levente

Hehe yes, if writing a version fully exploiting the limited range, that's
probably the approach I would go for as well.
(IAssuming it's the version at http://leves.web.elte.hu/**
squeak/LargeIdentitySet.st<http://leves.web.elte.hu/squeak/LargeIdentitySet.st>
)

Mariano commented in the version at http://www.squeaksource.com/**
FuelExperiments <http://www.squeaksource.com/FuelExperiments> that it's

slow for them, which I guess is due to not adopting #identityHash calls to
#basicIdentityHash calls for Pharo:
((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
So it basically uses 1 bucket instead of 4096... Whoops. :)

Uploaded a new version to the MC repository which is adapted for Pharo,
on the same machine my numbers were taken from, it does the same test as I
used above in 871 ms. (181 with preallocation).


Cool. One more thing: in Squeak the method using primitive 132 directly
was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If
this was also added to Pharo, then the #pointsTo: sends should be changed
to #instVarsInclude:, otherwise Array can be reported as included even if
it wasn't added.
I'll upload my LargeIdentityDictionary implementation to the same place
this evening, since it's still 2-3 factor faster than other solutionts and
there seem to be demand for it.


Levente


Cheers,
Henry






--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Do we have the new primitive?? [WAS] Re: IdentitySet but using #hash rather than #identityHash ?

Levente Uzonyi-2
On Sat, 25 Feb 2012, Mariano Martinez Peck wrote:

>
>             All I can say is that I am impressed by the numbers it is really much
>             faster.
>             I still don't understand why I send this email with a subject say
>             IdentitySet because what I really need is a fast/large IdentityDictionary
>             :(  Anyway, there's a place where we can use this LargeIdentitySet in Fuel
>             I think).
>
>             So Levente, you say this is not possible to adapt this for dictionary?  can
>             we contact Eliot to provide such a primitive?
>
>
> As promised, I uploaded my LargeIdentityDictionary implementation to http://leves.web.elte.hu/squeak/LargeIdentityDictionary.st .
> The numbers will be a bit worse compared to LargeIdentitySet, because of the lack of the primitive, but it's still 2-3x faster than other solutions
> (IdentityDictionary, PluggableIdentityDictionary, subclassing, etc). I'm about to propose this primitive with other improvements on the vm-dev
> list.
My proposals are still on the way. :)

>
>
> Hi Eliot/Levente. What is the status of this? Do we have already the new primitive? If true, how can we adapt LargeIdentitySet to use such new primitive?

AFAIK the new primitive is not implemented yet. Adding the primitive to
the interpreter VM is very easy, but it seems to be a lot more complicated
(to me) to add it to Cog, because the receiver can be a MethodContext
which needs special handling.
I'll rewrite both LargeIdentitySet and LargeIdentityDictionary when the
primitive is ready.


Levente

>
> Thanks!
>
>
>
>
>
>
>  
>
>       Levente
>
>
>             thanks
>
>             On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <[hidden email]> wrote:
>
>       On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:
>
>        On 16.12.2011 03:26, Levente Uzonyi wrote:
>
>
>             How about my numbers? :)
>
>             "Preallocate objects, so we won't count gc time."
>             n := 1000000.
>             objects := Array new: n streamContents: [ :stream |
>               n timesRepeat: [ stream nextPut: Object new ] ].
>
>             set := IdentitySet new: n.
>             Smalltalk garbageCollect.
>             [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949"
>
>             set := LargeIdentitySet new.
>             Smalltalk garbageCollect.
>             [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331"
>
>             set := (PluggableSet new: n)
>               hashBlock: [ :object | object identityHash * 4096 + object class
>             identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo"
>               equalBlock: [ :a :b | a == b ];
>               yourself.
>             Smalltalk garbageCollect.
>             [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511"
>
>
>             I also have a LargeIdentityDictionary, which is relatively fast, but not
>             as fast as LargeIdentitySet, because (for some unknown reason) we don't
>             have a primitive that could support it. If we had a primitive like
>             primitive 132 which would return the index of the element if found or 0 if
>             not, then we could have a really fast LargeIdentityDictionary.
>
>
>             Levente
>
>       Hehe yes, if writing a version fully exploiting the limited range, that's
>       probably the approach I would go for as well.
> (IAssuming it's the version at http://leves.web.elte.hu/**
> squeak/LargeIdentitySet.st<http://leves.web.elte.hu/squeak/LargeIdentitySet.st>
> )
>
> Mariano commented in the version at http://www.squeaksource.com/**
> FuelExperiments <http://www.squeaksource.com/FuelExperiments> that it's
> slow for them, which I guess is due to not adopting #identityHash calls to
> #basicIdentityHash calls for Pharo:
> ((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
> So it basically uses 1 bucket instead of 4096... Whoops. :)
>
> Uploaded a new version to the MC repository which is adapted for Pharo,
> on the same machine my numbers were taken from, it does the same test as I
> used above in 871 ms. (181 with preallocation).
>
>
> Cool. One more thing: in Squeak the method using primitive 132 directly
> was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If
> this was also added to Pharo, then the #pointsTo: sends should be changed
> to #instVarsInclude:, otherwise Array can be reported as included even if
> it wasn't added.
> I'll upload my LargeIdentityDictionary implementation to the same place
> this evening, since it's still 2-3 factor faster than other solutionts and
> there seem to be demand for it.
>
>
> Levente
>
>
>       Cheers,
>       Henry
>
>
>
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>
>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Do we have the new primitive?? [WAS] Re: IdentitySet but using #hash rather than #identityHash ?

Mariano Martinez Peck


On Mon, Feb 27, 2012 at 3:16 PM, Levente Uzonyi <[hidden email]> wrote:
On Sat, 25 Feb 2012, Mariano Martinez Peck wrote:


           All I can say is that I am impressed by the numbers it is really much
           faster.
           I still don't understand why I send this email with a subject say
           IdentitySet because what I really need is a fast/large IdentityDictionary
           :(  Anyway, there's a place where we can use this LargeIdentitySet in Fuel
           I think).

           So Levente, you say this is not possible to adapt this for dictionary?  can
           we contact Eliot to provide such a primitive?


As promised, I uploaded my LargeIdentityDictionary implementation to http://leves.web.elte.hu/squeak/LargeIdentityDictionary.st .
The numbers will be a bit worse compared to LargeIdentitySet, because of the lack of the primitive, but it's still 2-3x faster than other solutions
(IdentityDictionary, PluggableIdentityDictionary, subclassing, etc). I'm about to propose this primitive with other improvements on the vm-dev
list.

My proposals are still on the way. :)




Hi Eliot/Levente. What is the status of this? Do we have already the new primitive? If true, how can we adapt LargeIdentitySet to use such new primitive?

AFAIK the new primitive is not implemented yet. Adding the primitive to the interpreter VM is very easy, but it seems to be a lot more complicated (to me) to add it to Cog, because the receiver can be a MethodContext which needs special handling.
I'll rewrite both LargeIdentitySet and LargeIdentityDictionary when the primitive is ready.

Thanks Levente. So we should wait Eliot.
I will ping again in a couple of weeks/months  ;)

 


Levente



Thanks!






 

     Levente


           thanks

           On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <[hidden email]> wrote:

     On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:

      On 16.12.2011 03:26, Levente Uzonyi wrote:


           How about my numbers? :)

           "Preallocate objects, so we won't count gc time."
           n := 1000000.
           objects := Array new: n streamContents: [ :stream |
             n timesRepeat: [ stream nextPut: Object new ] ].

           set := IdentitySet new: n.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949"

           set := LargeIdentitySet new.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331"

           set := (PluggableSet new: n)
             hashBlock: [ :object | object identityHash * 4096 + object class
           identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo"
             equalBlock: [ :a :b | a == b ];
             yourself.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511"


           I also have a LargeIdentityDictionary, which is relatively fast, but not
           as fast as LargeIdentitySet, because (for some unknown reason) we don't
           have a primitive that could support it. If we had a primitive like
           primitive 132 which would return the index of the element if found or 0 if
           not, then we could have a really fast LargeIdentityDictionary.


           Levente

     Hehe yes, if writing a version fully exploiting the limited range, that's
     probably the approach I would go for as well.
(IAssuming it's the version at http://leves.web.elte.hu/**
squeak/LargeIdentitySet.st<http://leves.web.elte.hu/squeak/LargeIdentitySet.st>
)

Mariano commented in the version at http://www.squeaksource.com/**
FuelExperiments <http://www.squeaksource.com/FuelExperiments> that it's
slow for them, which I guess is due to not adopting #identityHash calls to
#basicIdentityHash calls for Pharo:
((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
So it basically uses 1 bucket instead of 4096... Whoops. :)

Uploaded a new version to the MC repository which is adapted for Pharo,
on the same machine my numbers were taken from, it does the same test as I
used above in 871 ms. (181 with preallocation).


Cool. One more thing: in Squeak the method using primitive 132 directly
was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If
this was also added to Pharo, then the #pointsTo: sends should be changed
to #instVarsInclude:, otherwise Array can be reported as included even if
it wasn't added.
I'll upload my LargeIdentityDictionary implementation to the same place
this evening, since it's still 2-3 factor faster than other solutionts and
there seem to be demand for it.


Levente


     Cheers,
     Henry






--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: Do we have the new primitive?? [WAS] Re: IdentitySet but using #hash rather than #identityHash ?

Mariano Martinez Peck


Hi Eliot/Levente. What is the status of this? Do we have already the new primitive? If true, how can we adapt LargeIdentitySet to use such new primitive?

AFAIK the new primitive is not implemented yet. Adding the primitive to the interpreter VM is very easy, but it seems to be a lot more complicated (to me) to add it to Cog, because the receiver can be a MethodContext which needs special handling.
I'll rewrite both LargeIdentitySet and LargeIdentityDictionary when the primitive is ready.

Thanks Levente. So we should wait Eliot.
I will ping again in a couple of weeks/months  ;)


Ping. So I did it :)
Eliot if you tell us what it is needed maybe I can push Esteban or Igor (or me?) to do it ;)

Thanks!

 
 


Levente



Thanks!






 

     Levente


           thanks

           On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <[hidden email]> wrote:

     On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:

      On 16.12.2011 03:26, Levente Uzonyi wrote:


           How about my numbers? :)

           "Preallocate objects, so we won't count gc time."
           n := 1000000.
           objects := Array new: n streamContents: [ :stream |
             n timesRepeat: [ stream nextPut: Object new ] ].

           set := IdentitySet new: n.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949"

           set := LargeIdentitySet new.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331"

           set := (PluggableSet new: n)
             hashBlock: [ :object | object identityHash * 4096 + object class
           identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo"
             equalBlock: [ :a :b | a == b ];
             yourself.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511"


           I also have a LargeIdentityDictionary, which is relatively fast, but not
           as fast as LargeIdentitySet, because (for some unknown reason) we don't
           have a primitive that could support it. If we had a primitive like
           primitive 132 which would return the index of the element if found or 0 if
           not, then we could have a really fast LargeIdentityDictionary.


           Levente

     Hehe yes, if writing a version fully exploiting the limited range, that's
     probably the approach I would go for as well.
(IAssuming it's the version at http://leves.web.elte.hu/**
squeak/LargeIdentitySet.st<http://leves.web.elte.hu/squeak/LargeIdentitySet.st>
)

Mariano commented in the version at http://www.squeaksource.com/**
FuelExperiments <http://www.squeaksource.com/FuelExperiments> that it's
slow for them, which I guess is due to not adopting #identityHash calls to
#basicIdentityHash calls for Pharo:
((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
So it basically uses 1 bucket instead of 4096... Whoops. :)

Uploaded a new version to the MC repository which is adapted for Pharo,
on the same machine my numbers were taken from, it does the same test as I
used above in 871 ms. (181 with preallocation).


Cool. One more thing: in Squeak the method using primitive 132 directly
was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If
this was also added to Pharo, then the #pointsTo: sends should be changed
to #instVarsInclude:, otherwise Array can be reported as included even if
it wasn't added.
I'll upload my LargeIdentityDictionary implementation to the same place
this evening, since it's still 2-3 factor faster than other solutionts and
there seem to be demand for it.


Levente


     Cheers,
     Henry






--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] Re: Do we have the new primitive?? [WAS] Re: IdentitySet but using #hash rather than #identityHash ?

Eliot Miranda-2


On Thu, Apr 26, 2012 at 8:19 AM, Mariano Martinez Peck <[hidden email]> wrote:
 


Hi Eliot/Levente. What is the status of this? Do we have already the new primitive? If true, how can we adapt LargeIdentitySet to use such new primitive?

AFAIK the new primitive is not implemented yet. Adding the primitive to the interpreter VM is very easy, but it seems to be a lot more complicated (to me) to add it to Cog, because the receiver can be a MethodContext which needs special handling.
I'll rewrite both LargeIdentitySet and LargeIdentityDictionary when the primitive is ready.

Thanks Levente. So we should wait Eliot.
I will ping again in a couple of weeks/months  ;)


Ping. So I did it :)
Eliot if you tell us what it is needed maybe I can push Esteban or Igor (or me?) to do it ;)

Some form of accurate spec.  e.g. a simulation in Smalltalk, along with a specification of the types.  I'm not happy about the receiver being a MethodContext because that means the primitive has to check argument types.  A primitive can assume the type of the receiver because the primitive can be put on a specific class in the hierarchy.  Hence  aBehavior adoptInstance: anObject is much better than anObject changeClassTo: aClass, because the former can know that the receiver is a valid behavior, and simply check that the argument is a suitable instance of the receiver, whereas the latter has to check both that aClass is a valid behavior (walking its hierarchy?  checking it has a valid methodDictionary, etc, etc) /and/ that the receiver is a suitable instance of the argument.  So if possible specify it as one or two primitives on LargeIdentitySet & LargeIdentityDictionary.
 

Thanks!

 
 


Levente



Thanks!






 

     Levente


           thanks

           On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <[hidden email]> wrote:

     On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:

      On 16.12.2011 03:26, Levente Uzonyi wrote:


           How about my numbers? :)

           "Preallocate objects, so we won't count gc time."
           n := 1000000.
           objects := Array new: n streamContents: [ :stream |
             n timesRepeat: [ stream nextPut: Object new ] ].

           set := IdentitySet new: n.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949"

           set := LargeIdentitySet new.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331"

           set := (PluggableSet new: n)
             hashBlock: [ :object | object identityHash * 4096 + object class
           identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo"
             equalBlock: [ :a :b | a == b ];
             yourself.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511"


           I also have a LargeIdentityDictionary, which is relatively fast, but not
           as fast as LargeIdentitySet, because (for some unknown reason) we don't
           have a primitive that could support it. If we had a primitive like
           primitive 132 which would return the index of the element if found or 0 if
           not, then we could have a really fast LargeIdentityDictionary.


           Levente

     Hehe yes, if writing a version fully exploiting the limited range, that's
     probably the approach I would go for as well.
(IAssuming it's the version at http://leves.web.elte.hu/**
squeak/LargeIdentitySet.st<http://leves.web.elte.hu/squeak/LargeIdentitySet.st>
)

Mariano commented in the version at http://www.squeaksource.com/**
FuelExperiments <http://www.squeaksource.com/FuelExperiments> that it's
slow for them, which I guess is due to not adopting #identityHash calls to
#basicIdentityHash calls for Pharo:
((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
So it basically uses 1 bucket instead of 4096... Whoops. :)

Uploaded a new version to the MC repository which is adapted for Pharo,
on the same machine my numbers were taken from, it does the same test as I
used above in 871 ms. (181 with preallocation).


Cool. One more thing: in Squeak the method using primitive 132 directly
was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If
this was also added to Pharo, then the #pointsTo: sends should be changed
to #instVarsInclude:, otherwise Array can be reported as included even if
it wasn't added.
I'll upload my LargeIdentityDictionary implementation to the same place
this evening, since it's still 2-3 factor faster than other solutionts and
there seem to be demand for it.


Levente


     Cheers,
     Henry






--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com




--
Mariano
http://marianopeck.wordpress.com





--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: Do we have the new primitive?? [WAS] Re: [Pharo-project] IdentitySet but using #hash rather than #identityHash ?

Mariano Martinez Peck
In reply to this post by Mariano Martinez Peck


On Tue, Feb 28, 2012 at 7:52 PM, Mariano Martinez Peck <[hidden email]> wrote:


On Mon, Feb 27, 2012 at 3:16 PM, Levente Uzonyi <[hidden email]> wrote:
On Sat, 25 Feb 2012, Mariano Martinez Peck wrote:


           All I can say is that I am impressed by the numbers it is really much
           faster.
           I still don't understand why I send this email with a subject say
           IdentitySet because what I really need is a fast/large IdentityDictionary
           :(  Anyway, there's a place where we can use this LargeIdentitySet in Fuel
           I think).

           So Levente, you say this is not possible to adapt this for dictionary?  can
           we contact Eliot to provide such a primitive?


As promised, I uploaded my LargeIdentityDictionary implementation to http://leves.web.elte.hu/squeak/LargeIdentityDictionary.st .
The numbers will be a bit worse compared to LargeIdentitySet, because of the lack of the primitive, but it's still 2-3x faster than other solutions
(IdentityDictionary, PluggableIdentityDictionary, subclassing, etc). I'm about to propose this primitive with other improvements on the vm-dev
list.

My proposals are still on the way. :)




Hi Eliot/Levente. What is the status of this? Do we have already the new primitive? If true, how can we adapt LargeIdentitySet to use such new primitive?

AFAIK the new primitive is not implemented yet. Adding the primitive to the interpreter VM is very easy, but it seems to be a lot more complicated (to me) to add it to Cog, because the receiver can be a MethodContext which needs special handling.
I'll rewrite both LargeIdentitySet and LargeIdentityDictionary when the primitive is ready.

Thanks Levente. So we should wait Eliot.
I will ping again in a couple of weeks/months  ;)



I am reviving a very old thread.
Levente, as you know, we are using a variation of your LargeIdentitySet and LargeIdentityDictionary for Fuel. 
Considering Spur's new #identityHash, had you have the chance to try/think if these large collection classes is still worth?
What it seems clear is that with Spur we could simply replace:

ProtoObject >> largeIdentityHash

<primitive: 75>

With:

ProtoObject >> largeIdentityHash

^ self basicIdentityHash 

Right?

But I still wonder about the collection classes itself. 

Thoughts? 

 
 


Levente



Thanks!






 

     Levente


           thanks

           On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <[hidden email]> wrote:

     On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:

      On 16.12.2011 03:26, Levente Uzonyi wrote:


           How about my numbers? :)

           "Preallocate objects, so we won't count gc time."
           n := 1000000.
           objects := Array new: n streamContents: [ :stream |
             n timesRepeat: [ stream nextPut: Object new ] ].

           set := IdentitySet new: n.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "4949"

           set := LargeIdentitySet new.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "331"

           set := (PluggableSet new: n)
             hashBlock: [ :object | object identityHash * 4096 + object class
           identityHash * 64 ]; "Change this to #basicIdentityHash in Pharo"
             equalBlock: [ :a :b | a == b ];
             yourself.
           Smalltalk garbageCollect.
           [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun. "5511"


           I also have a LargeIdentityDictionary, which is relatively fast, but not
           as fast as LargeIdentitySet, because (for some unknown reason) we don't
           have a primitive that could support it. If we had a primitive like
           primitive 132 which would return the index of the element if found or 0 if
           not, then we could have a really fast LargeIdentityDictionary.


           Levente

     Hehe yes, if writing a version fully exploiting the limited range, that's
     probably the approach I would go for as well.
(IAssuming it's the version at http://leves.web.elte.hu/**
squeak/LargeIdentitySet.st<http://leves.web.elte.hu/squeak/LargeIdentitySet.st>
)

Mariano commented in the version at http://www.squeaksource.com/**
FuelExperiments <http://www.squeaksource.com/FuelExperiments> that it's
slow for them, which I guess is due to not adopting #identityHash calls to
#basicIdentityHash calls for Pharo:
((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
So it basically uses 1 bucket instead of 4096... Whoops. :)

Uploaded a new version to the MC repository which is adapted for Pharo,
on the same machine my numbers were taken from, it does the same test as I
used above in 871 ms. (181 with preallocation).


Cool. One more thing: in Squeak the method using primitive 132 directly
was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If
this was also added to Pharo, then the #pointsTo: sends should be changed
to #instVarsInclude:, otherwise Array can be reported as included even if
it wasn't added.
I'll upload my LargeIdentityDictionary implementation to the same place
this evening, since it's still 2-3 factor faster than other solutionts and
there seem to be demand for it.


Levente


     Cheers,
     Henry






--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com





--
Mariano
http://marianopeck.wordpress.com




--