FasterSets

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

FasterSets

Stéphane Ducasse
I would really like to assess FasterSets for Pharo1.1

Begin forwarded message:

> From: Ralph Boland <[hidden email]>
> Date: June 25, 2009 8:30:13 PM CEDT
> To: Stéphane Ducasse <[hidden email]>
> Subject: Re: about fasterTests
>
> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>> we are interest to get better libraries in pharo.
>> If you want to join and help making sure that your code in  
>> integrated in
>> pharo1.1
>> please say it and may be join the pharo mailing-list.
>>
>> Stef
>>
>
> Yes, I would like my code 'FasterSets' added to Pharo.
> I joined the pharo mailing list.
> Today I will mail you my MIT license release.
> Probably will take a couple of weeks (From Calgary, Canada) to get  
> there.
> Once I am added as a contributer I will release my code to Pharo for
> verification.
> In the meantime if any Squeakers try out my code and report bugs I  
> will fix them
> before releasing to Pharo.
>
> Some comments:
>
> 1)  Since FasterSets modifies low level methods in class Set and its
> subclasses it
>     may be incompatible with packages that create subclasses of Set  
> and then
>     override low level methods.  If someone could send me a list of
> package add-ons
>     to Pharo that do this then I can download them and make any  
> modifications
>     necessary for them to run with  FasterSets and make these changes
> part of my
>     release.  Note that it is no more difficult to subclass from Set
> with FasterSets
>     than without;  it is just different and only different if you do
> low level things.
>
> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>     an element to a set without doing compares or growing the set.
>     Needless to say it is a private method.
>     I also have a public method 'nastyAddNew:'  which, like
>     'noCompareOrGrowAdd:'  adds an element to a set without checking  
> if the
>     element is already there but does do a grow if needed.
>     This is useful (i.e. faster) in situations where you KNOW the  
> object
>     you are adding to a set is not there already.
>     However if the object IS already there then you now have two
>     of them in your set;  i.e.  you now have a subtle bug.
>     I use 'nastyAddNew:' in my code but never made it a part of  
> FasterSets
>     because it is questionable whether or not such a method should be
>     available generally.   However, if I can decide I like the  
> 'nastyAddNew:'
>     method and am willing to use it despite its risks so can others  
> so there
>     is an argument for it being added.
>     While my expectation is that you don't want  'nastyAddNew:'  in  
> Pharo
>     I thought you should at least be aware of it.
>
> Regards,
>
> Ralph Boland


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Mariano Martinez Peck

On Thu, Jun 25, 2009 at 5:55 PM, Stéphane Ducasse <[hidden email]> wrote:
I would really like to assess FasterSets for Pharo1.1

are all of these things stored somewhere apart from this mailing list ?

I saw http://code.google.com/p/pharo/wiki/Milestones

but it is quite old

Mariano

 

Begin forwarded message:

> From: Ralph Boland <[hidden email]>
> Date: June 25, 2009 8:30:13 PM CEDT
> To: Stéphane Ducasse <[hidden email]>
> Subject: Re: about fasterTests
>
> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>> we are interest to get better libraries in pharo.
>> If you want to join and help making sure that your code in
>> integrated in
>> pharo1.1
>> please say it and may be join the pharo mailing-list.
>>
>> Stef
>>
>
> Yes, I would like my code 'FasterSets' added to Pharo.
> I joined the pharo mailing list.
> Today I will mail you my MIT license release.
> Probably will take a couple of weeks (From Calgary, Canada) to get
> there.
> Once I am added as a contributer I will release my code to Pharo for
> verification.
> In the meantime if any Squeakers try out my code and report bugs I
> will fix them
> before releasing to Pharo.
>
> Some comments:
>
> 1)  Since FasterSets modifies low level methods in class Set and its
> subclasses it
>     may be incompatible with packages that create subclasses of Set
> and then
>     override low level methods.  If someone could send me a list of
> package add-ons
>     to Pharo that do this then I can download them and make any
> modifications
>     necessary for them to run with  FasterSets and make these changes
> part of my
>     release.  Note that it is no more difficult to subclass from Set
> with FasterSets
>     than without;  it is just different and only different if you do
> low level things.
>
> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>     an element to a set without doing compares or growing the set.
>     Needless to say it is a private method.
>     I also have a public method 'nastyAddNew:'  which, like
>     'noCompareOrGrowAdd:'  adds an element to a set without checking
> if the
>     element is already there but does do a grow if needed.
>     This is useful (i.e. faster) in situations where you KNOW the
> object
>     you are adding to a set is not there already.
>     However if the object IS already there then you now have two
>     of them in your set;  i.e.  you now have a subtle bug.
>     I use 'nastyAddNew:' in my code but never made it a part of
> FasterSets
>     because it is questionable whether or not such a method should be
>     available generally.   However, if I can decide I like the
> 'nastyAddNew:'
>     method and am willing to use it despite its risks so can others
> so there
>     is an argument for it being added.
>     While my expectation is that you don't want  'nastyAddNew:'  in
> Pharo
>     I thought you should at least be aware of it.
>
> Regards,
>
> Ralph Boland


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Damien Cassou
2009/6/25 Mariano Martinez Peck <[hidden email]>:
> are all of these things stored somewhere apart from this mailing list ?

http://code.google.com/p/pharo/issues/list?can=1&q=label%3AMilestone-1.1

--
Damien Cassou
http://damiencassou.seasidehosting.st

"Lambdas are relegated to relative obscurity until Java makes them
popular by not having them." James Iry

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Stéphane Ducasse
In reply to this post by Mariano Martinez Peck
We should clean it.

Stef

On Jun 25, 2009, at 10:03 PM, Mariano Martinez Peck wrote:

>
> On Thu, Jun 25, 2009 at 5:55 PM, Stéphane Ducasse <[hidden email]
> > wrote:
> I would really like to assess FasterSets for Pharo1.1
>
> are all of these things stored somewhere apart from this mailing  
> list ?
>
> I saw http://code.google.com/p/pharo/wiki/Milestones
>
> but it is quite old
>
> Mariano
>
>
>
> Begin forwarded message:
>
> > From: Ralph Boland <[hidden email]>
> > Date: June 25, 2009 8:30:13 PM CEDT
> > To: Stéphane Ducasse <[hidden email]>
> > Subject: Re: about fasterTests
> >
> > 2009/6/25 Stéphane Ducasse <[hidden email]>:
> >> we are interest to get better libraries in pharo.
> >> If you want to join and help making sure that your code in
> >> integrated in
> >> pharo1.1
> >> please say it and may be join the pharo mailing-list.
> >>
> >> Stef
> >>
> >
> > Yes, I would like my code 'FasterSets' added to Pharo.
> > I joined the pharo mailing list.
> > Today I will mail you my MIT license release.
> > Probably will take a couple of weeks (From Calgary, Canada) to get
> > there.
> > Once I am added as a contributer I will release my code to Pharo for
> > verification.
> > In the meantime if any Squeakers try out my code and report bugs I
> > will fix them
> > before releasing to Pharo.
> >
> > Some comments:
> >
> > 1)  Since FasterSets modifies low level methods in class Set and its
> > subclasses it
> >     may be incompatible with packages that create subclasses of Set
> > and then
> >     override low level methods.  If someone could send me a list of
> > package add-ons
> >     to Pharo that do this then I can download them and make any
> > modifications
> >     necessary for them to run with  FasterSets and make these  
> changes
> > part of my
> >     release.  Note that it is no more difficult to subclass from Set
> > with FasterSets
> >     than without;  it is just different and only different if you do
> > low level things.
> >
> > 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
> >     an element to a set without doing compares or growing the set.
> >     Needless to say it is a private method.
> >     I also have a public method 'nastyAddNew:'  which, like
> >     'noCompareOrGrowAdd:'  adds an element to a set without checking
> > if the
> >     element is already there but does do a grow if needed.
> >     This is useful (i.e. faster) in situations where you KNOW the
> > object
> >     you are adding to a set is not there already.
> >     However if the object IS already there then you now have two
> >     of them in your set;  i.e.  you now have a subtle bug.
> >     I use 'nastyAddNew:' in my code but never made it a part of
> > FasterSets
> >     because it is questionable whether or not such a method should  
> be
> >     available generally.   However, if I can decide I like the
> > 'nastyAddNew:'
> >     method and am willing to use it despite its risks so can others
> > so there
> >     is an argument for it being added.
> >     While my expectation is that you don't want  'nastyAddNew:'  in
> > Pharo
> >     I thought you should at least be aware of it.
> >
> > Regards,
> >
> > Ralph Boland
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Henrik Sperre Johansen
In reply to this post by Stéphane Ducasse
I took some time to check this yesterday, my comments on the code  
itself are basically the same as what Lucas posted in the issue tracker.
Works as advertised, but should probably clean out the "gunk" which is  
unused after it's integrated.

Some feedback on possible improvements:
- What it does, is basically speed up the adds to the new array in a  
grow that might happen Why not take this all to the logical  
conclusion, and instead optimize an addAll: method to be used in grow,  
that assumes elements are not in set (and themselves are not  
duplicates)? F.ex. it could also add to tally just once, instead of  
one increase per element as the noCompareAdd: does. (an add method  
with no tally add, no = object check, and no growCheck).

- The usual addAll: method can be sped up in a similar way, by  
performing one grow operation before adding elements if necessary  
(pessimistically assuming all elements are unique), then use an add  
method which does not check for grow need at all. (an add method with  
tally add and = object check, but no growCheck)
  If desirable, you could also shrink the capacity after adds are  
done, to match the real amount of new elements.

Adding (1 to: 1200000) to a new Set with such an addAll: method took  
1/2 to  1/3rd the time of the regular addAll: method, even with the  
lower grow cost of FasterSets. (Lower gains if Set is presized, and if  
hash-calculation is bigger part of cost)

Note: If you want to stay ANSI-compatible, one tricky point in such an  
optimization is adding all objects in collection up to a nil element,  
then raising an error. (ie: must produce equivalent results to add:  
performed of each of collections elements in sequence )

Cheers,
Henry


On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:

> I would really like to assess FasterSets for Pharo1.1
>
> Begin forwarded message:
>
>> From: Ralph Boland <[hidden email]>
>> Date: June 25, 2009 8:30:13 PM CEDT
>> To: Stéphane Ducasse <[hidden email]>
>> Subject: Re: about fasterTests
>>
>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>> we are interest to get better libraries in pharo.
>>> If you want to join and help making sure that your code in
>>> integrated in
>>> pharo1.1
>>> please say it and may be join the pharo mailing-list.
>>>
>>> Stef
>>>
>>
>> Yes, I would like my code 'FasterSets' added to Pharo.
>> I joined the pharo mailing list.
>> Today I will mail you my MIT license release.
>> Probably will take a couple of weeks (From Calgary, Canada) to get
>> there.
>> Once I am added as a contributer I will release my code to Pharo for
>> verification.
>> In the meantime if any Squeakers try out my code and report bugs I
>> will fix them
>> before releasing to Pharo.
>>
>> Some comments:
>>
>> 1)  Since FasterSets modifies low level methods in class Set and its
>> subclasses it
>>    may be incompatible with packages that create subclasses of Set
>> and then
>>    override low level methods.  If someone could send me a list of
>> package add-ons
>>    to Pharo that do this then I can download them and make any
>> modifications
>>    necessary for them to run with  FasterSets and make these changes
>> part of my
>>    release.  Note that it is no more difficult to subclass from Set
>> with FasterSets
>>    than without;  it is just different and only different if you do
>> low level things.
>>
>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>>    an element to a set without doing compares or growing the set.
>>    Needless to say it is a private method.
>>    I also have a public method 'nastyAddNew:'  which, like
>>    'noCompareOrGrowAdd:'  adds an element to a set without checking
>> if the
>>    element is already there but does do a grow if needed.
>>    This is useful (i.e. faster) in situations where you KNOW the
>> object
>>    you are adding to a set is not there already.
>>    However if the object IS already there then you now have two
>>    of them in your set;  i.e.  you now have a subtle bug.
>>    I use 'nastyAddNew:' in my code but never made it a part of
>> FasterSets
>>    because it is questionable whether or not such a method should be
>>    available generally.   However, if I can decide I like the
>> 'nastyAddNew:'
>>    method and am willing to use it despite its risks so can others
>> so there
>>    is an argument for it being added.
>>    While my expectation is that you don't want  'nastyAddNew:'  in
>> Pharo
>>    I thought you should at least be aware of it.
>>
>> Regards,
>>
>> Ralph Boland
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Stéphane Ducasse
Thanks this is really good.
We will probably integrate these changes really fast

On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote:

> I took some time to check this yesterday, my comments on the code
> itself are basically the same as what Lucas posted in the issue  
> tracker.
> Works as advertised, but should probably clean out the "gunk" which is
> unused after it's integrated.
>
> Some feedback on possible improvements:
> - What it does, is basically speed up the adds to the new array in a
> grow that might happen Why not take this all to the logical
> conclusion, and instead optimize an addAll: method to be used in grow,
> that assumes elements are not in set (and themselves are not
> duplicates)? F.ex. it could also add to tally just once, instead of
> one increase per element as the noCompareAdd: does. (an add method
> with no tally add, no = object check, and no growCheck).
>
> - The usual addAll: method can be sped up in a similar way, by
> performing one grow operation before adding elements if necessary
> (pessimistically assuming all elements are unique), then use an add
> method which does not check for grow need at all. (an add method with
> tally add and = object check, but no growCheck)
>  If desirable, you could also shrink the capacity after adds are
> done, to match the real amount of new elements.
>
> Adding (1 to: 1200000) to a new Set with such an addAll: method took
> 1/2 to  1/3rd the time of the regular addAll: method, even with the
> lower grow cost of FasterSets. (Lower gains if Set is presized, and if
> hash-calculation is bigger part of cost)
>
> Note: If you want to stay ANSI-compatible, one tricky point in such an
> optimization is adding all objects in collection up to a nil element,
> then raising an error. (ie: must produce equivalent results to add:
> performed of each of collections elements in sequence )
>
> Cheers,
> Henry
>
>
> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:
>
>> I would really like to assess FasterSets for Pharo1.1
>>
>> Begin forwarded message:
>>
>>> From: Ralph Boland <[hidden email]>
>>> Date: June 25, 2009 8:30:13 PM CEDT
>>> To: Stéphane Ducasse <[hidden email]>
>>> Subject: Re: about fasterTests
>>>
>>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>>> we are interest to get better libraries in pharo.
>>>> If you want to join and help making sure that your code in
>>>> integrated in
>>>> pharo1.1
>>>> please say it and may be join the pharo mailing-list.
>>>>
>>>> Stef
>>>>
>>>
>>> Yes, I would like my code 'FasterSets' added to Pharo.
>>> I joined the pharo mailing list.
>>> Today I will mail you my MIT license release.
>>> Probably will take a couple of weeks (From Calgary, Canada) to get
>>> there.
>>> Once I am added as a contributer I will release my code to Pharo for
>>> verification.
>>> In the meantime if any Squeakers try out my code and report bugs I
>>> will fix them
>>> before releasing to Pharo.
>>>
>>> Some comments:
>>>
>>> 1)  Since FasterSets modifies low level methods in class Set and its
>>> subclasses it
>>>   may be incompatible with packages that create subclasses of Set
>>> and then
>>>   override low level methods.  If someone could send me a list of
>>> package add-ons
>>>   to Pharo that do this then I can download them and make any
>>> modifications
>>>   necessary for them to run with  FasterSets and make these changes
>>> part of my
>>>   release.  Note that it is no more difficult to subclass from Set
>>> with FasterSets
>>>   than without;  it is just different and only different if you do
>>> low level things.
>>>
>>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>>>   an element to a set without doing compares or growing the set.
>>>   Needless to say it is a private method.
>>>   I also have a public method 'nastyAddNew:'  which, like
>>>   'noCompareOrGrowAdd:'  adds an element to a set without checking
>>> if the
>>>   element is already there but does do a grow if needed.
>>>   This is useful (i.e. faster) in situations where you KNOW the
>>> object
>>>   you are adding to a set is not there already.
>>>   However if the object IS already there then you now have two
>>>   of them in your set;  i.e.  you now have a subtle bug.
>>>   I use 'nastyAddNew:' in my code but never made it a part of
>>> FasterSets
>>>   because it is questionable whether or not such a method should be
>>>   available generally.   However, if I can decide I like the
>>> 'nastyAddNew:'
>>>   method and am willing to use it despite its risks so can others
>>> so there
>>>   is an argument for it being added.
>>>   While my expectation is that you don't want  'nastyAddNew:'  in
>>> Pharo
>>>   I thought you should at least be aware of it.
>>>
>>> Regards,
>>>
>>> Ralph Boland
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Andres Valloud-4
Keep in mind that growing on addAll: before the duplicates are filtered
out may require significantly more memory than necessary.  For example,

Set new addAll: "all instances of string"

I am sure there is an example than illustrates the issue better than the
above one.  While

Set new addAll: "consecutive integers"

may run faster, I think doing

(Set new: desiredCapacity) addAll: "consecutive integers"

may be a better compromise.

Andres.


Stéphane Ducasse wrote:

> Thanks this is really good.
> We will probably integrate these changes really fast
>
> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote:
>
>  
>> I took some time to check this yesterday, my comments on the code
>> itself are basically the same as what Lucas posted in the issue
>> tracker.
>> Works as advertised, but should probably clean out the "gunk" which is
>> unused after it's integrated.
>>
>> Some feedback on possible improvements:
>> - What it does, is basically speed up the adds to the new array in a
>> grow that might happen Why not take this all to the logical
>> conclusion, and instead optimize an addAll: method to be used in grow,
>> that assumes elements are not in set (and themselves are not
>> duplicates)? F.ex. it could also add to tally just once, instead of
>> one increase per element as the noCompareAdd: does. (an add method
>> with no tally add, no = object check, and no growCheck).
>>
>> - The usual addAll: method can be sped up in a similar way, by
>> performing one grow operation before adding elements if necessary
>> (pessimistically assuming all elements are unique), then use an add
>> method which does not check for grow need at all. (an add method with
>> tally add and = object check, but no growCheck)
>>  If desirable, you could also shrink the capacity after adds are
>> done, to match the real amount of new elements.
>>
>> Adding (1 to: 1200000) to a new Set with such an addAll: method took
>> 1/2 to  1/3rd the time of the regular addAll: method, even with the
>> lower grow cost of FasterSets. (Lower gains if Set is presized, and if
>> hash-calculation is bigger part of cost)
>>
>> Note: If you want to stay ANSI-compatible, one tricky point in such an
>> optimization is adding all objects in collection up to a nil element,
>> then raising an error. (ie: must produce equivalent results to add:
>> performed of each of collections elements in sequence )
>>
>> Cheers,
>> Henry
>>
>>
>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:
>>
>>    
>>> I would really like to assess FasterSets for Pharo1.1
>>>
>>> Begin forwarded message:
>>>
>>>      
>>>> From: Ralph Boland <[hidden email]>
>>>> Date: June 25, 2009 8:30:13 PM CEDT
>>>> To: Stéphane Ducasse <[hidden email]>
>>>> Subject: Re: about fasterTests
>>>>
>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>>>        
>>>>> we are interest to get better libraries in pharo.
>>>>> If you want to join and help making sure that your code in
>>>>> integrated in
>>>>> pharo1.1
>>>>> please say it and may be join the pharo mailing-list.
>>>>>
>>>>> Stef
>>>>>
>>>>>          
>>>> Yes, I would like my code 'FasterSets' added to Pharo.
>>>> I joined the pharo mailing list.
>>>> Today I will mail you my MIT license release.
>>>> Probably will take a couple of weeks (From Calgary, Canada) to get
>>>> there.
>>>> Once I am added as a contributer I will release my code to Pharo for
>>>> verification.
>>>> In the meantime if any Squeakers try out my code and report bugs I
>>>> will fix them
>>>> before releasing to Pharo.
>>>>
>>>> Some comments:
>>>>
>>>> 1)  Since FasterSets modifies low level methods in class Set and its
>>>> subclasses it
>>>>   may be incompatible with packages that create subclasses of Set
>>>> and then
>>>>   override low level methods.  If someone could send me a list of
>>>> package add-ons
>>>>   to Pharo that do this then I can download them and make any
>>>> modifications
>>>>   necessary for them to run with  FasterSets and make these changes
>>>> part of my
>>>>   release.  Note that it is no more difficult to subclass from Set
>>>> with FasterSets
>>>>   than without;  it is just different and only different if you do
>>>> low level things.
>>>>
>>>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>>>>   an element to a set without doing compares or growing the set.
>>>>   Needless to say it is a private method.
>>>>   I also have a public method 'nastyAddNew:'  which, like
>>>>   'noCompareOrGrowAdd:'  adds an element to a set without checking
>>>> if the
>>>>   element is already there but does do a grow if needed.
>>>>   This is useful (i.e. faster) in situations where you KNOW the
>>>> object
>>>>   you are adding to a set is not there already.
>>>>   However if the object IS already there then you now have two
>>>>   of them in your set;  i.e.  you now have a subtle bug.
>>>>   I use 'nastyAddNew:' in my code but never made it a part of
>>>> FasterSets
>>>>   because it is questionable whether or not such a method should be
>>>>   available generally.   However, if I can decide I like the
>>>> 'nastyAddNew:'
>>>>   method and am willing to use it despite its risks so can others
>>>> so there
>>>>   is an argument for it being added.
>>>>   While my expectation is that you don't want  'nastyAddNew:'  in
>>>> Pharo
>>>>   I thought you should at least be aware of it.
>>>>
>>>> Regards,
>>>>
>>>> Ralph Boland
>>>>        
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>>      
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>    
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
> .
>
>  

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Henrik Sperre Johansen
Having spent a fair deal of time (in VisualWorks) worrying about my  
intial Set sizes, writing functions to precompute sane sizes for  
arbitrary inputs and adding changeCapacityTo: calls before addAll:  
operations due to performance issues on a case-by-case basis (Squeak/
Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly  
disagree.

I'd rather spend my time thinking about my problem domain, and know  
the Set implementation handles the nitty-gritty details in a  
sufficiently smart way.

Whether that means growing once to accomedate the worst case, then  
shrinking afterwards, or doing a grow check every X added items, in  
the unlikely event that what you're adding is a 3 million element  
Array with 10 unique objects, I don't really care.

Cheers,
Henry

On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote:

>
> (Set new: desiredCapacity) addAll: "consecutive integers"
>
> may be a better compromise.
>
> Andres.
>
>
> Stéphane Ducasse wrote:
>> Thanks this is really good.
>> We will probably integrate these changes really fast
>>
>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote:
>>
>>
>>> I took some time to check this yesterday, my comments on the code
>>> itself are basically the same as what Lucas posted in the issue
>>> tracker.
>>> Works as advertised, but should probably clean out the "gunk"  
>>> which is
>>> unused after it's integrated.
>>>
>>> Some feedback on possible improvements:
>>> - What it does, is basically speed up the adds to the new array in a
>>> grow that might happen Why not take this all to the logical
>>> conclusion, and instead optimize an addAll: method to be used in  
>>> grow,
>>> that assumes elements are not in set (and themselves are not
>>> duplicates)? F.ex. it could also add to tally just once, instead of
>>> one increase per element as the noCompareAdd: does. (an add method
>>> with no tally add, no = object check, and no growCheck).
>>>
>>> - The usual addAll: method can be sped up in a similar way, by
>>> performing one grow operation before adding elements if necessary
>>> (pessimistically assuming all elements are unique), then use an add
>>> method which does not check for grow need at all. (an add method  
>>> with
>>> tally add and = object check, but no growCheck)
>>> If desirable, you could also shrink the capacity after adds are
>>> done, to match the real amount of new elements.
>>>
>>> Adding (1 to: 1200000) to a new Set with such an addAll: method took
>>> 1/2 to  1/3rd the time of the regular addAll: method, even with the
>>> lower grow cost of FasterSets. (Lower gains if Set is presized,  
>>> and if
>>> hash-calculation is bigger part of cost)
>>>
>>> Note: If you want to stay ANSI-compatible, one tricky point in  
>>> such an
>>> optimization is adding all objects in collection up to a nil  
>>> element,
>>> then raising an error. (ie: must produce equivalent results to add:
>>> performed of each of collections elements in sequence )
>>>
>>> Cheers,
>>> Henry
>>>
>>>
>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:
>>>
>>>
>>>> I would really like to assess FasterSets for Pharo1.1
>>>>
>>>> Begin forwarded message:
>>>>
>>>>
>>>>> From: Ralph Boland <[hidden email]>
>>>>> Date: June 25, 2009 8:30:13 PM CEDT
>>>>> To: Stéphane Ducasse <[hidden email]>
>>>>> Subject: Re: about fasterTests
>>>>>
>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>>>>
>>>>>> we are interest to get better libraries in pharo.
>>>>>> If you want to join and help making sure that your code in
>>>>>> integrated in
>>>>>> pharo1.1
>>>>>> please say it and may be join the pharo mailing-list.
>>>>>>
>>>>>> Stef
>>>>>>
>>>>>>
>>>>> Yes, I would like my code 'FasterSets' added to Pharo.
>>>>> I joined the pharo mailing list.
>>>>> Today I will mail you my MIT license release.
>>>>> Probably will take a couple of weeks (From Calgary, Canada) to get
>>>>> there.
>>>>> Once I am added as a contributer I will release my code to Pharo  
>>>>> for
>>>>> verification.
>>>>> In the meantime if any Squeakers try out my code and report bugs I
>>>>> will fix them
>>>>> before releasing to Pharo.
>>>>>
>>>>> Some comments:
>>>>>
>>>>> 1)  Since FasterSets modifies low level methods in class Set and  
>>>>> its
>>>>> subclasses it
>>>>>  may be incompatible with packages that create subclasses of Set
>>>>> and then
>>>>>  override low level methods.  If someone could send me a list of
>>>>> package add-ons
>>>>>  to Pharo that do this then I can download them and make any
>>>>> modifications
>>>>>  necessary for them to run with  FasterSets and make these changes
>>>>> part of my
>>>>>  release.  Note that it is no more difficult to subclass from Set
>>>>> with FasterSets
>>>>>  than without;  it is just different and only different if you do
>>>>> low level things.
>>>>>
>>>>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>>>>>  an element to a set without doing compares or growing the set.
>>>>>  Needless to say it is a private method.
>>>>>  I also have a public method 'nastyAddNew:'  which, like
>>>>>  'noCompareOrGrowAdd:'  adds an element to a set without checking
>>>>> if the
>>>>>  element is already there but does do a grow if needed.
>>>>>  This is useful (i.e. faster) in situations where you KNOW the
>>>>> object
>>>>>  you are adding to a set is not there already.
>>>>>  However if the object IS already there then you now have two
>>>>>  of them in your set;  i.e.  you now have a subtle bug.
>>>>>  I use 'nastyAddNew:' in my code but never made it a part of
>>>>> FasterSets
>>>>>  because it is questionable whether or not such a method should be
>>>>>  available generally.   However, if I can decide I like the
>>>>> 'nastyAddNew:'
>>>>>  method and am willing to use it despite its risks so can others
>>>>> so there
>>>>>  is an argument for it being added.
>>>>>  While my expectation is that you don't want  'nastyAddNew:'  in
>>>>> Pharo
>>>>>  I thought you should at least be aware of it.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Ralph Boland
>>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>> .
>>
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Andres Valloud-4
Hmmm, I don't know though, usually the usage pattern with hashed
collections has been to presize explicitly and then add to them... what
kind of use cases are you dealing with?

Henrik Johansen wrote:

> Having spent a fair deal of time (in VisualWorks) worrying about my
> intial Set sizes, writing functions to precompute sane sizes for
> arbitrary inputs and adding changeCapacityTo: calls before addAll:
> operations due to performance issues on a case-by-case basis (Squeak/
> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly
> disagree.
>
> I'd rather spend my time thinking about my problem domain, and know
> the Set implementation handles the nitty-gritty details in a
> sufficiently smart way.
>
> Whether that means growing once to accomedate the worst case, then
> shrinking afterwards, or doing a grow check every X added items, in
> the unlikely event that what you're adding is a 3 million element
> Array with 10 unique objects, I don't really care.
>
> Cheers,
> Henry
>
> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote:
>
>  
>> (Set new: desiredCapacity) addAll: "consecutive integers"
>>
>> may be a better compromise.
>>
>> Andres.
>>
>>
>> Stéphane Ducasse wrote:
>>    
>>> Thanks this is really good.
>>> We will probably integrate these changes really fast
>>>
>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote:
>>>
>>>
>>>      
>>>> I took some time to check this yesterday, my comments on the code
>>>> itself are basically the same as what Lucas posted in the issue
>>>> tracker.
>>>> Works as advertised, but should probably clean out the "gunk"
>>>> which is
>>>> unused after it's integrated.
>>>>
>>>> Some feedback on possible improvements:
>>>> - What it does, is basically speed up the adds to the new array in a
>>>> grow that might happen Why not take this all to the logical
>>>> conclusion, and instead optimize an addAll: method to be used in
>>>> grow,
>>>> that assumes elements are not in set (and themselves are not
>>>> duplicates)? F.ex. it could also add to tally just once, instead of
>>>> one increase per element as the noCompareAdd: does. (an add method
>>>> with no tally add, no = object check, and no growCheck).
>>>>
>>>> - The usual addAll: method can be sped up in a similar way, by
>>>> performing one grow operation before adding elements if necessary
>>>> (pessimistically assuming all elements are unique), then use an add
>>>> method which does not check for grow need at all. (an add method
>>>> with
>>>> tally add and = object check, but no growCheck)
>>>> If desirable, you could also shrink the capacity after adds are
>>>> done, to match the real amount of new elements.
>>>>
>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method took
>>>> 1/2 to  1/3rd the time of the regular addAll: method, even with the
>>>> lower grow cost of FasterSets. (Lower gains if Set is presized,
>>>> and if
>>>> hash-calculation is bigger part of cost)
>>>>
>>>> Note: If you want to stay ANSI-compatible, one tricky point in
>>>> such an
>>>> optimization is adding all objects in collection up to a nil
>>>> element,
>>>> then raising an error. (ie: must produce equivalent results to add:
>>>> performed of each of collections elements in sequence )
>>>>
>>>> Cheers,
>>>> Henry
>>>>
>>>>
>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:
>>>>
>>>>
>>>>        
>>>>> I would really like to assess FasterSets for Pharo1.1
>>>>>
>>>>> Begin forwarded message:
>>>>>
>>>>>
>>>>>          
>>>>>> From: Ralph Boland <[hidden email]>
>>>>>> Date: June 25, 2009 8:30:13 PM CEDT
>>>>>> To: Stéphane Ducasse <[hidden email]>
>>>>>> Subject: Re: about fasterTests
>>>>>>
>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>>>>>
>>>>>>            
>>>>>>> we are interest to get better libraries in pharo.
>>>>>>> If you want to join and help making sure that your code in
>>>>>>> integrated in
>>>>>>> pharo1.1
>>>>>>> please say it and may be join the pharo mailing-list.
>>>>>>>
>>>>>>> Stef
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>> Yes, I would like my code 'FasterSets' added to Pharo.
>>>>>> I joined the pharo mailing list.
>>>>>> Today I will mail you my MIT license release.
>>>>>> Probably will take a couple of weeks (From Calgary, Canada) to get
>>>>>> there.
>>>>>> Once I am added as a contributer I will release my code to Pharo
>>>>>> for
>>>>>> verification.
>>>>>> In the meantime if any Squeakers try out my code and report bugs I
>>>>>> will fix them
>>>>>> before releasing to Pharo.
>>>>>>
>>>>>> Some comments:
>>>>>>
>>>>>> 1)  Since FasterSets modifies low level methods in class Set and
>>>>>> its
>>>>>> subclasses it
>>>>>>  may be incompatible with packages that create subclasses of Set
>>>>>> and then
>>>>>>  override low level methods.  If someone could send me a list of
>>>>>> package add-ons
>>>>>>  to Pharo that do this then I can download them and make any
>>>>>> modifications
>>>>>>  necessary for them to run with  FasterSets and make these changes
>>>>>> part of my
>>>>>>  release.  Note that it is no more difficult to subclass from Set
>>>>>> with FasterSets
>>>>>>  than without;  it is just different and only different if you do
>>>>>> low level things.
>>>>>>
>>>>>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>>>>>>  an element to a set without doing compares or growing the set.
>>>>>>  Needless to say it is a private method.
>>>>>>  I also have a public method 'nastyAddNew:'  which, like
>>>>>>  'noCompareOrGrowAdd:'  adds an element to a set without checking
>>>>>> if the
>>>>>>  element is already there but does do a grow if needed.
>>>>>>  This is useful (i.e. faster) in situations where you KNOW the
>>>>>> object
>>>>>>  you are adding to a set is not there already.
>>>>>>  However if the object IS already there then you now have two
>>>>>>  of them in your set;  i.e.  you now have a subtle bug.
>>>>>>  I use 'nastyAddNew:' in my code but never made it a part of
>>>>>> FasterSets
>>>>>>  because it is questionable whether or not such a method should be
>>>>>>  available generally.   However, if I can decide I like the
>>>>>> 'nastyAddNew:'
>>>>>>  method and am willing to use it despite its risks so can others
>>>>>> so there
>>>>>>  is an argument for it being added.
>>>>>>  While my expectation is that you don't want  'nastyAddNew:'  in
>>>>>> Pharo
>>>>>>  I thought you should at least be aware of it.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Ralph Boland
>>>>>>
>>>>>>            
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>
>>>>>
>>>>>          
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>>        
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>> .
>>>
>>>
>>>      
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>    
>
> .
>
>  

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Andres Valloud-4
Also, what happens in these cases?

| set |
set := Set new.
bag := Bag new.
bad add: 4 withOccurrences: 1000000000 "or equivalent code".
set addAll: bag

I am guessing the set will fail to grow and addAll: will fail, even
though it should work.  Maybe it won't happen with bags, or perhaps the
example needs some massaging.  Is the failure to grow handled
gracefully?  Note that I have not looked at the code because I have not
had much chance to do that yet.

Andres Valloud wrote:

> Hmmm, I don't know though, usually the usage pattern with hashed
> collections has been to presize explicitly and then add to them... what
> kind of use cases are you dealing with?
>
> Henrik Johansen wrote:
>  
>> Having spent a fair deal of time (in VisualWorks) worrying about my
>> intial Set sizes, writing functions to precompute sane sizes for
>> arbitrary inputs and adding changeCapacityTo: calls before addAll:
>> operations due to performance issues on a case-by-case basis (Squeak/
>> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly
>> disagree.
>>
>> I'd rather spend my time thinking about my problem domain, and know
>> the Set implementation handles the nitty-gritty details in a
>> sufficiently smart way.
>>
>> Whether that means growing once to accomedate the worst case, then
>> shrinking afterwards, or doing a grow check every X added items, in
>> the unlikely event that what you're adding is a 3 million element
>> Array with 10 unique objects, I don't really care.
>>
>> Cheers,
>> Henry
>>
>> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote:
>>
>>
>>    
>>> (Set new: desiredCapacity) addAll: "consecutive integers"
>>>
>>> may be a better compromise.
>>>
>>> Andres.
>>>
>>>
>>> Stéphane Ducasse wrote:
>>>
>>>      
>>>> Thanks this is really good.
>>>> We will probably integrate these changes really fast
>>>>
>>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote:
>>>>
>>>>
>>>>
>>>>        
>>>>> I took some time to check this yesterday, my comments on the code
>>>>> itself are basically the same as what Lucas posted in the issue
>>>>> tracker.
>>>>> Works as advertised, but should probably clean out the "gunk"
>>>>> which is
>>>>> unused after it's integrated.
>>>>>
>>>>> Some feedback on possible improvements:
>>>>> - What it does, is basically speed up the adds to the new array in a
>>>>> grow that might happen Why not take this all to the logical
>>>>> conclusion, and instead optimize an addAll: method to be used in
>>>>> grow,
>>>>> that assumes elements are not in set (and themselves are not
>>>>> duplicates)? F.ex. it could also add to tally just once, instead of
>>>>> one increase per element as the noCompareAdd: does. (an add method
>>>>> with no tally add, no = object check, and no growCheck).
>>>>>
>>>>> - The usual addAll: method can be sped up in a similar way, by
>>>>> performing one grow operation before adding elements if necessary
>>>>> (pessimistically assuming all elements are unique), then use an add
>>>>> method which does not check for grow need at all. (an add method
>>>>> with
>>>>> tally add and = object check, but no growCheck)
>>>>> If desirable, you could also shrink the capacity after adds are
>>>>> done, to match the real amount of new elements.
>>>>>
>>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method took
>>>>> 1/2 to  1/3rd the time of the regular addAll: method, even with the
>>>>> lower grow cost of FasterSets. (Lower gains if Set is presized,
>>>>> and if
>>>>> hash-calculation is bigger part of cost)
>>>>>
>>>>> Note: If you want to stay ANSI-compatible, one tricky point in
>>>>> such an
>>>>> optimization is adding all objects in collection up to a nil
>>>>> element,
>>>>> then raising an error. (ie: must produce equivalent results to add:
>>>>> performed of each of collections elements in sequence )
>>>>>
>>>>> Cheers,
>>>>> Henry
>>>>>
>>>>>
>>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>> I would really like to assess FasterSets for Pharo1.1
>>>>>>
>>>>>> Begin forwarded message:
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>>>> From: Ralph Boland <[hidden email]>
>>>>>>> Date: June 25, 2009 8:30:13 PM CEDT
>>>>>>> To: Stéphane Ducasse <[hidden email]>
>>>>>>> Subject: Re: about fasterTests
>>>>>>>
>>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>>>> we are interest to get better libraries in pharo.
>>>>>>>> If you want to join and help making sure that your code in
>>>>>>>> integrated in
>>>>>>>> pharo1.1
>>>>>>>> please say it and may be join the pharo mailing-list.
>>>>>>>>
>>>>>>>> Stef
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>> Yes, I would like my code 'FasterSets' added to Pharo.
>>>>>>> I joined the pharo mailing list.
>>>>>>> Today I will mail you my MIT license release.
>>>>>>> Probably will take a couple of weeks (From Calgary, Canada) to get
>>>>>>> there.
>>>>>>> Once I am added as a contributer I will release my code to Pharo
>>>>>>> for
>>>>>>> verification.
>>>>>>> In the meantime if any Squeakers try out my code and report bugs I
>>>>>>> will fix them
>>>>>>> before releasing to Pharo.
>>>>>>>
>>>>>>> Some comments:
>>>>>>>
>>>>>>> 1)  Since FasterSets modifies low level methods in class Set and
>>>>>>> its
>>>>>>> subclasses it
>>>>>>>  may be incompatible with packages that create subclasses of Set
>>>>>>> and then
>>>>>>>  override low level methods.  If someone could send me a list of
>>>>>>> package add-ons
>>>>>>>  to Pharo that do this then I can download them and make any
>>>>>>> modifications
>>>>>>>  necessary for them to run with  FasterSets and make these changes
>>>>>>> part of my
>>>>>>>  release.  Note that it is no more difficult to subclass from Set
>>>>>>> with FasterSets
>>>>>>>  than without;  it is just different and only different if you do
>>>>>>> low level things.
>>>>>>>
>>>>>>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>>>>>>>  an element to a set without doing compares or growing the set.
>>>>>>>  Needless to say it is a private method.
>>>>>>>  I also have a public method 'nastyAddNew:'  which, like
>>>>>>>  'noCompareOrGrowAdd:'  adds an element to a set without checking
>>>>>>> if the
>>>>>>>  element is already there but does do a grow if needed.
>>>>>>>  This is useful (i.e. faster) in situations where you KNOW the
>>>>>>> object
>>>>>>>  you are adding to a set is not there already.
>>>>>>>  However if the object IS already there then you now have two
>>>>>>>  of them in your set;  i.e.  you now have a subtle bug.
>>>>>>>  I use 'nastyAddNew:' in my code but never made it a part of
>>>>>>> FasterSets
>>>>>>>  because it is questionable whether or not such a method should be
>>>>>>>  available generally.   However, if I can decide I like the
>>>>>>> 'nastyAddNew:'
>>>>>>>  method and am willing to use it despite its risks so can others
>>>>>>> so there
>>>>>>>  is an argument for it being added.
>>>>>>>  While my expectation is that you don't want  'nastyAddNew:'  in
>>>>>>> Pharo
>>>>>>>  I thought you should at least be aware of it.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Ralph Boland
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>> _______________________________________________
>>>>>> Pharo-project mailing list
>>>>>> [hidden email]
>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>
>>>>>
>>>>>          
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>> .
>>>>
>>>>
>>>>
>>>>        
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>>
>>>      
>> .
>>
>>
>>    
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
> .
>
>  

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Henrik Sperre Johansen
In reply to this post by Andres Valloud-4
Anything using a dynamic lookup cache, but also doing bulk updates to  
the cache?
Searchlight in VisualWorks and Monticello in Pharo are two examples.

Cheers,
Henry

On Jul 2, 2009, at 12:13 21PM, Andres Valloud wrote:

> Hmmm, I don't know though, usually the usage pattern with hashed
> collections has been to presize explicitly and then add to them...  
> what
> kind of use cases are you dealing with?
>
> Henrik Johansen wrote:
>> Having spent a fair deal of time (in VisualWorks) worrying about my
>> intial Set sizes, writing functions to precompute sane sizes for
>> arbitrary inputs and adding changeCapacityTo: calls before addAll:
>> operations due to performance issues on a case-by-case basis (Squeak/
>> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly
>> disagree.
>>
>> I'd rather spend my time thinking about my problem domain, and know
>> the Set implementation handles the nitty-gritty details in a
>> sufficiently smart way.
>>
>> Whether that means growing once to accomedate the worst case, then
>> shrinking afterwards, or doing a grow check every X added items, in
>> the unlikely event that what you're adding is a 3 million element
>> Array with 10 unique objects, I don't really care.
>>
>> Cheers,
>> Henry
>>
>> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote:
>>
>>
>>> (Set new: desiredCapacity) addAll: "consecutive integers"
>>>
>>> may be a better compromise.
>>>
>>> Andres.
>>>
>>>
>>> Stéphane Ducasse wrote:
>>>
>>>> Thanks this is really good.
>>>> We will probably integrate these changes really fast
>>>>
>>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote:
>>>>
>>>>
>>>>
>>>>> I took some time to check this yesterday, my comments on the code
>>>>> itself are basically the same as what Lucas posted in the issue
>>>>> tracker.
>>>>> Works as advertised, but should probably clean out the "gunk"
>>>>> which is
>>>>> unused after it's integrated.
>>>>>
>>>>> Some feedback on possible improvements:
>>>>> - What it does, is basically speed up the adds to the new array  
>>>>> in a
>>>>> grow that might happen Why not take this all to the logical
>>>>> conclusion, and instead optimize an addAll: method to be used in
>>>>> grow,
>>>>> that assumes elements are not in set (and themselves are not
>>>>> duplicates)? F.ex. it could also add to tally just once, instead  
>>>>> of
>>>>> one increase per element as the noCompareAdd: does. (an add method
>>>>> with no tally add, no = object check, and no growCheck).
>>>>>
>>>>> - The usual addAll: method can be sped up in a similar way, by
>>>>> performing one grow operation before adding elements if necessary
>>>>> (pessimistically assuming all elements are unique), then use an  
>>>>> add
>>>>> method which does not check for grow need at all. (an add method
>>>>> with
>>>>> tally add and = object check, but no growCheck)
>>>>> If desirable, you could also shrink the capacity after adds are
>>>>> done, to match the real amount of new elements.
>>>>>
>>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method  
>>>>> took
>>>>> 1/2 to  1/3rd the time of the regular addAll: method, even with  
>>>>> the
>>>>> lower grow cost of FasterSets. (Lower gains if Set is presized,
>>>>> and if
>>>>> hash-calculation is bigger part of cost)
>>>>>
>>>>> Note: If you want to stay ANSI-compatible, one tricky point in
>>>>> such an
>>>>> optimization is adding all objects in collection up to a nil
>>>>> element,
>>>>> then raising an error. (ie: must produce equivalent results to  
>>>>> add:
>>>>> performed of each of collections elements in sequence )
>>>>>
>>>>> Cheers,
>>>>> Henry
>>>>>
>>>>>
>>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:
>>>>>
>>>>>
>>>>>
>>>>>> I would really like to assess FasterSets for Pharo1.1
>>>>>>
>>>>>> Begin forwarded message:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> From: Ralph Boland <[hidden email]>
>>>>>>> Date: June 25, 2009 8:30:13 PM CEDT
>>>>>>> To: Stéphane Ducasse <[hidden email]>
>>>>>>> Subject: Re: about fasterTests
>>>>>>>
>>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>>>>>>
>>>>>>>
>>>>>>>> we are interest to get better libraries in pharo.
>>>>>>>> If you want to join and help making sure that your code in
>>>>>>>> integrated in
>>>>>>>> pharo1.1
>>>>>>>> please say it and may be join the pharo mailing-list.
>>>>>>>>
>>>>>>>> Stef
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> Yes, I would like my code 'FasterSets' added to Pharo.
>>>>>>> I joined the pharo mailing list.
>>>>>>> Today I will mail you my MIT license release.
>>>>>>> Probably will take a couple of weeks (From Calgary, Canada) to  
>>>>>>> get
>>>>>>> there.
>>>>>>> Once I am added as a contributer I will release my code to Pharo
>>>>>>> for
>>>>>>> verification.
>>>>>>> In the meantime if any Squeakers try out my code and report  
>>>>>>> bugs I
>>>>>>> will fix them
>>>>>>> before releasing to Pharo.
>>>>>>>
>>>>>>> Some comments:
>>>>>>>
>>>>>>> 1)  Since FasterSets modifies low level methods in class Set and
>>>>>>> its
>>>>>>> subclasses it
>>>>>>> may be incompatible with packages that create subclasses of Set
>>>>>>> and then
>>>>>>> override low level methods.  If someone could send me a list of
>>>>>>> package add-ons
>>>>>>> to Pharo that do this then I can download them and make any
>>>>>>> modifications
>>>>>>> necessary for them to run with  FasterSets and make these  
>>>>>>> changes
>>>>>>> part of my
>>>>>>> release.  Note that it is no more difficult to subclass from Set
>>>>>>> with FasterSets
>>>>>>> than without;  it is just different and only different if you do
>>>>>>> low level things.
>>>>>>>
>>>>>>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>>>>>>> an element to a set without doing compares or growing the set.
>>>>>>> Needless to say it is a private method.
>>>>>>> I also have a public method 'nastyAddNew:'  which, like
>>>>>>> 'noCompareOrGrowAdd:'  adds an element to a set without checking
>>>>>>> if the
>>>>>>> element is already there but does do a grow if needed.
>>>>>>> This is useful (i.e. faster) in situations where you KNOW the
>>>>>>> object
>>>>>>> you are adding to a set is not there already.
>>>>>>> However if the object IS already there then you now have two
>>>>>>> of them in your set;  i.e.  you now have a subtle bug.
>>>>>>> I use 'nastyAddNew:' in my code but never made it a part of
>>>>>>> FasterSets
>>>>>>> because it is questionable whether or not such a method should  
>>>>>>> be
>>>>>>> available generally.   However, if I can decide I like the
>>>>>>> 'nastyAddNew:'
>>>>>>> method and am willing to use it despite its risks so can others
>>>>>>> so there
>>>>>>> is an argument for it being added.
>>>>>>> While my expectation is that you don't want  'nastyAddNew:'  in
>>>>>>> Pharo
>>>>>>> I thought you should at least be aware of it.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Ralph Boland
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Pharo-project mailing list
>>>>>> [hidden email]
>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- 
>>>>> project
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>> .
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>>
>>
>> .
>>
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Henrik Sperre Johansen
In reply to this post by Andres Valloud-4
It would.
I'm not sure I consider the current behaviour of spending 12 minutes  
adding the same element so much better, though.

Cheers,
Henry

On Jul 2, 2009, at 12:17 14PM, Andres Valloud wrote:

> Also, what happens in these cases?
>
> | set |
> set := Set new.
> bag := Bag new.
> bad add: 4 withOccurrences: 1000000000 "or equivalent code".
> set addAll: bag
>
> I am guessing the set will fail to grow and addAll: will fail, even
> though it should work.  Maybe it won't happen with bags, or perhaps  
> the
> example needs some massaging.  Is the failure to grow handled
> gracefully?  Note that I have not looked at the code because I have  
> not
> had much chance to do that yet.
>
> Andres Valloud wrote:
>> Hmmm, I don't know though, usually the usage pattern with hashed
>> collections has been to presize explicitly and then add to them...  
>> what
>> kind of use cases are you dealing with?
>>
>> Henrik Johansen wrote:
>>
>>> Having spent a fair deal of time (in VisualWorks) worrying about my
>>> intial Set sizes, writing functions to precompute sane sizes for
>>> arbitrary inputs and adding changeCapacityTo: calls before addAll:
>>> operations due to performance issues on a case-by-case basis  
>>> (Squeak/
>>> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly
>>> disagree.
>>>
>>> I'd rather spend my time thinking about my problem domain, and know
>>> the Set implementation handles the nitty-gritty details in a
>>> sufficiently smart way.
>>>
>>> Whether that means growing once to accomedate the worst case, then
>>> shrinking afterwards, or doing a grow check every X added items, in
>>> the unlikely event that what you're adding is a 3 million element
>>> Array with 10 unique objects, I don't really care.
>>>
>>> Cheers,
>>> Henry
>>>
>>> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote:
>>>
>>>
>>>
>>>> (Set new: desiredCapacity) addAll: "consecutive integers"
>>>>
>>>> may be a better compromise.
>>>>
>>>> Andres.
>>>>
>>>>
>>>> Stéphane Ducasse wrote:
>>>>
>>>>
>>>>> Thanks this is really good.
>>>>> We will probably integrate these changes really fast
>>>>>
>>>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> I took some time to check this yesterday, my comments on the code
>>>>>> itself are basically the same as what Lucas posted in the issue
>>>>>> tracker.
>>>>>> Works as advertised, but should probably clean out the "gunk"
>>>>>> which is
>>>>>> unused after it's integrated.
>>>>>>
>>>>>> Some feedback on possible improvements:
>>>>>> - What it does, is basically speed up the adds to the new array  
>>>>>> in a
>>>>>> grow that might happen Why not take this all to the logical
>>>>>> conclusion, and instead optimize an addAll: method to be used in
>>>>>> grow,
>>>>>> that assumes elements are not in set (and themselves are not
>>>>>> duplicates)? F.ex. it could also add to tally just once,  
>>>>>> instead of
>>>>>> one increase per element as the noCompareAdd: does. (an add  
>>>>>> method
>>>>>> with no tally add, no = object check, and no growCheck).
>>>>>>
>>>>>> - The usual addAll: method can be sped up in a similar way, by
>>>>>> performing one grow operation before adding elements if necessary
>>>>>> (pessimistically assuming all elements are unique), then use an  
>>>>>> add
>>>>>> method which does not check for grow need at all. (an add method
>>>>>> with
>>>>>> tally add and = object check, but no growCheck)
>>>>>> If desirable, you could also shrink the capacity after adds are
>>>>>> done, to match the real amount of new elements.
>>>>>>
>>>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method  
>>>>>> took
>>>>>> 1/2 to  1/3rd the time of the regular addAll: method, even with  
>>>>>> the
>>>>>> lower grow cost of FasterSets. (Lower gains if Set is presized,
>>>>>> and if
>>>>>> hash-calculation is bigger part of cost)
>>>>>>
>>>>>> Note: If you want to stay ANSI-compatible, one tricky point in
>>>>>> such an
>>>>>> optimization is adding all objects in collection up to a nil
>>>>>> element,
>>>>>> then raising an error. (ie: must produce equivalent results to  
>>>>>> add:
>>>>>> performed of each of collections elements in sequence )
>>>>>>
>>>>>> Cheers,
>>>>>> Henry
>>>>>>
>>>>>>
>>>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> I would really like to assess FasterSets for Pharo1.1
>>>>>>>
>>>>>>> Begin forwarded message:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> From: Ralph Boland <[hidden email]>
>>>>>>>> Date: June 25, 2009 8:30:13 PM CEDT
>>>>>>>> To: Stéphane Ducasse <[hidden email]>
>>>>>>>> Subject: Re: about fasterTests
>>>>>>>>
>>>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> we are interest to get better libraries in pharo.
>>>>>>>>> If you want to join and help making sure that your code in
>>>>>>>>> integrated in
>>>>>>>>> pharo1.1
>>>>>>>>> please say it and may be join the pharo mailing-list.
>>>>>>>>>
>>>>>>>>> Stef
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Yes, I would like my code 'FasterSets' added to Pharo.
>>>>>>>> I joined the pharo mailing list.
>>>>>>>> Today I will mail you my MIT license release.
>>>>>>>> Probably will take a couple of weeks (From Calgary, Canada)  
>>>>>>>> to get
>>>>>>>> there.
>>>>>>>> Once I am added as a contributer I will release my code to  
>>>>>>>> Pharo
>>>>>>>> for
>>>>>>>> verification.
>>>>>>>> In the meantime if any Squeakers try out my code and report  
>>>>>>>> bugs I
>>>>>>>> will fix them
>>>>>>>> before releasing to Pharo.
>>>>>>>>
>>>>>>>> Some comments:
>>>>>>>>
>>>>>>>> 1)  Since FasterSets modifies low level methods in class Set  
>>>>>>>> and
>>>>>>>> its
>>>>>>>> subclasses it
>>>>>>>> may be incompatible with packages that create subclasses of Set
>>>>>>>> and then
>>>>>>>> override low level methods.  If someone could send me a list of
>>>>>>>> package add-ons
>>>>>>>> to Pharo that do this then I can download them and make any
>>>>>>>> modifications
>>>>>>>> necessary for them to run with  FasterSets and make these  
>>>>>>>> changes
>>>>>>>> part of my
>>>>>>>> release.  Note that it is no more difficult to subclass from  
>>>>>>>> Set
>>>>>>>> with FasterSets
>>>>>>>> than without;  it is just different and only different if you  
>>>>>>>> do
>>>>>>>> low level things.
>>>>>>>>
>>>>>>>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>>>>>>>> an element to a set without doing compares or growing the set.
>>>>>>>> Needless to say it is a private method.
>>>>>>>> I also have a public method 'nastyAddNew:'  which, like
>>>>>>>> 'noCompareOrGrowAdd:'  adds an element to a set without  
>>>>>>>> checking
>>>>>>>> if the
>>>>>>>> element is already there but does do a grow if needed.
>>>>>>>> This is useful (i.e. faster) in situations where you KNOW the
>>>>>>>> object
>>>>>>>> you are adding to a set is not there already.
>>>>>>>> However if the object IS already there then you now have two
>>>>>>>> of them in your set;  i.e.  you now have a subtle bug.
>>>>>>>> I use 'nastyAddNew:' in my code but never made it a part of
>>>>>>>> FasterSets
>>>>>>>> because it is questionable whether or not such a method  
>>>>>>>> should be
>>>>>>>> available generally.   However, if I can decide I like the
>>>>>>>> 'nastyAddNew:'
>>>>>>>> method and am willing to use it despite its risks so can others
>>>>>>>> so there
>>>>>>>> is an argument for it being added.
>>>>>>>> While my expectation is that you don't want  'nastyAddNew:'  in
>>>>>>>> Pharo
>>>>>>>> I thought you should at least be aware of it.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Ralph Boland
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pharo-project mailing list
>>>>>>> [hidden email]
>>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Pharo-project mailing list
>>>>>> [hidden email]
>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- 
>>>>> project
>>>>> .
>>>>>
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>
>>>>
>>>>
>>> .
>>>
>>>
>>>
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>> .
>>
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Andres Valloud-4
However, handling the growth failure wouldn't crash the program...

Henrik Johansen wrote:

> It would.
> I'm not sure I consider the current behaviour of spending 12 minutes
> adding the same element so much better, though.
>
> Cheers,
> Henry
>
> On Jul 2, 2009, at 12:17 14PM, Andres Valloud wrote:
>
>  
>> Also, what happens in these cases?
>>
>> | set |
>> set := Set new.
>> bag := Bag new.
>> bad add: 4 withOccurrences: 1000000000 "or equivalent code".
>> set addAll: bag
>>
>> I am guessing the set will fail to grow and addAll: will fail, even
>> though it should work.  Maybe it won't happen with bags, or perhaps
>> the
>> example needs some massaging.  Is the failure to grow handled
>> gracefully?  Note that I have not looked at the code because I have
>> not
>> had much chance to do that yet.
>>
>> Andres Valloud wrote:
>>    
>>> Hmmm, I don't know though, usually the usage pattern with hashed
>>> collections has been to presize explicitly and then add to them...
>>> what
>>> kind of use cases are you dealing with?
>>>
>>> Henrik Johansen wrote:
>>>
>>>      
>>>> Having spent a fair deal of time (in VisualWorks) worrying about my
>>>> intial Set sizes, writing functions to precompute sane sizes for
>>>> arbitrary inputs and adding changeCapacityTo: calls before addAll:
>>>> operations due to performance issues on a case-by-case basis
>>>> (Squeak/
>>>> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly
>>>> disagree.
>>>>
>>>> I'd rather spend my time thinking about my problem domain, and know
>>>> the Set implementation handles the nitty-gritty details in a
>>>> sufficiently smart way.
>>>>
>>>> Whether that means growing once to accomedate the worst case, then
>>>> shrinking afterwards, or doing a grow check every X added items, in
>>>> the unlikely event that what you're adding is a 3 million element
>>>> Array with 10 unique objects, I don't really care.
>>>>
>>>> Cheers,
>>>> Henry
>>>>
>>>> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote:
>>>>
>>>>
>>>>
>>>>        
>>>>> (Set new: desiredCapacity) addAll: "consecutive integers"
>>>>>
>>>>> may be a better compromise.
>>>>>
>>>>> Andres.
>>>>>
>>>>>
>>>>> Stéphane Ducasse wrote:
>>>>>
>>>>>
>>>>>          
>>>>>> Thanks this is really good.
>>>>>> We will probably integrate these changes really fast
>>>>>>
>>>>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>>>> I took some time to check this yesterday, my comments on the code
>>>>>>> itself are basically the same as what Lucas posted in the issue
>>>>>>> tracker.
>>>>>>> Works as advertised, but should probably clean out the "gunk"
>>>>>>> which is
>>>>>>> unused after it's integrated.
>>>>>>>
>>>>>>> Some feedback on possible improvements:
>>>>>>> - What it does, is basically speed up the adds to the new array
>>>>>>> in a
>>>>>>> grow that might happen Why not take this all to the logical
>>>>>>> conclusion, and instead optimize an addAll: method to be used in
>>>>>>> grow,
>>>>>>> that assumes elements are not in set (and themselves are not
>>>>>>> duplicates)? F.ex. it could also add to tally just once,
>>>>>>> instead of
>>>>>>> one increase per element as the noCompareAdd: does. (an add
>>>>>>> method
>>>>>>> with no tally add, no = object check, and no growCheck).
>>>>>>>
>>>>>>> - The usual addAll: method can be sped up in a similar way, by
>>>>>>> performing one grow operation before adding elements if necessary
>>>>>>> (pessimistically assuming all elements are unique), then use an
>>>>>>> add
>>>>>>> method which does not check for grow need at all. (an add method
>>>>>>> with
>>>>>>> tally add and = object check, but no growCheck)
>>>>>>> If desirable, you could also shrink the capacity after adds are
>>>>>>> done, to match the real amount of new elements.
>>>>>>>
>>>>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method
>>>>>>> took
>>>>>>> 1/2 to  1/3rd the time of the regular addAll: method, even with
>>>>>>> the
>>>>>>> lower grow cost of FasterSets. (Lower gains if Set is presized,
>>>>>>> and if
>>>>>>> hash-calculation is bigger part of cost)
>>>>>>>
>>>>>>> Note: If you want to stay ANSI-compatible, one tricky point in
>>>>>>> such an
>>>>>>> optimization is adding all objects in collection up to a nil
>>>>>>> element,
>>>>>>> then raising an error. (ie: must produce equivalent results to
>>>>>>> add:
>>>>>>> performed of each of collections elements in sequence )
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Henry
>>>>>>>
>>>>>>>
>>>>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>>>> I would really like to assess FasterSets for Pharo1.1
>>>>>>>>
>>>>>>>> Begin forwarded message:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>> From: Ralph Boland <[hidden email]>
>>>>>>>>> Date: June 25, 2009 8:30:13 PM CEDT
>>>>>>>>> To: Stéphane Ducasse <[hidden email]>
>>>>>>>>> Subject: Re: about fasterTests
>>>>>>>>>
>>>>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>>>> we are interest to get better libraries in pharo.
>>>>>>>>>> If you want to join and help making sure that your code in
>>>>>>>>>> integrated in
>>>>>>>>>> pharo1.1
>>>>>>>>>> please say it and may be join the pharo mailing-list.
>>>>>>>>>>
>>>>>>>>>> Stef
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                    
>>>>>>>>> Yes, I would like my code 'FasterSets' added to Pharo.
>>>>>>>>> I joined the pharo mailing list.
>>>>>>>>> Today I will mail you my MIT license release.
>>>>>>>>> Probably will take a couple of weeks (From Calgary, Canada)
>>>>>>>>> to get
>>>>>>>>> there.
>>>>>>>>> Once I am added as a contributer I will release my code to
>>>>>>>>> Pharo
>>>>>>>>> for
>>>>>>>>> verification.
>>>>>>>>> In the meantime if any Squeakers try out my code and report
>>>>>>>>> bugs I
>>>>>>>>> will fix them
>>>>>>>>> before releasing to Pharo.
>>>>>>>>>
>>>>>>>>> Some comments:
>>>>>>>>>
>>>>>>>>> 1)  Since FasterSets modifies low level methods in class Set
>>>>>>>>> and
>>>>>>>>> its
>>>>>>>>> subclasses it
>>>>>>>>> may be incompatible with packages that create subclasses of Set
>>>>>>>>> and then
>>>>>>>>> override low level methods.  If someone could send me a list of
>>>>>>>>> package add-ons
>>>>>>>>> to Pharo that do this then I can download them and make any
>>>>>>>>> modifications
>>>>>>>>> necessary for them to run with  FasterSets and make these
>>>>>>>>> changes
>>>>>>>>> part of my
>>>>>>>>> release.  Note that it is no more difficult to subclass from
>>>>>>>>> Set
>>>>>>>>> with FasterSets
>>>>>>>>> than without;  it is just different and only different if you
>>>>>>>>> do
>>>>>>>>> low level things.
>>>>>>>>>
>>>>>>>>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which adds
>>>>>>>>> an element to a set without doing compares or growing the set.
>>>>>>>>> Needless to say it is a private method.
>>>>>>>>> I also have a public method 'nastyAddNew:'  which, like
>>>>>>>>> 'noCompareOrGrowAdd:'  adds an element to a set without
>>>>>>>>> checking
>>>>>>>>> if the
>>>>>>>>> element is already there but does do a grow if needed.
>>>>>>>>> This is useful (i.e. faster) in situations where you KNOW the
>>>>>>>>> object
>>>>>>>>> you are adding to a set is not there already.
>>>>>>>>> However if the object IS already there then you now have two
>>>>>>>>> of them in your set;  i.e.  you now have a subtle bug.
>>>>>>>>> I use 'nastyAddNew:' in my code but never made it a part of
>>>>>>>>> FasterSets
>>>>>>>>> because it is questionable whether or not such a method
>>>>>>>>> should be
>>>>>>>>> available generally.   However, if I can decide I like the
>>>>>>>>> 'nastyAddNew:'
>>>>>>>>> method and am willing to use it despite its risks so can others
>>>>>>>>> so there
>>>>>>>>> is an argument for it being added.
>>>>>>>>> While my expectation is that you don't want  'nastyAddNew:'  in
>>>>>>>>> Pharo
>>>>>>>>> I thought you should at least be aware of it.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Ralph Boland
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                  
>>>>>>>> _______________________________________________
>>>>>>>> Pharo-project mailing list
>>>>>>>> [hidden email]
>>>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>> _______________________________________________
>>>>>>> Pharo-project mailing list
>>>>>>> [hidden email]
>>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>> _______________________________________________
>>>>>> Pharo-project mailing list
>>>>>> [hidden email]
>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-
>>>>>> project
>>>>>> .
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>> _______________________________________________
>>>>> Pharo-project mailing list
>>>>> [hidden email]
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>
>>>>>
>>>>>
>>>>>          
>>>> .
>>>>
>>>>
>>>>
>>>>        
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>> .
>>>
>>>
>>>      
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>    
>
> .
>
>  

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: FasterSets

Henrik Sperre Johansen
I submit for your consideration the use case where you want to  
initialize a Set with elements from multiple sources.
How do you do that today?
Well, you could write
set := (Set withAll: source1) addAll: source2; addAll: source3... etc,  
which will surely cause quite some grows. (and also crashes if source1  
has more elements than implementation limits)

A few other options:

set := Set new: (sources inject: 0 into: [:sub :next | sub + next  
size]).
sources do: [:each | set addAll: each ]

or:
tmpWithAllToAdd := sources inject: OrderedCollection new into: [:sub :  
next | sub addAll: next; yourself].
set := Set withAll: tmpAllToAdd

which doesn't perform excessive grows, but still would crash if total  
size of sources is bigger than implementation limits.

So how are these better than modifying addAll: so you can write
set := Set new.
sources do:  [:each | set addAll: each]
and not worry too much how much your set spends on growing/checking if  
it needs to grow?

You could always min: with current implementation limit of Set size if  
you think it's likely that's a case you'll encounter...

And my point was really, I'd rather have an error telling me "Hey,  
you're trying to do something EXTREME" (like, trying to add (in VW)  
more than 2^28 (2^56 on 64bit)
elements to a set) than have my program stop for a large amount of  
time, then finishing "correctly"...

Cheers,
Henry

On Jul 2, 2009, at 4:16 50PM, Andres Valloud wrote:

> However, handling the growth failure wouldn't crash the program...
>
> Henrik Johansen wrote:
>> It would.
>> I'm not sure I consider the current behaviour of spending 12 minutes
>> adding the same element so much better, though.
>>
>> Cheers,
>> Henry
>>
>> On Jul 2, 2009, at 12:17 14PM, Andres Valloud wrote:
>>
>>
>>> Also, what happens in these cases?
>>>
>>> | set |
>>> set := Set new.
>>> bag := Bag new.
>>> bad add: 4 withOccurrences: 1000000000 "or equivalent code".
>>> set addAll: bag
>>>
>>> I am guessing the set will fail to grow and addAll: will fail, even
>>> though it should work.  Maybe it won't happen with bags, or perhaps
>>> the
>>> example needs some massaging.  Is the failure to grow handled
>>> gracefully?  Note that I have not looked at the code because I have
>>> not
>>> had much chance to do that yet.
>>>
>>> Andres Valloud wrote:
>>>
>>>> Hmmm, I don't know though, usually the usage pattern with hashed
>>>> collections has been to presize explicitly and then add to them...
>>>> what
>>>> kind of use cases are you dealing with?
>>>>
>>>> Henrik Johansen wrote:
>>>>
>>>>
>>>>> Having spent a fair deal of time (in VisualWorks) worrying about  
>>>>> my
>>>>> intial Set sizes, writing functions to precompute sane sizes for
>>>>> arbitrary inputs and adding changeCapacityTo: calls before addAll:
>>>>> operations due to performance issues on a case-by-case basis
>>>>> (Squeak/
>>>>> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly
>>>>> disagree.
>>>>>
>>>>> I'd rather spend my time thinking about my problem domain, and  
>>>>> know
>>>>> the Set implementation handles the nitty-gritty details in a
>>>>> sufficiently smart way.
>>>>>
>>>>> Whether that means growing once to accomedate the worst case, then
>>>>> shrinking afterwards, or doing a grow check every X added items,  
>>>>> in
>>>>> the unlikely event that what you're adding is a 3 million element
>>>>> Array with 10 unique objects, I don't really care.
>>>>>
>>>>> Cheers,
>>>>> Henry
>>>>>
>>>>> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> (Set new: desiredCapacity) addAll: "consecutive integers"
>>>>>>
>>>>>> may be a better compromise.
>>>>>>
>>>>>> Andres.
>>>>>>
>>>>>>
>>>>>> Stéphane Ducasse wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks this is really good.
>>>>>>> We will probably integrate these changes really fast
>>>>>>>
>>>>>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> I took some time to check this yesterday, my comments on the  
>>>>>>>> code
>>>>>>>> itself are basically the same as what Lucas posted in the issue
>>>>>>>> tracker.
>>>>>>>> Works as advertised, but should probably clean out the "gunk"
>>>>>>>> which is
>>>>>>>> unused after it's integrated.
>>>>>>>>
>>>>>>>> Some feedback on possible improvements:
>>>>>>>> - What it does, is basically speed up the adds to the new array
>>>>>>>> in a
>>>>>>>> grow that might happen Why not take this all to the logical
>>>>>>>> conclusion, and instead optimize an addAll: method to be used  
>>>>>>>> in
>>>>>>>> grow,
>>>>>>>> that assumes elements are not in set (and themselves are not
>>>>>>>> duplicates)? F.ex. it could also add to tally just once,
>>>>>>>> instead of
>>>>>>>> one increase per element as the noCompareAdd: does. (an add
>>>>>>>> method
>>>>>>>> with no tally add, no = object check, and no growCheck).
>>>>>>>>
>>>>>>>> - The usual addAll: method can be sped up in a similar way, by
>>>>>>>> performing one grow operation before adding elements if  
>>>>>>>> necessary
>>>>>>>> (pessimistically assuming all elements are unique), then use an
>>>>>>>> add
>>>>>>>> method which does not check for grow need at all. (an add  
>>>>>>>> method
>>>>>>>> with
>>>>>>>> tally add and = object check, but no growCheck)
>>>>>>>> If desirable, you could also shrink the capacity after adds are
>>>>>>>> done, to match the real amount of new elements.
>>>>>>>>
>>>>>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method
>>>>>>>> took
>>>>>>>> 1/2 to  1/3rd the time of the regular addAll: method, even with
>>>>>>>> the
>>>>>>>> lower grow cost of FasterSets. (Lower gains if Set is presized,
>>>>>>>> and if
>>>>>>>> hash-calculation is bigger part of cost)
>>>>>>>>
>>>>>>>> Note: If you want to stay ANSI-compatible, one tricky point in
>>>>>>>> such an
>>>>>>>> optimization is adding all objects in collection up to a nil
>>>>>>>> element,
>>>>>>>> then raising an error. (ie: must produce equivalent results to
>>>>>>>> add:
>>>>>>>> performed of each of collections elements in sequence )
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Henry
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> I would really like to assess FasterSets for Pharo1.1
>>>>>>>>>
>>>>>>>>> Begin forwarded message:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> From: Ralph Boland <[hidden email]>
>>>>>>>>>> Date: June 25, 2009 8:30:13 PM CEDT
>>>>>>>>>> To: Stéphane Ducasse <[hidden email]>
>>>>>>>>>> Subject: Re: about fasterTests
>>>>>>>>>>
>>>>>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> we are interest to get better libraries in pharo.
>>>>>>>>>>> If you want to join and help making sure that your code in
>>>>>>>>>>> integrated in
>>>>>>>>>>> pharo1.1
>>>>>>>>>>> please say it and may be join the pharo mailing-list.
>>>>>>>>>>>
>>>>>>>>>>> Stef
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Yes, I would like my code 'FasterSets' added to Pharo.
>>>>>>>>>> I joined the pharo mailing list.
>>>>>>>>>> Today I will mail you my MIT license release.
>>>>>>>>>> Probably will take a couple of weeks (From Calgary, Canada)
>>>>>>>>>> to get
>>>>>>>>>> there.
>>>>>>>>>> Once I am added as a contributer I will release my code to
>>>>>>>>>> Pharo
>>>>>>>>>> for
>>>>>>>>>> verification.
>>>>>>>>>> In the meantime if any Squeakers try out my code and report
>>>>>>>>>> bugs I
>>>>>>>>>> will fix them
>>>>>>>>>> before releasing to Pharo.
>>>>>>>>>>
>>>>>>>>>> Some comments:
>>>>>>>>>>
>>>>>>>>>> 1)  Since FasterSets modifies low level methods in class Set
>>>>>>>>>> and
>>>>>>>>>> its
>>>>>>>>>> subclasses it
>>>>>>>>>> may be incompatible with packages that create subclasses of  
>>>>>>>>>> Set
>>>>>>>>>> and then
>>>>>>>>>> override low level methods.  If someone could send me a  
>>>>>>>>>> list of
>>>>>>>>>> package add-ons
>>>>>>>>>> to Pharo that do this then I can download them and make any
>>>>>>>>>> modifications
>>>>>>>>>> necessary for them to run with  FasterSets and make these
>>>>>>>>>> changes
>>>>>>>>>> part of my
>>>>>>>>>> release.  Note that it is no more difficult to subclass from
>>>>>>>>>> Set
>>>>>>>>>> with FasterSets
>>>>>>>>>> than without;  it is just different and only different if you
>>>>>>>>>> do
>>>>>>>>>> low level things.
>>>>>>>>>>
>>>>>>>>>> 2)  FasterSets uses a method  'noCompareOrGrowAdd:'  which  
>>>>>>>>>> adds
>>>>>>>>>> an element to a set without doing compares or growing the  
>>>>>>>>>> set.
>>>>>>>>>> Needless to say it is a private method.
>>>>>>>>>> I also have a public method 'nastyAddNew:'  which, like
>>>>>>>>>> 'noCompareOrGrowAdd:'  adds an element to a set without
>>>>>>>>>> checking
>>>>>>>>>> if the
>>>>>>>>>> element is already there but does do a grow if needed.
>>>>>>>>>> This is useful (i.e. faster) in situations where you KNOW the
>>>>>>>>>> object
>>>>>>>>>> you are adding to a set is not there already.
>>>>>>>>>> However if the object IS already there then you now have two
>>>>>>>>>> of them in your set;  i.e.  you now have a subtle bug.
>>>>>>>>>> I use 'nastyAddNew:' in my code but never made it a part of
>>>>>>>>>> FasterSets
>>>>>>>>>> because it is questionable whether or not such a method
>>>>>>>>>> should be
>>>>>>>>>> available generally.   However, if I can decide I like the
>>>>>>>>>> 'nastyAddNew:'
>>>>>>>>>> method and am willing to use it despite its risks so can  
>>>>>>>>>> others
>>>>>>>>>> so there
>>>>>>>>>> is an argument for it being added.
>>>>>>>>>> While my expectation is that you don't want  
>>>>>>>>>> 'nastyAddNew:'  in
>>>>>>>>>> Pharo
>>>>>>>>>> I thought you should at least be aware of it.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>
>>>>>>>>>> Ralph Boland
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Pharo-project mailing list
>>>>>>>>> [hidden email]
>>>>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pharo-project mailing list
>>>>>>>> [hidden email]
>>>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pharo-project mailing list
>>>>>>> [hidden email]
>>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-
>>>>>>> project
>>>>>>> .
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Pharo-project mailing list
>>>>>> [hidden email]
>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> .
>>>>>
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Pharo-project mailing list
>>>> [hidden email]
>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>> .
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>>
>>
>> .
>>
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project