I would really like to assess FasterSets for Pharo1.1
Begin forwarded message: > From: Ralph Boland <[hidden email]> > Date: June 25, 2009 8:30:13 PM CEDT > To: Stéphane Ducasse <[hidden email]> > Subject: Re: about fasterTests > > 2009/6/25 Stéphane Ducasse <[hidden email]>: >> we are interest to get better libraries in pharo. >> If you want to join and help making sure that your code in >> integrated in >> pharo1.1 >> please say it and may be join the pharo mailing-list. >> >> Stef >> > > Yes, I would like my code 'FasterSets' added to Pharo. > I joined the pharo mailing list. > Today I will mail you my MIT license release. > Probably will take a couple of weeks (From Calgary, Canada) to get > there. > Once I am added as a contributer I will release my code to Pharo for > verification. > In the meantime if any Squeakers try out my code and report bugs I > will fix them > before releasing to Pharo. > > Some comments: > > 1) Since FasterSets modifies low level methods in class Set and its > subclasses it > may be incompatible with packages that create subclasses of Set > and then > override low level methods. If someone could send me a list of > package add-ons > to Pharo that do this then I can download them and make any > modifications > necessary for them to run with FasterSets and make these changes > part of my > release. Note that it is no more difficult to subclass from Set > with FasterSets > than without; it is just different and only different if you do > low level things. > > 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds > an element to a set without doing compares or growing the set. > Needless to say it is a private method. > I also have a public method 'nastyAddNew:' which, like > 'noCompareOrGrowAdd:' adds an element to a set without checking > if the > element is already there but does do a grow if needed. > This is useful (i.e. faster) in situations where you KNOW the > object > you are adding to a set is not there already. > However if the object IS already there then you now have two > of them in your set; i.e. you now have a subtle bug. > I use 'nastyAddNew:' in my code but never made it a part of > FasterSets > because it is questionable whether or not such a method should be > available generally. However, if I can decide I like the > 'nastyAddNew:' > method and am willing to use it despite its risks so can others > so there > is an argument for it being added. > While my expectation is that you don't want 'nastyAddNew:' in > Pharo > I thought you should at least be aware of it. > > Regards, > > Ralph Boland _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On Thu, Jun 25, 2009 at 5:55 PM, Stéphane Ducasse <[hidden email]> wrote: I would really like to assess FasterSets for Pharo1.1 are all of these things stored somewhere apart from this mailing list ? I saw http://code.google.com/p/pharo/wiki/Milestones but it is quite old Mariano
_______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
2009/6/25 Mariano Martinez Peck <[hidden email]>:
> are all of these things stored somewhere apart from this mailing list ? http://code.google.com/p/pharo/issues/list?can=1&q=label%3AMilestone-1.1 -- Damien Cassou http://damiencassou.seasidehosting.st "Lambdas are relegated to relative obscurity until Java makes them popular by not having them." James Iry _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Mariano Martinez Peck
We should clean it.
Stef On Jun 25, 2009, at 10:03 PM, Mariano Martinez Peck wrote: > > On Thu, Jun 25, 2009 at 5:55 PM, Stéphane Ducasse <[hidden email] > > wrote: > I would really like to assess FasterSets for Pharo1.1 > > are all of these things stored somewhere apart from this mailing > list ? > > I saw http://code.google.com/p/pharo/wiki/Milestones > > but it is quite old > > Mariano > > > > Begin forwarded message: > > > From: Ralph Boland <[hidden email]> > > Date: June 25, 2009 8:30:13 PM CEDT > > To: Stéphane Ducasse <[hidden email]> > > Subject: Re: about fasterTests > > > > 2009/6/25 Stéphane Ducasse <[hidden email]>: > >> we are interest to get better libraries in pharo. > >> If you want to join and help making sure that your code in > >> integrated in > >> pharo1.1 > >> please say it and may be join the pharo mailing-list. > >> > >> Stef > >> > > > > Yes, I would like my code 'FasterSets' added to Pharo. > > I joined the pharo mailing list. > > Today I will mail you my MIT license release. > > Probably will take a couple of weeks (From Calgary, Canada) to get > > there. > > Once I am added as a contributer I will release my code to Pharo for > > verification. > > In the meantime if any Squeakers try out my code and report bugs I > > will fix them > > before releasing to Pharo. > > > > Some comments: > > > > 1) Since FasterSets modifies low level methods in class Set and its > > subclasses it > > may be incompatible with packages that create subclasses of Set > > and then > > override low level methods. If someone could send me a list of > > package add-ons > > to Pharo that do this then I can download them and make any > > modifications > > necessary for them to run with FasterSets and make these > changes > > part of my > > release. Note that it is no more difficult to subclass from Set > > with FasterSets > > than without; it is just different and only different if you do > > low level things. > > > > 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds > > an element to a set without doing compares or growing the set. > > Needless to say it is a private method. > > I also have a public method 'nastyAddNew:' which, like > > 'noCompareOrGrowAdd:' adds an element to a set without checking > > if the > > element is already there but does do a grow if needed. > > This is useful (i.e. faster) in situations where you KNOW the > > object > > you are adding to a set is not there already. > > However if the object IS already there then you now have two > > of them in your set; i.e. you now have a subtle bug. > > I use 'nastyAddNew:' in my code but never made it a part of > > FasterSets > > because it is questionable whether or not such a method should > be > > available generally. However, if I can decide I like the > > 'nastyAddNew:' > > method and am willing to use it despite its risks so can others > > so there > > is an argument for it being added. > > While my expectation is that you don't want 'nastyAddNew:' in > > Pharo > > I thought you should at least be aware of it. > > > > Regards, > > > > Ralph Boland > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Stéphane Ducasse
I took some time to check this yesterday, my comments on the code
itself are basically the same as what Lucas posted in the issue tracker. Works as advertised, but should probably clean out the "gunk" which is unused after it's integrated. Some feedback on possible improvements: - What it does, is basically speed up the adds to the new array in a grow that might happen Why not take this all to the logical conclusion, and instead optimize an addAll: method to be used in grow, that assumes elements are not in set (and themselves are not duplicates)? F.ex. it could also add to tally just once, instead of one increase per element as the noCompareAdd: does. (an add method with no tally add, no = object check, and no growCheck). - The usual addAll: method can be sped up in a similar way, by performing one grow operation before adding elements if necessary (pessimistically assuming all elements are unique), then use an add method which does not check for grow need at all. (an add method with tally add and = object check, but no growCheck) If desirable, you could also shrink the capacity after adds are done, to match the real amount of new elements. Adding (1 to: 1200000) to a new Set with such an addAll: method took 1/2 to 1/3rd the time of the regular addAll: method, even with the lower grow cost of FasterSets. (Lower gains if Set is presized, and if hash-calculation is bigger part of cost) Note: If you want to stay ANSI-compatible, one tricky point in such an optimization is adding all objects in collection up to a nil element, then raising an error. (ie: must produce equivalent results to add: performed of each of collections elements in sequence ) Cheers, Henry On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: > I would really like to assess FasterSets for Pharo1.1 > > Begin forwarded message: > >> From: Ralph Boland <[hidden email]> >> Date: June 25, 2009 8:30:13 PM CEDT >> To: Stéphane Ducasse <[hidden email]> >> Subject: Re: about fasterTests >> >> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>> we are interest to get better libraries in pharo. >>> If you want to join and help making sure that your code in >>> integrated in >>> pharo1.1 >>> please say it and may be join the pharo mailing-list. >>> >>> Stef >>> >> >> Yes, I would like my code 'FasterSets' added to Pharo. >> I joined the pharo mailing list. >> Today I will mail you my MIT license release. >> Probably will take a couple of weeks (From Calgary, Canada) to get >> there. >> Once I am added as a contributer I will release my code to Pharo for >> verification. >> In the meantime if any Squeakers try out my code and report bugs I >> will fix them >> before releasing to Pharo. >> >> Some comments: >> >> 1) Since FasterSets modifies low level methods in class Set and its >> subclasses it >> may be incompatible with packages that create subclasses of Set >> and then >> override low level methods. If someone could send me a list of >> package add-ons >> to Pharo that do this then I can download them and make any >> modifications >> necessary for them to run with FasterSets and make these changes >> part of my >> release. Note that it is no more difficult to subclass from Set >> with FasterSets >> than without; it is just different and only different if you do >> low level things. >> >> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds >> an element to a set without doing compares or growing the set. >> Needless to say it is a private method. >> I also have a public method 'nastyAddNew:' which, like >> 'noCompareOrGrowAdd:' adds an element to a set without checking >> if the >> element is already there but does do a grow if needed. >> This is useful (i.e. faster) in situations where you KNOW the >> object >> you are adding to a set is not there already. >> However if the object IS already there then you now have two >> of them in your set; i.e. you now have a subtle bug. >> I use 'nastyAddNew:' in my code but never made it a part of >> FasterSets >> because it is questionable whether or not such a method should be >> available generally. However, if I can decide I like the >> 'nastyAddNew:' >> method and am willing to use it despite its risks so can others >> so there >> is an argument for it being added. >> While my expectation is that you don't want 'nastyAddNew:' in >> Pharo >> I thought you should at least be aware of it. >> >> Regards, >> >> Ralph Boland > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Thanks this is really good.
We will probably integrate these changes really fast On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote: > I took some time to check this yesterday, my comments on the code > itself are basically the same as what Lucas posted in the issue > tracker. > Works as advertised, but should probably clean out the "gunk" which is > unused after it's integrated. > > Some feedback on possible improvements: > - What it does, is basically speed up the adds to the new array in a > grow that might happen Why not take this all to the logical > conclusion, and instead optimize an addAll: method to be used in grow, > that assumes elements are not in set (and themselves are not > duplicates)? F.ex. it could also add to tally just once, instead of > one increase per element as the noCompareAdd: does. (an add method > with no tally add, no = object check, and no growCheck). > > - The usual addAll: method can be sped up in a similar way, by > performing one grow operation before adding elements if necessary > (pessimistically assuming all elements are unique), then use an add > method which does not check for grow need at all. (an add method with > tally add and = object check, but no growCheck) > If desirable, you could also shrink the capacity after adds are > done, to match the real amount of new elements. > > Adding (1 to: 1200000) to a new Set with such an addAll: method took > 1/2 to 1/3rd the time of the regular addAll: method, even with the > lower grow cost of FasterSets. (Lower gains if Set is presized, and if > hash-calculation is bigger part of cost) > > Note: If you want to stay ANSI-compatible, one tricky point in such an > optimization is adding all objects in collection up to a nil element, > then raising an error. (ie: must produce equivalent results to add: > performed of each of collections elements in sequence ) > > Cheers, > Henry > > > On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: > >> I would really like to assess FasterSets for Pharo1.1 >> >> Begin forwarded message: >> >>> From: Ralph Boland <[hidden email]> >>> Date: June 25, 2009 8:30:13 PM CEDT >>> To: Stéphane Ducasse <[hidden email]> >>> Subject: Re: about fasterTests >>> >>> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>>> we are interest to get better libraries in pharo. >>>> If you want to join and help making sure that your code in >>>> integrated in >>>> pharo1.1 >>>> please say it and may be join the pharo mailing-list. >>>> >>>> Stef >>>> >>> >>> Yes, I would like my code 'FasterSets' added to Pharo. >>> I joined the pharo mailing list. >>> Today I will mail you my MIT license release. >>> Probably will take a couple of weeks (From Calgary, Canada) to get >>> there. >>> Once I am added as a contributer I will release my code to Pharo for >>> verification. >>> In the meantime if any Squeakers try out my code and report bugs I >>> will fix them >>> before releasing to Pharo. >>> >>> Some comments: >>> >>> 1) Since FasterSets modifies low level methods in class Set and its >>> subclasses it >>> may be incompatible with packages that create subclasses of Set >>> and then >>> override low level methods. If someone could send me a list of >>> package add-ons >>> to Pharo that do this then I can download them and make any >>> modifications >>> necessary for them to run with FasterSets and make these changes >>> part of my >>> release. Note that it is no more difficult to subclass from Set >>> with FasterSets >>> than without; it is just different and only different if you do >>> low level things. >>> >>> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds >>> an element to a set without doing compares or growing the set. >>> Needless to say it is a private method. >>> I also have a public method 'nastyAddNew:' which, like >>> 'noCompareOrGrowAdd:' adds an element to a set without checking >>> if the >>> element is already there but does do a grow if needed. >>> This is useful (i.e. faster) in situations where you KNOW the >>> object >>> you are adding to a set is not there already. >>> However if the object IS already there then you now have two >>> of them in your set; i.e. you now have a subtle bug. >>> I use 'nastyAddNew:' in my code but never made it a part of >>> FasterSets >>> because it is questionable whether or not such a method should be >>> available generally. However, if I can decide I like the >>> 'nastyAddNew:' >>> method and am willing to use it despite its risks so can others >>> so there >>> is an argument for it being added. >>> While my expectation is that you don't want 'nastyAddNew:' in >>> Pharo >>> I thought you should at least be aware of it. >>> >>> Regards, >>> >>> Ralph Boland >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Keep in mind that growing on addAll: before the duplicates are filtered
out may require significantly more memory than necessary. For example, Set new addAll: "all instances of string" I am sure there is an example than illustrates the issue better than the above one. While Set new addAll: "consecutive integers" may run faster, I think doing (Set new: desiredCapacity) addAll: "consecutive integers" may be a better compromise. Andres. Stéphane Ducasse wrote: > Thanks this is really good. > We will probably integrate these changes really fast > > On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote: > > >> I took some time to check this yesterday, my comments on the code >> itself are basically the same as what Lucas posted in the issue >> tracker. >> Works as advertised, but should probably clean out the "gunk" which is >> unused after it's integrated. >> >> Some feedback on possible improvements: >> - What it does, is basically speed up the adds to the new array in a >> grow that might happen Why not take this all to the logical >> conclusion, and instead optimize an addAll: method to be used in grow, >> that assumes elements are not in set (and themselves are not >> duplicates)? F.ex. it could also add to tally just once, instead of >> one increase per element as the noCompareAdd: does. (an add method >> with no tally add, no = object check, and no growCheck). >> >> - The usual addAll: method can be sped up in a similar way, by >> performing one grow operation before adding elements if necessary >> (pessimistically assuming all elements are unique), then use an add >> method which does not check for grow need at all. (an add method with >> tally add and = object check, but no growCheck) >> If desirable, you could also shrink the capacity after adds are >> done, to match the real amount of new elements. >> >> Adding (1 to: 1200000) to a new Set with such an addAll: method took >> 1/2 to 1/3rd the time of the regular addAll: method, even with the >> lower grow cost of FasterSets. (Lower gains if Set is presized, and if >> hash-calculation is bigger part of cost) >> >> Note: If you want to stay ANSI-compatible, one tricky point in such an >> optimization is adding all objects in collection up to a nil element, >> then raising an error. (ie: must produce equivalent results to add: >> performed of each of collections elements in sequence ) >> >> Cheers, >> Henry >> >> >> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: >> >> >>> I would really like to assess FasterSets for Pharo1.1 >>> >>> Begin forwarded message: >>> >>> >>>> From: Ralph Boland <[hidden email]> >>>> Date: June 25, 2009 8:30:13 PM CEDT >>>> To: Stéphane Ducasse <[hidden email]> >>>> Subject: Re: about fasterTests >>>> >>>> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>>> >>>>> we are interest to get better libraries in pharo. >>>>> If you want to join and help making sure that your code in >>>>> integrated in >>>>> pharo1.1 >>>>> please say it and may be join the pharo mailing-list. >>>>> >>>>> Stef >>>>> >>>>> >>>> Yes, I would like my code 'FasterSets' added to Pharo. >>>> I joined the pharo mailing list. >>>> Today I will mail you my MIT license release. >>>> Probably will take a couple of weeks (From Calgary, Canada) to get >>>> there. >>>> Once I am added as a contributer I will release my code to Pharo for >>>> verification. >>>> In the meantime if any Squeakers try out my code and report bugs I >>>> will fix them >>>> before releasing to Pharo. >>>> >>>> Some comments: >>>> >>>> 1) Since FasterSets modifies low level methods in class Set and its >>>> subclasses it >>>> may be incompatible with packages that create subclasses of Set >>>> and then >>>> override low level methods. If someone could send me a list of >>>> package add-ons >>>> to Pharo that do this then I can download them and make any >>>> modifications >>>> necessary for them to run with FasterSets and make these changes >>>> part of my >>>> release. Note that it is no more difficult to subclass from Set >>>> with FasterSets >>>> than without; it is just different and only different if you do >>>> low level things. >>>> >>>> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds >>>> an element to a set without doing compares or growing the set. >>>> Needless to say it is a private method. >>>> I also have a public method 'nastyAddNew:' which, like >>>> 'noCompareOrGrowAdd:' adds an element to a set without checking >>>> if the >>>> element is already there but does do a grow if needed. >>>> This is useful (i.e. faster) in situations where you KNOW the >>>> object >>>> you are adding to a set is not there already. >>>> However if the object IS already there then you now have two >>>> of them in your set; i.e. you now have a subtle bug. >>>> I use 'nastyAddNew:' in my code but never made it a part of >>>> FasterSets >>>> because it is questionable whether or not such a method should be >>>> available generally. However, if I can decide I like the >>>> 'nastyAddNew:' >>>> method and am willing to use it despite its risks so can others >>>> so there >>>> is an argument for it being added. >>>> While my expectation is that you don't want 'nastyAddNew:' in >>>> Pharo >>>> I thought you should at least be aware of it. >>>> >>>> Regards, >>>> >>>> Ralph Boland >>>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > . > > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Having spent a fair deal of time (in VisualWorks) worrying about my
intial Set sizes, writing functions to precompute sane sizes for arbitrary inputs and adding changeCapacityTo: calls before addAll: operations due to performance issues on a case-by-case basis (Squeak/ Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly disagree. I'd rather spend my time thinking about my problem domain, and know the Set implementation handles the nitty-gritty details in a sufficiently smart way. Whether that means growing once to accomedate the worst case, then shrinking afterwards, or doing a grow check every X added items, in the unlikely event that what you're adding is a 3 million element Array with 10 unique objects, I don't really care. Cheers, Henry On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote: > > (Set new: desiredCapacity) addAll: "consecutive integers" > > may be a better compromise. > > Andres. > > > Stéphane Ducasse wrote: >> Thanks this is really good. >> We will probably integrate these changes really fast >> >> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote: >> >> >>> I took some time to check this yesterday, my comments on the code >>> itself are basically the same as what Lucas posted in the issue >>> tracker. >>> Works as advertised, but should probably clean out the "gunk" >>> which is >>> unused after it's integrated. >>> >>> Some feedback on possible improvements: >>> - What it does, is basically speed up the adds to the new array in a >>> grow that might happen Why not take this all to the logical >>> conclusion, and instead optimize an addAll: method to be used in >>> grow, >>> that assumes elements are not in set (and themselves are not >>> duplicates)? F.ex. it could also add to tally just once, instead of >>> one increase per element as the noCompareAdd: does. (an add method >>> with no tally add, no = object check, and no growCheck). >>> >>> - The usual addAll: method can be sped up in a similar way, by >>> performing one grow operation before adding elements if necessary >>> (pessimistically assuming all elements are unique), then use an add >>> method which does not check for grow need at all. (an add method >>> with >>> tally add and = object check, but no growCheck) >>> If desirable, you could also shrink the capacity after adds are >>> done, to match the real amount of new elements. >>> >>> Adding (1 to: 1200000) to a new Set with such an addAll: method took >>> 1/2 to 1/3rd the time of the regular addAll: method, even with the >>> lower grow cost of FasterSets. (Lower gains if Set is presized, >>> and if >>> hash-calculation is bigger part of cost) >>> >>> Note: If you want to stay ANSI-compatible, one tricky point in >>> such an >>> optimization is adding all objects in collection up to a nil >>> element, >>> then raising an error. (ie: must produce equivalent results to add: >>> performed of each of collections elements in sequence ) >>> >>> Cheers, >>> Henry >>> >>> >>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: >>> >>> >>>> I would really like to assess FasterSets for Pharo1.1 >>>> >>>> Begin forwarded message: >>>> >>>> >>>>> From: Ralph Boland <[hidden email]> >>>>> Date: June 25, 2009 8:30:13 PM CEDT >>>>> To: Stéphane Ducasse <[hidden email]> >>>>> Subject: Re: about fasterTests >>>>> >>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>>>> >>>>>> we are interest to get better libraries in pharo. >>>>>> If you want to join and help making sure that your code in >>>>>> integrated in >>>>>> pharo1.1 >>>>>> please say it and may be join the pharo mailing-list. >>>>>> >>>>>> Stef >>>>>> >>>>>> >>>>> Yes, I would like my code 'FasterSets' added to Pharo. >>>>> I joined the pharo mailing list. >>>>> Today I will mail you my MIT license release. >>>>> Probably will take a couple of weeks (From Calgary, Canada) to get >>>>> there. >>>>> Once I am added as a contributer I will release my code to Pharo >>>>> for >>>>> verification. >>>>> In the meantime if any Squeakers try out my code and report bugs I >>>>> will fix them >>>>> before releasing to Pharo. >>>>> >>>>> Some comments: >>>>> >>>>> 1) Since FasterSets modifies low level methods in class Set and >>>>> its >>>>> subclasses it >>>>> may be incompatible with packages that create subclasses of Set >>>>> and then >>>>> override low level methods. If someone could send me a list of >>>>> package add-ons >>>>> to Pharo that do this then I can download them and make any >>>>> modifications >>>>> necessary for them to run with FasterSets and make these changes >>>>> part of my >>>>> release. Note that it is no more difficult to subclass from Set >>>>> with FasterSets >>>>> than without; it is just different and only different if you do >>>>> low level things. >>>>> >>>>> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds >>>>> an element to a set without doing compares or growing the set. >>>>> Needless to say it is a private method. >>>>> I also have a public method 'nastyAddNew:' which, like >>>>> 'noCompareOrGrowAdd:' adds an element to a set without checking >>>>> if the >>>>> element is already there but does do a grow if needed. >>>>> This is useful (i.e. faster) in situations where you KNOW the >>>>> object >>>>> you are adding to a set is not there already. >>>>> However if the object IS already there then you now have two >>>>> of them in your set; i.e. you now have a subtle bug. >>>>> I use 'nastyAddNew:' in my code but never made it a part of >>>>> FasterSets >>>>> because it is questionable whether or not such a method should be >>>>> available generally. However, if I can decide I like the >>>>> 'nastyAddNew:' >>>>> method and am willing to use it despite its risks so can others >>>>> so there >>>>> is an argument for it being added. >>>>> While my expectation is that you don't want 'nastyAddNew:' in >>>>> Pharo >>>>> I thought you should at least be aware of it. >>>>> >>>>> Regards, >>>>> >>>>> Ralph Boland >>>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> . >> >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Hmmm, I don't know though, usually the usage pattern with hashed
collections has been to presize explicitly and then add to them... what kind of use cases are you dealing with? Henrik Johansen wrote: > Having spent a fair deal of time (in VisualWorks) worrying about my > intial Set sizes, writing functions to precompute sane sizes for > arbitrary inputs and adding changeCapacityTo: calls before addAll: > operations due to performance issues on a case-by-case basis (Squeak/ > Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly > disagree. > > I'd rather spend my time thinking about my problem domain, and know > the Set implementation handles the nitty-gritty details in a > sufficiently smart way. > > Whether that means growing once to accomedate the worst case, then > shrinking afterwards, or doing a grow check every X added items, in > the unlikely event that what you're adding is a 3 million element > Array with 10 unique objects, I don't really care. > > Cheers, > Henry > > On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote: > > >> (Set new: desiredCapacity) addAll: "consecutive integers" >> >> may be a better compromise. >> >> Andres. >> >> >> Stéphane Ducasse wrote: >> >>> Thanks this is really good. >>> We will probably integrate these changes really fast >>> >>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote: >>> >>> >>> >>>> I took some time to check this yesterday, my comments on the code >>>> itself are basically the same as what Lucas posted in the issue >>>> tracker. >>>> Works as advertised, but should probably clean out the "gunk" >>>> which is >>>> unused after it's integrated. >>>> >>>> Some feedback on possible improvements: >>>> - What it does, is basically speed up the adds to the new array in a >>>> grow that might happen Why not take this all to the logical >>>> conclusion, and instead optimize an addAll: method to be used in >>>> grow, >>>> that assumes elements are not in set (and themselves are not >>>> duplicates)? F.ex. it could also add to tally just once, instead of >>>> one increase per element as the noCompareAdd: does. (an add method >>>> with no tally add, no = object check, and no growCheck). >>>> >>>> - The usual addAll: method can be sped up in a similar way, by >>>> performing one grow operation before adding elements if necessary >>>> (pessimistically assuming all elements are unique), then use an add >>>> method which does not check for grow need at all. (an add method >>>> with >>>> tally add and = object check, but no growCheck) >>>> If desirable, you could also shrink the capacity after adds are >>>> done, to match the real amount of new elements. >>>> >>>> Adding (1 to: 1200000) to a new Set with such an addAll: method took >>>> 1/2 to 1/3rd the time of the regular addAll: method, even with the >>>> lower grow cost of FasterSets. (Lower gains if Set is presized, >>>> and if >>>> hash-calculation is bigger part of cost) >>>> >>>> Note: If you want to stay ANSI-compatible, one tricky point in >>>> such an >>>> optimization is adding all objects in collection up to a nil >>>> element, >>>> then raising an error. (ie: must produce equivalent results to add: >>>> performed of each of collections elements in sequence ) >>>> >>>> Cheers, >>>> Henry >>>> >>>> >>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: >>>> >>>> >>>> >>>>> I would really like to assess FasterSets for Pharo1.1 >>>>> >>>>> Begin forwarded message: >>>>> >>>>> >>>>> >>>>>> From: Ralph Boland <[hidden email]> >>>>>> Date: June 25, 2009 8:30:13 PM CEDT >>>>>> To: Stéphane Ducasse <[hidden email]> >>>>>> Subject: Re: about fasterTests >>>>>> >>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>>>>> >>>>>> >>>>>>> we are interest to get better libraries in pharo. >>>>>>> If you want to join and help making sure that your code in >>>>>>> integrated in >>>>>>> pharo1.1 >>>>>>> please say it and may be join the pharo mailing-list. >>>>>>> >>>>>>> Stef >>>>>>> >>>>>>> >>>>>>> >>>>>> Yes, I would like my code 'FasterSets' added to Pharo. >>>>>> I joined the pharo mailing list. >>>>>> Today I will mail you my MIT license release. >>>>>> Probably will take a couple of weeks (From Calgary, Canada) to get >>>>>> there. >>>>>> Once I am added as a contributer I will release my code to Pharo >>>>>> for >>>>>> verification. >>>>>> In the meantime if any Squeakers try out my code and report bugs I >>>>>> will fix them >>>>>> before releasing to Pharo. >>>>>> >>>>>> Some comments: >>>>>> >>>>>> 1) Since FasterSets modifies low level methods in class Set and >>>>>> its >>>>>> subclasses it >>>>>> may be incompatible with packages that create subclasses of Set >>>>>> and then >>>>>> override low level methods. If someone could send me a list of >>>>>> package add-ons >>>>>> to Pharo that do this then I can download them and make any >>>>>> modifications >>>>>> necessary for them to run with FasterSets and make these changes >>>>>> part of my >>>>>> release. Note that it is no more difficult to subclass from Set >>>>>> with FasterSets >>>>>> than without; it is just different and only different if you do >>>>>> low level things. >>>>>> >>>>>> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds >>>>>> an element to a set without doing compares or growing the set. >>>>>> Needless to say it is a private method. >>>>>> I also have a public method 'nastyAddNew:' which, like >>>>>> 'noCompareOrGrowAdd:' adds an element to a set without checking >>>>>> if the >>>>>> element is already there but does do a grow if needed. >>>>>> This is useful (i.e. faster) in situations where you KNOW the >>>>>> object >>>>>> you are adding to a set is not there already. >>>>>> However if the object IS already there then you now have two >>>>>> of them in your set; i.e. you now have a subtle bug. >>>>>> I use 'nastyAddNew:' in my code but never made it a part of >>>>>> FasterSets >>>>>> because it is questionable whether or not such a method should be >>>>>> available generally. However, if I can decide I like the >>>>>> 'nastyAddNew:' >>>>>> method and am willing to use it despite its risks so can others >>>>>> so there >>>>>> is an argument for it being added. >>>>>> While my expectation is that you don't want 'nastyAddNew:' in >>>>>> Pharo >>>>>> I thought you should at least be aware of it. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Ralph Boland >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> . >>> >>> >>> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> > > . > > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Also, what happens in these cases?
| set | set := Set new. bag := Bag new. bad add: 4 withOccurrences: 1000000000 "or equivalent code". set addAll: bag I am guessing the set will fail to grow and addAll: will fail, even though it should work. Maybe it won't happen with bags, or perhaps the example needs some massaging. Is the failure to grow handled gracefully? Note that I have not looked at the code because I have not had much chance to do that yet. Andres Valloud wrote: > Hmmm, I don't know though, usually the usage pattern with hashed > collections has been to presize explicitly and then add to them... what > kind of use cases are you dealing with? > > Henrik Johansen wrote: > >> Having spent a fair deal of time (in VisualWorks) worrying about my >> intial Set sizes, writing functions to precompute sane sizes for >> arbitrary inputs and adding changeCapacityTo: calls before addAll: >> operations due to performance issues on a case-by-case basis (Squeak/ >> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly >> disagree. >> >> I'd rather spend my time thinking about my problem domain, and know >> the Set implementation handles the nitty-gritty details in a >> sufficiently smart way. >> >> Whether that means growing once to accomedate the worst case, then >> shrinking afterwards, or doing a grow check every X added items, in >> the unlikely event that what you're adding is a 3 million element >> Array with 10 unique objects, I don't really care. >> >> Cheers, >> Henry >> >> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote: >> >> >> >>> (Set new: desiredCapacity) addAll: "consecutive integers" >>> >>> may be a better compromise. >>> >>> Andres. >>> >>> >>> Stéphane Ducasse wrote: >>> >>> >>>> Thanks this is really good. >>>> We will probably integrate these changes really fast >>>> >>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote: >>>> >>>> >>>> >>>> >>>>> I took some time to check this yesterday, my comments on the code >>>>> itself are basically the same as what Lucas posted in the issue >>>>> tracker. >>>>> Works as advertised, but should probably clean out the "gunk" >>>>> which is >>>>> unused after it's integrated. >>>>> >>>>> Some feedback on possible improvements: >>>>> - What it does, is basically speed up the adds to the new array in a >>>>> grow that might happen Why not take this all to the logical >>>>> conclusion, and instead optimize an addAll: method to be used in >>>>> grow, >>>>> that assumes elements are not in set (and themselves are not >>>>> duplicates)? F.ex. it could also add to tally just once, instead of >>>>> one increase per element as the noCompareAdd: does. (an add method >>>>> with no tally add, no = object check, and no growCheck). >>>>> >>>>> - The usual addAll: method can be sped up in a similar way, by >>>>> performing one grow operation before adding elements if necessary >>>>> (pessimistically assuming all elements are unique), then use an add >>>>> method which does not check for grow need at all. (an add method >>>>> with >>>>> tally add and = object check, but no growCheck) >>>>> If desirable, you could also shrink the capacity after adds are >>>>> done, to match the real amount of new elements. >>>>> >>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method took >>>>> 1/2 to 1/3rd the time of the regular addAll: method, even with the >>>>> lower grow cost of FasterSets. (Lower gains if Set is presized, >>>>> and if >>>>> hash-calculation is bigger part of cost) >>>>> >>>>> Note: If you want to stay ANSI-compatible, one tricky point in >>>>> such an >>>>> optimization is adding all objects in collection up to a nil >>>>> element, >>>>> then raising an error. (ie: must produce equivalent results to add: >>>>> performed of each of collections elements in sequence ) >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> >>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> I would really like to assess FasterSets for Pharo1.1 >>>>>> >>>>>> Begin forwarded message: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> From: Ralph Boland <[hidden email]> >>>>>>> Date: June 25, 2009 8:30:13 PM CEDT >>>>>>> To: Stéphane Ducasse <[hidden email]> >>>>>>> Subject: Re: about fasterTests >>>>>>> >>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> we are interest to get better libraries in pharo. >>>>>>>> If you want to join and help making sure that your code in >>>>>>>> integrated in >>>>>>>> pharo1.1 >>>>>>>> please say it and may be join the pharo mailing-list. >>>>>>>> >>>>>>>> Stef >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> Yes, I would like my code 'FasterSets' added to Pharo. >>>>>>> I joined the pharo mailing list. >>>>>>> Today I will mail you my MIT license release. >>>>>>> Probably will take a couple of weeks (From Calgary, Canada) to get >>>>>>> there. >>>>>>> Once I am added as a contributer I will release my code to Pharo >>>>>>> for >>>>>>> verification. >>>>>>> In the meantime if any Squeakers try out my code and report bugs I >>>>>>> will fix them >>>>>>> before releasing to Pharo. >>>>>>> >>>>>>> Some comments: >>>>>>> >>>>>>> 1) Since FasterSets modifies low level methods in class Set and >>>>>>> its >>>>>>> subclasses it >>>>>>> may be incompatible with packages that create subclasses of Set >>>>>>> and then >>>>>>> override low level methods. If someone could send me a list of >>>>>>> package add-ons >>>>>>> to Pharo that do this then I can download them and make any >>>>>>> modifications >>>>>>> necessary for them to run with FasterSets and make these changes >>>>>>> part of my >>>>>>> release. Note that it is no more difficult to subclass from Set >>>>>>> with FasterSets >>>>>>> than without; it is just different and only different if you do >>>>>>> low level things. >>>>>>> >>>>>>> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds >>>>>>> an element to a set without doing compares or growing the set. >>>>>>> Needless to say it is a private method. >>>>>>> I also have a public method 'nastyAddNew:' which, like >>>>>>> 'noCompareOrGrowAdd:' adds an element to a set without checking >>>>>>> if the >>>>>>> element is already there but does do a grow if needed. >>>>>>> This is useful (i.e. faster) in situations where you KNOW the >>>>>>> object >>>>>>> you are adding to a set is not there already. >>>>>>> However if the object IS already there then you now have two >>>>>>> of them in your set; i.e. you now have a subtle bug. >>>>>>> I use 'nastyAddNew:' in my code but never made it a part of >>>>>>> FasterSets >>>>>>> because it is questionable whether or not such a method should be >>>>>>> available generally. However, if I can decide I like the >>>>>>> 'nastyAddNew:' >>>>>>> method and am willing to use it despite its risks so can others >>>>>>> so there >>>>>>> is an argument for it being added. >>>>>>> While my expectation is that you don't want 'nastyAddNew:' in >>>>>>> Pharo >>>>>>> I thought you should at least be aware of it. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Ralph Boland >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Pharo-project mailing list >>>>>> [hidden email] >>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>> >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> . >>>> >>>> >>>> >>>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> >>> >> . >> >> >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > . > > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Andres Valloud-4
Anything using a dynamic lookup cache, but also doing bulk updates to
the cache? Searchlight in VisualWorks and Monticello in Pharo are two examples. Cheers, Henry On Jul 2, 2009, at 12:13 21PM, Andres Valloud wrote: > Hmmm, I don't know though, usually the usage pattern with hashed > collections has been to presize explicitly and then add to them... > what > kind of use cases are you dealing with? > > Henrik Johansen wrote: >> Having spent a fair deal of time (in VisualWorks) worrying about my >> intial Set sizes, writing functions to precompute sane sizes for >> arbitrary inputs and adding changeCapacityTo: calls before addAll: >> operations due to performance issues on a case-by-case basis (Squeak/ >> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly >> disagree. >> >> I'd rather spend my time thinking about my problem domain, and know >> the Set implementation handles the nitty-gritty details in a >> sufficiently smart way. >> >> Whether that means growing once to accomedate the worst case, then >> shrinking afterwards, or doing a grow check every X added items, in >> the unlikely event that what you're adding is a 3 million element >> Array with 10 unique objects, I don't really care. >> >> Cheers, >> Henry >> >> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote: >> >> >>> (Set new: desiredCapacity) addAll: "consecutive integers" >>> >>> may be a better compromise. >>> >>> Andres. >>> >>> >>> Stéphane Ducasse wrote: >>> >>>> Thanks this is really good. >>>> We will probably integrate these changes really fast >>>> >>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote: >>>> >>>> >>>> >>>>> I took some time to check this yesterday, my comments on the code >>>>> itself are basically the same as what Lucas posted in the issue >>>>> tracker. >>>>> Works as advertised, but should probably clean out the "gunk" >>>>> which is >>>>> unused after it's integrated. >>>>> >>>>> Some feedback on possible improvements: >>>>> - What it does, is basically speed up the adds to the new array >>>>> in a >>>>> grow that might happen Why not take this all to the logical >>>>> conclusion, and instead optimize an addAll: method to be used in >>>>> grow, >>>>> that assumes elements are not in set (and themselves are not >>>>> duplicates)? F.ex. it could also add to tally just once, instead >>>>> of >>>>> one increase per element as the noCompareAdd: does. (an add method >>>>> with no tally add, no = object check, and no growCheck). >>>>> >>>>> - The usual addAll: method can be sped up in a similar way, by >>>>> performing one grow operation before adding elements if necessary >>>>> (pessimistically assuming all elements are unique), then use an >>>>> add >>>>> method which does not check for grow need at all. (an add method >>>>> with >>>>> tally add and = object check, but no growCheck) >>>>> If desirable, you could also shrink the capacity after adds are >>>>> done, to match the real amount of new elements. >>>>> >>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method >>>>> took >>>>> 1/2 to 1/3rd the time of the regular addAll: method, even with >>>>> the >>>>> lower grow cost of FasterSets. (Lower gains if Set is presized, >>>>> and if >>>>> hash-calculation is bigger part of cost) >>>>> >>>>> Note: If you want to stay ANSI-compatible, one tricky point in >>>>> such an >>>>> optimization is adding all objects in collection up to a nil >>>>> element, >>>>> then raising an error. (ie: must produce equivalent results to >>>>> add: >>>>> performed of each of collections elements in sequence ) >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> >>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: >>>>> >>>>> >>>>> >>>>>> I would really like to assess FasterSets for Pharo1.1 >>>>>> >>>>>> Begin forwarded message: >>>>>> >>>>>> >>>>>> >>>>>>> From: Ralph Boland <[hidden email]> >>>>>>> Date: June 25, 2009 8:30:13 PM CEDT >>>>>>> To: Stéphane Ducasse <[hidden email]> >>>>>>> Subject: Re: about fasterTests >>>>>>> >>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>>>>>> >>>>>>> >>>>>>>> we are interest to get better libraries in pharo. >>>>>>>> If you want to join and help making sure that your code in >>>>>>>> integrated in >>>>>>>> pharo1.1 >>>>>>>> please say it and may be join the pharo mailing-list. >>>>>>>> >>>>>>>> Stef >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> Yes, I would like my code 'FasterSets' added to Pharo. >>>>>>> I joined the pharo mailing list. >>>>>>> Today I will mail you my MIT license release. >>>>>>> Probably will take a couple of weeks (From Calgary, Canada) to >>>>>>> get >>>>>>> there. >>>>>>> Once I am added as a contributer I will release my code to Pharo >>>>>>> for >>>>>>> verification. >>>>>>> In the meantime if any Squeakers try out my code and report >>>>>>> bugs I >>>>>>> will fix them >>>>>>> before releasing to Pharo. >>>>>>> >>>>>>> Some comments: >>>>>>> >>>>>>> 1) Since FasterSets modifies low level methods in class Set and >>>>>>> its >>>>>>> subclasses it >>>>>>> may be incompatible with packages that create subclasses of Set >>>>>>> and then >>>>>>> override low level methods. If someone could send me a list of >>>>>>> package add-ons >>>>>>> to Pharo that do this then I can download them and make any >>>>>>> modifications >>>>>>> necessary for them to run with FasterSets and make these >>>>>>> changes >>>>>>> part of my >>>>>>> release. Note that it is no more difficult to subclass from Set >>>>>>> with FasterSets >>>>>>> than without; it is just different and only different if you do >>>>>>> low level things. >>>>>>> >>>>>>> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds >>>>>>> an element to a set without doing compares or growing the set. >>>>>>> Needless to say it is a private method. >>>>>>> I also have a public method 'nastyAddNew:' which, like >>>>>>> 'noCompareOrGrowAdd:' adds an element to a set without checking >>>>>>> if the >>>>>>> element is already there but does do a grow if needed. >>>>>>> This is useful (i.e. faster) in situations where you KNOW the >>>>>>> object >>>>>>> you are adding to a set is not there already. >>>>>>> However if the object IS already there then you now have two >>>>>>> of them in your set; i.e. you now have a subtle bug. >>>>>>> I use 'nastyAddNew:' in my code but never made it a part of >>>>>>> FasterSets >>>>>>> because it is questionable whether or not such a method should >>>>>>> be >>>>>>> available generally. However, if I can decide I like the >>>>>>> 'nastyAddNew:' >>>>>>> method and am willing to use it despite its risks so can others >>>>>>> so there >>>>>>> is an argument for it being added. >>>>>>> While my expectation is that you don't want 'nastyAddNew:' in >>>>>>> Pharo >>>>>>> I thought you should at least be aware of it. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Ralph Boland >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Pharo-project mailing list >>>>>> [hidden email] >>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>> project >>>>> >>>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> . >>>> >>>> >>>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> >> >> . >> >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Andres Valloud-4
It would.
I'm not sure I consider the current behaviour of spending 12 minutes adding the same element so much better, though. Cheers, Henry On Jul 2, 2009, at 12:17 14PM, Andres Valloud wrote: > Also, what happens in these cases? > > | set | > set := Set new. > bag := Bag new. > bad add: 4 withOccurrences: 1000000000 "or equivalent code". > set addAll: bag > > I am guessing the set will fail to grow and addAll: will fail, even > though it should work. Maybe it won't happen with bags, or perhaps > the > example needs some massaging. Is the failure to grow handled > gracefully? Note that I have not looked at the code because I have > not > had much chance to do that yet. > > Andres Valloud wrote: >> Hmmm, I don't know though, usually the usage pattern with hashed >> collections has been to presize explicitly and then add to them... >> what >> kind of use cases are you dealing with? >> >> Henrik Johansen wrote: >> >>> Having spent a fair deal of time (in VisualWorks) worrying about my >>> intial Set sizes, writing functions to precompute sane sizes for >>> arbitrary inputs and adding changeCapacityTo: calls before addAll: >>> operations due to performance issues on a case-by-case basis >>> (Squeak/ >>> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly >>> disagree. >>> >>> I'd rather spend my time thinking about my problem domain, and know >>> the Set implementation handles the nitty-gritty details in a >>> sufficiently smart way. >>> >>> Whether that means growing once to accomedate the worst case, then >>> shrinking afterwards, or doing a grow check every X added items, in >>> the unlikely event that what you're adding is a 3 million element >>> Array with 10 unique objects, I don't really care. >>> >>> Cheers, >>> Henry >>> >>> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote: >>> >>> >>> >>>> (Set new: desiredCapacity) addAll: "consecutive integers" >>>> >>>> may be a better compromise. >>>> >>>> Andres. >>>> >>>> >>>> Stéphane Ducasse wrote: >>>> >>>> >>>>> Thanks this is really good. >>>>> We will probably integrate these changes really fast >>>>> >>>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> I took some time to check this yesterday, my comments on the code >>>>>> itself are basically the same as what Lucas posted in the issue >>>>>> tracker. >>>>>> Works as advertised, but should probably clean out the "gunk" >>>>>> which is >>>>>> unused after it's integrated. >>>>>> >>>>>> Some feedback on possible improvements: >>>>>> - What it does, is basically speed up the adds to the new array >>>>>> in a >>>>>> grow that might happen Why not take this all to the logical >>>>>> conclusion, and instead optimize an addAll: method to be used in >>>>>> grow, >>>>>> that assumes elements are not in set (and themselves are not >>>>>> duplicates)? F.ex. it could also add to tally just once, >>>>>> instead of >>>>>> one increase per element as the noCompareAdd: does. (an add >>>>>> method >>>>>> with no tally add, no = object check, and no growCheck). >>>>>> >>>>>> - The usual addAll: method can be sped up in a similar way, by >>>>>> performing one grow operation before adding elements if necessary >>>>>> (pessimistically assuming all elements are unique), then use an >>>>>> add >>>>>> method which does not check for grow need at all. (an add method >>>>>> with >>>>>> tally add and = object check, but no growCheck) >>>>>> If desirable, you could also shrink the capacity after adds are >>>>>> done, to match the real amount of new elements. >>>>>> >>>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method >>>>>> took >>>>>> 1/2 to 1/3rd the time of the regular addAll: method, even with >>>>>> the >>>>>> lower grow cost of FasterSets. (Lower gains if Set is presized, >>>>>> and if >>>>>> hash-calculation is bigger part of cost) >>>>>> >>>>>> Note: If you want to stay ANSI-compatible, one tricky point in >>>>>> such an >>>>>> optimization is adding all objects in collection up to a nil >>>>>> element, >>>>>> then raising an error. (ie: must produce equivalent results to >>>>>> add: >>>>>> performed of each of collections elements in sequence ) >>>>>> >>>>>> Cheers, >>>>>> Henry >>>>>> >>>>>> >>>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> I would really like to assess FasterSets for Pharo1.1 >>>>>>> >>>>>>> Begin forwarded message: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> From: Ralph Boland <[hidden email]> >>>>>>>> Date: June 25, 2009 8:30:13 PM CEDT >>>>>>>> To: Stéphane Ducasse <[hidden email]> >>>>>>>> Subject: Re: about fasterTests >>>>>>>> >>>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> we are interest to get better libraries in pharo. >>>>>>>>> If you want to join and help making sure that your code in >>>>>>>>> integrated in >>>>>>>>> pharo1.1 >>>>>>>>> please say it and may be join the pharo mailing-list. >>>>>>>>> >>>>>>>>> Stef >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> Yes, I would like my code 'FasterSets' added to Pharo. >>>>>>>> I joined the pharo mailing list. >>>>>>>> Today I will mail you my MIT license release. >>>>>>>> Probably will take a couple of weeks (From Calgary, Canada) >>>>>>>> to get >>>>>>>> there. >>>>>>>> Once I am added as a contributer I will release my code to >>>>>>>> Pharo >>>>>>>> for >>>>>>>> verification. >>>>>>>> In the meantime if any Squeakers try out my code and report >>>>>>>> bugs I >>>>>>>> will fix them >>>>>>>> before releasing to Pharo. >>>>>>>> >>>>>>>> Some comments: >>>>>>>> >>>>>>>> 1) Since FasterSets modifies low level methods in class Set >>>>>>>> and >>>>>>>> its >>>>>>>> subclasses it >>>>>>>> may be incompatible with packages that create subclasses of Set >>>>>>>> and then >>>>>>>> override low level methods. If someone could send me a list of >>>>>>>> package add-ons >>>>>>>> to Pharo that do this then I can download them and make any >>>>>>>> modifications >>>>>>>> necessary for them to run with FasterSets and make these >>>>>>>> changes >>>>>>>> part of my >>>>>>>> release. Note that it is no more difficult to subclass from >>>>>>>> Set >>>>>>>> with FasterSets >>>>>>>> than without; it is just different and only different if you >>>>>>>> do >>>>>>>> low level things. >>>>>>>> >>>>>>>> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds >>>>>>>> an element to a set without doing compares or growing the set. >>>>>>>> Needless to say it is a private method. >>>>>>>> I also have a public method 'nastyAddNew:' which, like >>>>>>>> 'noCompareOrGrowAdd:' adds an element to a set without >>>>>>>> checking >>>>>>>> if the >>>>>>>> element is already there but does do a grow if needed. >>>>>>>> This is useful (i.e. faster) in situations where you KNOW the >>>>>>>> object >>>>>>>> you are adding to a set is not there already. >>>>>>>> However if the object IS already there then you now have two >>>>>>>> of them in your set; i.e. you now have a subtle bug. >>>>>>>> I use 'nastyAddNew:' in my code but never made it a part of >>>>>>>> FasterSets >>>>>>>> because it is questionable whether or not such a method >>>>>>>> should be >>>>>>>> available generally. However, if I can decide I like the >>>>>>>> 'nastyAddNew:' >>>>>>>> method and am willing to use it despite its risks so can others >>>>>>>> so there >>>>>>>> is an argument for it being added. >>>>>>>> While my expectation is that you don't want 'nastyAddNew:' in >>>>>>>> Pharo >>>>>>>> I thought you should at least be aware of it. >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Ralph Boland >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pharo-project mailing list >>>>>>> [hidden email] >>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Pharo-project mailing list >>>>>> [hidden email] >>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>> project >>>>> . >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>>> >>>> >>> . >>> >>> >>> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> . >> >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
However, handling the growth failure wouldn't crash the program...
Henrik Johansen wrote: > It would. > I'm not sure I consider the current behaviour of spending 12 minutes > adding the same element so much better, though. > > Cheers, > Henry > > On Jul 2, 2009, at 12:17 14PM, Andres Valloud wrote: > > >> Also, what happens in these cases? >> >> | set | >> set := Set new. >> bag := Bag new. >> bad add: 4 withOccurrences: 1000000000 "or equivalent code". >> set addAll: bag >> >> I am guessing the set will fail to grow and addAll: will fail, even >> though it should work. Maybe it won't happen with bags, or perhaps >> the >> example needs some massaging. Is the failure to grow handled >> gracefully? Note that I have not looked at the code because I have >> not >> had much chance to do that yet. >> >> Andres Valloud wrote: >> >>> Hmmm, I don't know though, usually the usage pattern with hashed >>> collections has been to presize explicitly and then add to them... >>> what >>> kind of use cases are you dealing with? >>> >>> Henrik Johansen wrote: >>> >>> >>>> Having spent a fair deal of time (in VisualWorks) worrying about my >>>> intial Set sizes, writing functions to precompute sane sizes for >>>> arbitrary inputs and adding changeCapacityTo: calls before addAll: >>>> operations due to performance issues on a case-by-case basis >>>> (Squeak/ >>>> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly >>>> disagree. >>>> >>>> I'd rather spend my time thinking about my problem domain, and know >>>> the Set implementation handles the nitty-gritty details in a >>>> sufficiently smart way. >>>> >>>> Whether that means growing once to accomedate the worst case, then >>>> shrinking afterwards, or doing a grow check every X added items, in >>>> the unlikely event that what you're adding is a 3 million element >>>> Array with 10 unique objects, I don't really care. >>>> >>>> Cheers, >>>> Henry >>>> >>>> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote: >>>> >>>> >>>> >>>> >>>>> (Set new: desiredCapacity) addAll: "consecutive integers" >>>>> >>>>> may be a better compromise. >>>>> >>>>> Andres. >>>>> >>>>> >>>>> Stéphane Ducasse wrote: >>>>> >>>>> >>>>> >>>>>> Thanks this is really good. >>>>>> We will probably integrate these changes really fast >>>>>> >>>>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> I took some time to check this yesterday, my comments on the code >>>>>>> itself are basically the same as what Lucas posted in the issue >>>>>>> tracker. >>>>>>> Works as advertised, but should probably clean out the "gunk" >>>>>>> which is >>>>>>> unused after it's integrated. >>>>>>> >>>>>>> Some feedback on possible improvements: >>>>>>> - What it does, is basically speed up the adds to the new array >>>>>>> in a >>>>>>> grow that might happen Why not take this all to the logical >>>>>>> conclusion, and instead optimize an addAll: method to be used in >>>>>>> grow, >>>>>>> that assumes elements are not in set (and themselves are not >>>>>>> duplicates)? F.ex. it could also add to tally just once, >>>>>>> instead of >>>>>>> one increase per element as the noCompareAdd: does. (an add >>>>>>> method >>>>>>> with no tally add, no = object check, and no growCheck). >>>>>>> >>>>>>> - The usual addAll: method can be sped up in a similar way, by >>>>>>> performing one grow operation before adding elements if necessary >>>>>>> (pessimistically assuming all elements are unique), then use an >>>>>>> add >>>>>>> method which does not check for grow need at all. (an add method >>>>>>> with >>>>>>> tally add and = object check, but no growCheck) >>>>>>> If desirable, you could also shrink the capacity after adds are >>>>>>> done, to match the real amount of new elements. >>>>>>> >>>>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method >>>>>>> took >>>>>>> 1/2 to 1/3rd the time of the regular addAll: method, even with >>>>>>> the >>>>>>> lower grow cost of FasterSets. (Lower gains if Set is presized, >>>>>>> and if >>>>>>> hash-calculation is bigger part of cost) >>>>>>> >>>>>>> Note: If you want to stay ANSI-compatible, one tricky point in >>>>>>> such an >>>>>>> optimization is adding all objects in collection up to a nil >>>>>>> element, >>>>>>> then raising an error. (ie: must produce equivalent results to >>>>>>> add: >>>>>>> performed of each of collections elements in sequence ) >>>>>>> >>>>>>> Cheers, >>>>>>> Henry >>>>>>> >>>>>>> >>>>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> I would really like to assess FasterSets for Pharo1.1 >>>>>>>> >>>>>>>> Begin forwarded message: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> From: Ralph Boland <[hidden email]> >>>>>>>>> Date: June 25, 2009 8:30:13 PM CEDT >>>>>>>>> To: Stéphane Ducasse <[hidden email]> >>>>>>>>> Subject: Re: about fasterTests >>>>>>>>> >>>>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> we are interest to get better libraries in pharo. >>>>>>>>>> If you want to join and help making sure that your code in >>>>>>>>>> integrated in >>>>>>>>>> pharo1.1 >>>>>>>>>> please say it and may be join the pharo mailing-list. >>>>>>>>>> >>>>>>>>>> Stef >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Yes, I would like my code 'FasterSets' added to Pharo. >>>>>>>>> I joined the pharo mailing list. >>>>>>>>> Today I will mail you my MIT license release. >>>>>>>>> Probably will take a couple of weeks (From Calgary, Canada) >>>>>>>>> to get >>>>>>>>> there. >>>>>>>>> Once I am added as a contributer I will release my code to >>>>>>>>> Pharo >>>>>>>>> for >>>>>>>>> verification. >>>>>>>>> In the meantime if any Squeakers try out my code and report >>>>>>>>> bugs I >>>>>>>>> will fix them >>>>>>>>> before releasing to Pharo. >>>>>>>>> >>>>>>>>> Some comments: >>>>>>>>> >>>>>>>>> 1) Since FasterSets modifies low level methods in class Set >>>>>>>>> and >>>>>>>>> its >>>>>>>>> subclasses it >>>>>>>>> may be incompatible with packages that create subclasses of Set >>>>>>>>> and then >>>>>>>>> override low level methods. If someone could send me a list of >>>>>>>>> package add-ons >>>>>>>>> to Pharo that do this then I can download them and make any >>>>>>>>> modifications >>>>>>>>> necessary for them to run with FasterSets and make these >>>>>>>>> changes >>>>>>>>> part of my >>>>>>>>> release. Note that it is no more difficult to subclass from >>>>>>>>> Set >>>>>>>>> with FasterSets >>>>>>>>> than without; it is just different and only different if you >>>>>>>>> do >>>>>>>>> low level things. >>>>>>>>> >>>>>>>>> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which adds >>>>>>>>> an element to a set without doing compares or growing the set. >>>>>>>>> Needless to say it is a private method. >>>>>>>>> I also have a public method 'nastyAddNew:' which, like >>>>>>>>> 'noCompareOrGrowAdd:' adds an element to a set without >>>>>>>>> checking >>>>>>>>> if the >>>>>>>>> element is already there but does do a grow if needed. >>>>>>>>> This is useful (i.e. faster) in situations where you KNOW the >>>>>>>>> object >>>>>>>>> you are adding to a set is not there already. >>>>>>>>> However if the object IS already there then you now have two >>>>>>>>> of them in your set; i.e. you now have a subtle bug. >>>>>>>>> I use 'nastyAddNew:' in my code but never made it a part of >>>>>>>>> FasterSets >>>>>>>>> because it is questionable whether or not such a method >>>>>>>>> should be >>>>>>>>> available generally. However, if I can decide I like the >>>>>>>>> 'nastyAddNew:' >>>>>>>>> method and am willing to use it despite its risks so can others >>>>>>>>> so there >>>>>>>>> is an argument for it being added. >>>>>>>>> While my expectation is that you don't want 'nastyAddNew:' in >>>>>>>>> Pharo >>>>>>>>> I thought you should at least be aware of it. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> Ralph Boland >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Pharo-project mailing list >>>>>>>> [hidden email] >>>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pharo-project mailing list >>>>>>> [hidden email] >>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Pharo-project mailing list >>>>>> [hidden email] >>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>>> project >>>>>> . >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [hidden email] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>> >>>>> >>>>> >>>>> >>>> . >>>> >>>> >>>> >>>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> . >>> >>> >>> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> > > . > > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
I submit for your consideration the use case where you want to
initialize a Set with elements from multiple sources. How do you do that today? Well, you could write set := (Set withAll: source1) addAll: source2; addAll: source3... etc, which will surely cause quite some grows. (and also crashes if source1 has more elements than implementation limits) A few other options: set := Set new: (sources inject: 0 into: [:sub :next | sub + next size]). sources do: [:each | set addAll: each ] or: tmpWithAllToAdd := sources inject: OrderedCollection new into: [:sub : next | sub addAll: next; yourself]. set := Set withAll: tmpAllToAdd which doesn't perform excessive grows, but still would crash if total size of sources is bigger than implementation limits. So how are these better than modifying addAll: so you can write set := Set new. sources do: [:each | set addAll: each] and not worry too much how much your set spends on growing/checking if it needs to grow? You could always min: with current implementation limit of Set size if you think it's likely that's a case you'll encounter... And my point was really, I'd rather have an error telling me "Hey, you're trying to do something EXTREME" (like, trying to add (in VW) more than 2^28 (2^56 on 64bit) elements to a set) than have my program stop for a large amount of time, then finishing "correctly"... Cheers, Henry On Jul 2, 2009, at 4:16 50PM, Andres Valloud wrote: > However, handling the growth failure wouldn't crash the program... > > Henrik Johansen wrote: >> It would. >> I'm not sure I consider the current behaviour of spending 12 minutes >> adding the same element so much better, though. >> >> Cheers, >> Henry >> >> On Jul 2, 2009, at 12:17 14PM, Andres Valloud wrote: >> >> >>> Also, what happens in these cases? >>> >>> | set | >>> set := Set new. >>> bag := Bag new. >>> bad add: 4 withOccurrences: 1000000000 "or equivalent code". >>> set addAll: bag >>> >>> I am guessing the set will fail to grow and addAll: will fail, even >>> though it should work. Maybe it won't happen with bags, or perhaps >>> the >>> example needs some massaging. Is the failure to grow handled >>> gracefully? Note that I have not looked at the code because I have >>> not >>> had much chance to do that yet. >>> >>> Andres Valloud wrote: >>> >>>> Hmmm, I don't know though, usually the usage pattern with hashed >>>> collections has been to presize explicitly and then add to them... >>>> what >>>> kind of use cases are you dealing with? >>>> >>>> Henrik Johansen wrote: >>>> >>>> >>>>> Having spent a fair deal of time (in VisualWorks) worrying about >>>>> my >>>>> intial Set sizes, writing functions to precompute sane sizes for >>>>> arbitrary inputs and adding changeCapacityTo: calls before addAll: >>>>> operations due to performance issues on a case-by-case basis >>>>> (Squeak/ >>>>> Pharo doesn't even HAVE a changeCapacityTo: ), I have to strongly >>>>> disagree. >>>>> >>>>> I'd rather spend my time thinking about my problem domain, and >>>>> know >>>>> the Set implementation handles the nitty-gritty details in a >>>>> sufficiently smart way. >>>>> >>>>> Whether that means growing once to accomedate the worst case, then >>>>> shrinking afterwards, or doing a grow check every X added items, >>>>> in >>>>> the unlikely event that what you're adding is a 3 million element >>>>> Array with 10 unique objects, I don't really care. >>>>> >>>>> Cheers, >>>>> Henry >>>>> >>>>> On Jul 2, 2009, at 7:38 10AM, Andres Valloud wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> (Set new: desiredCapacity) addAll: "consecutive integers" >>>>>> >>>>>> may be a better compromise. >>>>>> >>>>>> Andres. >>>>>> >>>>>> >>>>>> Stéphane Ducasse wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Thanks this is really good. >>>>>>> We will probably integrate these changes really fast >>>>>>> >>>>>>> On Jul 1, 2009, at 1:03 PM, Henrik Johansen wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> I took some time to check this yesterday, my comments on the >>>>>>>> code >>>>>>>> itself are basically the same as what Lucas posted in the issue >>>>>>>> tracker. >>>>>>>> Works as advertised, but should probably clean out the "gunk" >>>>>>>> which is >>>>>>>> unused after it's integrated. >>>>>>>> >>>>>>>> Some feedback on possible improvements: >>>>>>>> - What it does, is basically speed up the adds to the new array >>>>>>>> in a >>>>>>>> grow that might happen Why not take this all to the logical >>>>>>>> conclusion, and instead optimize an addAll: method to be used >>>>>>>> in >>>>>>>> grow, >>>>>>>> that assumes elements are not in set (and themselves are not >>>>>>>> duplicates)? F.ex. it could also add to tally just once, >>>>>>>> instead of >>>>>>>> one increase per element as the noCompareAdd: does. (an add >>>>>>>> method >>>>>>>> with no tally add, no = object check, and no growCheck). >>>>>>>> >>>>>>>> - The usual addAll: method can be sped up in a similar way, by >>>>>>>> performing one grow operation before adding elements if >>>>>>>> necessary >>>>>>>> (pessimistically assuming all elements are unique), then use an >>>>>>>> add >>>>>>>> method which does not check for grow need at all. (an add >>>>>>>> method >>>>>>>> with >>>>>>>> tally add and = object check, but no growCheck) >>>>>>>> If desirable, you could also shrink the capacity after adds are >>>>>>>> done, to match the real amount of new elements. >>>>>>>> >>>>>>>> Adding (1 to: 1200000) to a new Set with such an addAll: method >>>>>>>> took >>>>>>>> 1/2 to 1/3rd the time of the regular addAll: method, even with >>>>>>>> the >>>>>>>> lower grow cost of FasterSets. (Lower gains if Set is presized, >>>>>>>> and if >>>>>>>> hash-calculation is bigger part of cost) >>>>>>>> >>>>>>>> Note: If you want to stay ANSI-compatible, one tricky point in >>>>>>>> such an >>>>>>>> optimization is adding all objects in collection up to a nil >>>>>>>> element, >>>>>>>> then raising an error. (ie: must produce equivalent results to >>>>>>>> add: >>>>>>>> performed of each of collections elements in sequence ) >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Henry >>>>>>>> >>>>>>>> >>>>>>>> On Jun 25, 2009, at 8:55 03PM, Stéphane Ducasse wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> I would really like to assess FasterSets for Pharo1.1 >>>>>>>>> >>>>>>>>> Begin forwarded message: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> From: Ralph Boland <[hidden email]> >>>>>>>>>> Date: June 25, 2009 8:30:13 PM CEDT >>>>>>>>>> To: Stéphane Ducasse <[hidden email]> >>>>>>>>>> Subject: Re: about fasterTests >>>>>>>>>> >>>>>>>>>> 2009/6/25 Stéphane Ducasse <[hidden email]>: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> we are interest to get better libraries in pharo. >>>>>>>>>>> If you want to join and help making sure that your code in >>>>>>>>>>> integrated in >>>>>>>>>>> pharo1.1 >>>>>>>>>>> please say it and may be join the pharo mailing-list. >>>>>>>>>>> >>>>>>>>>>> Stef >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Yes, I would like my code 'FasterSets' added to Pharo. >>>>>>>>>> I joined the pharo mailing list. >>>>>>>>>> Today I will mail you my MIT license release. >>>>>>>>>> Probably will take a couple of weeks (From Calgary, Canada) >>>>>>>>>> to get >>>>>>>>>> there. >>>>>>>>>> Once I am added as a contributer I will release my code to >>>>>>>>>> Pharo >>>>>>>>>> for >>>>>>>>>> verification. >>>>>>>>>> In the meantime if any Squeakers try out my code and report >>>>>>>>>> bugs I >>>>>>>>>> will fix them >>>>>>>>>> before releasing to Pharo. >>>>>>>>>> >>>>>>>>>> Some comments: >>>>>>>>>> >>>>>>>>>> 1) Since FasterSets modifies low level methods in class Set >>>>>>>>>> and >>>>>>>>>> its >>>>>>>>>> subclasses it >>>>>>>>>> may be incompatible with packages that create subclasses of >>>>>>>>>> Set >>>>>>>>>> and then >>>>>>>>>> override low level methods. If someone could send me a >>>>>>>>>> list of >>>>>>>>>> package add-ons >>>>>>>>>> to Pharo that do this then I can download them and make any >>>>>>>>>> modifications >>>>>>>>>> necessary for them to run with FasterSets and make these >>>>>>>>>> changes >>>>>>>>>> part of my >>>>>>>>>> release. Note that it is no more difficult to subclass from >>>>>>>>>> Set >>>>>>>>>> with FasterSets >>>>>>>>>> than without; it is just different and only different if you >>>>>>>>>> do >>>>>>>>>> low level things. >>>>>>>>>> >>>>>>>>>> 2) FasterSets uses a method 'noCompareOrGrowAdd:' which >>>>>>>>>> adds >>>>>>>>>> an element to a set without doing compares or growing the >>>>>>>>>> set. >>>>>>>>>> Needless to say it is a private method. >>>>>>>>>> I also have a public method 'nastyAddNew:' which, like >>>>>>>>>> 'noCompareOrGrowAdd:' adds an element to a set without >>>>>>>>>> checking >>>>>>>>>> if the >>>>>>>>>> element is already there but does do a grow if needed. >>>>>>>>>> This is useful (i.e. faster) in situations where you KNOW the >>>>>>>>>> object >>>>>>>>>> you are adding to a set is not there already. >>>>>>>>>> However if the object IS already there then you now have two >>>>>>>>>> of them in your set; i.e. you now have a subtle bug. >>>>>>>>>> I use 'nastyAddNew:' in my code but never made it a part of >>>>>>>>>> FasterSets >>>>>>>>>> because it is questionable whether or not such a method >>>>>>>>>> should be >>>>>>>>>> available generally. However, if I can decide I like the >>>>>>>>>> 'nastyAddNew:' >>>>>>>>>> method and am willing to use it despite its risks so can >>>>>>>>>> others >>>>>>>>>> so there >>>>>>>>>> is an argument for it being added. >>>>>>>>>> While my expectation is that you don't want >>>>>>>>>> 'nastyAddNew:' in >>>>>>>>>> Pharo >>>>>>>>>> I thought you should at least be aware of it. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> >>>>>>>>>> Ralph Boland >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Pharo-project mailing list >>>>>>>>> [hidden email] >>>>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Pharo-project mailing list >>>>>>>> [hidden email] >>>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pharo-project mailing list >>>>>>> [hidden email] >>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>>>> project >>>>>>> . >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Pharo-project mailing list >>>>>> [hidden email] >>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>>>> >>>>>> >>>>>> >>>>>> >>>>> . >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [hidden email] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> . >>>> >>>> >>>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> >> >> . >> >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Free forum by Nabble | Edit this page |