When we don't find method, we reinvent the wheel. Ok, so this is what
happened with my quest on rounding float. So here is another example. I needed a method subsets ... to have all subsets of a Set... #(1 2 ) asSet subsets " -> a Set(a Set(1 2) a Set(1) a Set(2) a Set()) " Is there already something ? I couldn't find so here is what I did ? probably hackish (because of the binary mask) so feel free to comment ;) Set>>subsets | subsetsSize subsets workArray | workArray := self asArray. subsetsSize := 2 raisedTo: self size. subsets := OrderedCollection new. 1 to: subsetsSize do: [:ea | subsets add: ((workArray masquedBy: (ea printStringBase: 2)) asSet)]. "masque par une conversion binaire" ^subsets asSet "could be an array of sets" ArrayedCollection>>masquedBy: aBitString | result entry bitString | entry := self reverse. bitString := aBitString reverse. result := OrderedCollection new. 1 to: (self size) do: [:ea | ((bitString at: ea ifAbsent: []) = $1) ifTrue: [result add: (entry at: ea)]]. ^result reverse Cédrick _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
Googling "smalltalk powerset" turns this up: http://www.foldr.org/~michaelw/log/2005/09/
On Wed, Jul 23, 2008 at 11:51 AM, cdrick <[hidden email]> wrote: When we don't find method, we reinvent the wheel. Ok, so this is what _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
> Googling "smalltalk powerset" turns this up:
> http://www.foldr.org/~michaelw/log/2005/09/ > Thanks Marcin, interesting read :) This time, I had the wrong keyword :( _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
In reply to this post by cedreek
Couldn't find anything obvious but remembered #permutationsDo: which may
help. Not sure if these does what you want, but maybe it can be adapted? Set>>subsets | subsets | subsets := Set with: self. self asArray permutationsDo: [:e | subsets add: e asSet]. self do: [:e | subsets addAll: (self copyWithout: e) subsets]. ^ subsets #(1 2) asSet subsets. "a Set(a Set(1 2) a Set() a Set(2) a Set(1))" #(1 2 3) asSet subsets. "a Set(a Set(1 2) a Set() a Set(2) a Set(1 2 3) a Set(2 3) a Set(3) a Set(1 3) a Set(1))" Z. cdrick wrote: > When we don't find method, we reinvent the wheel. Ok, so this is what > happened with my quest on rounding float. So here is another example. > I needed a method subsets ... to have all subsets of a Set... > > #(1 2 ) asSet subsets " -> a Set(a Set(1 2) a Set(1) a Set(2) a Set()) " > > Is there already something ? I couldn't find so here is what I did ? > probably hackish (because of the binary mask) so feel free to comment > ;) > > Set>>subsets > | subsetsSize subsets workArray | > workArray := self asArray. > subsetsSize := 2 raisedTo: self size. > subsets := OrderedCollection new. > 1 to: subsetsSize do: [:ea | > subsets add: ((workArray masquedBy: (ea printStringBase: 2)) > asSet)]. "masque par une conversion binaire" > ^subsets asSet "could be an array of sets" > > ArrayedCollection>>masquedBy: aBitString > | result entry bitString | > entry := self reverse. > bitString := aBitString reverse. > result := OrderedCollection new. > 1 to: (self size) do: [:ea | > ((bitString at: ea ifAbsent: []) = $1) ifTrue: [result add: (entry > at: ea)]]. > ^result reverse > > Cédrick > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beginners mailing list > [hidden email] > http://lists.squeakfoundation.org/mailman/listinfo/beginners _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
>
> Set>>subsets > | subsets | > subsets := Set with: self. > self asArray permutationsDo: [:e | subsets add: e asSet]. > self do: [:e | subsets addAll: (self copyWithout: e) subsets]. > ^ subsets nice too and frustrating how you got it quick :) I tried a recursive method too first but found the byte ressemblance so... My only consolation is that the recursive solution (subsets2) is slower and hardly work for Set with more than 10 elements. set := #(1 2 3 4 5) asSet. [ set subsets ] timeToRun ." 1" [ set subsets2 ] timeToRun. " 8" set := #(1 2 3 4 5 6 7) asSet. [ set subsets ] timeToRun ." 5" [ set subsets2 ] timeToRun. " 233" set := #(1 2 3 4 5 6 7 8 ) asSet. [ set subsets ] timeToRun . " 11" [ set subsets2 ] timeToRun. " 1683" set := (1 to: 10) asSet. [ set subsets ] timeToRun . " 46" set := (1 to: 15) asSet. [ set subsets ] timeToRun ." 2484" set := (1 to: 20) asSet. [ set subsets ] timeToRun . "559953" "but here the result has (2 raisedTo: 20) 1 048 576 Sets :)" set := (1 to: 50) asSet. [ set subsets ] timeToRun . "I got a "space is low wow" :) I have to go, That was fun :) See you Cédrick _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
You have one messsage #combinations:atATimeDo: to enumerate partitions,
but in SequenceableCollection, not in Set: | collec subset | collec := 1 to: 4. subset := OrderedCollection new: (2 raisedTo: collec size). subset add: #(). 1 to: collec size do: [:subSize | collec combinations: subSize atATimeDo: [:subArray | subset add: subArray copy]]. subset Mind the copy, because the iteration recycle same subArray for economy. cdrick a écrit : >> Set>>subsets >> | subsets | >> subsets := Set with: self. >> self asArray permutationsDo: [:e | subsets add: e asSet]. >> self do: [:e | subsets addAll: (self copyWithout: e) subsets]. >> ^ subsets > > nice too and frustrating how you got it quick :) > > I tried a recursive method too first but found the byte ressemblance so... > My only consolation is that the recursive solution (subsets2) is > slower and hardly work for Set with more than 10 elements. > > set := #(1 2 3 4 5) asSet. > [ set subsets ] timeToRun ." 1" > [ set subsets2 ] timeToRun. " 8" > > set := #(1 2 3 4 5 6 7) asSet. > [ set subsets ] timeToRun ." 5" > [ set subsets2 ] timeToRun. " 233" > > set := #(1 2 3 4 5 6 7 8 ) asSet. > [ set subsets ] timeToRun . " 11" > [ set subsets2 ] timeToRun. " 1683" > > set := (1 to: 10) asSet. > [ set subsets ] timeToRun . " 46" > > set := (1 to: 15) asSet. > [ set subsets ] timeToRun ." 2484" > > set := (1 to: 20) asSet. > [ set subsets ] timeToRun . "559953" "but here the result has (2 > raisedTo: 20) 1 048 576 Sets :)" > > set := (1 to: 50) asSet. > [ set subsets ] timeToRun . "I got a "space is low wow" :) > > I have to go, > That was fun :) > > See you > > Cédrick > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beginners mailing list > [hidden email] > http://lists.squeakfoundation.org/mailman/listinfo/beginners _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
In reply to this post by cedreek
Actually, we don't need #permutationsDo: as we're only looking for
combinations. With a few optimisations: combinations ^ self combinationsInto: (Set new: (2 raisedTo: self size)) combinationsInto: aSet (aSet includes: self) ifTrue: [^ self]. "optimisation" aSet add: self. self do: [:each | (self copyWithout: each) combinationsInto: aSet]. ^ aSet [(1 to: 8) asSet combinations] timeToRun "215" [(1 to: 9) asSet combinations] timeToRun "918" [(1 to: 10) asSet combinations] timeToRun "3989" [(1 to: 11) asSet combinations] timeToRun "16349" [(1 to: 12) asSet combinations] timeToRun "68780" So a little better, but I was expecting much more. What's worrying is this: (1 to: 10) asSet combinations size "1024" ((1 to: 10) asSet combinations collect: [:e | e hash]) asSet size "16" So, 1024 distinct Sets have only 16 distinct hashes between them? That seems pretty bad. It would probably be possible to get a little bit more out of the routine with a better (more appropriate) hash function. Even so, I don't think it will ever be as fast as your method but will happily be proved wrong! :) Z. cdrick wrote: >> Set>>subsets >> | subsets | >> subsets := Set with: self. >> self asArray permutationsDo: [:e | subsets add: e asSet]. >> self do: [:e | subsets addAll: (self copyWithout: e) subsets]. >> ^ subsets > > nice too and frustrating how you got it quick :) > > I tried a recursive method too first but found the byte ressemblance so... > My only consolation is that the recursive solution (subsets2) is > slower and hardly work for Set with more than 10 elements. > > set := #(1 2 3 4 5) asSet. > [ set subsets ] timeToRun ." 1" > [ set subsets2 ] timeToRun. " 8" > > set := #(1 2 3 4 5 6 7) asSet. > [ set subsets ] timeToRun ." 5" > [ set subsets2 ] timeToRun. " 233" > > set := #(1 2 3 4 5 6 7 8 ) asSet. > [ set subsets ] timeToRun . " 11" > [ set subsets2 ] timeToRun. " 1683" > > set := (1 to: 10) asSet. > [ set subsets ] timeToRun . " 46" > > set := (1 to: 15) asSet. > [ set subsets ] timeToRun ." 2484" > > set := (1 to: 20) asSet. > [ set subsets ] timeToRun . "559953" "but here the result has (2 > raisedTo: 20) 1 048 576 Sets :)" > > set := (1 to: 50) asSet. > [ set subsets ] timeToRun . "I got a "space is low wow" :) > > I have to go, > That was fun :) > > See you > > Cédrick > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beginners mailing list > [hidden email] > http://lists.squeakfoundation.org/mailman/listinfo/beginners _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
In reply to this post by Nicolas Cellier-3
Nicolas, your initials fits you ;)
nice and twice quicker as the bit-sift one. > You have one messsage #combinations:atATimeDo: to enumerate partitions, but > in SequenceableCollection, not in Set: > > | collec subset | > collec := 1 to: 4. > subset := OrderedCollection new: (2 raisedTo: collec size). > subset add: #(). > 1 to: collec size do: [:subSize | > collec combinations: subSize atATimeDo: [:subArray | subset add: subArray > copy]]. > subset > > Mind the copy, because the iteration recycle same subArray for economy. > Cédrick _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
In reply to this post by Zulq Alam-2
On Thu, 24 Jul 2008 15:13:22 +0200, Zulq Alam wrote:
> Actually, we don't need #permutationsDo: as we're only looking for > combinations. Permutations? (n factorial)? v.s. (2 raisedTo: n)? > With a few optimisations: > > combinations > ^ self combinationsInto: (Set new: (2 raisedTo: self size)) > > combinationsInto: aSet > (aSet includes: self) ifTrue: [^ self]. "optimisation" > aSet add: self. > self do: > [:each | > (self copyWithout: each) combinationsInto: aSet]. > ^ aSet Yeah, that's a good one :) > [(1 to: 8) asSet combinations] timeToRun "215" > [(1 to: 9) asSet combinations] timeToRun "918" > [(1 to: 10) asSet combinations] timeToRun "3989" > [(1 to: 11) asSet combinations] timeToRun "16349" > [(1 to: 12) asSet combinations] timeToRun "68780" > > So a little better, but I was expecting much more. I think that the amount of work for (n + 1) is about (timeToRun of: n) * 3; should be observable better with larger n's ;) Squeak looks to be already fast in this case, your routine is almost optimal :) Implementing it in Set so that Set's internal can be of benefit won't bring these figures down much -- what remains is (aSet includes: another), times #copyWithout: loop, and that seems to be unavoidable. How about naming Collection>>#asPowerset ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) with Set>>#powersetInto: and putting that into the next release. /Klaus > What's worrying is this: > > (1 to: 10) asSet combinations size "1024" > ((1 to: 10) asSet combinations collect: [:e | e hash]) asSet size "16" > > So, 1024 distinct Sets have only 16 distinct hashes between them? That > seems pretty bad. It would probably be possible to get a little bit more > out of the routine with a better (more appropriate) hash function. Even > so, I don't think it will ever be as fast as your method but will > happily be proved wrong! :) > > Z. > > cdrick wrote: >>> Set>>subsets >>> | subsets | >>> subsets := Set with: self. >>> self asArray permutationsDo: [:e | subsets add: e asSet]. >>> self do: [:e | subsets addAll: (self copyWithout: e) subsets]. >>> ^ subsets >> nice too and frustrating how you got it quick :) >> I tried a recursive method too first but found the byte ressemblance >> so... >> My only consolation is that the recursive solution (subsets2) is >> slower and hardly work for Set with more than 10 elements. >> set := #(1 2 3 4 5) asSet. >> [ set subsets ] timeToRun ." 1" >> [ set subsets2 ] timeToRun. " 8" >> set := #(1 2 3 4 5 6 7) asSet. >> [ set subsets ] timeToRun ." 5" >> [ set subsets2 ] timeToRun. " 233" >> set := #(1 2 3 4 5 6 7 8 ) asSet. >> [ set subsets ] timeToRun . " 11" >> [ set subsets2 ] timeToRun. " 1683" >> set := (1 to: 10) asSet. >> [ set subsets ] timeToRun . " 46" >> set := (1 to: 15) asSet. >> [ set subsets ] timeToRun ." 2484" >> set := (1 to: 20) asSet. >> [ set subsets ] timeToRun . "559953" "but here the result has (2 >> raisedTo: 20) 1 048 576 Sets :)" >> set := (1 to: 50) asSet. >> [ set subsets ] timeToRun . "I got a "space is low wow" :) >> I have to go, >> That was fun :) >> See you >> Cédrick >> >> ------------------------------------------------------------------------ >> _______________________________________________ >> Beginners mailing list >> [hidden email] >> http://lists.squeakfoundation.org/mailman/listinfo/beginners _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
or withNicolas suggestion:
Set>>asPowerset | subset | subset := (OrderedCollection new: (2 raisedTo: self size)) add: Set new; yourself. 1 to: self size do: [:subSize | "don't copy anymore as there is the Set conversion" self asArray combinations: subSize atATimeDo: [:subArray | subset add: subArray asSet]]. ^ subset asSet I'd still prefer #combinationsSize:do: instead of #combinations:atATimeDo: > > How about naming > > Collection>>#asPowerset > ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) > > with Set>>#powersetInto: and putting that into the next release. > > /Klaus > Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
On Thu, 24 Jul 2008 18:27:48 +0200, Cedrick wrote:
> or withNicolas suggestion: > > Set>>asPowerset > | subset | > subset := (OrderedCollection new: (2 raisedTo: self size)) > add: Set new; > yourself. > 1 to: self size do: [:subSize | "don't copy anymore as there is the > Set conversion" > self asArray combinations: subSize atATimeDo: [:subArray | subset > add: subArray asSet]]. > ^ subset asSet Ah, you francophones always want to see Blaise Pascal's triangle at work ;) Didn't check #combinations:* implementation, does it avoid #includes: ? > I'd still prefer #combinationsSize:do: instead of > #combinations:atATimeDo: +1 but it looks alt-w friendly already (like #detectSum: looks ;) >> >> How about naming >> >> Collection>>#asPowerset >> ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) >> >> with Set>>#powersetInto: and putting that into the next release. >> >> /Klaus >> _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
>
> Didn't check #combinations:* implementation, does it avoid #includes: ? I think so. It calls #combinationsAt:in: after: do: wich is recursive. > >> I'd still prefer #combinationsSize:do: instead of >> #combinations:atATimeDo: > > +1 but it looks alt-w friendly already (like #detectSum: looks ;) > Funny to see it was probably the original name. See the comment of #combinationsAt:in: after: (last line) "Choose k of N items and put in aCollection. jj-1 already chosen. Indexes of items are in numerical order, to avoid the same combo being used twice. In this slot, we are allowed to use items in self indexed by nn+1 to self size. nn is the index used for position jj-1." "(1 to: 6) combinationsSize: 3 do: [:each | Transcript cr; show: each printString]" Cédrick >>> >>> How about naming >>> >>> Collection>>#asPowerset >>> ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) >>> >>> with Set>>#powersetInto: and putting that into the next release. >>> >>> /Klaus >>> > > > _______________________________________________ > Beginners mailing list > [hidden email] > http://lists.squeakfoundation.org/mailman/listinfo/beginners > _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
In reply to this post by Klaus D. Witzel
Klaus D. Witzel wrote:
> Squeak looks to be already fast in this case, your routine is almost > optimal :) Implementing it in Set so that Set's internal can be of > benefit won't bring these figures down much -- what remains is (aSet > includes: another), times #copyWithout: loop, and that seems to be > unavoidable. It's significantly faster with a more appropriate hashing function: {2->0 . 3->0 . 4->1 . 5->1 . 6->3 . 7->9 . 8->22 . 9->59 . 10->138 . 11->305 . 12->686 . 13->1640 . 14->4366 . 15->10550 . 16->28104 . 17->93373 . 18->303425} combinations | combinations | combinations := (PluggableSet new: (2 raisedTo: self size)) hashBlock: [:aSet | aSet inject: aSet size into: [:hash :each | hash * each hash + each]]; yourself. self combinationsInto: combinations. ^ combinations combinationsInto: aSet (aSet includes: self) ifTrue: [^ self]. aSet add: self. self do: [:each | (self copyWithout: each) combinationsInto: aSet]. I think the problem is that most of the sets contain similar data so the hashes (calculated with bitXor) tend towards to a small set of values. This means there are a lot of collisions and each #add: and #includes: is very expensive. > > How about naming > > Collection>>#asPowerset > ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) > > with Set>>#powersetInto: and putting that into the next release. > Even now, I'm still not clear on how one takes a change like this and drives it into a squeak release - Tests, Mantis, then what? Z. _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
On Thu, 24 Jul 2008 20:09:53 +0200, Zulq Alam wrote:
> Klaus D. Witzel wrote: > >> Squeak looks to be already fast in this case, your routine is almost >> optimal :) Implementing it in Set so that Set's internal can be of >> benefit won't bring these figures down much -- what remains is (aSet >> includes: another), times #copyWithout: loop, and that seems to be >> unavoidable. > > It's significantly faster with a more appropriate hashing function: > > {2->0 . 3->0 . 4->1 . 5->1 . 6->3 . 7->9 . 8->22 . 9->59 . 10->138 . > 11->305 . 12->686 . 13->1640 . 14->4366 . 15->10550 . 16->28104 . > 17->93373 . 18->303425} This is very good for small n :) > combinations > | combinations | > combinations := (PluggableSet new: (2 raisedTo: self size)) > hashBlock: > [:aSet | > aSet > inject: aSet size > into: [:hash :each | hash * each hash + each]]; > yourself. > self combinationsInto: combinations. > ^ combinations > > combinationsInto: aSet > (aSet includes: self) ifTrue: [^ self]. > aSet add: self. > self do: > [:each | > (self copyWithout: each) combinationsInto: aSet]. > > I think the problem is that most of the sets contain similar data so the > hashes (calculated with bitXor) tend towards to a small set of values. > This means there are a lot of collisions and each #add: and #includes: > is very expensive. > >> How about naming >> Collection>>#asPowerset >> ^ self asSet powersetInto: (Set new: (2 raisedTo: self size)) >> with Set>>#powersetInto: and putting that into the next release. >> > > Even now, I'm still not clear on how one takes a change like this and > drives it into a squeak release - Tests, Mantis, then what? Put it on Mantis as enhancement into the collection category. The rest follows automatically ;) I suggest to put something like "fast powerset method for collections" into the summary line. > Z. _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
Klaus D. Witzel <klaus.witzel <at> cobss.com> writes:
> > On Thu, 24 Jul 2008 20:09:53 +0200, Zulq Alam wrote: > > This is very good for small n :) > > > combinations > > | combinations | > > combinations := (PluggableSet new: (2 raisedTo: self size)) > > hashBlock: > > [:aSet | > > aSet > > inject: aSet size > > into: [:hash :each | hash * each hash + each]]; > > yourself. > > self combinationsInto: combinations. > > ^ combinations > > > > combinationsInto: aSet > > (aSet includes: self) ifTrue: [^ self]. > > aSet add: self. > > self do: > > [:each | > > (self copyWithout: each) combinationsInto: aSet]. > > > > Even now, I'm still not clear on how one takes a change like this and > > drives it into a squeak release - Tests, Mantis, then what? > > Put it on Mantis as enhancement into the collection category. The rest > follows automatically ;) I suggest to put something like "fast powerset > method for collections" into the summary line. > > > Z. > Zulq, the algorithm you are proposing is very simple but has major problems: 1) it is not efficient for large size n: it will do (n factorial) loops when only (2 raisedTo: n) are necessary (2 raisedTo: 10) -> 1024 (10 factorial) -> 3628800 2) each loop will result in a costly hash lookup. Your hashBlock involve LargeInteger of size (B raisedTo: p) where - B is the size of an elementary hash (presumably 4 bytes), - p the size of the partition (ranging from 0 to n) The cost of a single hash of size p involves p multiplications of growing size. Each multiplication of k digits has a cost k^2 (this is the naive algorithm currently programmed in LargeInteger plugin). Thus a single hash costs (B^2+(2*B)^2+...+(p*B)^2) Combine this with the (n factorial) loops and you will understand the problem. 3) the algorithm must store all partitions even if we don't want to collect but just to iterate on partitions. No offense, but you'd better not bother opening a mantis issue for this time. Cheers Nicolas _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
Yes.. Nico's solution is lot faster...
set := #(1 2 3 4 5) asSet. [ set subsets "mine"] timeToRun ." 1" [ set subsets2 "Zulk first" ] timeToRun. " 8" [ set subsets4 "Nicolas"] timeToRun. " 1" [ set subsets5 "Zulk"] timeToRun. " 2" set := (1 to: 10) asSet. [ set subsets ] timeToRun . " 46" [ set subsets4 ] timeToRun. " 18" [ set subsets5 ] timeToRun." 114" set := (1 to: 15) asSet. [ set subsets ] timeToRun ." 2484" [ set subsets4 ] timeToRun. " 1598" [ set subsets5 ] timeToRun." 12493" Nicolas won ;) So maybe we can add this one and rename the combinations:atATime: ? Cédrick 2008/7/25 Nicolas Cellier <[hidden email]>: > Klaus D. Witzel <klaus.witzel <at> cobss.com> writes: > >> >> On Thu, 24 Jul 2008 20:09:53 +0200, Zulq Alam wrote: >> >> This is very good for small n :) >> >> > combinations >> > | combinations | >> > combinations := (PluggableSet new: (2 raisedTo: self size)) >> > hashBlock: >> > [:aSet | >> > aSet >> > inject: aSet size >> > into: [:hash :each | hash * each hash + each]]; >> > yourself. >> > self combinationsInto: combinations. >> > ^ combinations >> > >> > combinationsInto: aSet >> > (aSet includes: self) ifTrue: [^ self]. >> > aSet add: self. >> > self do: >> > [:each | >> > (self copyWithout: each) combinationsInto: aSet]. >> > >> > Even now, I'm still not clear on how one takes a change like this and >> > drives it into a squeak release - Tests, Mantis, then what? >> >> Put it on Mantis as enhancement into the collection category. The rest >> follows automatically ;) I suggest to put something like "fast powerset >> method for collections" into the summary line. >> >> > Z. >> > > Zulq, > the algorithm you are proposing is very simple but has major problems: > > 1) it is not efficient for large size n: it will do (n factorial) loops when > only (2 raisedTo: n) are necessary > (2 raisedTo: 10) -> 1024 > (10 factorial) -> 3628800 > > 2) each loop will result in a costly hash lookup. > Your hashBlock involve LargeInteger of size (B raisedTo: p) where > - B is the size of an elementary hash (presumably 4 bytes), > - p the size of the partition (ranging from 0 to n) > The cost of a single hash of size p involves p multiplications of growing size. > Each multiplication of k digits has a cost k^2 > (this is the naive algorithm currently programmed in LargeInteger plugin). > Thus a single hash costs (B^2+(2*B)^2+...+(p*B)^2) > Combine this with the (n factorial) loops and you will understand the problem. > > 3) the algorithm must store all partitions even if we don't want to collect but > just to iterate on partitions. > > No offense, but you'd better not bother opening a mantis issue for this time. > > Cheers > > Nicolas > > > > > > > _______________________________________________ > Beginners mailing list > [hidden email] > http://lists.squeakfoundation.org/mailman/listinfo/beginners > _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
In reply to this post by Nicolas Cellier-3
Nicolas Cellier wrote:
> Zulq, > the algorithm you are proposing is very simple but has major problems: > > 1) it is not efficient for large size n: it will do (n factorial) loops when > only (2 raisedTo: n) are necessary It's better than N! because it will not loop over a set already processed. For instance, for a set of 5 elements it will try 81 sets but only process 32 of these. Not 120 in either case (5 factorial). > > 2) each loop will result in a costly hash lookup. > Your hashBlock involve LargeInteger of size (B raisedTo: p) where It doesn't need to. That was just a very rough attempt at producing a hash that didn't evaluate to only 16 values. It should be possible to create one that produces SmallIntegers but with a higher cardinality. > 3) the algorithm must store all partitions even if we don't want to collect but > just to iterate on partitions. Yes. > > No offense, but you'd better not bother opening a mantis issue for this time. > Agreed. I was just curious about why the naive algorithm was so slow and then as a seperate question how one gets such changes in. Z. _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
Zulq Alam <me <at> zulq.net> writes:
> > Nicolas Cellier wrote: > > > Zulq, > > the algorithm you are proposing is very simple but has major problems: > > > > 1) it is not efficient for large size n: it will do (n factorial) loops when > > only (2 raisedTo: n) are necessary > > It's better than N! because it will not loop over a set already > processed. For instance, for a set of 5 elements it will try 81 sets but > only process 32 of these. Not 120 in either case (5 factorial). > Apologies for this case of blindness! I forgot the includes: test was cutting branches. > > > > 2) each loop will result in a costly hash lookup. > > Your hashBlock involve LargeInteger of size (B raisedTo: p) where > > It doesn't need to. That was just a very rough attempt at producing a > hash that didn't evaluate to only 16 values. It should be possible to > create one that produces SmallIntegers but with a higher cardinality. > Yes, modular arithmetic would be a way to go. Beware, "small" LargeIntegers intermediate might still spoil the game... Crafting a good hash is an art (see work from Andres Valloud). > > 3) the algorithm must store all partitions even if we don't want to collect but > > just to iterate on partitions. > > Yes. > > > > > No offense, but you'd better not bother opening a mantis issue for this time. > > > > Agreed. I was just curious about why the naive algorithm was so slow and > then as a seperate question how one gets such changes in. > > Z. > Right, you did a good job exercizing your (and our) curiousity. This is very enlightening. Beginners should also learn using MessageTally spyOn: [], to analyze were the real costs go. Cheers _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
In reply to this post by cedreek
cdrick <cdrick65 <at> gmail.com> writes:
> > Yes.. Nico's solution is lot faster... > > set := #(1 2 3 4 5) asSet. > [ set subsets "mine"] timeToRun ." 1" > [ set subsets2 "Zulk first" ] timeToRun. " 8" > [ set subsets4 "Nicolas"] timeToRun. " 1" > [ set subsets5 "Zulk"] timeToRun. " 2" > > set := (1 to: 10) asSet. > [ set subsets ] timeToRun . " 46" > [ set subsets4 ] timeToRun. " 18" > [ set subsets5 ] timeToRun." 114" > > set := (1 to: 15) asSet. > [ set subsets ] timeToRun ." 2484" > [ set subsets4 ] timeToRun. " 1598" > [ set subsets5 ] timeToRun." 12493" > > Nicolas won ;) So maybe we can add this one and rename the > combinations:atATime: ? > > Cédrick > He, press ALT-v to get versions of #combinations:atATimeDo: and thanks tk! I like bit sift solution too for it's simplicity. The problem is that it will iterate p times for creating each subset of size p. #combinations:atATimeDo: does not. It is building subsets in parallel. The required copy does an iteration though, but in a primitive! Hence the difference... Nicolas _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
:)
me too even if we could argue this is not self explaining... That's what Andres Valoud said here (thanks Marcin): http://blogten.blogspot.com/2005/09/very-nice-methods.html
uhm, ok... the diferrence is an order or 2 more or less...
Thanks all, this was fun and interesting ;)... Cédrick _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
Free forum by Nabble | Edit this page |