2009/11/12 Tobias Pape <[hidden email]>:
> Hello —, > Am 2009-11-12 um 06:35 schrieb Andreas Raab: > >> Igor Stasenko wrote: >>> The main purpose to be able to handle nils in sets as an elements is to make Set >>> be able to hold _any_ potentially existing objects in image. >> >> Absolutely not. The main purpose of having nil be included in sets is to remove a particular restriction (namely that Sets cannot contain nil), not to "be able to hold _any_ potentially existing objects in an image". What is being proposed is trading one restriction (Sets cannot contain nil) against another one (Sets cannot contain their internal private marker) which is deemed advantageous as it allows us to convert many collections that contain nil into sets without blowing up. *That* is the main purpose. >> >> The goal of "be able to hold _any_ potentially existing objects in an image" is neither achievable nor worthwhile. Try this for starters: >> >> (s := Set new) add: s; add: s. > > As far as I know, another approach is to have > Distinct Entry Object inside the set. cf. the GemStone > approach. A set, there, contains an ordered collection of > nils an Entry elements. Thus, the actual element is stored _inside_ > the entry element, not in the underlying (storage) collection > of the array itself. As far as I know, this is similar to its Dictionary > approach. > Probably, this will introduce a slightly broader spectrum of > optimization possibilities. > > So long, > -Tobias > I haven't seen how it implemented in GemStone (so i think that implementing it will be license clean (tm) ;) As i see it, the trick is to use a message dispatch to object which going to be an element, to tell it how it would like to represent itself in Set. And this is where story begins. add: newObject "Include newObject as one of the receiver's elements, but only if not already present. Answer newObject." | entry index | entry := newObject asSetElement. index := self scanFor: entry. (array at: index) ifNil: [self atNewIndex: index put: entry]. ^ newObject a SetElement class then is a one-inst-var object, holding a reference to proxied object, and needs to properly implement #hash #= , required by Set. Also, at each place where we need to feed a real object , a Set should send an #unwrap (or give more distinct selector name) to return enclosed object. a subclasses, CollectionSetElement and NilSetElement implemented slightly differently to take in account that we are still using nil as filler object, and collection included in set should prevent going too deep in #hash and guard against potential recursion. As an alternative , we migth use already existing LookupKey class, and subclass from it, instead of having separate SetElement. All objects then obliged to implement #asSetElement and #unwrap - by providing the default methods in Object class, which by default doing nothing, return self. While Collection and UndefinedObject classes overriding asSetElement in order to return wrappers instead of real objects. The main overhead is cost of extra messages sent at each #add: and at each point where we need to deliver a real set element (#unwrap). The space overhead can be measured roughly as: count := 0. Set allSubInstancesDo: [:each | each do: [:value | value isCollection ifTrue: [ count := count +1 ]]]. count in my image its = 97197 but another test reveals that this number is much smaller, because we don't have to be worry about ByteArrays, Strings and Symbols (we could override #asSetElement back to return self, instead of constructing wrapper): count := 0. Set allSubInstancesDo: [:each | each do: [:value | (value isCollection and: [value class isBytes not and: [value isString not] ]) ifTrue: [ count := count +1 ]]]. count - 5021 So, what you think? Is it worth with trying to implemet it, and see the real numbers, or its already clear that its wrong way to go? -- Best regards, Igor Stasenko AKA sig. |
Hello —,
Am 2009-11-12 um 12:11 schrieb Igor Stasenko: > a SetElement class then is a one-inst-var object, holding a reference > to proxied object, and > needs to properly implement #hash #= , required by Set. > Also, at each place where we need to feed a real object , a Set should > send an #unwrap (or give more distinct selector name) > to return enclosed object. probaly, the folowing would also be possible: for all, except the “special” object, #asSetElement ^ self the other two shall return an actual wrapper. this shall retrun the actual entry upon called #value: #value ^ wrappedEntry Then, to obtain an entry , it would be called asking for its #value. Problem then: Blocks :) so long, -Tobias |
In reply to this post by Igor Stasenko
Guys, i took a time to implement this idea (not precisely as
described, but close to). Please review at: http://bugs.squeak.org/view.php?id=7413 -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Tobias Pape-2
2009/11/12 Tobias Pape <[hidden email]>:
> Hello —, > Am 2009-11-12 um 12:11 schrieb Igor Stasenko: > >> a SetElement class then is a one-inst-var object, holding a reference >> to proxied object, and >> needs to properly implement #hash #= , required by Set. >> Also, at each place where we need to feed a real object , a Set should >> send an #unwrap (or give more distinct selector name) >> to return enclosed object. > > probaly, the folowing would also be possible: > > for all, except the “special” object, > > #asSetElement > > ^ self > > the other two shall return an actual wrapper. > > this shall retrun the actual entry upon called #value: > > #value > > ^ wrappedEntry > > Then, to obtain an entry , it would be called > asking for its #value. > > Problem then: Blocks :) > not only with block, but with anything else, which implements #value differently. This is why i took completely different message selectors for that (#asSetElement and #enclosedSetElement) > so long, > -Tobias > > > -- Best regards, Igor Stasenko AKA sig. |
Free forum by Nabble | Edit this page |