Smalltalk › Squeak › Squeak - Dev

[squeak-dev] Letting Set contain nils?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

38 messages Options

Igor Stasenko

[squeak-dev] Letting Set contain nils?

Sets which not allowing contain nil as element is a point of inconvenience.

There are two ways how get around that:
- initialize an array which contains set elements with unique object ~~ nil.
and fix methods which testing for an empty slots to compare against
this object, not nil.
This can cause a slowdown during rehashing, because VM initially
creating arrays filled with nils, while we need to fill them with
another object.
There is also a problem with preserving it 'uniqueness' by not giving
this object outside a set.

- add an instVar 'containsNil'
then when set receiving 'add: nil' , it simply sets this flag to true.
modify #collect: , #do: , #remove: to be aware of flag value.

I find the second way is more appropriate. While it costs additional
memory per Set/IdentitySet instance, it costs almost nothing in speed.

What do you think about supporting Sets to contain nils in general,
and about methods how to achieve that?

--
Best regards,
Igor Stasenko AKA sig.

ccrraaiigg

[squeak-dev] Re: Letting Set contain nils?

I'll ask the inevitable question... given that ability, what do
you really want to do? :)

-C

Randal L. Schwartz

Re: [squeak-dev] Letting Set contain nils?

In reply to this post by Igor Stasenko

>>>>> "Igor" == Igor Stasenko <[hidden email]> writes:

Igor> What do you think about supporting Sets to contain nils in general,
Igor> and about methods how to achieve that?

I ran across this problem as well.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion

timrowledge

Re: [squeak-dev] Letting Set contain nils?

In reply to this post by Igor Stasenko

Sets are defined as not holding nils. Changing that would mess up
quite a bit of code I suspect.

Sets are sparsely populated arrays (around 40% IIRC) that use the nils
to mark the unused slots. How do you imagine that we would mark unused
slots (which are an important part of the whole sparse array thing) if
nils were allowed? And how would you expect #do:/collect: etc to work
- you would have to make all your blocks test for nil. Oh and how
would code discriminate between nils you put in the array and nils
that are put there by the system for the sparseness management?

If you need Sets and want a distinguished value, make one. Just don't
mess up the whole system in the process, eh?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: RDR: Rotate Disk Right

Igor Stasenko

Re: [squeak-dev] Re: Letting Set contain nils?

In reply to this post by ccrraaiigg

2008/8/9 Craig Latta <[hidden email]>:
>
> I'll ask the inevitable question... given that ability, what do you
> really want to do? :)
>

Some practicle usage: pick an arbitrary array and tell the number of
_unique_ objects it contains, including nils, of course.

--
Best regards,
Igor Stasenko AKA sig.

timrowledge

Re: [squeak-dev] Re: Letting Set contain nils?

On 8-Aug-08, at 5:20 PM, Igor Stasenko wrote:

> 2008/8/9 Craig Latta <[hidden email]>:
>>
>> I'll ask the inevitable question... given that ability, what do
>> you
>> really want to do? :)
>>
>
> Some practicle usage: pick an arbitrary array and tell the number of
> _unique_ objects it contains, including nils, of course.

For Sets that is trivial without any need whatsoever to alter
anything. It is theSet size + 1 since all Sets will have at least one
nil. Unless of course some evil person messes up the growing protocol
for Sets.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Variables won't; constants aren't.

Igor Stasenko

Re: [squeak-dev] Letting Set contain nils?

In reply to this post by timrowledge

2008/8/9 tim Rowledge <[hidden email]>:
> Sets are defined as not holding nils.

That's the problem. A nil is just an object. What other reasons
besides implementaion limitations restricts from putting nils in set?
Sets are commonly used as a sets of _arbitrary_ objects. But hey - nil
is an object too. So what the point in such discrimination?

> Changing that would mess up quite a bit of code I suspect.

I suspect not.
Currently Sets will throw an error if you try to include nil as
element. This means that in current system there are no sets
containing nils and no code which intentionally put nils to sets,
because this leads to error.
>From the above i can conclude that removing limitation of inclusion
nils to sets will not break anything in current system.

>
> Sets are sparsely populated arrays (around 40% IIRC) that use the nils to
> mark the unused slots. How do you imagine that we would mark unused slots
> (which are an important part of the whole sparse array thing) if nils were
> allowed?

I thought i outlined possible solutions good enough :)

> And how would you expect #do:/collect: etc to work - you would
> have to make all your blocks test for nil. Oh and how would code
> discriminate between nils you put in the array and nils that are put there
> by the system for the sparseness management?
>

Let me illustrate you:

Collection subclass: #Set
instanceVariableNames: 'tally array containsNil'

Set>>initialize
containsNil := false

Set>>add: newObject
newObject ifNil: [ containsNil := true. ^ nil ]. " instead of
self error: blabla ... "
... blabla...

Set>>do: aBlock
containsNil ifTrue: [ aBlock value: nil ].
... the rest of method is same as previous ..

Set>>collect: aBlock
| newSet |
newSet := Set new: self size.
array do: [:each | each ifNotNil: [newSet add: (aBlock value: each)]].
containsNil ifTrue: [ newSet add: nil ].
^ newSet

so, its actually a question of adding 'containsNil ifTrue: [] '
everywhere it needs to be.

> If you need Sets and want a distinguished value, make one. Just don't mess
> up the whole system in the process, eh?
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> Strange OpCodes: RDR: Rotate Disk Right
>

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] Letting Set contain nils?

In reply to this post by Igor Stasenko

On Fri, Aug 8, 2008 at 3:53 PM, Igor Stasenko <[hidden email]> wrote:

Sets which not allowing contain nil as element is a point of inconvenience.

There are two ways how get around that:
- initialize an array which contains set elements with unique object ~~ nil.
and fix methods which testing for an empty slots to compare against
this object, not nil.
This can cause a slowdown during rehashing, because VM initially
creating arrays filled with nils, while we need to fill them with
another object.
There is also a problem with preserving it 'uniqueness' by not giving
this object outside a set.

- add an instVar 'containsNil'
then when set receiving 'add: nil' , it simply sets this flag to true.
modify #collect: , #do: , #remove: to be aware of flag value.

I find the second way is more appropriate. While it costs additional
memory per Set/IdentitySet instance, it costs almost nothing in speed.

What do you think about supporting Sets to contain nils in general,
and about methods how to achieve that?

Here's a third approach (a variation on your 2nd approach). Have an instVar 'includesSelf' and fill the array with the Set itself. So add a new primitive new:fillWith: (primitiveNewWithArgAndFillValue?) and use this to create the empty array filled with the Set itself. Check for adding the set itself, and itself being the null entry. The advantage over the flag for nil approach is that you kill two birds with one stone.

1. You need a unique value anyway, and the Set can nicely be its own unique value

2. recursive collections are a problem to print and with the explicit flag this becomes much easier to deal with.

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] Re: Letting Set contain nils?

In reply to this post by timrowledge

On Fri, Aug 8, 2008 at 5:37 PM, tim Rowledge <[hidden email]> wrote:

On 8-Aug-08, at 5:20 PM, Igor Stasenko wrote:

2008/8/9 Craig Latta <[hidden email]>:

I'll ask the inevitable question... given that ability, what do you
really want to do? :)

Some practicle usage: pick an arbitrary array and tell the number of
_unique_ objects it contains, including nils, of course.

For Sets that is trivial without any need whatsoever to alter anything. It is theSet size + 1 since all Sets will have at least one nil. Unless of course some evil person messes up the growing protocol for Sets.

But it doesn't deal with the case where the array includes nil. So one has to come up with something clumsy such as

(theArray select: [:obj| obj notNil]) asSet size + ((theArray includes: nil) ifTrue: [1] ifFalse: [0])

which creates an extra collection for the select, requires an extra pass to find nil (n the case where it doesn't contain it) and is plain ugly

[| forNil | forNil := 0. (theArray inject: Set new into: [:set :item| item == nil ifTrue: [hasNil := 1] ifFalse: [set add: item]. set]) + forNil]

which is faster but even more verbose and confusing.

Compare this to

theArray asSet size

I'm with Igor. This would indeed be an improvement.

Just because "its always been this way" doesn't mean its right.

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Variables won't; constants aren't.

Igor Stasenko

Re: [squeak-dev] Letting Set contain nils?

In reply to this post by Eliot Miranda-2

2008/8/9 Eliot Miranda <[hidden email]>:

>
>
> On Fri, Aug 8, 2008 at 3:53 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> Sets which not allowing contain nil as element is a point of
>> inconvenience.
>>
>> There are two ways how get around that:
>> - initialize an array which contains set elements with unique object ~~
>> nil.
>> and fix methods which testing for an empty slots to compare against
>> this object, not nil.
>> This can cause a slowdown during rehashing, because VM initially
>> creating arrays filled with nils, while we need to fill them with
>> another object.
>> There is also a problem with preserving it 'uniqueness' by not giving
>> this object outside a set.
>>
>> - add an instVar 'containsNil'
>> then when set receiving 'add: nil' , it simply sets this flag to true.
>> modify #collect: , #do: , #remove: to be aware of flag value.
>>
>> I find the second way is more appropriate. While it costs additional
>> memory per Set/IdentitySet instance, it costs almost nothing in speed.
>>
>> What do you think about supporting Sets to contain nils in general,
>> and about methods how to achieve that?
>
> Here's a third approach (a variation on your 2nd approach). Have an instVar
> 'includesSelf' and fill the array with the Set itself. So add a new
> primitive new:fillWith: (primitiveNewWithArgAndFillValue?) and use this to
> create the empty array filled with the Set itself. Check for adding the set
> itself, and itself being the null entry. The advantage over the flag for
> nil approach is that you kill two birds with one stone.
> 1. You need a unique value anyway, and the Set can nicely be its own unique
> value
> 2. recursive collections are a problem to print and with the explicit flag
> this becomes much easier to deal with.

In math domain, any set includes itself , not as element of course,
but as subset :)
And i don't see how this is better comparing to having 'containsNil' ivar?
You still have to test this flag in each method which deals with
elements , so be it containsFoo or containsBar - no real difference.

>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] Letting Set contain nils?

On Fri, Aug 8, 2008 at 5:46 PM, Igor Stasenko <[hidden email]> wrote:

2008/8/9 Eliot Miranda <[hidden email]>:

>
>
> On Fri, Aug 8, 2008 at 3:53 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> Sets which not allowing contain nil as element is a point of
>> inconvenience.
>>
>> There are two ways how get around that:
>> - initialize an array which contains set elements with unique object ~~
>> nil.
>> and fix methods which testing for an empty slots to compare against
>> this object, not nil.
>> This can cause a slowdown during rehashing, because VM initially
>> creating arrays filled with nils, while we need to fill them with
>> another object.
>> There is also a problem with preserving it 'uniqueness' by not giving
>> this object outside a set.
>>
>> - add an instVar 'containsNil'
>> then when set receiving 'add: nil' , it simply sets this flag to true.
>> modify #collect: , #do: , #remove: to be aware of flag value.
>>
>> I find the second way is more appropriate. While it costs additional
>> memory per Set/IdentitySet instance, it costs almost nothing in speed.
>>
>> What do you think about supporting Sets to contain nils in general,
>> and about methods how to achieve that?
>
> Here's a third approach (a variation on your 2nd approach). Have an instVar
> 'includesSelf' and fill the array with the Set itself. So add a new
> primitive new:fillWith: (primitiveNewWithArgAndFillValue?) and use this to
> create the empty array filled with the Set itself. Check for adding the set
> itself, and itself being the null entry. The advantage over the flag for
> nil approach is that you kill two birds with one stone.
> 1. You need a unique value anyway, and the Set can nicely be its own unique
> value
> 2. recursive collections are a problem to print and with the explicit flag
> this becomes much easier to deal with.

In math domain, any set includes itself , not as element of course,
but as subset :)
And i don't see how this is better comparing to having 'containsNil' ivar?
You still have to test this flag in each method which deals with
elements , so be it containsFoo or containsBar - no real difference.

One difference is the use of self for the unique object instead of another.

Another difference is that recursive sets no longer fail to print, inspect, etc.

I think those are real differences.

>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>

--

Best regards,
Igor Stasenko AKA sig.

Igor Stasenko

Re: [squeak-dev] Letting Set contain nils?

2008/8/9 Eliot Miranda <[hidden email]>:

>
>
> On Fri, Aug 8, 2008 at 5:46 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> 2008/8/9 Eliot Miranda <[hidden email]>:
>> >
>> >
>> > On Fri, Aug 8, 2008 at 3:53 PM, Igor Stasenko <[hidden email]>
>> > wrote:
>> >>
>> >> Sets which not allowing contain nil as element is a point of
>> >> inconvenience.
>> >>
>> >> There are two ways how get around that:
>> >> - initialize an array which contains set elements with unique object ~~
>> >> nil.
>> >> and fix methods which testing for an empty slots to compare against
>> >> this object, not nil.
>> >> This can cause a slowdown during rehashing, because VM initially
>> >> creating arrays filled with nils, while we need to fill them with
>> >> another object.
>> >> There is also a problem with preserving it 'uniqueness' by not giving
>> >> this object outside a set.
>> >>
>> >> - add an instVar 'containsNil'
>> >> then when set receiving 'add: nil' , it simply sets this flag to true.
>> >> modify #collect: , #do: , #remove: to be aware of flag value.
>> >>
>> >> I find the second way is more appropriate. While it costs additional
>> >> memory per Set/IdentitySet instance, it costs almost nothing in speed.
>> >>
>> >> What do you think about supporting Sets to contain nils in general,
>> >> and about methods how to achieve that?
>> >
>> > Here's a third approach (a variation on your 2nd approach). Have an
>> > instVar
>> > 'includesSelf' and fill the array with the Set itself. So add a new
>> > primitive new:fillWith: (primitiveNewWithArgAndFillValue?) and use this
>> > to
>> > create the empty array filled with the Set itself. Check for adding the
>> > set
>> > itself, and itself being the null entry. The advantage over the flag
>> > for
>> > nil approach is that you kill two birds with one stone.
>> > 1. You need a unique value anyway, and the Set can nicely be its own
>> > unique
>> > value
>> > 2. recursive collections are a problem to print and with the explicit
>> > flag
>> > this becomes much easier to deal with.
>>
>> In math domain, any set includes itself , not as element of course,
>> but as subset :)
>> And i don't see how this is better comparing to having 'containsNil' ivar?
>> You still have to test this flag in each method which deals with
>> elements , so be it containsFoo or containsBar - no real difference.
>
> One difference is the use of self for the unique object instead of another.
> Another difference is that recursive sets no longer fail to print, inspect,
> etc.
> I think those are real differences.
>

Valid point.

As a bytecode proofy, can you tell how much different a bytecode will be for:

obj == nil
versus
obj == self
versus
obj isNil

what is faster/slower?

--
Best regards,
Igor Stasenko AKA sig.

Igor Stasenko

Re: [squeak-dev] Re: Letting Set contain nils?

In reply to this post by Eliot Miranda-2

2008/8/9 Eliot Miranda <[hidden email]>:

>
>
> On Fri, Aug 8, 2008 at 5:37 PM, tim Rowledge <[hidden email]> wrote:
>>
>> On 8-Aug-08, at 5:20 PM, Igor Stasenko wrote:
>>
>>> 2008/8/9 Craig Latta <[hidden email]>:
>>>>
>>>> I'll ask the inevitable question... given that ability, what do you
>>>> really want to do? :)
>>>>
>>>
>>> Some practicle usage: pick an arbitrary array and tell the number of
>>> _unique_ objects it contains, including nils, of course.
>>
>>
>> For Sets that is trivial without any need whatsoever to alter anything. It
>> is theSet size + 1 since all Sets will have at least one nil. Unless of
>> course some evil person messes up the growing protocol for Sets.
>
> But it doesn't deal with the case where the array includes nil. So one has
> to come up with something clumsy such as
> (theArray select: [:obj| obj notNil]) asSet size + ((theArray includes:
> nil) ifTrue: [1] ifFalse: [0])
> which creates an extra collection for the select, requires an extra pass to
> find nil (n the case where it doesn't contain it) and is plain ugly
> or
> [| forNil | forNil := 0. (theArray inject: Set new into: [:set :item|
> item == nil ifTrue: [hasNil := 1] ifFalse: [set add: item]. set]) + forNil]
> which is faster but even more verbose and confusing.

My, maybe more verbose sample, but most efficient one, i guess :) :

daSet := Set new.
daObject := Object new.
daArray do: [:item | item ifNil: [ daSet add: daObject ] ifNotNil: [
daSet add: item] ].
^ daSet size

A use case example from the parrallel domain - databases:

SELECT DISTINCT field FROM someTable WHERE blabla
- returns a set of disctinct field values (including NULL values).

But while we can deal more or less painless with calculating a number
of distinct values,
operating with distinct values causing a lot more problem:
suggest you want to pass a set of distinct values somewhere else:
Now since you want to keep notion that nil present or not as distinct
value - you have to use additional argument, or convert a set to
OrderedCollection and add nil.
Nedless to say, its very inconvenient.

> Compare this to
>
> theArray asSet size
> I'm with Igor. This would indeed be an improvement.
> Just because "its always been this way" doesn't mean its right.
>>
>> tim
>> --
>> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
>> Variables won't; constants aren't.
>>

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] Letting Set contain nils?

In reply to this post by Igor Stasenko

On Fri, Aug 8, 2008 at 6:09 PM, Igor Stasenko <[hidden email]> wrote:

2008/8/9 Eliot Miranda <[hidden email]>:
>
>
> On Fri, Aug 8, 2008 at 5:46 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> 2008/8/9 Eliot Miranda <[hidden email]>:
>> >
>> >
>> > On Fri, Aug 8, 2008 at 3:53 PM, Igor Stasenko <[hidden email]>
>> > wrote:
>> >>
>> >> Sets which not allowing contain nil as element is a point of
>> >> inconvenience.
>> >>
>> >> There are two ways how get around that:
>> >> - initialize an array which contains set elements with unique object ~~
>> >> nil.
>> >> and fix methods which testing for an empty slots to compare against
>> >> this object, not nil.
>> >> This can cause a slowdown during rehashing, because VM initially
>> >> creating arrays filled with nils, while we need to fill them with
>> >> another object.
>> >> There is also a problem with preserving it 'uniqueness' by not giving
>> >> this object outside a set.
>> >>
>> >> - add an instVar 'containsNil'
>> >> then when set receiving 'add: nil' , it simply sets this flag to true.
>> >> modify #collect: , #do: , #remove: to be aware of flag value.
>> >>
>> >> I find the second way is more appropriate. While it costs additional
>> >> memory per Set/IdentitySet instance, it costs almost nothing in speed.
>> >>
>> >> What do you think about supporting Sets to contain nils in general,
>> >> and about methods how to achieve that?
>> >
>> > Here's a third approach (a variation on your 2nd approach). Have an
>> > instVar
>> > 'includesSelf' and fill the array with the Set itself. So add a new
>> > primitive new:fillWith: (primitiveNewWithArgAndFillValue?) and use this
>> > to
>> > create the empty array filled with the Set itself. Check for adding the
>> > set
>> > itself, and itself being the null entry. The advantage over the flag
>> > for
>> > nil approach is that you kill two birds with one stone.
>> > 1. You need a unique value anyway, and the Set can nicely be its own
>> > unique
>> > value
>> > 2. recursive collections are a problem to print and with the explicit
>> > flag
>> > this becomes much easier to deal with.
>>
>> In math domain, any set includes itself , not as element of course,
>> but as subset :)
>> And i don't see how this is better comparing to having 'containsNil' ivar?
>> You still have to test this flag in each method which deals with
>> elements , so be it containsFoo or containsBar - no real difference.
>
> One difference is the use of self for the unique object instead of another.
> Another difference is that recursive sets no longer fail to print, inspect,
> etc.
> I think those are real differences.
>

Valid point.

As a bytecode proofy, can you tell how much different a bytecode will be for:

what's a proofy? ;)

obj == nil
versus
obj == self
versus
obj isNil

Depends. In an Interpreter obj == self likely to be slightly faster than obj == nil since nil must be fetched either form the specialObjectsArray (if bytecode set has pushNl, as it does) or from the method's literal frame. In a JIT they're liely the same because self is a read through the frame pointer and nil is a constant embedded in the instruction stream. These are likely to be of similar cost.

obj isNil will either be the same cost as == nil or slower depending on whether the bytecode compiler inlines it.

But the difference between obj == self & obj == nil/obj isNil will be in the noise.

You should make the decision on convenience.

what is faster/slower?

anArray asSet size is faster than any of the alternatives because it is easier to thunk about ;) Faster thought is much more valuable than faster processing, c.f. Smalltalk programs vs C++ programs ;)

--

Best regards,
Igor Stasenko AKA sig.

Igor Stasenko

Re: [squeak-dev] Letting Set contain nils?

2008/8/9 Eliot Miranda <[hidden email]>:

>
>
> On Fri, Aug 8, 2008 at 6:09 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> 2008/8/9 Eliot Miranda <[hidden email]>:
>> >
>> >
>> > On Fri, Aug 8, 2008 at 5:46 PM, Igor Stasenko <[hidden email]>
>> > wrote:
>> >>
>> >> 2008/8/9 Eliot Miranda <[hidden email]>:
>> >> >
>> >> >
>> >> > On Fri, Aug 8, 2008 at 3:53 PM, Igor Stasenko <[hidden email]>
>> >> > wrote:
>> >> >>
>> >> >> Sets which not allowing contain nil as element is a point of
>> >> >> inconvenience.
>> >> >>
>> >> >> There are two ways how get around that:
>> >> >> - initialize an array which contains set elements with unique object
>> >> >> ~~
>> >> >> nil.
>> >> >> and fix methods which testing for an empty slots to compare against
>> >> >> this object, not nil.
>> >> >> This can cause a slowdown during rehashing, because VM initially
>> >> >> creating arrays filled with nils, while we need to fill them with
>> >> >> another object.
>> >> >> There is also a problem with preserving it 'uniqueness' by not
>> >> >> giving
>> >> >> this object outside a set.
>> >> >>
>> >> >> - add an instVar 'containsNil'
>> >> >> then when set receiving 'add: nil' , it simply sets this flag to
>> >> >> true.
>> >> >> modify #collect: , #do: , #remove: to be aware of flag value.
>> >> >>
>> >> >> I find the second way is more appropriate. While it costs additional
>> >> >> memory per Set/IdentitySet instance, it costs almost nothing in
>> >> >> speed.
>> >> >>
>> >> >> What do you think about supporting Sets to contain nils in general,
>> >> >> and about methods how to achieve that?
>> >> >
>> >> > Here's a third approach (a variation on your 2nd approach). Have an
>> >> > instVar
>> >> > 'includesSelf' and fill the array with the Set itself. So add a new
>> >> > primitive new:fillWith: (primitiveNewWithArgAndFillValue?) and use
>> >> > this
>> >> > to
>> >> > create the empty array filled with the Set itself. Check for adding
>> >> > the
>> >> > set
>> >> > itself, and itself being the null entry. The advantage over the flag
>> >> > for
>> >> > nil approach is that you kill two birds with one stone.
>> >> > 1. You need a unique value anyway, and the Set can nicely be its own
>> >> > unique
>> >> > value
>> >> > 2. recursive collections are a problem to print and with the explicit
>> >> > flag
>> >> > this becomes much easier to deal with.
>> >>
>> >> In math domain, any set includes itself , not as element of course,
>> >> but as subset :)
>> >> And i don't see how this is better comparing to having 'containsNil'
>> >> ivar?
>> >> You still have to test this flag in each method which deals with
>> >> elements , so be it containsFoo or containsBar - no real difference.
>> >
>> > One difference is the use of self for the unique object instead of
>> > another.
>> > Another difference is that recursive sets no longer fail to print,
>> > inspect,
>> > etc.
>> > I think those are real differences.
>> >
>>
>> Valid point.
>>
>> As a bytecode proofy, can you tell how much different a bytecode will be
>> for:
>
> what's a proofy? ;)
>

an expert :)

>>
>> obj == nil
>> versus
>> obj == self
>> versus
>> obj isNil
>
> Depends. In an Interpreter obj == self likely to be slightly faster than
> obj == nil since nil must be fetched either form the specialObjectsArray (if
> bytecode set has pushNl, as it does) or from the method's literal frame. In
> a JIT they're liely the same because self is a read through the frame
> pointer and nil is a constant embedded in the instruction stream. These are
> likely to be of similar cost.
> obj isNil will either be the same cost as == nil or slower depending on
> whether the bytecode compiler inlines it.
> But the difference between obj == self & obj == nil/obj isNil will be in the
> noise.
> You should make the decision on convenience.
>>
>> what is faster/slower?
>
> anArray asSet size is faster than any of the alternatives because it is
> easier to thunk about ;) Faster thought is much more valuable than faster
> processing, c.f. Smalltalk programs vs C++ programs ;)
>

yes, i raised this topic exactly from this reason.
What i'm not sure that is this change is so badly needed.
Maybe its only me who get stuck with a problem how to deal with sets
where nils are meaningful and useful value.

--
Best regards,
Igor Stasenko AKA sig.

Eliot Miranda-2

Re: [squeak-dev] Letting Set contain nils?

On Fri, Aug 8, 2008 at 7:07 PM, Igor Stasenko <[hidden email]> wrote:

2008/8/9 Eliot Miranda <[hidden email]>:
>
>
> On Fri, Aug 8, 2008 at 6:09 PM, Igor Stasenko <[hidden email]> wrote:
>>
>> 2008/8/9 Eliot Miranda <[hidden email]>:
>> >
>> >
>> > On Fri, Aug 8, 2008 at 5:46 PM, Igor Stasenko <[hidden email]>
>> > wrote:
>> >>
>> >> 2008/8/9 Eliot Miranda <[hidden email]>:
>> >> >
>> >> >
>> >> > On Fri, Aug 8, 2008 at 3:53 PM, Igor Stasenko <[hidden email]>
>> >> > wrote:
>> >> >>
>> >> >> Sets which not allowing contain nil as element is a point of
>> >> >> inconvenience.
>> >> >>
>> >> >> There are two ways how get around that:
>> >> >> - initialize an array which contains set elements with unique object
>> >> >> ~~
>> >> >> nil.
>> >> >> and fix methods which testing for an empty slots to compare against
>> >> >> this object, not nil.
>> >> >> This can cause a slowdown during rehashing, because VM initially
>> >> >> creating arrays filled with nils, while we need to fill them with
>> >> >> another object.
>> >> >> There is also a problem with preserving it 'uniqueness' by not
>> >> >> giving
>> >> >> this object outside a set.
>> >> >>
>> >> >> - add an instVar 'containsNil'
>> >> >> then when set receiving 'add: nil' , it simply sets this flag to
>> >> >> true.
>> >> >> modify #collect: , #do: , #remove: to be aware of flag value.
>> >> >>
>> >> >> I find the second way is more appropriate. While it costs additional
>> >> >> memory per Set/IdentitySet instance, it costs almost nothing in
>> >> >> speed.
>> >> >>
>> >> >> What do you think about supporting Sets to contain nils in general,
>> >> >> and about methods how to achieve that?
>> >> >
>> >> > Here's a third approach (a variation on your 2nd approach). Have an
>> >> > instVar
>> >> > 'includesSelf' and fill the array with the Set itself. So add a new
>> >> > primitive new:fillWith: (primitiveNewWithArgAndFillValue?) and use
>> >> > this
>> >> > to
>> >> > create the empty array filled with the Set itself. Check for adding
>> >> > the
>> >> > set
>> >> > itself, and itself being the null entry. The advantage over the flag
>> >> > for
>> >> > nil approach is that you kill two birds with one stone.
>> >> > 1. You need a unique value anyway, and the Set can nicely be its own
>> >> > unique
>> >> > value
>> >> > 2. recursive collections are a problem to print and with the explicit
>> >> > flag
>> >> > this becomes much easier to deal with.
>> >>
>> >> In math domain, any set includes itself , not as element of course,
>> >> but as subset :)
>> >> And i don't see how this is better comparing to having 'containsNil'
>> >> ivar?
>> >> You still have to test this flag in each method which deals with
>> >> elements , so be it containsFoo or containsBar - no real difference.
>> >
>> > One difference is the use of self for the unique object instead of
>> > another.
>> > Another difference is that recursive sets no longer fail to print,
>> > inspect,
>> > etc.
>> > I think those are real differences.
>> >
>>
>> Valid point.
>>
>> As a bytecode proofy, can you tell how much different a bytecode will be
>> for:
>
> what's a proofy? ;)
>

an expert :)

>>
>> obj == nil
>> versus
>> obj == self
>> versus
>> obj isNil
>
> Depends. In an Interpreter obj == self likely to be slightly faster than
> obj == nil since nil must be fetched either form the specialObjectsArray (if
> bytecode set has pushNl, as it does) or from the method's literal frame. In
> a JIT they're liely the same because self is a read through the frame
> pointer and nil is a constant embedded in the instruction stream. These are
> likely to be of similar cost.
> obj isNil will either be the same cost as == nil or slower depending on
> whether the bytecode compiler inlines it.
> But the difference between obj == self & obj == nil/obj isNil will be in the
> noise.
> You should make the decision on convenience.
>>
>> what is faster/slower?
>
> anArray asSet size is faster than any of the alternatives because it is
> easier to thunk about ;) Faster thought is much more valuable than faster
> processing, c.f. Smalltalk programs vs C++ programs ;)
>

yes, i raised this topic exactly from this reason.
What i'm not sure that is this change is so badly needed.
Maybe its only me who get stuck with a problem how to deal with sets
where nils are meaningful and useful value.

I've certainly run into it before and I know colleagues have in the past. What's hard to tell is how much code out there is working around the limitation.

--

Best regards,
Igor Stasenko AKA sig.

Giovanni Corriga

Re: [squeak-dev] Letting Set contain nils?

In reply to this post by Igor Stasenko

Igor Stasenko ha scritto:

> What do you think about supporting Sets to contain nils in general,
> and about methods how to achieve that?
>

Messing in such a way with a standard collection class is definetively
wrong. What you could do is create another class that implements all
Set's protocols. In such a class I'd use the instance itself as a
placeholder to put in the empty slots.

Giovanni

Giovanni Corriga

Re: [squeak-dev] Letting Set contain nils?

In reply to this post by Igor Stasenko

Igor Stasenko ha scritto:
>
> yes, i raised this topic exactly from this reason.
> What i'm not sure that is this change is so badly needed.
> Maybe its only me who get stuck with a problem how to deal with sets
> where nils are meaningful and useful value.
>

Could it be that you're possibly missing an abstraction here? why not
use a different object, specialized for your needs, to represent your
null value?

Giovanni

Igor Stasenko

Re: [squeak-dev] Letting Set contain nils?

2008/8/9 Giovanni Corriga <[hidden email]>:

> Igor Stasenko ha scritto:
>>
>> yes, i raised this topic exactly from this reason.
>> What i'm not sure that is this change is so badly needed.
>> Maybe its only me who get stuck with a problem how to deal with sets
>> where nils are meaningful and useful value.
>>
>
> Could it be that you're possibly missing an abstraction here? why not use a
> different object, specialized for your needs, to represent your null value?
>

Because other part(s) of the system may not know about nulls as
representation of nils, and in own turn may want to abstract a set
containing nils - so it will use own null2 abstraction.. In result
your end up operating with sets containing nulls and null2-s :)

> Giovanni
>

--
Best regards,
Igor Stasenko AKA sig.

Klaus D. Witzel

[squeak-dev] Re: Letting Set contain nils?

On Sat, 09 Aug 2008 14:41:42 +0200, Igor Stasenko wrote:

> 2008/8/9 Giovanni Corriga :
>> Igor Stasenko ha scritto:
>>>
>>> yes, i raised this topic exactly from this reason.
>>> What i'm not sure that is this change is so badly needed.
>>> Maybe its only me who get stuck with a problem how to deal with sets
>>> where nils are meaningful and useful value.
>>>
>>
>> Could it be that you're possibly missing an abstraction here? why not
>> use a
>> different object, specialized for your needs, to represent your null
>> value?
>>
>
> Because other part(s) of the system may not know about nulls as
> representation of nils, and in own turn may want to abstract a set
> containing nils - so it will use own null2 abstraction.. In result
> your end up operating with sets containing nulls and null2-s :)

If it where a Smalltalk system the two of you are talking about, the Set
in question follows encapsulation principle, and nobody outside of Set
would know how it memorizes unoccupied slots.

The implementor of Set is free to make own choice (all other things kept
equal, like for example the contract with the VM on MethodDictionary while
the latter inherits from Set).

/Klaus