About Dictionary >> #at:ifAbsentPut:

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

About Dictionary >> #at:ifAbsentPut:

Mariano Martinez Peck
Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what would you expect the second parameter to be?  the value.
So, one would do:
Dictionary new at: #foo ifAbsentPut: 4

But if you see Dictionary >>

at: key ifAbsentPut: aBlock
    "Return the value at the given key.
    If key is not included in the receiver store the result
    of evaluating aBlock as new value."

    ^ self at: key ifAbsent: [self at: key put: aBlock value]

so it expects a Block. Ok, we are in Smalltalk, so implementing #value is enough.

Well..the previous example works, but only because we have an ugly Object >> value  that returns self.
If I put instances of subclasses from ProtoObjects (proxies), that do not work anymore.

So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:  expects a block and not directly the value?
in which case I need a block instead of the value object directly ?

thanks

--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Sven Van Caekenberghe

On 04 Oct 2011, at 12:52, Mariano Martinez Peck wrote:

> Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what would you expect the second parameter to be?  the value.
> So, one would do:
> Dictionary new at: #foo ifAbsentPut: 4
>
> But if you see Dictionary >>
>
> at: key ifAbsentPut: aBlock
>     "Return the value at the given key.
>     If key is not included in the receiver store the result
>     of evaluating aBlock as new value."
>
>     ^ self at: key ifAbsent: [self at: key put: aBlock value]
>
> so it expects a Block. Ok, we are in Smalltalk, so implementing #value is enough.
>
> Well..the previous example works, but only because we have an ugly Object >> value  that returns self.
> If I put instances of subclasses from ProtoObjects (proxies), that do not work anymore.
>
> So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:  expects a block and not directly the value?
> in which case I need a block instead of the value object directly ?

You are right, it feels a bit silly.
However, have a look at the senders (there are many).
Often the block holds an expensive operation when using the dictionary as a cache:

        cache at: key ifAbsent: [ backEnd get: key ]

Then it feels quite elegant.

Sven


Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Mariano Martinez Peck


On Tue, Oct 4, 2011 at 1:08 PM, Sven Van Caekenberghe <[hidden email]> wrote:

On 04 Oct 2011, at 12:52, Mariano Martinez Peck wrote:

> Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what would you expect the second parameter to be?  the value.
> So, one would do:
> Dictionary new at: #foo ifAbsentPut: 4
>
> But if you see Dictionary >>
>
> at: key ifAbsentPut: aBlock
>     "Return the value at the given key.
>     If key is not included in the receiver store the result
>     of evaluating aBlock as new value."
>
>     ^ self at: key ifAbsent: [self at: key put: aBlock value]
>
> so it expects a Block. Ok, we are in Smalltalk, so implementing #value is enough.
>
> Well..the previous example works, but only because we have an ugly Object >> value  that returns self.
> If I put instances of subclasses from ProtoObjects (proxies), that do not work anymore.
>
> So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:  expects a block and not directly the value?
> in which case I need a block instead of the value object directly ?

You are right, it feels a bit silly.
However, have a look at the senders (there are many).
Often the block holds an expensive operation when using the dictionary as a cache:

       cache at: key ifAbsent: [ backEnd get: key ]


and what is the difference to do:

cache at: key ifAbsent: ( backEnd get: key )    (notice the parenthesis instead of [])

[500 timesRepeat:[
Dictionary new at: #foo ifAbsentPut: [ CompiledMethod instanceCount ] ]] timeToRun 7254

[500 timesRepeat:[
Dictionary new at: #foo ifAbsentPut:  (CompiledMethod instanceCount)  ]] timeToRun 7294 

gives me more or less the same...

Thanks

 
Then it feels quite elegant.

Sven





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

NorbertHartl

Am 04.10.2011 um 13:33 schrieb Mariano Martinez Peck:



On Tue, Oct 4, 2011 at 1:08 PM, Sven Van Caekenberghe <[hidden email]> wrote:

On 04 Oct 2011, at 12:52, Mariano Martinez Peck wrote:

> Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what would you expect the second parameter to be?  the value.
> So, one would do:
> Dictionary new at: #foo ifAbsentPut: 4
>
> But if you see Dictionary >>
>
> at: key ifAbsentPut: aBlock
>     "Return the value at the given key.
>     If key is not included in the receiver store the result
>     of evaluating aBlock as new value."
>
>     ^ self at: key ifAbsent: [self at: key put: aBlock value]
>
> so it expects a Block. Ok, we are in Smalltalk, so implementing #value is enough.
>
> Well..the previous example works, but only because we have an ugly Object >> value  that returns self.
> If I put instances of subclasses from ProtoObjects (proxies), that do not work anymore.
>
> So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:  expects a block and not directly the value?
> in which case I need a block instead of the value object directly ?

You are right, it feels a bit silly.
However, have a look at the senders (there are many).
Often the block holds an expensive operation when using the dictionary as a cache:

       cache at: key ifAbsent: [ backEnd get: key ]


and what is the difference to do:

cache at: key ifAbsent: ( backEnd get: key )    (notice the parenthesis instead of [])

[500 timesRepeat:[
Dictionary new at: #foo ifAbsentPut: [ CompiledMethod instanceCount ] ]] timeToRun 7254

[500 timesRepeat:[
Dictionary new at: #foo ifAbsentPut:  (CompiledMethod instanceCount)  ]] timeToRun 7294 

gives me more or less the same...

try

| dict |
dict := Dictionary new.
[500 timesRepeat:[ 
dict at: #foo ifAbsentPut: [ CompiledMethod instanceCount ] ]] timeToRun.  21

| dict |
dict := Dictionary new.
[500 timesRepeat:[ 
dict at: #foo ifAbsentPut:  (CompiledMethod instanceCount)  ]] timeToRun  10501

Norbert

Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Schwab,Wilhelm K
In reply to this post by Sven Van Caekenberghe
Execution time is not the only consideration.  Often one needs state from other objects at the time of creation of the new element, so the creation really needs to be deferred by default.  Having "everything" understand #value is probably harmless, and certainly better than doing away with the ability to use a block to create the new element.

Bill



________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Sven Van Caekenberghe [[hidden email]]
Sent: Tuesday, October 04, 2011 7:08 AM
To: [hidden email]
Subject: Re: [Pharo-project] About Dictionary >> #at:ifAbsentPut:

On 04 Oct 2011, at 12:52, Mariano Martinez Peck wrote:

> Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what would you expect the second parameter to be?  the value.
> So, one would do:
> Dictionary new at: #foo ifAbsentPut: 4
>
> But if you see Dictionary >>
>
> at: key ifAbsentPut: aBlock
>     "Return the value at the given key.
>     If key is not included in the receiver store the result
>     of evaluating aBlock as new value."
>
>     ^ self at: key ifAbsent: [self at: key put: aBlock value]
>
> so it expects a Block. Ok, we are in Smalltalk, so implementing #value is enough.
>
> Well..the previous example works, but only because we have an ugly Object >> value  that returns self.
> If I put instances of subclasses from ProtoObjects (proxies), that do not work anymore.
>
> So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:  expects a block and not directly the value?
> in which case I need a block instead of the value object directly ?

You are right, it feels a bit silly.
However, have a look at the senders (there are many).
Often the block holds an expensive operation when using the dictionary as a cache:

        cache at: key ifAbsent: [ backEnd get: key ]

Then it feels quite elegant.

Sven



Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Henrik Sperre Johansen
In reply to this post by Mariano Martinez Peck
On 04.10.2011 12:52, Mariano Martinez Peck wrote:
Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what would you expect the second parameter to be?  the value.
So, one would do:
Dictionary new at: #foo ifAbsentPut: 4

But if you see Dictionary >>

at: key ifAbsentPut: aBlock
    "Return the value at the given key.
    If key is not included in the receiver store the result
    of evaluating aBlock as new value."

    ^ self at: key ifAbsent: [self at: key put: aBlock value]

so it expects a Block. Ok, we are in Smalltalk, so implementing #value is enough.

Well..the previous example works, but only because we have an ugly Object >> value  that returns self.
If I put instances of subclasses from ProtoObjects (proxies), that do not work anymore.

So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:  expects a block and not directly the value?
Sven's reasons are correct I think, it's for efficiency, and elegant lazy initialization of a cache.
The same reasons will never hold true when using #at:put:.
in which case I need a block instead of the value object directly ?
In addition to when you want delayed computation, when you have value objects who redefine/do not define #value ;)

Cheers,
Henry
Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Michael Roberts-2
I find the thread a bit confused in the sense

Dictionary *new* at: x ifAbsentPut: y

does not make sense (is academic), because the new dictionary will
never have x as a key. So why profile it and use that as reasoning?...

whereas

myDict at:x ifAbsentPut: y

is more interesting.

In my experience i am biased by the use of the message in the systems
that already use it heavily...however it always feels that the use of
ifAbsentXX is intentionally deferring some code because it expresses
exceptional cases in the application logic. Especially if the block
makes an object you don't want to be doing that every time and
discarding the result irrespective of the presence of the key. It does
not have to be expensive to make, just not good style near any kind of
loop.

I know some people that actually don't like the ifAbsentPut: []
variant. and prefer to be explicit, of the form

(myDict includesKey: x) ifFalse: [
   myDict at: x put: y]

where y is either a simple value or some expression.

here the test on the key has the same effect as expecting a block in
the inline ifAbsentPut: []  . It makes both forms basically the same,
with one being more compact if you prefer that. If ifAbsentPut: did
not expect the block then i think it would be a bit odd because you
would have to unroll the code in order to achieve the deferred case
which i feel is more common.

cheers,
Mike

On Tue, Oct 4, 2011 at 2:44 PM, Henrik Sperre Johansen
<[hidden email]> wrote:

> On 04.10.2011 12:52, Mariano Martinez Peck wrote:
>
> Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what
> would you expect the second parameter to be?  the value.
> So, one would do:
> Dictionary new at: #foo ifAbsentPut: 4
>
> But if you see Dictionary >>
>
> at: key ifAbsentPut: aBlock
>     "Return the value at the given key.
>     If key is not included in the receiver store the result
>     of evaluating aBlock as new value."
>
>     ^ self at: key ifAbsent: [self at: key put: aBlock value]
>
> so it expects a Block. Ok, we are in Smalltalk, so implementing #value is
> enough.
>
> Well..the previous example works, but only because we have an ugly Object >>
> value  that returns self.
> If I put instances of subclasses from ProtoObjects (proxies), that do not
> work anymore.
>
> So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:
> expects a block and not directly the value?
>
> Sven's reasons are correct I think, it's for efficiency, and elegant lazy
> initialization of a cache.
> The same reasons will never hold true when using #at:put:.
>
> in which case I need a block instead of the value object directly ?
>
> In addition to when you want delayed computation, when you have value
> objects who redefine/do not define #value ;)
>
> Cheers,
> Henry
>

Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Nicolas Cellier
2011/10/5 Michael Roberts <[hidden email]>:

> I find the thread a bit confused in the sense
>
> Dictionary *new* at: x ifAbsentPut: y
>
> does not make sense (is academic), because the new dictionary will
> never have x as a key. So why profile it and use that as reasoning?...
>
> whereas
>
> myDict at:x ifAbsentPut: y
>
> is more interesting.
>

I guess in some examples, like when you want to serialize an object
graph with as many nodes as branches (no or few sharing), then the
case of absence is dominating.
Knowing that Mariano is working on Fuel, i guess that could have
biased his judgement.

> In my experience i am biased by the use of the message in the systems
> that already use it heavily...however it always feels that the use of
> ifAbsentXX is intentionally deferring some code because it expresses
> exceptional cases in the application logic. Especially if the block
> makes an object you don't want to be doing that every time and
> discarding the result irrespective of the presence of the key. It does
> not have to be expensive to make, just not good style near any kind of
> loop.
>
> I know some people that actually don't like the ifAbsentPut: []
> variant. and prefer to be explicit, of the form
>
> (myDict includesKey: x) ifFalse: [
>   myDict at: x put: y]
>

Both codes are valid. But note that at:ifAbsentPut: will return the
value, or the newly put value.
So the exact equivalent is ((myDict includesKey: x)
    ifTrue: [myDict at: x]
    ifFalse: [myDict at: x put: y])

However, maybe Mariano is searching for tight inner loop optimization
(we can only guess, because he didn't tell, he should have).
In this case, using ifFalse: is good: because it is inlined by
Compiler, it avoids a BlockClosure creation.
But above code will cost two lookup in both branch, so it will be
pretty bad too, depending on hash evaluation cost and collision rate,
in a majority of cases, worse than block creation time.
I would suggest to Mariano to implement in the Dictionary class of his
choice the single lookup, block free creation code:

at: key ifAbsentPutValue: aValue
        "Answer the value associated with the key.
        if key isn't found, associate it first with aValue."

        | index |
        ^((array at: (index := self scanFor: key))
                ifNil: [ array at: index put: (key -> aValue). aValue ]
                ifNotNil: [:value | value]

But, please, please, Mariano, don't change #at:ifAbsentPut:
As other said, conditional execution is useful both for speed and
because in Object world, the state can change.
        aBagOfKeys do: [:key | myDict at: key ifAbsentPut: (myStream next)].
        aBagOfKeys do: [:key | myDict at: key ifAbsentPut: [myStream next]].
won't behave the same if aBagOfKeys has duplicates, will they ?

Last thing, #at:ifAbsentPut: is absolutely decoupled from
Object>>#value, it does rely on Object>>#value
So don't throw #at:ifAbsentPut: because you don't like Object>>#value.
If you don't like it, don't use it, use a BlockClosure instead.

Anyway, thanks to Mariano for these questions, they were biased, but
not silly, no question is :)

Nicolas

> where y is either a simple value or some expression.
>
> here the test on the key has the same effect as expecting a block in
> the inline ifAbsentPut: []  . It makes both forms basically the same,
> with one being more compact if you prefer that. If ifAbsentPut: did
> not expect the block then i think it would be a bit odd because you
> would have to unroll the code in order to achieve the deferred case
> which i feel is more common.
>
> cheers,
> Mike
>
> On Tue, Oct 4, 2011 at 2:44 PM, Henrik Sperre Johansen
> <[hidden email]> wrote:
>> On 04.10.2011 12:52, Mariano Martinez Peck wrote:
>>
>> Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what
>> would you expect the second parameter to be?  the value.
>> So, one would do:
>> Dictionary new at: #foo ifAbsentPut: 4
>>
>> But if you see Dictionary >>
>>
>> at: key ifAbsentPut: aBlock
>>     "Return the value at the given key.
>>     If key is not included in the receiver store the result
>>     of evaluating aBlock as new value."
>>
>>     ^ self at: key ifAbsent: [self at: key put: aBlock value]
>>
>> so it expects a Block. Ok, we are in Smalltalk, so implementing #value is
>> enough.
>>
>> Well..the previous example works, but only because we have an ugly Object >>
>> value  that returns self.
>> If I put instances of subclasses from ProtoObjects (proxies), that do not
>> work anymore.
>>
>> So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:
>> expects a block and not directly the value?
>>
>> Sven's reasons are correct I think, it's for efficiency, and elegant lazy
>> initialization of a cache.
>> The same reasons will never hold true when using #at:put:.
>>
>> in which case I need a block instead of the value object directly ?
>>
>> In addition to when you want delayed computation, when you have value
>> objects who redefine/do not define #value ;)
>>
>> Cheers,
>> Henry
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Nicolas Cellier
2011/10/5 Nicolas Cellier <[hidden email]>:

> 2011/10/5 Michael Roberts <[hidden email]>:
>> I find the thread a bit confused in the sense
>>
>> Dictionary *new* at: x ifAbsentPut: y
>>
>> does not make sense (is academic), because the new dictionary will
>> never have x as a key. So why profile it and use that as reasoning?...
>>
>> whereas
>>
>> myDict at:x ifAbsentPut: y
>>
>> is more interesting.
>>
>
> I guess in some examples, like when you want to serialize an object
> graph with as many nodes as branches (no or few sharing), then the
> case of absence is dominating.
> Knowing that Mariano is working on Fuel, i guess that could have
> biased his judgement.
>
>> In my experience i am biased by the use of the message in the systems
>> that already use it heavily...however it always feels that the use of
>> ifAbsentXX is intentionally deferring some code because it expresses
>> exceptional cases in the application logic. Especially if the block
>> makes an object you don't want to be doing that every time and
>> discarding the result irrespective of the presence of the key. It does
>> not have to be expensive to make, just not good style near any kind of
>> loop.
>>
>> I know some people that actually don't like the ifAbsentPut: []
>> variant. and prefer to be explicit, of the form
>>
>> (myDict includesKey: x) ifFalse: [
>>   myDict at: x put: y]
>>
>
> Both codes are valid. But note that at:ifAbsentPut: will return the
> value, or the newly put value.
> So the exact equivalent is ((myDict includesKey: x)
>    ifTrue: [myDict at: x]
>    ifFalse: [myDict at: x put: y])
>
> However, maybe Mariano is searching for tight inner loop optimization
> (we can only guess, because he didn't tell, he should have).
> In this case, using ifFalse: is good: because it is inlined by
> Compiler, it avoids a BlockClosure creation.
> But above code will cost two lookup in both branch, so it will be
> pretty bad too, depending on hash evaluation cost and collision rate,
> in a majority of cases, worse than block creation time.
> I would suggest to Mariano to implement in the Dictionary class of his
> choice the single lookup, block free creation code:
>
> at: key ifAbsentPutValue: aValue
>        "Answer the value associated with the key.
>        if key isn't found, associate it first with aValue."
>
>        | index |
>        ^((array at: (index := self scanFor: key))
>                ifNil: [ array at: index put: (key -> aValue). aValue ]
>                ifNotNil: [:value | value]
>
> But, please, please, Mariano, don't change #at:ifAbsentPut:
> As other said, conditional execution is useful both for speed and
> because in Object world, the state can change.
>        aBagOfKeys do: [:key | myDict at: key ifAbsentPut: (myStream next)].
>        aBagOfKeys do: [:key | myDict at: key ifAbsentPut: [myStream next]].
> won't behave the same if aBagOfKeys has duplicates, will they ?
>
> Last thing, #at:ifAbsentPut: is absolutely decoupled from
> Object>>#value, it does rely on Object>>#value

it does not rely, of course, hmm time to sleep a bit.

> So don't throw #at:ifAbsentPut: because you don't like Object>>#value.
> If you don't like it, don't use it, use a BlockClosure instead.
>
> Anyway, thanks to Mariano for these questions, they were biased, but
> not silly, no question is :)
>
> Nicolas
>
>> where y is either a simple value or some expression.
>>
>> here the test on the key has the same effect as expecting a block in
>> the inline ifAbsentPut: []  . It makes both forms basically the same,
>> with one being more compact if you prefer that. If ifAbsentPut: did
>> not expect the block then i think it would be a bit odd because you
>> would have to unroll the code in order to achieve the deferred case
>> which i feel is more common.
>>
>> cheers,
>> Mike
>>
>> On Tue, Oct 4, 2011 at 2:44 PM, Henrik Sperre Johansen
>> <[hidden email]> wrote:
>>> On 04.10.2011 12:52, Mariano Martinez Peck wrote:
>>>
>>> Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what
>>> would you expect the second parameter to be?  the value.
>>> So, one would do:
>>> Dictionary new at: #foo ifAbsentPut: 4
>>>
>>> But if you see Dictionary >>
>>>
>>> at: key ifAbsentPut: aBlock
>>>     "Return the value at the given key.
>>>     If key is not included in the receiver store the result
>>>     of evaluating aBlock as new value."
>>>
>>>     ^ self at: key ifAbsent: [self at: key put: aBlock value]
>>>
>>> so it expects a Block. Ok, we are in Smalltalk, so implementing #value is
>>> enough.
>>>
>>> Well..the previous example works, but only because we have an ugly Object >>
>>> value  that returns self.
>>> If I put instances of subclasses from ProtoObjects (proxies), that do not
>>> work anymore.
>>>
>>> So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:
>>> expects a block and not directly the value?
>>>
>>> Sven's reasons are correct I think, it's for efficiency, and elegant lazy
>>> initialization of a cache.
>>> The same reasons will never hold true when using #at:put:.
>>>
>>> in which case I need a block instead of the value object directly ?
>>>
>>> In addition to when you want delayed computation, when you have value
>>> objects who redefine/do not define #value ;)
>>>
>>> Cheers,
>>> Henry
>>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Nicolas Cellier
In reply to this post by Nicolas Cellier
2011/10/5 Nicolas Cellier <[hidden email]>:

> 2011/10/5 Michael Roberts <[hidden email]>:
>> I find the thread a bit confused in the sense
>>
>> Dictionary *new* at: x ifAbsentPut: y
>>
>> does not make sense (is academic), because the new dictionary will
>> never have x as a key. So why profile it and use that as reasoning?...
>>
>> whereas
>>
>> myDict at:x ifAbsentPut: y
>>
>> is more interesting.
>>
>
> I guess in some examples, like when you want to serialize an object
> graph with as many nodes as branches (no or few sharing), then the
> case of absence is dominating.
> Knowing that Mariano is working on Fuel, i guess that could have
> biased his judgement.
>
>> In my experience i am biased by the use of the message in the systems
>> that already use it heavily...however it always feels that the use of
>> ifAbsentXX is intentionally deferring some code because it expresses
>> exceptional cases in the application logic. Especially if the block
>> makes an object you don't want to be doing that every time and
>> discarding the result irrespective of the presence of the key. It does
>> not have to be expensive to make, just not good style near any kind of
>> loop.
>>
>> I know some people that actually don't like the ifAbsentPut: []
>> variant. and prefer to be explicit, of the form
>>
>> (myDict includesKey: x) ifFalse: [
>>   myDict at: x put: y]
>>
>
> Both codes are valid. But note that at:ifAbsentPut: will return the
> value, or the newly put value.
> So the exact equivalent is ((myDict includesKey: x)
>    ifTrue: [myDict at: x]
>    ifFalse: [myDict at: x put: y])
>
> However, maybe Mariano is searching for tight inner loop optimization
> (we can only guess, because he didn't tell, he should have).
> In this case, using ifFalse: is good: because it is inlined by
> Compiler, it avoids a BlockClosure creation.
> But above code will cost two lookup in both branch, so it will be
> pretty bad too, depending on hash evaluation cost and collision rate,
> in a majority of cases, worse than block creation time.
> I would suggest to Mariano to implement in the Dictionary class of his
> choice the single lookup, block free creation code:
>
> at: key ifAbsentPutValue: aValue
>        "Answer the value associated with the key.
>        if key isn't found, associate it first with aValue."
>
>        | index |
>        ^((array at: (index := self scanFor: key))
>                ifNil: [ array at: index put: (key -> aValue). aValue ]
>                ifNotNil: [:value | value]
>

And please, correct me, above code is plain false :)

> But, please, please, Mariano, don't change #at:ifAbsentPut:
> As other said, conditional execution is useful both for speed and
> because in Object world, the state can change.
>        aBagOfKeys do: [:key | myDict at: key ifAbsentPut: (myStream next)].
>        aBagOfKeys do: [:key | myDict at: key ifAbsentPut: [myStream next]].
> won't behave the same if aBagOfKeys has duplicates, will they ?
>
> Last thing, #at:ifAbsentPut: is absolutely decoupled from
> Object>>#value, it does rely on Object>>#value
> So don't throw #at:ifAbsentPut: because you don't like Object>>#value.
> If you don't like it, don't use it, use a BlockClosure instead.
>
> Anyway, thanks to Mariano for these questions, they were biased, but
> not silly, no question is :)
>
> Nicolas
>
>> where y is either a simple value or some expression.
>>
>> here the test on the key has the same effect as expecting a block in
>> the inline ifAbsentPut: []  . It makes both forms basically the same,
>> with one being more compact if you prefer that. If ifAbsentPut: did
>> not expect the block then i think it would be a bit odd because you
>> would have to unroll the code in order to achieve the deferred case
>> which i feel is more common.
>>
>> cheers,
>> Mike
>>
>> On Tue, Oct 4, 2011 at 2:44 PM, Henrik Sperre Johansen
>> <[hidden email]> wrote:
>>> On 04.10.2011 12:52, Mariano Martinez Peck wrote:
>>>
>>> Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what
>>> would you expect the second parameter to be?  the value.
>>> So, one would do:
>>> Dictionary new at: #foo ifAbsentPut: 4
>>>
>>> But if you see Dictionary >>
>>>
>>> at: key ifAbsentPut: aBlock
>>>     "Return the value at the given key.
>>>     If key is not included in the receiver store the result
>>>     of evaluating aBlock as new value."
>>>
>>>     ^ self at: key ifAbsent: [self at: key put: aBlock value]
>>>
>>> so it expects a Block. Ok, we are in Smalltalk, so implementing #value is
>>> enough.
>>>
>>> Well..the previous example works, but only because we have an ugly Object >>
>>> value  that returns self.
>>> If I put instances of subclasses from ProtoObjects (proxies), that do not
>>> work anymore.
>>>
>>> So...my question is we do Dictionary at: #foo put: 4, why #at:ifAbsentPut:
>>> expects a block and not directly the value?
>>>
>>> Sven's reasons are correct I think, it's for efficiency, and elegant lazy
>>> initialization of a cache.
>>> The same reasons will never hold true when using #at:put:.
>>>
>>> in which case I need a block instead of the value object directly ?
>>>
>>> In addition to when you want delayed computation, when you have value
>>> objects who redefine/do not define #value ;)
>>>
>>> Cheers,
>>> Henry
>>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Henrik Sperre Johansen
In reply to this post by Nicolas Cellier
On 05.10.2011 01:59, Nicolas Cellier wrote:

> 2011/10/5 Michael Roberts<[hidden email]>:
>> I find the thread a bit confused in the sense
>>
>> Dictionary *new* at: x ifAbsentPut: y
>>
>> does not make sense (is academic), because the new dictionary will
>> never have x as a key. So why profile it and use that as reasoning?...
>>
>> whereas
>>
>> myDict at:x ifAbsentPut: y
>>
>> is more interesting.
>>
> I guess in some examples, like when you want to serialize an object
> graph with as many nodes as branches (no or few sharing), then the
> case of absence is dominating.
> Knowing that Mariano is working on Fuel, i guess that could have
> biased his judgement.
>
>> In my experience i am biased by the use of the message in the systems
>> that already use it heavily...however it always feels that the use of
>> ifAbsentXX is intentionally deferring some code because it expresses
>> exceptional cases in the application logic. Especially if the block
>> makes an object you don't want to be doing that every time and
>> discarding the result irrespective of the presence of the key. It does
>> not have to be expensive to make, just not good style near any kind of
>> loop.
>>
>> I know some people that actually don't like the ifAbsentPut: []
>> variant. and prefer to be explicit, of the form
>>
>> (myDict includesKey: x) ifFalse: [
>>    myDict at: x put: y]
>>
> Both codes are valid. But note that at:ifAbsentPut: will return the
> value, or the newly put value.
> So the exact equivalent is ((myDict includesKey: x)
>      ifTrue: [myDict at: x]
>      ifFalse: [myDict at: x put: y])
>
> However, maybe Mariano is searching for tight inner loop optimization
> (we can only guess, because he didn't tell, he should have).
> In this case, using ifFalse: is good: because it is inlined by
> Compiler, it avoids a BlockClosure creation.
> But above code will cost two lookup in both branch, so it will be
> pretty bad too, depending on hash evaluation cost and collision rate,
> in a majority of cases, worse than block creation time.
> I would suggest to Mariano to implement in the Dictionary class of his
> choice the single lookup, block free creation code:
>
> at: key ifAbsentPutValue: aValue
> "Answer the value associated with the key.
> if key isn't found, associate it first with aValue."
>
> | index |
> ^((array at: (index := self scanFor: key))
> ifNil: [ array at: index put: (key ->  aValue). aValue ]
> ifNotNil: [:value | value]
>
> But, please, please, Mariano, don't change #at:ifAbsentPut:
> As other said, conditional execution is useful both for speed and
> because in Object world, the state can change.
> aBagOfKeys do: [:key | myDict at: key ifAbsentPut: (myStream next)].
> aBagOfKeys do: [:key | myDict at: key ifAbsentPut: [myStream next]].
> won't behave the same if aBagOfKeys has duplicates, will they ?
>
> Last thing, #at:ifAbsentPut: is absolutely decoupled from
> Object>>#value, it does rely on Object>>#value
> So don't throw #at:ifAbsentPut: because you don't like Object>>#value.
> If you don't like it, don't use it, use a BlockClosure instead.
>
> Anyway, thanks to Mariano for these questions, they were biased, but
> not silly, no question is :)
>
> Nicolas
>
You could make at:ifAbsentPut: do single lookup, at the cost of using
cull: (with the scanned for index) in at:ifAbsent: .
Sort of restricts what you can pass to at:ifAbsent: though, (not that I
found any not using a block in a quick scan of senders, nor did my image
crash)
and the runtime really didn't improve that much.

| dict |
dict := Dictionary new: 100000.
[1 to: 100000 do: [: i   |                                  "1 scan
(new)"                               "2 scans (old)"
dict at: i ifAbsentPut:  i ]] timeToRun  34 34 48 38 82 34 58 35
33       42 76 57 56 40 40 57 120 48
This is numbers in a presized dictionary though, so the collision rate
might not be representative...
Let me check that with points instead, where the hash computation/lookup
cost is higher...
| dict pts|
dict := Dictionary new: 100000.
pts := (1 to: 100000) collect: [:i  | i@i].
[1 to: pts size do: [: ix   |                                            
           "1 scan (new)"                                              
"2 scans (old)"
dict at: (pts at: ix) ifAbsentPut:  ix ]] timeToRun  96 89 89  89 91 143
92 89 96         132 127 144 126 139 126 133 129  127 134

Still not worth it to restrict at:ifAbsent: imho.

Cheers,
Henry

Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Henrik Sperre Johansen
On 05.10.2011 11:19, Henrik Sperre Johansen wrote:

> On 05.10.2011 01:59, Nicolas Cellier wrote:
>>
>> However, maybe Mariano is searching for tight inner loop optimization
>> (we can only guess, because he didn't tell, he should have).
>> In this case, using ifFalse: is good: because it is inlined by
>> Compiler, it avoids a BlockClosure creation.
>> But above code will cost two lookup in both branch, so it will be
>> pretty bad too, depending on hash evaluation cost and collision rate,
>> in a majority of cases, worse than block creation time.
>>
> You could make at:ifAbsentPut: do single lookup, at the cost of using
> cull: (with the scanned for index) in at:ifAbsent: .
> Sort of restricts what you can pass to at:ifAbsent: though, (not that
> I found any not using a block in a quick scan of senders, nor did my
> image crash)
> and the runtime really didn't improve that much.
Here's the .cs for those interested, btw.

Cheers,
Henry

atifabsentput.1.cs (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Philippe Marschall-2
In reply to this post by Mariano Martinez Peck
On 10/04/2011 12:52 PM, Mariano Martinez Peck wrote:
> Hi guys. If I tell you the selector is Dictionary >> #at:ifAbsentPut:  what
> would you expect the second parameter to be?

A block of course, like #at:ifAbsent:

Cheers
Philippe


Reply | Threaded
Open this post in threaded view
|

Re: About Dictionary >> #at:ifAbsentPut:

Camillo Bruni
In reply to this post by Henrik Sperre Johansen
Partly off topic:

use http://www.squeaksource.com/SMark.html for your benchmarks!
it's as easy as writing unit tests :)

best cami


On 2011-10-05, at 11:27, Henrik Sperre Johansen wrote:

> On 05.10.2011 11:19, Henrik Sperre Johansen wrote:
>> On 05.10.2011 01:59, Nicolas Cellier wrote:
>>>
>>> However, maybe Mariano is searching for tight inner loop optimization
>>> (we can only guess, because he didn't tell, he should have).
>>> In this case, using ifFalse: is good: because it is inlined by
>>> Compiler, it avoids a BlockClosure creation.
>>> But above code will cost two lookup in both branch, so it will be
>>> pretty bad too, depending on hash evaluation cost and collision rate,
>>> in a majority of cases, worse than block creation time.
>>>
>> You could make at:ifAbsentPut: do single lookup, at the cost of using cull: (with the scanned for index) in at:ifAbsent: .
>> Sort of restricts what you can pass to at:ifAbsent: though, (not that I found any not using a block in a quick scan of senders, nor did my image crash)
>> and the runtime really didn't improve that much.
> Here's the .cs for those interested, btw.
>
> Cheers,
> Henry
> <atifabsentput.1.cs>