Smalltalk › Squeak › Squeak - Dev

Closure Compiler in 3.9 ?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

56 messages Options

123

Lukas Renggli

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

> The queastion remains: why does the mere existence of such recursive
> structures throw Squeak into a tizzy?
> No one answered this. What does the doIt mechanism do before it executes my
> code?

This is because the compiler internally stores the bindings of the
Workspace in a dictionary with the binding-association as a key.
Unfortunately the association are calculating the #hash using the key
and the value, what causes an infinite recursion on your dictionary
before you code is even compiled.

The implementation of #hash in Association should not use the value to
calculate the key, this is clearly a bug.

Cheers,
Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

Lukas Renggli

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Tony Garnock-Jones-2

> > But then, what to deal with such cases:
> >
> > aClass>>add: a with: b
> > ^ self add: a with: b+1
>
> This is a call in tail position, and it's a crying shame that the Squeak
> VM doesn't presently allow the associated degenerate stack frames to be
> elided.

There is a reason why this is not done in Smalltalk. Have a look at
the examples below:

aClass>>add: a with: b
b = 0 ifTrue: [ SomeGlobal := thisContext ].
^ self add: a with: b+1

aClass>>add: a with: b
self become: #zork.
^ self add: a with: b+1

aClass>>add: a with: b
self class removeSelector: #add:with:.
^ self add: a with: b+1

There is an infinite number of other examples.

Cheers,
Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

timrowledge

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Lukas Renggli

On 11-May-07, at 11:40 AM, Lukas Renggli wrote:

>
> The implementation of #hash in Association should not use the value to
> calculate the key, this is clearly a bug.

Cue another long discussion about hashing.

Just what do you believe the hash value should be based on?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: SDLI: Shift Disk Left Immediate

Klaus D. Witzel

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Lukas Renggli

On Fri, 11 May 2007 20:40:57 +0200, Lukas Renggli wrote:

>> The queastion remains: why does the mere existence of such recursive
>> structures throw Squeak into a tizzy?
>> No one answered this. What does the doIt mechanism do before it
>> executes my
>> code?
>
> This is because the compiler internally stores the bindings of the
> Workspace in a dictionary with the binding-association as a key.
> Unfortunately the association are calculating the #hash using the key
> and the value, what causes an infinite recursion on your dictionary
> before you code is even compiled.

*after* the code was compiled *and* *executed* the first time ;-)

> The implementation of #hash in Association should not use the value to
> calculate the key, this is clearly a bug.

Yes, this slows down compiler's binding of variables, for example (Unicode
generalCategory size) => 917632.

/Klaus

> Cheers,
> Lukas
>

Lukas Renggli

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Lukas Renggli

> There is a reason why this is not done in Smalltalk. Have a look at
> the examples below:
>
> aClass>>add: a with: b
> b = 0 ifTrue: [ SomeGlobal := thisContext ].
> ^ self add: a with: b+1
>
> aClass>>add: a with: b
> self become: #zork.
> ^ self add: a with: b+1
>
> aClass>>add: a with: b
> self class removeSelector: #add:with:.
> ^ self add: a with: b+1
>
> There is an infinite number of other examples.

One more example, that is even more striking and that doesn't use
nasty reflection tricks. Imagine that #add:with: was called from a
subclass implementation using super. Recycling the stack would clearly
change the behavior of the code:

A>>add: a with: b
^ self add: a with: b+1

B>>add: a with: b
^ super add: a + 1 with: b

Cheers,
Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

Lukas Renggli

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by timrowledge

> Cue another long discussion about hashing.
>
> Just what do you believe the hash value should be based on?

What about the key only?

I quickly had a look at some other Smalltalk implementations (VW,
Dolphin, Ambrai) they all use the key for hashing only.

Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

johnmci

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

Well my squeak 2.5 image of Aug 6th 1999 has

LookupKey>>hash
^ key hash

So when and why did it change?

On May 11, 2007, at 12:10 PM, Lukas Renggli wrote:

>> Cue another long discussion about hashing.
>>
>> Just what do you believe the hash value should be based on?
>
> What about the key only?
>
> I quickly had a look at some other Smalltalk implementations (VW,
> Dolphin, Ambrai) they all use the key for hashing only.
>
> Lukas
>
> --
> Lukas Renggli
> http://www.lukas-renggli.ch
>

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
========================================================================
===

johnmci

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Lukas Renggli

"Change Set: Associations-equal-reposted
Date: 13 January 2004
Author: Leandro Caniglia

md: V2: added Associations>>hash

Change Set: Associations-equal
Date: 20 November 2001
Author: Leandro Caniglia

Defines the method Association >> #= in such a way that the values,
other than the keys, get compared too. Previously this was inherited
from LookupKey which only compares the keys."

On May 11, 2007, at 12:10 PM, Lukas Renggli wrote:

>> Cue another long discussion about hashing.

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
========================================================================
===

Nicolas Cellier-3

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Lukas Renggli

Lukas Renggli a écrit :

As a LookupKey, Association should behave like Lukas said.
Also, look at this funny bug introduced by Association>>#= :

| a1 a2 |
a1 := #a -> 1.
a2 := #a -> 2.
{a1 >= a2. a2 >= a1. a1 = a2}
"true, true, false"

I do not know how mathematicians do qualify this kind of relation...

But among more than 30 references to Association, and 100 sendes of #->
how to tell whether Association>>#= ever need to test the value ?
(this is of course why Associations>>#hash is defined )

Quiet sure there are some uses that do not consider Association like a
LookupKey but just as a C++ std::pair.

Nicolas

Jecel Assumpcao Jr

tail recursion (was: Equality of Recursive Structures)

In reply to this post by Tony Garnock-Jones-2

Tony Garnock-Jones wrote:
> sig wrote:
> > But then, what to deal with such cases:
> >
> > aClass>>add: a with: b
> > ^ self add: a with: b+1
>
> This is a call in tail position, and it's a crying shame that the Squeak
> VM doesn't presently allow the associated degenerate stack frames to be
> elided.

Lukas Renggli gave a series of examples where replacing the "tail send"
with a "goto" would give use the wrong semantics. That, however, is a
separate issue from avoiding the creation of stack frames.

Dan Ingalls said that VM implementors can cheat as long as they don't
get caught and if you ever get into the debugger you will certainly
notice the lack of some stack frames and will curse whoever had the
brilhant idea of eliminating them to get some extra speed. The Self guys
were very fanatical about not getting caught this way no matter what the
costs in terms of performance.

For tinySelf 1, however, the cost in performance would have been
extremely high since having tail send optimization made programs far
more parallel than they would be otherwise. Unfortunately this single
feature was responsible for practically all the hard to find bugs in the
implementation and so when I had to move on to another project they
became the reason why tinySelf 1 remained unfinished. Given this
experience I think it would have been better for me to first have made
the system fully functional without this optimization and then add it in
a second version. But I would have added it.

- -Jecel

J J-6

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9?]

In reply to this post by Lukas Renggli

But then how do you differentiate an Association that appears in multiple
different places with several Associations that simply have the same keys?

>From: "Lukas Renggli" <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: "The general-purpose Squeak developers
>list"<[hidden email]>
>Subject: Re: Equality of Recursive Structures [Was: Closure Compiler in
>3.9?]
>Date: Fri, 11 May 2007 21:10:46 +0200
>
>>Cue another long discussion about hashing.
>>
>>Just what do you believe the hash value should be based on?
>
>What about the key only?
>
>I quickly had a look at some other Smalltalk implementations (VW,
>Dolphin, Ambrai) they all use the key for hashing only.
>
>Lukas
>
>--
>Lukas Renggli
>http://www.lukas-renggli.ch
>

_________________________________________________________________
Catch suspicious messages before you open themwith Windows Live Hotmail.
http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_protection_0507

Igor Stasenko

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Nicolas Cellier-3

On 11/05/07, nicolas cellier <[hidden email]> wrote:

> Lukas Renggli a écrit :
> >> Cue another long discussion about hashing.
> >>
> >> Just what do you believe the hash value should be based on?
> >
> > What about the key only?
> >
> > I quickly had a look at some other Smalltalk implementations (VW,
> > Dolphin, Ambrai) they all use the key for hashing only.
> >
> > Lukas
> >
>
> As a LookupKey, Association should behave like Lukas said.
> Also, look at this funny bug introduced by Association>>#= :
>
> | a1 a2 |
> a1 := #a -> 1.
> a2 := #a -> 2.
> {a1 >= a2. a2 >= a1. a1 = a2}
> "true, true, false"
>
> I do not know how mathematicians do qualify this kind of relation...
>
>
> But among more than 30 references to Association, and 100 sendes of #->
> how to tell whether Association>>#= ever need to test the value ?
> (this is of course why Associations>>#hash is defined )
>
> Quiet sure there are some uses that do not consider Association like a
> LookupKey but just as a C++ std::pair.
>
> Nicolas
>

Why then not add class Pair, and then Association as subclass.
We can leave a selector #->, to create Association and use #=> (or
other) to create Pair
The Pair #= will compare both left and right values , while
Assotiation only on first.
Same with #hash.

Igor Stasenko

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9?]

In reply to this post by J J-6

On 12/05/07, J J <[hidden email]> wrote:
> But then how do you differentiate an Association that appears in multiple
> different places with several Associations that simply have the same keys?
>
If you need this, you can always rely on identity.
a = b ifTrue: [ a == b ifTrue: [] ]
or
a value = b value

A value in Association must not take a part in comparison operations,
its just a placeholder for storing something under given key. If you
using a comparison in association based on both key and value, then
you losing a sence in having such class at all. Because then it can be
replaced with any other, like Point or Array [2].

>
> >From: "Lukas Renggli" <[hidden email]>
> >Reply-To: The general-purpose Squeak developers
> >list<[hidden email]>
> >To: "The general-purpose Squeak developers
> >list"<[hidden email]>
> >Subject: Re: Equality of Recursive Structures [Was: Closure Compiler in
> >3.9?]
> >Date: Fri, 11 May 2007 21:10:46 +0200
> >
> >>Cue another long discussion about hashing.
> >>
> >>Just what do you believe the hash value should be based on?
> >
> >What about the key only?
> >
> >I quickly had a look at some other Smalltalk implementations (VW,
> >Dolphin, Ambrai) they all use the key for hashing only.
> >
> >Lukas
> >
> >--
> >Lukas Renggli
> >http://www.lukas-renggli.ch
> >
>
> _________________________________________________________________
> Catch suspicious messages before you open them—with Windows Live Hotmail.
> http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_protection_0507
>
>
>

Igor Stasenko

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9?]

Also, just take a look what -> means in mathematics.
a -> b
is an implication and reads 'if a then b'!

So, if you using in comparison #= both a and b, its reads as 'a and b'
and of course its not an implication anymore.

Bergel, Alexandre

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by stephane ducasse

Hi Andrew,

Thanks for this instructive email.

Alexandre

On 10 May 2007, at 13:50, stephane ducasse wrote:

> thanks for the email. It was cool.
> I learned something today.
>
> Stef
>
> On 10 mai 07, at 08:46, Andrew P. Black wrote:
>
>> This started as a story about what I realized in the shower this
>> morning, and ended up as a story about what I don't know about
>> Squeak (again :-( )
>>
>> On 8 May 2007, at 17:12, Andreas Raab wrote:
>>
>>> A while ago Association>>= was changed to compare the values in
>>> addition to its keys which has bitten me many, many times since
>>> it was introduced. Most likely it is the culprit here, too.
>>
>> No, Andreas, this was a different problem but an interesting one.
>> This is the third time in my life that i has bitten me. It's
>> worth writing a pattern about: comparing recursive structures.
>>
>> Suppose that you have two potentially recursive data structures,
>> and you need to compare them — perhaps for equality, or perhaps
>> for dominance of one over the other. (I met this problem first
>> when figuring out the type conformance algorithm for Emerald in
>> about 1984.)
>>
>> To be concrete, suppose that the structures are dictionaries,
>> which contain simple data as keys (like symbols), and more complex
>> data as values — including, perhaps, other dictionaries. The
>> obvious thing is to walk both structures in parallel, checking
>> that they have the same keys, and that the values for each
>> occurrence of a key are also the same. It's possible that those
>> values are also dictionaries, in which case you execute the same =
>> code that you were in the middle of running ...
>>
>> This is not a problem, unless: the dictionary stored in the
>> dictionary is the very same dictionary that you started with. In
>> which case, the naive recursive search will go on for ever.
>>
>> I first met this in Squeak when I executed Smalltalk = Smalltalk.
>> Of course, Smalltalk, as the dictionary of all globals, includes
>> Smalltalk as a value, and the equality computation never
>> terminated. I filed a bug fix for that one, and it's in the image
>> now. My fix was to simply compare the two objects for identity
>> first, and only then do the recursive equality check. This, or
>> course, answers true ot Smalltalk = Smalltalk quite quickly.
>>
>> The situation is the same with my recursive block example.
>>
>> b := [(i:= i-1) > 0 ifTrue: [j := j+1 . b value.] ifFalse:[j]] .
>>
>> It's compiled as a method, which has (presumably, I haven't
>> looked) a literal that points to itself ... and the equality
>> comparision goes on forever. Why Squeak was doing an equality
>> comparision when all I asked for was to "DoIt" is another question.
>>
>> I'm deducing all this from Mathieu's fix. The identity test
>> isn't a complete fix, however. If you have two isomorphic
>> recursive structures that don't share any elements, the test will
>> still go on for ever.
>>
>> Here is a small example: two isomorphic one-element dictionaries.
>> Type this in a workspace (BUT DON"T TRY THIS IN AN IMAGE THAT YOU
>> WANT TO KEEP USING!)
>>
>> a := Dictionary new.
>> a at: #foo put: a.
>> b := Dictionary new.
>> b at: #foo put: b.
>>
>> I expected that if I then did a printIt on a==b, I would get an
>> immediate "false", but if I did a printit on a=b, the system would
>> go off into never-land.
>>
>> But actually, it's more interesting than that. The system goes
>> off into never-land almost regardless of what you do! Debug it a
>> == b will do it. So will a simple printIt of b or inspectIt of
>> b. A doIt of "self halt. a == b" goes into a loop /before/
>> executing the self halt. If you interrupt the computation, you
>> can look at b in the debugger, and even launch an inspector from
>> there: printStringLimtedTo: does its job, as it's supposed to.
>> The system is trying to evaluate hash on the cyclic dictionary;
>> hash is also broken for the same reason. Asking for a fullStack
>> in the debugger to see _why_ it's trying to compute hash puts the
>> system into another recursive loop!
>>
>> This explains why I saw the same behavior with my recursive block,
>> but not the root cause. Anyone have an explanation?
>>
>> By the way, the real solution to the recursive equality comparison
>> is to maintain a cache.
>> A pair of elements goes into the cache when you start to comapre
>> them for equality. If, in the process of doing the comparision,
>> you find that you are asking the same equaility question again,
>> then the second time around, you can assume that the answer is
>> true. This is OK, evne though you haven't finished doing the
>> comparison yet. If it turns out that there is indeed a
>> difference, the "outer" equality will eventually find it, and will
>> answer false. The same trick can be employed for computing the
>> hash. Assume that the hash of the structure that you are
>> computing the has over is some constant, say 0. Start computing
>> its hash, but if you find in the process that you need to compute
>> the has of the same structure again, then just assume that it is
>> 0. The "real" hash might be thought of as the fixed-point of this
>> process, but since hashes are just approximations anyway, once
>> around the loop is enough.
>>
>> The queation remains: why does the mere existence of such
>> recursive structures throw Squeak into a tizzy?
>>
>> Andrew
>>
>>
>>
>>
>
>
>

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel http://www.software-artist.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.

Tony Garnock-Jones-2

Tail-calls (was Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?])

In reply to this post by Lukas Renggli

Hi Lukas,

The problem with the examples you've given is that none of them justify
not eliding dead stack frames by themselves.

The usual argument for retaining dead stack frames is to make debugging
easier. The usual arguments against are the safe-for-space property and
to allow the expression of advanced control structures that are
otherwise impossible to express directly.

Lukas Renggli wrote:
> aClass>>add: a with: b
> b = 0 ifTrue: [ SomeGlobal := thisContext ].
> ^ self add: a with: b+1

This is the example that comes closest to demonstrating why retaining
dead frames might be interesting: examination of thisContext can tell
you how many times you've tail-recursed.

> aClass>>add: a with: b
> self become: #zork.
> ^ self add: a with: b+1
>
> aClass>>add: a with: b
> self class removeSelector: #add:with:.
> ^ self add: a with: b+1

Neither of these are affected at all by removing the no-op frames,
unless I'm missing something.

Regards,
Tony

Tony Garnock-Jones-2

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Lukas Renggli

Hi,

> One more example, that is even more striking and that doesn't use
> nasty reflection tricks. Imagine that #add:with: was called from a
> subclass implementation using super. Recycling the stack would clearly
> change the behavior of the code:

How? The stack frame, at the point the super send has just been made,
has exactly one role remaining: to return to its caller. Nothing can
happen between the stack frame becoming active again and its immediate
return to its calling frame.

Regards,
Tony

Tony Garnock-Jones-2

Re: tail recursion

In reply to this post by Tony Garnock-Jones-2

Hi Jecel,

To me it's not a question of optimisation, but a question of not being
able to express useful patterns of control flow directly. Instead, I am
forced to encode them, often using otherwise-avoidable mutable state.

Imagine a BASIC without GOSUB. You have to use arrays to represent a
stack if you need to implement a push-down automaton. It's a very
similar situation here: state machines have a natural implementation
using tail calls that is not available unless dead frames are elided.
The alternative is to use an instance variable or similar to hold the
current state, and to perform explicit trampolining.

Regards,
Tony

Jecel Assumpcao Jr wrote:

> Tony Garnock-Jones wrote:
>> sig wrote:
>>> But then, what to deal with such cases:
>>>
>>> aClass>>add: a with: b
>>> ^ self add: a with: b+1
>> This is a call in tail position, and it's a crying shame that the Squeak
>> VM doesn't presently allow the associated degenerate stack frames to be
>> elided.
>
> Lukas Renggli gave a series of examples where replacing the "tail send"
> with a "goto" would give use the wrong semantics. That, however, is a
> separate issue from avoiding the creation of stack frames.
>
> Dan Ingalls said that VM implementors can cheat as long as they don't
> get caught and if you ever get into the debugger you will certainly
> notice the lack of some stack frames and will curse whoever had the
> brilhant idea of eliminating them to get some extra speed. The Self guys
> were very fanatical about not getting caught this way no matter what the
> costs in terms of performance.
>
> For tinySelf 1, however, the cost in performance would have been
> extremely high since having tail send optimization made programs far
> more parallel than they would be otherwise. Unfortunately this single
> feature was responsible for practically all the hard to find bugs in the
> implementation and so when I had to move on to another project they
> became the reason why tinySelf 1 remained unfinished. Given this
> experience I think it would have been better for me to first have made
> the system fully functional without this optimization and then add it in
> a second version. But I would have added it.
>
> - -Jecel
>
>

Lukas Renggli

Re: Tail-calls (was Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?])

In reply to this post by Tony Garnock-Jones-2

> > aClass>>add: a with: b
> > self become: #zork.
> > ^ self add: a with: b+1
> >
> > aClass>>add: a with: b
> > self class removeSelector: #add:with:.
> > ^ self add: a with: b+1
>
> Neither of these are affected at all by removing the no-op frames,
> unless I'm missing something.

In both cases the receiver changes and so might the implementation of
#add:with:. Have a look at the other example I gave not using any of
the dirty tricks above, but just plain inheritance. The same problem
there ...

Cheers,
Lukas

--
Lukas Renggli
http://www.lukas-renggli.ch

Colin Putney

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Tony Garnock-Jones-2

On May 12, 2007, at 7:38 AM, Tony Garnock-Jones wrote:

>> One more example, that is even more striking and that doesn't use
>> nasty reflection tricks. Imagine that #add:with: was called from a
>> subclass implementation using super. Recycling the stack would
>> clearly
>> change the behavior of the code:
>
> How? The stack frame, at the point the super send has just been made,
> has exactly one role remaining: to return to its caller. Nothing can
> happen between the stack frame becoming active again and its immediate
> return to its calling frame.

The basic problem that Lukas was pointing out is that tailcall
elimination bypasses message dispatch on the recursive message, which
can potentially violate the semantics of Smalltalk. In Lukas' example
above, the "tail call" actually resolves to a different method, so
it's not a tail call at all. Whether a method is tail-recursive
depends on how it was called in the first place, and what code
executes between when it's invoked and when it recurses.

Now, you might say that none of that makes it impossible to do tail-
call elimination, you just have to work hard to avoid getting caught.
The VM could perform the dispatch and only reuse the stack frame if
the message resolves to the same method. Ok, this is true. But tail-
call elimination is Smalltalk isn't the clear win that it is in, say,
Scheme.

Consider: It has to be done by the VM during execution, with a
(small) performance cost in the case where there is no tail
recursion. The tail call is no longer a simple goto, because the VM
still has to do the message dispatch. The VM also has to trap
references to thisContext to avoid getting caught that way. And
finally, Smalltalk does a lot less tail recursion than functional
languages, if only because our collections are based on Array rather
than Association.

Colin

123