> The queastion remains: why does the mere existence of such recursive
> structures throw Squeak into a tizzy? > No one answered this. What does the doIt mechanism do before it executes my > code? This is because the compiler internally stores the bindings of the Workspace in a dictionary with the binding-association as a key. Unfortunately the association are calculating the #hash using the key and the value, what causes an infinite recursion on your dictionary before you code is even compiled. The implementation of #hash in Association should not use the value to calculate the key, this is clearly a bug. Cheers, Lukas -- Lukas Renggli http://www.lukas-renggli.ch |
In reply to this post by Tony Garnock-Jones-2
> > But then, what to deal with such cases:
> > > > aClass>>add: a with: b > > ^ self add: a with: b+1 > > This is a call in tail position, and it's a crying shame that the Squeak > VM doesn't presently allow the associated degenerate stack frames to be > elided. There is a reason why this is not done in Smalltalk. Have a look at the examples below: aClass>>add: a with: b b = 0 ifTrue: [ SomeGlobal := thisContext ]. ^ self add: a with: b+1 aClass>>add: a with: b self become: #zork. ^ self add: a with: b+1 aClass>>add: a with: b self class removeSelector: #add:with:. ^ self add: a with: b+1 There is an infinite number of other examples. Cheers, Lukas -- Lukas Renggli http://www.lukas-renggli.ch |
In reply to this post by Lukas Renggli
On 11-May-07, at 11:40 AM, Lukas Renggli wrote: > > The implementation of #hash in Association should not use the value to > calculate the key, this is clearly a bug. Cue another long discussion about hashing. Just what do you believe the hash value should be based on? tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: SDLI: Shift Disk Left Immediate |
In reply to this post by Lukas Renggli
On Fri, 11 May 2007 20:40:57 +0200, Lukas Renggli wrote:
>> The queastion remains: why does the mere existence of such recursive >> structures throw Squeak into a tizzy? >> No one answered this. What does the doIt mechanism do before it >> executes my >> code? > > This is because the compiler internally stores the bindings of the > Workspace in a dictionary with the binding-association as a key. > Unfortunately the association are calculating the #hash using the key > and the value, what causes an infinite recursion on your dictionary > before you code is even compiled. *after* the code was compiled *and* *executed* the first time ;-) > The implementation of #hash in Association should not use the value to > calculate the key, this is clearly a bug. Yes, this slows down compiler's binding of variables, for example (Unicode generalCategory size) => 917632. /Klaus > Cheers, > Lukas > |
In reply to this post by Lukas Renggli
> There is a reason why this is not done in Smalltalk. Have a look at
> the examples below: > > aClass>>add: a with: b > b = 0 ifTrue: [ SomeGlobal := thisContext ]. > ^ self add: a with: b+1 > > aClass>>add: a with: b > self become: #zork. > ^ self add: a with: b+1 > > aClass>>add: a with: b > self class removeSelector: #add:with:. > ^ self add: a with: b+1 > > There is an infinite number of other examples. One more example, that is even more striking and that doesn't use nasty reflection tricks. Imagine that #add:with: was called from a subclass implementation using super. Recycling the stack would clearly change the behavior of the code: A>>add: a with: b ^ self add: a with: b+1 B>>add: a with: b ^ super add: a + 1 with: b Cheers, Lukas -- Lukas Renggli http://www.lukas-renggli.ch |
In reply to this post by timrowledge
> Cue another long discussion about hashing.
> > Just what do you believe the hash value should be based on? What about the key only? I quickly had a look at some other Smalltalk implementations (VW, Dolphin, Ambrai) they all use the key for hashing only. Lukas -- Lukas Renggli http://www.lukas-renggli.ch |
Well my squeak 2.5 image of Aug 6th 1999 has
LookupKey>>hash ^ key hash So when and why did it change? On May 11, 2007, at 12:10 PM, Lukas Renggli wrote: >> Cue another long discussion about hashing. >> >> Just what do you believe the hash value should be based on? > > What about the key only? > > I quickly had a look at some other Smalltalk implementations (VW, > Dolphin, Ambrai) they all use the key for hashing only. > > Lukas > > -- > Lukas Renggli > http://www.lukas-renggli.ch > -- ======================================================================== === John M. McIntosh <[hidden email]> Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== === |
In reply to this post by Lukas Renggli
"Change Set: Associations-equal-reposted
Date: 13 January 2004 Author: Leandro Caniglia md: V2: added Associations>>hash Change Set: Associations-equal Date: 20 November 2001 Author: Leandro Caniglia Defines the method Association >> #= in such a way that the values, other than the keys, get compared too. Previously this was inherited from LookupKey which only compares the keys." On May 11, 2007, at 12:10 PM, Lukas Renggli wrote: >> Cue another long discussion about hashing. -- ======================================================================== === John M. McIntosh <[hidden email]> Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== === |
In reply to this post by Lukas Renggli
Lukas Renggli a écrit :
>> Cue another long discussion about hashing. >> >> Just what do you believe the hash value should be based on? > > What about the key only? > > I quickly had a look at some other Smalltalk implementations (VW, > Dolphin, Ambrai) they all use the key for hashing only. > > Lukas > As a LookupKey, Association should behave like Lukas said. Also, look at this funny bug introduced by Association>>#= : | a1 a2 | a1 := #a -> 1. a2 := #a -> 2. {a1 >= a2. a2 >= a1. a1 = a2} "true, true, false" I do not know how mathematicians do qualify this kind of relation... But among more than 30 references to Association, and 100 sendes of #-> how to tell whether Association>>#= ever need to test the value ? (this is of course why Associations>>#hash is defined ) Quiet sure there are some uses that do not consider Association like a LookupKey but just as a C++ std::pair. Nicolas |
In reply to this post by Tony Garnock-Jones-2
Tony Garnock-Jones wrote:
> sig wrote: > > But then, what to deal with such cases: > > > > aClass>>add: a with: b > > ^ self add: a with: b+1 > > This is a call in tail position, and it's a crying shame that the Squeak > VM doesn't presently allow the associated degenerate stack frames to be > elided. Lukas Renggli gave a series of examples where replacing the "tail send" with a "goto" would give use the wrong semantics. That, however, is a separate issue from avoiding the creation of stack frames. Dan Ingalls said that VM implementors can cheat as long as they don't get caught and if you ever get into the debugger you will certainly notice the lack of some stack frames and will curse whoever had the brilhant idea of eliminating them to get some extra speed. The Self guys were very fanatical about not getting caught this way no matter what the costs in terms of performance. For tinySelf 1, however, the cost in performance would have been extremely high since having tail send optimization made programs far more parallel than they would be otherwise. Unfortunately this single feature was responsible for practically all the hard to find bugs in the implementation and so when I had to move on to another project they became the reason why tinySelf 1 remained unfinished. Given this experience I think it would have been better for me to first have made the system fully functional without this optimization and then add it in a second version. But I would have added it. - -Jecel |
In reply to this post by Lukas Renggli
But then how do you differentiate an Association that appears in multiple
different places with several Associations that simply have the same keys? >From: "Lukas Renggli" <[hidden email]> >Reply-To: The general-purpose Squeak developers >list<[hidden email]> >To: "The general-purpose Squeak developers >list"<[hidden email]> >Subject: Re: Equality of Recursive Structures [Was: Closure Compiler in >3.9?] >Date: Fri, 11 May 2007 21:10:46 +0200 > >>Cue another long discussion about hashing. >> >>Just what do you believe the hash value should be based on? > >What about the key only? > >I quickly had a look at some other Smalltalk implementations (VW, >Dolphin, Ambrai) they all use the key for hashing only. > >Lukas > >-- >Lukas Renggli >http://www.lukas-renggli.ch > _________________________________________________________________ Catch suspicious messages before you open themwith Windows Live Hotmail. http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_protection_0507 |
In reply to this post by Nicolas Cellier-3
On 11/05/07, nicolas cellier <[hidden email]> wrote:
> Lukas Renggli a écrit : > >> Cue another long discussion about hashing. > >> > >> Just what do you believe the hash value should be based on? > > > > What about the key only? > > > > I quickly had a look at some other Smalltalk implementations (VW, > > Dolphin, Ambrai) they all use the key for hashing only. > > > > Lukas > > > > As a LookupKey, Association should behave like Lukas said. > Also, look at this funny bug introduced by Association>>#= : > > | a1 a2 | > a1 := #a -> 1. > a2 := #a -> 2. > {a1 >= a2. a2 >= a1. a1 = a2} > "true, true, false" > > I do not know how mathematicians do qualify this kind of relation... > > > But among more than 30 references to Association, and 100 sendes of #-> > how to tell whether Association>>#= ever need to test the value ? > (this is of course why Associations>>#hash is defined ) > > Quiet sure there are some uses that do not consider Association like a > LookupKey but just as a C++ std::pair. > > Nicolas > We can leave a selector #->, to create Association and use #=> (or other) to create Pair The Pair #= will compare both left and right values , while Assotiation only on first. Same with #hash. |
In reply to this post by J J-6
On 12/05/07, J J <[hidden email]> wrote:
> But then how do you differentiate an Association that appears in multiple > different places with several Associations that simply have the same keys? > If you need this, you can always rely on identity. a = b ifTrue: [ a == b ifTrue: [] ] or a value = b value A value in Association must not take a part in comparison operations, its just a placeholder for storing something under given key. If you using a comparison in association based on both key and value, then you losing a sence in having such class at all. Because then it can be replaced with any other, like Point or Array [2]. > > >From: "Lukas Renggli" <[hidden email]> > >Reply-To: The general-purpose Squeak developers > >list<[hidden email]> > >To: "The general-purpose Squeak developers > >list"<[hidden email]> > >Subject: Re: Equality of Recursive Structures [Was: Closure Compiler in > >3.9?] > >Date: Fri, 11 May 2007 21:10:46 +0200 > > > >>Cue another long discussion about hashing. > >> > >>Just what do you believe the hash value should be based on? > > > >What about the key only? > > > >I quickly had a look at some other Smalltalk implementations (VW, > >Dolphin, Ambrai) they all use the key for hashing only. > > > >Lukas > > > >-- > >Lukas Renggli > >http://www.lukas-renggli.ch > > > > _________________________________________________________________ > Catch suspicious messages before you open them—with Windows Live Hotmail. > http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_protection_0507 > > > |
Also, just take a look what -> means in mathematics.
a -> b is an implication and reads 'if a then b'! So, if you using in comparison #= both a and b, its reads as 'a and b' and of course its not an implication anymore. |
In reply to this post by stephane ducasse
Hi Andrew,
Thanks for this instructive email. Alexandre On 10 May 2007, at 13:50, stephane ducasse wrote: > thanks for the email. It was cool. > I learned something today. > > Stef > > On 10 mai 07, at 08:46, Andrew P. Black wrote: > >> This started as a story about what I realized in the shower this >> morning, and ended up as a story about what I don't know about >> Squeak (again :-( ) >> >> On 8 May 2007, at 17:12, Andreas Raab wrote: >> >>> A while ago Association>>= was changed to compare the values in >>> addition to its keys which has bitten me many, many times since >>> it was introduced. Most likely it is the culprit here, too. >> >> No, Andreas, this was a different problem but an interesting one. >> This is the third time in my life that i has bitten me. It's >> worth writing a pattern about: comparing recursive structures. >> >> Suppose that you have two potentially recursive data structures, >> and you need to compare them — perhaps for equality, or perhaps >> for dominance of one over the other. (I met this problem first >> when figuring out the type conformance algorithm for Emerald in >> about 1984.) >> >> To be concrete, suppose that the structures are dictionaries, >> which contain simple data as keys (like symbols), and more complex >> data as values — including, perhaps, other dictionaries. The >> obvious thing is to walk both structures in parallel, checking >> that they have the same keys, and that the values for each >> occurrence of a key are also the same. It's possible that those >> values are also dictionaries, in which case you execute the same = >> code that you were in the middle of running ... >> >> This is not a problem, unless: the dictionary stored in the >> dictionary is the very same dictionary that you started with. In >> which case, the naive recursive search will go on for ever. >> >> I first met this in Squeak when I executed Smalltalk = Smalltalk. >> Of course, Smalltalk, as the dictionary of all globals, includes >> Smalltalk as a value, and the equality computation never >> terminated. I filed a bug fix for that one, and it's in the image >> now. My fix was to simply compare the two objects for identity >> first, and only then do the recursive equality check. This, or >> course, answers true ot Smalltalk = Smalltalk quite quickly. >> >> The situation is the same with my recursive block example. >> >> b := [(i:= i-1) > 0 ifTrue: [j := j+1 . b value.] ifFalse:[j]] . >> >> It's compiled as a method, which has (presumably, I haven't >> looked) a literal that points to itself ... and the equality >> comparision goes on forever. Why Squeak was doing an equality >> comparision when all I asked for was to "DoIt" is another question. >> >> I'm deducing all this from Mathieu's fix. The identity test >> isn't a complete fix, however. If you have two isomorphic >> recursive structures that don't share any elements, the test will >> still go on for ever. >> >> Here is a small example: two isomorphic one-element dictionaries. >> Type this in a workspace (BUT DON"T TRY THIS IN AN IMAGE THAT YOU >> WANT TO KEEP USING!) >> >> a := Dictionary new. >> a at: #foo put: a. >> b := Dictionary new. >> b at: #foo put: b. >> >> I expected that if I then did a printIt on a==b, I would get an >> immediate "false", but if I did a printit on a=b, the system would >> go off into never-land. >> >> But actually, it's more interesting than that. The system goes >> off into never-land almost regardless of what you do! Debug it a >> == b will do it. So will a simple printIt of b or inspectIt of >> b. A doIt of "self halt. a == b" goes into a loop /before/ >> executing the self halt. If you interrupt the computation, you >> can look at b in the debugger, and even launch an inspector from >> there: printStringLimtedTo: does its job, as it's supposed to. >> The system is trying to evaluate hash on the cyclic dictionary; >> hash is also broken for the same reason. Asking for a fullStack >> in the debugger to see _why_ it's trying to compute hash puts the >> system into another recursive loop! >> >> This explains why I saw the same behavior with my recursive block, >> but not the root cause. Anyone have an explanation? >> >> By the way, the real solution to the recursive equality comparison >> is to maintain a cache. >> A pair of elements goes into the cache when you start to comapre >> them for equality. If, in the process of doing the comparision, >> you find that you are asking the same equaility question again, >> then the second time around, you can assume that the answer is >> true. This is OK, evne though you haven't finished doing the >> comparison yet. If it turns out that there is indeed a >> difference, the "outer" equality will eventually find it, and will >> answer false. The same trick can be employed for computing the >> hash. Assume that the hash of the structure that you are >> computing the has over is some constant, say 0. Start computing >> its hash, but if you find in the process that you need to compute >> the has of the same structure again, then just assume that it is >> 0. The "real" hash might be thought of as the fixed-point of this >> process, but since hashes are just approximations anyway, once >> around the loop is enough. >> >> The queation remains: why does the mere existence of such >> recursive structures throw Squeak into a tizzy? >> >> Andrew >> >> >> >> > > > -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.software-artist.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
In reply to this post by Lukas Renggli
Hi Lukas,
The problem with the examples you've given is that none of them justify not eliding dead stack frames by themselves. The usual argument for retaining dead stack frames is to make debugging easier. The usual arguments against are the safe-for-space property and to allow the expression of advanced control structures that are otherwise impossible to express directly. Lukas Renggli wrote: > aClass>>add: a with: b > b = 0 ifTrue: [ SomeGlobal := thisContext ]. > ^ self add: a with: b+1 This is the example that comes closest to demonstrating why retaining dead frames might be interesting: examination of thisContext can tell you how many times you've tail-recursed. > aClass>>add: a with: b > self become: #zork. > ^ self add: a with: b+1 > > aClass>>add: a with: b > self class removeSelector: #add:with:. > ^ self add: a with: b+1 Neither of these are affected at all by removing the no-op frames, unless I'm missing something. Regards, Tony |
In reply to this post by Lukas Renggli
Hi,
> One more example, that is even more striking and that doesn't use > nasty reflection tricks. Imagine that #add:with: was called from a > subclass implementation using super. Recycling the stack would clearly > change the behavior of the code: How? The stack frame, at the point the super send has just been made, has exactly one role remaining: to return to its caller. Nothing can happen between the stack frame becoming active again and its immediate return to its calling frame. Regards, Tony |
In reply to this post by Tony Garnock-Jones-2
Hi Jecel,
To me it's not a question of optimisation, but a question of not being able to express useful patterns of control flow directly. Instead, I am forced to encode them, often using otherwise-avoidable mutable state. Imagine a BASIC without GOSUB. You have to use arrays to represent a stack if you need to implement a push-down automaton. It's a very similar situation here: state machines have a natural implementation using tail calls that is not available unless dead frames are elided. The alternative is to use an instance variable or similar to hold the current state, and to perform explicit trampolining. Regards, Tony Jecel Assumpcao Jr wrote: > Tony Garnock-Jones wrote: >> sig wrote: >>> But then, what to deal with such cases: >>> >>> aClass>>add: a with: b >>> ^ self add: a with: b+1 >> This is a call in tail position, and it's a crying shame that the Squeak >> VM doesn't presently allow the associated degenerate stack frames to be >> elided. > > Lukas Renggli gave a series of examples where replacing the "tail send" > with a "goto" would give use the wrong semantics. That, however, is a > separate issue from avoiding the creation of stack frames. > > Dan Ingalls said that VM implementors can cheat as long as they don't > get caught and if you ever get into the debugger you will certainly > notice the lack of some stack frames and will curse whoever had the > brilhant idea of eliminating them to get some extra speed. The Self guys > were very fanatical about not getting caught this way no matter what the > costs in terms of performance. > > For tinySelf 1, however, the cost in performance would have been > extremely high since having tail send optimization made programs far > more parallel than they would be otherwise. Unfortunately this single > feature was responsible for practically all the hard to find bugs in the > implementation and so when I had to move on to another project they > became the reason why tinySelf 1 remained unfinished. Given this > experience I think it would have been better for me to first have made > the system fully functional without this optimization and then add it in > a second version. But I would have added it. > > - -Jecel > > |
In reply to this post by Tony Garnock-Jones-2
> > aClass>>add: a with: b
> > self become: #zork. > > ^ self add: a with: b+1 > > > > aClass>>add: a with: b > > self class removeSelector: #add:with:. > > ^ self add: a with: b+1 > > Neither of these are affected at all by removing the no-op frames, > unless I'm missing something. In both cases the receiver changes and so might the implementation of #add:with:. Have a look at the other example I gave not using any of the dirty tricks above, but just plain inheritance. The same problem there ... Cheers, Lukas -- Lukas Renggli http://www.lukas-renggli.ch |
In reply to this post by Tony Garnock-Jones-2
On May 12, 2007, at 7:38 AM, Tony Garnock-Jones wrote: >> One more example, that is even more striking and that doesn't use >> nasty reflection tricks. Imagine that #add:with: was called from a >> subclass implementation using super. Recycling the stack would >> clearly >> change the behavior of the code: > > How? The stack frame, at the point the super send has just been made, > has exactly one role remaining: to return to its caller. Nothing can > happen between the stack frame becoming active again and its immediate > return to its calling frame. The basic problem that Lukas was pointing out is that tailcall elimination bypasses message dispatch on the recursive message, which can potentially violate the semantics of Smalltalk. In Lukas' example above, the "tail call" actually resolves to a different method, so it's not a tail call at all. Whether a method is tail-recursive depends on how it was called in the first place, and what code executes between when it's invoked and when it recurses. Now, you might say that none of that makes it impossible to do tail- call elimination, you just have to work hard to avoid getting caught. The VM could perform the dispatch and only reuse the stack frame if the message resolves to the same method. Ok, this is true. But tail- call elimination is Smalltalk isn't the clear win that it is in, say, Scheme. Consider: It has to be done by the VM during execution, with a (small) performance cost in the case where there is no tail recursion. The tail call is no longer a simple goto, because the VM still has to do the message dispatch. The VM also has to trap references to thisContext to avoid getting caught that way. And finally, Smalltalk does a lot less tail recursion than functional languages, if only because our collections are based on Array rather than Association. Colin |
Free forum by Nabble | Edit this page |