Smalltalk › Squeak › Squeak - Dev

Closure Compiler in 3.9 ?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

56 messages Options

123

Lex Spoon-3

Re: tail recursion

Tony Garnock-Jones <[hidden email]> writes:
> To me it's not a question of optimisation, but a question of not being
> able to express useful patterns of control flow directly. Instead, I am
> forced to encode them, often using otherwise-avoidable mutable state.

In Squeak, you should feel free to write recursive routines anyway.
Since the stack is heap-based, you will not run out of stack unless
you are already running out of memory with your manipulations, anyway.
It is not like in Java or C, where you have the ugly combination of
(a) fixed-size stacks and (b) no tail-call elimination.

You will see many recursive methods if you browse through the
Behavior classes.

Now, curiously, if your recursive method has an error, then the
debugger is not good at codensing long sequences of frames. This is
curious because of Jecel's comment about not getting caught. It might
actually be more helpful to the user, in most cases, to dump some
frames with tail-call elimination and thus end up displaying a more
concise debugging view!

-Lex

> > Dan Ingalls said that VM implementors can cheat as long as they don't
> > get caught and if you ever get into the debugger you will certainly
> > notice the lack of some stack frames and will curse whoever had the
> > brilhant idea of eliminating them to get some extra speed. The Self guys
> > were very fanatical about not getting caught this way no matter what the
> > costs in terms of performance.
> >
> > For tinySelf 1, however, the cost in performance would have been
> > extremely high since having tail send optimization made programs far
> > more parallel than they would be otherwise. Unfortunately this single
> > feature was responsible for practically all the hard to find bugs in the
> > implementation and so when I had to move on to another project they
> > became the reason why tinySelf 1 remained unfinished. Given this
> > experience I think it would have been better for me to first have made
> > the system fully functional without this optimization and then add it in
> > a second version. But I would have added it.
> >
> > - -Jecel
> >
> >

Lex Spoon-3

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Colin Putney

Colin Putney <[hidden email]> writes:
> Now, you might say that none of that makes it impossible to do tail-
> call elimination, you just have to work hard to avoid getting caught.
> The VM could perform the dispatch and only reuse the stack frame if
> the message resolves to the same method. Ok, this is true. But tail-
> call elimination is Smalltalk isn't the clear win that it is in, say,
> Scheme.

A tail call does not have to be to the same method to be optimized!

In fact there is not even a special opcode needed. Whenever you
do [^ x message] then the send of "message" is a tail call. Since
message-passing is about all you can do in Smalltalk, most return
statements are in fact tail calls.

What the optimization can do is create the new stack frame as normal,
but set its return pointer to skip over the current frame. Then, the
calling frame can be returned to the free list before jumping to the
new method. (Assuming, that is, that no closure has captured the
frame....)

What cannot be done in Lukas' interesting examples is to reuse the
same stack frame object, thus compiling the method as a loop.
However, given a linear stack (unlike Squeak's heap-based stack), you
could still implement the call as a jump....

Lex

Philippe Marschall

Re: tail recursion

In reply to this post by Lex Spoon-3

13 May 2007 11:52:13 +0200, Lex Spoon <[hidden email]>:

> Tony Garnock-Jones <[hidden email]> writes:
> > To me it's not a question of optimisation, but a question of not being
> > able to express useful patterns of control flow directly. Instead, I am
> > forced to encode them, often using otherwise-avoidable mutable state.
>
> In Squeak, you should feel free to write recursive routines anyway.
> Since the stack is heap-based, you will not run out of stack unless
> you are already running out of memory with your manipulations, anyway.
> It is not like in Java or C, where you have the ugly combination of
> (a) fixed-size stacks and (b) no tail-call elimination.

I thought GCC and VC++ do tail-call elimination.

Cheers
Philippe

> You will see many recursive methods if you browse through the
> Behavior classes.
>
> Now, curiously, if your recursive method has an error, then the
> debugger is not good at codensing long sequences of frames. This is
> curious because of Jecel's comment about not getting caught. It might
> actually be more helpful to the user, in most cases, to dump some
> frames with tail-call elimination and thus end up displaying a more
> concise debugging view!
>
> -Lex
>
>
> > > Dan Ingalls said that VM implementors can cheat as long as they don't
> > > get caught and if you ever get into the debugger you will certainly
> > > notice the lack of some stack frames and will curse whoever had the
> > > brilhant idea of eliminating them to get some extra speed. The Self guys
> > > were very fanatical about not getting caught this way no matter what the
> > > costs in terms of performance.
> > >
> > > For tinySelf 1, however, the cost in performance would have been
> > > extremely high since having tail send optimization made programs far
> > > more parallel than they would be otherwise. Unfortunately this single
> > > feature was responsible for practically all the hard to find bugs in the
> > > implementation and so when I had to move on to another project they
> > > became the reason why tinySelf 1 remained unfinished. Given this
> > > experience I think it would have been better for me to first have made
> > > the system fully functional without this optimization and then add it in
> > > a second version. But I would have added it.
> > >
> > > - -Jecel
> > >
> > >
>
>
>

David T. Lewis

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Lukas Renggli

On Fri, May 11, 2007 at 08:40:57PM +0200, Lukas Renggli wrote:
> The implementation of #hash in Association should not use the value to
> calculate the key, this is clearly a bug.

Just as a point of procedure, if this is a bug could you please enter
in on Mantis?

Thanks,

Dave

Tony Garnock-Jones-2

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Colin Putney

Colin Putney wrote:
> Whether a method is tail-recursive depends on
> how it was called in the first place, and what code executes between
> when it's invoked and when it recurses.

I think we're seeing different aspects of the issue: I'm wanting to
concentrate very narrowly on the stack frames that are left at the point
a method does the following:

TailCaller >> doSomething
"... arbitrary stuff ..."
^ something someMessage.

Imagine execution has just entered SomeClass>>#someMessage. The stack
looks like

UltimateCaller >> entryPoint
TailCaller >> doSomething "hovering over a return-to-sender bytecode"
SomeClass >> someMessage

Since all the middle frame will *ever* do when control is returned to it
is in turn return the accumulator to its caller, it can *always* safely
be elided without affecting the object-level meaning of a program at all.

The point where the difference *is* detectable is when reflective access
such as debugging or accesses to thisContext appear. (And even here,
there are options for improving the reflective presentation of the call
stack without harming the important object-level safety properties
tail-call frame elision gives you - for instance, see the literature on
"continuation marks")

The improved safe-for-space behaviour enables the construction of a new
class of programs that are not directly representable without elision of
redundant stack frames.

> And finally, Smalltalk does a lot less
> tail recursion than functional languages, if only because our
> collections are based on Array rather than Association.

Another reason is perhaps that using tail-recursion in a natural way
gets you into trouble very quickly, so people quickly learn not to do it ;-)

Regards,
Tony
--
[][][] Tony Garnock-Jones | Mob: +44 (0)7905 974 211
[][] LShift Ltd | Tel: +44 (0)20 7729 7060
[] [] http://www.lshift.net/ | Email: [hidden email]

Tony Garnock-Jones-2

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Lukas Renggli

Lukas Renggli wrote:
> [...] Recycling the stack would clearly
> change the behavior of the code:

Ah, I see where we're differing (thanks Lex for pointing it out) - I'm
not interested here in recycling frames, or other meta-representation
issues: I'm interested in not having them present in the thisContext
chain if they don't contribute observationally to my object-level
program. There are all sorts of ways of doing this efficiently, and I
guess my original post was a complaint that Squeak's VM doesn't offer
even an inefficient way of doing it :-)

(Actually that gets me thinking about awful hacks modifying the stack
frame reflectively without help from the VM at all... eww :-) )

Regards,
Tony

Tony Garnock-Jones-2

Re: tail recursion

In reply to this post by Lex Spoon-3

Lex Spoon wrote:
> In Squeak, you should feel free to write recursive routines anyway.

The kinds of routines I have in mind are e.g. state machines and
continuation-passing-style control structures.

MyStateMachine >> startState
self someCondition
ifTrue: [^ self nextState]
ifFalse: [^ self otherState].

MyStateMachine >> nextState
self otherCondition whileFalse: [^ self nextState].
^ self startState.

...

These *will* run out of stack given sufficiently large input, unless the
continuation is aggressively eta-reduced.

> Now, curiously, if your recursive method has an error, then the
> debugger is not good at codensing long sequences of frames. This is
> curious because of Jecel's comment about not getting caught. It might
> actually be more helpful to the user, in most cases, to dump some
> frames with tail-call elimination and thus end up displaying a more
> concise debugging view!

That's an interesting observation. The "continuation mark" ideas used to
support debuggers etc in the Scheme world seem to indicate that you can
have the best of both worlds: retention of enough meta-level context
(continuation marks) to be useful in a debugger, while not retaining so
much that it affects the object-level (as happens in languages that
don't eta-reduce their continuations).

Regards,
Tony

timrowledge

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Tony Garnock-Jones-2

On 14-May-07, at 8:25 AM, Tony Garnock-Jones wrote:

>
> (Actually that gets me thinking about awful hacks modifying the stack
> frame reflectively without help from the VM at all... eww :-) )
Contexts are just objects; with a few (albeit painful and not always
obvious) limits you can do almost anything with them without any VM
help. Manipulating the sender ivar, messing with the stack contents,
whatever. Some things are even useful!

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Never trust a computer you can't lift.

Tony Garnock-Jones-2

Re: tail recursion

In reply to this post by Philippe Marschall

Philippe Marschall wrote:
> I thought GCC and VC++ do tail-call elimination.

In certain, very very constrained circumstances, they do, yes. It is
difficult to rely on it as a general safety property while programming,
though. There's been a lot of discussion of exactly how far GCC goes in
tail-call elimination in the context of the implementation of the
Chicken Scheme compiler. The summary is: not far enough to be a useful
target for a compiler like Chicken, because of the way C's automatic (==
stack) memory management works.

Regards,
Tony
--
[][][] Tony Garnock-Jones | Mob: +44 (0)7905 974 211
[][] LShift Ltd | Tel: +44 (0)20 7729 7060
[] [] http://www.lshift.net/ | Email: [hidden email]

Tony Garnock-Jones-2

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by timrowledge

tim Rowledge wrote:
> Contexts are just objects; with a few (albeit painful and not always
> obvious) limits you can do almost anything with them without any VM
> help. Manipulating the sender ivar, messing with the stack contents,
> whatever. Some things are even useful!

It's true. For instance, the way Seaside paints a continuation-flavoured
veneer atop the stack...

Regards,
Tony

Colin Putney

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

In reply to this post by Tony Garnock-Jones-2

On May 14, 2007, at 8:19 AM, Tony Garnock-Jones wrote:

> UltimateCaller >> entryPoint
> TailCaller >> doSomething "hovering over a return-to-sender
> bytecode"
> SomeClass >> someMessage
>
> Since all the middle frame will *ever* do when control is returned
> to it
> is in turn return the accumulator to its caller, it can *always*
> safely
> be elided without affecting the object-level meaning of a program
> at all.
>

Ok, sure. So what are you proposing?

Lex suggested that the stack could be constructed such that the
invocation of #someMessage constructs an activation context whose
sender slot points directly to the context for #entryPoint. We'd then
have a tree rather than a stack, and barring a reference escaping via
thisContext, the context for #doSomething is now garbage. Returning
from #someMessage to #entryPoint will now take one bytecode instead
of two.

Is this what you mean by frames being elided?

> The improved safe-for-space behaviour enables the construction of a
> new
> class of programs that are not directly representable without
> elision of
> redundant stack frames.
>

This is where I get confused. Above you're talking about how frames
can be elided because (barring reflection) that elision can have no
effect on the semantics of the program. Now you're suggesting that
elision of frames *does* have an effect on semantics, because it
allows a new class of programs that wouldn't otherwise be possible to
express.

Are you talking about something other than tail-call-elimination?

Colin

Tony Garnock-Jones-2

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

Hi Colin,

Colin Putney wrote:
> Lex suggested that the stack could be constructed such that the
> invocation of #someMessage constructs an activation context whose sender
> slot points directly to the context for #entryPoint. We'd then have a
> tree rather than a stack, and barring a reference escaping via
> thisContext, the context for #doSomething is now garbage. Returning from
> #someMessage to #entryPoint will now take one bytecode instead of two.

That'd be ideal. (I still see it as a stack - just not a dense
representation!) I've looked at the VM before to see if it would be
difficult to introduce tail-call frame elision at that level, and it
was. I'm sorry to say I haven't got a patch to submit :-(

I'd like the compiler to be able to choose which send variant to use:
there'd be a new bytecode "tailsend" that combined both a "send" and an
implicit following "return". That way, existing code continues to have
the same behaviour it's always had, but new code can be built using
proper tail calls.

This gives the first half of the story: reclaiming garbage that is
otherwise unreclaimable. The other half involves building a
continuation-mark-like system to beef up the reflection side of things
again.

> This is where I get confused. Above you're talking about how frames can
> be elided because (barring reflection) that elision can have no effect
> on the semantics of the program. Now you're suggesting that elision of
> frames *does* have an effect on semantics, because it allows a new class
> of programs that wouldn't otherwise be possible to express.

I've chosen my words poorly. I meant "... not possible to express
without triggering a bug in the GC."

As Lex pointed out, I'm free to use recursion in any way I like in
Squeak already. The lack of tail-call frame elision can be seen then as
a bug in Squeak's *garbage collector*: I can write valid code in such a
way that garbage frames are generated but never reclaimed.

This is one of those hairy cases where the metalevel environment
impinges on the objectlevel environment by signalling out-of-memory
conditions to the interpreted program.

In a "perfect computer" (infinite memory), tail-call frame elision makes
no difference at all to the observable properties of the program. Once
we simulate a perfect computer in an imperfect (finite) one, we rely on
the GC to reclaim unused space, and so if we allocate and retain memory
excessively, either stack frames via recursion or regular heap objects,
the problems of representing the ideal computer in a finite environment
start to encroach upon our semantics...

Another choice (a poor one, IMO) would be to not let the objectlevel
program know it was running in emulation. Then, on an out-of-memory
condition, execution would be paused and the runtime would have to reach
*outward*, rather than inward, to receive instructions on how to
proceed. From the objectlevel program's point of view, it's still
running on a perfect computer... this touches upon the distinction
between Turing machines and more holistic approaches to what a computer
might be. If you forbade a machine from finding out it was running in
emulation, you'd probably want to be very careful in how you structured
your primitive operations, too...

> Are you talking about something other than tail-call-elimination?

I've been trying to avoid the phrase "tail-call optimisation" in
particular, since it has a specific meaning in the compiler community
that's only loosely related to the sense I'm interested in.

Will Clinger's paper http://citeseer.ist.psu.edu/clinger98proper.html
has a rigorous definition of various kinds of stack-related space leaks
and explores a lot of the detail.

Regards,
Tony

Bryce Kampjes

Re: Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

Tony Garnock-Jones writes:
> Hi Colin,
>
> Colin Putney wrote:
> > Lex suggested that the stack could be constructed such that the
> > invocation of #someMessage constructs an activation context whose sender
> > slot points directly to the context for #entryPoint. We'd then have a
> > tree rather than a stack, and barring a reference escaping via
> > thisContext, the context for #doSomething is now garbage. Returning from
> > #someMessage to #entryPoint will now take one bytecode instead of two.
>
> That'd be ideal. (I still see it as a stack - just not a dense
> representation!) I've looked at the VM before to see if it would be
> difficult to introduce tail-call frame elision at that level, and it
> was. I'm sorry to say I haven't got a patch to submit :-(

commonSend and internalActivateNewMethod would be worth looking
at if you're trying to create a tail send bytecode.

Be careful about context recycling, you don't want to accidentally
allow some other context to be recycled because the tail call has
removed a context from the stack.

Bryce

Bergel, Alexandre

Re: tail recursion (was: Equality of Recursive Structures)

In reply to this post by Tony Garnock-Jones-2

> Dan Ingalls said that VM implementors can cheat as long as they don't
> get caught and if you ever get into the debugger you will certainly
> notice the lack of some stack frames and will curse whoever had the
> brilhant idea of eliminating them to get some extra speed.

I would consider myself as having experience with Scheme. Whereas not
creating a stack frame for recursive call reduce memory consumption,
it makes application debugging a much harder task... As far as I have
seen, not creating stack frame causes more problems than it solves...

Alexandre

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel http://www.software-artist.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.

Lex Spoon-3

Re: tail recursion

In reply to this post by Lex Spoon-3

Lex Spoon <[hidden email]> writes:
> Tony Garnock-Jones <[hidden email]> writes:
> > To me it's not a question of optimisation, but a question of not being
> > able to express useful patterns of control flow directly. Instead, I am
> > forced to encode them, often using otherwise-avoidable mutable state.
>
> In Squeak, you should feel free to write recursive routines anyway.

On second thought, let me ammend this. You should mostly feel free to
write tail-recursive functions. However, you must consider that you
will use up lots of heap.

You will not blow stacks, though. And if you are recursing through a
data structure, then the heap usage will be no larger than the size of
the data structure.

On the other hand, don't write an infinite loop in tail-recursive
style in Squeak! For example, this idiom works in Scheme but not
in Squeak:

eventLoop
self processEvent.
^self eventLoop

-Lex

Lex Spoon-3

Re: tail recursion

In reply to this post by Tony Garnock-Jones-2

Tony Garnock-Jones <[hidden email]> writes:

> Lex Spoon wrote:
> > In Squeak, you should feel free to write recursive routines anyway.
>
> The kinds of routines I have in mind are e.g. state machines and
> continuation-passing-style control structures.
>
> MyStateMachine >> startState
> self someCondition
> ifTrue: [^ self nextState]
> ifFalse: [^ self otherState].
>
> MyStateMachine >> nextState
> self otherCondition whileFalse: [^ self nextState].
> ^ self startState.
>
> ...
>
> These *will* run out of stack given sufficiently large input, unless the
> continuation is aggressively eta-reduced.

True, I spoke too broadly. This won't work in Squeak, the way
it is written.

> > Now, curiously, if your recursive method has an error, then the
> > debugger is not good at codensing long sequences of frames. This is
> > curious because of Jecel's comment about not getting caught. It might
> > actually be more helpful to the user, in most cases, to dump some
> > frames with tail-call elimination and thus end up displaying a more
> > concise debugging view!
>
> That's an interesting observation. The "continuation mark" ideas used to
> support debuggers etc in the Scheme world seem to indicate that you can
> have the best of both worlds: retention of enough meta-level context
> (continuation marks) to be useful in a debugger, while not retaining so
> much that it affects the object-level (as happens in languages that
> don't eta-reduce their continuations).

That sounds heavenly. It would be great if Squeak worked like that.

-Lex

123