What does it take to implement tail-call optimization?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

What does it take to implement tail-call optimization?

Stefan Marr-4
Hello Pharo:

What would it take to implement tail-call optimization?

My goal is to have an Erlang-like framework, which allows to port existing Erlang code as straight forward as possible.
But this requires to have tail-call optimization, since Erlang provides no loops.

However, I am not going for implementing the whole language and all its semantics, but only the basics I am interested in.

So, a possible solution would also be a pragma/hint to the compiler to apply the optimization.

Has anyone done something like this before, or has some hints where to start digging?

Thanks
Stefan


--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://soft.vub.ac.be/~smarr
Phone: +32 2 629 2974
Fax:   +32 2 629 3525


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: What does it take to implement tail-call optimization?

Stéphane Ducasse
Hi  Stefan

I always wanted to understand what a tail recursive execution for smalltalk (I know for the other)
would bring to a smalltalk system. We got a lot of discussion during a dagsthul seminar with schemers and others.

I would really like to know what would be gain and the cost (implementation wise to do that).

Stef

On Jul 4, 2010, at 10:26 AM, Stefan Marr wrote:

> Hello Pharo:
>
> What would it take to implement tail-call optimization?
>
> My goal is to have an Erlang-like framework, which allows to port existing Erlang code as straight forward as possible.
> But this requires to have tail-call optimization, since Erlang provides no loops.
>
> However, I am not going for implementing the whole language and all its semantics, but only the basics I am interested in.
>
> So, a possible solution would also be a pragma/hint to the compiler to apply the optimization.
>
> Has anyone done something like this before, or has some hints where to start digging?
>
> Thanks
> Stefan
>
>
> --
> Stefan Marr
> Software Languages Lab
> Vrije Universiteit Brussel
> Pleinlaan 2 / B-1050 Brussels / Belgium
> http://soft.vub.ac.be/~smarr
> Phone: +32 2 629 2974
> Fax:   +32 2 629 3525
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: What does it take to implement tail-call optimization?

Lukas Renggli
In reply to this post by Stefan Marr-4
> What would it take to implement tail-call optimization?

Implementing tail recursion is easily doable and could be
transparently done at the compiler level. The basic idea (at the
language level) is described here:

    http://stackoverflow.com/questions/2500483?tab=oldest#tab-top

However, tail call optimization is not a good match for Smalltalk
code. You arbitrarily loose stack frames that you might want for
debugging (and other kinds of reflection). Also the code in the above
article kills the performance of modern VMs, because it manipulates
the runtime stack (if you do the jump using bytecodes that wouldn't be
a problem though). Furthermore you need to take a lot of care to keep
your code correct, because these kind of optimization can easily break
with polymorphism and reflective code.

> I would really like to know what would be gain and the cost
> (implementation wise to do that).

I did some analysis in a Squeak 3.9 image a long time ago and I came
to the following conclusion: "In my Squeak image there are 160464 send
locations of which 8884 are possible tail call locations (5.5%). I
guess this does not really make it worth the trouble."

Lukas

--
Lukas Renggli
www.lukas-renggli.ch

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: What does it take to implement tail-call optimization?

Stéphane Ducasse

On Jul 4, 2010, at 11:15 AM, Lukas Renggli wrote:

>> What would it take to implement tail-call optimization?
>
> Implementing tail recursion is easily doable and could be
> transparently done at the compiler level. The basic idea (at the
> language level) is described here:
>
>    http://stackoverflow.com/questions/2500483?tab=oldest#tab-top
>
> However, tail call optimization is not a good match for Smalltalk
> code. You arbitrarily loose stack frames that you might want for
> debugging (and other kinds of reflection). Also the code in the above
> article kills the performance of modern VMs, because it manipulates
> the runtime stack (if you do the jump using bytecodes that wouldn't be
> a problem though). Furthermore you need to take a lot of care to keep
> your code correct, because these kind of optimization can easily break
> with polymorphism and reflective code.

Excellent this is now what was the conclusion of the discussion :)

>
>> I would really like to know what would be gain and the cost
>> (implementation wise to do that).
>
> I did some analysis in a Squeak 3.9 image a long time ago and I came
> to the following conclusion: "In my Squeak image there are 160464 send
> locations of which 8884 are possible tail call locations (5.5%). I
> guess this does not really make it worth the trouble."
>
> Lukas
>
> --
> Lukas Renggli
> www.lukas-renggli.ch
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: What does it take to implement tail-call optimization?

Eliot Miranda-2
In reply to this post by Stefan Marr-4


On Sun, Jul 4, 2010 at 1:26 AM, Stefan Marr <[hidden email]> wrote:
Hello Pharo:

What would it take to implement tail-call optimization?

My goal is to have an Erlang-like framework, which allows to port existing Erlang code as straight forward as possible.
But this requires to have tail-call optimization, since Erlang provides no loops.

But the system will run for a long time before it runs out of stack.  Don't get hung up on this.  Continue with the rest of the problem.

However, I am not going for implementing the whole language and all its semantics, but only the basics I am interested in.

So, a possible solution would also be a pragma/hint to the compiler to apply the optimization.

When you come to deal with tail calls you may find that the obvious case, namely spotting recursive sends to self, is adequate, in which case the compiler implementation is relatively trivial; one maps a tail-recursive send to self to a set of assignments to the arguments (*) followed by any necessary pops to level the stack, followed by a jump to the start.

(* Assignment to arguments legal in the bytecode; the compiler prevents explicit assignment to arguments but args and temps are the same  at the bytecode level).

Has anyone done something like this before, or has some hints where to start digging?

Various people, myself included, have done context-based hacks for fun, but they're merely parlour tricks; they have abysmal performance (certainly for loops, which is where we came in), no safety and don't integrate with the debugger.  You'll want to look at Scheme and Java implementations.

There's an excellent discussion in Gilad Bracha blog which has some good starting points.  If Newspeak gets ported to Cog (or rather vice verse, if Cog is extended to run Newspeak) and Gilad doesn't dismount from the TCO hobby horse then I expect TCO will be implemented in Cog in a more complete manner.  But as Gilad says, "one can cheat for a very long time though.".

best
Eliot


Thanks
Stefan


--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://soft.vub.ac.be/~smarr
Phone: +32 2 629 2974
Fax:   +32 2 629 3525


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: What does it take to implement tail-call optimization?

Eliot Miranda-2
In reply to this post by Stefan Marr-4
[oops; forgot the link I meant to post, so here's a 2nd attempt

On Sun, Jul 4, 2010 at 1:26 AM, Stefan Marr <[hidden email]> wrote:
Hello Pharo:

What would it take to implement tail-call optimization?

My goal is to have an Erlang-like framework, which allows to port existing Erlang code as straight forward as possible.
But this requires to have tail-call optimization, since Erlang provides no loops.


But the system will run for a long time before it runs out of stack.  Don't get hung up on this.  Continue with the rest of the problem.
 
However, I am not going for implementing the whole language and all its semantics, but only the basics I am interested in.

So, a possible solution would also be a pragma/hint to the compiler to apply the optimization.



When you come to deal with tail calls you may find that the obvious case, namely spotting recursive sends to self, is adequate, in which case the compiler implementation is relatively trivial; one maps a tail-recursive send to self to a set of assignments to the arguments (*) followed by any necessary pops to level the stack, followed by a jump to the start.

(* Assignment to arguments legal in the bytecode; the compiler prevents explicit assignment to arguments but args and temps are the same  at the bytecode level).
 
Has anyone done something like this before, or has some hints where to start digging?

 

Various people, myself included, have done context-based hacks for fun, but they're merely parlour tricks; they have abysmal performance (certainly for loops, which is where we came in), no safety and don't integrate with the debugger.  You'll want to look at Scheme and Java implementations.

There's an excellent discussion in Gilad Bracha's blog which has some good starting points.  If Newspeak gets ported to Cog (or rather vice verse, if Cog is extended to run Newspeak) and Gilad doesn't dismount from the TCO hobby horse then I expect TCO will be implemented in Cog in a more complete manner.  But as Gilad says, "one can cheat for a very long time though.".

best
Eliot
Thanks
Stefan


--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://soft.vub.ac.be/~smarr
Phone: +32 2 629 2974
Fax:   +32 2 629 3525


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project