Smalltalk › Squeak › Squeak VM

Compiling with Continuations and LLVM

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

3 messages Options

Ben Coman

Compiling with Continuations and LLVM

Could Smalltalk VM be defined as a "Continuation Passing Style", or is that still more the realm of more functional languages?

I vaguely remember one of the arguments against using LLVM is that its IR was Single Static Assignment that wasn't well suited for continuations and non-local returns. I bumped into the following article...

"Compiling with Continuations and LLVM"

https://arxiv.org/pdf/1805.08842.pdf

...and wondered if it addresses any of those concerns, and refresh myself on other impediments to making use of the code-optimisation parts of LLVM. A few extracts...

While LLVM provides a relatively easy path to high-quality native code, its design is based on a traditional runtime model which is not well suited to alternative compilation strategies used in high-level language compilers, such as the use of heap-allocated continuation closures. This paper describes a new LLVM-based backend that supports heap-allocated continuation closures, which enables constant-time callcc and very-lightweight multithreading.

...

The LLVM backend for the MLton SML compiler uses trampolining to avoid the issues with TCO [23]. As far as we know, no one has successfully implemented heap-allocated first-class continuations with LLVM. In the remainder of the paper, we describe a new calling convention that we have added to LLVM, how we handle the GC interface, and how we support capturing continuations to support preemption

...

In preparation for generating native code that uses bump allocation in the heap, we insert heap-limit tests into the CFG representation. [...] The limit tests introduce continuation captures that are not present in the original program, which can be used to implement preemptive scheduling

...

Most functional languages use tail recursion to express loops and, thus, require that tail calls do not grow the stack. In limited circumstances, LLVM can meet this requirement, but even when LLVM is able to apply TCO to a call, there is extra stack-management overhead on function entry and exit. This extra overhead is bad enough for implementations that rely on traditional stacks, but it is prohibitively expensive for implementations that use CPS and heap-allocated continuations. [...] Our solution to this problem is to add a new calling convention, which we call Jump-With-Arguments (JWA), to LLVM. This calling convention has the property that it uses all of the available hardware registers and that no registers are preserved by either the caller or callee. [...] we mark every function with the naked attribute, which tells the code generator to completely skip the emission of a function prologue and epilogue

[This sounds a bit like how our VM works...]

We use the technique developed for supporting asynchronous signals in SML/NJ [26], which limits preemption to safe points where all live values are known. We store the heap limit pointer in memory, which means that we can set it to zero when we need to force a heap-limit check to fail.

Cheers, Ben

Ronie Salgado

Re: Compiling with Continuations and LLVM

I remember that some years ago during a pair programming session with Clément modeled non-local returns and closures in terms of continuation passing in his SSA based Sista optimizer. So, the argument against SSA form and functional programming in my opinion is pretty weak, and from what I have experimented with Sysmel SSA form really facilitates making a compiler front-end. However, the arguments against using LLVM are much stronger. LLVM is really designed to be a C and C++ compiler, and it also has several design issues and cases of apparent incidental complexity that I guess they have not solved because of backward compatibility, and because they do not have a need to solve them. Also, a lot of this complexity (particularly on the scope of exception handling implementation) is because of the limitation of the program loader and DLL machinery implemented by the operating system itself. The center of the issue is having a single global symbol table whose keys are just strings, when they need some more complex objects as keys, and more complex values than just addresses and relocations.

With my recent experiment I am getting each time more convinced that virtual machines are not actually needed, and they are actually an emulation hack to workaround limitation on C based compilers and operating systems, and to avoid having to actually write a proper compiler from a high-level language such as Smalltalk and Python directly into the CPU that is executing it. From this perspective, the only actual advantage that byte code provides is much increased instruction density (and memory requirement usage), on the expense of making many optimizations that can be done ahead of time (specially when having static type information, or the possibility of doing accurate type inference that allows to remove dynamic lookups to facilitate inlining) impossible due to an artificial restriction on low startup time that you have with a jit.

The funny part is that also these restrictions on the operating system can be solved in userspace by using mmap() (in fact, in Linux dlopen/dlsym is not even a system call. It is just a glorified wrapper on top of mmap. On Windows VirtualAlloc and family can be used instead), but this introduces the problem that is not compatible with existing machine code debuggers such as gdb The security and sandboxing argument that is given against distributing machine code is really invalid in today's world. It has been shown that userspace sandboxing facilities have never worked in practice (Java applets on the web are now forbidden. Web browsers are a favorite attack vector for compromising "sandboxed systems''). The sandboxing that actually works is the one that is implemented by the distinction between userspace and kernel space, by restricting system calls, the permissions and the namespace (excluding CPU bugs, but if the CPU is buggy you are already completely screwed in security). These are some of the facilities that are used by chroot and docker.

El vie., 12 jun. 2020 a las 17:56, Ben Coman (<[hidden email]>) escribió:

Could Smalltalk VM be defined as a "Continuation Passing Style", or is that still more the realm of more functional languages?

I vaguely remember one of the arguments against using LLVM is that its IR was Single Static Assignment that wasn't well suited for continuations and non-local returns. I bumped into the following article...

"Compiling with Continuations and LLVM"
https://arxiv.org/pdf/1805.08842.pdf

...and wondered if it addresses any of those concerns, and refresh myself on other impediments to making use of the code-optimisation parts of LLVM. A few extracts...

While LLVM provides a relatively easy path to high-quality native code, its design is based on a traditional runtime model which is not well suited to alternative compilation strategies used in high-level language compilers, such as the use of heap-allocated continuation closures. This paper describes a new LLVM-based backend that supports heap-allocated continuation closures, which enables constant-time callcc and very-lightweight multithreading.
...
The LLVM backend for the MLton SML compiler uses trampolining to avoid the issues with TCO [23]. As far as we know, no one has successfully implemented heap-allocated first-class continuations with LLVM. In the remainder of the paper, we describe a new calling convention that we have added to LLVM, how we handle the GC interface, and how we support capturing continuations to support preemption
...
In preparation for generating native code that uses bump allocation in the heap, we insert heap-limit tests into the CFG representation. [...] The limit tests introduce continuation captures that are not present in the original program, which can be used to implement preemptive scheduling
...
Most functional languages use tail recursion to express loops and, thus, require that tail calls do not grow the stack. In limited circumstances, LLVM can meet this requirement, but even when LLVM is able to apply TCO to a call, there is extra stack-management overhead on function entry and exit. This extra overhead is bad enough for implementations that rely on traditional stacks, but it is prohibitively expensive for implementations that use CPS and heap-allocated continuations. [...] Our solution to this problem is to add a new calling convention, which we call Jump-With-Arguments (JWA), to LLVM. This calling convention has the property that it uses all of the available hardware registers and that no registers are preserved by either the caller or callee. [...] we mark every function with the naked attribute, which tells the code generator to completely skip the emission of a function prologue and epilogue

[This sounds a bit like how our VM works...]
We use the technique developed for supporting asynchronous signals in SML/NJ [26], which limits preemption to safe points where all live values are known.   We store the heap limit pointer in memory, which means that we can set it to zero when we need to force a heap-limit check to fail.

Cheers, Ben

Boris Shingarov

Re: Compiling with Continuations and LLVM

In reply to this post by Ben Coman

Ben,

> Could Smalltalk VM be defined as a "Continuation Passing Style"
>
FWIW, back in the days when I worked at OTI our VM *was* a CPS VM.

> or is that still more the realm of more functional languages?
>
How is Smalltalk any less functional than, say, Lisp? Ok sure from a
syntax perspective we have syntactic sugar for what CLOS expresses in
first-class language constructs. Yes, sometimes this turns out to be
rather annoying. For example, I find it infuriating that I need to
modify the parser to be able to have dynamic-scope variables (and then
when I do, it crashes the tools because they try to autocomplete etc (if
you are following "Pharo-ArchC", you may have seen what I mean).

> IR was Single Static Assignment that wasn't well suited for continuations
>
I think the world is much more beautiful than how those guys paint it.
SSA and CPS are exactly the same thing (modulo isomorphism). The best
explanation of this, that I know of, was in Simon Peyton Jones' talk
"Compiling Without Continuations" at the ARM Summit in 2017.

-- Boris