Smalltalk › Squeak › Squeak VM

SafeFFI concept

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

5 messages Options

Ben Coman

SafeFFI concept

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further. Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process. That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI.

1. Speed

2. Functionality

Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process.

https://stackoverflow.com/questions/1083172/how-to-mmap-the-stack-for-the-clone-system-call-on-linux

https://nullprogram.com/blog/2015/05/15/

The child-process might be a simple event loop waiting on a semaphore.

My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function. The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main()

002 { expose_child_function_addresses_to_parent_process();

003 while(true)

004 { sem_wait(&fficallout); // Smalltalk image reconstructs stack frame here

005 printf("Dummy statement. Never gets here");

006 }

007 }

008

009 demo_redirect()

010 { printf("SafeFFI demo success");

011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.

Eliot Miranda-2

Re: SafeFFI concept

Hi Ben,

I think it's a fun idea (my Spur memory debugging scheme uses the clone idea too) but for the FFI it isn't useful. IMO so much state is associated with a specific process that only a fraction of library and system calls would work, and debugging those that didn't would be very strange. Just take a system call that opens a file for example. On return the file handle would be present only in the child. Any use of the file descriptor from the parent would fail. There are simpler alternatives:

a) modify the already installed low-level exception handlers in the VM to fail an FFI call, reporting exception location and code, when a low-level exception occurs during an FFI call.

b) allow write-protecting the Smalltalk heap during an FFI call

I like a). b) doesn't play nicely with the threaded FFI

On Mar 31, 2018, at 6:25 AM, Ben Coman <[hidden email]> wrote:

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further.  Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process.   That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI.
1. Speed
2. Functionality
Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process.

https://stackoverflow.com/questions/1083172/how-to-mmap-the-stack-for-the-clone-system-call-on-linux
https://nullprogram.com/blog/2015/05/15/

The child-process might be a simple event loop waiting on a semaphore.
My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function.   The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main()
002 { expose_child_function_addresses_to_parent_process();
003 while(true)
004 { sem_wait(&fficallout);   // Smalltalk image reconstructs stack frame here
005 printf("Dummy statement. Never gets here");
006 }
007 }
008
009 demo_redirect()
010 { printf("SafeFFI demo success");
011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.

tblanchard

Re: SafeFFI concept

In reply to this post by Ben Coman

Problem with that is when you want to do something like integrate with Cocoa on a Mac of iOS. The thing you want to talk to is in your process already.

On Mar 31, 2018, at 6:25 AM, Ben Coman <[hidden email]> wrote:

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further.  Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process.   That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI.
1. Speed
2. Functionality
Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process.

https://stackoverflow.com/questions/1083172/how-to-mmap-the-stack-for-the-clone-system-call-on-linux
https://nullprogram.com/blog/2015/05/15/

The child-process might be a simple event loop waiting on a semaphore.
My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function.   The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main()
002 { expose_child_function_addresses_to_parent_process();
003 while(true)
004 { sem_wait(&fficallout);   // Smalltalk image reconstructs stack frame here
005 printf("Dummy statement. Never gets here");
006 }
007 }
008
009 demo_redirect()
010 { printf("SafeFFI demo success");
011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.

Eliot Miranda-2

Re: SafeFFI concept

On Sat, Mar 31, 2018 at 11:15 AM, Todd Blanchard <[hidden email]> wrote:

Problem with that is when you want to do something like integrate with Cocoa on a Mac of iOS. The thing you want to talk to is in your process already.

On Mar 31, 2018, at 6:25 AM, Ben Coman <[hidden email]> wrote:

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further.  Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process.   That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI.
1. Speed
2. Functionality
Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process.

https://stackoverflow.com/questions/1083172/how-to-mmap-the-stack-for-the-clone-system-call-on-linux
https://nullprogram.com/blog/2015/05/15/

The child-process might be a simple event loop waiting on a semaphore.
My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function.   The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main()
002 { expose_child_function_addresses_to_parent_process();
003 while(true)
004 { sem_wait(&fficallout);   // Smalltalk image reconstructs stack frame here
005 printf("Dummy statement. Never gets here");
006 }
007 }
008
009 demo_redirect()
010 { printf("SafeFFI demo success");
011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.

_,,,^..^,,,_

best, Eliot

Ben Coman

Re: SafeFFI concept

In reply to this post by Eliot Miranda-2

On 1 April 2018 at 02:15, Todd Blanchard <[hidden email]> wrote:

Problem with that is when you want to do something like integrate with Cocoa on a Mac of iOS. The thing you want to talk to is in your process already.

On 1 April 2018 at 02:10, Eliot Miranda <[hidden email]> wrote:

Hi Ben,

I think it's a fun idea (my Spur memory debugging scheme uses the clone idea too) but for the FFI it isn't useful. IMO so much state is associated with a specific process that only a fraction of library and system calls would work, and debugging those that didn't would be very strange. Just take a system call that opens a file for example. On return the file handle would be present only in the child. Any use of the file descriptor from the parent would fail. There are simpler alternatives:

a) modify the already installed low-level exception handlers in the VM to fail an FFI call, reporting exception location and code, when a low-level exception occurs during an FFI call.

b) allow write-protecting the Smalltalk heap during an FFI call

I like a). b) doesn't play nicely with the threaded FFI

Thanks for your consideration.

Helps me put the idea aside.

cheers -ben

On Mar 31, 2018, at 6:25 AM, Ben Coman <[hidden email]> wrote:

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further.  Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process.   That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI.
1. Speed
2. Functionality
Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process.

https://stackoverflow.com/questions/1083172/how-to-mmap-the-stack-for-the-clone-system-call-on-linux
https://nullprogram.com/blog/2015/05/15/

The child-process might be a simple event loop waiting on a semaphore.
My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function.   The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main()
002 { expose_child_function_addresses_to_parent_process();
003 while(true)
004 { sem_wait(&fficallout);   // Smalltalk image reconstructs stack frame here
005 printf("Dummy statement. Never gets here");
006 }
007 }
008
009 demo_redirect()
010 { printf("SafeFFI demo success");
011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.