Smalltalk › Squeak › Squeak - Dev

Loading FFI is broken

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

40 messages Options

Eliot Miranda-2

Re: Loading FFI is broken

Hi All,

this is an important discussion that is taking a religious tone that we should strive to avoid. There are good arguments for plugins, namely security and encapsulation. There are good arguments for an FFI, namely extensibility and platform compatibility.

Plugins provide security because they allow the system to control any and all access to the underlying platform, permitting access only through plugins. With an FFI the underlying platform is exposed and one needs other mechanisms, for example Newspeak mirrors, to prevent untrusted code from accessing the platform with potentially disastrous effects (self shell: '/bin/rm -rf /*').

Plugins encapsulate all sorts of details behind a potentially simple primitive interface. This can avoid confusing the newcommer (but at the same time frustrate them by hiding details), provide portability, can make it easier to determine the extent of work in moving to a new OS platform, and so on.

An FFI allows immediate extensibility. External functionality can be invoked immediately. With plugins a primitive interface must be designed and then implemented. With the FFI the API is already defined; it must "merely" be accessed. This immediacy can itself provide simplicity, especially where callbacks and threads are involved. Plugins can hide a lot of complexity (e.g. the SocketPlugin encapsulates platform threads that are waiting on blocking calls so that Squeak itself is provided with an interrupt-driven interface, necessitated by the Squeak platform's lack of native thread support).

An FFI allows all underlying functionality to be accessed. The plugin approach necessitates defining a lowest common denominator approach to functionality, especially irksome in some applications where setting the right flag, e.g. on a socket stream, can have a significant performance impact.

So there are good arguments either way. In a system oriented towards safe play plugins make excellent sense. In a platform oriented towards industrial development an FFI is a must-have, and a weak one will really hurt acceptance.

IMO Squeak needs to have both. It needs plugins to provide its hallmarks such as eToys. But to be a more general platform it needs an FFI. Managing this split personality will take work but I don't see any fundamental issues. Having a well-factored base into which packages can be loaded to create different personalities is key, and good work is being done here. There may be a half-way house where the FFI is strictly encapsulated, but this is hypothetical. I know how to solve threads, pinning, etc, but I don't know off the top of my head how to encapsulate the FFI, so I can't propose it as a solution.

A number of straw men have been raised against the FFI in this discussion. OK, that's unfair. A number of important questions have been asked of the FFI in this discussion.

Levente asks "Show me how you can replace the SocketPlugin with FFI, and I'll consider it. ;)".

The issue here is threads. The SocketPlugin encapsulates blocking calls, spawning hidden OS threads to make these calls and then signal semaphores when they complete. To solve this one needs both native thread support in the VM (and I have a prototype that needs Spur's facilities to make practicable) and pinning (the ability to stop certain objects moving). Spur provides pinning.

David says "I remember when somebody on the Pharo list suggested reimplementing the

OSProcessPlugin in FFI. I told them it was a really great idea, and they
should give it a try. That settled the matter quite quickly ;-)". Again they failed because of the lack of necessary underlying functionality from the VM. With threads, pinning and a way of expressing the array of pointers to strings idiom (a simple extension to marshalling, and/or pinning, e.g. provide an address of first field primitive) an FFI can do all the OSProcessPlugin can do and significantly simpler.

David also says "it is a complete mystery to me why people are willing to work so hard to avoid writing a VM plugin. VM plugins are reliable, portable, and debuggable. They work across a range of processors. They work on 64-bit platforms. So why would someone prefer to switch to a calling interface that basically only works on 32-bit Intel processors and that may require low level knowledge of calling conventions, word alignment, and platform-specific data types?"

This is a non-sequitur. The sentences beginning "So why would someone..." don't follow from the first sentences. Writing the plugin requires even more knowledge than writing the FFI interface because one needs to know the VM facilities for mating Squeak objects to plugins. Writing plugins /and/ writing interfaces above FFIs are hard. But in my experience a powerful FFI provides a faster and easier development experience. Both can be difficult to port, but plugins have the advantage that only the innards have to be ported while facing the C code face. My experience in that regard leaves me with a preference for FFIs. The lack of a 64-bit FFI is a bad weakness of the Squeak platform, something Spur again makes easy to rectify.

Bert asks "Suppose we add a new VM platform, like a VM running on JavaScript in the browser. Do you really want to re-implement all the C libraries utilized via FFI? Or rather a handful of primitives in your language of choice?". First it is not clear that one *can* implement these primitives taking either approach. If the platform, e.g. JavaScript in a browser, takes the Squeak plugin approach of preventing access to the platform except through a restricted set of facilities, then certain functionality will simply be off-limits, whether one has an FFI or not. Second, reimplementing all the C libraries isn't obligatory. If the platform provides an FFI one simply mates to its FFI and accesses the underlying libraries. If it doesn't then that functionality is off-limits, but that doesn't mean the rest of the system doesn't work. It also means that Squeak running in that context is no less useful than any other platform, because the underlying platform (just as Squeak does with plugins)

--
best,

Eliot

David T. Lewis

Re: Loading FFI is broken

On Tue, Nov 19, 2013 at 10:35:40AM -0800, Eliot Miranda wrote:
> Hi All,
>
> this is an important discussion that is taking a religious tone that we
> should strive to avoid.

+1 and thanks for the excellent overview

>
> IMO Squeak needs to have both.

+100

>
> David also says "it is a complete mystery to me why people are willing to work
> so hard to avoid writing a VM plugin...

I remain mystified, but I'm sorry if I took the discussion in an unhelpful direction.

Dave

Bert Freudenberg

FFI considered harmless (was: Loading FFI is broken)

In reply to this post by Eliot Miranda-2

On 19.11.2013, at 12:35, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> this is an important discussion that is taking a religious tone that we should strive to avoid.

Let's call it philosophical, "religious" is just flame bait.

> There are good arguments for plugins, namely security and encapsulation. There are good arguments for an FFI, namely extensibility and platform compatibility.

Agreed.

> [... nice explanation snipped ...]

> A number of straw men have been raised against the FFI in this discussion.

No-one has been arguing against FFI in general. We agree an FFI is useful, and a more powerful FFI is better. We just (appear to) disagree on how widely it should be used.

> Bert asks "Suppose we add a new VM platform, like a VM running on JavaScript in the browser. Do you really want to re-implement all the C libraries utilized via FFI? Or rather a handful of primitives in your language of choice?". First it is not clear that one *can* implement these primitives taking either approach. If the platform, e.g. JavaScript in a browser, takes the Squeak plugin approach of preventing access to the platform except through a restricted set of facilities, then certain functionality will simply be off-limits, whether one has an FFI or not. Second, reimplementing all the C libraries isn't obligatory. If the platform provides an FFI one simply mates to its FFI and accesses the underlying libraries. If it doesn't then that functionality is off-limits, but that doesn't mean the rest of the system doesn't work.

That's where we disagree. If basic functions in the system depend on FFI, and FFI is not available, then the system *does not work* at all. E.g., there are efforts in other Squeak forks to replace fundamental parts of the system (which currently rely on VM primitives) with FFI calls. That's what I am wary of.

One of the fundamental services of a virtual machine is providing a safe and complete environment for the system to function in. Plugins enrich that environment. But FFI pokes holes into that safe environment, reaching out of the virtual world into the "real" world.

Indeed sometimes that is exactly what you need, namely to better interact with the specific host system you are running on. I'm simply saying that we need to clearly separate this from the base system, which should be as independent of the actual host platform as possible.

- Bert -

Eliot Miranda-2

Re: FFI considered harmless (was: Loading FFI is broken)

On Tue, Nov 19, 2013 at 12:00 PM, Bert Freudenberg <[hidden email]> wrote:

On 19.11.2013, at 12:35, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> this is an important discussion that is taking a religious tone that we should strive to avoid.

Let's call it philosophical, "religious" is just flame bait.

> There are good arguments for plugins, namely security and encapsulation. There are good arguments for an FFI, namely extensibility and platform compatibility.

Agreed.

> [... nice explanation snipped ...]

> A number of straw men have been raised against the FFI in this discussion.

No-one has been arguing against FFI in general. We agree an FFI is useful, and a more powerful FFI is better. We just (appear to) disagree on how widely it should be used.

> Bert asks "Suppose we add a new VM platform, like a VM running on JavaScript in the browser. Do you really want to re-implement all the C libraries utilized via FFI? Or rather a handful of primitives in your language of choice?". First it is not clear that one *can* implement these primitives taking either approach. If the platform, e.g. JavaScript in a browser, takes the Squeak plugin approach of preventing access to the platform except through a restricted set of facilities, then certain functionality will simply be off-limits, whether one has an FFI or not. Second, reimplementing all the C libraries isn't obligatory. If the platform provides an FFI one simply mates to its FFI and accesses the underlying libraries. If it doesn't then that functionality is off-limits, but that doesn't mean the rest of the system doesn't work.

That's where we disagree. If basic functions in the system depend on FFI, and FFI is not available, then the system *does not work* at all. E.g., there are efforts in other Squeak forks to replace fundamental parts of the system (which currently rely on VM primitives) with FFI calls. That's what I am wary of.

I see your concern but it doesn't worry me. I don't see why the system can't be constructed so that it discovers what services are available. It already does that in a number of circumstances. For example, the menu bar includes a system report if class SystemReporter is loaded. So I can imagine that the socket layer would look for an FFI-based implementation and use it if available, falling back on the plugin interface if absent.

In these days of build and test slaves this kind of layering is straight-forward to manage.

One of the fundamental services of a virtual machine is providing a safe and complete environment for the system to function in. Plugins enrich that environment. But FFI pokes holes into that safe environment, reaching out of the virtual world into the "real" world.

Indeed sometimes that is exactly what you need, namely to better interact with the specific host system you are running on. I'm simply saying that we need to clearly separate this from the base system, which should be as independent of the actual host platform as possible.

Again I see this as a straw-man. Yes, the system should be able to provide something portable and safe, and feature-rich. But it should also be able to provide access to the broader environment if so desired. Further, if a superior interface is available via FFI the system should use it over and above the plugin interface. The JIT does this, but you'd never notice. If sse instructions are available they get used, etc. The system already adapts to the underlying host (file directory separators, etc). Whether the system is independent of the host or heavily dependent on it is a matter of perspective. One perspective is to say that it provides portable abstractions of host facilities. Whether one goes through an FFI or a plugin to provide these abstractions makes little difference. I would agree that we keep the FFI separate from /a/ base system, but not form /all/ base systems. I want support for symbolic links and I don't want to depend on a plugin that can't, because the facilities are too different across platforms, provide a portable abstraction of symbolic links across unix, windows and mac. I want to be able to launch an arbitrary external program and not be limited to a small set of supported programs known by a plugin, etc, etc. These are all valid things to have in a base system, but not valid things to have in e.g. a web plugin.

- Bert -

--
best,

Eliot

Chris Muller-3

Re: FFI considered harmless (was: Loading FFI is broken)

> if a
> superior interface is available via FFI the system should use it over and
> above the plugin interface.

For absolute privacy, though, The End To End Argument convinced me it
would be better to use image-level Cryptography than an external
module via FFI. Too opaque.

OT: Just thinking about this made me wonder whether password-encrypted
images would be nice to have. The VM can only launch them when the
proper key (or file) is supplied... A corresponding primitve to save
the image encrypts with a supplied key. Secure images?

On Tue, Nov 19, 2013 at 6:12 PM, Eliot Miranda <[hidden email]> wrote:

> On Tue, Nov 19, 2013 at 12:00 PM, Bert Freudenberg <[hidden email]>
> wrote:
>>
>> On 19.11.2013, at 12:35, Eliot Miranda <[hidden email]> wrote:
>>
>> > Hi All,
>> >
>> > this is an important discussion that is taking a religious tone that
>> > we should strive to avoid.
>>
>> Let's call it philosophical, "religious" is just flame bait.
>>
>> > There are good arguments for plugins, namely security and
>> > encapsulation. There are good arguments for an FFI, namely extensibility
>> > and platform compatibility.
>>
>> Agreed.
>>
>> > [... nice explanation snipped ...]
>>
>> > A number of straw men have been raised against the FFI in this
>> > discussion.
>>
>> No-one has been arguing against FFI in general. We agree an FFI is useful,
>> and a more powerful FFI is better. We just (appear to) disagree on how
>> widely it should be used.
>>
>> > Bert asks "Suppose we add a new VM platform, like a VM running on
>> > JavaScript in the browser. Do you really want to re-implement all the C
>> > libraries utilized via FFI? Or rather a handful of primitives in your
>> > language of choice?". First it is not clear that one *can* implement these
>> > primitives taking either approach. If the platform, e.g. JavaScript in a
>> > browser, takes the Squeak plugin approach of preventing access to the
>> > platform except through a restricted set of facilities, then certain
>> > functionality will simply be off-limits, whether one has an FFI or not.
>> > Second, reimplementing all the C libraries isn't obligatory. If the
>> > platform provides an FFI one simply mates to its FFI and accesses the
>> > underlying libraries. If it doesn't then that functionality is off-limits,
>> > but that doesn't mean the rest of the system doesn't work.
>>
>> That's where we disagree. If basic functions in the system depend on FFI,
>> and FFI is not available, then the system *does not work* at all. E.g.,
>> there are efforts in other Squeak forks to replace fundamental parts of the
>> system (which currently rely on VM primitives) with FFI calls. That's what I
>> am wary of.
>
>
> I see your concern but it doesn't worry me. I don't see why the system
> can't be constructed so that it discovers what services are available. It
> already does that in a number of circumstances. For example, the menu bar
> includes a system report if class SystemReporter is loaded. So I can
> imagine that the socket layer would look for an FFI-based implementation and
> use it if available, falling back on the plugin interface if absent.
>
> In these days of build and test slaves this kind of layering is
> straight-forward to manage.
>
>
>> One of the fundamental services of a virtual machine is providing a safe
>> and complete environment for the system to function in. Plugins enrich that
>> environment. But FFI pokes holes into that safe environment, reaching out of
>> the virtual world into the "real" world.
>>
>>
>> Indeed sometimes that is exactly what you need, namely to better interact
>> with the specific host system you are running on. I'm simply saying that we
>> need to clearly separate this from the base system, which should be as
>> independent of the actual host platform as possible.
>
>
> Again I see this as a straw-man. Yes, the system should be able to provide
> something portable and safe, and feature-rich. But it should also be able
> to provide access to the broader environment if so desired. Further, if a
> superior interface is available via FFI the system should use it over and
> above the plugin interface. The JIT does this, but you'd never notice. If
> sse instructions are available they get used, etc. The system already
> adapts to the underlying host (file directory separators, etc). Whether the
> system is independent of the host or heavily dependent on it is a matter of
> perspective. One perspective is to say that it provides portable
> abstractions of host facilities. Whether one goes through an FFI or a
> plugin to provide these abstractions makes little difference. I would agree
> that we keep the FFI separate from /a/ base system, but not form /all/ base
> systems. I want support for symbolic links and I don't want to depend on a
> plugin that can't, because the facilities are too different across
> platforms, provide a portable abstraction of symbolic links across unix,
> windows and mac. I want to be able to launch an arbitrary external program
> and not be limited to a small set of supported programs known by a plugin,
> etc, etc. These are all valid things to have in a base system, but not
> valid things to have in e.g. a web plugin.
>
>>
>>
>> - Bert -
>>
>>
>
>
>
> --
> best,
> Eliot
>
>
>

timrowledge

Re: FFI considered harmless (was: Loading FFI is broken)

On 19-11-2013, at 6:44 PM, Chris Muller <[hidden email]> wrote:
>
> OT: Just thinking about this made me wonder whether password-encrypted
> images would be nice to have. The VM can only launch them when the
> proper key (or file) is supplied... A corresponding primitve to save
> the image encrypts with a supplied key. Secure images?

Should be easy enough by using encrypted zipping to load the image; the zip code handles all the tricky cryppy stuff. Of course, you have to trust the zip code...

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: PWB: Put to Waste Basket

Andres Valloud-4

Re: Loading FFI is broken

In reply to this post by Eliot Miranda-2

There are other points of view worth considering. Let's require that
the resulting system works correctly, and backtrack from there to
determine how to achieve that goal.

Sometimes, such as with Single Unix Specification / POSIX sockets, it is
*impossible* to use an FFI correctly because the standard is such that
using an FFI cannot be guaranteed to produce correct results. Another
way of saying the same thing is that you can use an FFI, as long as you
don't care about the presence of undefined behavior in the general case.

(note that "undefined behavior" is specification language short hand for
"execute arbitrary instructions", basically. Usually this results in a
segfault, but data corruption and security holes are possible too)

> Show me how you can replace the SocketPlugin with FFI, and
> I'll consider it. ;)

Specifically, SUS / POSIX sockets rely on partially specified structs
that can change size, field, and field order from Unix to Unix.
Moreover, the functions you'd call using those structs as arguments can
be defined as macros. Even trivial things like malloc() can be macros.
It's impossible to use those kinds of APIs in a sane manner from an
FFI. Theoretically it's conceivable, but at the cost of breaking C's
encapsulation mechanism, thus making the whole application non portable
across SUS / POSIX compliant implementations. If one wanted to go that
route, keep in mind the resulting never ending maintenance homework is
extremely time consuming, and the application's behavior cannot ever be
proven correct. In real life, the FFI approach to these APIs means
applications are not rationally supportable due to undefined behavior.

Speaking of symlinks, the function-like-things symlink() and stat() can
also be macros as per SUS / POSIX. So, even if there was a function
called "symlink" you could find via dlsym() or an equivalent, it's
*unsafe* to assume you can use an FFI to call that something called
"symlink" and produce the same effect as writing "symlink" in a C source
file that is given to a C compiler.

This problem has already been satisfactorily addressed in the form of a
C compiler and a properly configured compilation environment producing
primitives (or things equivalent to primitives), such that you write
something like

make fooPrimitivesOrBarPlugin

and in O(1 second) you have something that could possibly work
correctly. Note that I mean "correctly" as in

"if it doesn't work, then it's conceivable you can file a well
documented bug report with the maintainer after a modest amount of effort",

as opposed to

"send the author a circumstantial account to the effect that after
looking at random .h files with a random (perhaps human) .h file parser,
using binaries compiled with random optimization switches on a random
machine, and violating the relevant specification that describes the
rational use of the feature in question, the resulting application fails
due to an unspecified cause --- help!".

For some reason, code maintainers tend to pay attention to the former
and ignore the latter.

In short, an issue with these types of FFIs is that all too often they
merely *appear* to work. The only rational usage model for some (most?)
of the APIs mentioned in this thread involves a C compiler, which in
practice means a C primitive or a C plugin.

The above points, argued strictly on technical grounds, are not intended
to "cause a confrontation" or to "negate benefits of FFIs and plugins".
I just strongly care that applications Work(TM). That goal sometimes
implies dealing with SUS / POSIX (or, gasp, MSDN) and a C compiler.
Maybe it's not necessarily the most enjoyable activity, but at least
then the C stuff will be used as intended. The alternative is non stop
stochastic crashes preventing everyone's progress.

... my 2 cents...

On 11/19/13 10:35 , Eliot Miranda wrote:

> Hi All,
>
> this is an important discussion that is taking a religious tone
> that we should strive to avoid. There are good arguments for plugins,
> namely security and encapsulation. There are good arguments for an FFI,
> namely extensibility and platform compatibility.
>
> Plugins provide security because they allow the system to control any
> and all access to the underlying platform, permitting access only
> through plugins. With an FFI the underlying platform is exposed and one
> needs other mechanisms, for example Newspeak mirrors, to prevent
> untrusted code from accessing the platform with potentially disastrous
> effects (self shell: '/bin/rm -rf /*').
>
> Plugins encapsulate all sorts of details behind a potentially simple
> primitive interface. This can avoid confusing the newcommer (but at the
> same time frustrate them by hiding details), provide portability, can
> make it easier to determine the extent of work in moving to a new OS
> platform, and so on.
>
> An FFI allows immediate extensibility. External functionality can be
> invoked immediately. With plugins a primitive interface must be
> designed and then implemented. With the FFI the API is already defined;
> it must "merely" be accessed. This immediacy can itself provide
> simplicity, especially where callbacks and threads are involved.
> Plugins can hide a lot of complexity (e.g. the SocketPlugin
> encapsulates platform threads that are waiting on blocking calls so that
> Squeak itself is provided with an interrupt-driven interface,
> necessitated by the Squeak platform's lack of native thread support).
>
> An FFI allows all underlying functionality to be accessed. The plugin
> approach necessitates defining a lowest common denominator approach to
> functionality, especially irksome in some applications where setting the
> right flag, e.g. on a socket stream, can have a significant performance
> impact.
>
> So there are good arguments either way. In a system oriented towards
> safe play plugins make excellent sense. In a platform oriented towards
> industrial development an FFI is a must-have, and a weak one will really
> hurt acceptance.
>
> IMO Squeak needs to have both. It needs plugins to provide its
> hallmarks such as eToys. But to be a more general platform it needs an
> FFI. Managing this split personality will take work but I don't see any
> fundamental issues. Having a well-factored base into which packages can
> be loaded to create different personalities is key, and good work is
> being done here. There may be a half-way house where the FFI is
> strictly encapsulated, but this is hypothetical. I know how to solve
> threads, pinning, etc, but I don't know off the top of my head how to
> encapsulate the FFI, so I can't propose it as a solution.
>
> A number of straw men have been raised against the FFI in this
> discussion. OK, that's unfair. A number of important questions have
> been asked of the FFI in this discussion.
>
> Levente asks "Show me how you can replace the SocketPlugin with FFI, and
> I'll consider it. ;)".
> The issue here is threads. The SocketPlugin encapsulates blocking
> calls, spawning hidden OS threads to make these calls and then signal
> semaphores when they complete. To solve this one needs both native
> thread support in the VM (and I have a prototype that needs Spur's
> facilities to make practicable) and pinning (the ability to stop certain
> objects moving). Spur provides pinning.
>
> David says "I remember when somebody on the Pharo list suggested
> reimplementing the
> OSProcessPlugin in FFI. I told them it was a really great idea, and they
> should give it a try. That settled the matter quite quickly ;-)". Again
> they failed because of the lack of necessary underlying functionality
> from the VM. With threads, pinning and a way of expressing the array of
> pointers to strings idiom (a simple extension to marshalling, and/or
> pinning, e.g. provide an address of first field primitive) an FFI can do
> all the OSProcessPlugin can do and significantly simpler.
>
> David also says "it is a complete mystery to me why people are willing
> to work so hard to avoid writing a VM plugin. VM plugins are reliable,
> portable, and debuggable. They work across a range of processors. They
> work on 64-bit platforms. So why would someone prefer to switch to a
> calling interface that basically only works on 32-bit Intel processors
> and that may require low level knowledge of calling conventions, word
> alignment, and platform-specific data types?"
>
> This is a non-sequitur. The sentences beginning "So why would
> someone..." don't follow from the first sentences. Writing the plugin
> requires even more knowledge than writing the FFI interface because one
> needs to know the VM facilities for mating Squeak objects to plugins.
> Writing plugins /and/ writing interfaces above FFIs are hard. But in
> my experience a powerful FFI provides a faster and easier development
> experience. Both can be difficult to port, but plugins have the
> advantage that only the innards have to be ported while facing the C
> code face. My experience in that regard leaves me with a preference for
> FFIs. The lack of a 64-bit FFI is a bad weakness of the Squeak
> platform, something Spur again makes easy to rectify.
>
> Bert asks "Suppose we add a new VM platform, like a VM running on
> JavaScript in the browser. Do you really want to re-implement all the C
> libraries utilized via FFI? Or rather a handful of primitives in your
> language of choice?". First it is not clear that one *can* implement
> these primitives taking either approach. If the platform, e.g.
> JavaScript in a browser, takes the Squeak plugin approach of preventing
> access to the platform except through a restricted set of facilities,
> then certain functionality will simply be off-limits, whether one has an
> FFI or not. Second, reimplementing all the C libraries isn't
> obligatory. If the platform provides an FFI one simply mates to its FFI
> and accesses the underlying libraries. If it doesn't then that
> functionality is off-limits, but that doesn't mean the rest of the
> system doesn't work. It also means that Squeak running in that context
> is no less useful than any other platform, because the underlying
> platform (just as Squeak does with plugins)
>
> --
> best,
> Eliot

Eliot Miranda-2

Re: Loading FFI is broken

On Tue, Nov 19, 2013 at 7:00 PM, Andres Valloud <[hidden email]> wrote:

There are other points of view worth considering. Let's require that the resulting system works correctly, and backtrack from there to determine how to achieve that goal.

Sometimes, such as with Single Unix Specification / POSIX sockets, it is *impossible* to use an FFI correctly because the standard is such that using an FFI cannot be guaranteed to produce correct results. Another way of saying the same thing is that you can use an FFI, as long as you don't care about the presence of undefined behavior in the general case.

(note that "undefined behavior" is specification language short hand for "execute arbitrary instructions", basically. Usually this results in a segfault, but data corruption and security holes are possible too)

> Show me how you can replace the SocketPlugin with FFI, and
> I'll consider it. ;)

Specifically, SUS / POSIX sockets rely on partially specified structs that can change size, field, and field order from Unix to Unix. Moreover, the functions you'd call using those structs as arguments can be defined as macros. Even trivial things like malloc() can be macros. It's impossible to use those kinds of APIs in a sane manner from an FFI.

That's not so. I came up with a scheme and implemented a prototype for VW. All one need do is generate a wrapper and compile it on the platform. One can autogenerate and autocompile the wrapper. The wrapper can either be something that outputs metadata interpreted by the image or something that actually wraps the platform functions. If it can be called from C then, with a little ingenuity, it an be called through an FFI. An FFI is not just a marshaller.

I would argue that in fact the best way to deal with differing UNIX implementations is this approach. For example, ioctl defines, socket constant defines, struct layouts, etc, etc all differ markedly between UNIX implementations, and hence one easy way to extract exact information is to generate, compile and either run or load a program that reveals the implementation details.

Theoretically it's conceivable, but at the cost of breaking C's encapsulation mechanism, thus making the whole application non portable across SUS / POSIX compliant implementations. If one wanted to go that route, keep in mind the resulting never ending maintenance homework is extremely time consuming, and the application's behavior cannot ever be proven correct. In real life, the FFI approach to these APIs means applications are not rationally supportable due to undefined behavior.

Speaking of symlinks, the function-like-things symlink() and stat() can also be macros as per SUS / POSIX. So, even if there was a function called "symlink" you could find via dlsym() or an equivalent, it's *unsafe* to assume you can use an FFI to call that something called "symlink" and produce the same effect as writing "symlink" in a C source file that is given to a C compiler.

This problem has already been satisfactorily addressed in the form of a C compiler and a properly configured compilation environment producing primitives (or things equivalent to primitives), such that you write something like

make fooPrimitivesOrBarPlugin

and in O(1 second) you have something that could possibly work correctly. Note that I mean "correctly" as in

"if it doesn't work, then it's conceivable you can file a well documented bug report with the maintainer after a modest amount of effort",

as opposed to

"send the author a circumstantial account to the effect that after looking at random .h files with a random (perhaps human) .h file parser, using binaries compiled with random optimization switches on a random machine, and violating the relevant specification that describes the rational use of the feature in question, the resulting application fails due to an unspecified cause --- help!".

For some reason, code maintainers tend to pay attention to the former and ignore the latter.

In short, an issue with these types of FFIs is that all too often they merely *appear* to work. The only rational usage model for some (most?) of the APIs mentioned in this thread involves a C compiler, which in practice means a C primitive or a C plugin.

The above points, argued strictly on technical grounds, are not intended to "cause a confrontation" or to "negate benefits of FFIs and plugins". I just strongly care that applications Work(TM). That goal sometimes implies dealing with SUS / POSIX (or, gasp, MSDN) and a C compiler. Maybe it's not necessarily the most enjoyable activity, but at least then the C stuff will be used as intended. The alternative is non stop stochastic crashes preventing everyone's progress.

... my 2 cents...

On 11/19/13 10:35 , Eliot Miranda wrote:

Hi All,

this is an important discussion that is taking a religious tone
that we should strive to avoid. There are good arguments for plugins,
namely security and encapsulation. There are good arguments for an FFI,
namely extensibility and platform compatibility.

Plugins provide security because they allow the system to control any
and all access to the underlying platform, permitting access only
through plugins. With an FFI the underlying platform is exposed and one
needs other mechanisms, for example Newspeak mirrors, to prevent
untrusted code from accessing the platform with potentially disastrous
effects (self shell: '/bin/rm -rf /*').

Plugins encapsulate all sorts of details behind a potentially simple
primitive interface. This can avoid confusing the newcommer (but at the
same time frustrate them by hiding details), provide portability, can
make it easier to determine the extent of work in moving to a new OS
platform, and so on.

An FFI allows immediate extensibility. External functionality can be
invoked immediately. With plugins a primitive interface must be
designed and then implemented. With the FFI the API is already defined;
it must "merely" be accessed. This immediacy can itself provide
simplicity, especially where callbacks and threads are involved.
Plugins can hide a lot of complexity (e.g. the SocketPlugin
encapsulates platform threads that are waiting on blocking calls so that
Squeak itself is provided with an interrupt-driven interface,
necessitated by the Squeak platform's lack of native thread support).

An FFI allows all underlying functionality to be accessed. The plugin
approach necessitates defining a lowest common denominator approach to
functionality, especially irksome in some applications where setting the
right flag, e.g. on a socket stream, can have a significant performance
impact.

So there are good arguments either way. In a system oriented towards
safe play plugins make excellent sense. In a platform oriented towards
industrial development an FFI is a must-have, and a weak one will really
hurt acceptance.

IMO Squeak needs to have both. It needs plugins to provide its
hallmarks such as eToys. But to be a more general platform it needs an
FFI. Managing this split personality will take work but I don't see any
fundamental issues. Having a well-factored base into which packages can
be loaded to create different personalities is key, and good work is
being done here. There may be a half-way house where the FFI is
strictly encapsulated, but this is hypothetical. I know how to solve
threads, pinning, etc, but I don't know off the top of my head how to
encapsulate the FFI, so I can't propose it as a solution.

A number of straw men have been raised against the FFI in this
discussion. OK, that's unfair. A number of important questions have
been asked of the FFI in this discussion.

Levente asks "Show me how you can replace the SocketPlugin with FFI, and
I'll consider it. ;)".
The issue here is threads. The SocketPlugin encapsulates blocking
calls, spawning hidden OS threads to make these calls and then signal
semaphores when they complete. To solve this one needs both native
thread support in the VM (and I have a prototype that needs Spur's
facilities to make practicable) and pinning (the ability to stop certain
objects moving). Spur provides pinning.

David says "I remember when somebody on the Pharo list suggested
reimplementing the
OSProcessPlugin in FFI. I told them it was a really great idea, and they
should give it a try. That settled the matter quite quickly ;-)". Again
they failed because of the lack of necessary underlying functionality
from the VM. With threads, pinning and a way of expressing the array of
pointers to strings idiom (a simple extension to marshalling, and/or
pinning, e.g. provide an address of first field primitive) an FFI can do
all the OSProcessPlugin can do and significantly simpler.

David also says "it is a complete mystery to me why people are willing
to work so hard to avoid writing a VM plugin. VM plugins are reliable,
portable, and debuggable. They work across a range of processors. They
work on 64-bit platforms. So why would someone prefer to switch to a
calling interface that basically only works on 32-bit Intel processors
and that may require low level knowledge of calling conventions, word
alignment, and platform-specific data types?"

This is a non-sequitur. The sentences beginning "So why would
someone..." don't follow from the first sentences. Writing the plugin
requires even more knowledge than writing the FFI interface because one
needs to know the VM facilities for mating Squeak objects to plugins.
Writing plugins /and/ writing interfaces above FFIs are hard. But in
my experience a powerful FFI provides a faster and easier development
experience. Both can be difficult to port, but plugins have the
advantage that only the innards have to be ported while facing the C
code face. My experience in that regard leaves me with a preference for
FFIs. The lack of a 64-bit FFI is a bad weakness of the Squeak
platform, something Spur again makes easy to rectify.

Bert asks "Suppose we add a new VM platform, like a VM running on
JavaScript in the browser. Do you really want to re-implement all the C
libraries utilized via FFI? Or rather a handful of primitives in your
language of choice?". First it is not clear that one *can* implement
these primitives taking either approach. If the platform, e.g.
JavaScript in a browser, takes the Squeak plugin approach of preventing
access to the platform except through a restricted set of facilities,
then certain functionality will simply be off-limits, whether one has an
FFI or not. Second, reimplementing all the C libraries isn't
obligatory. If the platform provides an FFI one simply mates to its FFI
and accesses the underlying libraries. If it doesn't then that
functionality is off-limits, but that doesn't mean the rest of the
system doesn't work. It also means that Squeak running in that context
is no less useful than any other platform, because the underlying
platform (just as Squeak does with plugins)

--
best,
Eliot

--
best,

Eliot

Andres Valloud-4

Re: Loading FFI is broken

On 11/19/13 19:13 , Eliot Miranda wrote:
>
> That's not so. I came up with a scheme and implemented a prototype for
> VW. All one need do is generate a wrapper and compile it on the
> platform. One can autogenerate and autocompile the wrapper. The
> wrapper can either be something that outputs metadata interpreted by the
> image or something that actually wraps the platform functions. If it
> can be called from C then, with a little ingenuity, it an be called
> through an FFI. An FFI is not just a marshaller.

An FFI is not a C compiler either :).

Andres Valloud-4

Re: Loading FFI is broken

In reply to this post by Eliot Miranda-2

On 11/19/13 19:13 , Eliot Miranda wrote:
> I would argue that in fact the best way to deal with differing UNIX
> implementations is this approach. For example, ioctl defines, socket
> constant defines, struct layouts, etc, etc all differ markedly between
> UNIX implementations, and hence one easy way to extract exact
> information is to generate, compile and either run or load a program
> that reveals the implementation details.

It's not clear to me how an arbitrary interpretation mechanism would
reveal what function to call to invoke e.g. malloc(), assuming that's
all there is to it. The mechanism would have to deal with arbitrary
macro code expansion, arbitrary code without source code provided by the
compiler itself, and on some platforms such as OS X the behavior of
malloc() could depend on the value of environment variables at the time
the binary is loaded (as opposed to the time when the interpretation
mechanism looks at said variables). Similarly, it's unclear what that
interpretation mechanism would do in the general presence of things like

#if defined(FOO)
...
#endif

To me, avoiding writing a few lines of C does not justify the effort of
correctly rewriting and maintaining (parts of) a C compiler in Smalltalk.

Generally, I agree that one could carefully and consciously write
comparatively small primitives and/or plugins. Then, one could compile
those with a C compiler in a compilation environment compatible with
that of the VM. And then, one could call those primitives and/or
plugins from the image with the expectation (within reason) that they
should work.

Igor Stasenko

Re: FFI considered harmless (was: Loading FFI is broken)

In reply to this post by Eliot Miranda-2

On 20 November 2013 01:12, Eliot Miranda <[hidden email]> wrote:

On Tue, Nov 19, 2013 at 12:00 PM, Bert Freudenberg <[hidden email]> wrote:

On 19.11.2013, at 12:35, Eliot Miranda <[hidden email]> wrote:

> Hi All,
>
> this is an important discussion that is taking a religious tone that we should strive to avoid.

Let's call it philosophical, "religious" is just flame bait.

> There are good arguments for plugins, namely security and encapsulation. There are good arguments for an FFI, namely extensibility and platform compatibility.

Agreed.

> [... nice explanation snipped ...]

> A number of straw men have been raised against the FFI in this discussion.

No-one has been arguing against FFI in general. We agree an FFI is useful, and a more powerful FFI is better. We just (appear to) disagree on how widely it should be used.

> Bert asks "Suppose we add a new VM platform, like a VM running on JavaScript in the browser. Do you really want to re-implement all the C libraries utilized via FFI? Or rather a handful of primitives in your language of choice?". First it is not clear that one *can* implement these primitives taking either approach. If the platform, e.g. JavaScript in a browser, takes the Squeak plugin approach of preventing access to the platform except through a restricted set of facilities, then certain functionality will simply be off-limits, whether one has an FFI or not. Second, reimplementing all the C libraries isn't obligatory. If the platform provides an FFI one simply mates to its FFI and accesses the underlying libraries. If it doesn't then that functionality is off-limits, but that doesn't mean the rest of the system doesn't work.

That's where we disagree. If basic functions in the system depend on FFI, and FFI is not available, then the system *does not work* at all. E.g., there are efforts in other Squeak forks to replace fundamental parts of the system (which currently rely on VM primitives) with FFI calls. That's what I am wary of.

I see your concern but it doesn't worry me. I don't see why the system can't be constructed so that it discovers what services are available. It already does that in a number of circumstances. For example, the menu bar includes a system report if class SystemReporter is loaded. So I can imagine that the socket layer would look for an FFI-based implementation and use it if available, falling back on the plugin interface if absent.

In these days of build and test slaves this kind of layering is straight-forward to manage.

One of the fundamental services of a virtual machine is providing a safe and complete environment for the system to function in. Plugins enrich that environment. But FFI pokes holes into that safe environment, reaching out of the virtual world into the "real" world.

Indeed sometimes that is exactly what you need, namely to better interact with the specific host system you are running on. I'm simply saying that we need to clearly separate this from the base system, which should be as independent of the actual host platform as possible.

Again I see this as a straw-man. Yes, the system should be able to provide something portable and safe, and feature-rich. But it should also be able to provide access to the broader environment if so desired. Further, if a superior interface is available via FFI the system should use it over and above the plugin interface. The JIT does this, but you'd never notice. If sse instructions are available they get used, etc. The system already adapts to the underlying host (file directory separators, etc). Whether the system is independent of the host or heavily dependent on it is a matter of perspective. One perspective is to say that it provides portable abstractions of host facilities. Whether one goes through an FFI or a plugin to provide these abstractions makes little difference. I would agree that we keep the FFI separate from /a/ base system, but not form /all/ base systems. I want support for symbolic links and I don't want to depend on a plugin that can't, because the facilities are too different across platforms, provide a portable abstraction of symbolic links across unix, windows and mac. I want to be able to launch an arbitrary external program and not be limited to a small set of supported programs known by a plugin, etc, etc. These are all valid things to have in a base system, but not valid things to have in e.g. a web plugin.

Amen.

For application-level developer the main focus is how to get
missing functionality with minimum effort. The amount of effort increases (up to an
infinity), once you start adding artificial walls and layers.

For system-level design, i agree, we should be accurate and keep things separate (but this is just a modularity concern, not security!). But for application-level, it is usually completely opposite: because people develop and deploy their app on system X,

packed with libraries A, B,C .. and they don't care if it won't run on system Y or Z,

since it is beyond their deployment target.

VM-level security is IMO wrong approach. VM runs on a system which provides

(or not) certain level of security. If you start artificially putting barriers, you only making it harder for application-level developer to get what he wants.

The level of security should be in hands of developers , users of VM, not in hands of VM.
So, VM role is like a screwdriver - a tool you using to (screw ;) do something, but not a police officer, which watches after your crimes.

- Bert -

--
best,
Eliot

--
Best regards,
Igor Stasenko.

Igor Stasenko

Re: Loading FFI is broken

In reply to this post by Eliot Miranda-2

On 15 November 2013 02:40, Eliot Miranda <[hidden email]> wrote:

Hi David,

forgive me for chipping in somewhat tangentally. But the vision thang can be important sometimes.

On Thu, Nov 14, 2013 at 3:59 PM, David T. Lewis <[hidden email]> wrote:

A few things to be aware of:

- Both Eliot Miranda and Igor Stasenko have done considerable work on
FFI interfaces. I cannot explain the details, but be aware that there
may be more than one kind of FFI that people will want to use, and it
seems quite likely that the classic FFI may in future be replaced by
a different implementation.

+1. At least Spur will change the facilities available as does NativeBoost. Ideally we'll develop some kind of synthesis. SPur provides pinning. NativeBoost provides the ability to generate marshalling code from within the image which means not having to rely on the VM for marshalling, which should hopefully mean correctness on platforms with really complex marshalling rules such as x86-64/IA64.

Further, Igor's discovery that any C signature can be compiled as a Smalltalk literal array changes how one should parse C declarations. It's a magnificently simple hack.

struct { int foo; float bar; } (*f)(double bletch);
=> #(#struct #'{' #int #foo #';' #float #bar #';' #'}' #(#* #f) #(#double #bletch) #';')
=> #(struct { int foo; float bar; } (*f)(double bletch);)

well, actually my discovery was, that if i put
#( ... )

around any piece of code,

i can be almost certain it won't give me any syntax error's, except if there's unmatching [] () '' thingies,
which actually NEVER occur in programming languages code - because of their syntax.

So, my discovery was that i can use array literal syntax as a pseudo-comment, if i like to , or as a metadata holder, as i did that in

NB..

But actually, today's NB function signature parser accepts both array literals and strings as input. Because as it appeared, there's not much extra code/complexity to parse strings as well.

But using the array literal for function signature has certain benefits for developers over strings, because if i have thing in form:

#( myType foo())

i can look for senders of #myType

and quickly find where it used/defined.

While when it in form

'myType foo()'

that won't be so trivial.

- None of the available FFI interfaces will work for a VM compiled in
64-bit mode, nor can they be used to interface to existing 64-bit libraries.
I am sure that the problems will be addressed in one or more of the
FFI implementations, but this may take some time (to give an idea of
the time scale, see http://bugs.squeak.org/view.php?id=7237).

That will be addressed when a 64-bit Spur is implemented.

- Like any other reloadable package in Squeak, the classic Squeak FFI
can be maintained as an external package. It requires a little more
work to keep it healthy, because someone has to remember to test it
once in a while. But there is no reason that this can not or should not
be done.

+1.

Dave

On Thu, Nov 14, 2013 at 03:38:49PM -0800, Chris Cunningham wrote:
> FFI used to BE part of the base image, and was specifically taken out on
> purpose. The purpose, as I remember it, was because it acted like a
> security hole for squeak as a squeaklet on the web (when you browse a page
> on a website that invokes Squeak, and then being able to, via FFI, invoke
> anything you want on the remote machine). So, for any image still working
> in that mode (such as the etoys derivative), it is important to leave out.
>
> But, with the shiny new CI servers, couldn't we have an artifact built with
> FFI loaded?
>
> Or, even neater, include FFI in the trunk, but have a switch/pragma in the
> image that says 'Don't ever load FFI' for those that don't want it, but
> want everything else loaded? So, for people that LIKE FFI, it is just part
> of Trunk; for those that have to avoid it, it isn't loaded; for the rest,
> they get whatever was setup when they got their image. (I'd be in the
> first group most of the time)
>
> -cbc
>
>
> On Thu, Nov 14, 2013 at 2:30 PM, Frank Shearar <[hidden email]>wrote:
>
> > On 14 November 2013 16:42, Eliot Miranda <[hidden email]> wrote:
> > > Hi Frank,
> > >
> > >
> > > On Thu, Nov 14, 2013 at 1:59 AM, Frank Shearar <[hidden email]>
> > > wrote:
> > >>
> > >> That's the source of the problem: FFI-Kernel uses FFIConstants, which
> > >> is declared/defined in FFI-Pools.
> > >>
> > >> Insert standard Frank rant of something called "Kernel" depending on
> > >> other stuff. (Possibly insert standard Chris rant of a package
> > >> containing a single class? :) )
> > >
> > >
> > > The problem here is that the VM depends on FFIConstants also, and the VM
> > > shouldn't depend on the rest of FFI. So FFIConstants does need to be on
> > its
> > > own, but could perhaps be called something different.
> >
> > You know, I would be quite happy if we folded FFI into the base image.
> > Seriously. For once I'd argue for _including_ something. I would
> > gladly trade Morphic for FFI.
> >
> > OK, I'll chalk this up to That's How It Is, and move on. If you
> > install from SqueakMap (as you should, and as I should have) you get
> > the right load order.
> >
> > frank
> >
> > >> frank
> > >>
> > >> On 12 November 2013 23:20, Chris Muller <[hidden email]> wrote:
> > >> > I didn't see you loading FFI-Pools in advance. The issue you
> > >> > encountered
> > >> > may not have had anything to do with -eem.24.
> > >> >
> > >> > FYI -- FFI also has a "head" release on SqueakMap which documents how
> > to
> > >> > load it.
> > >> >
> > >> >
> > >> > On Tue, Nov 12, 2013 at 4:50 PM, Frank Shearar <
> > [hidden email]>
> > >> > wrote:
> > >> >>
> > >> >> That pulls in the latest packages, right?
> > >> >>
> > >> >> So that does work. But why does the reported version fail to load?
> > >> >> That's a problem still.
> > >> >>
> > >> >> (While this lets me get on with what I wanted to do - add
> > >> >> #interleaving: to Xtreams - this is still a problem. But thanks for
> > >> >> the workaround, Chris!)
> > >> >>
> > >> >> frank
> > >> >>
> > >> >> On 12 November 2013 20:55, Chris Muller <[hidden email]> wrote:
> > >> >> > Installer new merge: #ffi.
> > >> >> >
> > >> >> > Worked for me..
> > >> >> >
> > >> >> > On Tue, Nov 12, 2013 at 2:45 PM, Frank Shearar
> > >> >> > <[hidden email]>
> > >> >> > wrote:
> > >> >> >> On 26 February 2013 18:36, Eliot Miranda <[hidden email]
> > >
> > >> >> >> wrote:
> > >> >> >>>
> > >> >> >>>
> > >> >> >>> On Tue, Feb 26, 2013 at 1:51 AM, Frank Shearar
> > >> >> >>> <[hidden email]>
> > >> >> >>> wrote:
> > >> >> >>>>
> > >> >> >>>> (Installer monticello mc: (MCHttpRepository new location:
> > >> >> >>>> 'http://source.squeak.org/FFI'))
> > >> >> >>>> install: 'FFI-Kernel-eem.24.mcz'. "There's a -tbn.25, but
> > >> >> >>>> that's
> > >> >> >>>> not important to this post"
> > >> >> >>>>
> > >> >> >>>> fails with an MNU: ExternalFunction class >>
> > >> >> >>>> callingConventionFor:.
> > >> >> >>>>
> > >> >> >>>> As far as I can see what's happening is this:
> > >> >> >>>> * during the loading of the mcz ExternalFunction is defined,
> > >> >> >>>> * a method is parsed (#XOpenDisplay, which has a pragma <cdecl:
> > >> >> >>>> X11Display* ''XOpenDisplay'' (char*) module:''X11''>)
> > >> >> >>>> * Parser >> externalFunctionDeclaration checks whether
> > >> >> >>>> ExternalFunction is defined.
> > >> >> >>>> * It is, so tries to evaluate `descriptorClass
> > callingConvention:
> > >> >> >>>> here` and boom, because ExternalFunction class >>
> > >> >> >>>> callingConventionFor: _has not been loaded yet_.
> > >> >> >>>
> > >> >> >>>
> > >> >> >>> I thought we'd modified Monticello to load new methods first.
> > Did
> > >> >> >>> this not
> > >> >> >>> get added to trunk?
> > >> >> >>
> > >> >> >> Apparently not. It still happens with an up-to-date Squeak 4.5.
> > >> >> >>
> > >> >> >> frank
> > >> >> >>
> > >> >> >>>> I've seen this kind of issue with Helvetia code: sometimes you
> > >> >> >>>> simply
> > >> >> >>>> have to load class-side methods first.
> > >> >> >>>>
> > >> >> >>>> Thoughts? Lukas worked around the issue with his Helvetia code
> > by
> > >> >> >>>> directly patching Monticello, and ripping out its "try to do
> > >> >> >>>> atomic
> > >> >> >>>> loading" mechanism.
> > >> >> >>>>
> > >> >> >>>> frank
> > >> >> >>>>
> > >> >> >>>
> > >> >> >>>
> > >> >> >>>
> > >> >> >>> --
> > >> >> >>> best,
> > >> >> >>> Eliot
> > >> >> >>>
> > >> >> >>>
> > >> >> >>>
> > >> >> >>
> > >> >> >
> > >> >
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > best,
> > > Eliot
> >
> >

>

--
best,
Eliot

--
Best regards,
Igor Stasenko.

Igor Stasenko

Re: Loading FFI is broken

In reply to this post by Andres Valloud-4

On 20 November 2013 04:57, Andres Valloud <[hidden email]> wrote:

On 11/19/13 19:13 , Eliot Miranda wrote:

I would argue that in fact the best way to deal with differing UNIX
implementations is this approach. For example, ioctl defines, socket
constant defines, struct layouts, etc, etc all differ markedly between
UNIX implementations, and hence one easy way to extract exact
information is to generate, compile and either run or load a program
that reveals the implementation details.

It's not clear to me how an arbitrary interpretation mechanism would reveal what function to call to invoke e.g. malloc(), assuming that's all there is to it. The mechanism would have to deal with arbitrary macro code expansion, arbitrary code without source code provided by the compiler itself, and on some platforms such as OS X the behavior of malloc() could depend on the value of environment variables at the time the binary is loaded (as opposed to the time when the interpretation mechanism looks at said variables). Similarly, it's unclear what that interpretation mechanism would do in the general presence of things like

#if defined(FOO)
...
#endif

To me, avoiding writing a few lines of C does not justify the effort of correctly rewriting and maintaining (parts of) a C compiler in Smalltalk.

Generally, I agree that one could carefully and consciously write comparatively small primitives and/or plugins. Then, one could compile those with a C compiler in a compilation environment compatible with that of the VM. And then, one could call those primitives and/or plugins from the image with the expectation (within reason) that they should work.

But you forgot to mention, that while writing plugin, or trying to dynamically link with some external library,
an author must deal with tons of those #ifdefs as well..

IMO, macros is the worst thing invented for C.
Having 10+ years of Pascal experience, i cannot really understand why it left behind by industry: 10/100 times faster compilation times, no crazy modularity problems,

same (or even better code performance), but what's most important is that you get what you see in code.

--
Best regards,
Igor Stasenko.

Igor Stasenko

Re: Loading FFI is broken

On 20 November 2013 15:07, Igor Stasenko <[hidden email]> wrote:

On 20 November 2013 04:57, Andres Valloud <[hidden email]> wrote:

On 11/19/13 19:13 , Eliot Miranda wrote:

I would argue that in fact the best way to deal with differing UNIX
implementations is this approach. For example, ioctl defines, socket
constant defines, struct layouts, etc, etc all differ markedly between
UNIX implementations, and hence one easy way to extract exact
information is to generate, compile and either run or load a program
that reveals the implementation details.

It's not clear to me how an arbitrary interpretation mechanism would reveal what function to call to invoke e.g. malloc(), assuming that's all there is to it. The mechanism would have to deal with arbitrary macro code expansion, arbitrary code without source code provided by the compiler itself, and on some platforms such as OS X the behavior of malloc() could depend on the value of environment variables at the time the binary is loaded (as opposed to the time when the interpretation mechanism looks at said variables). Similarly, it's unclear what that interpretation mechanism would do in the general presence of things like

#if defined(FOO)
...
#endif

To me, avoiding writing a few lines of C does not justify the effort of correctly rewriting and maintaining (parts of) a C compiler in Smalltalk.

Generally, I agree that one could carefully and consciously write comparatively small primitives and/or plugins. Then, one could compile those with a C compiler in a compilation environment compatible with that of the VM. And then, one could call those primitives and/or plugins from the image with the expectation (within reason) that they should work.

But you forgot to mention, that while writing plugin, or trying to dynamically link with some external library,
an author must deal with tons of those #ifdefs as well..

Heck, even worse (and most offensive scenario):

- you compiled plugin & ship it with own set of defines

- a stupid (or maybe too clever) user uses a slightly different environment setup built with own set of defines

- crash boom as result

... so the following (quote):

<<< . VM plugins are reliable,
portable, and debuggable. They work across a range of processors. They
work on 64-bit platforms. >>>

.. is just a fairy tale. Nothing is reliable, portable, and debuggable when it comes about C.

IMO, macros is the worst thing invented for C.
Having 10+ years of Pascal experience, i cannot really understand why it left behind by industry: 10/100 times faster compilation times, no crazy modularity problems,

same (or even better code performance), but what's most important is that you get what you see in code.

--
Best regards,
Igor Stasenko.

--
Best regards,
Igor Stasenko.

Eliot Miranda-2

Re: Loading FFI is broken

In reply to this post by Andres Valloud-4

On Tue, Nov 19, 2013 at 7:57 PM, Andres Valloud <[hidden email]> wrote:

On 11/19/13 19:13 , Eliot Miranda wrote:

I would argue that in fact the best way to deal with differing UNIX
implementations is this approach. For example, ioctl defines, socket
constant defines, struct layouts, etc, etc all differ markedly between
UNIX implementations, and hence one easy way to extract exact
information is to generate, compile and either run or load a program
that reveals the implementation details.

It's not clear to me how an arbitrary interpretation mechanism

I didn't untroduce an interpretation mechanism, I introduced a compialtion mechanism.

would reveal what function to call to invoke e.g. malloc(), assuming that's all there is to it. The mechanism would have to deal with arbitrary macro code expansion, arbitrary code without source code provided by the compiler itself, and on some platforms such as OS X the behavior of malloc() could depend on the value of environment variables at the time the binary is loaded (as opposed to the time when the interpretation mechanism looks at said variables). Similarly, it's unclear what that interpretation mechanism would do in the general presence of things like

#if defined(FOO)
...
#endif

To me, avoiding writing a few lines of C does not justify the effort of correctly rewriting and maintaining (parts of) a C compiler in Smalltalk.

No matter what, the following defines a function that takes an integer and returns the result of malloc:

#include <stdlib.h>

void *malloc_wrapper(int n) { return malloc(n); }

This wrapper can be auto-generated and compiled into a shared object or dll and used to wrap whatever crap the underlying platform chooses to use in implementing malloc.

This can also be used to wrap macros and print the values of simple defines.

A similar approach can be used to derive the layouts of things like a struct stat.

#define printfield(s,f) printf("&" #s "." #f "=%ld\n", offsetof(s.f))

So one can use the C compiler to extract layout information easily that abstracts away from implementation detail and means the Smalltalk system *does not* have to implement a C compiler, merely invoke one.

Note that this approach is a fall-back for perverse platforms. Most platforms are not remotely this difficult to use. Most allow us to directly call functions, know the layouts of structures and so on.

The above can be autogenerated.

Generally, I agree that one could carefully and consciously write comparatively small primitives and/or plugins. Then, one could compile those with a C compiler in a compilation environment compatible with that of the VM. And then, one could call those primitives and/or plugins from the image with the expectation (within reason) that they should work.

--
best,

Eliot

David T. Lewis

Re: Loading FFI is broken

In reply to this post by Igor Stasenko

> On 20 November 2013 15:07, Igor Stasenko <[hidden email]> wrote:
>
> ... so the following (quote):
>
> <<< . VM plugins are reliable,
> portable, and debuggable. They work across a range of processors. They
> work on 64-bit platforms. >>>
>
> .. is just a fairy tale. Nothing is reliable, portable, and debuggable
> when
> it comes about C.
>

That quote comes from me, and I stand by what I said. It is based on
personal experience, not theory.

The plugins that I have done are all written in Smalltalk, not in C.

Dave

Andres Valloud-4

Re: Loading FFI is broken

In reply to this post by Igor Stasenko

> Heck, even worse (and most offensive scenario):
> - you compiled plugin & ship it with own set of defines
> - a stupid (or maybe too clever) user uses a slightly different
> environment setup built with own set of defines
> - crash boom as result
>
> ... so the following (quote):
>
> <<< . VM plugins are reliable,
> portable, and debuggable. They work across a range of processors. They
> work on 64-bit platforms. >>>
>
> .. is just a fairy tale. Nothing is reliable, portable, and debuggable
> when it comes about C.

You're using OS X or Linux, right? I don't think it's that bad. My
point though is that if you are going to rely on C, then you ought to
play by C's rules. Forcing a Smalltalk point of view on C doesn't work
in the long run.

Andres Valloud-4

Re: Loading FFI is broken

In reply to this post by Eliot Miranda-2

On 11/20/13 8:51 , Eliot Miranda wrote:
> No matter what, the following defines a function that takes an integer
> and returns the result of malloc:
>
> #include <stdlib.h>
> void *malloc_wrapper(int n) { return malloc(n); }
>
> This wrapper can be auto-generated and compiled into a shared object or
> dll and used to wrap whatever crap the underlying platform chooses to
> use in implementing malloc.

This is the same that I said before: just write a few lines of C, and
use a C compiler to make prims or VM plugins (in the shape of a .dll or
.so file).

> Generally, I agree that one could carefully and consciously write
> comparatively small primitives and/or plugins. Then, one could
> compile those with a C compiler in a compilation environment
> compatible with that of the VM. And then, one could call those
> primitives and/or plugins from the image with the expectation
> (within reason) that they should work.

Andres.

Eliot Miranda-2

Re: Loading FFI is broken

On Wed, Nov 20, 2013 at 10:16 AM, Andres Valloud <[hidden email]> wrote:

On 11/20/13 8:51 , Eliot Miranda wrote:

No matter what, the following defines a function that takes an integer
and returns the result of malloc:

#include <stdlib.h>
void *malloc_wrapper(int n) { return malloc(n); }

This wrapper can be auto-generated and compiled into a shared object or
dll and used to wrap whatever crap the underlying platform chooses to
use in implementing malloc.

This is the same that I said before: just write a few lines of C, and use a C compiler to make prims or VM plugins (in the shape of a .dll or .so file).

But you apparently miss the point that it can also be used to help a general FFI functon, and that FFI is much more useful than plugins, as discussed in this and related threads.

Generally, I agree that one could carefully and consciously write
comparatively small primitives and/or plugins. Then, one could
compile those with a C compiler in a compilation environment
compatible with that of the VM. And then, one could call those
primitives and/or plugins from the image with the expectation
(within reason) that they should work.

Andres.

Eliot

Andres Valloud-4

Re: Loading FFI is broken

> This is the same that I said before: just write a few lines of C,
> and use a C compiler to make prims or VM plugins (in the shape of a
> .dll or .so file).
>
>
> But you apparently miss the point that it can also be used to help a
> general FFI functon, and that FFI is much more useful than plugins, as
> discussed in this and related threads.

As I wrote before, I never meant to detract from the advantages of
either mechanism. What I pointed out are some of the inherent
restrictions in their usage because of how C works (like it or not, it's
99% certain C runs the OS in which we want to successfully run VMs).