Alignment of FFI ExternalStructure fields

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Alignment of FFI ExternalStructure fields

Nicolas Cellier
So the current algorithm in #defineFields: and compileFields: is producing compact structures layout with all fields aligned on 1 byte.

It does not correspond to any platform behavior except when using special pragmas or compiler options.

The current workaround is to force a length in 3rd column of spec.
https://stackoverflow.com/questions/49782651/how-one-aligns-structure-fields-in-squeak-ffi/49782754#49782754
But it's brainfuck:
- it depends both on the fields preceding and the fields following
- it becomes very hard to support both 32+64 bits

#(
    (x 'char' 2)
    (y 'short')
 )
#(
    (x 'char' 4) "depends on the field following"
    (y 'long')
 )
#(
    (w 'char' 1)
    (x 'char' 3) "but also on the field preceding"
    (y 'long')
 )

I'd like to make our default behavior match that of the current platform
- atomic fields of size 1,2,4,8 bytes have alignment = size
- pointers also have alignment = pointer size (4 or 8).
- nested structures have alignment = max of alignment of their fields

Then we can support exotic alignment with a 4th column in spec:

#( (x 'char' -1 1) (y 'short' -1 1) ) "compact"

The 3rd column -1 or nil would mean
"use the default size rather than force a user defined one"

Other suggestions?


Reply | Threaded
Open this post in threaded view
|

Re: Alignment of FFI ExternalStructure fields

Eliot Miranda-2
Hi Nicolas,

On Fri, Apr 13, 2018 at 10:53 AM, Nicolas Cellier <[hidden email]> wrote:
So the current algorithm in #defineFields: and compileFields: is producing compact structures layout with all fields aligned on 1 byte.

It does not correspond to any platform behavior except when using special pragmas or compiler options.

The current workaround is to force a length in 3rd column of spec.
https://stackoverflow.com/questions/49782651/how-one-aligns-structure-fields-in-squeak-ffi/49782754#49782754
But it's brainfuck:
- it depends both on the fields preceding and the fields following
- it becomes very hard to support both 32+64 bits

#(
    (x 'char' 2)
    (y 'short')
 )
#(
    (x 'char' 4) "depends on the field following"
    (y 'long')
 )
#(
    (w 'char' 1)
    (x 'char' 3) "but also on the field preceding"
    (y 'long')
 )

I'd like to make our default behavior match that of the current platform
- atomic fields of size 1,2,4,8 bytes have alignment = size
- pointers also have alignment = pointer size (4 or 8).
- nested structures have alignment = max of alignment of their fields

Then we can support exotic alignment with a 4th column in spec:

#( (x 'char' -1 1) (y 'short' -1 1) ) "compact"

The 3rd column -1 or nil would mean
"use the default size rather than force a user defined one"

Agreed.  This looks sane.  AFAIA this fits all our currently supported platforms, right?
 
Other suggestions?

This isn't another suggestion.  It is more an extension of what you propose.

We could somehow try and maintain 64-bit and 32-bit versions of each type.  Imagine adding an instance variable to ExternalType, say compiledSpecs, which is a dictionary from layout name (#default32, #default64, #dos32, etc) to compiledSpec for that layout.  Then provide a realign method that on start-up selects the correct compiledSpec for the current platform, and if it is different to the type's currentSpec inst var collects instances of the referent class and realigns the contents.

This needs some thought because the accessors are in the referentClass.  If the ExternalType also stored the accessor methods for each field, it could swap these in and out as required on start up as it did the realignment.  We could use a CompiledMethodTrailer that encoded source in the method, which would avoid any issues with source being compacted, etc.  The methods could be lazily generated.

So the referent class would be auto-generated from a typedef above.  I would also consider writing a simple compiler from typedefs to specs (redefined to support the alignment you want).  While Bert's suggestion of using the tool that outputs xml is a good one. it is yet another external dependency.  I favor running the C compiler with the -P argument (which runs the preprocessor and outputs the preprocessor output) and then writing a C parser that only parses type and function declarations, skipping over C bodies.  the C syntax is quite simple.  Its the preprocessor and pragma extensions etc that complete things.  The fatal flaw in DLLCC was to try and parse arbitrary input files, hence having to handle preprocessing,but C;s preprocessor knows no structure, and hence its enormously difficult to parse pre-preprocessed C.  But pre-processed C is much more manageable.  We could provide tooling to auto generate interfaces derived from preprocessed header files.  But we would not attempt to support macros, leaving it up to the programmer to convert any macros into methods by hand.

As far as importing the values of #defines, the approach I've outlined before which Monty is having a look at, where we generate a program that when run, prints easily parseable definitions, e.g. as STON, is a good one.  Someone working on the FFI can be expected to have a C compiler available, which means we can preprocess and compile programs.  So limiting ourselves to what we can do simply with these two is a good idea.  A full blown C parser is, as DLLCC demonstrates, a practical impossibility.

_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: Alignment of FFI ExternalStructure fields

Nicolas Cellier


2018-04-13 21:58 GMT+02:00 Eliot Miranda <[hidden email]>:
Hi Nicolas,

On Fri, Apr 13, 2018 at 10:53 AM, Nicolas Cellier <[hidden email]> wrote:
So the current algorithm in #defineFields: and compileFields: is producing compact structures layout with all fields aligned on 1 byte.

It does not correspond to any platform behavior except when using special pragmas or compiler options.

The current workaround is to force a length in 3rd column of spec.
https://stackoverflow.com/questions/49782651/how-one-aligns-structure-fields-in-squeak-ffi/49782754#49782754
But it's brainfuck:
- it depends both on the fields preceding and the fields following
- it becomes very hard to support both 32+64 bits

#(
    (x 'char' 2)
    (y 'short')
 )
#(
    (x 'char' 4) "depends on the field following"
    (y 'long')
 )
#(
    (w 'char' 1)
    (x 'char' 3) "but also on the field preceding"
    (y 'long')
 )

I'd like to make our default behavior match that of the current platform
- atomic fields of size 1,2,4,8 bytes have alignment = size
- pointers also have alignment = pointer size (4 or 8).
- nested structures have alignment = max of alignment of their fields

Then we can support exotic alignment with a 4th column in spec:

#( (x 'char' -1 1) (y 'short' -1 1) ) "compact"

The 3rd column -1 or nil would mean
"use the default size rather than force a user defined one"

Agreed.  This looks sane.  AFAIA this fits all our currently supported platforms, right?
 
Other suggestions?

This isn't another suggestion.  It is more an extension of what you propose.

We could somehow try and maintain 64-bit and 32-bit versions of each type.  Imagine adding an instance variable to ExternalType, say compiledSpecs, which is a dictionary from layout name (#default32, #default64, #dos32, etc) to compiledSpec for that layout.  Then provide a realign method that on start-up selects the correct compiledSpec for the current platform, and if it is different to the type's currentSpec inst var collects instances of the referent class and realigns the contents.


With http://source.squeak.org/FFI/FFI-Kernel-nice.49.diff I am now able to detect platform change and recompile the compiledSpec at startup.
I expect that change of platform are rare, so carrying the specs for several platforms seem an un-necessary optimization to me.

This needs some thought because the accessors are in the referentClass.  If the ExternalType also stored the accessor methods for each field, it could swap these in and out as required on start up as it did the realignment.  We could use a CompiledMethodTrailer that encoded source in the method, which would avoid any issues with source being compacted, etc.  The methods could be lazily generated.


The choice I did was to regenerate the accessor iff already auto-generated.
auto-generated are marked with a <generated> annotation which i find simple enough.

I must also see what I can do to re-generate when installing the code in the image (more changes to come).
Sure, some <generated> methods will be marked as modified in MC browser, but that's my least problem by now.

Alternatively, Leandro suggested to handle the offset in some pool dictionary, and just change the values of offsets on platform switch.
I did not follow this route, because it was deeper changes... But it makes sense.
And it also means that we would have to support machine-dependent accessors that we do not have yet
(native 'unsigned long' and 'size_t' for example, not our machine independant unsignedLongAt: which is more a uint32At:)

So the referent class would be auto-generated from a typedef above.  I would also consider writing a simple compiler from typedefs to specs (redefined to support the alignment you want).  While Bert's suggestion of using the tool that outputs xml is a good one. it is yet another external dependency.  I favor running the C compiler with the -P argument (which runs the preprocessor and outputs the preprocessor output) and then writing a C parser that only parses type and function declarations, skipping over C bodies.  the C syntax is quite simple.  Its the preprocessor and pragma extensions etc that complete things.  The fatal flaw in DLLCC was to try and parse arbitrary input files, hence having to handle preprocessing,but C;s preprocessor knows no structure, and hence its enormously difficult to parse pre-preprocessed C.  But pre-processed C is much more manageable.  We could provide tooling to auto generate interfaces derived from preprocessed header files.  But we would not attempt to support macros, leaving it up to the programmer to convert any macros into methods by hand.


Yes that sounds interesting. Let the right tool do the right job.
That could also mean that we have to make use of offsetof in order to let the C compiler deal with its own specific alignment pragmas, and avoid re-inventing the wheel.
So it might be a multi-stage or interactive build between Smalltalk and C compiler.
We also have to pass the proper compiler options -DThisOrThat to the pre-processor etc... (as produced from some configure madness) but that can be decoupled.

Sometimes, we don't have to handle a type directly, but can just consider them as opaque.
Some handle is returned by an init function and then passed as parameter to any other,.
We never bother to interfer with its contents: it's implementation details.
File * is a good example of this: we don't want to import its machine-specific definition.

If all functions that we need to import use a File*, then I do not need to import FILE.
The chance that there is an #include <stdio.h> somewhere is high!
So I foresee the need to select the minimum definition that could possibly work, probably something user driven thru a GUI like DLLCC.

Also there's a small difference between having a C compiler on one platform, and having to support differences between C compiler/headers on every supported platforms.
I hope that we can abstract the small differences at an upper level with top-down approach (like with LLP64/ILP64 considerations), rather than addressing the problem from the bottom-up with a C pre-processor, because bottom up means kicking many bottoms in this case ;)

As far as importing the values of #defines, the approach I've outlined before which Monty is having a look at, where we generate a program that when run, prints easily parseable definitions, e.g. as STON, is a good one.  Someone working on the FFI can be expected to have a C compiler available, which means we can preprocess and compile programs.  So limiting ourselves to what we can do simply with these two is a good idea.  A full blown C parser is, as DLLCC demonstrates, a practical impossibility.


Agree, I've had to patch it several times in the past (longlong support etc...).
And now there are those C++ libraries with just an extern "C" {...} interface, a whole world of additional complexity...
I generally use the DLLCC parser on manually patched headers...
 
_,,,^..^,,,_
best, Eliot