FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

marcel.taeumel
Hi all!

For some days now, I am thinking about "pointer-to-pointer" types. I suppose that the more general topic is the interpretation of multi-dimensional arrays.

On the one hand, this is an in-image issue to correctly read ExternalData and to generate appropriate accessors for struct fields.

On the other hand, this may affect FFI call's arg coercing and return-type packaging.

So, after playing around with an implementation of pointer-to-pointer types via ExternalType's #referencedType (i.e. making it a chain of three: ... -> void -> void* -> void** -> void ...), I figured that we might need to store the array dimensions in the compiledSpec.

Here are some unused bits for that:

FFIFlagPointerArrayMask := 16rF00000. "up to 16-dimensional arrays"
FFIFlagPointerArrayShift := 20.

Here are some questions for you on this topic:

- Should we reserve all 4 bits? 16-dimensional arrays sound like overkill ... 2 bits could be enough, having rarely the lower one set for char** ...
- In the image, would you store all possible versions through refrencedType in a linked cycle? Or would you lazily create them ... on the fly? As requested from struct fields and FFI call specs via ExternalType class >> #typeNamed: ?
- In the FFI plugin, do you see value in coercing an array of, for example, IntegerArrays for an FFI call? (int**)
- In the FFI plugin, do you see value in automatically packaging returned objects if those dimensions would be zero-terminated?

Best,
Marcel


Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

Jakob Reschke
Hi Marcel,

Do you think that the dimensions are always known? I understood that you also need to know the cardinality per dimension of the involved arrays. If you want to remember only the number of dimensions, some of my remarks may not apply. Note that an int** is not a two-dimensional array int[x][y], so it might be misleading to speak of dimensions. From practice I know we would need to support at least three pointer indirections, not only two, at least if you don't want to require intermediate type aliases for pointers in some cases. See below.

Look at the CredRead function and CREDENTIALS struct for example:

The CredRead function takes essentially a CREDENTIAL** as an output parameter, so it will give you (dereferencing the passed argument) one CREDENTIAL*. It is like passing a PCREDENTIAL[1] variable, but you could well pass it a PCREDENTIAL[6] if you like, since both decay to PCREDENTIAL* when passed as arguments. Anyway, this is not a two-dimensional array of CREDENTIALs.

It takes a CREDENTIAL*** to give you back an array of pointers to CREDENTIAL structs. How many there are, you will only learn by dereferencing the count argument. The actual CREDENTIAL structures /could/ be laid out as a CREDENTIAL[count] array, but they don't need to be (implementation detail). So there is a one-dimensional array of pointers, there could be, but needn't be, a one-dimensional array of CREDENTIALs, but certainly there are no two- or three-dimensional arrays here.

Kind regards,
Jakob


Am So., 14. Juni 2020 um 18:19 Uhr schrieb Marcel Taeumel <[hidden email]>:
Hi all!

For some days now, I am thinking about "pointer-to-pointer" types. I suppose that the more general topic is the interpretation of multi-dimensional arrays.

On the one hand, this is an in-image issue to correctly read ExternalData and to generate appropriate accessors for struct fields.

On the other hand, this may affect FFI call's arg coercing and return-type packaging.

So, after playing around with an implementation of pointer-to-pointer types via ExternalType's #referencedType (i.e. making it a chain of three: ... -> void -> void* -> void** -> void ...), I figured that we might need to store the array dimensions in the compiledSpec.

Here are some unused bits for that:

FFIFlagPointerArrayMask := 16rF00000. "up to 16-dimensional arrays"
FFIFlagPointerArrayShift := 20.

Here are some questions for you on this topic:

- Should we reserve all 4 bits? 16-dimensional arrays sound like overkill ... 2 bits could be enough, having rarely the lower one set for char** ...
- In the image, would you store all possible versions through refrencedType in a linked cycle? Or would you lazily create them ... on the fly? As requested from struct fields and FFI call specs via ExternalType class >> #typeNamed: ?
- In the FFI plugin, do you see value in coercing an array of, for example, IntegerArrays for an FFI call? (int**)
- In the FFI plugin, do you see value in automatically packaging returned objects if those dimensions would be zero-terminated?

Best,
Marcel



Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

marcel.taeumel
Hi Jakob

Do you think that the dimensions are always known?

Yes, how would you else be able to write an FFI interface in the first place? If an interface says "int**" and documents "can be int***" from time to time, then I hope it does also give a hint on how to find that out. Is that even possible with C compilers? From within Squeak, it's not an issue to add that extra dimension dynamically (i.e. just switch the external type for interpretation from, e.g., int** to int***) when trying to read the data. :-)

Note that an int** is not a two-dimensional array int[x][y], so it might be misleading to speak of dimensions.

I don't think it makes a difference from the Squeak FFI perspective. Pointer arithmetic for such access is currently implemented in ExternalData >> #at: and #at:put:. I don't think we should use more new terminology than necessary.

If you want to remember only the number of dimensions, some of my remarks may not apply.

Of course, only the number of dimensions. The length/size has to be provided somewhere else. Maybe another field in my external struct. :-) Or maybe zero-terminatd if the library's documentation claims so. Then I have to count manually and store it in ExternalData >> #size. After that, I can enumerate the data.

For more fun

Thank you for the examples!

Best,
Marcel

Am 14.06.2020 21:14:20 schrieb Jakob Reschke <[hidden email]>:

Hi Marcel,

Do you think that the dimensions are always known? I understood that you also need to know the cardinality per dimension of the involved arrays. If you want to remember only the number of dimensions, some of my remarks may not apply. Note that an int** is not a two-dimensional array int[x][y], so it might be misleading to speak of dimensions. From practice I know we would need to support at least three pointer indirections, not only two, at least if you don't want to require intermediate type aliases for pointers in some cases. See below.

Look at the CredRead function and CREDENTIALS struct for example:

The CredRead function takes essentially a CREDENTIAL** as an output parameter, so it will give you (dereferencing the passed argument) one CREDENTIAL*. It is like passing a PCREDENTIAL[1] variable, but you could well pass it a PCREDENTIAL[6] if you like, since both decay to PCREDENTIAL* when passed as arguments. Anyway, this is not a two-dimensional array of CREDENTIALs.

It takes a CREDENTIAL*** to give you back an array of pointers to CREDENTIAL structs. How many there are, you will only learn by dereferencing the count argument. The actual CREDENTIAL structures /could/ be laid out as a CREDENTIAL[count] array, but they don't need to be (implementation detail). So there is a one-dimensional array of pointers, there could be, but needn't be, a one-dimensional array of CREDENTIALs, but certainly there are no two- or three-dimensional arrays here.

Kind regards,
Jakob


Am So., 14. Juni 2020 um 18:19 Uhr schrieb Marcel Taeumel <[hidden email]>:
Hi all!

For some days now, I am thinking about "pointer-to-pointer" types. I suppose that the more general topic is the interpretation of multi-dimensional arrays.

On the one hand, this is an in-image issue to correctly read ExternalData and to generate appropriate accessors for struct fields.

On the other hand, this may affect FFI call's arg coercing and return-type packaging.

So, after playing around with an implementation of pointer-to-pointer types via ExternalType's #referencedType (i.e. making it a chain of three: ... -> void -> void* -> void** -> void ...), I figured that we might need to store the array dimensions in the compiledSpec.

Here are some unused bits for that:

FFIFlagPointerArrayMask := 16rF00000. "up to 16-dimensional arrays"
FFIFlagPointerArrayShift := 20.

Here are some questions for you on this topic:

- Should we reserve all 4 bits? 16-dimensional arrays sound like overkill ... 2 bits could be enough, having rarely the lower one set for char** ...
- In the image, would you store all possible versions through refrencedType in a linked cycle? Or would you lazily create them ... on the fly? As requested from struct fields and FFI call specs via ExternalType class >> #typeNamed: ?
- In the FFI plugin, do you see value in coercing an array of, for example, IntegerArrays for an FFI call? (int**)
- In the FFI plugin, do you see value in automatically packaging returned objects if those dimensions would be zero-terminated?

Best,
Marcel



Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

Jakob Reschke
Am Mo., 15. Juni 2020 um 10:05 Uhr schrieb Marcel Taeumel <[hidden email]>:
Do you think that the dimensions are always known?

Yes, how would you else be able to write an FFI interface in the first place? If an interface says "int**" and documents "can be int***" from time to time, then I hope it does also give a hint on how to find that out. Is that even possible with C compilers?

Having established that the sizes are not important for the "dimensions", ok. No, the number of effective stars cannot vary except through casting (i. e. pretending wrong things). That question was about knowing the sizes.
 
Note that an int** is not a two-dimensional array int[x][y], so it might be misleading to speak of dimensions.

I don't think it makes a difference from the Squeak FFI perspective. Pointer arithmetic for such access is currently implemented in ExternalData >> #at: and #at:put:. I don't think we should use more new terminology than necessary.

But we should also not use wrong or misleading terminology. Is dimension really the word for "level of pointer nesting/number of pointer indirections"?

As you can see in the CredEnum example, multi-dimensional arrays and multiple nested pointers are two different things and I doubt that the FFI can really treat them the same. For example, an int a[2][3] is just syntactic sugar for int b[2 * 3], which you can access with a[1][2] to get the b[1*3 + 2] element. But for an int**p array of indirections, p[1][2] means: dereference the second pointer in my array and get the int at byte offset 2*sizeof(int) from that. In C with int a[2][3], a[1] gives you a pointer to the start of the second slice (like &a[0][1*3]), which makes it look somewhat similar to an array of pointers, but that is not what it is. There is no array of pointers to the slices at &a. It is the start of the first slice. So I suppose the FFI has to access it differently.

Consequently you cannot correctly pass an int a[2][3] as an int**, which I learned just yesterday. Never ever say "arrays are just pointers in C" again. :-)

=> Don't treat arrays of pointers or nested pointers as multi-dimensional arrays, and therefore please reconsider using the word dimension here, unless it is well-understood and established to also describe the number of indirections.
 

If you want to remember only the number of dimensions, some of my remarks may not apply.

Of course, only the number of dimensions. The length/size has to be provided somewhere else. Maybe another field in my external struct. :-) Or maybe zero-terminatd if the library's documentation claims so. Then I have to count manually and store it in ExternalData >> #size. After that, I can enumerate the data.

Since there is a lot of "flexibility" here, I suppose the FFI can only help with some of the more common patterns that you mentioned. :-) But of course, the FFI must not presuppose that any of the patterns is used unless that is explicitly declared in some way.

Note that eager counting can be costly. Consider a 1 GB null-terminated char[]. ;-)


Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

marcel.taeumel
Hi Jakob.

Consequently you cannot correctly pass an int a[2][3] as an int**, which I learned just yesterday. Never ever say "arrays are just pointers in C" again. :-)

Got it! :-) A two-dimensional array (i.e. int[][]) occupies a single contiguous block of memory while an array of pointers (i.e. int**) as an extra level of indirection and is likely to point to multiple (contiguous) memory blocks, one for each "entry" or "array" (here: int* or int[]). Hmm... only for int* vs int[] it does not matter. Hmm...

multi-dimensional arrays and multiple nested pointers are two different things and I doubt that the FFI can really treat them the same.

In an FFI call, for example, an IntegerArray can be coerced to int*, which includes n-dimensional arrays because that memory is contiguous anyway. If a function expects int** as one argument ... you may try to give a pointer address via ByteArray? I am not sure how to express "&ptr" in Squeak FFI having an ExternalAddress at hand.  ExternalAddress class >> #allocate: does already malloc(). Not sure how to create pointer "in" the heap that is not yet defined, i.e. "int *ptr;"

Anyway, a struct field with "int **" is more fun to think about for now. :-) Because it does not involve an FFI call and argument coercing. And comparing that to "int[][]".

What about encoding not only the ... "levels of indirection" for a pointer but also whether it is an array (= contiguous memory) such as int[][] or has pointers to follow in between such as int**?



Best,
Marcel


Am 15.06.2020 11:19:43 schrieb Jakob Reschke <[hidden email]>:

Am Mo., 15. Juni 2020 um 10:05 Uhr schrieb Marcel Taeumel <[hidden email]>:
Do you think that the dimensions are always known?

Yes, how would you else be able to write an FFI interface in the first place? If an interface says "int**" and documents "can be int***" from time to time, then I hope it does also give a hint on how to find that out. Is that even possible with C compilers?

Having established that the sizes are not important for the "dimensions", ok. No, the number of effective stars cannot vary except through casting (i. e. pretending wrong things). That question was about knowing the sizes.
 
Note that an int** is not a two-dimensional array int[x][y], so it might be misleading to speak of dimensions.

I don't think it makes a difference from the Squeak FFI perspective. Pointer arithmetic for such access is currently implemented in ExternalData >> #at: and #at:put:. I don't think we should use more new terminology than necessary.

But we should also not use wrong or misleading terminology. Is dimension really the word for "level of pointer nesting/number of pointer indirections"?

As you can see in the CredEnum example, multi-dimensional arrays and multiple nested pointers are two different things and I doubt that the FFI can really treat them the same. For example, an int a[2][3] is just syntactic sugar for int b[2 * 3], which you can access with a[1][2] to get the b[1*3 + 2] element. But for an int**p array of indirections, p[1][2] means: dereference the second pointer in my array and get the int at byte offset 2*sizeof(int) from that. In C with int a[2][3], a[1] gives you a pointer to the start of the second slice (like &a[0][1*3]), which makes it look somewhat similar to an array of pointers, but that is not what it is. There is no array of pointers to the slices at &a. It is the start of the first slice. So I suppose the FFI has to access it differently.

Consequently you cannot correctly pass an int a[2][3] as an int**, which I learned just yesterday. Never ever say "arrays are just pointers in C" again. :-)

=> Don't treat arrays of pointers or nested pointers as multi-dimensional arrays, and therefore please reconsider using the word dimension here, unless it is well-understood and established to also describe the number of indirections.
 

If you want to remember only the number of dimensions, some of my remarks may not apply.

Of course, only the number of dimensions. The length/size has to be provided somewhere else. Maybe another field in my external struct. :-) Or maybe zero-terminatd if the library's documentation claims so. Then I have to count manually and store it in ExternalData >> #size. After that, I can enumerate the data.

Since there is a lot of "flexibility" here, I suppose the FFI can only help with some of the more common patterns that you mentioned. :-) But of course, the FFI must not presuppose that any of the patterns is used unless that is explicitly declared in some way.

Note that eager counting can be costly. Consider a 1 GB null-terminated char[]. ;-)


Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

marcel.taeumel
One more thing:

int *ptr[NUM] ... is an array of pointers of size NUM, each pointing to an int, which is like "int **ptr"

int (*ptr)[NUM] ... is a single pointer to an array of int of size NUM, which is like "int *ptr" and I think often automatically created if you pass an array into a function that expects a pointer instead of an array

;o)

Best,
Marcel

Am 15.06.2020 11:35:55 schrieb Marcel Taeumel <[hidden email]>:

Hi Jakob.

Consequently you cannot correctly pass an int a[2][3] as an int**, which I learned just yesterday. Never ever say "arrays are just pointers in C" again. :-)

Got it! :-) A two-dimensional array (i.e. int[][]) occupies a single contiguous block of memory while an array of pointers (i.e. int**) as an extra level of indirection and is likely to point to multiple (contiguous) memory blocks, one for each "entry" or "array" (here: int* or int[]). Hmm... only for int* vs int[] it does not matter. Hmm...

multi-dimensional arrays and multiple nested pointers are two different things and I doubt that the FFI can really treat them the same.

In an FFI call, for example, an IntegerArray can be coerced to int*, which includes n-dimensional arrays because that memory is contiguous anyway. If a function expects int** as one argument ... you may try to give a pointer address via ByteArray? I am not sure how to express "&ptr" in Squeak FFI having an ExternalAddress at hand.  ExternalAddress class >> #allocate: does already malloc(). Not sure how to create pointer "in" the heap that is not yet defined, i.e. "int *ptr;"

Anyway, a struct field with "int **" is more fun to think about for now. :-) Because it does not involve an FFI call and argument coercing. And comparing that to "int[][]".

What about encoding not only the ... "levels of indirection" for a pointer but also whether it is an array (= contiguous memory) such as int[][] or has pointers to follow in between such as int**?



Best,
Marcel


Am 15.06.2020 11:19:43 schrieb Jakob Reschke <[hidden email]>:

Am Mo., 15. Juni 2020 um 10:05 Uhr schrieb Marcel Taeumel <[hidden email]>:
Do you think that the dimensions are always known?

Yes, how would you else be able to write an FFI interface in the first place? If an interface says "int**" and documents "can be int***" from time to time, then I hope it does also give a hint on how to find that out. Is that even possible with C compilers?

Having established that the sizes are not important for the "dimensions", ok. No, the number of effective stars cannot vary except through casting (i. e. pretending wrong things). That question was about knowing the sizes.
 
Note that an int** is not a two-dimensional array int[x][y], so it might be misleading to speak of dimensions.

I don't think it makes a difference from the Squeak FFI perspective. Pointer arithmetic for such access is currently implemented in ExternalData >> #at: and #at:put:. I don't think we should use more new terminology than necessary.

But we should also not use wrong or misleading terminology. Is dimension really the word for "level of pointer nesting/number of pointer indirections"?

As you can see in the CredEnum example, multi-dimensional arrays and multiple nested pointers are two different things and I doubt that the FFI can really treat them the same. For example, an int a[2][3] is just syntactic sugar for int b[2 * 3], which you can access with a[1][2] to get the b[1*3 + 2] element. But for an int**p array of indirections, p[1][2] means: dereference the second pointer in my array and get the int at byte offset 2*sizeof(int) from that. In C with int a[2][3], a[1] gives you a pointer to the start of the second slice (like &a[0][1*3]), which makes it look somewhat similar to an array of pointers, but that is not what it is. There is no array of pointers to the slices at &a. It is the start of the first slice. So I suppose the FFI has to access it differently.

Consequently you cannot correctly pass an int a[2][3] as an int**, which I learned just yesterday. Never ever say "arrays are just pointers in C" again. :-)

=> Don't treat arrays of pointers or nested pointers as multi-dimensional arrays, and therefore please reconsider using the word dimension here, unless it is well-understood and established to also describe the number of indirections.
 

If you want to remember only the number of dimensions, some of my remarks may not apply.

Of course, only the number of dimensions. The length/size has to be provided somewhere else. Maybe another field in my external struct. :-) Or maybe zero-terminatd if the library's documentation claims so. Then I have to count manually and store it in ExternalData >> #size. After that, I can enumerate the data.

Since there is a lot of "flexibility" here, I suppose the FFI can only help with some of the more common patterns that you mentioned. :-) But of course, the FFI must not presuppose that any of the patterns is used unless that is explicitly declared in some way.

Note that eager counting can be costly. Consider a 1 GB null-terminated char[]. ;-)


Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

marcel.taeumel
Since multi-dimensional arrays can be coerced via RawBitsArray to simple pointers, I am not so sure anymore that any additional information should be encoded in ExternalType's compiledSpec. Especially since new state combinations in compiledSpec would yield many new instances of ExternalType ... so be managed in the image somehow :-/

So, how would you implement manual interpretation of pointer arrays in Squeak FFI? With a type alias. Then you have your custom subclass of ExternalStructure (or ExternalTypeAlias) in which you can then add methods to read the ExternalData you are aliasing. This would also work for mapping n-dimensional C arrays to a collection or matrix in Squeak.

How could Squeak FFI still help for such cases?

- Accept type names such as "int **" or "int[][]" in FFI-call specs and struct-field specs
- Generate "convenient" accessors for struct fields and a similar mechanism for ExternalData >> #at:(put:)

I am not sure that we can totally avoid having more instances of ExternalType to encode (pointer) indirections and maybe whether it is an array or not to be careful when following those indirections ...

I think that the plugin side cannot do anything to help here. Especially because of the unknown size for each of such indirections/dimensions.

Best,
Marcel

Am 15.06.2020 11:51:40 schrieb Marcel Taeumel <[hidden email]>:

One more thing:

int *ptr[NUM] ... is an array of pointers of size NUM, each pointing to an int, which is like "int **ptr"

int (*ptr)[NUM] ... is a single pointer to an array of int of size NUM, which is like "int *ptr" and I think often automatically created if you pass an array into a function that expects a pointer instead of an array

;o)

Best,
Marcel

Am 15.06.2020 11:35:55 schrieb Marcel Taeumel <[hidden email]>:

Hi Jakob.

Consequently you cannot correctly pass an int a[2][3] as an int**, which I learned just yesterday. Never ever say "arrays are just pointers in C" again. :-)

Got it! :-) A two-dimensional array (i.e. int[][]) occupies a single contiguous block of memory while an array of pointers (i.e. int**) as an extra level of indirection and is likely to point to multiple (contiguous) memory blocks, one for each "entry" or "array" (here: int* or int[]). Hmm... only for int* vs int[] it does not matter. Hmm...

multi-dimensional arrays and multiple nested pointers are two different things and I doubt that the FFI can really treat them the same.

In an FFI call, for example, an IntegerArray can be coerced to int*, which includes n-dimensional arrays because that memory is contiguous anyway. If a function expects int** as one argument ... you may try to give a pointer address via ByteArray? I am not sure how to express "&ptr" in Squeak FFI having an ExternalAddress at hand.  ExternalAddress class >> #allocate: does already malloc(). Not sure how to create pointer "in" the heap that is not yet defined, i.e. "int *ptr;"

Anyway, a struct field with "int **" is more fun to think about for now. :-) Because it does not involve an FFI call and argument coercing. And comparing that to "int[][]".

What about encoding not only the ... "levels of indirection" for a pointer but also whether it is an array (= contiguous memory) such as int[][] or has pointers to follow in between such as int**?



Best,
Marcel


Am 15.06.2020 11:19:43 schrieb Jakob Reschke <[hidden email]>:

Am Mo., 15. Juni 2020 um 10:05 Uhr schrieb Marcel Taeumel <[hidden email]>:
Do you think that the dimensions are always known?

Yes, how would you else be able to write an FFI interface in the first place? If an interface says "int**" and documents "can be int***" from time to time, then I hope it does also give a hint on how to find that out. Is that even possible with C compilers?

Having established that the sizes are not important for the "dimensions", ok. No, the number of effective stars cannot vary except through casting (i. e. pretending wrong things). That question was about knowing the sizes.
 
Note that an int** is not a two-dimensional array int[x][y], so it might be misleading to speak of dimensions.

I don't think it makes a difference from the Squeak FFI perspective. Pointer arithmetic for such access is currently implemented in ExternalData >> #at: and #at:put:. I don't think we should use more new terminology than necessary.

But we should also not use wrong or misleading terminology. Is dimension really the word for "level of pointer nesting/number of pointer indirections"?

As you can see in the CredEnum example, multi-dimensional arrays and multiple nested pointers are two different things and I doubt that the FFI can really treat them the same. For example, an int a[2][3] is just syntactic sugar for int b[2 * 3], which you can access with a[1][2] to get the b[1*3 + 2] element. But for an int**p array of indirections, p[1][2] means: dereference the second pointer in my array and get the int at byte offset 2*sizeof(int) from that. In C with int a[2][3], a[1] gives you a pointer to the start of the second slice (like &a[0][1*3]), which makes it look somewhat similar to an array of pointers, but that is not what it is. There is no array of pointers to the slices at &a. It is the start of the first slice. So I suppose the FFI has to access it differently.

Consequently you cannot correctly pass an int a[2][3] as an int**, which I learned just yesterday. Never ever say "arrays are just pointers in C" again. :-)

=> Don't treat arrays of pointers or nested pointers as multi-dimensional arrays, and therefore please reconsider using the word dimension here, unless it is well-understood and established to also describe the number of indirections.
 

If you want to remember only the number of dimensions, some of my remarks may not apply.

Of course, only the number of dimensions. The length/size has to be provided somewhere else. Maybe another field in my external struct. :-) Or maybe zero-terminatd if the library's documentation claims so. Then I have to count manually and store it in ExternalData >> #size. After that, I can enumerate the data.

Since there is a lot of "flexibility" here, I suppose the FFI can only help with some of the more common patterns that you mentioned. :-) But of course, the FFI must not presuppose that any of the patterns is used unless that is explicitly declared in some way.

Note that eager counting can be costly. Consider a 1 GB null-terminated char[]. ;-)


Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

Jakob Reschke

Marcel Taeumel <[hidden email]> schrieb am Mo., 15. Juni 2020, 13:22:

- Accept type names such as "int **" or "int[][]" in FFI-call specs and struct-field spec

One further note: int[][] is not valid C in parameter types. Only the first [] can be without length, and is equivalent to a pointer. So char*argv[] is the same as char**argv. Valid parameter type examples: int a[][3], int b[][2][3]. These are like int(*a)[3] and int(*b)[2][3] if I am not mistaken.


Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

Tobias Pape

> On 15.06.2020, at 14:24, Jakob Reschke <[hidden email]> wrote:
>
>
> Marcel Taeumel <[hidden email]> schrieb am Mo., 15. Juni 2020, 13:22:
>
> - Accept type names such as "int **" or "int[][]" in FFI-call specs and struct-field spec
>
> One further note: int[][] is not valid C in parameter types. Only the first [] can be without length, and is equivalent to a pointer. So char*argv[] is the same as char**argv. Valid parameter type examples: int a[][3], int b[][2][3]. These are like int(*a)[3] and int(*b)[2][3] if I am not mistaken.
>

Hello, C99:

/*      Find all paths that may contain dynamic libraries.
 *      Returns their count. libs may be NULL to get allocation size
 */
static size_t _sqo_lib_paths(size_t const n, char (*libs[n]))
{
 /*...*/
}

;)

-t

Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

marcel.taeumel
Hello, C99:

So, "char (*libs[n])" would be equivalent to "char *libs[n]", which is an array of n pointers, each pointing to a character ... which is a null-terminated string, I suppose? Like "char **argv" or "char *argv[]" ... but with n

Best,
Marcel

Am 15.06.2020 14:34:02 schrieb Tobias Pape <[hidden email]>:


> On 15.06.2020, at 14:24, Jakob Reschke wrote:
>
>
> Marcel Taeumel schrieb am Mo., 15. Juni 2020, 13:22:
>
> - Accept type names such as "int **" or "int[][]" in FFI-call specs and struct-field spec
>
> One further note: int[][] is not valid C in parameter types. Only the first [] can be without length, and is equivalent to a pointer. So char*argv[] is the same as char**argv. Valid parameter type examples: int a[][3], int b[][2][3]. These are like int(*a)[3] and int(*b)[2][3] if I am not mistaken.
>

Hello, C99:

/* Find all paths that may contain dynamic libraries.
* Returns their count. libs may be NULL to get allocation size
*/
static size_t _sqo_lib_paths(size_t const n, char (*libs[n]))
{
/*...*/
}

;)

-t



Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

Tobias Pape

> On 15.06.2020, at 14:52, Marcel Taeumel <[hidden email]> wrote:
>
> > Hello, C99:
>
> So, "char (*libs[n])" would be equivalent to "char *libs[n]", which is an array of n pointers, each pointing to a character ... which is a null-terminated string, I suppose? Like "char **argv" or "char *argv[]" ... but with n

It is an array of n char-pointers (in fact, C-Strings), and n refers to an earlier variable in the parameter list...

BTW: "char *libs[n]" doesn't work and won't compile ;)
-t

>
> Best,
> Marcel
>> Am 15.06.2020 14:34:02 schrieb Tobias Pape <[hidden email]>:
>>
>>
>> > On 15.06.2020, at 14:24, Jakob Reschke wrote:
>> >
>> >
>> > Marcel Taeumel schrieb am Mo., 15. Juni 2020, 13:22:
>> >
>> > - Accept type names such as "int **" or "int[][]" in FFI-call specs and struct-field spec
>> >
>> > One further note: int[][] is not valid C in parameter types. Only the first [] can be without length, and is equivalent to a pointer. So char*argv[] is the same as char**argv. Valid parameter type examples: int a[][3], int b[][2][3]. These are like int(*a)[3] and int(*b)[2][3] if I am not mistaken.
>> >
>>
>> Hello, C99:
>>
>> /* Find all paths that may contain dynamic libraries.
>> * Returns their count. libs may be NULL to get allocation size
>> */
>> static size_t _sqo_lib_paths(size_t const n, char (*libs[n]))
>> {
>> /*...*/
>> }
>>
>> ;)
>>
>> -t
>>
>



Reply | Threaded
Open this post in threaded view
|

Re: FFI (Plugin) | Question about multi-dimensional arrays (e.g., char**, int**, void*****...)

marcel.taeumel
BTW: "char *libs[n]" doesn't work and won't compile ;)

Hmm... what could the FFI Call spec look like for:

static size_t _sqo_lib_paths(size_t const n, char (*libs[n]))

Maybe like this:

cdeclSqoLibPaths: n with: libs
    <cdecl: size_t '_sqo_lib_paths' (size_t char**)>

Well, calling "self cdeclSqoLibPaths: 0 with: nil" would return the required size n for libs. Then you could construct a byte array with "n * ExternalType size_t byteSize" bytes so that the library can store all the pointers...

n := self cdeclSqoLibPaths: 0 with: nil.
libs := ByteArray new: n * ExternalType size_t byteSize.
self cdeclSqoLibPaths: n with: libs.

Should work. Alternatively, you can use external memory, not Squeak's object memory:

n := self cdeclSqoLibPaths: 0 with: nil.
libs := ExternalAddress allocate: n * ExternalType size_t byteSize.
self cdeclSqoLibPaths: n with: libs.

In both cases, you would need to put it into an object to start reading the data. For example, an ExternalData ... or a fitting ExternalTypeAlias:

data := ExternalData fromHandle: libs type: ExternalType void asPointerType.

Now it get's tricky. At the moment, we cannot tell ExternalData about the "char**" type. So, "data size: n; do: [:each | ... ]" will not work. We can, however, do it manually:

strings := (1 to: n) collect: [:index |
   | data |
   data := libs pointerAt: (index-1 * ExternalAddress wordSize)+1.
   data := ExternalData fromHandle: data type: ExternalType string "char*".
   data fromCString].

libs class == ExternalAddress ifTrue: [libs free].

But then we have to watch out for the pointer sizes on our own.

Maybe we can help here to improve the workflow and avoid redundant code.

Best,
Marcel

Am 15.06.2020 15:00:58 schrieb Tobias Pape <[hidden email]>:


> On 15.06.2020, at 14:52, Marcel Taeumel wrote:
>
> > Hello, C99:
>
> So, "char (*libs[n])" would be equivalent to "char *libs[n]", which is an array of n pointers, each pointing to a character ... which is a null-terminated string, I suppose? Like "char **argv" or "char *argv[]" ... but with n

It is an array of n char-pointers (in fact, C-Strings), and n refers to an earlier variable in the parameter list...

BTW: "char *libs[n]" doesn't work and won't compile ;)
-t

>
> Best,
> Marcel
>> Am 15.06.2020 14:34:02 schrieb Tobias Pape :
>>
>>
>> > On 15.06.2020, at 14:24, Jakob Reschke wrote:
>> >
>> >
>> > Marcel Taeumel schrieb am Mo., 15. Juni 2020, 13:22:
>> >
>> > - Accept type names such as "int **" or "int[][]" in FFI-call specs and struct-field spec
>> >
>> > One further note: int[][] is not valid C in parameter types. Only the first [] can be without length, and is equivalent to a pointer. So char*argv[] is the same as char**argv. Valid parameter type examples: int a[][3], int b[][2][3]. These are like int(*a)[3] and int(*b)[2][3] if I am not mistaken.
>> >
>>
>> Hello, C99:
>>
>> /* Find all paths that may contain dynamic libraries.
>> * Returns their count. libs may be NULL to get allocation size
>> */
>> static size_t _sqo_lib_paths(size_t const n, char (*libs[n]))
>> {
>> /*...*/
>> }
>>
>> ;)
>>
>> -t
>>
>