Plugin code generation issues

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Plugin code generation issues

Levente Uzonyi
 
Hi All,

I wrote a new plugin the other day and found two issues with code
generation.
The first one, which is less important is related to the inline pragma.
I expected that methods marked with <inline: true> will either not
be generated at all when the code generator inlines all occurrences of
them or they'll be generated as inline C functions. But they are
generated as regular static functions even though they are never used.

The other one, which is more significant is related to type casts.
I've got the following method:

rotateRight64: value by: amount

  <inline: true>
  <var: #value type: #'unsigned long long'>
  <returnTypeC: #'unsigned long long'>

  ^(value >> amount) bitOr: (value << (64 - amount))

The code using the above method has the variable declaration

  <var: #w declareC: 'unsigned long long w[80]'>

The method is used with individual elements of w:

  self rotateRight64: (w at: i - 2) by: 61

The generated function gets inlined as expected, but the elements of the
array are casted to usqInt before right shifts:

  ((((usqInt) (w[i2 - 2])) >> 61) | ((w[i2 - 2]) << (3)))

This is a problem on 32-bit platforms, because usqInt is unsigned int
there, so the upper 32-bits will be lost by the cast.

When the method is used with non-array variables, then the generator works
as expected:

  self rotateRight64: a by: 39

is generated as:

  ((a >> 39) | (a << (25)))

where a is declared as

  <var: #a type: #'unsigned long long'>

I tried to fix the issue by declaring w as

  <var: #w type: #'unsigned long long*'>

But is didn't make the usqInt casts go away.
Is it possible that the code generator doesn't know that the type of w[i]
is unsigned long long, when w is unsigned long long*?


Levente

P.S.: The code is available here
http://squeaksource.com/Cryptography/CryptographyPlugins-ul.18.mcz .
Generate SHA2Plugin to see the problem.
Reply | Threaded
Open this post in threaded view
|

Re: Plugin code generation issues

Eliot Miranda-2
 
Hi Levente,


> On Mar 9, 2020, at 8:54 AM, Levente Uzonyi <[hidden email]> wrote:
>
> Hi All,
>
> I wrote a new plugin the other day and found two issues with code generation.
> The first one, which is less important is related to the inline pragma.
> I expected that methods marked with <inline: true> will either not be generated at all when the code generator inlines all occurrences of them or they'll be generated as inline C functions. But they are generated as regular static functions even though they are never used.

Slang is not always able to inline.  For a example, inlining into conditionals is problematic. For this reason Slang treats <inline: true> as a recommendation.  If you want to insist, eliminate the static functions, and deal with any failures to inline by suitable rewrites then use <inline: #always>.

Slang also obeys <inline: #never> which is useful in other ways (for example reducing code duplication).

> The other one, which is more significant is related to type casts.
> I've got the following method:
>
> rotateRight64: value by: amount
>
>    <inline: true>
>    <var: #value type: #'unsigned long long'>
>    <returnTypeC: #'unsigned long long'>
>
>    ^(value >> amount) bitOr: (value << (64 - amount))
>
> The code using the above method has the variable declaration
>
>    <var: #w declareC: 'unsigned long long w[80]'>
>
> The method is used with individual elements of w:
>
>    self rotateRight64: (w at: i - 2) by: 61
>
> The generated function gets inlined as expected, but the elements of the array are casted to usqInt before right shifts:
>
>    ((((usqInt) (w[i2 - 2])) >> 61) | ((w[i2 - 2]) << (3)))
>
> This is a problem on 32-bit platforms, because usqInt is unsigned int there, so the upper 32-bits will be lost by the cast.
>
> When the method is used with non-array variables, then the generator works as expected:
>
>    self rotateRight64: a by: 39
>
> is generated as:
>
>    ((a >> 39) | (a << (25)))
>
> where a is declared as
>
>    <var: #a type: #'unsigned long long'>
>
> I tried to fix the issue by declaring w as
>
>    <var: #w type: #'unsigned long long*'>
>
> But is didn't make the usqInt casts go away.
> Is it possible that the code generator doesn't know that the type of w[i] is unsigned long long, when w is unsigned long long*?

I’ll need to take a look at this in detail.  For now, try using the type #usqLong and see if it makes a difference.

> Levente
>
> P.S.: The code is available here http://squeaksource.com/Cryptography/CryptographyPlugins-ul.18.mcz .
> Generate SHA2Plugin to see the problem.
Reply | Threaded
Open this post in threaded view
|

Re: Plugin code generation issues

Levente Uzonyi
 
Hi Eliot,

On Mon, 9 Mar 2020, Eliot Miranda wrote:

>
> Hi Levente,
>
>
>> On Mar 9, 2020, at 8:54 AM, Levente Uzonyi <[hidden email]> wrote:
>>
>> Hi All,
>>
>> I wrote a new plugin the other day and found two issues with code generation.
>> The first one, which is less important is related to the inline pragma.
>> I expected that methods marked with <inline: true> will either not be generated at all when the code generator inlines all occurrences of them or they'll be generated as inline C functions. But they are generated as regular static functions even though they are never used.
>
> Slang is not always able to inline.  For a example, inlining into conditionals is problematic. For this reason Slang treats <inline: true> as a recommendation.  If you want to insist, eliminate the static functions, and deal with any failures to inline by suitable rewrites then use <inline: #always>.
>
> Slang also obeys <inline: #never> which is useful in other ways (for example reducing code duplication).
Thanks. <inline: #always> does just what I need.

>
>> The other one, which is more significant is related to type casts.
>> I've got the following method:
>>
>> rotateRight64: value by: amount
>>
>>    <inline: true>
>>    <var: #value type: #'unsigned long long'>
>>    <returnTypeC: #'unsigned long long'>
>>
>>    ^(value >> amount) bitOr: (value << (64 - amount))
>>
>> The code using the above method has the variable declaration
>>
>>    <var: #w declareC: 'unsigned long long w[80]'>
>>
>> The method is used with individual elements of w:
>>
>>    self rotateRight64: (w at: i - 2) by: 61
>>
>> The generated function gets inlined as expected, but the elements of the array are casted to usqInt before right shifts:
>>
>>    ((((usqInt) (w[i2 - 2])) >> 61) | ((w[i2 - 2]) << (3)))
>>
>> This is a problem on 32-bit platforms, because usqInt is unsigned int there, so the upper 32-bits will be lost by the cast.
>>
>> When the method is used with non-array variables, then the generator works as expected:
>>
>>    self rotateRight64: a by: 39
>>
>> is generated as:
>>
>>    ((a >> 39) | (a << (25)))
>>
>> where a is declared as
>>
>>    <var: #a type: #'unsigned long long'>
>>
>> I tried to fix the issue by declaring w as
>>
>>    <var: #w type: #'unsigned long long*'>
>>
>> But is didn't make the usqInt casts go away.
>> Is it possible that the code generator doesn't know that the type of w[i] is unsigned long long, when w is unsigned long long*?
>
> I’ll need to take a look at this in detail.  For now, try using the type #usqLong and see if it makes a difference.
Unfortunately, the code behaves the same way with #usqLong.

I forgot to mention a third issue. There's no easy way to check the type
of 64-bit and 16-bit arrays. There's #isWords: for 32-bit arrays, and
#isBytes: for 8-bit arrays.
#isWordsOrBytes: is a misnomer now, because it returns true for 8, 16, 32
and 64-bit arrays.

I used the following workaround:

  ((interpreterProxy isIndexable: hashOop)
  and: [ (interpreterProxy isWordsOrBytes: hashOop)
  and: [ (interpreterProxy slotSizeOf: hashOop) = 8 ] ])
  ifFalse: [ ^interpreterProxy primitiveFailFor: PrimErrBadArgument ].

The first check is probably unnecessary.

SpurMemoryManager understands #isLong64s:, but InterpreterProxy doesn't
have that method.
Also, some implementations in InterpreterProxy may be incorrect. e.g.
#isWordsOrBytes:.


Levenete

>
>> Levente
>>
>> P.S.: The code is available here http://squeaksource.com/Cryptography/CryptographyPlugins-ul.18.mcz .
>> Generate SHA2Plugin to see the problem.
Reply | Threaded
Open this post in threaded view
|

Re: Plugin code generation issues

Nicolas Cellier
 
Hi Levente,
IMO the undue sqInt cast indeed qualifies as a BUG, we were just lucky enough to not encounter it.
(said another way, since we have not much tests, we would have corrected it only if encountered it).

This happens in #generateShiftRight:on:indent: because the type analysis is insufficient.
You can see that we care to verify if operand is already 64 bits with
    (self is64BitIntegralVariable: msgNode receiver typeInto: [:t| type := t])
Unfortunately, as the name tells, this method restrict the analysis to the case where
    node isVariable

Note how the analysis of operand type is more careful in generateShiftLeft:on:indent:
We changed it several times in 2016 with Eliot, and I finally brought current complexification circa July 2016
(some details improving handling of literal constants have changed since).

It uses more general C expression type #typeFor:in: rather than just C variable type.
Eliot wrote this more generic type analyzer and I completed it in 2016, this is posterior the #generateShiftRight:on:indent: time stamp, hence the limitations of #generateShiftRight:on:indent. As said above, we did not require a better one until now...

So it seems like we have enough material to correct the generation and make it right.
We should really pick the Unit Test initiative from Pharo Team.
A pity, a totally unfair attitude, and a total waste that it's not retro contributed...



Le lun. 9 mars 2020 à 18:27, Levente Uzonyi <[hidden email]> a écrit :
 Hi Eliot,

On Mon, 9 Mar 2020, Eliot Miranda wrote:

>
> Hi Levente,
>
>
>> On Mar 9, 2020, at 8:54 AM, Levente Uzonyi <[hidden email]> wrote:
>>
>> Hi All,
>>
>> I wrote a new plugin the other day and found two issues with code generation.
>> The first one, which is less important is related to the inline pragma.
>> I expected that methods marked with <inline: true> will either not be generated at all when the code generator inlines all occurrences of them or they'll be generated as inline C functions. But they are generated as regular static functions even though they are never used.
>
> Slang is not always able to inline.  For a example, inlining into conditionals is problematic. For this reason Slang treats <inline: true> as a recommendation.  If you want to insist, eliminate the static functions, and deal with any failures to inline by suitable rewrites then use <inline: #always>.
>
> Slang also obeys <inline: #never> which is useful in other ways (for example reducing code duplication).

Thanks. <inline: #always> does just what I need.

>
>> The other one, which is more significant is related to type casts.
>> I've got the following method:
>>
>> rotateRight64: value by: amount
>>
>>    <inline: true>
>>    <var: #value type: #'unsigned long long'>
>>    <returnTypeC: #'unsigned long long'>
>>
>>    ^(value >> amount) bitOr: (value << (64 - amount))
>>
>> The code using the above method has the variable declaration
>>
>>    <var: #w declareC: 'unsigned long long w[80]'>
>>
>> The method is used with individual elements of w:
>>
>>    self rotateRight64: (w at: i - 2) by: 61
>>
>> The generated function gets inlined as expected, but the elements of the array are casted to usqInt before right shifts:
>>
>>    ((((usqInt) (w[i2 - 2])) >> 61) | ((w[i2 - 2]) << (3)))
>>
>> This is a problem on 32-bit platforms, because usqInt is unsigned int there, so the upper 32-bits will be lost by the cast.
>>
>> When the method is used with non-array variables, then the generator works as expected:
>>
>>    self rotateRight64: a by: 39
>>
>> is generated as:
>>
>>    ((a >> 39) | (a << (25)))
>>
>> where a is declared as
>>
>>    <var: #a type: #'unsigned long long'>
>>
>> I tried to fix the issue by declaring w as
>>
>>    <var: #w type: #'unsigned long long*'>
>>
>> But is didn't make the usqInt casts go away.
>> Is it possible that the code generator doesn't know that the type of w[i] is unsigned long long, when w is unsigned long long*?
>
> I’ll need to take a look at this in detail.  For now, try using the type #usqLong and see if it makes a difference.

Unfortunately, the code behaves the same way with #usqLong.

I forgot to mention a third issue. There's no easy way to check the type
of 64-bit and 16-bit arrays. There's #isWords: for 32-bit arrays, and
#isBytes: for 8-bit arrays.
#isWordsOrBytes: is a misnomer now, because it returns true for 8, 16, 32
and 64-bit arrays.

I used the following workaround:

        ((interpreterProxy isIndexable: hashOop)
                and: [ (interpreterProxy isWordsOrBytes: hashOop)
                and: [ (interpreterProxy slotSizeOf: hashOop) = 8 ] ])
                ifFalse: [ ^interpreterProxy primitiveFailFor: PrimErrBadArgument ].

The first check is probably unnecessary.

SpurMemoryManager understands #isLong64s:, but InterpreterProxy doesn't
have that method.
Also, some implementations in InterpreterProxy may be incorrect. e.g.
#isWordsOrBytes:.


Levenete

>
>> Levente
>>
>> P.S.: The code is available here http://squeaksource.com/Cryptography/CryptographyPlugins-ul.18.mcz .
>> Generate SHA2Plugin to see the problem.
Reply | Threaded
Open this post in threaded view
|

Re: Plugin code generation issues

pierre misse
 

Hi,

I did not read this thread in detail, but might it be related to the tests that i shared ?
(by mail sadly, was not able to find the time to commit them).

Pierre

On 09/03/2020 22:21, Nicolas Cellier wrote:
 

Hi Levente,
IMO the undue sqInt cast indeed qualifies as a BUG, we were just lucky enough to not encounter it.
(said another way, since we have not much tests, we would have corrected it only if encountered it).

This happens in #generateShiftRight:on:indent: because the type analysis is insufficient.
You can see that we care to verify if operand is already 64 bits with
    (self is64BitIntegralVariable: msgNode receiver typeInto: [:t| type := t])
Unfortunately, as the name tells, this method restrict the analysis to the case where
    node isVariable

Note how the analysis of operand type is more careful in generateShiftLeft:on:indent:
We changed it several times in 2016 with Eliot, and I finally brought current complexification circa July 2016
(some details improving handling of literal constants have changed since).

It uses more general C expression type #typeFor:in: rather than just C variable type.
Eliot wrote this more generic type analyzer and I completed it in 2016, this is posterior the #generateShiftRight:on:indent: time stamp, hence the limitations of #generateShiftRight:on:indent. As said above, we did not require a better one until now...

So it seems like we have enough material to correct the generation and make it right.
We should really pick the Unit Test initiative from Pharo Team.
A pity, a totally unfair attitude, and a total waste that it's not retro contributed...



Le lun. 9 mars 2020 à 18:27, Levente Uzonyi <[hidden email]> a écrit :
 Hi Eliot,

On Mon, 9 Mar 2020, Eliot Miranda wrote:

>
> Hi Levente,
>
>
>> On Mar 9, 2020, at 8:54 AM, Levente Uzonyi <[hidden email]> wrote:
>>
>> Hi All,
>>
>> I wrote a new plugin the other day and found two issues with code generation.
>> The first one, which is less important is related to the inline pragma.
>> I expected that methods marked with <inline: true> will either not be generated at all when the code generator inlines all occurrences of them or they'll be generated as inline C functions. But they are generated as regular static functions even though they are never used.
>
> Slang is not always able to inline.  For a example, inlining into conditionals is problematic. For this reason Slang treats <inline: true> as a recommendation.  If you want to insist, eliminate the static functions, and deal with any failures to inline by suitable rewrites then use <inline: #always>.
>
> Slang also obeys <inline: #never> which is useful in other ways (for example reducing code duplication).

Thanks. <inline: #always> does just what I need.

>
>> The other one, which is more significant is related to type casts.
>> I've got the following method:
>>
>> rotateRight64: value by: amount
>>
>>    <inline: true>
>>    <var: #value type: #'unsigned long long'>
>>    <returnTypeC: #'unsigned long long'>
>>
>>    ^(value >> amount) bitOr: (value << (64 - amount))
>>
>> The code using the above method has the variable declaration
>>
>>    <var: #w declareC: 'unsigned long long w[80]'>
>>
>> The method is used with individual elements of w:
>>
>>    self rotateRight64: (w at: i - 2) by: 61
>>
>> The generated function gets inlined as expected, but the elements of the array are casted to usqInt before right shifts:
>>
>>    ((((usqInt) (w[i2 - 2])) >> 61) | ((w[i2 - 2]) << (3)))
>>
>> This is a problem on 32-bit platforms, because usqInt is unsigned int there, so the upper 32-bits will be lost by the cast.
>>
>> When the method is used with non-array variables, then the generator works as expected:
>>
>>    self rotateRight64: a by: 39
>>
>> is generated as:
>>
>>    ((a >> 39) | (a << (25)))
>>
>> where a is declared as
>>
>>    <var: #a type: #'unsigned long long'>
>>
>> I tried to fix the issue by declaring w as
>>
>>    <var: #w type: #'unsigned long long*'>
>>
>> But is didn't make the usqInt casts go away.
>> Is it possible that the code generator doesn't know that the type of w[i] is unsigned long long, when w is unsigned long long*?
>
> I’ll need to take a look at this in detail.  For now, try using the type #usqLong and see if it makes a difference.

Unfortunately, the code behaves the same way with #usqLong.

I forgot to mention a third issue. There's no easy way to check the type
of 64-bit and 16-bit arrays. There's #isWords: for 32-bit arrays, and
#isBytes: for 8-bit arrays.
#isWordsOrBytes: is a misnomer now, because it returns true for 8, 16, 32
and 64-bit arrays.

I used the following workaround:

        ((interpreterProxy isIndexable: hashOop)
                and: [ (interpreterProxy isWordsOrBytes: hashOop)
                and: [ (interpreterProxy slotSizeOf: hashOop) = 8 ] ])
                ifFalse: [ ^interpreterProxy primitiveFailFor: PrimErrBadArgument ].

The first check is probably unnecessary.

SpurMemoryManager understands #isLong64s:, but InterpreterProxy doesn't
have that method.
Also, some implementations in InterpreterProxy may be incorrect. e.g.
#isWordsOrBytes:.


Levenete

>
>> Levente
>>
>> P.S.: The code is available here http://squeaksource.com/Cryptography/CryptographyPlugins-ul.18.mcz .
>> Generate SHA2Plugin to see the problem.
Reply | Threaded
Open this post in threaded view
|

Re: Plugin code generation issues

Eliot Miranda-2
 
Hi Pierre,


On Mar 9, 2020, at 11:30 PM, hogoww <[hidden email]> wrote:



Hi,

I did not read this thread in detail, but might it be related to the tests that i shared ?
(by mail sadly, was not able to find the time to commit them).


I have them; thank you.  As soon as time allows I’ll try and integrate them. They are much appreciated.

Pierre

On 09/03/2020 22:21, Nicolas Cellier wrote:
 

Hi Levente,
IMO the undue sqInt cast indeed qualifies as a BUG, we were just lucky enough to not encounter it.
(said another way, since we have not much tests, we would have corrected it only if encountered it).

This happens in #generateShiftRight:on:indent: because the type analysis is insufficient.
You can see that we care to verify if operand is already 64 bits with
    (self is64BitIntegralVariable: msgNode receiver typeInto: [:t| type := t])
Unfortunately, as the name tells, this method restrict the analysis to the case where
    node isVariable

Note how the analysis of operand type is more careful in generateShiftLeft:on:indent:
We changed it several times in 2016 with Eliot, and I finally brought current complexification circa July 2016
(some details improving handling of literal constants have changed since).

It uses more general C expression type #typeFor:in: rather than just C variable type.
Eliot wrote this more generic type analyzer and I completed it in 2016, this is posterior the #generateShiftRight:on:indent: time stamp, hence the limitations of #generateShiftRight:on:indent. As said above, we did not require a better one until now...

So it seems like we have enough material to correct the generation and make it right.
We should really pick the Unit Test initiative from Pharo Team.
A pity, a totally unfair attitude, and a total waste that it's not retro contributed...



Le lun. 9 mars 2020 à 18:27, Levente Uzonyi <[hidden email]> a écrit :
 Hi Eliot,

On Mon, 9 Mar 2020, Eliot Miranda wrote:

>
> Hi Levente,
>
>
>> On Mar 9, 2020, at 8:54 AM, Levente Uzonyi <[hidden email]> wrote:
>>
>> Hi All,
>>
>> I wrote a new plugin the other day and found two issues with code generation.
>> The first one, which is less important is related to the inline pragma.
>> I expected that methods marked with <inline: true> will either not be generated at all when the code generator inlines all occurrences of them or they'll be generated as inline C functions. But they are generated as regular static functions even though they are never used.
>
> Slang is not always able to inline.  For a example, inlining into conditionals is problematic. For this reason Slang treats <inline: true> as a recommendation.  If you want to insist, eliminate the static functions, and deal with any failures to inline by suitable rewrites then use <inline: #always>.
>
> Slang also obeys <inline: #never> which is useful in other ways (for example reducing code duplication).

Thanks. <inline: #always> does just what I need.

>
>> The other one, which is more significant is related to type casts.
>> I've got the following method:
>>
>> rotateRight64: value by: amount
>>
>>    <inline: true>
>>    <var: #value type: #'unsigned long long'>
>>    <returnTypeC: #'unsigned long long'>
>>
>>    ^(value >> amount) bitOr: (value << (64 - amount))
>>
>> The code using the above method has the variable declaration
>>
>>    <var: #w declareC: 'unsigned long long w[80]'>
>>
>> The method is used with individual elements of w:
>>
>>    self rotateRight64: (w at: i - 2) by: 61
>>
>> The generated function gets inlined as expected, but the elements of the array are casted to usqInt before right shifts:
>>
>>    ((((usqInt) (w[i2 - 2])) >> 61) | ((w[i2 - 2]) << (3)))
>>
>> This is a problem on 32-bit platforms, because usqInt is unsigned int there, so the upper 32-bits will be lost by the cast.
>>
>> When the method is used with non-array variables, then the generator works as expected:
>>
>>    self rotateRight64: a by: 39
>>
>> is generated as:
>>
>>    ((a >> 39) | (a << (25)))
>>
>> where a is declared as
>>
>>    <var: #a type: #'unsigned long long'>
>>
>> I tried to fix the issue by declaring w as
>>
>>    <var: #w type: #'unsigned long long*'>
>>
>> But is didn't make the usqInt casts go away.
>> Is it possible that the code generator doesn't know that the type of w[i] is unsigned long long, when w is unsigned long long*?
>
> I’ll need to take a look at this in detail.  For now, try using the type #usqLong and see if it makes a difference.

Unfortunately, the code behaves the same way with #usqLong.

I forgot to mention a third issue. There's no easy way to check the type
of 64-bit and 16-bit arrays. There's #isWords: for 32-bit arrays, and
#isBytes: for 8-bit arrays.
#isWordsOrBytes: is a misnomer now, because it returns true for 8, 16, 32
and 64-bit arrays.

I used the following workaround:

        ((interpreterProxy isIndexable: hashOop)
                and: [ (interpreterProxy isWordsOrBytes: hashOop)
                and: [ (interpreterProxy slotSizeOf: hashOop) = 8 ] ])
                ifFalse: [ ^interpreterProxy primitiveFailFor: PrimErrBadArgument ].

The first check is probably unnecessary.

SpurMemoryManager understands #isLong64s:, but InterpreterProxy doesn't
have that method.
Also, some implementations in InterpreterProxy may be incorrect. e.g.
#isWordsOrBytes:.


Levenete

>
>> Levente
>>
>> P.S.: The code is available here http://squeaksource.com/Cryptography/CryptographyPlugins-ul.18.mcz .
>> Generate SHA2Plugin to see the problem.