Primitive 40 (asFloat) fails for me

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Primitive 40 (asFloat) fails for me

Christian Kellermann
 
Hi,

I am currently using the latest VM with commit
f9ae4a1479122b448fcfdc80bb2caa09b53aa474

On a Squeak image:

-----
/home/ckeen/Sync/src/Muffin/Squeak5.2-18225-64bit.dependents.image
Squeak5.2
latest update: #18231
Current Change Set: Unnamed1
Image format 68021 (64 bit)

Virtual Machine
---------------
/home/ckeen/src/playground/opensmalltalk-vm/products/sqcogspur64linuxht/lib/squeak/5.0-201902181742/squeak
Croquet Closure Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.2509]
Unix built on Feb 19 2019 08:58:47 Compiler: 8.2.1 20181127
platform sources revision VM: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm Date: Mon Feb 18 09:42:23 2019 CommitHash: f9ae4a147 Plugins: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm
CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019
StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019


-100 asFloat returns 100 for me. Can anyone confirm this?

The Float tests from the Kernel-Number category also fail.
Is my image out of sync with the VM?

Kind regards,

Christian

--
May you be peaceful, may you live in safety, may you be free from
suffering, and may you live with ease.

signature.asc (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Tobias Pape
 
Hi,

> On 19.02.2019, at 12:25, Christian Kellermann <[hidden email]> wrote:
>
> Hi,
>
> I am currently using the latest VM with commit
> f9ae4a1479122b448fcfdc80bb2caa09b53aa474
>
> On a Squeak image:
>
> -----
> /home/ckeen/Sync/src/Muffin/Squeak5.2-18225-64bit.dependents.image
> Squeak5.2
> latest update: #18231
> Current Change Set: Unnamed1
> Image format 68021 (64 bit)
>
> Virtual Machine
> ---------------
> /home/ckeen/src/playground/opensmalltalk-vm/products/sqcogspur64linuxht/lib/squeak/5.0-201902181742/squeak
> Croquet Closure Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.2509]
> Unix built on Feb 19 2019 08:58:47 Compiler: 8.2.1 20181127
> platform sources revision VM: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm Date: Mon Feb 18 09:42:23 2019 CommitHash: f9ae4a147 Plugins: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm
> CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019
> StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019
What Platform [os/cpu] is this  on?

Best regards
        -Tobias

>
>
> -100 asFloat returns 100 for me. Can anyone confirm this?
>
> The Float tests from the Kernel-Number category also fail.
> Is my image out of sync with the VM?
>
> Kind regards,
>
> Christian
>
> --
> May you be peaceful, may you live in safety, may you be free from
> suffering, and may you live with ease.


signature.asc (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Christian Kellermann
 
* Tobias Pape <[hidden email]> [190219 17:02]:
>  

> Hi,
>
> > On 19.02.2019, at 12:25, Christian Kellermann <[hidden email]> wrote:
> >
> > Hi,
> >
> > I am currently using the latest VM with commit
> > f9ae4a1479122b448fcfdc80bb2caa09b53aa474
> >
> > On a Squeak image:
> >
> > -----
> > /home/ckeen/Sync/src/Muffin/Squeak5.2-18225-64bit.dependents.image
> > Squeak5.2
> > latest update: #18231
> > Current Change Set: Unnamed1
> > Image format 68021 (64 bit)
> >
> > Virtual Machine
> > ---------------
> > /home/ckeen/src/playground/opensmalltalk-vm/products/sqcogspur64linuxht/lib/squeak/5.0-201902181742/squeak
> > Croquet Closure Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.2509]
> > Unix built on Feb 19 2019 08:58:47 Compiler: 8.2.1 20181127
> > platform sources revision VM: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm Date: Mon Feb 18 09:42:23 2019 CommitHash: f9ae4a147 Plugins: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm
> > CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019
> > StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019
>
> What Platform [os/cpu] is this  on?

Linux/amd64 sorry for omitting this...


--
May you be peaceful, may you live in safety, may you be free from
suffering, and may you live with ease.

signature.asc (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Levente Uzonyi
In reply to this post by Christian Kellermann
 
I've seen this bug before[1] when building the VM with gcc 7/8.
Probably undefined behavior somewhere, but I didn't have time to find the
cause.

Levente

[1] http://forum.world.st/ftlo-option-link-time-optimizer-tp5092657p5092693.html

On Tue, 19 Feb 2019, Christian Kellermann wrote:

Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Eliot Miranda-2
In reply to this post by Christian Kellermann
 
Hi Christian,

On Tue, Feb 19, 2019 at 8:05 AM Christian Kellermann <[hidden email]> wrote:
 * Tobias Pape <[hidden email]> [190219 17:02]:


> Hi,
>
> > On 19.02.2019, at 12:25, Christian Kellermann <[hidden email]> wrote:
> >
> > Hi,
> >
> > I am currently using the latest VM with commit
> > f9ae4a1479122b448fcfdc80bb2caa09b53aa474
> >
> > On a Squeak image:
> >
> > -----
> > /home/ckeen/Sync/src/Muffin/Squeak5.2-18225-64bit.dependents.image
> > Squeak5.2
> > latest update: #18231
> > Current Change Set: Unnamed1
> > Image format 68021 (64 bit)
> >
> > Virtual Machine
> > ---------------
> > /home/ckeen/src/playground/opensmalltalk-vm/products/sqcogspur64linuxht/lib/squeak/5.0-201902181742/squeak
> > Croquet Closure Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.2509]
> > Unix built on Feb 19 2019 08:58:47 Compiler: 8.2.1 20181127
> > platform sources revision VM: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm Date: Mon Feb 18 09:42:23 2019 CommitHash: f9ae4a147 Plugins: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm
> > CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019
> > StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019
>
> What Platform [os/cpu] is this  on?


Linux/amd64 sorry for omitting this...

and you report that -100 asFloat answers 100.0.

Can you please try the following experiments?  

First what is the result of

    | floats |
    floats := Array new: 30.
    1 to: floats size do: [:i| floats at: i put: -100 asFloat].
    floats

?

If the result has the first several floats as 100.0 and the last few as -100.0 then...

Edit spur64src/vm/gcc3x-cointerp.c so that

static void
primitiveAsFloat(void)
{   DECL_MAYBE_SQ_GLOBAL_STRUCT
    sqInt rcvr;
    char *sp;

    rcvr = longAt(GIV(stackPointer));
    assert((((rcvr) & 7) == 1));
    /* begin pop:thenPushFloat: */
    longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
    GIV(stackPointer) = sp;
}

reads

static void
primitiveAsFloat(void)
{   DECL_MAYBE_SQ_GLOBAL_STRUCT
    sqInt rcvr;
    char *sp;

    rcvr = longAt(GIV(stackPointer));
    assert((((rcvr) & 7) == 1));
    /* begin pop:thenPushFloat: */
    longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((sqInt)(rcvr >> 3)) )));
    GIV(stackPointer) = sp;
}

and test -100 asFloat.  Does this fix it?

And does this rewrite fix it?

static void
primitiveAsFloat(void)
{   DECL_MAYBE_SQ_GLOBAL_STRUCT
    sqInt rcvr;
    char *sp;

    rcvr = longAt(GIV(stackPointer));
    assert((((rcvr) & 7) == 1));
    /* begin pop:thenPushFloat: */
    longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) (((sqInt)rcvr >> 3)) )));
    GIV(stackPointer) = sp;
}

(I would expect not)

_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Christian Kellermann
 
Hi Eliot!

* Eliot Miranda <[hidden email]> [190219 18:18]:
> and you report that -100 asFloat answers 100.0.
> Can you please try the following experiments?

I will do this on the work machine tomorrow (in 12 hours roughly).
I have just tested this on OpenBSD/amd64 with its old gcc 4.2.1 and
here it works as expected. I think on the other machine I am using
arch linux' gcc7, so there might be a too aggressive optimizer at
work. I will report back.

Thanks!

Christian

--
May you be peaceful, may you live in safety, may you be free from
suffering, and may you live with ease.

signature.asc (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Christian Kellermann
In reply to this post by Eliot Miranda-2
 
* Eliot Miranda <[hidden email]> [190219 18:18]:
>  

> First what is the result of
>
>     | floats |
>     floats := Array new: 30.
>     1 to: floats size do: [:i| floats at: i put: -100 asFloat].
>     floats
>
> ?

 #(100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0)

All 100's...

> If the result has the first several floats as 100.0 and the last few as
> -100.0 then...

omitted.

> static void
> primitiveAsFloat(void)
> {   DECL_MAYBE_SQ_GLOBAL_STRUCT
>     sqInt rcvr;
>     char *sp;
>
>     rcvr = longAt(GIV(stackPointer));
>     assert((((rcvr) & 7) == 1));
>     /* begin pop:thenPushFloat: */
>     longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)),
> floatObjectOf(((double) (((sqInt)rcvr >> 3)) )));
>     GIV(stackPointer) = sp;
> }
>
> (I would expect not)
You are right:

diff --git a/spur64src/vm/gcc3x-cointerp.c b/spur64src/vm/gcc3x-cointerp.c
index c80be332c..0256a6585 100644
--- a/spur64src/vm/gcc3x-cointerp.c
+++ b/spur64src/vm/gcc3x-cointerp.c
@@ -25996,7 +25996,7 @@ primitiveAsFloat(void)
        rcvr = longAt(GIV(stackPointer));
        assert((((rcvr) & 7) == 1));
        /* begin pop:thenPushFloat: */
-       longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
+       longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) (((sqInt)rcvr >> 3)) )));
        GIV(stackPointer) = sp;
 }

-100 asFloat → 100

This has been built with
$ gcc --version
gcc (GCC) 8.2.1 20181127

When I disable optimisations with -O0, I get the correct result.
So I guess, let's see why the compiler does what it does...

Kind regards,

Christian

--
May you be peaceful, may you live in safety, may you be free from
suffering, and may you live with ease.

signature.asc (817 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Levente Uzonyi
In reply to this post by Eliot Miranda-2
 
I had a look at the generated code and I think the generated code for
primitiveAsFloat has no issues. The machine code is simple and looks
correct with gcc 8, thought the pxor operation is unnecessary (and not
generated by gcc 4) as cvtsi2sd will zero fill the higher 64 bits of xmm0.

GCC 4.8:

0000000000432ff0 <primitiveAsFloat>:
{   DECL_MAYBE_SQ_GLOBAL_STRUCT
   432ff0:       53                      push   %rbx
         rcvr = longAt(GIV(stackPointer));
   432ff1:       48 8b 1d 20 24 36 00    mov    0x362420(%rip),%rbx        # 795418 <stackPointer>
         longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
   432ff8:       48 8b 03                mov    (%rbx),%rax
   432ffb:       48 c1 f8 03             sar    $0x3,%rax
   432fff:       f2 48 0f 2a c0          cvtsi2sd %rax,%xmm0
   433004:       e8 37 ee ff ff          callq  431e40 <floatObjectOf>
   433009:       48 89 03                mov    %rax,(%rbx)
         GIV(stackPointer) = sp;
   43300c:       48 89 1d 05 24 36 00    mov    %rbx,0x362405(%rip)        # 795418 <stackPointer>
}
   433013:       5b                      pop    %rbx
   433014:       c3                      retq
   433015:       90                      nop
   433016:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
   43301d:       00 00 00

GCC 8.2:

000000000004a1f0 <primitiveAsFloat>:
{   DECL_MAYBE_SQ_GLOBAL_STRUCT
    4a1f0:       53                      push   %rbx
         rcvr = longAt(GIV(stackPointer));
    4a1f1:       48 8b 1d 80 32 36 00    mov    0x363280(%rip),%rbx        # 3ad478 <stackPointer>
         longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
    4a1f8:       66 0f ef c0             pxor   %xmm0,%xmm0
    4a1fc:       48 8b 03                mov    (%rbx),%rax
    4a1ff:       48 c1 f8 03             sar    $0x3,%rax
    4a203:       f2 48 0f 2a c0          cvtsi2sd %rax,%xmm0
    4a208:       e8 b3 eb ff ff          callq  48dc0 <floatObjectOf>
    4a20d:       48 89 03                mov    %rax,(%rbx)
         GIV(stackPointer) = sp;
    4a210:       48 89 1d 61 32 36 00    mov    %rbx,0x363261(%rip)        # 3ad478 <stackPointer>
}
    4a217:       5b                      pop    %rbx
    4a218:       c3                      retq
    4a219:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)


Actually the returned float object is -100.0 (-100 asFloat < 0.0 returns
true), but the code used for printing doesn't work properly, because
#basicAt: returns 0 (when the argument is 1 or 2), and the method used to
print the number uses #signBit to print the negative sign, and #signBit
relies on #basicAt: returning a non-zero value when the number is
negative.

So, we're back to the original issue I found: #basicAt: (primitive 38)
doesn't work with newer gcc versions.

Levente


Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Tobias Pape
 
Hi Levente.
               

> On 22.02.2019, at 00:43, Levente Uzonyi <[hidden email]> wrote:
>
> I had a look at the generated code and I think the generated code for primitiveAsFloat has no issues. The machine code is simple and looks correct with gcc 8, thought the pxor operation is unnecessary (and not generated by gcc 4) as cvtsi2sd will zero fill the higher 64 bits of xmm0.
>
> GCC 4.8:
>
> 0000000000432ff0 <primitiveAsFloat>:
> {   DECL_MAYBE_SQ_GLOBAL_STRUCT
>  432ff0:       53                      push   %rbx
>        rcvr = longAt(GIV(stackPointer));
>  432ff1:       48 8b 1d 20 24 36 00    mov    0x362420(%rip),%rbx        # 795418 <stackPointer>
>        longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
>  432ff8:       48 8b 03                mov    (%rbx),%rax
>  432ffb:       48 c1 f8 03             sar    $0x3,%rax
>  432fff:       f2 48 0f 2a c0          cvtsi2sd %rax,%xmm0
>  433004:       e8 37 ee ff ff          callq  431e40 <floatObjectOf>
>  433009:       48 89 03                mov    %rax,(%rbx)
>        GIV(stackPointer) = sp;
>  43300c:       48 89 1d 05 24 36 00    mov    %rbx,0x362405(%rip)        # 795418 <stackPointer>
> }
>  433013:       5b                      pop    %rbx
>  433014:       c3                      retq
>  433015:       90                      nop
>  433016:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
>  43301d:       00 00 00
>
> GCC 8.2:
>
> 000000000004a1f0 <primitiveAsFloat>:
> {   DECL_MAYBE_SQ_GLOBAL_STRUCT
>   4a1f0:       53                      push   %rbx
>        rcvr = longAt(GIV(stackPointer));
>   4a1f1:       48 8b 1d 80 32 36 00    mov    0x363280(%rip),%rbx        # 3ad478 <stackPointer>
>        longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
>   4a1f8:       66 0f ef c0             pxor   %xmm0,%xmm0
>   4a1fc:       48 8b 03                mov    (%rbx),%rax
>   4a1ff:       48 c1 f8 03             sar    $0x3,%rax
>   4a203:       f2 48 0f 2a c0          cvtsi2sd %rax,%xmm0
>   4a208:       e8 b3 eb ff ff          callq  48dc0 <floatObjectOf>
>   4a20d:       48 89 03                mov    %rax,(%rbx)
>        GIV(stackPointer) = sp;
>   4a210:       48 89 1d 61 32 36 00    mov    %rbx,0x363261(%rip)        # 3ad478 <stackPointer>
> }
>   4a217:       5b                      pop    %rbx
>   4a218:       c3                      retq
>   4a219:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
>
>
> Actually the returned float object is -100.0 (-100 asFloat < 0.0 returns true), but the code used for printing doesn't work properly, because #basicAt: returns 0 (when the argument is 1 or 2), and the method used to print the number uses #signBit to print the negative sign, and #signBit relies on #basicAt: returning a non-zero value when the number is negative.
>
> So, we're back to the original issue I found: #basicAt: (primitive 38) doesn't work with newer gcc versions.
>

Wow. Great analysis! Thank you :)

Best regards
        -Tobias

Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Levente Uzonyi
 
Thanks. I spent some time in gdb land and there seems to be a miscompiled
conditional jump. The question is: is it a compiler bug or a bug in the
Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is
generated.

Here is the disassembled code with my comments following the path when
-100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt:
    0x0000555555589ec0 <+0>: mov    0x3775b1(%rip),%rdx        # 0x555555901478 <stackPointer>
%rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively).
    0x0000555555589ec7 <+7>: mov    (%rdx),%rax
%rcx is the receiver object's address
    0x0000555555589eca <+10>: mov    0x8(%rdx),%rcx
Is the index SmallInteger 1?
    0x0000555555589ece <+14>: cmp    $0x9,%rax
If yes, jump to 0x555555589ef8
    0x0000555555589ed2 <+18>: je     0x555555589ef8 <primitiveFloatAt+56>
    0x0000555555589ed4 <+20>: cmp    $0x11,%rax
    0x0000555555589ed8 <+24>: je     0x555555589f30 <primitiveFloatAt+112>
    0x0000555555589eda <+26>: and    $0x7,%eax
    0x0000555555589edd <+29>: cmp    $0x1,%rax
    0x0000555555589ee1 <+33>: sete   %al
    0x0000555555589ee4 <+36>: movzbl %al,%eax
    0x0000555555589ee7 <+39>: add    $0x3,%rax
    0x0000555555589eeb <+43>: mov    %rax,0x37757e(%rip)        # 0x555555901470 <primFailCode>
    0x0000555555589ef2 <+50>: retq
    0x0000555555589ef3 <+51>: nopl   0x0(%rax,%rax,1)
Set the value of  %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important.
    0x0000555555589ef8 <+56>: xor    %eax,%eax
Check if the receiver has the SmallFloat tag set (0x4)
    0x0000555555589efa <+58>: test   $0x4,%cl
If yes, jump to  0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0.
=> 0x0000555555589efd <+61>: jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 2nd 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589eff <+63>: movslq 0xc(%rcx),%rax
Shift by 3 bits to make room for the tag.
    0x0000555555589f03 <+67>: shl    $0x3,%rax
Adjust the stack pointer.
    0x0000555555589f07 <+71>: add    $0x8,%rdx
Prepare a 32-bit mask shifted left by 3 bits for the tag.
    0x0000555555589f0b <+75>: movabs $0x7fffffff8,%rcx
Mask the result.
    0x0000555555589f15 <+85>: and    %rcx,%rax
Add the SmallInteger tag.
    0x0000555555589f18 <+88>: or     $0x1,%rax
Store the result on the stack.
    0x0000555555589f1c <+92>: mov    %rax,(%rdx)
    0x0000555555589f1f <+95>: mov    %rdx,0x377552(%rip)        # 0x555555901478 <stackPointer>
    0x0000555555589f26 <+102>: retq
    0x0000555555589f27 <+103>: nopw   0x0(%rax,%rax,1)
    0x0000555555589f30 <+112>: xor    %eax,%eax
    0x0000555555589f32 <+114>: test   $0x4,%cl
    0x0000555555589f35 <+117>: jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 1st 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589f37 <+119>: movslq 0x8(%rcx),%rax
    0x0000555555589f3b <+123>: jmp    0x555555589f03 <primitiveFloatAt+67>
End of assembler dump.

Levente

On Fri, 22 Feb 2019, Tobias Pape wrote:

>
> Hi Levente.
>
>> On 22.02.2019, at 00:43, Levente Uzonyi <[hidden email]> wrote:
>>
>> I had a look at the generated code and I think the generated code for primitiveAsFloat has no issues. The machine code is simple and looks correct with gcc 8, thought the pxor operation is unnecessary (and not generated by gcc 4) as cvtsi2sd will zero fill the higher 64 bits of xmm0.
>>
>> GCC 4.8:
>>
>> 0000000000432ff0 <primitiveAsFloat>:
>> {   DECL_MAYBE_SQ_GLOBAL_STRUCT
>>  432ff0:       53                      push   %rbx
>>        rcvr = longAt(GIV(stackPointer));
>>  432ff1:       48 8b 1d 20 24 36 00    mov    0x362420(%rip),%rbx        # 795418 <stackPointer>
>>        longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
>>  432ff8:       48 8b 03                mov    (%rbx),%rax
>>  432ffb:       48 c1 f8 03             sar    $0x3,%rax
>>  432fff:       f2 48 0f 2a c0          cvtsi2sd %rax,%xmm0
>>  433004:       e8 37 ee ff ff          callq  431e40 <floatObjectOf>
>>  433009:       48 89 03                mov    %rax,(%rbx)
>>        GIV(stackPointer) = sp;
>>  43300c:       48 89 1d 05 24 36 00    mov    %rbx,0x362405(%rip)        # 795418 <stackPointer>
>> }
>>  433013:       5b                      pop    %rbx
>>  433014:       c3                      retq
>>  433015:       90                      nop
>>  433016:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
>>  43301d:       00 00 00
>>
>> GCC 8.2:
>>
>> 000000000004a1f0 <primitiveAsFloat>:
>> {   DECL_MAYBE_SQ_GLOBAL_STRUCT
>>   4a1f0:       53                      push   %rbx
>>        rcvr = longAt(GIV(stackPointer));
>>   4a1f1:       48 8b 1d 80 32 36 00    mov    0x363280(%rip),%rbx        # 3ad478 <stackPointer>
>>        longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
>>   4a1f8:       66 0f ef c0             pxor   %xmm0,%xmm0
>>   4a1fc:       48 8b 03                mov    (%rbx),%rax
>>   4a1ff:       48 c1 f8 03             sar    $0x3,%rax
>>   4a203:       f2 48 0f 2a c0          cvtsi2sd %rax,%xmm0
>>   4a208:       e8 b3 eb ff ff          callq  48dc0 <floatObjectOf>
>>   4a20d:       48 89 03                mov    %rax,(%rbx)
>>        GIV(stackPointer) = sp;
>>   4a210:       48 89 1d 61 32 36 00    mov    %rbx,0x363261(%rip)        # 3ad478 <stackPointer>
>> }
>>   4a217:       5b                      pop    %rbx
>>   4a218:       c3                      retq
>>   4a219:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
>>
>>
>> Actually the returned float object is -100.0 (-100 asFloat < 0.0 returns true), but the code used for printing doesn't work properly, because #basicAt: returns 0 (when the argument is 1 or 2), and the method used to print the number uses #signBit to print the negative sign, and #signBit relies on #basicAt: returning a non-zero value when the number is negative.
>>
>> So, we're back to the original issue I found: #basicAt: (primitive 38) doesn't work with newer gcc versions.
>>
>
> Wow. Great analysis! Thank you :)
>
> Best regards
> -Tobias
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Eliot Miranda-2
 
Hi Levente,

On Fri, Feb 22, 2019 at 1:58 AM Levente Uzonyi <[hidden email]> wrote:
 
Thanks. I spent some time in gdb land and there seems to be a miscompiled
conditional jump. The question is: is it a compiler bug or a bug in the
Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is
generated.

Here is the disassembled code with my comments following the path when
-100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt:
    0x0000555555589ec0 <+0>:    mov    0x3775b1(%rip),%rdx        # 0x555555901478 <stackPointer>
%rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively).
    0x0000555555589ec7 <+7>:    mov    (%rdx),%rax
%rcx is the receiver object's address
    0x0000555555589eca <+10>:   mov    0x8(%rdx),%rcx
Is the index SmallInteger 1?
    0x0000555555589ece <+14>:   cmp    $0x9,%rax
If yes, jump to 0x555555589ef8
    0x0000555555589ed2 <+18>:   je     0x555555589ef8 <primitiveFloatAt+56>
    0x0000555555589ed4 <+20>:   cmp    $0x11,%rax
    0x0000555555589ed8 <+24>:   je     0x555555589f30 <primitiveFloatAt+112>
    0x0000555555589eda <+26>:   and    $0x7,%eax
    0x0000555555589edd <+29>:   cmp    $0x1,%rax
    0x0000555555589ee1 <+33>:   sete   %al
    0x0000555555589ee4 <+36>:   movzbl %al,%eax
    0x0000555555589ee7 <+39>:   add    $0x3,%rax
    0x0000555555589eeb <+43>:   mov    %rax,0x37757e(%rip)        # 0x555555901470 <primFailCode>
    0x0000555555589ef2 <+50>:   retq
    0x0000555555589ef3 <+51>:   nopl   0x0(%rax,%rax,1)
Set the value of  %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important.
    0x0000555555589ef8 <+56>:   xor    %eax,%eax
Check if the receiver has the SmallFloat tag set (0x4)
    0x0000555555589efa <+58>:   test   $0x4,%cl
If yes, jump to  0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0.
=> 0x0000555555589efd <+61>:    jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 2nd 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589eff <+63>:   movslq 0xc(%rcx),%rax
Shift by 3 bits to make room for the tag.
    0x0000555555589f03 <+67>:   shl    $0x3,%rax
Adjust the stack pointer.
    0x0000555555589f07 <+71>:   add    $0x8,%rdx
Prepare a 32-bit mask shifted left by 3 bits for the tag.
    0x0000555555589f0b <+75>:   movabs $0x7fffffff8,%rcx
Mask the result.
    0x0000555555589f15 <+85>:   and    %rcx,%rax
Add the SmallInteger tag.
    0x0000555555589f18 <+88>:   or     $0x1,%rax
Store the result on the stack.
    0x0000555555589f1c <+92>:   mov    %rax,(%rdx)
    0x0000555555589f1f <+95>:   mov    %rdx,0x377552(%rip)        # 0x555555901478 <stackPointer>
    0x0000555555589f26 <+102>:  retq
    0x0000555555589f27 <+103>:  nopw   0x0(%rax,%rax,1)
    0x0000555555589f30 <+112>:  xor    %eax,%eax
    0x0000555555589f32 <+114>:  test   $0x4,%cl
    0x0000555555589f35 <+117>:  jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 1st 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589f37 <+119>:  movslq 0x8(%rcx),%rax
    0x0000555555589f3b <+123>:  jmp    0x555555589f03 <primitiveFloatAt+67>
End of assembler dump.

I'm pretty sure the C code is correct, but Nicolas is our C lawyer.  However we need to look at both primitiveFloatAt and fetchLong32:ofFloatObject:. Here's the source:

    /* Spur64BitMemoryManager>>#fetchLong32:ofFloatObject: */
static sqInt NoDbgRegParms
fetchLong32ofFloatObject(sqInt fieldIndex, sqInt oop)
{
    usqLong bits;
    usqLong rot;

    if (!(oop & (smallFloatTag()))) {
        return long32At((oop + BaseHeaderSize) + (((sqInt)((usqInt)(fieldIndex) << 2))));
    }
    /* begin smallFloatBitsOf: */
    assert(isImmediateFloat(oop));
    rot = ((usqInt) (((usqInt)oop))) >> (numTagBits());
    if (rot > 1) {

        /* a.k.a. ~= +/-0.0 */
        rot += ((sqInt)((usqInt)((smallFloatExponentOffset())) << ((smallFloatMantissaBits()) + 1)));
    }
    /* begin rotateRight: */
    rot = (rot << 0x3F) + (((usqInt) (((usqInt)rot))) >> 1);
    bits = rot;
    return (((int *) ((&bits))))[fieldIndex];
}

Note that [+/-]100.0 is an immediate float.  Nicolas, is this kind of pointer punning still OK?  Notice that at no time do we pun the bits to a double.  They are always integral bits.  What we're doing is manipulating the bit pattern for -100.0, an immediate SmallFloat64. (Spur64BitMemoryManager new smallFloatObjectOf: -100.0) hex '16r859000000000000C'  where 16r4 is the tag pattern for an immediate float and 16r8 is the sign bit in the least significant position.



Levente

On Fri, 22 Feb 2019, Tobias Pape wrote:

>
> Hi Levente.
>
>> On 22.02.2019, at 00:43, Levente Uzonyi <[hidden email]> wrote:
>>
>> I had a look at the generated code and I think the generated code for primitiveAsFloat has no issues. The machine code is simple and looks correct with gcc 8, thought the pxor operation is unnecessary (and not generated by gcc 4) as cvtsi2sd will zero fill the higher 64 bits of xmm0.
>>
>> GCC 4.8:
>>
>> 0000000000432ff0 <primitiveAsFloat>:
>> {   DECL_MAYBE_SQ_GLOBAL_STRUCT
>>  432ff0:       53                      push   %rbx
>>        rcvr = longAt(GIV(stackPointer));
>>  432ff1:       48 8b 1d 20 24 36 00    mov    0x362420(%rip),%rbx        # 795418 <stackPointer>
>>        longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
>>  432ff8:       48 8b 03                mov    (%rbx),%rax
>>  432ffb:       48 c1 f8 03             sar    $0x3,%rax
>>  432fff:       f2 48 0f 2a c0          cvtsi2sd %rax,%xmm0
>>  433004:       e8 37 ee ff ff          callq  431e40 <floatObjectOf>
>>  433009:       48 89 03                mov    %rax,(%rbx)
>>        GIV(stackPointer) = sp;
>>  43300c:       48 89 1d 05 24 36 00    mov    %rbx,0x362405(%rip)        # 795418 <stackPointer>
>> }
>>  433013:       5b                      pop    %rbx
>>  433014:       c3                      retq
>>  433015:       90                      nop
>>  433016:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
>>  43301d:       00 00 00
>>
>> GCC 8.2:
>>
>> 000000000004a1f0 <primitiveAsFloat>:
>> {   DECL_MAYBE_SQ_GLOBAL_STRUCT
>>   4a1f0:       53                      push   %rbx
>>        rcvr = longAt(GIV(stackPointer));
>>   4a1f1:       48 8b 1d 80 32 36 00    mov    0x363280(%rip),%rbx        # 3ad478 <stackPointer>
>>        longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) )));
>>   4a1f8:       66 0f ef c0             pxor   %xmm0,%xmm0
>>   4a1fc:       48 8b 03                mov    (%rbx),%rax
>>   4a1ff:       48 c1 f8 03             sar    $0x3,%rax
>>   4a203:       f2 48 0f 2a c0          cvtsi2sd %rax,%xmm0
>>   4a208:       e8 b3 eb ff ff          callq  48dc0 <floatObjectOf>
>>   4a20d:       48 89 03                mov    %rax,(%rbx)
>>        GIV(stackPointer) = sp;
>>   4a210:       48 89 1d 61 32 36 00    mov    %rbx,0x363261(%rip)        # 3ad478 <stackPointer>
>> }
>>   4a217:       5b                      pop    %rbx
>>   4a218:       c3                      retq
>>   4a219:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
>>
>>
>> Actually the returned float object is -100.0 (-100 asFloat < 0.0 returns true), but the code used for printing doesn't work properly, because #basicAt: returns 0 (when the argument is 1 or 2), and the method used to print the number uses #signBit to print the negative sign, and #signBit relies on #basicAt: returning a non-zero value when the number is negative.
>>
>> So, we're back to the original issue I found: #basicAt: (primitive 38) doesn't work with newer gcc versions.
>>
>
> Wow. Great analysis! Thank you :)
>
> Best regards
>       -Tobias


--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Nicolas Cellier
 

Hi Eliot,
see below...

Le ven. 22 févr. 2019 à 21:03, Eliot Miranda <[hidden email]> a écrit :
 
Hi Levente,

On Fri, Feb 22, 2019 at 1:58 AM Levente Uzonyi <[hidden email]> wrote:
 
Thanks. I spent some time in gdb land and there seems to be a miscompiled
conditional jump. The question is: is it a compiler bug or a bug in the
Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is
generated.

Here is the disassembled code with my comments following the path when
-100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt:
    0x0000555555589ec0 <+0>:    mov    0x3775b1(%rip),%rdx        # 0x555555901478 <stackPointer>
%rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively).
    0x0000555555589ec7 <+7>:    mov    (%rdx),%rax
%rcx is the receiver object's address
    0x0000555555589eca <+10>:   mov    0x8(%rdx),%rcx
Is the index SmallInteger 1?
    0x0000555555589ece <+14>:   cmp    $0x9,%rax
If yes, jump to 0x555555589ef8
    0x0000555555589ed2 <+18>:   je     0x555555589ef8 <primitiveFloatAt+56>
    0x0000555555589ed4 <+20>:   cmp    $0x11,%rax
    0x0000555555589ed8 <+24>:   je     0x555555589f30 <primitiveFloatAt+112>
    0x0000555555589eda <+26>:   and    $0x7,%eax
    0x0000555555589edd <+29>:   cmp    $0x1,%rax
    0x0000555555589ee1 <+33>:   sete   %al
    0x0000555555589ee4 <+36>:   movzbl %al,%eax
    0x0000555555589ee7 <+39>:   add    $0x3,%rax
    0x0000555555589eeb <+43>:   mov    %rax,0x37757e(%rip)        # 0x555555901470 <primFailCode>
    0x0000555555589ef2 <+50>:   retq
    0x0000555555589ef3 <+51>:   nopl   0x0(%rax,%rax,1)
Set the value of  %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important.
    0x0000555555589ef8 <+56>:   xor    %eax,%eax
Check if the receiver has the SmallFloat tag set (0x4)
    0x0000555555589efa <+58>:   test   $0x4,%cl
If yes, jump to  0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0.
=> 0x0000555555589efd <+61>:    jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 2nd 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589eff <+63>:   movslq 0xc(%rcx),%rax
Shift by 3 bits to make room for the tag.
    0x0000555555589f03 <+67>:   shl    $0x3,%rax
Adjust the stack pointer.
    0x0000555555589f07 <+71>:   add    $0x8,%rdx
Prepare a 32-bit mask shifted left by 3 bits for the tag.
    0x0000555555589f0b <+75>:   movabs $0x7fffffff8,%rcx
Mask the result.
    0x0000555555589f15 <+85>:   and    %rcx,%rax
Add the SmallInteger tag.
    0x0000555555589f18 <+88>:   or     $0x1,%rax
Store the result on the stack.
    0x0000555555589f1c <+92>:   mov    %rax,(%rdx)
    0x0000555555589f1f <+95>:   mov    %rdx,0x377552(%rip)        # 0x555555901478 <stackPointer>
    0x0000555555589f26 <+102>:  retq
    0x0000555555589f27 <+103>:  nopw   0x0(%rax,%rax,1)
    0x0000555555589f30 <+112>:  xor    %eax,%eax
    0x0000555555589f32 <+114>:  test   $0x4,%cl
    0x0000555555589f35 <+117>:  jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 1st 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589f37 <+119>:  movslq 0x8(%rcx),%rax
    0x0000555555589f3b <+123>:  jmp    0x555555589f03 <primitiveFloatAt+67>
End of assembler dump.

I'm pretty sure the C code is correct, but Nicolas is our C lawyer.  However we need to look at both primitiveFloatAt and fetchLong32:ofFloatObject:. Here's the source:

    /* Spur64BitMemoryManager>>#fetchLong32:ofFloatObject: */
static sqInt NoDbgRegParms
fetchLong32ofFloatObject(sqInt fieldIndex, sqInt oop)
{
    usqLong bits;
    usqLong rot;

    if (!(oop & (smallFloatTag()))) {
        return long32At((oop + BaseHeaderSize) + (((sqInt)((usqInt)(fieldIndex) << 2))));
    }
    /* begin smallFloatBitsOf: */
    assert(isImmediateFloat(oop));
    rot = ((usqInt) (((usqInt)oop))) >> (numTagBits());
    if (rot > 1) {

        /* a.k.a. ~= +/-0.0 */
        rot += ((sqInt)((usqInt)((smallFloatExponentOffset())) << ((smallFloatMantissaBits()) + 1)));
    }
    /* begin rotateRight: */
    rot = (rot << 0x3F) + (((usqInt) (((usqInt)rot))) >> 1);
    bits = rot;
    return (((int *) ((&bits))))[fieldIndex];
}

Note that [+/-]100.0 is an immediate float.  Nicolas, is this kind of pointer punning still OK?  Notice that at no time do we pun the bits to a double.  They are always integral bits.  What we're doing is manipulating the bit pattern for -100.0, an immediate SmallFloat64. (Spur64BitMemoryManager new smallFloatObjectOf: -100.0) hex '16r859000000000000C'  where 16r4 is the tag pattern for an immediate float and 16r8 is the sign bit in the least significant position.

In theory, pointer type aliasing is Undefined Behavior.
Though, I don't well see what kind of optimization a compiler could perform here...
Try one of the 2 solid choices for type punning:
- memcpy
- union (at least in C it's legal, I'm not even sure for C++ and don't have time to search now)
or you may try the more fragile -fno-strict-aliasing (or something like that)

Note that a clever enough compiler should not invoke memcpy at all in simple cases like this.
(If it is clever enough to misscompile this simple case of pointer aliasing, I don't see why it wouldn't for memcpy).

--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Eliot Miranda-2
 


On Sun, Feb 24, 2019 at 12:16 PM Nicolas Cellier <[hidden email]> wrote:
 

Hi Eliot,
see below...

Le ven. 22 févr. 2019 à 21:03, Eliot Miranda <[hidden email]> a écrit :
 
Hi Levente,

On Fri, Feb 22, 2019 at 1:58 AM Levente Uzonyi <[hidden email]> wrote:
 
Thanks. I spent some time in gdb land and there seems to be a miscompiled
conditional jump. The question is: is it a compiler bug or a bug in the
Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is
generated.

Here is the disassembled code with my comments following the path when
-100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt:
    0x0000555555589ec0 <+0>:    mov    0x3775b1(%rip),%rdx        # 0x555555901478 <stackPointer>
%rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively).
    0x0000555555589ec7 <+7>:    mov    (%rdx),%rax
%rcx is the receiver object's address
    0x0000555555589eca <+10>:   mov    0x8(%rdx),%rcx
Is the index SmallInteger 1?
    0x0000555555589ece <+14>:   cmp    $0x9,%rax
If yes, jump to 0x555555589ef8
    0x0000555555589ed2 <+18>:   je     0x555555589ef8 <primitiveFloatAt+56>
    0x0000555555589ed4 <+20>:   cmp    $0x11,%rax
    0x0000555555589ed8 <+24>:   je     0x555555589f30 <primitiveFloatAt+112>
    0x0000555555589eda <+26>:   and    $0x7,%eax
    0x0000555555589edd <+29>:   cmp    $0x1,%rax
    0x0000555555589ee1 <+33>:   sete   %al
    0x0000555555589ee4 <+36>:   movzbl %al,%eax
    0x0000555555589ee7 <+39>:   add    $0x3,%rax
    0x0000555555589eeb <+43>:   mov    %rax,0x37757e(%rip)        # 0x555555901470 <primFailCode>
    0x0000555555589ef2 <+50>:   retq
    0x0000555555589ef3 <+51>:   nopl   0x0(%rax,%rax,1)
Set the value of  %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important.
    0x0000555555589ef8 <+56>:   xor    %eax,%eax
Check if the receiver has the SmallFloat tag set (0x4)
    0x0000555555589efa <+58>:   test   $0x4,%cl
If yes, jump to  0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0.
=> 0x0000555555589efd <+61>:    jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 2nd 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589eff <+63>:   movslq 0xc(%rcx),%rax
Shift by 3 bits to make room for the tag.
    0x0000555555589f03 <+67>:   shl    $0x3,%rax
Adjust the stack pointer.
    0x0000555555589f07 <+71>:   add    $0x8,%rdx
Prepare a 32-bit mask shifted left by 3 bits for the tag.
    0x0000555555589f0b <+75>:   movabs $0x7fffffff8,%rcx
Mask the result.
    0x0000555555589f15 <+85>:   and    %rcx,%rax
Add the SmallInteger tag.
    0x0000555555589f18 <+88>:   or     $0x1,%rax
Store the result on the stack.
    0x0000555555589f1c <+92>:   mov    %rax,(%rdx)
    0x0000555555589f1f <+95>:   mov    %rdx,0x377552(%rip)        # 0x555555901478 <stackPointer>
    0x0000555555589f26 <+102>:  retq
    0x0000555555589f27 <+103>:  nopw   0x0(%rax,%rax,1)
    0x0000555555589f30 <+112>:  xor    %eax,%eax
    0x0000555555589f32 <+114>:  test   $0x4,%cl
    0x0000555555589f35 <+117>:  jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 1st 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589f37 <+119>:  movslq 0x8(%rcx),%rax
    0x0000555555589f3b <+123>:  jmp    0x555555589f03 <primitiveFloatAt+67>
End of assembler dump.

I'm pretty sure the C code is correct, but Nicolas is our C lawyer.  However we need to look at both primitiveFloatAt and fetchLong32:ofFloatObject:. Here's the source:

    /* Spur64BitMemoryManager>>#fetchLong32:ofFloatObject: */
static sqInt NoDbgRegParms
fetchLong32ofFloatObject(sqInt fieldIndex, sqInt oop)
{
    usqLong bits;
    usqLong rot;

    if (!(oop & (smallFloatTag()))) {
        return long32At((oop + BaseHeaderSize) + (((sqInt)((usqInt)(fieldIndex) << 2))));
    }
    /* begin smallFloatBitsOf: */
    assert(isImmediateFloat(oop));
    rot = ((usqInt) (((usqInt)oop))) >> (numTagBits());
    if (rot > 1) {

        /* a.k.a. ~= +/-0.0 */
        rot += ((sqInt)((usqInt)((smallFloatExponentOffset())) << ((smallFloatMantissaBits()) + 1)));
    }
    /* begin rotateRight: */
    rot = (rot << 0x3F) + (((usqInt) (((usqInt)rot))) >> 1);
    bits = rot;
    return (((int *) ((&bits))))[fieldIndex];
}

Note that [+/-]100.0 is an immediate float.  Nicolas, is this kind of pointer punning still OK?  Notice that at no time do we pun the bits to a double.  They are always integral bits.  What we're doing is manipulating the bit pattern for -100.0, an immediate SmallFloat64. (Spur64BitMemoryManager new smallFloatObjectOf: -100.0) hex '16r859000000000000C'  where 16r4 is the tag pattern for an immediate float and 16r8 is the sign bit in the least significant position.

In theory, pointer type aliasing is Undefined Behavior.
Though, I don't well see what kind of optimization a compiler could perform here...
Try one of the 2 solid choices for type punning:
- memcpy
- union (at least in C it's legal, I'm not even sure for C++ and don't have time to search now)
or you may try the more fragile -fno-strict-aliasing (or something like that)

I'll rewrite to use a union.
 

Note that a clever enough compiler should not invoke memcpy at all in simple cases like this.
(If it is clever enough to misscompile this simple case of pointer aliasing, I don't see why it wouldn't for memcpy).

--
_,,,^..^,,,_
best, Eliot


--
_,,,^..^,,,_
best, Eliot
Reply | Threaded
Open this post in threaded view
|

Re: Primitive 40 (asFloat) fails for me

Eliot Miranda-2
 


On Sun, Feb 24, 2019 at 12:52 PM Eliot Miranda <[hidden email]> wrote:


On Sun, Feb 24, 2019 at 12:16 PM Nicolas Cellier <[hidden email]> wrote:
 

Hi Eliot,
see below...

Le ven. 22 févr. 2019 à 21:03, Eliot Miranda <[hidden email]> a écrit :
 
Hi Levente,

On Fri, Feb 22, 2019 at 1:58 AM Levente Uzonyi <[hidden email]> wrote:
 
Thanks. I spent some time in gdb land and there seems to be a miscompiled
conditional jump. The question is: is it a compiler bug or a bug in the
Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is
generated.

Here is the disassembled code with my comments following the path when
-100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt:
    0x0000555555589ec0 <+0>:    mov    0x3775b1(%rip),%rdx        # 0x555555901478 <stackPointer>
%rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively).
    0x0000555555589ec7 <+7>:    mov    (%rdx),%rax
%rcx is the receiver object's address
    0x0000555555589eca <+10>:   mov    0x8(%rdx),%rcx
Is the index SmallInteger 1?
    0x0000555555589ece <+14>:   cmp    $0x9,%rax
If yes, jump to 0x555555589ef8
    0x0000555555589ed2 <+18>:   je     0x555555589ef8 <primitiveFloatAt+56>
    0x0000555555589ed4 <+20>:   cmp    $0x11,%rax
    0x0000555555589ed8 <+24>:   je     0x555555589f30 <primitiveFloatAt+112>
    0x0000555555589eda <+26>:   and    $0x7,%eax
    0x0000555555589edd <+29>:   cmp    $0x1,%rax
    0x0000555555589ee1 <+33>:   sete   %al
    0x0000555555589ee4 <+36>:   movzbl %al,%eax
    0x0000555555589ee7 <+39>:   add    $0x3,%rax
    0x0000555555589eeb <+43>:   mov    %rax,0x37757e(%rip)        # 0x555555901470 <primFailCode>
    0x0000555555589ef2 <+50>:   retq
    0x0000555555589ef3 <+51>:   nopl   0x0(%rax,%rax,1)
Set the value of  %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important.
    0x0000555555589ef8 <+56>:   xor    %eax,%eax
Check if the receiver has the SmallFloat tag set (0x4)
    0x0000555555589efa <+58>:   test   $0x4,%cl
If yes, jump to  0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0.
=> 0x0000555555589efd <+61>:    jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 2nd 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589eff <+63>:   movslq 0xc(%rcx),%rax
Shift by 3 bits to make room for the tag.
    0x0000555555589f03 <+67>:   shl    $0x3,%rax
Adjust the stack pointer.
    0x0000555555589f07 <+71>:   add    $0x8,%rdx
Prepare a 32-bit mask shifted left by 3 bits for the tag.
    0x0000555555589f0b <+75>:   movabs $0x7fffffff8,%rcx
Mask the result.
    0x0000555555589f15 <+85>:   and    %rcx,%rax
Add the SmallInteger tag.
    0x0000555555589f18 <+88>:   or     $0x1,%rax
Store the result on the stack.
    0x0000555555589f1c <+92>:   mov    %rax,(%rdx)
    0x0000555555589f1f <+95>:   mov    %rdx,0x377552(%rip)        # 0x555555901478 <stackPointer>
    0x0000555555589f26 <+102>:  retq
    0x0000555555589f27 <+103>:  nopw   0x0(%rax,%rax,1)
    0x0000555555589f30 <+112>:  xor    %eax,%eax
    0x0000555555589f32 <+114>:  test   $0x4,%cl
    0x0000555555589f35 <+117>:  jne    0x555555589f03 <primitiveFloatAt+67>
Copy the 1st 32-bit field of the receiver into %rax with sign extension?
    0x0000555555589f37 <+119>:  movslq 0x8(%rcx),%rax
    0x0000555555589f3b <+123>:  jmp    0x555555589f03 <primitiveFloatAt+67>
End of assembler dump.

I'm pretty sure the C code is correct, but Nicolas is our C lawyer.  However we need to look at both primitiveFloatAt and fetchLong32:ofFloatObject:. Here's the source:

    /* Spur64BitMemoryManager>>#fetchLong32:ofFloatObject: */
static sqInt NoDbgRegParms
fetchLong32ofFloatObject(sqInt fieldIndex, sqInt oop)
{
    usqLong bits;
    usqLong rot;

    if (!(oop & (smallFloatTag()))) {
        return long32At((oop + BaseHeaderSize) + (((sqInt)((usqInt)(fieldIndex) << 2))));
    }
    /* begin smallFloatBitsOf: */
    assert(isImmediateFloat(oop));
    rot = ((usqInt) (((usqInt)oop))) >> (numTagBits());
    if (rot > 1) {

        /* a.k.a. ~= +/-0.0 */
        rot += ((sqInt)((usqInt)((smallFloatExponentOffset())) << ((smallFloatMantissaBits()) + 1)));
    }
    /* begin rotateRight: */
    rot = (rot << 0x3F) + (((usqInt) (((usqInt)rot))) >> 1);
    bits = rot;
    return (((int *) ((&bits))))[fieldIndex];
}

Note that [+/-]100.0 is an immediate float.  Nicolas, is this kind of pointer punning still OK?  Notice that at no time do we pun the bits to a double.  They are always integral bits.  What we're doing is manipulating the bit pattern for -100.0, an immediate SmallFloat64. (Spur64BitMemoryManager new smallFloatObjectOf: -100.0) hex '16r859000000000000C'  where 16r4 is the tag pattern for an immediate float and 16r8 is the sign bit in the least significant position.

In theory, pointer type aliasing is Undefined Behavior.
Though, I don't well see what kind of optimization a compiler could perform here...
Try one of the 2 solid choices for type punning:
- memcpy
- union (at least in C it's legal, I'm not even sure for C++ and don't have time to search now)
or you may try the more fragile -fno-strict-aliasing (or something like that)

I'll rewrite to use a union.

Ugh, not yet.  That would be so much uglier and m ore complicated.  Let's pin down where the bug is first.
 
 

Note that a clever enough compiler should not invoke memcpy at all in simple cases like this.
(If it is clever enough to misscompile this simple case of pointer aliasing, I don't see why it wouldn't for memcpy).

+1.

_,,,^..^,,,_
best, Eliot