> Have you tried switching off compiler optimisations? Ah! I had forgotten that one. I thought that was changed in mvm. Image works fine noe (but more slowly!) on aarch64 Alpine Linux (musl+busybox). I am just out the door fir a couple of days but will try fdlib on return. Thanks again to all! -KenD
-KenD
|
Hi Ken, this is a problem in an optimization of GCC 8.3 (I am not sure what other versions are impacted). Basically it is generating bad a function, removing code that assumes that is dead code. Even though this is a bug in GCC, because there is not actual reason to remove the code. It is produced because the code is not written in a nice way. Basically, if someone wants to fix it in the VM code is modifying a single method: Spur64BitMemoryManager >> fetchLong32: fieldIndex ofFloatObject: oop "index by word size, and return a pointer as long as the word size" | bits | (self isImmediateFloat: oop) ifFalse: [^self fetchLong32: fieldIndex ofObject: oop]. bits := self smallFloatBitsOf: oop. ^ fieldIndex = 0 ifTrue: [bits bitAnd: 16rFFFFFFFF] ifFalse: [bits >> 32] The old method had a handwritten piece of C code to replace the last if: (self cCoerceSimple: (self addressOf: bits) to: #'int *') at: fieldIndex I assume it was done to handle a different kind of endianness, but the single sender of this message is already doing the handling and having different behavior depending on the endianness. With this fix the build can still be executed with -O2 optimization in GCC 8.3 Cheers, On Sat, Dec 7, 2019 at 4:20 PM <[hidden email]> wrote: > > > > Have you tried switching off compiler optimisations? > > Ah! I had forgotten that one. I thought that was changed in mvm. > > Image works fine noe (but more slowly!) on aarch64 Alpine Linux > (musl+busybox). > > I am just out the door fir a couple of days but will try fdlib on > return. > > Thanks again to all! > -KenD -- Pablo Tesone. [hidden email] |
> On Dec 11, 2019, at 8:45 AM, "[hidden email]" <[hidden email]> wrote: > > > Hi Ken, > this is a problem in an optimization of GCC 8.3 (I am not sure what > other versions are impacted). > Basically it is generating bad a function, removing code that assumes > that is dead code. > Even though this is a bug in GCC, because there is not actual reason > to remove the code. > It is produced because the code is not written in a nice way. > > Basically, if someone wants to fix it in the VM code is modifying a > single method: > > Spur64BitMemoryManager >> fetchLong32: fieldIndex ofFloatObject: oop > "index by word size, and return a pointer as long as the word size" > > | bits | > (self isImmediateFloat: oop) ifFalse: > [^self fetchLong32: fieldIndex ofObject: oop]. > bits := self smallFloatBitsOf: oop. > ^ fieldIndex = 0 > ifTrue: [bits bitAnd: 16rFFFFFFFF] > ifFalse: [bits >> 32] Thank you, Pablo. I’ll integrate this quickly. Can you possibly give me initials and time stamp? It’s not necessary but helps that the code is marked as being your fix. > > The old method had a handwritten piece of C code to replace the last if: > > (self cCoerceSimple: (self addressOf: bits) to: #'int *') at: fieldIndex > > I assume it was done to handle a different kind of endianness, but the > single sender of this message is already doing the handling and having > different behavior depending on the endianness. > > With this fix the build can still be executed with -O2 optimization in GCC 8.3 > > Cheers, > >> On Sat, Dec 7, 2019 at 4:20 PM <[hidden email]> wrote: >> >> >>> Have you tried switching off compiler optimisations? >> >> Ah! I had forgotten that one. I thought that was changed in mvm. >> >> Image works fine noe (but more slowly!) on aarch64 Alpine Linux >> (musl+busybox). >> >> I am just out the door fir a couple of days but will try fdlib on >> return. >> >> Thanks again to all! >> -KenD > > > > -- > Pablo Tesone. > [hidden email] |
Hi Eliot, you can use PT as initials (I don't if there is a clash or ptesone) and the timestamp is just today. Thanks On Wed, Dec 11, 2019 at 6:02 PM Eliot Miranda <[hidden email]> wrote: > > > > > > On Dec 11, 2019, at 8:45 AM, "[hidden email]" <[hidden email]> wrote: > > > > > > Hi Ken, > > this is a problem in an optimization of GCC 8.3 (I am not sure what > > other versions are impacted). > > Basically it is generating bad a function, removing code that assumes > > that is dead code. > > Even though this is a bug in GCC, because there is not actual reason > > to remove the code. > > It is produced because the code is not written in a nice way. > > > > Basically, if someone wants to fix it in the VM code is modifying a > > single method: > > > > Spur64BitMemoryManager >> fetchLong32: fieldIndex ofFloatObject: oop > > "index by word size, and return a pointer as long as the word size" > > > > | bits | > > (self isImmediateFloat: oop) ifFalse: > > [^self fetchLong32: fieldIndex ofObject: oop]. > > bits := self smallFloatBitsOf: oop. > > ^ fieldIndex = 0 > > ifTrue: [bits bitAnd: 16rFFFFFFFF] > > ifFalse: [bits >> 32] > > Thank you, Pablo. I’ll integrate this quickly. Can you possibly give me initials and time stamp? It’s not necessary but helps that the code is marked as being your fix. > > > > > The old method had a handwritten piece of C code to replace the last if: > > > > (self cCoerceSimple: (self addressOf: bits) to: #'int *') at: fieldIndex > > > > I assume it was done to handle a different kind of endianness, but the > > single sender of this message is already doing the handling and having > > different behavior depending on the endianness. > > > > With this fix the build can still be executed with -O2 optimization in GCC 8.3 > > > > Cheers, > > > >> On Sat, Dec 7, 2019 at 4:20 PM <[hidden email]> wrote: > >> > >> > >>> Have you tried switching off compiler optimisations? > >> > >> Ah! I had forgotten that one. I thought that was changed in mvm. > >> > >> Image works fine noe (but more slowly!) on aarch64 Alpine Linux > >> (musl+busybox). > >> > >> I am just out the door fir a couple of days but will try fdlib on > >> return. > >> > >> Thanks again to all! > >> -KenD > > > > > > > > -- > > Pablo Tesone. > > [hidden email] -- Pablo Tesone. [hidden email] |
> On Dec 11, 2019, at 9:09 AM, "[hidden email]" <[hidden email]> wrote: > > Hi Eliot, you can use PT as initials (I don't if there is a clash or > ptesone) and the timestamp is just today. Thanks. And I’ll include your email on the GCC SSA. Thanks for tracking this down. This has been biting us for some time. There was a similar issue with accessing the two 32-bit halves of the 64-bit object header in Spur that meant I had to use a rewrite. I’ll try and dig it out. Perhaps we could collaborate on performing the same analysis and verifying that there is an issue with this header access code in gcc. > > Thanks > >> On Wed, Dec 11, 2019 at 6:02 PM Eliot Miranda <[hidden email]> wrote: >> >> >> >> >>>> On Dec 11, 2019, at 8:45 AM, "[hidden email]" <[hidden email]> wrote: >>> >>> >>> Hi Ken, >>> this is a problem in an optimization of GCC 8.3 (I am not sure what >>> other versions are impacted). >>> Basically it is generating bad a function, removing code that assumes >>> that is dead code. >>> Even though this is a bug in GCC, because there is not actual reason >>> to remove the code. >>> It is produced because the code is not written in a nice way. >>> >>> Basically, if someone wants to fix it in the VM code is modifying a >>> single method: >>> >>> Spur64BitMemoryManager >> fetchLong32: fieldIndex ofFloatObject: oop >>> "index by word size, and return a pointer as long as the word size" >>> >>> | bits | >>> (self isImmediateFloat: oop) ifFalse: >>> [^self fetchLong32: fieldIndex ofObject: oop]. >>> bits := self smallFloatBitsOf: oop. >>> ^ fieldIndex = 0 >>> ifTrue: [bits bitAnd: 16rFFFFFFFF] >>> ifFalse: [bits >> 32] >> >> Thank you, Pablo. I’ll integrate this quickly. Can you possibly give me initials and time stamp? It’s not necessary but helps that the code is marked as being your fix. >> >>> >>> The old method had a handwritten piece of C code to replace the last if: >>> >>> (self cCoerceSimple: (self addressOf: bits) to: #'int *') at: fieldIndex >>> >>> I assume it was done to handle a different kind of endianness, but the >>> single sender of this message is already doing the handling and having >>> different behavior depending on the endianness. >>> >>> With this fix the build can still be executed with -O2 optimization in GCC 8.3 >>> >>> Cheers, >>> >>>> On Sat, Dec 7, 2019 at 4:20 PM <[hidden email]> wrote: >>>> >>>> >>>>> Have you tried switching off compiler optimisations? >>>> >>>> Ah! I had forgotten that one. I thought that was changed in mvm. >>>> >>>> Image works fine noe (but more slowly!) on aarch64 Alpine Linux >>>> (musl+busybox). >>>> >>>> I am just out the door fir a couple of days but will try fdlib on >>>> return. >>>> >>>> Thanks again to all! >>>> -KenD >>> >>> >>> >>> -- >>> Pablo Tesone. >>> [hidden email] > > > > -- > Pablo Tesone. > [hidden email] |
Hi all, Eliot, Pablo, we should review all the pointer aliasing that we are still depending upon, because it can strike anytime soon... The recommended way is memcpy, so let's use that. The alternative is type puning via union, but it's not legal on C++. If wanting to support exotic compiler (MSVC) which is mainly focused on C++, not modern C, it's safer to use memcpy. Le mer. 11 déc. 2019 à 20:00, Eliot Miranda <[hidden email]> a écrit :
|
Hi Nicolas, On Wed, Dec 11, 2019 at 11:58 AM Nicolas Cellier <[hidden email]> wrote:
Agreed. I just went through all the uses of casts to int * and verified that all the other uses are legitimate (they're typically cases of firstIndexableField:, so of something in memory, not something that may be in a register). That's what this issue is really about. In the olden days for a C compiler to put something in a register one had to use the register type qualifier, and taking the address of an automatic variable (C's name for a local variable) would raise an error. Then (quite rightly) gcc started ignoring the register qualifier and (at my request) added the asm("register name") form for specifying global register variables. Taking the address of an automatic variable then marked it as a variable that couldn't be assigned to a variable. So tat was fine. One could perfectly legally play the usual memory punning tricks without having to resort to writing a union. When 64-bit computing went mainstream the need was for C compilers to generate good code for int variables and so lots of previously valid code was redefined to be undefined behavior. See Chris Lattner's excellent blog post http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html. (While this is a really good post I don't endorse the position taken by C compilers; IMO this is an example of the tail wagging the dog; compilers could still do what they did in the '90's and stack allocate something whose address is taken). Anyway, we don't control this world so we have to adapt.
_,,,^..^,,,_ best, Eliot |
In reply to this post by Nicolas Cellier
Hi Nicolas, hi all, Can you be more specific Nicolas on how to use memcpy to circumvent strict aliasing? What should the convention for the line: ^ (self cCoerceSimple: (self addressOf: bits) to: #'int *') at: fieldIndex ? This: | res | self memcpy: (self addressOf: res) _: bits + (fieldIndex *4) _: 4. ^ res ? I've already seen union used in the VM to circumvent aliasing (Cf in sqMemoryAccess.h): /* this is to allow strict aliasing assumption in the optimizer */ typedef union { double d; int i[sizeof(double) / sizeof(int)]; } _swapper; Not sure if we should do something about that right now. On Wed, Dec 11, 2019 at 8:58 PM Nicolas Cellier <[hidden email]> wrote:
|
Hi Clement, I need more context to answer your question. We need to declare an automatic (local) variable for holding the content. Here I understand that bits has to be re-interpreted as an array of int: we need to know the length of this array. If we can't determine the length at compile time, this is going to be tricky... I have no VMMaker image handy, so I won't try to type the exact declaration stntax from memory, but I can give you an equivalent for swapper in C if the purpose is to swap: int i[sizeof(double) / sizeof(int)]; double d; memcpy(i,&d,sizeof(d)); return i[1-index]; Usage of union for type puning is legal in C, at least up to C99. It is not legal in C++: the compiler may assume that assigning swapper.d won't have any effect on swapper.i, et vice et versa. We can use one field or the other but they must be considered unrelated. hence it can eliminate some code or consider it invariant in loops and keep swapper.d and swapper.i in two different registers and never reload them from memory... My only concern is if we want to use compiler with unclear standards (MSVC?) Le mar. 17 déc. 2019 à 20:05, Clément Béra <[hidden email]> a écrit :
|
>
My only concern is if we want to use compiler with unclear standards (MSVC?)
Maybe this helps...? On Wed, 18 Dec 2019 at 17:44, Nicolas Cellier <[hidden email]> wrote:
|
In reply to this post by Nicolas Cellier
Thanks Nicolas for your reply. That means that basically we make a copy of the whole array to read a single field. I was initially confused because copying the whole array looks expensive, but I guess all C compilers are able to remove the unused copy (everything is unused besides the field read). Best, On Wed, Dec 18, 2019, 10:43 Nicolas Cellier <[hidden email]> wrote:
|
Hi Clément, Hi Nicolas, On Fri, Dec 20, 2019 at 4:47 AM Clément Béra <[hidden email]> wrote:
For example: =========8<======== dpint.c =========8<======== typedef union { long l; double d; } Punner; unsigned long uv(double d) { return d; } signed long iv(double d) { return d; } unsigned long ivp(double d) { Punner punner; punner.d = d; return punner.l; } double dv(long l) { return l; } double du(unsigned long l) { return l; } double dvp(long l) { Punner punner; punner.l = l; return punner.d; } =========8<======== dpint.c =========8<======== aarch64-none-elf-cc -O4 -S dpint.c (a cross compiler version of gcc 7.2.0 for ARMv8 on MacOS) =========8<======== dpint.s =========8<======== uv: fcvtzu x0, d0 ret iv: fcvtzs x0, d0 ret ivp: fmov x0, d0 ret dv: scvtf d0, x0 ret du: ucvtf d0, x0 ret dvp: fmov d0, x0 ret =========8<======== dpint.s =========8<======== This is a quick and easy way of finding out the assembler mnemonics for float <-> integer conversion/moves. Far easier (and sometimes faster) than searching through a 7,900 page manual (!?!?!?!?)
_,,,^..^,,,_ best, Eliot |
Free forum by Nabble | Edit this page |