Hello, I was debugging a strange crash when calling sqrt via a Lowcode instruction in the interpreter, which I tracked to currentBytecode stored in register(EBX), having a very large value. When debugging the generated assembly code with GDB, I noticed that GCC was generating position independent code and using EBX for doing a call without spilling/unspilling its value. |
Correction: this is not because of GCC, but because of Ubuntu 16.10. The same happens with GCC 5 2017-02-21 0:35 GMT-03:00 Ronie Salgado <[hidden email]>:
|
On Tue, Feb 21, 2017 at 1:23 PM, Ronie Salgado <[hidden email]> wrote: > > Correction: this is not because of GCC, but because of Ubuntu 16.10. The same happens with GCC 5 > > 2017-02-21 0:35 GMT-03:00 Ronie Salgado <[hidden email]>: >> >> Hello, >> >> I was debugging a strange crash when calling sqrt via a Lowcode instruction in the interpreter, which I tracked to currentBytecode stored in register(EBX), having a very large value. When debugging the generated assembly code with GDB, I noticed that GCC was generating position independent code and using EBX for doing a call without spilling/unspilling its value. >> >> By googling, it seems that position independent executable generation was turned on GCC 6 by default ( https://www.open-mesh.org/issues/304 ). To disable PIE, we have to compile the sources with -fno-pie and link with the -no-pie options. Would that only be applicable to 32-bit? To familiarise myself with these concepts I found this a good explanation of Position Independent Code... * http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries which says it "... will explain only how PIC works on x86, picking this older architecture specifically because (unlike x64) it wasn't designed with PIC in mind, so implementing PIC on it is a bit trickier ... Some non-Intel architectures like SPARC64 force PIC-only code for shared libraries, and many others (for example, ARM) include IP-relative addressing modes to make PIC more efficient. Both are true for the successor of x86, the x64 architecture." and the sister article on Load Time Relocation... * http://eli.thegreenplace.net/2011/08/25/load-time-relocation-of-shared-libraries/ says "... some modern systems (such as x86-64) no longer support load-time relocation." and at the bottom here describes why -nopic on 64-bit requires -mcmodel=large. * http://eli.thegreenplace.net/2011/11/11/position-independent-code-pic-in-shared-libraries-on-x64 cheers -ben |
In reply to this post by Ronie Salgado
> On 21 Feb 2017, at 10:35, Ronie Salgado <[hidden email]> wrote: > > Hello, > > I was debugging a strange crash when calling sqrt via a Lowcode instruction in the interpreter, which I tracked to currentBytecode stored in register(EBX), having a very large value. When debugging the generated assembly code with GDB, I noticed that GCC was generating position independent code and using EBX for doing a call without spilling/unspilling its value. Can you elaborate on the misbehavior? As of the ABI[1] EBX is a local register, can GCC know that EBX has been used for something else? holger [1] http://www.sco.com/developers/devspecs/abi386-4.pdf |
Hello, GDB session:Would that only be applicable to 32-bit? Perphaps. I have not tested the 64 bits Linux without forcing -fno-pie and -no-pie on my build. The default option on Ubuntu 16.10 is not -fPIC, but -fpie which is to allow Address Space Layout Randomization (a technique to mitigate security exploits based on buffer overflows, and return oriented programming) of the code of the executables themselves. The executables, unlike shared libraries usually are not compiled as position independent code even on x86_64. On x86_64, the difference on PIC/no PIC is not only in using rip relative addressing, but also in the usage the GOT and the PLT tables, for doing calls. The relocation of the position dependent code of executables happens in linking time, long before load time. > On 21 Feb 2017, at 10:35, Ronie Salgado <[hidden email]> wrote: spursrc/vm/gcc3x-cointerp.c: sqInt interpret(void) { DECL_MAYBE_SQ_GLOBAL_STRUCT register sqInt currentBytecode CB_REG; ... platforms/unix/vm/sqGnu.h: #elif defined(__i386__) # define IP_REG __asm__("%esi") # define SP_REG __asm__("%edi") # if (__GNUC__ > 2) || ((__GNUC__ == 2) && (__GNUC_MINOR__ >= 95)) # define CB_REG __asm__("%ebx") # else # define CB_REG /* avoid undue register pressure */ # endif (gdb) list interpret 2604 /* If stacklimit is zero then the stack pages have not been initialized. */ 2605 2606 /* StackInterpreter>>#interpret */ 2607 sqInt 2608 interpret(void) 2609 { DECL_MAYBE_SQ_GLOBAL_STRUCT 2610 register sqInt currentBytecode CB_REG; 2611 sqInt extA; 2612 sqInt extB; 2613 sqInt lkupClassTag; (gdb) break 25313 Punto de interrupción 1 at 0x55c77: file /home/ronie/projects/osvm-lowcode-clean/spurlowcodesrc/vm/gcc3x-cointerp.c, line 25313. (gdb) break 25314 Punto de interrupción 2 at 0x55c9b: file /home/ronie/projects/osvm-lowcode-clean/spurlowcodesrc/vm/gcc3x-cointerp.c, line 25314. (gdb) break 25315 25313 result14 = sqrt(value17); 25314 /* begin internalPushFloat32: */ 25315 nativeSP = (nativeStackPointerIn(localFP)) - BytesPerOop; Nota: punto de rotura 2 también fijar en pc 0x55c9b. Punto de interrupción 3 at 0x55c9b: file /home/ronie/projects/osvm-lowcode-clean/spurlowcodesrc/vm/gcc3x-cointerp.c, line 25315. (gdb) run Starting program: /home/ronie/projects/osvm-lowcode-clean/products/debug/phcoglowcodelinuxht/lib/pharo/5.0-201702210706-LowcodeFixup/pharo [Depuración de hilo usando libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [Nuevo Thread 0xf728db40 (LWP 4716)] Thread 1 "pharo" hit Breakpoint 1, interpret () at /home/ronie/projects/osvm-lowcode-clean/spurlowcodesrc/vm/gcc3x-cointerp.c:25313 25313 result14 = sqrt(value17); (gdb) print currentBytecode $1 = 504 (gdb) info registers eax 0xf778f009 -143069175 ecx 0x5678ec88 1450765448 edx 0xf778f009 -143069175 ebx 0x1f8 504 esp 0xfffc2540 0xfffc2540 ebp 0xfffc6558 0xfffc6558 esi 0x594e702d 1498312749 edi 0xfffcb390 -216176 eip 0x565aac77 0x565aac77 <interpret+214836> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 (gdb) continue Continuando. Thread 1 "pharo" hit Breakpoint 2, interpret () at /home/ronie/projects/osvm-lowcode-clean/spurlowcodesrc/vm/gcc3x-cointerp.c:25315 25315 nativeSP = (nativeStackPointerIn(localFP)) - BytesPerOop; (gdb) info registers eax 0xf778f009 -143069175 ecx 0x5678ec88 1450765448 edx 0x127f 4735 ebx 0x5678ec88 1450765448 esp 0xfffc2540 0xfffc2540 ebp 0xfffc6558 0xfffc6558 esi 0x594e702d 1498312749 edi 0xfffcb390 -216176 eip 0x565aac9b 0x565aac9b <interpret+214872> eflags 0x282 [ SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 (gdb) print currentBytecode $2 = 1450765448 B+>│0x565aac77 <interpret+214836> flds -0x1d6c(%ebp) │ │0x565aac7d <interpret+214842> sub $0x8,%esp │ │0x565aac80 <interpret+214845> lea -0x8(%esp),%esp │ │0x565aac84 <interpret+214849> fstpl (%esp) │ │0x565aac87 <interpret+214852> mov -0x4008(%ebp),%ebx │ │0x565aac8d <interpret+214858> call 0x56570840 2017-02-21 22:35 GMT-03:00 Holger Freyther <[hidden email]>:
|
> On 22 Feb 2017, at 14:58, Ronie Salgado <[hidden email]> wrote: Dear Ronie, > GDB layout asm on the line 25313 shows the generated code. > > B+>│0x565aac77 <interpret+214836> flds -0x1d6c(%ebp) │ > │0x565aac7d <interpret+214842> sub $0x8,%esp │ > │0x565aac80 <interpret+214845> lea -0x8(%esp),%esp │ > │0x565aac84 <interpret+214849> fstpl (%esp) │ > │0x565aac87 <interpret+214852> mov -0x4008(%ebp),%ebx │ > │0x565aac8d <interpret+214858> call 0x56570840 > > > Of special importance, is the instruction: mov -0x4008(%ebp),%ebx . this is the PLT entry for sqrt, and this is where ebx with the currentBytecode is destroyed. I tried to reproduce it but I think I don't generate enough register pressure? #include <stdint.h> #include <sys/types.h> #include <math.h> int interpret(int *ops, const size_t num_ops) { register int op __asm__("%ebx"); size_t off = 0; while (off < num_ops) { op = ops[off]; switch(op) { case 1: case 2: sqrt(op + num_ops); break; default: break; } off += 1; } } can you think of a way to get closer to the interpreter? Is it using computed goto? If there is a reproducer I am happy to open a bug with the GCC project and try to bring it to a resolution. holger |
Free forum by Nabble | Edit this page |