Hi All,
anyone know the x86/IA32 really well? If so, read on. Otherwise save yourself the yawn. I just tried to save an instruction in Cog;s generated bitShift: primitive. It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote from IA-32 Intel® Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192
Flags Affected The CF flag contains the value of the last bit shifted out of the destination operand; it is unde- fined for SHL and SHR instructions where the count is greater than or equal to the size (in bits) of the destination operand. The OF flag is affected only for 1-bit shifts (see "Description" above); otherwise, it is undefined. The SF, ZF, and PF flags are set according to the result. If the count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined.
(my emphasis added). But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong? TIA Eliot |
Hi Eliot,
Eliot Miranda wrote: > Hi All, > anyone know the x86/IA32 really well? If so, read on. Otherwise save > yourself the yawn. > [...] > (my emphasis added). But neither the Bochs simulator nor my Intel Core Duo > set the flags when doing sarl $1, %eax when %eax contains -1. Have I > misread, or is the manual wrong? > I cannot confirm this. Using this simple C-Program: int calc(int i) { return i >> 1; } int main() { printf("%i\n", calc(-1)); } my GCC 4.3.2 generates a sarl %eax instruction as the assembler output shows. Debugging it with Kdbg shows a change of the flags after the instruction. In fact, CF and SF are set as (more or less) expected. I also have a Intel Core 2 Duo. Regards, Martin |
In reply to this post by Eliot Miranda-2
Eliot Miranda <eliot.miranda <at> gmail.com> writes:
> > > Hi All, > > anyone know the x86/IA32 really well? If so, read on. Otherwise save yourself the yawn. > > I just tried to save an instruction in Cog;s generated bitShift: primitive. It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote from IA-32 Intel® Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192 > Hi Eliot, I guess you are adressing case of SmallInteger, otherwise I would understand optimize as using some MMX 64 or 128 bits arithmetic (like PSRLLQ). If relevant, check my trivial optimizations for large ints at http://bugs.squeak.org/view.php?id=7109 Nicolas |
In reply to this post by Martin Beck-3
Hi Martin, can you send me the assembly? Or show me the opcodes? When I try this it doesn't work. So I must be doing something differently.
and Hi! Robert Hirschfeld mentioned you when he and I met last week. On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck <[hidden email]> wrote: Hi Eliot, |
In reply to this post by Nicolas Cellier-3
On Wed, Jan 21, 2009 at 12:38 PM, Nicolas Cellier <[hidden email]> wrote:
Hi Nicholas, yes I'm doing SmallInteger and also trying to keep the JIT very simple initially so no MMX registers or instructions in the stage one JIT until I do floating-point. I'll take a look at these. Thanks.
|
In reply to this post by Eliot Miranda-2
Eliot Miranda wrote:
> Hi All, > > anyone know the x86/IA32 really well? If so, read on. Otherwise > save yourself the yawn. > > I just tried to save an instruction in Cog;s generated bitShift: > primitive. It seems to me that SARL (shift arithmetic right long) > should set the sign flag based on the result, in fact it says as much in > the manual; I quote from IA-32 Intel® Architecture Software > Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192 > > Flags Affected > > The CF flag contains the value of the last bit shifted out of the > destination operand; it is unde- > > fined for SHL and SHR instructions where the count is greater than or > equal to the size (in bits) > > of the destination operand. The OF flag is affected only for 1-bit > shifts (see "Description" > > above); otherwise, it is undefined. The SF, ZF, and PF flags are set > according to the result. If the > > count is 0, the flags are not affected. For a non-zero count, the AF > flag is undefined. > > > (my emphasis added). But neither the Bochs simulator nor my Intel Core > Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have > I misread, or is the manual wrong? Interesting. FWIW, the AMD64 arch manual (vol 3, p.220) also says that an SAR will affect the SF flag. Regards, -Martin |
In reply to this post by Eliot Miranda-2
apologies; my bad. I'd used the wrong branch. jump greater (if 0 > v) is not the same as jump (if v) negative . I live and learn. Sorry for the noise.
On Wed, Jan 21, 2009 at 10:21 AM, Eliot Miranda <[hidden email]> wrote: Hi All, |
In reply to this post by Eliot Miranda-2
Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc. David On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote: Hi Martin, can you send me the assembly? Or show me the opcodes? When I try this it doesn't work. So I must be doing something differently. |
On Thu, Jan 22, 2009 at 12:53 PM, David Farber <[hidden email]> wrote:
Um, I know :) Trouble is gcc also optimizes so it may not always generate the code you expect. For example,
issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; } will, with -O4, generate movl 4(%esp), %eax sarl $31,%eax
ret because it works out this is the quickest way to generate a 1 if v is negative and doesn't generate a compare at all. BTW, I've been abusing gcc's -S output for a long time. Back in the 80's I used to generate direct-threaded-code VMs using gcc where I would edit the -S output with sed to produce the opcodes for the threaded code machine stripped of the prolog and epilog gcc would produce. I've also produced JIT-compiled BitBlt by similar means with a number of different compilers. -S has been my friend for many years.
Cheers! Eliot
|
2009/1/22 Eliot Miranda <[hidden email]>:
> > > On Thu, Jan 22, 2009 at 12:53 PM, David Farber <[hidden email]> wrote: >> >> Eliot - I know you've already moved past this problem, but in the future, >> gcc -S foo.c will create foo.s with the assembly generated by gcc. > > Um, I know :) Trouble is gcc also optimizes so it may not always generate > the code you expect. For example, > issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; } > will, with -O4, generate > movl 4(%esp), %eax > sarl $31,%eax > ret > because it works out this is the quickest way to generate a 1 if v is > negative and doesn't generate a compare at all. > BTW, I've been abusing gcc's -S output for a long time. Back in the 80's I > used to generate direct-threaded-code VMs using gcc where I would edit the > -S output with sed to produce the opcodes for the threaded code machine > stripped of the prolog and epilog gcc would produce. I've also produced > JIT-compiled BitBlt by similar means with a number of different compilers. > -S has been my friend for many years. i'm dreaming to have listing := Object compile: 'yourself ^self' options: '-S' :) > Cheers! > Eliot > >> >> David >> On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote: >> >> Hi Martin, can you send me the assembly? Or show me the opcodes? When I >> try this it doesn't work. So I must be doing something differently. >> and Hi! Robert Hirschfeld mentioned you when he and I met last week. >> On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck >> <[hidden email]> wrote: >>> >>> Hi Eliot, >>> >>> Eliot Miranda wrote: >>> > Hi All, >>> > anyone know the x86/IA32 really well? If so, read on. Otherwise >>> > save >>> > yourself the yawn. >>> > >>> [...] >>> > (my emphasis added). But neither the Bochs simulator nor my Intel Core >>> > Duo >>> > set the flags when doing sarl $1, %eax when %eax contains -1. Have I >>> > misread, or is the manual wrong? >>> > >>> I cannot confirm this. Using this simple C-Program: >>> >>> int calc(int i) { >>> return i >> 1; >>> } >>> >>> int main() { >>> printf("%i\n", calc(-1)); >>> } >>> >>> my GCC 4.3.2 generates a sarl %eax instruction as the assembler output >>> shows. Debugging it with Kdbg shows a change of the flags after the >>> instruction. In fact, CF and SF are set as (more or less) expected. I >>> also have a Intel Core 2 Duo. >>> >>> Regards, >>> Martin >>> >> >> >> >> >> >> > > > > > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Eliot Miranda-2
On Jan 22, 2009, at 1:59 PM, Eliot Miranda wrote:
Ok, ok. It's just that when I looked at the assembler output for Martin's example, it looked like it covered the case you were fighting. (I didn't step through it with a debugger.) .text .globl _calc _calc: pushl %ebp movl %esp, %ebp subl $8, %esp movl 8(%ebp), %eax sarl %eax leave ret Then you said "can you send me the assembly? Or show me the opcodes?" instead of something like "What gcc version/flags are you using." A thousand apologies for having impugned your knowledge of gcc. I will now go away before you taunt me a second time. :) David |
In reply to this post by Eliot Miranda-2
On 22-Jan-09, at 12:59 PM, Eliot Miranda wrote: > BTW, I've been abusing gcc's -S output for a long time. -fverbose-asm is also helpful... well unless you rather map registers to local variables in your head because you know it should work that way. -- = = = ======================================================================== John M. McIntosh <[hidden email]> Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com = = = ======================================================================== |
Free forum by Nabble | Edit this page |