[squeak-dev] x86 sarl curiosity...

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[squeak-dev] x86 sarl curiosity...

Eliot Miranda-2
Hi All,

    anyone know the x86/IA32 really well?  If so, read on.  Otherwise save yourself the yawn.

I just tried to save an instruction in Cog;s generated bitShift: primitive.  It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote froIA-32 Intel® Architecture Software Developer's Manual Volume 2B:  Instruction Set Reference, N-Z p 4-192

Flags Affected 

The CF flag contains the value of the last bit shifted out of the destination operand; it is unde- 

fined for SHL and SHR instructions where the count is greater than or equal to the size (in bits) 

of the destination operand. The OF flag is affected only for 1-bit shifts (see "Description" 

above); otherwise, it is undefined. The SF, ZF, and PF flags are set according to the result. If the 

count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined.


(my emphasis added).  But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1.  Have I misread, or is the manual wrong?


TIA

Eliot



Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] x86 sarl curiosity...

Martin Beck-3
Hi Eliot,

Eliot Miranda wrote:
> Hi All,
>     anyone know the x86/IA32 really well?  If so, read on.  Otherwise save
> yourself the yawn.
>
[...]
> (my emphasis added).  But neither the Bochs simulator nor my Intel Core Duo
> set the flags when doing sarl $1, %eax when %eax contains -1.  Have I
> misread, or is the manual wrong?
>
I cannot confirm this. Using this simple C-Program:

int calc(int i) {
  return i >> 1;
}

int main() {
  printf("%i\n", calc(-1));
}

my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
shows. Debugging it with Kdbg shows a change of the flags after the
instruction. In fact, CF and SF are set as (more or less) expected. I
also have a Intel Core 2 Duo.

Regards,
Martin

Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: x86 sarl curiosity...

Nicolas Cellier-3
In reply to this post by Eliot Miranda-2
Eliot Miranda <eliot.miranda <at> gmail.com> writes:

>
>
> Hi All,
>
>     anyone know the x86/IA32 really well?  If so, read on.  Otherwise save
yourself the yawn.
>
> I just tried to save an instruction in Cog;s generated bitShift: primitive.
 It seems to me that SARL (shift arithmetic right long) should set the sign flag
based on the result, in fact it says as much in the manual; I quote from IA-32
Intel® Architecture Software Developer's Manual Volume 2B:  Instruction Set
Reference, N-Z p 4-192
>

Hi Eliot,
I guess you are adressing case of SmallInteger, otherwise I would understand
optimize as using some MMX 64 or 128 bits arithmetic (like PSRLLQ).
If relevant, check my trivial optimizations for large ints at
http://bugs.squeak.org/view.php?id=7109

Nicolas


Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] x86 sarl curiosity...

Eliot Miranda-2
In reply to this post by Martin Beck-3
Hi Martin,  can you send me the assembly?  Or show me the opcodes? When I try this it doesn't work.  So I must be doing something differently.

and Hi!  Robert Hirschfeld mentioned you when he and I met last week.

On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck <[hidden email]> wrote:
Hi Eliot,

Eliot Miranda wrote:
> Hi All,
>     anyone know the x86/IA32 really well?  If so, read on.  Otherwise save
> yourself the yawn.
>
[...]
> (my emphasis added).  But neither the Bochs simulator nor my Intel Core Duo
> set the flags when doing sarl $1, %eax when %eax contains -1.  Have I
> misread, or is the manual wrong?
>
I cannot confirm this. Using this simple C-Program:

int calc(int i) {
 return i >> 1;
}

int main() {
 printf("%i\n", calc(-1));
}

my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
shows. Debugging it with Kdbg shows a change of the flags after the
instruction. In fact, CF and SF are set as (more or less) expected. I
also have a Intel Core 2 Duo.

Regards,
Martin




Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] Re: x86 sarl curiosity...

Eliot Miranda-2
In reply to this post by Nicolas Cellier-3


On Wed, Jan 21, 2009 at 12:38 PM, Nicolas Cellier <[hidden email]> wrote:
Eliot Miranda <eliot.miranda <at> gmail.com> writes:

>
>
> Hi All,
>
>     anyone know the x86/IA32 really well?  If so, read on.  Otherwise save
yourself the yawn.
>
> I just tried to save an instruction in Cog;s generated bitShift: primitive.
 It seems to me that SARL (shift arithmetic right long) should set the sign flag
based on the result, in fact it says as much in the manual; I quote from IA-32
Intel® Architecture Software Developer's Manual Volume 2B:  Instruction Set
Reference, N-Z p 4-192
>


Hi Eliot,
I guess you are adressing case of SmallInteger, otherwise I would understand
optimize as using some MMX 64 or 128 bits arithmetic (like PSRLLQ).
If relevant, check my trivial optimizations for large ints at
http://bugs.squeak.org/view.php?id=7109


Hi Nicholas,

    yes I'm doing SmallInteger and also trying to keep the JIT very simple initially so no MMX registers or instructions in the stage one JIT until I do floating-point.  I'll take a look at these.  Thanks.
 


Nicolas





Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: [vwnc] x86 sarl curiosity...

Martin McClure-2
In reply to this post by Eliot Miranda-2
Eliot Miranda wrote:

> Hi All,
>
>     anyone know the x86/IA32 really well?  If so, read on.  Otherwise
> save yourself the yawn.
>
> I just tried to save an instruction in Cog;s generated bitShift:
> primitive.  It seems to me that SARL (shift arithmetic right long)
> should set the sign flag based on the result, in fact it says as much in
> the manual; I quote from IA-32 Intel® Architecture Software
> Developer's Manual Volume 2B:  Instruction Set Reference, N-Z p 4-192
>
> Flags Affected
>
> The CF flag contains the value of the last bit shifted out of the
> destination operand; it is unde-
>
> fined for SHL and SHR instructions where the count is greater than or
> equal to the size (in bits)
>
> of the destination operand. The OF flag is affected only for 1-bit
> shifts (see "Description"
>
> above); otherwise, it is undefined. The SF, ZF, and PF flags are set
> according to the result. If the
>
> count is 0, the flags are not affected. For a non-zero count, the AF
> flag is undefined.
>
>
> (my emphasis added).  But neither the Bochs simulator nor my Intel Core
> Duo set the flags when doing sarl $1, %eax when %eax contains -1.  Have
> I misread, or is the manual wrong?

Interesting. FWIW, the AMD64 arch manual (vol 3, p.220) also says that
an SAR will affect the SF flag.

Regards,

-Martin

Reply | Threaded
Open this post in threaded view
|

[squeak-dev] Re: x86 sarl curiosity...

Eliot Miranda-2
In reply to this post by Eliot Miranda-2
apologies; my bad.  I'd used the wrong branch.  jump greater (if 0 > v) is not the same as jump (if v) negative .  I live and learn.  Sorry for the noise.

On Wed, Jan 21, 2009 at 10:21 AM, Eliot Miranda <[hidden email]> wrote:
Hi All,

    anyone know the x86/IA32 really well?  If so, read on.  Otherwise save yourself the yawn.

I just tried to save an instruction in Cog;s generated bitShift: primitive.  It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote froIA-32 Intel® Architecture Software Developer's Manual Volume 2B:  Instruction Set Reference, N-Z p 4-192

Flags Affected 

The CF flag contains the value of the last bit shifted out of the destination operand; it is unde- 

fined for SHL and SHR instructions where the count is greater than or equal to the size (in bits) 

of the destination operand. The OF flag is affected only for 1-bit shifts (see "Description" 

above); otherwise, it is undefined. The SF, ZF, and PF flags are set according to the result. If the 

count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined.


(my emphasis added).  But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1.  Have I misread, or is the manual wrong?


TIA

Eliot




Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] x86 sarl curiosity...

David Farber
In reply to this post by Eliot Miranda-2
Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.

David

On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote:

Hi Martin,  can you send me the assembly?  Or show me the opcodes? When I try this it doesn't work.  So I must be doing something differently.

and Hi!  Robert Hirschfeld mentioned you when he and I met last week.

On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck <[hidden email]> wrote:
Hi Eliot,

Eliot Miranda wrote:
> Hi All,
>     anyone know the x86/IA32 really well?  If so, read on.  Otherwise save
> yourself the yawn.
>
[...]
> (my emphasis added).  But neither the Bochs simulator nor my Intel Core Duo
> set the flags when doing sarl $1, %eax when %eax contains -1.  Have I
> misread, or is the manual wrong?
>
I cannot confirm this. Using this simple C-Program:

int calc(int i) {
 return i >> 1;
}

int main() {
 printf("%i\n", calc(-1));
}

my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
shows. Debugging it with Kdbg shows a change of the flags after the
instruction. In fact, CF and SF are set as (more or less) expected. I
also have a Intel Core 2 Duo.

Regards,
Martin






Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] x86 sarl curiosity...

Eliot Miranda-2


On Thu, Jan 22, 2009 at 12:53 PM, David Farber <[hidden email]> wrote:
Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.

Um, I know :)  Trouble is gcc also optimizes so it may not always generate the code you expect.  For example,

issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; }

will, with -O4, generate

       movl 4(%esp), %eax
       sarl $31,%eax
       ret

because it works out this is the quickest way to generate a 1 if v is negative and doesn't generate a compare at all.

BTW, I've been abusing gcc's -S output for a long time.  Back in the 80's I used to generate direct-threaded-code VMs using gcc where I would edit the -S output with sed to produce the opcodes for the threaded code machine stripped of the prolog and epilog gcc would produce.  I've also produced JIT-compiled BitBlt by similar means with a number of different compilers.  -S has been my friend for many years.

Cheers!
Eliot



David

On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote:

Hi Martin,  can you send me the assembly?  Or show me the opcodes? When I try this it doesn't work.  So I must be doing something differently.

and Hi!  Robert Hirschfeld mentioned you when he and I met last week.

On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck <[hidden email]> wrote:
Hi Eliot,

Eliot Miranda wrote:
> Hi All,
>     anyone know the x86/IA32 really well?  If so, read on.  Otherwise save
> yourself the yawn.
>
[...]
> (my emphasis added).  But neither the Bochs simulator nor my Intel Core Duo
> set the flags when doing sarl $1, %eax when %eax contains -1.  Have I
> misread, or is the manual wrong?
>
I cannot confirm this. Using this simple C-Program:

int calc(int i) {
 return i >> 1;
}

int main() {
 printf("%i\n", calc(-1));
}

my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
shows. Debugging it with Kdbg shows a change of the flags after the
instruction. In fact, CF and SF are set as (more or less) expected. I
also have a Intel Core 2 Duo.

Regards,
Martin










Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] x86 sarl curiosity...

Igor Stasenko
2009/1/22 Eliot Miranda <[hidden email]>:

>
>
> On Thu, Jan 22, 2009 at 12:53 PM, David Farber <[hidden email]> wrote:
>>
>> Eliot - I know you've already moved past this problem, but in the future,
>> gcc -S foo.c will create foo.s with the assembly generated by gcc.
>
> Um, I know :)  Trouble is gcc also optimizes so it may not always generate
> the code you expect.  For example,
> issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; }
> will, with -O4, generate
>        movl 4(%esp), %eax
>        sarl $31,%eax
>        ret
> because it works out this is the quickest way to generate a 1 if v is
> negative and doesn't generate a compare at all.
> BTW, I've been abusing gcc's -S output for a long time.  Back in the 80's I
> used to generate direct-threaded-code VMs using gcc where I would edit the
> -S output with sed to produce the opcodes for the threaded code machine
> stripped of the prolog and epilog gcc would produce.  I've also produced
> JIT-compiled BitBlt by similar means with a number of different compilers.
>  -S has been my friend for many years.

i'm dreaming to have

listing :=  Object  compile: 'yourself ^self' options: '-S'

:)


> Cheers!
> Eliot
>
>>
>> David
>> On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote:
>>
>> Hi Martin,  can you send me the assembly?  Or show me the opcodes? When I
>> try this it doesn't work.  So I must be doing something differently.
>> and Hi!  Robert Hirschfeld mentioned you when he and I met last week.
>> On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck
>> <[hidden email]> wrote:
>>>
>>> Hi Eliot,
>>>
>>> Eliot Miranda wrote:
>>> > Hi All,
>>> >     anyone know the x86/IA32 really well?  If so, read on.  Otherwise
>>> > save
>>> > yourself the yawn.
>>> >
>>> [...]
>>> > (my emphasis added).  But neither the Bochs simulator nor my Intel Core
>>> > Duo
>>> > set the flags when doing sarl $1, %eax when %eax contains -1.  Have I
>>> > misread, or is the manual wrong?
>>> >
>>> I cannot confirm this. Using this simple C-Program:
>>>
>>> int calc(int i) {
>>>  return i >> 1;
>>> }
>>>
>>> int main() {
>>>  printf("%i\n", calc(-1));
>>> }
>>>
>>> my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
>>> shows. Debugging it with Kdbg shows a change of the flags after the
>>> instruction. In fact, CF and SF are set as (more or less) expected. I
>>> also have a Intel Core 2 Duo.
>>>
>>> Regards,
>>> Martin
>>>
>>
>>
>>
>>
>>
>>
>
>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] x86 sarl curiosity...

David Farber
In reply to this post by Eliot Miranda-2
On Jan 22, 2009, at 1:59 PM, Eliot Miranda wrote:

On Thu, Jan 22, 2009 at 12:53 PM, David Farber <[hidden email]> wrote:
Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.

Um, I know :)  Trouble is gcc also optimizes so it may not always generate the code you expect.  For example,

issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; }

will, with -O4, generate

       movl 4(%esp), %eax
       sarl $31,%eax
       ret

because it works out this is the quickest way to generate a 1 if v is negative and doesn't generate a compare at all.

BTW, I've been abusing gcc's -S output for a long time.  Back in the 80's I used to generate direct-threaded-code VMs using gcc where I would edit the -S output with sed to produce the opcodes for the threaded code machine stripped of the prolog and epilog gcc would produce.  I've also produced JIT-compiled BitBlt by similar means with a number of different compilers.  -S has been my friend for many years.

Cheers!
Eliot

Ok, ok.  It's just that when I looked at the assembler output for Martin's example, it looked like it covered the case you were fighting.  (I didn't step through it with a debugger.)

        .text
.globl _calc
_calc:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        movl    8(%ebp), %eax
        sarl    %eax
        leave
        ret

Then you said "can you send me the assembly?  Or show me the opcodes?" instead of something like "What gcc version/flags are you using."

A thousand apologies for having impugned your knowledge of gcc.

I will now go away before you taunt me a second time.

:)

David



Reply | Threaded
Open this post in threaded view
|

Re: [squeak-dev] x86 sarl curiosity...

johnmci
In reply to this post by Eliot Miranda-2

On 22-Jan-09, at 12:59 PM, Eliot Miranda wrote:

> BTW, I've been abusing gcc's -S output for a long time.

-fverbose-asm

is also helpful...
well unless you rather map registers to local variables in your head  
because you know it should work that way.


--
=
=
=
========================================================================
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
=
=
=
========================================================================