Smalltalk › Squeak › Squeak - Dev

[squeak-dev] x86 sarl curiosity...

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

12 messages Options

Eliot Miranda-2

[squeak-dev] x86 sarl curiosity...

Hi All,

anyone know the x86/IA32 really well? If so, read on. Otherwise save yourself the yawn.

I just tried to save an instruction in Cog;s generated bitShift: primitive. It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote from IA-32 Intel® Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192

Flags Affected

The CF flag contains the value of the last bit shifted out of the destination operand; it is unde-

fined for SHL and SHR instructions where the count is greater than or equal to the size (in bits)

of the destination operand. The OF flag is affected only for 1-bit shifts (see "Description"

above); otherwise, it is undefined. The SF, ZF, and PF flags are set according to the result. If the

count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined.

(my emphasis added). But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong?

TIA

Eliot

Martin Beck-3

Re: [squeak-dev] x86 sarl curiosity...

Hi Eliot,

Eliot Miranda wrote:
> Hi All,
> anyone know the x86/IA32 really well? If so, read on. Otherwise save
> yourself the yawn.
>
[...]
> (my emphasis added). But neither the Bochs simulator nor my Intel Core Duo
> set the flags when doing sarl $1, %eax when %eax contains -1. Have I
> misread, or is the manual wrong?
>
I cannot confirm this. Using this simple C-Program:

int calc(int i) {
return i >> 1;
}

int main() {
printf("%i\n", calc(-1));
}

my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
shows. Debugging it with Kdbg shows a change of the flags after the
instruction. In fact, CF and SF are set as (more or less) expected. I
also have a Intel Core 2 Duo.

Regards,
Martin

Nicolas Cellier-3

[squeak-dev] Re: x86 sarl curiosity...

In reply to this post by Eliot Miranda-2

Eliot Miranda <eliot.miranda <at> gmail.com> writes:

>
>
> Hi All,
>
> anyone know the x86/IA32 really well? If so, read on. Otherwise save
yourself the yawn.
>
> I just tried to save an instruction in Cog;s generated bitShift: primitive.
It seems to me that SARL (shift arithmetic right long) should set the sign flag
based on the result, in fact it says as much in the manual; I quote from IA-32
Intel® Architecture Software Developer's Manual Volume 2B: Instruction Set
Reference, N-Z p 4-192
>

Hi Eliot,
I guess you are adressing case of SmallInteger, otherwise I would understand
optimize as using some MMX 64 or 128 bits arithmetic (like PSRLLQ).
If relevant, check my trivial optimizations for large ints at
http://bugs.squeak.org/view.php?id=7109

Nicolas

Eliot Miranda-2

Re: [squeak-dev] x86 sarl curiosity...

In reply to this post by Martin Beck-3

Hi Martin, can you send me the assembly? Or show me the opcodes? When I try this it doesn't work. So I must be doing something differently.

and Hi! Robert Hirschfeld mentioned you when he and I met last week.

On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck <[hidden email]> wrote:

Hi Eliot,

Eliot Miranda wrote:
> Hi All,
> anyone know the x86/IA32 really well? If so, read on. Otherwise save
> yourself the yawn.
>

[...]

> (my emphasis added). But neither the Bochs simulator nor my Intel Core Duo
> set the flags when doing sarl $1, %eax when %eax contains -1. Have I
> misread, or is the manual wrong?
>

I cannot confirm this. Using this simple C-Program:

int calc(int i) {
return i >> 1;
}

int main() {
printf("%i\n", calc(-1));
}

my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
shows. Debugging it with Kdbg shows a change of the flags after the
instruction. In fact, CF and SF are set as (more or less) expected. I
also have a Intel Core 2 Duo.

Regards,
Martin

Eliot Miranda-2

Re: [squeak-dev] Re: x86 sarl curiosity...

In reply to this post by Nicolas Cellier-3

On Wed, Jan 21, 2009 at 12:38 PM, Nicolas Cellier <[hidden email]> wrote:

Eliot Miranda <eliot.miranda <at> gmail.com> writes:

>
>
> Hi All,
>
> anyone know the x86/IA32 really well? If so, read on. Otherwise save
yourself the yawn.
>
> I just tried to save an instruction in Cog;s generated bitShift: primitive.
It seems to me that SARL (shift arithmetic right long) should set the sign flag
based on the result, in fact it says as much in the manual; I quote from IA-32
Intel® Architecture Software Developer's Manual Volume 2B: Instruction Set
Reference, N-Z p 4-192
>

Hi Eliot,
I guess you are adressing case of SmallInteger, otherwise I would understand
optimize as using some MMX 64 or 128 bits arithmetic (like PSRLLQ).
If relevant, check my trivial optimizations for large ints at
http://bugs.squeak.org/view.php?id=7109

Hi Nicholas,

yes I'm doing SmallInteger and also trying to keep the JIT very simple initially so no MMX registers or instructions in the stage one JIT until I do floating-point. I'll take a look at these. Thanks.

Nicolas

Martin McClure-2

[squeak-dev] Re: [vwnc] x86 sarl curiosity...

In reply to this post by Eliot Miranda-2

Eliot Miranda wrote:

> Hi All,
>
> anyone know the x86/IA32 really well? If so, read on. Otherwise
> save yourself the yawn.
>
> I just tried to save an instruction in Cog;s generated bitShift:
> primitive. It seems to me that SARL (shift arithmetic right long)
> should set the sign flag based on the result, in fact it says as much in
> the manual; I quote from IA-32 Intel® Architecture Software
> Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192
>
> Flags Affected
>
> The CF flag contains the value of the last bit shifted out of the
> destination operand; it is unde-
>
> fined for SHL and SHR instructions where the count is greater than or
> equal to the size (in bits)
>
> of the destination operand. The OF flag is affected only for 1-bit
> shifts (see "Description"
>
> above); otherwise, it is undefined. The SF, ZF, and PF flags are set
> according to the result. If the
>
> count is 0, the flags are not affected. For a non-zero count, the AF
> flag is undefined.
>
>
> (my emphasis added). But neither the Bochs simulator nor my Intel Core
> Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have
> I misread, or is the manual wrong?

Interesting. FWIW, the AMD64 arch manual (vol 3, p.220) also says that
an SAR will affect the SF flag.

Regards,

-Martin

Eliot Miranda-2

[squeak-dev] Re: x86 sarl curiosity...

In reply to this post by Eliot Miranda-2

apologies; my bad. I'd used the wrong branch. jump greater (if 0 > v) is not the same as jump (if v) negative . I live and learn. Sorry for the noise.

On Wed, Jan 21, 2009 at 10:21 AM, Eliot Miranda <[hidden email]> wrote:

Hi All,

anyone know the x86/IA32 really well? If so, read on. Otherwise save yourself the yawn.

I just tried to save an instruction in Cog;s generated bitShift: primitive. It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote from IA-32 Intel® Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192

Flags Affected

The CF flag contains the value of the last bit shifted out of the destination operand; it is unde-

fined for SHL and SHR instructions where the count is greater than or equal to the size (in bits)

of the destination operand. The OF flag is affected only for 1-bit shifts (see "Description"

above); otherwise, it is undefined. The SF, ZF, and PF flags are set according to the result. If the

count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined.

(my emphasis added). But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong?

TIA

Eliot

David Farber

Re: [squeak-dev] x86 sarl curiosity...

In reply to this post by Eliot Miranda-2

Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.

David

On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote:

Hi Martin, can you send me the assembly? Or show me the opcodes? When I try this it doesn't work. So I must be doing something differently.

and Hi! Robert Hirschfeld mentioned you when he and I met last week.

On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck <[hidden email]> wrote:
Hi Eliot,

Eliot Miranda wrote:
> Hi All,
> anyone know the x86/IA32 really well? If so, read on. Otherwise save
> yourself the yawn.
>

[...]

> (my emphasis added). But neither the Bochs simulator nor my Intel Core Duo
> set the flags when doing sarl $1, %eax when %eax contains -1. Have I
> misread, or is the manual wrong?
>

I cannot confirm this. Using this simple C-Program:

int calc(int i) {
return i >> 1;
}

int main() {
printf("%i\n", calc(-1));
}

my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
shows. Debugging it with Kdbg shows a change of the flags after the
instruction. In fact, CF and SF are set as (more or less) expected. I
also have a Intel Core 2 Duo.

Regards,
Martin

Eliot Miranda-2

Re: [squeak-dev] x86 sarl curiosity...

On Thu, Jan 22, 2009 at 12:53 PM, David Farber <[hidden email]> wrote:

Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.

Um, I know :) Trouble is gcc also optimizes so it may not always generate the code you expect. For example,

issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; }

will, with -O4, generate

movl 4(%esp), %eax

sarl $31,%eax

ret

because it works out this is the quickest way to generate a 1 if v is negative and doesn't generate a compare at all.

BTW, I've been abusing gcc's -S output for a long time. Back in the 80's I used to generate direct-threaded-code VMs using gcc where I would edit the -S output with sed to produce the opcodes for the threaded code machine stripped of the prolog and epilog gcc would produce. I've also produced JIT-compiled BitBlt by similar means with a number of different compilers. -S has been my friend for many years.

Cheers!

Eliot

David

On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote:

Hi Martin, can you send me the assembly? Or show me the opcodes? When I try this it doesn't work. So I must be doing something differently.

and Hi! Robert Hirschfeld mentioned you when he and I met last week.

On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck <[hidden email]> wrote:

Hi Eliot,

Eliot Miranda wrote:
> Hi All,
> anyone know the x86/IA32 really well? If so, read on. Otherwise save
> yourself the yawn.
>

[...]

> (my emphasis added). But neither the Bochs simulator nor my Intel Core Duo
> set the flags when doing sarl $1, %eax when %eax contains -1. Have I
> misread, or is the manual wrong?
>

I cannot confirm this. Using this simple C-Program:

int calc(int i) {
return i >> 1;
}

int main() {
printf("%i\n", calc(-1));
}

my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
shows. Debugging it with Kdbg shows a change of the flags after the
instruction. In fact, CF and SF are set as (more or less) expected. I
also have a Intel Core 2 Duo.

Regards,
Martin

Igor Stasenko

Re: [squeak-dev] x86 sarl curiosity...

2009/1/22 Eliot Miranda <[hidden email]>:

>
>
> On Thu, Jan 22, 2009 at 12:53 PM, David Farber <[hidden email]> wrote:
>>
>> Eliot - I know you've already moved past this problem, but in the future,
>> gcc -S foo.c will create foo.s with the assembly generated by gcc.
>
> Um, I know :) Trouble is gcc also optimizes so it may not always generate
> the code you expect. For example,
> issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; }
> will, with -O4, generate
> movl 4(%esp), %eax
> sarl $31,%eax
> ret
> because it works out this is the quickest way to generate a 1 if v is
> negative and doesn't generate a compare at all.
> BTW, I've been abusing gcc's -S output for a long time. Back in the 80's I
> used to generate direct-threaded-code VMs using gcc where I would edit the
> -S output with sed to produce the opcodes for the threaded code machine
> stripped of the prolog and epilog gcc would produce. I've also produced
> JIT-compiled BitBlt by similar means with a number of different compilers.
> -S has been my friend for many years.

i'm dreaming to have

listing := Object compile: 'yourself ^self' options: '-S'

:)

> Cheers!
> Eliot
>
>>
>> David
>> On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote:
>>
>> Hi Martin, can you send me the assembly? Or show me the opcodes? When I
>> try this it doesn't work. So I must be doing something differently.
>> and Hi! Robert Hirschfeld mentioned you when he and I met last week.
>> On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck
>> <[hidden email]> wrote:
>>>
>>> Hi Eliot,
>>>
>>> Eliot Miranda wrote:
>>> > Hi All,
>>> > anyone know the x86/IA32 really well? If so, read on. Otherwise
>>> > save
>>> > yourself the yawn.
>>> >
>>> [...]
>>> > (my emphasis added). But neither the Bochs simulator nor my Intel Core
>>> > Duo
>>> > set the flags when doing sarl $1, %eax when %eax contains -1. Have I
>>> > misread, or is the manual wrong?
>>> >
>>> I cannot confirm this. Using this simple C-Program:
>>>
>>> int calc(int i) {
>>> return i >> 1;
>>> }
>>>
>>> int main() {
>>> printf("%i\n", calc(-1));
>>> }
>>>
>>> my GCC 4.3.2 generates a sarl %eax instruction as the assembler output
>>> shows. Debugging it with Kdbg shows a change of the flags after the
>>> instruction. In fact, CF and SF are set as (more or less) expected. I
>>> also have a Intel Core 2 Duo.
>>>
>>> Regards,
>>> Martin
>>>
>>
>>
>>
>>
>>
>>
>
>
>
>
>

--
Best regards,
Igor Stasenko AKA sig.

David Farber

Re: [squeak-dev] x86 sarl curiosity...

In reply to this post by Eliot Miranda-2

On Jan 22, 2009, at 1:59 PM, Eliot Miranda wrote:

On Thu, Jan 22, 2009 at 12:53 PM, David Farber <[hidden email]> wrote:

Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.

Um, I know :) Trouble is gcc also optimizes so it may not always generate the code you expect. For example,

issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; }

will, with -O4, generate

   movl 4(%esp), %eax
   sarl $31,%eax
   ret

because it works out this is the quickest way to generate a 1 if v is negative and doesn't generate a compare at all.

BTW, I've been abusing gcc's -S output for a long time. Back in the 80's I used to generate direct-threaded-code VMs using gcc where I would edit the -S output with sed to produce the opcodes for the threaded code machine stripped of the prolog and epilog gcc would produce. I've also produced JIT-compiled BitBlt by similar means with a number of different compilers. -S has been my friend for many years.

Cheers!
Eliot

Ok, ok. It's just that when I looked at the assembler output for Martin's example, it looked like it covered the case you were fighting. (I didn't step through it with a debugger.)

.text

.globl _calc

_calc:

pushl %ebp

movl %esp, %ebp

subl $8, %esp

movl 8(%ebp), %eax

sarl %eax

leave

ret

Then you said "can you send me the assembly? Or show me the opcodes?" instead of something like "What gcc version/flags are you using."

A thousand apologies for having impugned your knowledge of gcc.

I will now go away before you taunt me a second time.

David

johnmci

Re: [squeak-dev] x86 sarl curiosity...

In reply to this post by Eliot Miranda-2

On 22-Jan-09, at 12:59 PM, Eliot Miranda wrote:

> BTW, I've been abusing gcc's -S output for a long time.

-fverbose-asm

is also helpful...
well unless you rather map registers to local variables in your head
because you know it should work that way.

--
=
=
=
========================================================================
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
=
=
=
========================================================================