GCC 4.1

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

GCC 4.1

Philippe Marschall
Hi

I'm seeing significant improvements when compiling a VM with GCC 4.1
over the stock Unix VMs. They are almost too good to be true.

stock 3.9-9 vm, GCC 4.0.3:
56437389 bytecodes/sec; 2609047 sends/sec

compiled with GCC 4.1.1:
113676731 bytecodes/sec; 4011617 sends/sec

That is about twice as many bytecodes/sec and about 50% more
sends/sec. Can that be?

Cheers
Philippe

Reply | Threaded
Open this post in threaded view
|

Re: GCC 4.1

Michael Rueger-6
Philippe Marschall wrote:
> Hi
>
> I'm seeing significant improvements when compiling a VM with GCC 4.1
> over the stock Unix VMs. They are almost too good to be true.

Just as a baseline comparison, could you post the numbers for
pre-compiled (download from squeak.org) 3.7 or 3.8 VM on your system?
And, what kind of system are you running on?

Michael

Reply | Threaded
Open this post in threaded view
|

Re: Re: GCC 4.1

Philippe Marschall
3.8a-1:
125244618 bytecodes/sec; 4317757 sends/sec
3.7-7:
124392614 bytecodes/sec; 4339259 sends/sec

They are somewhat below GCC 4.1 numbers so it seems that just the
3.9-9 VM happens to be slow.

CPU is a 1.2 GHz Tualatin

Philippe

2007/1/15, Michael Rueger <[hidden email]>:

>
> Philippe Marschall wrote:
> > Hi
> >
> > I'm seeing significant improvements when compiling a VM with GCC 4.1
> > over the stock Unix VMs. They are almost too good to be true.
>
> Just as a baseline comparison, could you post the numbers for
> pre-compiled (download from squeak.org) 3.7 or 3.8 VM on your system?
> And, what kind of system are you running on?
>
> Michael
>

Reply | Threaded
Open this post in threaded view
|

Re: Re: GCC 4.1

Göran Krampe
Hi!

"Philippe Marschall" <[hidden email]> wrote:
> 3.8a-1:
> 125244618 bytecodes/sec; 4317757 sends/sec
> 3.7-7:
> 124392614 bytecodes/sec; 4339259 sends/sec
>
> They are somewhat below GCC 4.1 numbers so it seems that just the

You mean the other way around right? They are somewhat higher than the
first number you posted.

> 3.9-9 VM happens to be slow.

My guess is that the combo you compiled (3.9-9, 4.0.3) that turned slow
somehow failed to get gnuified. Just a wild guess.

regards, Göran

Reply | Threaded
Open this post in threaded view
|

Re: Re: GCC 4.1

Philippe Marschall
2007/1/15, [hidden email] <[hidden email]>:

>
> Hi!
>
> "Philippe Marschall" <[hidden email]> wrote:
> > 3.8a-1:
> > 125244618 bytecodes/sec; 4317757 sends/sec
> > 3.7-7:
> > 124392614 bytecodes/sec; 4339259 sends/sec
> >
> > They are somewhat below GCC 4.1 numbers so it seems that just the
>
> You mean the other way around right? They are somewhat higher than the
> first number you posted.

Yeah.

> > 3.9-9 VM happens to be slow.
>
> My guess is that the combo you compiled (3.9-9, 4.0.3) that turned slow
> somehow failed to get gnuified. Just a wild guess.

Wasn't me. It's the stock from squeakvm.org

Philippe

Reply | Threaded
Open this post in threaded view
|

Re: Re: GCC 4.1

Göran Krampe
Hi Ian and all!

"Philippe Marschall" <[hidden email]> wrote:
> 2007/1/15, [hidden email] <[hidden email]>:
> > My guess is that the combo you compiled (3.9-9, 4.0.3) that turned slow
> > somehow failed to get gnuified. Just a wild guess.
>
> Wasn't me. It's the stock from squeakvm.org

Ah... Ian, could you perhaps check the performance of the binary 3.9-9
VM on squeakvm.org? As per this posting:

        http://lists.squeakfoundation.org/pipermail/vm-dev/2007-January/000977.
html

I presume this is the intel binary we are talking about. Did something
go wrong with gnuify or so?

I don't have a box handy to look into this right now.

regards, Göran

Reply | Threaded
Open this post in threaded view
|

Re: Re: GCC 4.1

johnmci
I think because of my work fiddling on gcc parms/registers or not for  
macintel that Ian had incorporated those
changes into the unix vm build last dec which affects the compare of  
current/previous VMs,
although I've not checked to see if that applies to outside of the  
darwin build logic.

For the bytecode and send/sec the difference is quite large. However  
which gcc version you use plus options plus
hardware can make quite a difference.  The clue to the better rates  
is if the assembler to support the first couple of byte codes
in the interpret() loop looks like below. Poorer optimizations can  
have 12 or more instructions, verus the more optimum 9.

.globl _interpret
_interpret:

after the jump tables
.long   LXXXXX

you should see something like:

L10161:
        addl $1, %esi
        movzbl (%esi), %ebx
        addl $4, %edi
        movl _foo, %eax
        movl 84(%eax), %eax
        movl 4(%eax), %eax
        movl %eax, (%edi)
        movl 512(%esp,%ebx,4), %eax
L10421:
        jmp *%eax



On Jan 15, 2007, at 2:37 AM, [hidden email] wrote:

>
> Hi Ian and all!
>
> "Philippe Marschall" <[hidden email]> wrote:
>> 2007/1/15, [hidden email] <[hidden email]>:
>>> My guess is that the combo you compiled (3.9-9, 4.0.3) that  
>>> turned slow
>>> somehow failed to get gnuified. Just a wild guess.
>>
>> Wasn't me. It's the stock from squeakvm.org
>
> Ah... Ian, could you perhaps check the performance of the binary 3.9-9
> VM on squeakvm.org? As per this posting:
>
> http://lists.squeakfoundation.org/pipermail/vm-dev/2007-January/ 
> 000977.
> html
>
> I presume this is the intel binary we are talking about. Did something
> go wrong with gnuify or so?
>
> I don't have a box handy to look into this right now.
>
> regards, Göran

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
========================================================================
===