StackVM with latest sources tinyBenchmarks

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

StackVM with latest sources tinyBenchmarks

EstebanLM

Hi,

I just compiled a cog vm and a stack vm with latest sources.
While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.

Do you tried to compile a stack vm lately? any idea where to start look for bugs?

thanks,
Esteban
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Igor Stasenko
 
On 15 February 2013 14:49, Esteban Lorenzano <[hidden email]> wrote:
>
> Hi,
>
> I just compiled a cog vm and a stack vm with latest sources.
> While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
>
> Do you tried to compile a stack vm lately? any idea where to start look for bugs?
>

just tried on my machine.. the results is discouraging:

1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'

> thanks,
> Esteban



--
Best regards,
Igor Stasenko.
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Nicolas Cellier
 
This is not confirmed in regular svn cog branch

1 tinyBenchmarks
 '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM
 '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM
 '656 410 256 bytecodes/sec; 67 802 547  sends/sec' Cog VM

Nicolas

2013/2/15 Igor Stasenko <[hidden email]>:

>
> On 15 February 2013 14:49, Esteban Lorenzano <[hidden email]> wrote:
>>
>> Hi,
>>
>> I just compiled a cog vm and a stack vm with latest sources.
>> While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
>>
>> Do you tried to compile a stack vm lately? any idea where to start look for bugs?
>>
>
> just tried on my machine.. the results is discouraging:
>
> 1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
>
>> thanks,
>> Esteban
>
>
>
> --
> Best regards,
> Igor Stasenko.
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Eliot Miranda-2
 
Hi All,

    kudos to Nicolas for posting some useful numbers in that they
provide some context, in this case the other VMs running on the same
machine.  But wrist slaps to all of you for not specifying:

1. which OS
2. what hardware
3. what C compiler was used to compile the VM

Further kudos for indicating what kind of load the machine is under
(one has to run benchmarks on a relatively unstressed machine, even if
multicore), and, *really usefully*, what a previous version's
benchmark score is on the same machine.

Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter
are exactly what one should expect for nfib (the sends/sec part of
tinyBenchmarks) with the current Cog architecture.

On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier
<[hidden email]> wrote:

>
> This is not confirmed in regular svn cog branch
>
> 1 tinyBenchmarks
>  '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM
>  '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM
>  '656 410 256 bytecodes/sec; 67 802 547  sends/sec' Cog VM
>
> Nicolas
>
> 2013/2/15 Igor Stasenko <[hidden email]>:
>>
>> On 15 February 2013 14:49, Esteban Lorenzano <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I just compiled a cog vm and a stack vm with latest sources.
>>> While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
>>>
>>> Do you tried to compile a stack vm lately? any idea where to start look for bugs?
>>>
>>
>> just tried on my machine.. the results is discouraging:
>>
>> 1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
>>
>>> thanks,
>>> Esteban
>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko.



--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Nicolas Cellier
 
2013/2/19 Eliot Miranda <[hidden email]>:

>
> Hi All,
>
>     kudos to Nicolas for posting some useful numbers in that they
> provide some context, in this case the other VMs running on the same
> machine.  But wrist slaps to all of you for not specifying:
>
> 1. which OS
> 2. what hardware
> 3. what C compiler was used to compile the VM
>

Sure, above numbers were not really meaningful, apart the ratio...
Mac OS X (Mac OS 1068 intel)
MacMini 2.26 GHz Intel Core 2 Duo
Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3)

As for the load, I don't know how to provide a synthetic measurement,
but it's low...
The most annoying piece is Time machine and its disk access, I
sometimes forget to suspend it, but it was off during the
tinyBenchmark.

Nicolas

> Further kudos for indicating what kind of load the machine is under
> (one has to run benchmarks on a relatively unstressed machine, even if
> multicore), and, *really usefully*, what a previous version's
> benchmark score is on the same machine.
>
> Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter
> are exactly what one should expect for nfib (the sends/sec part of
> tinyBenchmarks) with the current Cog architecture.
>
> On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier
> <[hidden email]> wrote:
>>
>> This is not confirmed in regular svn cog branch
>>
>> 1 tinyBenchmarks
>>  '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM
>>  '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM
>>  '656 410 256 bytecodes/sec; 67 802 547  sends/sec' Cog VM
>>
>> Nicolas
>>
>> 2013/2/15 Igor Stasenko <[hidden email]>:
>>>
>>> On 15 February 2013 14:49, Esteban Lorenzano <[hidden email]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I just compiled a cog vm and a stack vm with latest sources.
>>>> While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
>>>>
>>>> Do you tried to compile a stack vm lately? any idea where to start look for bugs?
>>>>
>>>
>>> just tried on my machine.. the results is discouraging:
>>>
>>> 1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
>>>
>>>> thanks,
>>>> Esteban
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Igor Stasenko.
>
>
>
> --
> best,
> Eliot
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Eliot Miranda-2
 
On Tue, Feb 19, 2013 at 2:02 PM, Nicolas Cellier
<[hidden email]> wrote:

>
> 2013/2/19 Eliot Miranda <[hidden email]>:
>>
>> Hi All,
>>
>>     kudos to Nicolas for posting some useful numbers in that they
>> provide some context, in this case the other VMs running on the same
>> machine.  But wrist slaps to all of you for not specifying:
>>
>> 1. which OS
>> 2. what hardware
>> 3. what C compiler was used to compile the VM
>>
>
> Sure, above numbers were not really meaningful, apart the ratio...
> Mac OS X (Mac OS 1068 intel)
> MacMini 2.26 GHz Intel Core 2 Duo
> Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3)
>
> As for the load, I don't know how to provide a synthetic measurement,
> but it's low...

uptime is fine on Mac & Linux.  Don't know about Windows.

> The most annoying piece is Time machine and its disk access, I
> sometimes forget to suspend it, but it was off during the
> tinyBenchmark.

One simple approach is to run the benchmark three times and to discard
the best and the worst results.

>
> Nicolas
>
>> Further kudos for indicating what kind of load the machine is under
>> (one has to run benchmarks on a relatively unstressed machine, even if
>> multicore), and, *really usefully*, what a previous version's
>> benchmark score is on the same machine.
>>
>> Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter
>> are exactly what one should expect for nfib (the sends/sec part of
>> tinyBenchmarks) with the current Cog architecture.
>>
>> On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier
>> <[hidden email]> wrote:
>>>
>>> This is not confirmed in regular svn cog branch
>>>
>>> 1 tinyBenchmarks
>>>  '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM
>>>  '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM
>>>  '656 410 256 bytecodes/sec; 67 802 547  sends/sec' Cog VM
>>>
>>> Nicolas
>>>
>>> 2013/2/15 Igor Stasenko <[hidden email]>:
>>>>
>>>> On 15 February 2013 14:49, Esteban Lorenzano <[hidden email]> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I just compiled a cog vm and a stack vm with latest sources.
>>>>> While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
>>>>>
>>>>> Do you tried to compile a stack vm lately? any idea where to start look for bugs?
>>>>>
>>>>
>>>> just tried on my machine.. the results is discouraging:
>>>>
>>>> 1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
>>>>
>>>>> thanks,
>>>>> Esteban
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Igor Stasenko.
>>
>>
>>
>> --
>> best,
>> Eliot



--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Camillo Bruni-3

>> The most annoying piece is Time machine and its disk access, I
>> sometimes forget to suspend it, but it was off during the
>> tinyBenchmark.
>
> One simple approach is to run the benchmark three times and to discard
> the best and the worst results.

that is as good as taking the first one... if you want decent results
measure >30 times and do the only scientific correct thing: avg + std deviation?

Too much work? use http://www.squeaksource.com/SMark.html
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Eliot Miranda-2
 
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni <[hidden email]> wrote:

>
>>> The most annoying piece is Time machine and its disk access, I
>>> sometimes forget to suspend it, but it was off during the
>>> tinyBenchmark.
>>
>> One simple approach is to run the benchmark three times and to discard
>> the best and the worst results.
>
> that is as good as taking the first one... if you want decent results
> measure >30 times and do the only scientific correct thing: avg + std deviation?

If the benchmark takes very little time to run and you're trying to
avoid background effects then your approach won't necessarily work
either.

>
> Too much work? use http://www.squeaksource.com/SMark.html



--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Camillo Bruni-3


On 2013-02-20, at 01:25, Eliot Miranda <[hidden email]> wrote:

>
> On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni <[hidden email]> wrote:
>>
>>>> The most annoying piece is Time machine and its disk access, I
>>>> sometimes forget to suspend it, but it was off during the
>>>> tinyBenchmark.
>>>
>>> One simple approach is to run the benchmark three times and to discard
>>> the best and the worst results.
>>
>> that is as good as taking the first one... if you want decent results
>> measure >30 times and do the only scientific correct thing: avg + std deviation?
>
> If the benchmark takes very little time to run and you're trying to
> avoid background effects then your approach won't necessarily work
> either.

true, but the deviation will most probably give you exactly that feedback.
if you increase the runs but the quality of the result doesn't improve
you know that you're dealing with some systematic error source.

This approach is simply more scientific and less home-brewed.
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

EstebanLM
In reply to this post by Eliot Miranda-2
 
ok, I forget that

On Feb 19, 2013, at 7:50 PM, Eliot Miranda <[hidden email]> wrote:


Hi All,

   kudos to Nicolas for posting some useful numbers in that they
provide some context, in this case the other VMs running on the same
machine.  But wrist slaps to all of you for not specifying:

1. which OS

osx 10.8

2. what hardware

i7 8gb

3. what C compiler was used to compile the VM

4.6.3


Further kudos for indicating what kind of load the machine is under
(one has to run benchmarks on a relatively unstressed machine, even if
multicore), and, *really usefully*, what a previous version's
benchmark score is on the same machine.

no matter the load. 
It is a comparative analysis: stackvm pre-merge with latest: 500 msends, after merge: 5msend... since I did not changed my mac for an ipad, the sends cannot be right. Also... 10% passive cpu usage is wrong, no matter the machine load. 


Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter
are exactly what one should expect for nfib (the sends/sec part of
tinyBenchmarks) with the current Cog architecture.

yep... so where is the stack vm made with latest sources? (it is not in http://www.mirandabanda.org/files/Cog/VM/VM.r2678/)


On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier
<[hidden email]> wrote:

This is not confirmed in regular svn cog branch

1 tinyBenchmarks
'380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM
'371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM
'656 410 256 bytecodes/sec; 67 802 547  sends/sec' Cog VM

Nicolas

2013/2/15 Igor Stasenko <[hidden email]>:

On 15 February 2013 14:49, Esteban Lorenzano <[hidden email]> wrote:

Hi,

I just compiled a cog vm and a stack vm with latest sources.
While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.

Do you tried to compile a stack vm lately? any idea where to start look for bugs?


just tried on my machine.. the results is discouraging:

1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'

thanks,
Esteban



--
Best regards,
Igor Stasenko.



--
best,
Eliot

Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Eliot Miranda-2
In reply to this post by Camillo Bruni-3
 
On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni <[hidden email]> wrote:

>
>
> On 2013-02-20, at 01:25, Eliot Miranda <[hidden email]> wrote:
>
>>
>> On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni <[hidden email]> wrote:
>>>
>>>>> The most annoying piece is Time machine and its disk access, I
>>>>> sometimes forget to suspend it, but it was off during the
>>>>> tinyBenchmark.
>>>>
>>>> One simple approach is to run the benchmark three times and to discard
>>>> the best and the worst results.
>>>
>>> that is as good as taking the first one... if you want decent results
>>> measure >30 times and do the only scientific correct thing: avg + std deviation?
>>
>> If the benchmark takes very little time to run and you're trying to
>> avoid background effects then your approach won't necessarily work
>> either.
>
> true, but the deviation will most probably give you exactly that feedback.
> if you increase the runs but the quality of the result doesn't improve
> you know that you're dealing with some systematic error source.
>
> This approach is simply more scientific and less home-brewed.

Of course, no argument here.  But what's being discussed is using
tinyBenchmarks as a quick smoke test.  A proper CI system can be set
it up for reliable results, but for IMO for a quick smoke test doing
three runs manually is fine.  IME, what tends to happen is that the
first run is slow (caches heating up etc) and the second two runs are
extremely close.
--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Igor Stasenko
 
On 20 February 2013 18:29, Eliot Miranda <[hidden email]> wrote:

>
> On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni <[hidden email]> wrote:
>>
>>
>> On 2013-02-20, at 01:25, Eliot Miranda <[hidden email]> wrote:
>>
>>>
>>> On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni <[hidden email]> wrote:
>>>>
>>>>>> The most annoying piece is Time machine and its disk access, I
>>>>>> sometimes forget to suspend it, but it was off during the
>>>>>> tinyBenchmark.
>>>>>
>>>>> One simple approach is to run the benchmark three times and to discard
>>>>> the best and the worst results.
>>>>
>>>> that is as good as taking the first one... if you want decent results
>>>> measure >30 times and do the only scientific correct thing: avg + std deviation?
>>>
>>> If the benchmark takes very little time to run and you're trying to
>>> avoid background effects then your approach won't necessarily work
>>> either.
>>
>> true, but the deviation will most probably give you exactly that feedback.
>> if you increase the runs but the quality of the result doesn't improve
>> you know that you're dealing with some systematic error source.
>>
>> This approach is simply more scientific and less home-brewed.
>
> Of course, no argument here.  But what's being discussed is using
> tinyBenchmarks as a quick smoke test.  A proper CI system can be set
> it up for reliable results, but for IMO for a quick smoke test doing
> three runs manually is fine.  IME, what tends to happen is that the
> first run is slow (caches heating up etc) and the second two runs are
> extremely close.

but not in case when you have an order(s) of magnitude speed
degradation. This is too significant to be
considered as measurement error or deviation.
There should be something wrong with VM (cache always fails?).

> --
> best,
> Eliot



--
Best regards,
Igor Stasenko.
Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Guillermo Polito
 
Ok, following with this. What I can add to the discussion:

In linux, latest VMs yield the following results (I added a space every three digits just to enhance readability)

"Pharo Cog"
1 tinyBenchmarks 
'887 348 353 bytecodes/sec; 141 150 557 sends/sec'

"Pharo Stack"
1 tinyBenchmarks  
'445 217 391 bytecodes/sec; 24 395 999 sends/sec'

While in Mac

"Pharo Cog"
1 tinyBenchmarks 
'895 104 895 bytecodes/sec; 138 102 772 sends/sec'

"Pharo Stack"
1 tinyBenchmarks
'3 319 502 bytecodes/sec; 217 939 sends/sec'


So, I'd say it's a problem in cmake configuration or just compilation in mac :). Though I didn't test on windowze.

Another thing that I noticed is that when compiling my VM on Mac, since I updated Xcode, I was not longer using gnu gcc but llvm one. I tried to go back using the gnu gcc but couldn't make it work so far, he.



On Thu, Feb 21, 2013 at 5:09 AM, Igor Stasenko <[hidden email]> wrote:

On 20 February 2013 18:29, Eliot Miranda <[hidden email]> wrote:
>
> On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni <[hidden email]> wrote:
>>
>>
>> On 2013-02-20, at 01:25, Eliot Miranda <[hidden email]> wrote:
>>
>>>
>>> On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni <[hidden email]> wrote:
>>>>
>>>>>> The most annoying piece is Time machine and its disk access, I
>>>>>> sometimes forget to suspend it, but it was off during the
>>>>>> tinyBenchmark.
>>>>>
>>>>> One simple approach is to run the benchmark three times and to discard
>>>>> the best and the worst results.
>>>>
>>>> that is as good as taking the first one... if you want decent results
>>>> measure >30 times and do the only scientific correct thing: avg + std deviation?
>>>
>>> If the benchmark takes very little time to run and you're trying to
>>> avoid background effects then your approach won't necessarily work
>>> either.
>>
>> true, but the deviation will most probably give you exactly that feedback.
>> if you increase the runs but the quality of the result doesn't improve
>> you know that you're dealing with some systematic error source.
>>
>> This approach is simply more scientific and less home-brewed.
>
> Of course, no argument here.  But what's being discussed is using
> tinyBenchmarks as a quick smoke test.  A proper CI system can be set
> it up for reliable results, but for IMO for a quick smoke test doing
> three runs manually is fine.  IME, what tends to happen is that the
> first run is slow (caches heating up etc) and the second two runs are
> extremely close.

but not in case when you have an order(s) of magnitude speed
degradation. This is too significant to be
considered as measurement error or deviation.
There should be something wrong with VM (cache always fails?).

> --
> best,
> Eliot



--
Best regards,
Igor Stasenko.

Reply | Threaded
Open this post in threaded view
|

Re: StackVM with latest sources tinyBenchmarks

Guillermo Polito
 
So, after digging a bit I've got some results and conclusions:

- The CMake configurations were using gcc to compile, which in mac is llvm-gcc 
- Xcode uses clang compiler, not gcc (and some different compiling flags also)
- I've played changing the configuration to use clang compiler

setGlobalOptions: maker

   super setGlobalOptions: maker.
   maker set: 'CMAKE_C_COMPILER' to: 'clang'.
   maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.


And to make it work as in gcc I added the following also (in some plugins such as mp3plugin there are functions with return type and return statements with no values specified).

compilerFlagsRelease

   ^super compilerFlagsRelease, #( '-Wno-return-type' )


And it compiled with the following results in the tinyBenchmarks:

'510723192 bytecodes/sec; -142407 sends/sec'

Which is, in the bytecode part, pretty much close to what we expect, and in the sends, looks buggy :).
But the overall performance using the image is far better


Cheers,
Guille



On Fri, Jul 26, 2013 at 12:20 PM, Guillermo Polito <[hidden email]> wrote:
Ok, following with this. What I can add to the discussion:

In linux, latest VMs yield the following results (I added a space every three digits just to enhance readability)

"Pharo Cog"
1 tinyBenchmarks 
'887 348 353 bytecodes/sec; 141 150 557 sends/sec'

"Pharo Stack"
1 tinyBenchmarks  
'445 217 391 bytecodes/sec; 24 395 999 sends/sec'

While in Mac

"Pharo Cog"
1 tinyBenchmarks 
'895 104 895 bytecodes/sec; 138 102 772 sends/sec'

"Pharo Stack"
1 tinyBenchmarks
'3 319 502 bytecodes/sec; 217 939 sends/sec'


So, I'd say it's a problem in cmake configuration or just compilation in mac :). Though I didn't test on windowze.

Another thing that I noticed is that when compiling my VM on Mac, since I updated Xcode, I was not longer using gnu gcc but llvm one. I tried to go back using the gnu gcc but couldn't make it work so far, he.



On Thu, Feb 21, 2013 at 5:09 AM, Igor Stasenko <[hidden email]> wrote:

On 20 February 2013 18:29, Eliot Miranda <[hidden email]> wrote:
>
> On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni <[hidden email]> wrote:
>>
>>
>> On 2013-02-20, at 01:25, Eliot Miranda <[hidden email]> wrote:
>>
>>>
>>> On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni <[hidden email]> wrote:
>>>>
>>>>>> The most annoying piece is Time machine and its disk access, I
>>>>>> sometimes forget to suspend it, but it was off during the
>>>>>> tinyBenchmark.
>>>>>
>>>>> One simple approach is to run the benchmark three times and to discard
>>>>> the best and the worst results.
>>>>
>>>> that is as good as taking the first one... if you want decent results
>>>> measure >30 times and do the only scientific correct thing: avg + std deviation?
>>>
>>> If the benchmark takes very little time to run and you're trying to
>>> avoid background effects then your approach won't necessarily work
>>> either.
>>
>> true, but the deviation will most probably give you exactly that feedback.
>> if you increase the runs but the quality of the result doesn't improve
>> you know that you're dealing with some systematic error source.
>>
>> This approach is simply more scientific and less home-brewed.
>
> Of course, no argument here.  But what's being discussed is using
> tinyBenchmarks as a quick smoke test.  A proper CI system can be set
> it up for reliable results, but for IMO for a quick smoke test doing
> three runs manually is fine.  IME, what tends to happen is that the
> first run is slow (caches heating up etc) and the second two runs are
> extremely close.

but not in case when you have an order(s) of magnitude speed
degradation. This is too significant to be
considered as measurement error or deviation.
There should be something wrong with VM (cache always fails?).

> --
> best,
> Eliot



--
Best regards,
Igor Stasenko.


Reply | Threaded
Open this post in threaded view
|

The Mac VM (pharo)

Göran Krampe
 
Hey!

(nice to meet at ESUG btw)

On 08/15/2013 08:06 PM, Guillermo Polito wrote:
> So, after digging a bit I've got some results and conclusions:
>
> - The CMake configurations were using gcc to compile, which in mac is
> llvm-gcc
> - Xcode uses clang compiler, not gcc (and some different compiling flags
> also)
> - I've played changing the configuration to use clang compiler

Mmmmm... this is all slightly confusing, so many combos of compilers
here (I am a Mac n00b), this is what I have (or more?):

gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr
--with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin12.5.0
Thread model: posix

gcc-4.2 --version
i686-apple-darwin11-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)

llvm-gcc --version
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build
5658) (LLVM build 2336.11.00)

llvm-gcc-4.2 --version
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build
5658) (LLVM build 2336.11.00)

clang --version
Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin12.5.0
Thread model: posix

(I removed some copyright notices from the above)

> setGlobalOptions: maker
>
>     super setGlobalOptions: maker.
>     maker set: 'CMAKE_C_COMPILER' to: 'clang'.
>     maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.

AFAICT "clang" is the same as "gcc", no? See my printouts above. The
only difference seem to be the added prefix/include-dir config.

> And to make it work as in gcc I added the following also (in some
> plugins such as mp3plugin there are functions with return type and
> return statements with no values specified).
>
> compilerFlagsRelease
>
>     ^super compilerFlagsRelease, #( '-Wno-return-type' )

Aaaah!! Perfect. I just went through this yesterday and also failed at
the mpeg3plugin.

> And it compiled with the following results in the tinyBenchmarks:
>
> '510723192 bytecodes/sec; -142407 sends/sec'
 >
> Which is, in the bytecode part, pretty much close to what we expect, and
> in the sends, looks buggy :).
> But the overall performance using the image is far better

Ok, I will try to get the build to use gcc-4.2 (the non LLVM gcc) and
compare it to the clang (=gcc) VM.

regards, Göran

PS. I am on Mountain Lion and have Xcode 5 installed + CLI tools + brew
apple-gcc4.2.
Reply | Threaded
Open this post in threaded view
|

Re: The Mac VM (pharo)

Eliot Miranda-2
 
Hi Göran,


On Thu, Sep 26, 2013 at 12:21 AM, Göran Krampe <[hidden email]> wrote:

Hey!

(nice to meet at ESUG btw)

On 08/15/2013 08:06 PM, Guillermo Polito wrote:
So, after digging a bit I've got some results and conclusions:

- The CMake configurations were using gcc to compile, which in mac is
llvm-gcc
- Xcode uses clang compiler, not gcc (and some different compiling flags
also)
- I've played changing the configuration to use clang compiler

Mmmmm... this is all slightly confusing, so many combos of compilers here (I am a Mac n00b), this is what I have (or more?):

gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin12.5.0
Thread model: posix

gcc-4.2 --version
i686-apple-darwin11-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)

llvm-gcc --version
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)

llvm-gcc-4.2 --version
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)

clang --version
Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin12.5.0
Thread model: posix

(I removed some copyright notices from the above)

setGlobalOptions: maker

    super setGlobalOptions: maker.
    maker set: 'CMAKE_C_COMPILER' to: 'clang'.
    maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.

AFAICT "clang" is the same as "gcc", no? See my printouts above. The only difference seem to be the added prefix/include-dir config.

No; very different.  I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.  Right now Cog doesn't run if compiled with clang.  Only gcc will do.  No time to debug this right now, and annoyingly clang compiles all static functions with a non-standard calling convention which means one can;t call these functions in gdb, hence lots of debugging functions aren't available without either a) turning off the optimization or b) changing the VM source so they're not static.  I prefer a). If anyone knows of a flag to do this *please* let me know asap.

 

And to make it work as in gcc I added the following also (in some
plugins such as mp3plugin there are functions with return type and
return statements with no values specified).

compilerFlagsRelease

    ^super compilerFlagsRelease, #( '-Wno-return-type' )

Aaaah!! Perfect. I just went through this yesterday and also failed at the mpeg3plugin.

And it compiled with the following results in the tinyBenchmarks:

'510723192 bytecodes/sec; -142407 sends/sec'
>
Which is, in the bytecode part, pretty much close to what we expect, and
in the sends, looks buggy :).
But the overall performance using the image is far better

Ok, I will try to get the build to use gcc-4.2 (the non LLVM gcc) and compare it to the clang (=gcc) VM.

regards, Göran

PS. I am on Mountain Lion and have Xcode 5 installed + CLI tools + brew apple-gcc4.2.



--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: The Mac VM (pharo)

Tobias Pape
 
Hi Eliot

Am 26.09.2013 um 09:26 schrieb Eliot Miranda <[hidden email]>:

> Hi Göran,
>
>
> On Thu, Sep 26, 2013 at 12:21 AM, Göran Krampe <[hidden email]> wrote:
>
>>
>> Hey!
>>
>> (nice to meet at ESUG btw)
>>
[…]
>>
>> gcc --version
>> Configured with: --prefix=/Applications/Xcode.**app/Contents/Developer/usr
>> --with-gxx-include-dir=/usr/**include/c++/4.2.1
>> Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn)
>> Target: x86_64-apple-darwin12.5.0
>> Thread model: posix
>>
[…]

>> clang --version
>> Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn)
>> Target: x86_64-apple-darwin12.5.0
>> Thread model: posix
>>
>> (I removed some copyright notices from the above)
>>
>> setGlobalOptions: maker
>>>
>>>   super setGlobalOptions: maker.
>>>   maker set: 'CMAKE_C_COMPILER' to: 'clang'.
>>>   maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.
>>>
>>
>> AFAICT "clang" is the same as "gcc", no? See my printouts above. The only
>> difference seem to be the added prefix/include-dir config.
>>
>
> No; very different.  I'm not an expert but I think essentially clang is
> Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.
Not on a default Mac Xcode installation.
See my OSX 10.8 + Xcode 4.x installation:

$ ls -al $(which gcc)
lrwxr-xr-x  1 root  wheel  12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2

But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc)
ships. gcc is linked to clang.

It is a problem of default naming. From Xcode 5 on, if you don't change
a thing, "gcc" will get you "clang".


> Right now Cog doesn't run if compiled with clang.  Only gcc will do.  No
> time to debug this right now, and annoyingly clang compiles all static
> functions with a non-standard calling convention which means one can;t call
> these functions in gdb, hence lots of debugging functions aren't available
> without either a) turning off the optimization or b) changing the VM source
> so they're not static.

you might want to try lldb, that ships with Xcode and is based on the
llvm/clang tool chain. I am not implying it is better than gcc but
maybe it can help in your situation?

> I prefer a). If anyone knows of a flag to do this
> *please* let me know asap.


Best
        -Tobias



signature.asc (210 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: The Mac VM (pharo)

Eliot Miranda-2
 
Hi Tobias,


On Thu, Sep 26, 2013 at 12:56 AM, Tobias Pape <[hidden email]> wrote:
 
Hi Eliot

Am 26.09.2013 um 09:26 schrieb Eliot Miranda <[hidden email]>:

> Hi Göran,
>
>
> On Thu, Sep 26, 2013 at 12:21 AM, Göran Krampe <[hidden email]> wrote:
>
>>
>> Hey!
>>
>> (nice to meet at ESUG btw)
>>
[…]
>>
>> gcc --version
>> Configured with: --prefix=/Applications/Xcode.**app/Contents/Developer/usr
>> --with-gxx-include-dir=/usr/**include/c++/4.2.1
>> Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn)
>> Target: x86_64-apple-darwin12.5.0
>> Thread model: posix
>>
[…]
>> clang --version
>> Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn)
>> Target: x86_64-apple-darwin12.5.0
>> Thread model: posix
>>
>> (I removed some copyright notices from the above)
>>
>> setGlobalOptions: maker
>>>
>>>   super setGlobalOptions: maker.
>>>   maker set: 'CMAKE_C_COMPILER' to: 'clang'.
>>>   maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.
>>>
>>
>> AFAICT "clang" is the same as "gcc", no? See my printouts above. The only
>> difference seem to be the added prefix/include-dir config.
>>
>
> No; very different.  I'm not an expert but I think essentially clang is
> Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.

Not on a default Mac Xcode installation.
See my OSX 10.8 + Xcode 4.x installation:

$ ls -al $(which gcc)
lrwxr-xr-x  1 root  wheel  12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2

But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc)
ships. gcc is linked to clang.

It is a problem of default naming. From Xcode 5 on, if you don't change
a thing, "gcc" will get you "clang".

I think you miss my point, which is that the clang compiler is very different (it uses LLVM for its code generator) than gcc.  That apple calls clang gcc is neither here-nor-there.  If you get a real gcc it will compile a functional VM.  If you get a clang-based compiler it won't.  Do you agree?
 
> Right now Cog doesn't run if compiled with clang.  Only gcc will do.  No
> time to debug this right now, and annoyingly clang compiles all static
> functions with a non-standard calling convention which means one can;t call
> these functions in gdb, hence lots of debugging functions aren't available
> without either a) turning off the optimization or b) changing the VM source
> so they're not static.

you might want to try lldb, that ships with Xcode and is based on the
llvm/clang tool chain. I am not implying it is better than gcc but
maybe it can help in your situation?

Thanks, that sounds promising!
 

> I prefer a). If anyone knows of a flag to do this
> *please* let me know asap.


Best
        -Tobias






--
best,
Eliot
Reply | Threaded
Open this post in threaded view
|

Re: The Mac VM (pharo)

Tobias Pape
 
Hi Eliot

Am 26.09.2013 um 10:06 schrieb Eliot Miranda <[hidden email]>:

> Hi Tobias,
>
>>>> […]
>>>
>>> No; very different.  I'm not an expert but I think essentially clang is
>>> Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.
>>
>> Not on a default Mac Xcode installation.
>> See my OSX 10.8 + Xcode 4.x installation:
>>
>> $ ls -al $(which gcc)
>> lrwxr-xr-x  1 root  wheel  12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2
>>
>> But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc)
>> ships. gcc is linked to clang.
>>
>> It is a problem of default naming. From Xcode 5 on, if you don't change
>> a thing, "gcc" will get you "clang".
>>
>
> I think you miss my point, which is that the clang compiler is very
> different (it uses LLVM for its code generator) than gcc.  
I got that point, but I was under the impression, Göran wanted to
make a different one.

> That apple calls
> clang gcc is neither here-nor-there.  If you get a real gcc it will compile
> a functional VM.  If you get a clang-based compiler it won't.  Do you agree?

Yes, I did not want to argue that point :).

But what is with the two-headed hydra, llvm-gcc (gcc frontend with llvm code-gen)?
Since Xcode 4, apple by default does _not_ ship a "normal" gcc but only a
llvm-based one, and with Xcode 5, even that is gone.
  My point was not about code-gen but compiler-availability ;)
However, yours seem more important ATM.

>
>>> Right now Cog doesn't run if compiled with clang.  Only gcc will do.  No
>>> time to debug this right now, and annoyingly clang compiles all static
>>> functions with a non-standard calling convention which means one can;t
>> call
>>> these functions in gdb, hence lots of debugging functions aren't
>> available
>>> without either a) turning off the optimization or b) changing the VM
>> source
>>> so they're not static.
>>
>> you might want to try lldb, that ships with Xcode and is based on the
>> llvm/clang tool chain. I am not implying it is better than gcc but
>> maybe it can help in your situation?
>>
>
> Thanks, that sounds promising!
Keep in mind, that not only on the technical level
lldb is to gdb what clang is to gcc, but also on the
“philosophical”, as in
“emulate the interface of gcc/gdb but not quite…”
  Be prepared for surprises, good ones and bad ones.

Best
        -Tobias

signature.asc (210 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: The Mac VM (pharo)

Göran Krampe
 
Hey!

My mail filters were bogged up so I missed this discussion, sorry. Let
me clarify some things:

First I wrote that "clang" is essentially the same as "gcc" but what I
*MEANT* by that is that given the output from those two commands on a
Mountain Lion - they BOTH invoke llvm-gcc.

On 09/26/2013 10:35 AM, Tobias Pape wrote:> Am 26.09.2013 um 10:06
schrieb Eliot Miranda <[hidden email]>:

>>>> No; very different.  I'm not an expert but I think essentially clang is
>>>> Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.
>>>
>>> Not on a default Mac Xcode installation.
>>> See my OSX 10.8 + Xcode 4.x installation:
>>>
>>> $ ls -al $(which gcc)
>>> lrwxr-xr-x  1 root  wheel  12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2
>>>
>>> But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc)
>>> ships. gcc is linked to clang.
>>>
>>> It is a problem of default naming. From Xcode 5 on, if you don't change
>>> a thing, "gcc" will get you "clang".

And Tobias explained exactly what I meant - "gcc" and "clang" resolve to
the SAME compiler under Mountain Lion.

>> I think you miss my point, which is that the clang compiler is very
>> different (it uses LLVM for its code generator) than gcc.
>
> I got that point, but I was under the impression, Göran wanted to
> make a different one.

Yes, thanks! :)

>> That apple calls
>> clang gcc is neither here-nor-there.  If you get a real gcc it will compile
>> a functional VM.  If you get a clang-based compiler it won't.  Do you agree?
>
> Yes, I did not want to argue that point :).

No, I don't agree! :) Because current PharoVM *DOES* compile and run
using clang! Which is quite cool btw.

> But what is with the two-headed hydra, llvm-gcc (gcc frontend with llvm code-gen)?
> Since Xcode 4, apple by default does _not_ ship a "normal" gcc but only a
> llvm-based one, and with Xcode 5, even that is gone.
>    My point was not about code-gen but compiler-availability ;)
> However, yours seem more important ATM.
>
>>
>>>> Right now Cog doesn't run if compiled with clang.  Only gcc will do.  No

Nope, it compiles and run :). Performance seems to be the same as the
Pharo VM that the Pharo guys build with GCC (not sure, but I think they
use 4.2).

I am now working on a build using GCC 4.9 - not through it yet, but almost.


For some interesting silly benchmarks:

Stock binary pharo-vm (presume built with GCC 4.2???):
        133 million sends, 800 mill bytecodes.

Built with clang from Xcode 5:
        121 million sends, 790 mill bytecodes.

Binary 2776 from Eliot:
        114 mill sends, but 980 mill bytecodes.

3 years old OpenQwaq Cog VM compiled with Intel compiler:
        138 mill sends and 1000 mill bytecodes.


Hehe. So... will be interesting to see how GCC 4.9 fares in all this -
but this is nice - we can compile using clang!

regards, Göran
12