Smalltalk › Pharo › Pharo Smalltalk Developers

Re: [Vm-dev] VM Benchmarks (Was: Re: [squeak-dev] [4.2] - VM <-> image release coordination)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

9 messages Options

Mariano Martinez Peck

Re: [Vm-dev] VM Benchmarks (Was: Re: [squeak-dev] [4.2] - VM <-> image release coordination)

On Tue, Jan 4, 2011 at 11:35 PM, Stefan Marr <[hidden email]> wrote:

Hi Igor:

On 04 Jan 2011, at 22:40, Igor Stasenko wrote:

> Okay, how about creating a separate
> VMBenchmarks repository
> and putting VMBenchmarks package there?

Sure, sounds good. There are also the Systems benchmarks at http://www.squeaksource.com/PharoBenchmarks.

yes! few months ago I commited this:

"
Several fixes to Benchmark. Now you can run for example:

Benchmark testStandardToFile: (FileStream forceNewFileNamed: 'mariano').
"

So, at least, I was able to run all benchmarks :)

However, it still needs cleaning, improvement, testing, blah. But it is a good start point I think.

Cheers

Mariano

One question would be what to include in such a benchmark suite.
And another question how to design the benchmark harness.

I need something (and if you are going to automate it, you probably, too) which is scriptable from the command line.
My harness registers itself in the startup list and then looks for the command line arguments to choose a benchmark class.

Best regards
Stefan

--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://soft.vub.ac.be/~smarr
Phone: +32 2 629 2974
Fax: +32 2 629 3525

Stefan Marr-4

Re: [Vm-dev] VM Benchmarks (Was: Re: [squeak-dev] [4.2] - VM <-> image release coordination)

Hi:

On 04 Jan 2011, at 23:40, Mariano Martinez Peck wrote:

> On Tue, Jan 4, 2011 at 11:35 PM, Stefan Marr <[hidden email]> wrote:
> Hi Igor:
>
> On 04 Jan 2011, at 22:40, Igor Stasenko wrote:
>
> > Okay, how about creating a separate
> > VMBenchmarks repository
> > and putting VMBenchmarks package there?
>
> Sure, sounds good. There are also the Systems benchmarks at http://www.squeaksource.com/PharoBenchmarks.
>
>
> So, at least, I was able to run all benchmarks :)
>
> However, it still needs cleaning, improvement, testing, blah. But it is a good start point I think.

Well, I am not to sure about the general value of those benchmarks.
There are many microbenchmarks which do not tell you a lot. All those test* things.
And well, their value for testing is also questionable. They only can help you to identify where it goes *boom* and crashes the VM, but they do not actually assert for anything.

Also not sure what the value of Slopstone and Smopstone (names from the top of my head might be slightly different) is nowadays.

The compiler benchmark is a reasonable application benchmark.
Would be good to have a few others in that collection, too.

Best regards
Stefan

--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://soft.vub.ac.be/~smarr
Phone: +32 2 629 2974
Fax: +32 2 629 3525

Mariano Martinez Peck

Re: [Vm-dev] VM Benchmarks (Was: Re: [squeak-dev] [4.2] - VM <-> image release coordination)

does anyone have news regarding this topic? we VERY welcome people helping with benchmarks.
please feel free to improve http://www.squeaksource.com/PharoBenchmarks

On Tue, Jan 4, 2011 at 11:51 PM, Stefan Marr <[hidden email]> wrote:

Hi:

On 04 Jan 2011, at 23:40, Mariano Martinez Peck wrote:

> On Tue, Jan 4, 2011 at 11:35 PM, Stefan Marr <[hidden email]> wrote:
> Hi Igor:
>
> On 04 Jan 2011, at 22:40, Igor Stasenko wrote:
>
> > Okay, how about creating a separate
> > VMBenchmarks repository
> > and putting VMBenchmarks package there?
>
> Sure, sounds good. There are also the Systems benchmarks at http://www.squeaksource.com/PharoBenchmarks.
>
>

> So, at least, I was able to run all benchmarks :)
>
> However, it still needs cleaning, improvement, testing, blah. But it is a good start point I think.

Well, I am not to sure about the general value of those benchmarks.
There are many microbenchmarks which do not tell you a lot. All those test* things.
And well, their value for testing is also questionable. They only can help you to identify where it goes *boom* and crashes the VM, but they do not actually assert for anything.

Also not sure what the value of Slopstone and Smopstone (names from the top of my head might be slightly different) is nowadays.

The compiler benchmark is a reasonable application benchmark.
Would be good to have a few others in that collection, too.

Best regards
Stefan

--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://soft.vub.ac.be/~smarr
Phone: <a href="tel:%2B32%202%20629%202974">+32 2 629 2974
Fax: <a href="tel:%2B32%202%20629%203525">+32 2 629 3525

abergel

Re: [Vm-dev] VM Benchmarks (Was: Re: [squeak-dev] [4.2] - VM <-> image release coordination)

I am not sure what you mean with vm benchmarks, but in almost all the tools I am contributing come with some benchmarks (spy, Mondrian, I wrote some benchmarks for Glamour as well). Naturally, those are macro benchmarks, which is probably what matter the most.

Cheers,

Alexandre

Le 11 mars 2011 à 10:55, Mariano Martinez Peck <[hidden email]> a écrit :

does anyone have news regarding this topic? we VERY welcome people helping with benchmarks.
please feel free to improve http://www.squeaksource.com/PharoBenchmarks

On Tue, Jan 4, 2011 at 11:51 PM, Stefan Marr <[hidden email]> wrote:
Hi:

On 04 Jan 2011, at 23:40, Mariano Martinez Peck wrote:

> On Tue, Jan 4, 2011 at 11:35 PM, Stefan Marr <[hidden email]> wrote:
> Hi Igor:
>
> On 04 Jan 2011, at 22:40, Igor Stasenko wrote:
>
> > Okay, how about creating a separate
> > VMBenchmarks repository
> > and putting VMBenchmarks package there?
>
> Sure, sounds good. There are also the Systems benchmarks at http://www.squeaksource.com/PharoBenchmarks.
>
>

> So, at least, I was able to run all benchmarks :)
>
> However, it still needs cleaning, improvement, testing, blah. But it is a good start point I think.

Well, I am not to sure about the general value of those benchmarks.
There are many microbenchmarks which do not tell you a lot. All those test* things.
And well, their value for testing is also questionable. They only can help you to identify where it goes *boom* and crashes the VM, but they do not actually assert for anything.

Also not sure what the value of Slopstone and Smopstone (names from the top of my head might be slightly different) is nowadays.

The compiler benchmark is a reasonable application benchmark.
Would be good to have a few others in that collection, too.

Best regards
Stefan

--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://soft.vub.ac.be/~smarr
Phone: <a href="tel:%2B32%202%20629%202974">+32 2 629 2974
Fax: <a href="tel:%2B32%202%20629%203525">+32 2 629 3525

Camillo Bruni

Re: [Vm-dev] VM Benchmarks (Was: Re: [squeak-dev] [4.2] - VM <-> image release coordination)

I am maybe an ignorant here writing like that. I like to see a nice set of benchmarks popping up in Pharo.

But did you have a look at my benchmark implementation we use in Pinocchio?
Its in the PBenchmark package of the Pinocchio project:

MCHttpRepository
location: 'http://www.squeaksource.com/p'
user: ''
password: ''

Its a fairly straight forward implementation based on UnitTests...

I see several issues in your benchmark implementation which I think are solved much cleaner with my approach:

- currently there is one single class with tons of benchmarks in it
- no statistically valid output ( just the average doesn't mean anything! see http://dx.doi.org/10.1145/1297105.1297033 for the basic scientific backgrounds)
- you interleave model (benchmarks and results) and view (transcript output) which is really evil, there is no way you can ever use this on the command line!

so what I suggest, is that you have a look at my implementation and see how we can improve the current situation.

m(^_-)m
camillo

On 2011-03-11, at 15:05, Alexandre Bergel wrote:

> I am not sure what you mean with vm benchmarks, but in almost all the tools I am contributing come with some benchmarks (spy, Mondrian, I wrote some benchmarks for Glamour as well). Naturally, those are macro benchmarks, which is probably what matter the most.
>
> Cheers,
> Alexandre
>
>
>
> Le 11 mars 2011 à 10:55, Mariano Martinez Peck <[hidden email]> a écrit :
>
>> does anyone have news regarding this topic? we VERY welcome people helping with benchmarks.
>> please feel free to improve http://www.squeaksource.com/PharoBenchmarks
>>
>>
>>
>> On Tue, Jan 4, 2011 at 11:51 PM, Stefan Marr <[hidden email]> wrote:
>> Hi:
>>
>> On 04 Jan 2011, at 23:40, Mariano Martinez Peck wrote:
>>
>>> On Tue, Jan 4, 2011 at 11:35 PM, Stefan Marr <[hidden email]> wrote:
>>> Hi Igor:
>>>
>>> On 04 Jan 2011, at 22:40, Igor Stasenko wrote:
>>>
>>>> Okay, how about creating a separate
>>>> VMBenchmarks repository
>>>> and putting VMBenchmarks package there?
>>>
>>> Sure, sounds good. There are also the Systems benchmarks at http://www.squeaksource.com/PharoBenchmarks.
>>>
>>>
>>> So, at least, I was able to run all benchmarks :)
>>>
>>> However, it still needs cleaning, improvement, testing, blah. But it is a good start point I think.
>>
>> Well, I am not to sure about the general value of those benchmarks.
>> There are many microbenchmarks which do not tell you a lot. All those test* things.
>> And well, their value for testing is also questionable. They only can help you to identify where it goes *boom* and crashes the VM, but they do not actually assert for anything.
>>
>> Also not sure what the value of Slopstone and Smopstone (names from the top of my head might be slightly different) is nowadays.
>>
>> The compiler benchmark is a reasonable application benchmark.
>> Would be good to have a few others in that collection, too.
>>
>> Best regards
>> Stefan
>>
>>
>>
>>
>> --
>> Stefan Marr
>> Software Languages Lab
>> Vrije Universiteit Brussel
>> Pleinlaan 2 / B-1050 Brussels / Belgium
>> http://soft.vub.ac.be/~smarr
>> Phone: +32 2 629 2974
>> Fax: +32 2 629 3525
>>
>>
>>

Stefan Marr-4

Re: VM Benchmarks

In reply to this post by abergel

Hi:

I have currently the following benchmarks at hand, partially taken from PharoBenchmarks and partially from the language shootout (http://shootout.alioth.debian.org/, https://alioth.debian.org/snapshots.php?group_id=30402).

CompilerBench.st
FloatLoopBench.st
IntLoopBench.st
RoarParallelTinyBench.st
StartupBench.st

NasParallelBenchmarksIS.st

BinaryTrees.st
Chameleon.st
FannkuchRedux.st
Fasta.st
NBody.st
PermGeneratorRedux.st
ChameneosRedux.st

RoarSlopstoneBenchmark.st
RoarSmopstoneBenchmark.st

This also includes a small benchmark harness that allows parallel execution, i.e., benchmark for weak scalability. It also includes the necessary details to be run in batch mode from the command-line, very important to do performance regression testing for our RoarVM. An in-image-only solution does not provide enough flexibility for us.

Would be interesting to discuss these things today or tomorrow at the sprint.

Best regards
Stefan

On 11 Mar 2011, at 15:05, Alexandre Bergel wrote:

--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://soft.vub.ac.be/~smarr
Phone: +32 2 629 2974
Fax: +32 2 629 3525

--
Stefan Marr
Software Languages Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://soft.vub.ac.be/~smarr
Phone: +32 2 629 2974
Fax: +32 2 629 3525

Mariano Martinez Peck

Re: [Vm-dev] VM Benchmarks (Was: Re: [squeak-dev] [4.2] - VM <-> image release coordination)

In reply to this post by Camillo Bruni

On Fri, Mar 11, 2011 at 3:23 PM, Camillo Bruni <[hidden email]> wrote:

I am maybe an ignorant here writing like that. I like to see a nice set of benchmarks popping up in Pharo.

But did you have a look at my benchmark implementation we use in Pinocchio?
Its in the PBenchmark package of the Pinocchio project:

MCHttpRepository
location: 'http://www.squeaksource.com/p'
user: ''
password: ''

Its a fairly straight forward implementation based on UnitTests...

I see several issues in your benchmark implementation which I think are solved much cleaner with my approach:

- currently there is one single class with tons of benchmarks in it
- no statistically valid output ( just the average doesn't mean anything! see http://dx.doi.org/10.1145/1297105.1297033 for the basic scientific backgrounds)
- you interleave model (benchmarks and results) and view (transcript output) which is really evil, there is no way you can ever use this on the command line!

so what I suggest, is that you have a look at my implementation and see how we can improve the current situation.

There is no class comment, no wiki, no class side examples....
I just tried
PBenchmarkSuite new runAll
but I didn't get anything.

cheers

mariano

m(^_-)m
camillo

On 2011-03-11, at 15:05, Alexandre Bergel wrote:

> I am not sure what you mean with vm benchmarks, but in almost all the tools I am contributing come with some benchmarks (spy, Mondrian, I wrote some benchmarks for Glamour as well). Naturally, those are macro benchmarks, which is probably what matter the most.
>
> Cheers,
> Alexandre
>
>
>
> Le 11 mars 2011 à 10:55, Mariano Martinez Peck <[hidden email]> a écrit :
>
>> does anyone have news regarding this topic? we VERY welcome people helping with benchmarks.
>> please feel free to improve http://www.squeaksource.com/PharoBenchmarks
>>
>>
>>
>> On Tue, Jan 4, 2011 at 11:51 PM, Stefan Marr <[hidden email]> wrote:
>> Hi:
>>
>> On 04 Jan 2011, at 23:40, Mariano Martinez Peck wrote:
>>
>>> On Tue, Jan 4, 2011 at 11:35 PM, Stefan Marr <[hidden email]> wrote:
>>> Hi Igor:
>>>
>>> On 04 Jan 2011, at 22:40, Igor Stasenko wrote:
>>>
>>>> Okay, how about creating a separate
>>>> VMBenchmarks repository
>>>> and putting VMBenchmarks package there?
>>>
>>> Sure, sounds good. There are also the Systems benchmarks at http://www.squeaksource.com/PharoBenchmarks.
>>>
>>>
>>> So, at least, I was able to run all benchmarks :)
>>>
>>> However, it still needs cleaning, improvement, testing, blah. But it is a good start point I think.
>>
>> Well, I am not to sure about the general value of those benchmarks.
>> There are many microbenchmarks which do not tell you a lot. All those test* things.
>> And well, their value for testing is also questionable. They only can help you to identify where it goes *boom* and crashes the VM, but they do not actually assert for anything.
>>
>> Also not sure what the value of Slopstone and Smopstone (names from the top of my head might be slightly different) is nowadays.
>>
>> The compiler benchmark is a reasonable application benchmark.
>> Would be good to have a few others in that collection, too.
>>
>> Best regards
>> Stefan
>>
>>
>>
>>
>> --
>> Stefan Marr
>> Software Languages Lab
>> Vrije Universiteit Brussel
>> Pleinlaan 2 / B-1050 Brussels / Belgium
>> http://soft.vub.ac.be/~smarr
>> Phone: <a href="tel:%2B32%202%20629%202974">+32 2 629 2974
>> Fax: <a href="tel:%2B32%202%20629%203525">+32 2 629 3525
>>
>>
>>

Camillo Bruni

Re: [Vm-dev] VM Benchmarks (Was: Re: [squeak-dev] [4.2] - VM <-> image release coordination)

On 2011-03-11, at 15:35, Mariano Martinez Peck wrote:

> On Fri, Mar 11, 2011 at 3:23 PM, Camillo Bruni <[hidden email]>wrote:
>
>> I am maybe an ignorant here writing like that. I like to see a nice set of
>> benchmarks popping up in Pharo.
>>
>> But did you have a look at my benchmark implementation we use in Pinocchio?
>> Its in the PBenchmark package of the Pinocchio project:
>>
>> MCHttpRepository
>> location: 'http://www.squeaksource.com/p'
>> user: ''
>> password: ''
>>
>> Its a fairly straight forward implementation based on UnitTests...
>>
>> I see several issues in your benchmark implementation which I think are
>> solved much cleaner with my approach:
>>
>> - currently there is one single class with tons of benchmarks in it
>> - no statistically valid output ( just the average doesn't mean anything!
>> see http://dx.doi.org/10.1145/1297105.1297033 for the basic scientific
>> backgrounds)
>> - you interleave model (benchmarks and results) and view (transcript
>> output) which is really evil, there is no way you can ever use this on the
>> command line!
>>
>> so what I suggest, is that you have a look at my implementation and see how
>> we can improve the current situation.
>>
>
> There is no class comment, no wiki, no class side examples....
> I just tried
> PBenchmarkSuite new runAll
> but I didn't get anything.
>
> cheers
>
> mariano

Sorry, indeed no documentation yet :). But as I said it works like the UnitTest suite, hence there is nothing to run in there.

Try one of the actual test-classes like

PBString run

and you will get a PBenchmarkRun back, as you would with a TestCase.

let me know if that helps :).

camillo

>
>>
>>
>> m(^_-)m
>> camillo
>>
>> On 2011-03-11, at 15:05, Alexandre Bergel wrote:
>>
>>> I am not sure what you mean with vm benchmarks, but in almost all the
>> tools I am contributing come with some benchmarks (spy, Mondrian, I wrote
>> some benchmarks for Glamour as well). Naturally, those are macro benchmarks,
>> which is probably what matter the most.
>>>
>>> Cheers,
>>> Alexandre
>>>
>>>
>>>
>>> Le 11 mars 2011 à 10:55, Mariano Martinez Peck <[hidden email]> a
>> écrit :
>>>
>>>> does anyone have news regarding this topic? we VERY welcome people
>> helping with benchmarks.
>>>> please feel free to improve http://www.squeaksource.com/PharoBenchmarks
>>>>
>>>>
>>>>
>>>> On Tue, Jan 4, 2011 at 11:51 PM, Stefan Marr <[hidden email]>
>> wrote:
>>>> Hi:
>>>>
>>>> On 04 Jan 2011, at 23:40, Mariano Martinez Peck wrote:
>>>>
>>>>> On Tue, Jan 4, 2011 at 11:35 PM, Stefan Marr <[hidden email]>
>> wrote:
>>>>> Hi Igor:
>>>>>
>>>>> On 04 Jan 2011, at 22:40, Igor Stasenko wrote:
>>>>>
>>>>>> Okay, how about creating a separate
>>>>>> VMBenchmarks repository
>>>>>> and putting VMBenchmarks package there?
>>>>>
>>>>> Sure, sounds good. There are also the Systems benchmarks at
>> http://www.squeaksource.com/PharoBenchmarks.
>>>>>
>>>>>
>>>>> So, at least, I was able to run all benchmarks :)
>>>>>
>>>>> However, it still needs cleaning, improvement, testing, blah. But it
>> is a good start point I think.
>>>>
>>>> Well, I am not to sure about the general value of those benchmarks.
>>>> There are many microbenchmarks which do not tell you a lot. All those
>> test* things.
>>>> And well, their value for testing is also questionable. They only can
>> help you to identify where it goes *boom* and crashes the VM, but they do
>> not actually assert for anything.
>>>>
>>>> Also not sure what the value of Slopstone and Smopstone (names from the
>> top of my head might be slightly different) is nowadays.
>>>>
>>>> The compiler benchmark is a reasonable application benchmark.
>>>> Would be good to have a few others in that collection, too.
>>>>
>>>> Best regards
>>>> Stefan
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Stefan Marr
>>>> Software Languages Lab
>>>> Vrije Universiteit Brussel
>>>> Pleinlaan 2 / B-1050 Brussels / Belgium
>>>> http://soft.vub.ac.be/~smarr
>>>> Phone: +32 2 629 2974
>>>> Fax: +32 2 629 3525
>>>>
>>>>
>>>>
>>
>>
>>

abergel

Re: [Vm-dev] VM Benchmarks (Was: Re: [squeak-dev] [4.2] - VM <-> image release coordination)

In reply to this post by Camillo Bruni

> Its a fairly straight forward implementation based on UnitTests...
>

Yeah. Actually, a very good benchmark is running unit tests.
People from docomo do that.

Alexandre

> I see several issues in your benchmark implementation which I think are solved much cleaner with my approach:
>
> - currently there is one single class with tons of benchmarks in it
> - no statistically valid output ( just the average doesn't mean anything! see http://dx.doi.org/10.1145/1297105.1297033 for the basic scientific backgrounds)
> - you interleave model (benchmarks and results) and view (transcript output) which is really evil, there is no way you can ever use this on the command line!
>
> so what I suggest, is that you have a look at my implementation and see how we can improve the current situation.
>
>
> m(^_-)m
> camillo
>
> On 2011-03-11, at 15:05, Alexandre Bergel wrote:
>
>> I am not sure what you mean with vm benchmarks, but in almost all the tools I am contributing come with some benchmarks (spy, Mondrian, I wrote some benchmarks for Glamour as well). Naturally, those are macro benchmarks, which is probably what matter the most.
>>
>> Cheers,
>> Alexandre
>>
>>
>>
>> Le 11 mars 2011 à 10:55, Mariano Martinez Peck <[hidden email]> a écrit :
>>
>>> does anyone have news regarding this topic? we VERY welcome people helping with benchmarks.
>>> please feel free to improve http://www.squeaksource.com/PharoBenchmarks
>>>
>>>
>>>
>>> On Tue, Jan 4, 2011 at 11:51 PM, Stefan Marr <[hidden email]> wrote:
>>> Hi:
>>>
>>> On 04 Jan 2011, at 23:40, Mariano Martinez Peck wrote:
>>>
>>>> On Tue, Jan 4, 2011 at 11:35 PM, Stefan Marr <[hidden email]> wrote:
>>>> Hi Igor:
>>>>
>>>> On 04 Jan 2011, at 22:40, Igor Stasenko wrote:
>>>>
>>>>> Okay, how about creating a separate
>>>>> VMBenchmarks repository
>>>>> and putting VMBenchmarks package there?
>>>>
>>>> Sure, sounds good. There are also the Systems benchmarks at http://www.squeaksource.com/PharoBenchmarks.
>>>>
>>>>
>>>> So, at least, I was able to run all benchmarks :)
>>>>
>>>> However, it still needs cleaning, improvement, testing, blah. But it is a good start point I think.
>>>
>>> Well, I am not to sure about the general value of those benchmarks.
>>> There are many microbenchmarks which do not tell you a lot. All those test* things.
>>> And well, their value for testing is also questionable. They only can help you to identify where it goes *boom* and crashes the VM, but they do not actually assert for anything.
>>>
>>> Also not sure what the value of Slopstone and Smopstone (names from the top of my head might be slightly different) is nowadays.
>>>
>>> The compiler benchmark is a reasonable application benchmark.
>>> Would be good to have a few others in that collection, too.
>>>
>>> Best regards
>>> Stefan
>>>
>>>
>>>
>>>
>>> --
>>> Stefan Marr
>>> Software Languages Lab
>>> Vrije Universiteit Brussel
>>> Pleinlaan 2 / B-1050 Brussels / Belgium
>>> http://soft.vub.ac.be/~smarr
>>> Phone: +32 2 629 2974
>>> Fax: +32 2 629 3525
>>>
>>>
>>>
>
>