Hi:
For the others, I am currently adapting the benchmarking infrastructure used for Pinocchio to be a bit more general, and enable me to integrate our RoarVM benchmarking tools. The goal is to have a framework that allows all kind of benchmarking, written like unit tests. One other idea Henrik was interested in is to be able to easily compare the benchmark results of different versions of method-implementations, to see whether optimizations were successful. Camillo, even so I don't have a working version yet, I was trying to commit my refactoring (perhaps for review). However, the PinocchioVM project seems to be 'global readonly'. Would it be better to make it a stand-alone project? When we go with that step, as already mentioned, I would like to rename it. PBenchmark is a name that might prevent adoption. Not that I would like to go into politics here, but perhaps we can consider a new name. Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'? Best regards Stefan PS: One thing, we might want to use as a source of inspiration in the future is: http://code.google.com/p/caliper/ Thats a microbenchmark framework for Java, also using the unit-testing metaphor. There @Param is also neat (http://code.google.com/p/caliper/source/browse/trunk/examples/src/main/java/examples/ArraySortBenchmark.java) and allows to encode input sizes for which the benchmarks should be executed. That is something I am not to interested in, but in case the framework finds adoption that might be something to keep in mind. -- Stefan Marr Software Languages Lab Vrije Universiteit Brussel Pleinlaan 2 / B-1050 Brussels / Belgium http://soft.vub.ac.be/~smarr Phone: +32 2 629 2974 Fax: +32 2 629 3525 |
Lets factor it out as a separate project.
I would go for SBench :). camillo On 2011-03-13, at 23:30, Stefan Marr wrote: > Hi: > > For the others, I am currently adapting the benchmarking infrastructure used for Pinocchio to be a bit more general, and enable me to integrate our RoarVM benchmarking tools. > The goal is to have a framework that allows all kind of benchmarking, written like unit tests. > One other idea Henrik was interested in is to be able to easily compare the benchmark results of different versions of method-implementations, to see whether optimizations were successful. > > > Camillo, even so I don't have a working version yet, I was trying to commit my refactoring (perhaps for review). > However, the PinocchioVM project seems to be 'global readonly'. > > Would it be better to make it a stand-alone project? > When we go with that step, as already mentioned, I would like to rename it. > > PBenchmark is a name that might prevent adoption. Not that I would like to go into politics here, but perhaps we can consider a new name. > > Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'? > > Best regards > Stefan > > > PS: > > One thing, we might want to use as a source of inspiration in the future is: > http://code.google.com/p/caliper/ > > Thats a microbenchmark framework for Java, also using the unit-testing metaphor. > There @Param is also neat (http://code.google.com/p/caliper/source/browse/trunk/examples/src/main/java/examples/ArraySortBenchmark.java) and allows to encode input sizes for which the benchmarks should be executed. > That is something I am not to interested in, but in case the framework finds adoption that might be something to keep in mind. > > -- > Stefan Marr > Software Languages Lab > Vrije Universiteit Brussel > Pleinlaan 2 / B-1050 Brussels / Belgium > http://soft.vub.ac.be/~smarr > Phone: +32 2 629 2974 > Fax: +32 2 629 3525 > > |
In reply to this post by Stefan Marr-4
On Sun, Mar 13, 2011 at 11:30 PM, Stefan Marr <[hidden email]> wrote: Hi: Hi Stef. I would love this. Not only to see how successful an optimization was, but also, how much overhead your solution has ;)
Yes, please. Did you consider the repository PharoBenchmarks (even if the package can be called SBench)? because it would be good to collect them somewhere, and that seems a good enough place. BTW, how do you compare the benchmarks from PBenchmark to the ones that are in PharoBenchmarks ? Finally, I would also love to have such benchmarks run in each hudson build to see wether each commit to pharo make it slower/faster. I know Cyrille was working on that. I cc'ed him, since it would be nice to join forces. When we go with that step, as already mentioned, I would like to rename it. SBench sounds good for me. Best regards |
In reply to this post by Stefan Marr-4
On 03/13/2011 11:30 PM, Stefan Marr wrote:
> Hi: > > For the others, I am currently adapting the benchmarking infrastructure used for Pinocchio to be a bit more general, and enable me to integrate our RoarVM benchmarking tools. > The goal is to have a framework that allows all kind of benchmarking, written like unit tests. > One other idea Henrik was interested in is to be able to easily compare the benchmark results of different versions of method-implementations, to see whether optimizations were successful. > > > Camillo, even so I don't have a working version yet, I was trying to commit my refactoring (perhaps for review). > However, the PinocchioVM project seems to be 'global readonly'. > > Would it be better to make it a stand-alone project? > When we go with that step, as already mentioned, I would like to rename it. > > PBenchmark is a name that might prevent adoption. Not that I would like to go into politics here, but perhaps we can consider a new name. > > Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'? > Best regards > Stefan > > > PS: > > One thing, we might want to use as a source of inspiration in the future is: > http://code.google.com/p/caliper/ > > Thats a microbenchmark framework for Java, also using the unit-testing metaphor. > There @Param is also neat (http://code.google.com/p/caliper/source/browse/trunk/examples/src/main/java/examples/ArraySortBenchmark.java) and allows to encode input sizes for which the benchmarks should be executed. > That is something I am not to interested in, but in case the framework finds adoption that might be something to keep in mind. > |
In reply to this post by Mariano Martinez Peck
Hi:
On 14 Mar 2011, at 09:36, Mariano Martinez Peck wrote: > Hi Stef. I would love this. Not only to see how successful an optimization was, but also, how much overhead your solution has ;) Just in case you are talking to me, it is Stefan, not Stef... > Yes, please. Did you consider the repository PharoBenchmarks (even if the package can be called SBench)? because it would be good to collect them somewhere, and that seems a good enough place. No, PharoBenchmarks is not 'good enough'. I am not going to restrict my work to a particular Smalltalk, more than necessary. And to make that apparent I will insist on dropping the 'P' form the name. The benchmark framework should not be restricted to Pharo in any way. Benchmarks perhaps, but not the execution framework. > BTW, how do you compare the benchmarks from PBenchmark to the ones that are in PharoBenchmarks ? PharoBenchmarks contains mostly microbenchmarks of all different kind. Should be relatively easy to adapt them to be used with the new framework. PBenchmark contains a small set of microbenchmarks, too. A bit better organized thought. The Whetstone benchmarks are already included in my RoarVM benchmarks and will definitely ported. > > Finally, I would also love to have such benchmarks run in each hudson build to see wether each commit to pharo make it slower/faster. We already do that for the RoarVM, and the whole point is to enable you to do it too, within a common framework. What I want, is to be able to compare the results of the various VMs(and/or images) with each other. Best regards Stefan -- Stefan Marr Software Languages Lab Vrije Universiteit Brussel Pleinlaan 2 / B-1050 Brussels / Belgium http://soft.vub.ac.be/~smarr Phone: +32 2 629 2974 Fax: +32 2 629 3525 |
In reply to this post by Toon Verwaest-2
Hi Toon:
On 14 Mar 2011, at 09:52, Toon Verwaest wrote: >> Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'? > And he immediately calls it after himself! S<tefan>mar<r>k ;) Rather farfetched no? :-P The SMark was inspired by names like 3DMark, FutureMark, CoreMark, SYSmark, WebMark, MobileMark. >> Best regards >> Stefan -- Stefan Marr Software Languages Lab Vrije Universiteit Brussel Pleinlaan 2 / B-1050 Brussels / Belgium http://soft.vub.ac.be/~smarr Phone: +32 2 629 2974 Fax: +32 2 629 3525 |
On 14.03.2011 10:19, Stefan Marr wrote:
> Hi Toon: > > On 14 Mar 2011, at 09:52, Toon Verwaest wrote: >>> Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'? >> And he immediately calls it after himself! S<tefan>mar<r>k ;) > Rather farfetched no? :-P > > The SMark was inspired by names like 3DMark, FutureMark, CoreMark, SYSmark, WebMark, MobileMark. I was actually about to suggest we call it SMarrk earlier today, but that would probably be too obvious :D Anyways, in regards to using PharoBenchmarks/PBenchmark/etc, +100 to Stefan's reply. It is important to keep the framework in a separate repository. (whose name does not imply it is specific to any dialect) Mariano: Don't mix/confuse the framework with the uses of it. Cheers, Henry |
+1 to SMark
On 14 March 2011 15:06, Henrik Sperre Johansen <[hidden email]> wrote: > On 14.03.2011 10:19, Stefan Marr wrote: >> >> Hi Toon: >> >> On 14 Mar 2011, at 09:52, Toon Verwaest wrote: >>>> >>>> Since the general idea is to write benchmarks like unit tests, how about >>>> 'SMark'/'SBench' instead of 'SUnit'? >>> >>> And he immediately calls it after himself! S<tefan>mar<r>k ;) >> >> Rather farfetched no? :-P >> >> The SMark was inspired by names like 3DMark, FutureMark, CoreMark, >> SYSmark, WebMark, MobileMark. > > I was actually about to suggest we call it SMarrk earlier today, but that > would probably be too obvious :D > > Anyways, in regards to using PharoBenchmarks/PBenchmark/etc, +100 to > Stefan's reply. > It is important to keep the framework in a separate repository. (whose name > does not imply it is specific to any dialect) > Mariano: Don't mix/confuse the framework with the uses of it. > > Cheers, > Henry > > > -- Best regards, Igor Stasenko AKA sig. |
In reply to this post by Stefan Marr-4
> Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'?
I like the idea. Actually, this is what I have done with HealthReportProducer. A benchmark is created by subclassing it and defining #metric* methods. Cheers, Alexandre -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
Hi Alexandre:
On 14 Mar 2011, at 18:10, Alexandre Bergel wrote: >> Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'? > > > I like the idea. Actually, this is what I have done with HealthReportProducer. A benchmark is created by subclassing it and defining #metric* methods. Ehm, could you elaborate? Do you want me to have a look at something and steal some code/ideas? If so, I would need a more elaborate hint than just a name. I googled for it, and there seems to be some reference in a metachello project, and interestingly google indexed already this email... Thanks Stefan > > Cheers, > Alexandre > > -- > _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: > Alexandre Bergel http://www.bergel.eu > ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. > > > > > > -- Stefan Marr Software Languages Lab Vrije Universiteit Brussel Pleinlaan 2 / B-1050 Brussels / Belgium http://soft.vub.ac.be/~smarr Phone: +32 2 629 2974 Fax: +32 2 629 3525 |
You can check the class MondrianHealth2 that belongs to the package Mondrian-Tests of http://www.squeaksource.com/Mondrian.html
The class contains many methods having a name that begins with 'metric'. Some students worked on a kind of Hudson that exploit these information. You can have a look at it: http://kimen.dcc.uchile.cl:8008/VerMon01/faces/project_details.xhtml?id=1 It would be cool to have a standard for this. As we have for SUnit. I think that what I did goes in the right direction, but the code is not as clean as it should. It would be cool to have a report generation, which could be .pdf, sending emails, ... Cheers, Alexandre On 14 Mar 2011, at 16:23, Stefan Marr wrote: > Hi Alexandre: > > > On 14 Mar 2011, at 18:10, Alexandre Bergel wrote: > >>> Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'? >> >> >> I like the idea. Actually, this is what I have done with HealthReportProducer. A benchmark is created by subclassing it and defining #metric* methods. > Ehm, could you elaborate? Do you want me to have a look at something and steal some code/ideas? If so, I would need a more elaborate hint than just a name. > I googled for it, and there seems to be some reference in a metachello project, and interestingly google indexed already this email... > > Thanks > Stefan > > > > >> >> Cheers, >> Alexandre >> >> -- >> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >> Alexandre Bergel http://www.bergel.eu >> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >> >> >> >> >> >> > > -- > Stefan Marr > Software Languages Lab > Vrije Universiteit Brussel > Pleinlaan 2 / B-1050 Brussels / Belgium > http://soft.vub.ac.be/~smarr > Phone: +32 2 629 2974 > Fax: +32 2 629 3525 > > -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
Hi Alexandre:
On 14 Mar 2011, at 20:31, Alexandre Bergel wrote: > The class contains many methods having a name that begins with 'metric'. Some students worked on a kind of Hudson that exploit these information. You can have a look at it: > http://kimen.dcc.uchile.cl:8008/VerMon01/faces/project_details.xhtml?id=1 Ok, that is about code metrics. Slightly different from the performance aspect I am interest in, but the solution seems to follow a similar principle. > It would be cool to have a standard for this. As we have for SUnit. I think that what I did goes in the right direction, but the code is not as clean as it should. It would be cool to have a report generation, which could be .pdf, sending emails, ... Ehm, do you suggest to have a common meta-framework for SUnit, you metric analysis/reporting, and benchmarking? Not sure whether thats worth the effort. For now I will concentrate on getting a decent job done to have common benchmarking framework. Where it goes from there, we shall see. Best regards Stefan -- Stefan Marr Software Languages Lab Vrije Universiteit Brussel Pleinlaan 2 / B-1050 Brussels / Belgium http://soft.vub.ac.be/~smarr Phone: +32 2 629 2974 Fax: +32 2 629 3525 |
>> The class contains many methods having a name that begins with 'metric'. Some students worked on a kind of Hudson that exploit these information. You can have a look at it:
>> http://kimen.dcc.uchile.cl:8008/VerMon01/faces/project_details.xhtml?id=1 > Ok, that is about code metrics. Slightly different from the performance aspect I am interest in, but the solution seems to follow a similar principle. It evaluates the execution time of any piece of code. It simply uses [ ... ] timeToRun. No big deal. But this is enough in many cases. > Ehm, do you suggest to have a common meta-framework for SUnit, you metric analysis/reporting, and benchmarking? Not sure whether thats worth the effort. No. I said that we should get inspired from the SUnit framework to develop a benchmark framework. This is what Lukas, Jorge, Oscar and I agree upon. I can send you things to read if you wish. > For now I will concentrate on getting a decent job done to have common benchmarking framework. Where it goes from there, we shall see. Yes! Alexandre > > > -- > Stefan Marr > Software Languages Lab > Vrije Universiteit Brussel > Pleinlaan 2 / B-1050 Brussels / Belgium > http://soft.vub.ac.be/~smarr > Phone: +32 2 629 2974 > Fax: +32 2 629 3525 > > -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
In reply to this post by Henrik Sperre Johansen
On Mon, Mar 14, 2011 at 3:06 PM, Henrik Sperre Johansen <[hidden email]> wrote:
Thanks Henry, that was exactly what I musunderstood :) Cheers, |
In reply to this post by Stefan Marr-4
Hi:
Small update, I just created http://www.squeaksource.com/SMark.html SMark is the attempt to build a common benchmarking framework for Smalltalk. It inspired from the metaphor used for unit testing in SUnit, and thus, a benchmark is implemented by adding #benchMyBenchmark to a subclass of SBenchmarkSuite. The code is originally based on PBenchmark the benchmark framework used for the PinocchioVM and RoarBenchmark a framework used for performance regression testing of the RoarVM. Other sources of inspiration are for instance the Caliper microbenchmarking framework for Java. (http://code.google.com/p/caliper/). The name choice of SMark is deliberately confusable with smark (as defined by the urban dictionary: Noun. A person who is being scamed but is in on the prank. Someone who knows they are being tricked. Its origin is from the term "mark" and is the shortened form of the phrase "smart mark".) freely following the old wisdom: "Lies, Damn Lies, and --Statistics-- Benchmarks" Current status is still pretty early, I am spec'ing out the behavior and data structures I imaging with tests. And I haven't actually renamed anything to new names yet. The next update here will be once there is something actually usable. However, the project is world-writeable, so feel free to have a look. Best regards Stefan On 13 Mar 2011, at 23:30, Stefan Marr wrote: > Hi: > > For the others, I am currently adapting the benchmarking infrastructure used for Pinocchio to be a bit more general, and enable me to integrate our RoarVM benchmarking tools. > The goal is to have a framework that allows all kind of benchmarking, written like unit tests. > One other idea Henrik was interested in is to be able to easily compare the benchmark results of different versions of method-implementations, to see whether optimizations were successful. > > > Camillo, even so I don't have a working version yet, I was trying to commit my refactoring (perhaps for review). > However, the PinocchioVM project seems to be 'global readonly'. > > Would it be better to make it a stand-alone project? > When we go with that step, as already mentioned, I would like to rename it. > > PBenchmark is a name that might prevent adoption. Not that I would like to go into politics here, but perhaps we can consider a new name. > > Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'? > > Best regards > Stefan > > > PS: > > One thing, we might want to use as a source of inspiration in the future is: > http://code.google.com/p/caliper/ > > Thats a microbenchmark framework for Java, also using the unit-testing metaphor. > There @Param is also neat (http://code.google.com/p/caliper/source/browse/trunk/examples/src/main/java/examples/ArraySortBenchmark.java) and allows to encode input sizes for which the benchmarks should be executed. > That is something I am not to interested in, but in case the framework finds adoption that might be something to keep in mind. > > -- > Stefan Marr > Software Languages Lab > Vrije Universiteit Brussel > Pleinlaan 2 / B-1050 Brussels / Belgium > http://soft.vub.ac.be/~smarr > Phone: +32 2 629 2974 > Fax: +32 2 629 3525 > > -- Stefan Marr Software Languages Lab Vrije Universiteit Brussel Pleinlaan 2 / B-1050 Brussels / Belgium http://soft.vub.ac.be/~smarr Phone: +32 2 629 2974 Fax: +32 2 629 3525 |
Cool idea!
Thanks. Cheers El mar, 15-03-2011 a las 23:37 +0100, Stefan Marr escribió: > Hi: > > Small update, I just created > http://www.squeaksource.com/SMark.html > > SMark is the attempt to build a common benchmarking framework for Smalltalk. > It inspired from the metaphor used for unit testing in SUnit, and thus, a benchmark is implemented by adding #benchMyBenchmark to a subclass of SBenchmarkSuite. > > The code is originally based on PBenchmark the benchmark framework used for the PinocchioVM and RoarBenchmark a framework used for performance regression testing of the RoarVM. Other sources of inspiration are for instance the Caliper microbenchmarking framework for Java. (http://code.google.com/p/caliper/). > > The name choice of SMark is deliberately confusable with smark (as defined by the urban dictionary: Noun. A person who is being scamed but is in on the prank. Someone who knows they are being tricked. Its origin is from the term "mark" and is the shortened form of the phrase "smart mark".) freely following the old wisdom: "Lies, Damn Lies, and --Statistics-- Benchmarks" > > > Current status is still pretty early, I am spec'ing out the behavior and data structures I imaging with tests. And I haven't actually renamed anything to new names yet. > > The next update here will be once there is something actually usable. > However, the project is world-writeable, so feel free to have a look. > > Best regards > Stefan > > > On 13 Mar 2011, at 23:30, Stefan Marr wrote: > > > Hi: > > > > For the others, I am currently adapting the benchmarking infrastructure used for Pinocchio to be a bit more general, and enable me to integrate our RoarVM benchmarking tools. > > The goal is to have a framework that allows all kind of benchmarking, written like unit tests. > > One other idea Henrik was interested in is to be able to easily compare the benchmark results of different versions of method-implementations, to see whether optimizations were successful. > > > > > > Camillo, even so I don't have a working version yet, I was trying to commit my refactoring (perhaps for review). > > However, the PinocchioVM project seems to be 'global readonly'. > > > > Would it be better to make it a stand-alone project? > > When we go with that step, as already mentioned, I would like to rename it. > > > > PBenchmark is a name that might prevent adoption. Not that I would like to go into politics here, but perhaps we can consider a new name. > > > > Since the general idea is to write benchmarks like unit tests, how about 'SMark'/'SBench' instead of 'SUnit'? > > > > Best regards > > Stefan > > > > > > PS: > > > > One thing, we might want to use as a source of inspiration in the future is: > > http://code.google.com/p/caliper/ > > > > Thats a microbenchmark framework for Java, also using the unit-testing metaphor. > > There @Param is also neat (http://code.google.com/p/caliper/source/browse/trunk/examples/src/main/java/examples/ArraySortBenchmark.java) and allows to encode input sizes for which the benchmarks should be executed. > > That is something I am not to interested in, but in case the framework finds adoption that might be something to keep in mind. > > > > -- > > Stefan Marr > > Software Languages Lab > > Vrije Universiteit Brussel > > Pleinlaan 2 / B-1050 Brussels / Belgium > > http://soft.vub.ac.be/~smarr > > Phone: +32 2 629 2974 > > Fax: +32 2 629 3525 > > > > > -- Miguel Cobá http://twitter.com/MiguelCobaMtz http://miguel.leugim.com.mx |
Free forum by Nabble | Edit this page |