About how to benchmark the lookup speed...

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

About how to benchmark the lookup speed...

Steven Costiou-2

Hi,

i'm having trouble doing benchmarks in Pharo, at least i don't understand the results (maybe i'm missing something). I want to compare anonymous subclasses lookup speed with standard pharo classes lookup speed.

I have a method m and a method m2 that execute the following code:     ^ 100 printString. I have implemented these 2 methods in one class then in one anonymous subclass. I thought that measuring the execution speed of the same code in these two different cases would give me different times if the lookup speed was not the same.

So I did the following:

|o results repetitions |

o := theClass new.

results := OrderedCollection new.

repetitions := 100.

repetitions timesRepeat:[ results add: [1000000 timesRepeat:[o m]] timeToRun].

I do the same for m2, and then i compute an average of all the values measured in results.

What i don't understand is that, for example, for an average on 100 measurements, m is 1% slower and m2 is 2% faster in the Pharo class case than with anonymous subclasses. But for 1 000 measurements, m is 11% faster but m2 is 3% slower. Results continue to vary as i change the number of measurements, but they do not increase with it (seems not to be linear).

 

I don't have enough knowledge about how to benchmark code, or what will make a difference in Pharo. For now the only explanations i have is that maybe the results are too slow to be significant and then they can vary, or i have done something wrong in how i measure it.

 

How would you measure the lookup speed of a method ?

 

Steven.

Reply | Threaded
Open this post in threaded view
|

Re: About how to benchmark the lookup speed...

Peter Kenny

Steven

 

The question you need to consider is how much the results vary from one repetition to another. If they are very variable, then the average will still be variable. In the case of your 100 repetitions, for instance, you can work out a confidence interval for the mean quite easily. Find the mean and standard deviation of the 100 repetitions; the standard error of the mean is the sample standard deviation divided by the square root of the number of values averaged (i.e. 10 in this case); the approximate 95% confidence interval is the mean +/- twice the standard error of the mean. (This makes a lot of simplifying assumptions, but should be sufficient for your purposes.)

 

If, as I suspect, the width of the confidence interval is quite large in relation to the mean, this means that you cannot consistently measure the speed of operations like this. You could try greatly increasing the number of repetitions, but the square root law is working against you. 10,000 repetitions would give you an interval about 10% of the width of 100 repetitions, 1,000,000 repetitions would reduce it to 1%.

 

Hope this is helpful

 

Peter Kenny

 

From: Pharo-users [mailto:[hidden email]] On Behalf Of Steven Costiou
Sent: 14 June 2017 18:14
To: Pharo users users <[hidden email]>
Subject: [Pharo-users] About how to benchmark the lookup speed...

 

Hi,

i'm having trouble doing benchmarks in Pharo, at least i don't understand the results (maybe i'm missing something). I want to compare anonymous subclasses lookup speed with standard pharo classes lookup speed.

I have a method m and a method m2 that execute the following code:     ^ 100 printString. I have implemented these 2 methods in one class then in one anonymous subclass. I thought that measuring the execution speed of the same code in these two different cases would give me different times if the lookup speed was not the same.

So I did the following:

|o results repetitions |

o := theClass new.

results := OrderedCollection new.

repetitions := 100.

repetitions timesRepeat:[ results add: [1000000 timesRepeat:[o m]] timeToRun].

I do the same for m2, and then i compute an average of all the values measured in results.

What i don't understand is that, for example, for an average on 100 measurements, m is 1% slower and m2 is 2% faster in the Pharo class case than with anonymous subclasses. But for 1 000 measurements, m is 11% faster but m2 is 3% slower. Results continue to vary as i change the number of measurements, but they do not increase with it (seems not to be linear).

 

I don't have enough knowledge about how to benchmark code, or what will make a difference in Pharo. For now the only explanations i have is that maybe the results are too slow to be significant and then they can vary, or i have done something wrong in how i measure it.

 

How would you measure the lookup speed of a method ?

 

Steven.

Reply | Threaded
Open this post in threaded view
|

Re: About how to benchmark the lookup speed...

Steven Costiou-2

Thanks Peter :)

I will make these computations tomorrow at the lab and see what the confidence interval looks like.

Steven.

Le 2017-06-14 20:03, PBKResearch a écrit :

Steven

 

The question you need to consider is how much the results vary from one repetition to another. If they are very variable, then the average will still be variable. In the case of your 100 repetitions, for instance, you can work out a confidence interval for the mean quite easily. Find the mean and standard deviation of the 100 repetitions; the standard error of the mean is the sample standard deviation divided by the square root of the number of values averaged (i.e. 10 in this case); the approximate 95% confidence interval is the mean +/- twice the standard error of the mean. (This makes a lot of simplifying assumptions, but should be sufficient for your purposes.)

 

If, as I suspect, the width of the confidence interval is quite large in relation to the mean, this means that you cannot consistently measure the speed of operations like this. You could try greatly increasing the number of repetitions, but the square root law is working against you. 10,000 repetitions would give you an interval about 10% of the width of 100 repetitions, 1,000,000 repetitions would reduce it to 1%.

 

Hope this is helpful

 

Peter Kenny

 

From: Pharo-users [mailto:[hidden email]] On Behalf Of Steven Costiou
Sent: 14 June 2017 18:14
To: Pharo users users <[hidden email]>
Subject: [Pharo-users] About how to benchmark the lookup speed...

 

Hi,

i'm having trouble doing benchmarks in Pharo, at least i don't understand the results (maybe i'm missing something). I want to compare anonymous subclasses lookup speed with standard pharo classes lookup speed.

I have a method m and a method m2 that execute the following code:     ^ 100 printString. I have implemented these 2 methods in one class then in one anonymous subclass. I thought that measuring the execution speed of the same code in these two different cases would give me different times if the lookup speed was not the same.

So I did the following:

|o results repetitions |

o := theClass new.

results := OrderedCollection new.

repetitions := 100.

repetitions timesRepeat:[ results add: [1000000 timesRepeat:[o m]] timeToRun].

I do the same for m2, and then i compute an average of all the values measured in results.

What i don't understand is that, for example, for an average on 100 measurements, m is 1% slower and m2 is 2% faster in the Pharo class case than with anonymous subclasses. But for 1 000 measurements, m is 11% faster but m2 is 3% slower. Results continue to vary as i change the number of measurements, but they do not increase with it (seems not to be linear).

 

I don't have enough knowledge about how to benchmark code, or what will make a difference in Pharo. For now the only explanations i have is that maybe the results are too slow to be significant and then they can vary, or i have done something wrong in how i measure it.

 

How would you measure the lookup speed of a method ?

 

Steven.

Reply | Threaded
Open this post in threaded view
|

Re: About how to benchmark the lookup speed...

wernerk
On 06/14/2017 09:12 PM, Steven Costiou wrote:
repetitions timesRepeat:[ results add: [1000000 timesRepeat:[o m]] timeToRun].

Hi Steven,

the influence of the garbage collector is in cases like these often somewhat difficult to forecast. i would eventually change the code to:

repetitions timesRepeat:[ Smalltalk garbageCollect. results add: [1000000 timesRepeat:[o m]] timeToRun].

and if i wanted to see the influence of the garbage collector i would enlarge repetitions and diminish 1000000 in such a way that the product remains constant. of course one would not want to destroy that influence completely as in the real world it always takes its toll.
werner

Reply | Threaded
Open this post in threaded view
|

Re: About how to benchmark the lookup speed...

Steven Costiou-2

Hi,

i added the garbage collect and the numbers are better ! There are fluctuations but no more "random-like" variations of the measured values.

I computed the standard error which is, for all measures i did, inferior to 0.5% of the mean (then the confidence interval is good).

 

Thank you both :)

 

Steven.

 

Le 2017-06-14 22:08, werner kassens a écrit :

On 06/14/2017 09:12 PM, Steven Costiou wrote:
repetitions timesRepeat:[ results add: [1000000 timesRepeat:[o m]] timeToRun].

Hi Steven,

the influence of the garbage collector is in cases like these often somewhat difficult to forecast. i would eventually change the code to:

repetitions timesRepeat:[ Smalltalk garbageCollect. results add: [1000000 timesRepeat:[o m]] timeToRun].

and if i wanted to see the influence of the garbage collector i would enlarge repetitions and diminish 1000000 in such a way that the product remains constant. of course one would not want to destroy that influence completely as in the real world it always takes its toll.
werner