Moving/rolling average implementations ?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Moving/rolling average implementations ?

cedreek
Hi,

I wanted to do a moving/rolling average on raw data [1]. 
I haven’t find code for that (maybe this is done in polymath though).

So I ended writing that (I thing this is SMA):

SequenceableCollection>>movingAverage: anOrder     
"Answer the moving or rolling average for anOrder window"

    | retval size x y |
        anOrder <= 0 ifTrue: [ Error signal: 'the order must be positive'].
    size := self  size - anOrder.
    size negative  ifTrue: [ Error signal: 'the collection size is too small'].
    retval := self species ofSize: size + 1.

    x := 1.
y := anOrder.
    [y <=  self  size ] whileTrue: [           
 retval at: x put: (self copyFrom: x to: y) average           
 x := x + 1. y := y + 1
 ].
    ^retval

Not perfect but seems to works quite well (that’s probably better to remove copyFrom: and use some kind of buffer instead).

Any interest in that ? If any existing code too, I’ll be interested especially for other implementation (weighted, exponential) ?


(#(118 113 105 105 103 99 98 101 100 107) movingAverage: 3) collect: [:v | v asScaledDecimal: 1 ] .

 "an Array(112.0s1 107.7s1 104.3s1 102.3s1 100.0s1 99.3s1 99.7s1 102.7s1)"

Cheers,
Cédrick 




Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

Christian Haider

Hi Cédrick,

 

for smallCharts, I had to implement the same, but for stock market time series, which are a bit bigger than your example (like creating a 38 day moving average over 5 years).

Always copying the values to average is far too slow in this case.

 

My solution is to add up the first n (for an n-movAvg) numbers and remember the sum. The new average point is of course the sum divided by n.

For the next point you subtract the first number from the sum and add the next (1 + n th) number to the sum.

 

Happy hacking,

                Christian

 

 

Von: Pharo-users <[hidden email]> Im Auftrag von Cédrick Béler
Gesendet: Mittwoch, 8. April 2020 10:07
An: Any question about pharo is welcome <[hidden email]>
Betreff: [Pharo-users] Moving/rolling average implementations ?

 

Hi,

 

I wanted to do a moving/rolling average on raw data [1]. 

I haven’t find code for that (maybe this is done in polymath though).

 

So I ended writing that (I thing this is SMA):

 

SequenceableCollection>>movingAverage: anOrder     
"Answer the moving or rolling average for anOrder window"

 

    | retval size x y |

        anOrder <= 0 ifTrue: [ Error signal: 'the order must be positive'].

             size := self  size - anOrder.

             size negative  ifTrue: [ Error signal: 'the collection size is too small'].

             retval := self species ofSize: size + 1.

 

             x := 1.

             y := anOrder.

             [y <=  self  size ] whileTrue: [           

                              retval at: x put: (self copyFrom: x to: y) average           

                              x := x + 1. y := y + 1

              ].

             ^retval

 

Not perfect but seems to works quite well (that’s probably better to remove copyFrom: and use some kind of buffer instead).

 

Any interest in that ? If any existing code too, I’ll be interested especially for other implementation (weighted, exponential) ?

 

 

(#(118 113 105 105 103 99 98 101 100 107) movingAverage: 3) collect: [:v | v asScaledDecimal: 1 ] .

 

 "an Array(112.0s1 107.7s1 104.3s1 102.3s1 100.0s1 99.3s1 99.7s1 102.7s1)"

 

Cheers,

Cédrick 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

Richard O'Keefe
In reply to this post by cedreek
I note that "self species ofSize: n" is not generally a good idea.
Consider ByteArray, ShortIntegerArray, WordArray.
Computing rolling means of these is perfectly sensible,
but the results will not fit into an array of the same species.
I'd stick with Array.

The suggestion about subtracting an old element and adding a new
one is great for integers, but for floating-point numbers risks
accumulating errors.

The

On Wed, 8 Apr 2020 at 20:07, Cédrick Béler <[hidden email]> wrote:
Hi,

I wanted to do a moving/rolling average on raw data [1]. 
I haven’t find code for that (maybe this is done in polymath though).

So I ended writing that (I thing this is SMA):

SequenceableCollection>>movingAverage: anOrder     
"Answer the moving or rolling average for anOrder window"

    | retval size x y |
        anOrder <= 0 ifTrue: [ Error signal: 'the order must be positive'].
    size := self  size - anOrder.
    size negative  ifTrue: [ Error signal: 'the collection size is too small'].
    retval := self species ofSize: size + 1.

    x := 1.
y := anOrder.
    [y <=  self  size ] whileTrue: [           
 retval at: x put: (self copyFrom: x to: y) average           
 x := x + 1. y := y + 1
 ].
    ^retval

Not perfect but seems to works quite well (that’s probably better to remove copyFrom: and use some kind of buffer instead).

Any interest in that ? If any existing code too, I’ll be interested especially for other implementation (weighted, exponential) ?


(#(118 113 105 105 103 99 98 101 100 107) movingAverage: 3) collect: [:v | v asScaledDecimal: 1 ] .

 "an Array(112.0s1 107.7s1 104.3s1 102.3s1 100.0s1 99.3s1 99.7s1 102.7s1)"

Cheers,
Cédrick 




Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

Christian Haider

I don’t see how rounding errors could accumulate, if you keep the sum and not the average.

The rounding errors should be neutral, because each element is added once and subtracted once.

If + and – is symmetrical in this respect, rounding inaccuracies should balance out.

 

Cheers,

                Christian

 

Von: Pharo-users <[hidden email]> Im Auftrag von Richard O'Keefe
Gesendet: Donnerstag, 9. April 2020 05:26
An: Any question about pharo is welcome <[hidden email]>
Betreff: Re: [Pharo-users] Moving/rolling average implementations ?

 

I note that "self species ofSize: n" is not generally a good idea.

Consider ByteArray, ShortIntegerArray, WordArray.

Computing rolling means of these is perfectly sensible,

but the results will not fit into an array of the same species.

I'd stick with Array.

 

The suggestion about subtracting an old element and adding a new

one is great for integers, but for floating-point numbers risks

accumulating errors.

 

The

 

On Wed, 8 Apr 2020 at 20:07, Cédrick Béler <[hidden email]> wrote:

Hi,

 

I wanted to do a moving/rolling average on raw data [1]. 

I haven’t find code for that (maybe this is done in polymath though).

 

So I ended writing that (I thing this is SMA):

 

SequenceableCollection>>movingAverage: anOrder     
"Answer the moving or rolling average for anOrder window"

 

    | retval size x y |

        anOrder <= 0 ifTrue: [ Error signal: 'the order must be positive'].

    size := self  size - anOrder.

    size negative  ifTrue: [ Error signal: 'the collection size is too small'].

    retval := self species ofSize: size + 1.

 

    x := 1.

y := anOrder.

    [y <=  self  size ] whileTrue: [           

 retval at: x put: (self copyFrom: x to: y) average           

 x := x + 1. y := y + 1

 ].

    ^retval

 

Not perfect but seems to works quite well (that’s probably better to remove copyFrom: and use some kind of buffer instead).

 

Any interest in that ? If any existing code too, I’ll be interested especially for other implementation (weighted, exponential) ?

 

 

(#(118 113 105 105 103 99 98 101 100 107) movingAverage: 3) collect: [:v | v asScaledDecimal: 1 ] .

 

 "an Array(112.0s1 107.7s1 104.3s1 102.3s1 100.0s1 99.3s1 99.7s1 102.7s1)"

 

Cheers,

Cédrick 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

cedreek
Hi Richard and Christian,

thanks for the comments/suggestions.

For the "self species ofSize… », I got inspiration from the image like with #overlappingPairsCollect:
As for float rounding errors, that interesting to consider indeed. I’ll try a buffer variant.

Some related questions for my OO culture ;-) :

#average is defined in Collection… That makes me wonder (especially in Collection hierarchy) that some methods are not applicable to some subclass. I think this is a common problem with inheritance ?

What’s the way to deal with that ? Traits ?
I found #shouldNotImplement that is used 106 times in one P8 image.

Is it a good/favored pattern to prevent usage of such methods ?

Cheers,
Cédrick

PS: would such method be a candidate for integration in base image ? If yes, I could try a PR.
Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

Richard O'Keefe
In reply to this post by Christian Haider
Let's take the inheritance issue first.
Yes, Collection has some subclasses where #average makes no sense,
such as String.  If any subclass of Collection should have #average, it is
Array, as #(1 2 3 4) average makes perfect sense.
BUT #($a $b $c $d) average makes exactly as much sense as 'abcd' average.
So we have
 - subclasses where a method never makes sense (String)
 - subclasses where a method always makes sense (ByteArray)
 - subclasses where a method may or may not make sense (Array).
Traits will not help at all with the third category.
Historic Smalltalk practice has been to define methods like
  String>>average self shouldNotImplement.
to "cancel" inappropriate methods to deal with the "never" subclasses,
but that still does not help with the "maybe" ones.

In my own  code I try very hard to ensure that a method is available in
a class if and only if the class can have instances where the method
makes sense.  That is, I try to make sure that #respondsTo: is "honest".
It is not always practical.

This is not a problem that is peculiar to Smalltalk.  Java and C# and C++
and you-name-it also have the "maybe" problem, where a method seems
to be available for an object but thanks to its state it is not.

As long as sending a message to an inappropriate object (whether due to
its class or its state) results in *some* exception, do we really have a problem?
'abcd' average
'abcd' asArray average
result in the same error.

Now for the roundoff error issue.

Oh dear, we really need to do a much better job educating programmers
about floating point, we really do.  In real number arithmetic,
 (a + b) - a = b
is always true.  In floating-point arithmetic it is not.
This is easiest to see in cases like
 (1.0e20 + 1.0) - 1.0e20
where Pharo answers 0.0 instead of 1.0.
Less extreme cases still give you roundoff error.

Much has been written about how to stably update mean, variance, &c.
The simplest thing is to use Welford's algorithm for the weighted mean,
using weight +1 to add the new element and -1 to remove the old.

On Thu, 9 Apr 2020 at 19:33, Christian Haider <[hidden email]> wrote:

I don’t see how rounding errors could accumulate, if you keep the sum and not the average.

The rounding errors should be neutral, because each element is added once and subtracted once.

If + and – is symmetrical in this respect, rounding inaccuracies should balance out.

 

Cheers,

                Christian

 

Von: Pharo-users <[hidden email]> Im Auftrag von Richard O'Keefe
Gesendet: Donnerstag, 9. April 2020 05:26
An: Any question about pharo is welcome <[hidden email]>
Betreff: Re: [Pharo-users] Moving/rolling average implementations ?

 

I note that "self species ofSize: n" is not generally a good idea.

Consider ByteArray, ShortIntegerArray, WordArray.

Computing rolling means of these is perfectly sensible,

but the results will not fit into an array of the same species.

I'd stick with Array.

 

The suggestion about subtracting an old element and adding a new

one is great for integers, but for floating-point numbers risks

accumulating errors.

 

The

 

On Wed, 8 Apr 2020 at 20:07, Cédrick Béler <[hidden email]> wrote:

Hi,

 

I wanted to do a moving/rolling average on raw data [1]. 

I haven’t find code for that (maybe this is done in polymath though).

 

So I ended writing that (I thing this is SMA):

 

SequenceableCollection>>movingAverage: anOrder     
"Answer the moving or rolling average for anOrder window"

 

    | retval size x y |

        anOrder <= 0 ifTrue: [ Error signal: 'the order must be positive'].

    size := self  size - anOrder.

    size negative  ifTrue: [ Error signal: 'the collection size is too small'].

    retval := self species ofSize: size + 1.

 

    x := 1.

y := anOrder.

    [y <=  self  size ] whileTrue: [           

 retval at: x put: (self copyFrom: x to: y) average           

 x := x + 1. y := y + 1

 ].

    ^retval

 

Not perfect but seems to works quite well (that’s probably better to remove copyFrom: and use some kind of buffer instead).

 

Any interest in that ? If any existing code too, I’ll be interested especially for other implementation (weighted, exponential) ?

 

 

(#(118 113 105 105 103 99 98 101 100 107) movingAverage: 3) collect: [:v | v asScaledDecimal: 1 ] .

 

 "an Array(112.0s1 107.7s1 104.3s1 102.3s1 100.0s1 99.3s1 99.7s1 102.7s1)"

 

Cheers,

Cédrick 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

cedreek
Superb, thank you Richard, very interesting answer :)

For the rounding issue, since I got bitten once, I most of the time use ScaleDecimal instead of floats for data anyway.

Cheers,

Cédrick

Le 10 avr. 2020 à 05:04, Richard O'Keefe <[hidden email]> a écrit :

Let's take the inheritance issue first.
Yes, Collection has some subclasses where #average makes no sense,
such as String.  If any subclass of Collection should have #average, it is
Array, as #(1 2 3 4) average makes perfect sense.
BUT #($a $b $c $d) average makes exactly as much sense as 'abcd' average.
So we have
 - subclasses where a method never makes sense (String)
 - subclasses where a method always makes sense (ByteArray)
 - subclasses where a method may or may not make sense (Array).
Traits will not help at all with the third category.
Historic Smalltalk practice has been to define methods like
  String>>average self shouldNotImplement.
to "cancel" inappropriate methods to deal with the "never" subclasses,
but that still does not help with the "maybe" ones.

In my own  code I try very hard to ensure that a method is available in
a class if and only if the class can have instances where the method
makes sense.  That is, I try to make sure that #respondsTo: is "honest".
It is not always practical.

This is not a problem that is peculiar to Smalltalk.  Java and C# and C++
and you-name-it also have the "maybe" problem, where a method seems
to be available for an object but thanks to its state it is not.

As long as sending a message to an inappropriate object (whether due to
its class or its state) results in *some* exception, do we really have a problem?
'abcd' average
'abcd' asArray average
result in the same error.

Now for the roundoff error issue.

Oh dear, we really need to do a much better job educating programmers
about floating point, we really do.  In real number arithmetic,
 (a + b) - a = b
is always true.  In floating-point arithmetic it is not.
This is easiest to see in cases like
 (1.0e20 + 1.0) - 1.0e20
where Pharo answers 0.0 instead of 1.0.
Less extreme cases still give you roundoff error.

Much has been written about how to stably update mean, variance, &c.
The simplest thing is to use Welford's algorithm for the weighted mean,
using weight +1 to add the new element and -1 to remove the old.

On Thu, 9 Apr 2020 at 19:33, Christian Haider <[hidden email]> wrote:

I don’t see how rounding errors could accumulate, if you keep the sum and not the average.

The rounding errors should be neutral, because each element is added once and subtracted once.

If + and – is symmetrical in this respect, rounding inaccuracies should balance out.

 

Cheers,

                Christian

 

Von: Pharo-users <[hidden email]> Im Auftrag von Richard O'Keefe
Gesendet: Donnerstag, 9. April 2020 05:26
An: Any question about pharo is welcome <[hidden email]>
Betreff: Re: [Pharo-users] Moving/rolling average implementations ?

 

I note that "self species ofSize: n" is not generally a good idea.

Consider ByteArray, ShortIntegerArray, WordArray.

Computing rolling means of these is perfectly sensible,

but the results will not fit into an array of the same species.

I'd stick with Array.

 

The suggestion about subtracting an old element and adding a new

one is great for integers, but for floating-point numbers risks

accumulating errors.

 

The

 

On Wed, 8 Apr 2020 at 20:07, Cédrick Béler <[hidden email]> wrote:

Hi,

 

I wanted to do a moving/rolling average on raw data [1]. 

I haven’t find code for that (maybe this is done in polymath though).

 

So I ended writing that (I thing this is SMA):

 

SequenceableCollection>>movingAverage: anOrder     
"Answer the moving or rolling average for anOrder window"

 

    | retval size x y |

        anOrder <= 0 ifTrue: [ Error signal: 'the order must be positive'].

    size := self  size - anOrder.

    size negative  ifTrue: [ Error signal: 'the collection size is too small'].

    retval := self species ofSize: size + 1.

 

    x := 1.

y := anOrder.

    [y <=  self  size ] whileTrue: [           

 retval at: x put: (self copyFrom: x to: y) average           

 x := x + 1. y := y + 1

 ].

    ^retval

 

Not perfect but seems to works quite well (that’s probably better to remove copyFrom: and use some kind of buffer instead).

 

Any interest in that ? If any existing code too, I’ll be interested especially for other implementation (weighted, exponential) ?

 

<image001.png>

 

(#(118 113 105 105 103 99 98 101 100 107) movingAverage: 3) collect: [:v | v asScaledDecimal: 1 ] .

 

 "an Array(112.0s1 107.7s1 104.3s1 102.3s1 100.0s1 99.3s1 99.7s1 102.7s1)"

 

Cheers,

Cédrick 

 

 

 

 


Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

Richard O'Keefe
In reply to this post by cedreek
I have coded and benchmarked 8 different running mean algorithms.
In the presence of inexact numbers it is not as accurate as
redoing the sums, but it's pretty close, and it's fast.
If "width" is not an integer or is out of range, an error
will be reported by #new: or #at:[put:].  It's based on Welford's
stable update.

Of course this approach does NOT work for trimmed or Winsorised
means or for medians or any kind of robust estimate of location.

SequenceableCollection
  methods for: 'summarising'
    runningMeans: width
      |a m d|
      a := Array new: self size - width + 1.
      m := 0.
      1 to: width do: [:i |
        m := (self at: i) + m].
      m := m / width.  
      d := 1.
      a at: d put: m.
      width + 1 to: self size do: [:i |
        m := ((self at: i) - (self at: d)) / width + m.
        d := d + 1.
        a at: d put: m].
      ^a
     
 



Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

cedreek
Beautiful ^^

I would vote for inclusion in the base image ?
With your explanation as comments.

I’ll play with it.

Thanks
Cedrick

> Le 12 avr. 2020 à 12:19, Richard O'Keefe <[hidden email]> a écrit :
>
> 
> I have coded and benchmarked 8 different running mean algorithms.
> In the presence of inexact numbers it is not as accurate as
> redoing the sums, but it's pretty close, and it's fast.
> If "width" is not an integer or is out of range, an error
> will be reported by #new: or #at:[put:].  It's based on Welford's
> stable update.
>
> Of course this approach does NOT work for trimmed or Winsorised
> means or for medians or any kind of robust estimate of location.
>
> SequenceableCollection
>   methods for: 'summarising'
>     runningMeans: width
>       |a m d|
>       a := Array new: self size - width + 1.
>       m := 0.
>       1 to: width do: [:i |
>         m := (self at: i) + m].
>       m := m / width.  
>       d := 1.
>       a at: d put: m.
>       width + 1 to: self size do: [:i |
>         m := ((self at: i) - (self at: d)) / width + m.
>         d := d + 1.
>         a at: d put: m].
>       ^a
>      
>  
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

Sven Van Caekenberghe-2


> On 12 Apr 2020, at 13:53, Cédrick Béler <[hidden email]> wrote:
>
> Beautiful ^^

I also like it.

But why the single letter variable names ? Why not:

SequenceableCollection>>#runningMeans: width
  | means sum index |
  means := Array new: self size - width + 1.
  sum := 0.
  1 to: width do: [ :each |
    sum := sum + (self at: each) ].
  index := 1.
  means at: index put: sum / width.
  width + 1 to: self size do: [ :each |
    sum := sum - (self at: index) + (self at: each).
    index := index + 1.
    means at: index put: sum / width ].
  ^ means

A good comment, a correct initial bounds check and unit tests are also needed.

> I would vote for inclusion in the base image ?
> With your explanation as comments.
>
> I’ll play with it.
>
> Thanks
> Cedrick
>
>> Le 12 avr. 2020 à 12:19, Richard O'Keefe <[hidden email]> a écrit :
>>
>> 
>> I have coded and benchmarked 8 different running mean algorithms.
>> In the presence of inexact numbers it is not as accurate as
>> redoing the sums, but it's pretty close, and it's fast.
>> If "width" is not an integer or is out of range, an error
>> will be reported by #new: or #at:[put:].  It's based on Welford's
>> stable update.
>>
>> Of course this approach does NOT work for trimmed or Winsorised
>> means or for medians or any kind of robust estimate of location.
>>
>> SequenceableCollection
>>  methods for: 'summarising'
>>    runningMeans: width
>>      |a m d|
>>      a := Array new: self size - width + 1.
>>      m := 0.
>>      1 to: width do: [:i |
>>        m := (self at: i) + m].
>>      m := m / width.  
>>      d := 1.
>>      a at: d put: m.
>>      width + 1 to: self size do: [:i |
>>        m := ((self at: i) - (self at: d)) / width + m.
>>        d := d + 1.
>>        a at: d put: m].
>>      ^a
>>
>>
>>
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

cedreek
Fully agree on proper names.

=== I don’t know if this is because of free confinement time but I can keep on asking questions so I share some I hope will make sense (for the sake of discussion).

For instance, I'm wondering if tests on conditions so as to raise proper exceptions is a good practice (for instance if the  width object does not make sense, like a float) #doesNotMakeSense btw would be a cool name for the « maybe » cases Richard was talking.

Then I asked myself what are the drawbacks (especially on performance) on adding extra information to source code (a bit like longer variable names) ?

There is the raw code and the sources code file that helps separating concerns. At least we don’t mind at all having longer literals (variables names, …).

I cannot help is what about pragmas. I kind see roughly how they work. But is it possible to distinguish between runtime / source only pragmas (not sure I’m clear here but it seems to me that some are important for documentation purposes that are not needed at runtime) ?

Also, I’ve never really liked method categories. I don’t really see how there are implemented but they don’t feel nice to me.
Could they be only pragmas ?





Happy eater Sunday, stay all preserved (sad day for the game of life),

Cédrick



> Le 12 avr. 2020 à 14:22, Sven Van Caekenberghe <[hidden email]> a écrit :
>
>
>
>> On 12 Apr 2020, at 13:53, Cédrick Béler <[hidden email]> wrote:
>>
>> Beautiful ^^
>
> I also like it.
>
> But why the single letter variable names ? Why not:
>
> SequenceableCollection>>#runningMeans: width
>  | means sum index |
>  means := Array new: self size - width + 1.
>  sum := 0.
>  1 to: width do: [ :each |
>    sum := sum + (self at: each) ].
>  index := 1.
>  means at: index put: sum / width.
>  width + 1 to: self size do: [ :each |
>    sum := sum - (self at: index) + (self at: each).
>    index := index + 1.
>    means at: index put: sum / width ].
>  ^ means
>
> A good comment, a correct initial bounds check and unit tests are also needed.
>
>> I would vote for inclusion in the base image ?
>> With your explanation as comments.
>>
>> I’ll play with it.
>>
>> Thanks
>> Cedrick
>>
>>> Le 12 avr. 2020 à 12:19, Richard O'Keefe <[hidden email]> a écrit :
>>>
>>> 
>>> I have coded and benchmarked 8 different running mean algorithms.
>>> In the presence of inexact numbers it is not as accurate as
>>> redoing the sums, but it's pretty close, and it's fast.
>>> If "width" is not an integer or is out of range, an error
>>> will be reported by #new: or #at:[put:].  It's based on Welford's
>>> stable update.
>>>
>>> Of course this approach does NOT work for trimmed or Winsorised
>>> means or for medians or any kind of robust estimate of location.
>>>
>>> SequenceableCollection
>>> methods for: 'summarising'
>>>   runningMeans: width
>>>     |a m d|
>>>     a := Array new: self size - width + 1.
>>>     m := 0.
>>>     1 to: width do: [:i |
>>>       m := (self at: i) + m].
>>>     m := m / width.  
>>>     d := 1.
>>>     a at: d put: m.
>>>     width + 1 to: self size do: [:i |
>>>       m := ((self at: i) - (self at: d)) / width + m.
>>>       d := d + 1.
>>>       a at: d put: m].
>>>     ^a
>>>
>>>
>>>
>>>
>>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

Richard O'Keefe
In reply to this post by Sven Van Caekenberghe-2
I did mention that it was one of eight implementations that I tried.
In fact you broke it with your change.
It is important ***NOT*** to keep a running sum but a running MEAN.
The line
    m := ((self at: i) - (self at: d)) / width + m
was written that way for a reason.

I did think of renaming the variables for general use, but decided to display
the actual code that I tested.  If you want polished code, here it is.

    runningMeans: width
      "This returns an array of running means as if the receiver
       were broken into overlapping segments 'width' long and the
       mean of each calculated.  This uses an adaptation of
       Welford's algorithm for stably updating the mean; it is
       important to maintain a current MEAN not a current SUM.
       This has been tested against 7 other algorithms.  It was
       the most accurate of the faster ones.  The result is an
       Array no matter what kind of sequence the receiver is.
       If the receiver is a tree (like a virtual concatenation)
       or a singly or doubly linked list you should convert the
       receiver to an Array first.  Note that there is no
       explicit check that width is an Integer or is in range;
       none is needed because those checks happen anyway."
      |result mean resultIndex|
      result := Array new: self size - width + 1.
      mean := 0.
      1 to: width do: [:i | mean := (self at: i) + mean].
      mean := mean / width.
      resultIndex := 1.
      result at: resultIndex put: mean.
      width + 1 to: self size do: [:i |
        mean := ((self at: i) - (self at: resultIndex)) / width + mean.
        resultIndex := resultIndex + 1.
        result at: resultIndex put: mean].
      ^result



On Mon, 13 Apr 2020 at 00:23, Sven Van Caekenberghe <[hidden email]> wrote:

>
>
>
> > On 12 Apr 2020, at 13:53, Cédrick Béler <[hidden email]> wrote:
> >
> > Beautiful ^^
>
> I also like it.
>
> But why the single letter variable names ? Why not:
>
> SequenceableCollection>>#runningMeans: width
>   | means sum index |
>   means := Array new: self size - width + 1.
>   sum := 0.
>   1 to: width do: [ :each |
>     sum := sum + (self at: each) ].
>   index := 1.
>   means at: index put: sum / width.
>   width + 1 to: self size do: [ :each |
>     sum := sum - (self at: index) + (self at: each).
>     index := index + 1.
>     means at: index put: sum / width ].
>   ^ means
>
> A good comment, a correct initial bounds check and unit tests are also needed.
>
> > I would vote for inclusion in the base image ?
> > With your explanation as comments.
> >
> > I’ll play with it.
> >
> > Thanks
> > Cedrick
> >
> >> Le 12 avr. 2020 à 12:19, Richard O'Keefe <[hidden email]> a écrit :
> >>
> >> 
> >> I have coded and benchmarked 8 different running mean algorithms.
> >> In the presence of inexact numbers it is not as accurate as
> >> redoing the sums, but it's pretty close, and it's fast.
> >> If "width" is not an integer or is out of range, an error
> >> will be reported by #new: or #at:[put:].  It's based on Welford's
> >> stable update.
> >>
> >> Of course this approach does NOT work for trimmed or Winsorised
> >> means or for medians or any kind of robust estimate of location.
> >>
> >> SequenceableCollection
> >>  methods for: 'summarising'
> >>    runningMeans: width
> >>      |a m d|
> >>      a := Array new: self size - width + 1.
> >>      m := 0.
> >>      1 to: width do: [:i |
> >>        m := (self at: i) + m].
> >>      m := m / width.
> >>      d := 1.
> >>      a at: d put: m.
> >>      width + 1 to: self size do: [:i |
> >>        m := ((self at: i) - (self at: d)) / width + m.
> >>        d := d + 1.
> >>        a at: d put: m].
> >>      ^a
> >>
> >>
> >>
> >>
> >>
> >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

cedreek
What about this test (in OrderedCollectionTest - it suggests a Trait too) ?

testRunningMeans

        | result col |
        col := #(1 1 2 2 3 3) asOrderedCollection.
        result := col runningMeans: 2.

        self assert: (result = {1. (3/2). 2. (5/2). 3}).
       
        self assert: (result class = Array).
       
        self assert: result size <= (col size). "Running means with 1 has little interest ?"
       
        self should: [ col runningMeans: 7 ] raise: SubscriptOutOfBounds.
        self  should: [ col runningMeans: -2 ] raise: SubscriptOutOfBounds.
        self  should: [ col runningMeans: 1.3 ] raise: Error withExceptionDo: [ :anException | self assert: anException messageText equals: 'primitive #basicNew: in Array class failed’ ]


Cheers,
Cédrick

> Le 13 avr. 2020 à 04:47, Richard O'Keefe <[hidden email]> a écrit :
>
> I did mention that it was one of eight implementations that I tried.
> In fact you broke it with your change.
> It is important ***NOT*** to keep a running sum but a running MEAN.
> The line
>    m := ((self at: i) - (self at: d)) / width + m
> was written that way for a reason.
>
> I did think of renaming the variables for general use, but decided to display
> the actual code that I tested.  If you want polished code, here it is.
>
>    runningMeans: width
>      "This returns an array of running means as if the receiver
>       were broken into overlapping segments 'width' long and the
>       mean of each calculated.  This uses an adaptation of
>       Welford's algorithm for stably updating the mean; it is
>       important to maintain a current MEAN not a current SUM.
>       This has been tested against 7 other algorithms.  It was
>       the most accurate of the faster ones.  The result is an
>       Array no matter what kind of sequence the receiver is.
>       If the receiver is a tree (like a virtual concatenation)
>       or a singly or doubly linked list you should convert the
>       receiver to an Array first.  Note that there is no
>       explicit check that width is an Integer or is in range;
>       none is needed because those checks happen anyway."
>      |result mean resultIndex|
>      result := Array new: self size - width + 1.
>      mean := 0.
>      1 to: width do: [:i | mean := (self at: i) + mean].
>      mean := mean / width.
>      resultIndex := 1.
>      result at: resultIndex put: mean.
>      width + 1 to: self size do: [:i |
>        mean := ((self at: i) - (self at: resultIndex)) / width + mean.
>        resultIndex := resultIndex + 1.
>        result at: resultIndex put: mean].
>      ^result
>
>
>
> On Mon, 13 Apr 2020 at 00:23, Sven Van Caekenberghe <[hidden email]> wrote:
>>
>>
>>
>>> On 12 Apr 2020, at 13:53, Cédrick Béler <[hidden email]> wrote:
>>>
>>> Beautiful ^^
>>
>> I also like it.
>>
>> But why the single letter variable names ? Why not:
>>
>> SequenceableCollection>>#runningMeans: width
>>  | means sum index |
>>  means := Array new: self size - width + 1.
>>  sum := 0.
>>  1 to: width do: [ :each |
>>    sum := sum + (self at: each) ].
>>  index := 1.
>>  means at: index put: sum / width.
>>  width + 1 to: self size do: [ :each |
>>    sum := sum - (self at: index) + (self at: each).
>>    index := index + 1.
>>    means at: index put: sum / width ].
>>  ^ means
>>
>> A good comment, a correct initial bounds check and unit tests are also needed.
>>
>>> I would vote for inclusion in the base image ?
>>> With your explanation as comments.
>>>
>>> I’ll play with it.
>>>
>>> Thanks
>>> Cedrick
>>>
>>>> Le 12 avr. 2020 à 12:19, Richard O'Keefe <[hidden email]> a écrit :
>>>>
>>>> 
>>>> I have coded and benchmarked 8 different running mean algorithms.
>>>> In the presence of inexact numbers it is not as accurate as
>>>> redoing the sums, but it's pretty close, and it's fast.
>>>> If "width" is not an integer or is out of range, an error
>>>> will be reported by #new: or #at:[put:].  It's based on Welford's
>>>> stable update.
>>>>
>>>> Of course this approach does NOT work for trimmed or Winsorised
>>>> means or for medians or any kind of robust estimate of location.
>>>>
>>>> SequenceableCollection
>>>> methods for: 'summarising'
>>>>   runningMeans: width
>>>>     |a m d|
>>>>     a := Array new: self size - width + 1.
>>>>     m := 0.
>>>>     1 to: width do: [:i |
>>>>       m := (self at: i) + m].
>>>>     m := m / width.
>>>>     d := 1.
>>>>     a at: d put: m.
>>>>     width + 1 to: self size do: [:i |
>>>>       m := ((self at: i) - (self at: d)) / width + m.
>>>>       d := d + 1.
>>>>       a at: d put: m].
>>>>     ^a
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

Richard O'Keefe
In reply to this post by cedreek
Concerning method categories:
VIsualAge Smalltalk manages without them.  Of all the Smalltalk
systems available to me, it's the only one where I don't enjoy using
the browser.  It carves up the world of methods another way, which I
find of little or no help.  That in no way detracts from it being a
solid system fit for its intended uses.

GNU Smalltalk uses <category: 'whatever'> pragmas to put methods into
categories.

Used consistently and well, method categories are not only a
navigation aid, helping you to find methods you do not know the names
of, but a helpful documentation aid.  For example, #runningMeans: is
in the 'summarising' category.  In my Smalltalk, that means
 - it inspects every element of the collection
 - it does not change the collection
 - the result is somehow a summary or distillation of the elements
Of course, to get the most from method categories, the method
categories need to be documented, and you need to keep them as up to
date as any other part of the source code.

I can say that the discipline of trying to come up with meaningful
categories and use them consistently has improved the quality of my
code.

On Mon, 13 Apr 2020 at 00:38, Cédrick Béler <[hidden email]> wrote:

>
> Fully agree on proper names.
>
> === I don’t know if this is because of free confinement time but I can keep on asking questions so I share some I hope will make sense (for the sake of discussion).
>
> For instance, I'm wondering if tests on conditions so as to raise proper exceptions is a good practice (for instance if the  width object does not make sense, like a float) #doesNotMakeSense btw would be a cool name for the « maybe » cases Richard was talking.
>
> Then I asked myself what are the drawbacks (especially on performance) on adding extra information to source code (a bit like longer variable names) ?
>
> There is the raw code and the sources code file that helps separating concerns. At least we don’t mind at all having longer literals (variables names, …).
>
> I cannot help is what about pragmas. I kind see roughly how they work. But is it possible to distinguish between runtime / source only pragmas (not sure I’m clear here but it seems to me that some are important for documentation purposes that are not needed at runtime) ?
>
> Also, I’ve never really liked method categories. I don’t really see how there are implemented but they don’t feel nice to me.
> Could they be only pragmas ?
>
>
>
>
>
> Happy eater Sunday, stay all preserved (sad day for the game of life),
>
> Cédrick
>
>
>
> > Le 12 avr. 2020 à 14:22, Sven Van Caekenberghe <[hidden email]> a écrit :
> >
> >
> >
> >> On 12 Apr 2020, at 13:53, Cédrick Béler <[hidden email]> wrote:
> >>
> >> Beautiful ^^
> >
> > I also like it.
> >
> > But why the single letter variable names ? Why not:
> >
> > SequenceableCollection>>#runningMeans: width
> >  | means sum index |
> >  means := Array new: self size - width + 1.
> >  sum := 0.
> >  1 to: width do: [ :each |
> >    sum := sum + (self at: each) ].
> >  index := 1.
> >  means at: index put: sum / width.
> >  width + 1 to: self size do: [ :each |
> >    sum := sum - (self at: index) + (self at: each).
> >    index := index + 1.
> >    means at: index put: sum / width ].
> >  ^ means
> >
> > A good comment, a correct initial bounds check and unit tests are also needed.
> >
> >> I would vote for inclusion in the base image ?
> >> With your explanation as comments.
> >>
> >> I’ll play with it.
> >>
> >> Thanks
> >> Cedrick
> >>
> >>> Le 12 avr. 2020 à 12:19, Richard O'Keefe <[hidden email]> a écrit :
> >>>
> >>> 
> >>> I have coded and benchmarked 8 different running mean algorithms.
> >>> In the presence of inexact numbers it is not as accurate as
> >>> redoing the sums, but it's pretty close, and it's fast.
> >>> If "width" is not an integer or is out of range, an error
> >>> will be reported by #new: or #at:[put:].  It's based on Welford's
> >>> stable update.
> >>>
> >>> Of course this approach does NOT work for trimmed or Winsorised
> >>> means or for medians or any kind of robust estimate of location.
> >>>
> >>> SequenceableCollection
> >>> methods for: 'summarising'
> >>>   runningMeans: width
> >>>     |a m d|
> >>>     a := Array new: self size - width + 1.
> >>>     m := 0.
> >>>     1 to: width do: [:i |
> >>>       m := (self at: i) + m].
> >>>     m := m / width.
> >>>     d := 1.
> >>>     a at: d put: m.
> >>>     width + 1 to: self size do: [:i |
> >>>       m := ((self at: i) - (self at: d)) / width + m.
> >>>       d := d + 1.
> >>>       a at: d put: m].
> >>>     ^a
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >
> >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

cedreek
In reply to this post by cedreek
And maybe another one to show deviation problem ?

testRunningMeansDeviation

        | result1 result2 col1 col2 |
       
        col1 := #(0.3s1 1s1 0.1s1 0.3s1 1s1 0.1s1 0.3s1 1s1 0.1s1).
        result1 := col1 runningMeans: 2.
       
        col2 := #(0.3 1 0.1 0.3 1 0.1 0.3 1 0.1).
        result2 := col2 runningMeans: 2.
       
        self assert: result1 first equals: result1 fourth.
       
        "presence of a rounding error"
        self deny: result2 first equals: result2 fourth.
        self assert: result2 first closeTo: result2 fourth.



> Le 13 avr. 2020 à 10:48, Cédrick Béler <[hidden email]> a écrit :
>
> What about this test (in OrderedCollectionTest - it suggests a Trait too) ?
>
> testRunningMeans
>
> | result col |
> col := #(1 1 2 2 3 3) asOrderedCollection.
> result := col runningMeans: 2.
>
> self assert: (result = {1. (3/2). 2. (5/2). 3}).
>
> self assert: (result class = Array).
>
> self assert: result size <= (col size). "Running means with 1 has little interest ?"
>
> self should: [ col runningMeans: 7 ] raise: SubscriptOutOfBounds.
> self  should: [ col runningMeans: -2 ] raise: SubscriptOutOfBounds.
> self  should: [ col runningMeans: 1.3 ] raise: Error withExceptionDo: [ :anException | self assert: anException messageText equals: 'primitive #basicNew: in Array class failed’ ]
>
>
> Cheers,
> Cédrick
>
>> Le 13 avr. 2020 à 04:47, Richard O'Keefe <[hidden email]> a écrit :
>>
>> I did mention that it was one of eight implementations that I tried.
>> In fact you broke it with your change.
>> It is important ***NOT*** to keep a running sum but a running MEAN.
>> The line
>>   m := ((self at: i) - (self at: d)) / width + m
>> was written that way for a reason.
>>
>> I did think of renaming the variables for general use, but decided to display
>> the actual code that I tested.  If you want polished code, here it is.
>>
>>   runningMeans: width
>>     "This returns an array of running means as if the receiver
>>      were broken into overlapping segments 'width' long and the
>>      mean of each calculated.  This uses an adaptation of
>>      Welford's algorithm for stably updating the mean; it is
>>      important to maintain a current MEAN not a current SUM.
>>      This has been tested against 7 other algorithms.  It was
>>      the most accurate of the faster ones.  The result is an
>>      Array no matter what kind of sequence the receiver is.
>>      If the receiver is a tree (like a virtual concatenation)
>>      or a singly or doubly linked list you should convert the
>>      receiver to an Array first.  Note that there is no
>>      explicit check that width is an Integer or is in range;
>>      none is needed because those checks happen anyway."
>>     |result mean resultIndex|
>>     result := Array new: self size - width + 1.
>>     mean := 0.
>>     1 to: width do: [:i | mean := (self at: i) + mean].
>>     mean := mean / width.
>>     resultIndex := 1.
>>     result at: resultIndex put: mean.
>>     width + 1 to: self size do: [:i |
>>       mean := ((self at: i) - (self at: resultIndex)) / width + mean.
>>       resultIndex := resultIndex + 1.
>>       result at: resultIndex put: mean].
>>     ^result
>>
>>
>>
>> On Mon, 13 Apr 2020 at 00:23, Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>>
>>>
>>>> On 12 Apr 2020, at 13:53, Cédrick Béler <[hidden email]> wrote:
>>>>
>>>> Beautiful ^^
>>>
>>> I also like it.
>>>
>>> But why the single letter variable names ? Why not:
>>>
>>> SequenceableCollection>>#runningMeans: width
>>> | means sum index |
>>> means := Array new: self size - width + 1.
>>> sum := 0.
>>> 1 to: width do: [ :each |
>>>   sum := sum + (self at: each) ].
>>> index := 1.
>>> means at: index put: sum / width.
>>> width + 1 to: self size do: [ :each |
>>>   sum := sum - (self at: index) + (self at: each).
>>>   index := index + 1.
>>>   means at: index put: sum / width ].
>>> ^ means
>>>
>>> A good comment, a correct initial bounds check and unit tests are also needed.
>>>
>>>> I would vote for inclusion in the base image ?
>>>> With your explanation as comments.
>>>>
>>>> I’ll play with it.
>>>>
>>>> Thanks
>>>> Cedrick
>>>>
>>>>> Le 12 avr. 2020 à 12:19, Richard O'Keefe <[hidden email]> a écrit :
>>>>>
>>>>> 
>>>>> I have coded and benchmarked 8 different running mean algorithms.
>>>>> In the presence of inexact numbers it is not as accurate as
>>>>> redoing the sums, but it's pretty close, and it's fast.
>>>>> If "width" is not an integer or is out of range, an error
>>>>> will be reported by #new: or #at:[put:].  It's based on Welford's
>>>>> stable update.
>>>>>
>>>>> Of course this approach does NOT work for trimmed or Winsorised
>>>>> means or for medians or any kind of robust estimate of location.
>>>>>
>>>>> SequenceableCollection
>>>>> methods for: 'summarising'
>>>>>  runningMeans: width
>>>>>    |a m d|
>>>>>    a := Array new: self size - width + 1.
>>>>>    m := 0.
>>>>>    1 to: width do: [:i |
>>>>>      m := (self at: i) + m].
>>>>>    m := m / width.
>>>>>    d := 1.
>>>>>    a at: d put: m.
>>>>>    width + 1 to: self size do: [:i |
>>>>>      m := ((self at: i) - (self at: d)) / width + m.
>>>>>      d := d + 1.
>>>>>      a at: d put: m].
>>>>>    ^a
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Discussions on method categories (Was: Moving/rolling average implementations ?)

cedreek
In reply to this post by Richard O'Keefe

> Used consistently and well, method categories are not only a
> navigation aid, helping you to find methods you do not know the names
> of, but a helpful documentation aid.  For example, #runningMeans: is
> in the 'summarising' category.  In my Smalltalk, that means
> - it inspects every element of the collection
> - it does not change the collection
> - the result is somehow a summary or distillation of the elements

Great. If only categories were more first class, we could store such information.

I agree that a proper name is important but this is kind of hell to do bu yourself as minait no help.

From you definition, I see that the category can be associated to tests that might eventually auto categorize.

Anyway, with such artefact, this is before being a tool artefact, a discipline  one (convetions). I just have headaches each time I have to write my categories ^^especially with the UI.

Shouldn’t category by more like tags ?  And first class object (more source oriented even if some could be helpful at runtime like private methods) ?

Cheers,
Cédrick


> Of course, to get the most from method categories, the method
> categories need to be documented, and you need to keep them as up to
> date as any other part of the source code.
>
> I can say that the discipline of trying to come up with meaningful
> categories and use them consistently has improved the quality of my
> code.
>
> On Mon, 13 Apr 2020 at 00:38, Cédrick Béler <[hidden email]> wrote:
>>
>> Fully agree on proper names.
>>
>> === I don’t know if this is because of free confinement time but I can keep on asking questions so I share some I hope will make sense (for the sake of discussion).
>>
>> For instance, I'm wondering if tests on conditions so as to raise proper exceptions is a good practice (for instance if the  width object does not make sense, like a float) #doesNotMakeSense btw would be a cool name for the « maybe » cases Richard was talking.
>>
>> Then I asked myself what are the drawbacks (especially on performance) on adding extra information to source code (a bit like longer variable names) ?
>>
>> There is the raw code and the sources code file that helps separating concerns. At least we don’t mind at all having longer literals (variables names, …).
>>
>> I cannot help is what about pragmas. I kind see roughly how they work. But is it possible to distinguish between runtime / source only pragmas (not sure I’m clear here but it seems to me that some are important for documentation purposes that are not needed at runtime) ?
>>
>> Also, I’ve never really liked method categories. I don’t really see how there are implemented but they don’t feel nice to me.
>> Could they be only pragmas ?
>>
>>
>>
>>
>>
>> Happy eater Sunday, stay all preserved (sad day for the game of life),
>>
>> Cédrick
>>
>>
>>
>>> Le 12 avr. 2020 à 14:22, Sven Van Caekenberghe <[hidden email]> a écrit :
>>>
>>>
>>>
>>>> On 12 Apr 2020, at 13:53, Cédrick Béler <[hidden email]> wrote:
>>>>
>>>> Beautiful ^^
>>>
>>> I also like it.
>>>
>>> But why the single letter variable names ? Why not:
>>>
>>> SequenceableCollection>>#runningMeans: width
>>> | means sum index |
>>> means := Array new: self size - width + 1.
>>> sum := 0.
>>> 1 to: width do: [ :each |
>>>   sum := sum + (self at: each) ].
>>> index := 1.
>>> means at: index put: sum / width.
>>> width + 1 to: self size do: [ :each |
>>>   sum := sum - (self at: index) + (self at: each).
>>>   index := index + 1.
>>>   means at: index put: sum / width ].
>>> ^ means
>>>
>>> A good comment, a correct initial bounds check and unit tests are also needed.
>>>
>>>> I would vote for inclusion in the base image ?
>>>> With your explanation as comments.
>>>>
>>>> I’ll play with it.
>>>>
>>>> Thanks
>>>> Cedrick
>>>>
>>>>> Le 12 avr. 2020 à 12:19, Richard O'Keefe <[hidden email]> a écrit :
>>>>>
>>>>> 
>>>>> I have coded and benchmarked 8 different running mean algorithms.
>>>>> In the presence of inexact numbers it is not as accurate as
>>>>> redoing the sums, but it's pretty close, and it's fast.
>>>>> If "width" is not an integer or is out of range, an error
>>>>> will be reported by #new: or #at:[put:].  It's based on Welford's
>>>>> stable update.
>>>>>
>>>>> Of course this approach does NOT work for trimmed or Winsorised
>>>>> means or for medians or any kind of robust estimate of location.
>>>>>
>>>>> SequenceableCollection
>>>>> methods for: 'summarising'
>>>>>  runningMeans: width
>>>>>    |a m d|
>>>>>    a := Array new: self size - width + 1.
>>>>>    m := 0.
>>>>>    1 to: width do: [:i |
>>>>>      m := (self at: i) + m].
>>>>>    m := m / width.
>>>>>    d := 1.
>>>>>    a at: d put: m.
>>>>>    width + 1 to: self size do: [:i |
>>>>>      m := ((self at: i) - (self at: d)) / width + m.
>>>>>      d := d + 1.
>>>>>      a at: d put: m].
>>>>>    ^a
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Moving/rolling average implementations ?

Guillermo Polito
In reply to this post by Richard O'Keefe


> El 13 abr 2020, a las 11:00, Richard O'Keefe <[hidden email]> escribió:
>
> Concerning method categories:
> VIsualAge Smalltalk manages without them.  Of all the Smalltalk
> systems available to me, it's the only one where I don't enjoy using
> the browser.  It carves up the world of methods another way, which I
> find of little or no help.  That in no way detracts from it being a
> solid system fit for its intended uses.
>
> GNU Smalltalk uses <category: 'whatever'> pragmas to put methods into
> categories.
>
> Used consistently and well, method categories are not only a
> navigation aid, helping you to find methods you do not know the names
> of, but a helpful documentation aid.  For example, #runningMeans: is
> in the 'summarising' category.  In my Smalltalk, that means
> - it inspects every element of the collection
> - it does not change the collection
> - the result is somehow a summary or distillation of the elements
> Of course, to get the most from method categories, the method
> categories need to be documented, and you need to keep them as up to
> date as any other part of the source code.

It would be interesting if protocols had comments :)

>
> I can say that the discipline of trying to come up with meaningful
> categories and use them consistently has improved the quality of my
> code.
>
> On Mon, 13 Apr 2020 at 00:38, Cédrick Béler <[hidden email]> wrote:
>>
>> Fully agree on proper names.
>>
>> === I don’t know if this is because of free confinement time but I can keep on asking questions so I share some I hope will make sense (for the sake of discussion).
>>
>> For instance, I'm wondering if tests on conditions so as to raise proper exceptions is a good practice (for instance if the  width object does not make sense, like a float) #doesNotMakeSense btw would be a cool name for the « maybe » cases Richard was talking.
>>
>> Then I asked myself what are the drawbacks (especially on performance) on adding extra information to source code (a bit like longer variable names) ?
>>
>> There is the raw code and the sources code file that helps separating concerns. At least we don’t mind at all having longer literals (variables names, …).
>>
>> I cannot help is what about pragmas. I kind see roughly how they work. But is it possible to distinguish between runtime / source only pragmas (not sure I’m clear here but it seems to me that some are important for documentation purposes that are not needed at runtime) ?
>>
>> Also, I’ve never really liked method categories. I don’t really see how there are implemented but they don’t feel nice to me.
>> Could they be only pragmas ?
>>
>>
>>
>>
>>
>> Happy eater Sunday, stay all preserved (sad day for the game of life),
>>
>> Cédrick
>>
>>
>>
>>> Le 12 avr. 2020 à 14:22, Sven Van Caekenberghe <[hidden email]> a écrit :
>>>
>>>
>>>
>>>> On 12 Apr 2020, at 13:53, Cédrick Béler <[hidden email]> wrote:
>>>>
>>>> Beautiful ^^
>>>
>>> I also like it.
>>>
>>> But why the single letter variable names ? Why not:
>>>
>>> SequenceableCollection>>#runningMeans: width
>>> | means sum index |
>>> means := Array new: self size - width + 1.
>>> sum := 0.
>>> 1 to: width do: [ :each |
>>>   sum := sum + (self at: each) ].
>>> index := 1.
>>> means at: index put: sum / width.
>>> width + 1 to: self size do: [ :each |
>>>   sum := sum - (self at: index) + (self at: each).
>>>   index := index + 1.
>>>   means at: index put: sum / width ].
>>> ^ means
>>>
>>> A good comment, a correct initial bounds check and unit tests are also needed.
>>>
>>>> I would vote for inclusion in the base image ?
>>>> With your explanation as comments.
>>>>
>>>> I’ll play with it.
>>>>
>>>> Thanks
>>>> Cedrick
>>>>
>>>>> Le 12 avr. 2020 à 12:19, Richard O'Keefe <[hidden email]> a écrit :
>>>>>
>>>>> 
>>>>> I have coded and benchmarked 8 different running mean algorithms.
>>>>> In the presence of inexact numbers it is not as accurate as
>>>>> redoing the sums, but it's pretty close, and it's fast.
>>>>> If "width" is not an integer or is out of range, an error
>>>>> will be reported by #new: or #at:[put:].  It's based on Welford's
>>>>> stable update.
>>>>>
>>>>> Of course this approach does NOT work for trimmed or Winsorised
>>>>> means or for medians or any kind of robust estimate of location.
>>>>>
>>>>> SequenceableCollection
>>>>> methods for: 'summarising'
>>>>>  runningMeans: width
>>>>>    |a m d|
>>>>>    a := Array new: self size - width + 1.
>>>>>    m := 0.
>>>>>    1 to: width do: [:i |
>>>>>      m := (self at: i) + m].
>>>>>    m := m / width.
>>>>>    d := 1.
>>>>>    a at: d put: m.
>>>>>    width + 1 to: self size do: [:i |
>>>>>      m := ((self at: i) - (self at: d)) / width + m.
>>>>>      d := d + 1.
>>>>>      a at: d put: m].
>>>>>    ^a
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Discussions on method categories (Was: Moving/rolling average implementations ?)

Guillermo Polito
In reply to this post by cedreek


> El 13 abr 2020, a las 11:17, Cédrick Béler <[hidden email]> escribió:
>
>
>> Used consistently and well, method categories are not only a
>> navigation aid, helping you to find methods you do not know the names
>> of, but a helpful documentation aid.  For example, #runningMeans: is
>> in the 'summarising' category.  In my Smalltalk, that means
>> - it inspects every element of the collection
>> - it does not change the collection
>> - the result is somehow a summary or distillation of the elements
>
> Great. If only categories were more first class, we could store such information.

Haha just said the same in the other thread ^^.
Sorry for the noise
Reply | Threaded
Open this post in threaded view
|

Re: Discussions on method categories (Was: Moving/rolling average implementations ?)

cedreek
(this is not really noise, at least useful one ie. an information like the +1 :) )

At least, I think we need « standard » protocols list, at least for the most used ones. Some conventions too.
For instance, all the initialize* (+ class side)… I can cope with that but this is clearly a brain stopper for me :grinning:.

Were they some attempts to define such standard list / best practices for protocols ?

I guess this is a huge task but one important one. Regarding implementations, it might be hard to change too much for compatibility reasons (nb: exercice for later => get all protocols name of an image)

@++
Cédrick



> Le 13 avr. 2020 à 12:03, Guillermo Polito <[hidden email]> a écrit :
>
>
>
>> El 13 abr 2020, a las 11:17, Cédrick Béler <[hidden email]> escribió:
>>
>>
>>> Used consistently and well, method categories are not only a
>>> navigation aid, helping you to find methods you do not know the names
>>> of, but a helpful documentation aid.  For example, #runningMeans: is
>>> in the 'summarising' category.  In my Smalltalk, that means
>>> - it inspects every element of the collection
>>> - it does not change the collection
>>> - the result is somehow a summary or distillation of the elements
>>
>> Great. If only categories were more first class, we could store such information.
>
> Haha just said the same in the other thread ^^.
> Sorry for the noise