While checking out Mocketry, I came upon SequenceableCollection>>=
and SequenceableCollection>>isSameSequenceAs. The latter seems to be part of the former. Is there a technical reason for this code duplication? Thanks, s. |
3 reasons. isSameSequenceAs: was written after #= (by me IIRC). Performance of #= can be important and the change would slow down comparing empty collections. If it ain't broke don't fix it. Hence my call was no to change #= to use isSameSequenceAs:.
On Feb 19, 2008 11:46 PM, Stefan Schmiedl <[hidden email]> wrote: While checking out Mocketry, I came upon SequenceableCollection>>= |
Hi Eliot,
thanks for replying. I have done some more messing with the code, please bear with me, I'm more testing my understanding of VisualWorks than challenging your decisions. On Wed, 20 Feb 2008 11:11:54 -0800 "Eliot Miranda" <[hidden email]> wrote: > 3 reasons. isSameSequenceAs: was written after #= (by me IIRC). I guessed so. It looks like it is an optimization for when you are not concerned comparing baskets of apples to bags of oranges. Empty containers are always sad :-) > Performance of #= can be important and the change would slow down comparing > empty collections. (Time millisecondsToRun: [ |x c1 c2| x := true. c1 := OrderedCollection new. c2 := c1 copy. 10000000 timesRepeat: [x := x & (c1 ### c2)]. x out ]) out gives the following results on successive runs when subsituting ### with = 742, 712, 724 isSameSequenceAs: 407, 416, 408 myEquals 777, 789, 787 where myEquals is the "obvious" refactoring: myEquals: otherCollection self species == otherCollection species ifFalse: [^false]. ^self isSameSequenceAs: otherCollection 10 million method calls make about a 20th of a second difference on my 2GHz amd64. So I guess that comparing the species is way more expensive than another method call. Which justifies the existence of the separate method, but not the lack of refactoring. I'm just a grasshopper, what do the masters think? Is the code above suitable for measuring what I claim? I tried to make it so that it's not trivially possible to optimize away what I'm interested in. Next, I changed c2 to be an empty array: (Time millisecondsToRun: [ |x c1 c2| x := true. c1 := OrderedCollection new. c2 := Array new. 10000000 timesRepeat: [x := x & (c1 ### c2)]. x out ]) out = 597, 582, 612 isSameSequenceAs: 371, 412, 390 myEquals 581, 567, 576 isSameSequence is in the same ballbark as with two OrderedCollections, both = and myEquals are significantly faster. As it should be, since they leave early returning false. > If it ain't broke don't fix it. Hence my call was no to > change #= to use isSameSequenceAs:. yeah... that one gets me regularly in trouble :-) Regards, s. |
Hi Stefan,
On Wed, Feb 20, 2008 at 11:58 AM, Stefan Schmiedl <[hidden email]> wrote: Hi Eliot,
Looks like a (742 + 712 + 724) - (777 + 789 + 787) / (742 + 712 + 724) asFloat, or 8% slowdown. Quite significant. Is the code above suitable for measuring what I claim? I tried to ?? your figures show isSameSequenceAs: is faster, around 400ms as opposed to ~600ms. P.S. try something like the following for more stable measurements: BlockClosure methods for benchmarking benchmarkMicrosecondsToRunRepetitions: n | start end | Smalltalk garbageCollect. "side-effect is to flush the compiled code cache" start := Time microsecondClock. "invoke once to compile the code" self value. "run once to compile the code" start := Time microsecodClock. 1 to: n do: [:i| self value]. "compiler generates inline code for this loop." end := Time microsecondClock. ^(end - start) / n > If it ain't broke don't fix it. Hence my call was no to :) The three reasons together still do it for me. Regards, Best Eliot |
On Wed, 20 Feb 2008 18:31:27 -0800
"Eliot Miranda" <[hidden email]> wrote: > > > > = 597, 582, 612 > > isSameSequenceAs: 371, 412, 390 > > myEquals 581, 567, 576 > > > > isSameSequence is in the same ballbark as with two OrderedCollections, > > both = and myEquals are significantly faster. As it should be, since > > they leave early returning false. > > > ?? your figures show isSameSequenceAs: is faster, around 400ms as opposed > to ~600ms. "both = and myEquals are significantly faster" with OC and Array than with OC and OC. Sorry for the confusion. > P.S. try something like the following for more stable measurements: > Thanks, that one goes straight into my toolbox. > :) The three reasons together still do it for me. As I said, I do this for the learning experience. E.g. I learned that there is a method isSameSequenceAs: already doing size comparisons, which directly translates into shorter code in my packages. Thanks, s. |
Free forum by Nabble | Edit this page |