"John Brant" <[hidden email]> wrote in message
news:JIg69.18636$me6.3009@sccrnsc01... > ... > You might increase the cut over point for the insertion sort. Currently, it > is set at 5. From the tests that I ran on the original sort algorithm, I > couldn't really notice any time difference between 5-13. The larger the cut > over point, the fewer times you need to execute the new while loop. For > example, with the cut over point at 5, sorting (1 to: 10000) will require > executing the new while loop 2047 times. But, if you increase the cut over > point to 9, you will cut the executions down to 1023. If you increase the > cut over to be 10-12, you may be able to cut the overhead to 0.5-0.75%. True, but the insertion sort's poorer algorithmic performance may squander the gains if we move the cut-over too far. For example if I do the following sort: comparisons := 0. Symbol allInstances asSortedCollection: [:a :b | comparisons := comparisons + 1. a <= b]. Then I get the following values for the number of comparisons required to do the sort of the 20199 Symbols in my image: Less Or Equal Cut Comparisons 5 311521 7 310750 9 311654 11 313319 Less Than Cut Comparisons 5 309084 7 308188 9 308994 11 311110 So for that case, the optimal cut-over value is 7 (BTW: Would anyone like to speculate on why the default sort block in Smalltalk uses #<= instead of #<?). Repeating the tests for 10000 random Floats the optimal cut-over point also seemed to be 7, at least in terms of number of comparisons. The cut-over of 5 was chosen based on the performance of the previous algorithm, so it should probably be revised to allow for the modification. Regards Blair |
"Blair McGlashan" <[hidden email]> wrote in message
news:ajdvi8$1a8vn3$[hidden email]... > "John Brant" <[hidden email]> wrote in message > news:JIg69.18636$me6.3009@sccrnsc01... > > ... > > You might increase the cut over point for the insertion sort. Currently, > it > > is set at 5. From the tests that I ran on the original sort algorithm, I > > couldn't really notice any time difference between 5-13. The larger the > cut > > over point, the fewer times you need to execute the new while loop. For > > example, with the cut over point at 5, sorting (1 to: 10000) will > > executing the new while loop 2047 times. But, if you increase the cut over > > point to 9, you will cut the executions down to 1023. If you increase the > > cut over to be 10-12, you may be able to cut the overhead to 0.5-0.75%. > > True, but the insertion sort's poorer algorithmic performance may squander > the gains if we move the cut-over too far. For example if I do the following > sort: This depends on a number of things and is not solely dependent on the number of comparisons. For example, your quicksort implementation will require moving the median element to the starting position. Once the partitioning is done, it is then moved to the "middle". Of course, this movement might not be necessary. The number that I've seen in the past was 5-15 with most likely the best crossover point being around 10. The results I posted yesterday were from sorting 10000 random floats. There really wasn't much time difference between 5-13. > (BTW: Would anyone like to > speculate on why the default sort block in Smalltalk uses #<= instead of > #<?). It doesn't make much difference. However, once you know which one is preferred, you should modify your algorithm to make the prefered case the fastest case. For example, your insertion sort currently prefers #< as the operator, but it is easy to change to prefer #<= insertsortFrom: start to: stop start + 1 to: stop do: [:j | | a t i | a := self basicAt: j. i := j. [i <= start or: [sortBlock value: (t := self basicAt: i - 1) value: a]] whileFalse: [self basicAt: i put: t. i := i - 1]. self basicAt: i put: a] If you use #< as the default, then your quicksort implementation can result in unnecessary swaps. For example, if you use #< and sort a collection with the same elements, you end up swaping every element for each recursive call. John Brant |
"John Brant" <[hidden email]> wrote in message
news:FHy69.17377$983.20967@rwcrnsc53... > "Blair McGlashan" <[hidden email]> wrote in message > news:ajdvi8$1a8vn3$[hidden email]... > > > > True, but the insertion sort's poorer algorithmic performance may squander > > the gains if we move the cut-over too far. For example if I do the > following > > sort: > > This depends on a number of things and is not solely dependent on the number > of comparisons. For example, your quicksort implementation will require > moving the median element to the starting position. Once the partitioning is > done, it is then moved to the "middle". Of course, this movement might not > be necessary. Yes, I am aware of that, but we were speaking about the number of comparisons. I observed that the modification to the algorithm increases the number of comparisons by a very small percentage. You suggested that even this small increase could be reduced by increasing the cut-over point at which the implementation switches to insertion sort in order to reduce the number of times around the quicksort loop. I was pointing out in my last reply that the reduction in comparisons gained could be more than offset by the additional comparisons that would then be performed in the larger insertion sorts. On the question of overall performance impact it seems that, in Dolphin at least, moving the elements around is usually cheaper than invoking the sort block, and therefore the number of invocations of the sort block is one of the most significant factors. It is also one of the most consistently measurable. > > The number that I've seen in the past was 5-15 with most likely the best > crossover point being around 10. Most algorithm books do seem to suggest values in that range - Sedgewick says that the "algorithm works about the same for M in the range from about 5 to about 25", but he points at that the best value "depends upon the implementation." The original source of the algorithm in Dolphin was in C and used a cut of 7, which I reduced to five when testing the Smalltalk implementation. Most books have implementations in C (or similar), and are using the built-in relational operators against value types. I would submit that this makes the relative cost of the comparison in those implementations much less significant than it is in Smalltalk's SortedCollection, and so the choice of cut may be different. Of course this is pure speculation, but it seems to hold in practice. >...The results I posted yesterday were from > sorting 10000 random floats. There really wasn't much time difference > between 5-13. I'd guess you aren't likely to see much difference because the comparison of Floats is relatively quick so the number of invocations of the sort block is relatively less important than in some cases. My own timings agree with yours in that when timing sorts of 10000 Random Floats with different cut overs, the measured differences are small enough such that they might be accounted for by the variations in timings between runs. Much the same is true of sorting 10000 randomly chosen strings, although there the timings did worsen significantly above 10 in my tests. If, however, you try with some user defined sort block (or a user defined #<=) that is expensive, perhaps because it compares calculated rather than cached values (e.g. sorting by an objects displayString or printString, commonly done in MVP UIs), then the choice of cross over become significantly more important. Since the cross-over doesn't matter much for basic types with fast comparisons, I think it make most sense to choose the cross-over to minimize the number of comparisons on average. > > > (BTW: Would anyone like to > > speculate on why the default sort block in Smalltalk uses #<= instead of > > #<?). > > It doesn't make much difference. However, once you know which one is > preferred, you should modify your algorithm to make the prefered case the > fastest case. For example, your insertion sort currently prefers #< as the > operator, but it is easy to change to prefer #<= > ... Thanks for spotting that. With your revised implementation of #insertsortFrom:to: the optimum number of comparisons still seems to occur with a cut-over of around 7, but the increase in the number of comparisons as the cut-over is increased is less marked than before. Since it seems that the cut-over point is about right at 7, and it doesn't seem to be that critical, I'm going to stick with 7 unless anyone wants to do a more complete analysis. Back to the BlockClosures... Regards Blair |
"Blair McGlashan" <[hidden email]> wrote:
> Since it seems that the cut-over point is about right at 7, and it doesn't > seem to be that critical, I'm going to stick with 7 unless anyone wants to > do a more complete analysis. Sedgewick has a fairly recent paper you might be interested in: http://www.cs.princeton.edu/~rs/talks/QuicksortIsOptimal.pdf John Keenan |
Free forum by Nabble | Edit this page |