SortedCollection>>includes:

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: SortedCollection>>includes:

Blair McGlashan
John

You wrote in message
news:eAQT6.27769$[hidden email]...
>
> It isn't a hard bug to fix. I just looked at my Dolphin image, and there
are
> only 9 classes that define #<= but do not define #<. I think making people
> fix their code would not be that hard....

I agree, but that is assuming they know it to be an issue. On reflection the
backwards compatibility issue is not so much with classes that implement #<=
rather than #< to be sorted (a DNU will quickly show up that one), but where
the behaviour is changed as a result of the slightly different sort order.
Is that likely to be a big problem? Probably not, and I suppose we could
excuse it as a change for the sake of ANSI compatibility :-)

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: SortedCollection>>includes:

Chris Uppal-3
Blair,

> I agree, but that is assuming they know it to be an issue. On reflection
the
> backwards compatibility issue is not so much with classes that implement
#<=
> rather than #< to be sorted (a DNU will quickly show up that one), but
where
> the behaviour is changed as a result of the slightly different sort order.
> Is that likely to be a big problem? Probably not, and I suppose we could
> excuse it as a change for the sake of ANSI compatibility :-)

You could also sweeten the pill by mentioning improved efficiency.  If its
known that a "<" b implies a ~= b, then a number of algorithmic improvements
become possible.

You could then safely use a binary chop to do inserts and #includes.

You could remove the bounds checks in the inner loop of quicksort, since the
"<" test has sentinels.

You could implement the three-way partitioning scheme described in the
later/bigger versions of Sedgewick's books.  (It partitions into elements <
pivot, elements > pivot, and the rest, and looks as if it might be a
worthwhile improvement.)

The downside is that the <= relationship violates the preconditions of these
quicker algorithms, so Very Odd Things might happen to the unwary.  (I saw
this in C++ a few years ago -- a junior was trying to use one of the
built-in sorted collections using a < condition which was a little buggy and
could sometimes allow both a < b and b < a.  He was quite suprised when the
program crashed, and it took some time to explain why it didn't just put the
elements into a slighly wrong order).

> Blair

    -- chris


Reply | Threaded
Open this post in threaded view
|

Re: SortedCollection>>includes:

John Brant
"Chris Uppal" <[hidden email]> wrote in message
news:[hidden email]...
> You could also sweeten the pill by mentioning improved efficiency.  If its
> known that a "<" b implies a ~= b, then a number of algorithmic
improvements
> become possible.

Why is this? If you know it is #<=, then you can get #<. For example,
    "a <= b"  == "(b < a) not"
If you know that it includes = then you can substitute the proper tests. You
can even test for = elements if you run the sort block with the same
element.

> You could then safely use a binary chop to do inserts and #includes.

The code I posted a couple weeks ago would use the sort block for an
"includes test" without making any assumptions about whether = was included.
However, the problem with using the sort block for #includes: is that
#includes: can take any argument, so you can ask a sorted collection that
contains only strings, if it includes a number. This will result in an error
instead of false.

> You could implement the three-way partitioning scheme described in the
> later/bigger versions of Sedgewick's books.  (It partitions into elements
<
> pivot, elements > pivot, and the rest, and looks as if it might be a
> worthwhile improvement.)

If they add a line such as
    [i < stop and: [sortBlock value: (self basicAt: i) value: median]]
whileTrue: [i := i + 1].
before the recursive calls, then the worse case (sorting the same elements)
becomes the best case and runs in linear time (using #<=). This line is
skipping all elements at the start of the right partition that are equal to
the median element.


John Brant


12