Surprise of the day

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Surprise of the day - sorting for multiple criteria

Dave Stevenson-3
Probably the most frequent use of sorting in our app is before presenting a group of objects just read from the db to the user in a dataset. My primary goal for #sortBy: was a concise api to make our client code more readable and easier to write, so I chose to leverage the #sorted: algorithm rather than implement my own inefficient sort. Hence, I can't speak to speed comparisons because I didn't perform any.
 
I've not looked too closely at the caching sort code, but one trivial point:
 
Paul wrote:
> Next you add a method to objects that you wish to sort for a criteria. The method should take one argument, which is an index for the sort field to answer. You can even use the same sort criteria method for otherwise dissimilar objects so long as the responses are comparable for a given field index. 
If you add similar behavior (the sort method) to the types of objects you wish to compare, then they are every bit as homogenous or heterogenous as the collection in the 'sortBy: #name' example. In fact we use #sortBy: on collections of different object types, though we must ensure they all understand the sort criteria.
 
Dave Stevenson
[hidden email]



From: Paul Baumann <[hidden email]>
To: Steven Kelly <[hidden email]>; Dave Stevenson <[hidden email]>
Cc: vwnc NC <[hidden email]>
Sent: Wed, August 11, 2010 7:02:32 AM
Subject: RE: [vwnc] Surprise of the day - sorting for multiple criteria

Here is another way to sort for multiple fields and multiple criteria. The approach Dave showed is more suited for on-demand dynamic sort criteria across a homogeneous collection of objects; this one is more suited for static sort criteria where sequence rules are maintained on the object to be sorted and the collection can contain heterogeneous objects.
 
Add this method:
 
Symbol>>should: objectA comeBefore: objectB
 "The receiver is expected to be a two argument selector that is used to determine
  if one object should be come before another. objectA and objectB will each be
  send the receiver as a message. The second argument will be an sort field index
  that starts at 1 and is incremented until order is determined."
 
 | i a b |
 i := 1.
 [
  (a := objectA perform: self with: i) isNil ifTrue: [^true].
  (b := objectB perform: self with: i) isNil ifTrue: [^false].
  a < b ifTrue: [^true].
  b < a ifTrue: [^false].
  (i := i + 1) < 1000
 ] whileTrue: [].
 self error: 'An unlikely number of field value matches were returned before returning nil'.
 
Next you add a method to objects that you wish to sort for a criteria. The method should take one argument, which is an index for the sort field to answer. You can even use the same sort criteria method for otherwise dissimilar objects so long as the responses are comparable for a given field index. In this example, #sortableFieldAt: is the sort criteria and when performed will answer the sort field value for a given index:
 
GbcCode>>sortableFieldAt: pos
 ^nil
 
GbcClassReference>>sortableFieldAt: pos
 pos = 1 ifTrue: [^50]. "Order: namespaces, variables, classes, methods"
 pos = 2 ifTrue: [^self hierarchyName].
 pos = 3 ifTrue: [^self lastSourceMagnitude].
 ^nil
 
GbcMethod>>sortableFieldAt: pos
 pos = 1 ifTrue: [^100]. "Order: namespaces, variables, classes, methods"
 pos = 2 ifTrue: [^self className].
 pos = 3 ifTrue: [^self isOverrideInImage ifTrue: [75] ifFalse: [50]].
 pos = 4 ifTrue: [^self isMeta ifTrue: [50] ifFalse: [51]].
 pos = 5 ifTrue: [^self selector].
 pos = 6 ifTrue: [^self lastSourceMagnitude].
 ^nil
 
This is how you'd use it:
 
self asSortedCollection: [:a :b | #sortableFieldAt: should: a comeBefore: b ].
 
If you wanted to use #asSortedCollection then you'd define #<= with a default sort order like this:
 
<= other
    ^#sortableFieldAt: should: self comeBefore: other
 
It fetches a minimal number of field values (but may fetch the same field value more than once). Dissimilar objects can be grouped using a generality from the first field (as shown). One method defines the sort criteria. The sort block specifies the sort criteria to be used but does not define it.
 
This is how the same code can be adapted to caching field values into keys of associations as Steven suggested:
 
    | cachedFields sortedAssociations |
    cachedFields := self collect: [:ea | ea -> ((1 to: 10) collect: [:i | ea sortableFieldAt: i])].
    sortedAssociations := cachedFields asSortedCollection: [:a :b | #at: should: a value comeBefore: b value ].
    ^sortedAssociations collect: [:ea | ea key ]
 
The caching example creates lots of temporary arrays and associations and enumerates more than once; however, it still managed to be 15% faster at sorting 9,467 GbcCode objects in my test. It is faster because it never queries the object more than once for each field value.
 
The caching example caches an arbitrary 10 fields--to accommodate the maximum number of sortable fields in my code. A faster and better approach would likely be to use on-demand query and fill of the cache field values so that field cache collections are grown as necessary and field values are not queried if never used.
 
It would be interesting to compare with the #sortBy: field getter code that Dave showed. I'm not sure there would be much of a performance difference though. Dave's approach is easier to use for application-specified sorts of a collection of homogeneous objects.
 
Paul Baumann 
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Steven Kelly
Sent: Monday, August 09, 2010 7:59 PM
To: Dave Stevenson
Cc: vwnc NC
Subject: Re: [vwnc] Surprise of the day

Neat! I’ve been meaning to implement sorting by multiple criteria for a while. We have something similar (unary sort criterion, block or selector), with our addition being the caching of the result of the sort criterion for each element. This can produce a useful speedup, e.g. if the sort criteria creates a string. Since we only have a single criterion we simply precalculate the sort value for each element, making an association of element->sortValue, sort the associations with a fixed sortBlock using assoc value, and then collect the keys.

 

Steve

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Dave Stevenson
Sent: 10. elokuuta 2010 1:07
To: [hidden email]
Subject: Re: [vwnc] Surprise of the day

 

Some time ago I wrote some extension methods that allow me to pass a block, a selector, or a collection of blocks/selectors for sorting (see attached). It uses #sorted:, so it has the same bug for SortedCollections, but I think folds might find it useful. Here are some examples:

 

 #(-14 3 1 -3 7 12) sortBy: #abs
 #(-14 3 1 -3 7 12) sortBy: #(abs yourself)
 #(-14 3 1 -3 7 12) sortBy: [:x | x \\ 4]
 #(-14 3 1 -3 7 12) sortBy: (List with: [:x | x \\ 4] with: #yourself)

 

The default sort order is ascending, but a variant allows a descending sort:

 

 aCollection sortBy: #name ascending: false

 

I don't like that #sortBy: seems to imply a mutation of the receiver while actually providing a new collection. Perhaps I should have named it #sortedBy: or something. Handling the arguments:
 ...
 value1 := unarySelectorOrBlock isSymbol
  ifTrue: [element1 perform: unarySelectorOrBlock]
  ifFalse: [unarySelectorOrBlock value: element1].
 ...

 

could be streamlined a bit if Squeak Extensions are loaded (which defines Symbol>>value:):
 ...
 value1 := unarySelectorOrBlock value: element1.
 ...

 

but the attached version works in vanilla VW 7.6. In our code, only a few percent of senders of #sortBy: pass a block (likely because #sorted: and #asSortedCollection: already take a block). About a quarter of our senders pass a collection of selectors, and the rest pass a single selector. I've not yet had a desire to pass a mixed collection containing blocks and selectors, but the examples in the method comments validate that case.

 

Paul indicated that #asSortedCollection: is faster than #sorted:, so if that is important to you I confess I did not optimize #sortBy: for speed.

 

If Cincom felt so inclined, they could probably modify #sorted: to take a block, a selector, or a collection thereof, making that single api more flexible, and eliminating my desire for a separate #sortBy: api. Or, the attached can be freely used/modified by anyone.


 

Dave Stevenson
[hidden email]

 

 


From: Paul Baumann <[hidden email]>
To: Julian Fitzell <[hidden email]>; "Joerg Beekmann, DeepCove Labs (YVR)" <[hidden email]>; "[hidden email]" <[hidden email]>
Sent: Mon, August 9, 2010 12:32:58 PM
Subject: Re: [vwnc] Surprise of the day


>> I think it would read "Answer a sequenceable collection containing the contents of the receiver sorted according to aBlock. The receiver is not modified."

Yes, that is a better method comment.

#sorted: is broken, uncommented (at least as of VW 7.5), typically slower, and non-portable. You think #sorted: can be better than #asSortedCollection: and should be fixed instead of removed, I respect that. I see why you appreciate it. I'll stick with #asSortedCollection: until #sorted: is better.

Paul Baumann


-----Original Message-----
From: Julian Fitzell [mailto:[hidden email]]
Sent: Monday, August 09, 2010 6:03 AM
To: Paul Baumann; 'Joerg Beekmann, DeepCove Labs (YVR)'; [hidden email]
Subject: Re: [vwnc] Surprise of the day
Importance: High

On 10-08-06 4:58 PM, "Paul Baumann" <[hidden email]> wrote:
> If the purpose of #sorted: were documented, I think it would read
> "Answer a collection similar to the receiver with contents sorted according to aBlock".
> The "similar to the receiver" part is the only part that would
> distinguish it from #asSortedCollection:.

I disagree. I think it would read "Answer a sequenceable collection containing the contents of the receiver sorted according to aBlock. The receiver is not modified."

A SortedCollection is specifically implemented to allow the contents to change and to remain sorted when elements are added. If you don't need that functionality, there's no point demanding a SortedCollection. #sorted:
allows each collection to make the decision of how to most efficiently create and return the result, whether that is returning a copy of self, a copy with the sort block changed, an Array created in some fashion or other, or some other method. It's a protocol that allows you ask for something specific that you need without specifying the implementation, which is exactly what OO is supposed to be about.

Julian



This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc



This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: Surprise of the day - sorting for multiple criteria

Paul Baumann
Sure, it is possible (thorough accessors named something generic like #sortableColumnThree), but one approach is more suited for sorting heterogenous objects than the other. Both approaches have strong points.
 
Paul Baumann 
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Dave Stevenson
Sent: Wednesday, August 11, 2010 11:41 AM
To: vwnc NC
Subject: Re: [vwnc] Surprise of the day - sorting for multiple criteria
Importance: Low

Probably the most frequent use of sorting in our app is before presenting a group of objects just read from the db to the user in a dataset. My primary goal for #sortBy: was a concise api to make our client code more readable and easier to write, so I chose to leverage the #sorted: algorithm rather than implement my own inefficient sort. Hence, I can't speak to speed comparisons because I didn't perform any.
 
I've not looked too closely at the caching sort code, but one trivial point:
 
Paul wrote:
> Next you add a method to objects that you wish to sort for a criteria. The method should take one argument, which is an index for the sort field to answer. You can even use the same sort criteria method for otherwise dissimilar objects so long as the responses are comparable for a given field index. 
If you add similar behavior (the sort method) to the types of objects you wish to compare, then they are every bit as homogenous or heterogenous as the collection in the 'sortBy: #name' example. In fact we use #sortBy: on collections of different object types, though we must ensure they all understand the sort criteria.
 
Dave Stevenson
[hidden email]



From: Paul Baumann <[hidden email]>
To: Steven Kelly <[hidden email]>; Dave Stevenson <[hidden email]>
Cc: vwnc NC <[hidden email]>
Sent: Wed, August 11, 2010 7:02:32 AM
Subject: RE: [vwnc] Surprise of the day - sorting for multiple criteria

Here is another way to sort for multiple fields and multiple criteria. The approach Dave showed is more suited for on-demand dynamic sort criteria across a homogeneous collection of objects; this one is more suited for static sort criteria where sequence rules are maintained on the object to be sorted and the collection can contain heterogeneous objects.
 
Add this method:
 
Symbol>>should: objectA comeBefore: objectB
 "The receiver is expected to be a two argument selector that is used to determine
  if one object should be come before another. objectA and objectB will each be
  send the receiver as a message. The second argument will be an sort field index
  that starts at 1 and is incremented until order is determined."
 
 | i a b |
 i := 1.
 [
  (a := objectA perform: self with: i) isNil ifTrue: [^true].
  (b := objectB perform: self with: i) isNil ifTrue: [^false].
  a < b ifTrue: [^true].
  b < a ifTrue: [^false].
  (i := i + 1) < 1000
 ] whileTrue: [].
 self error: 'An unlikely number of field value matches were returned before returning nil'.
 
Next you add a method to objects that you wish to sort for a criteria. The method should take one argument, which is an index for the sort field to answer. You can even use the same sort criteria method for otherwise dissimilar objects so long as the responses are comparable for a given field index. In this example, #sortableFieldAt: is the sort criteria and when performed will answer the sort field value for a given index:
 
GbcCode>>sortableFieldAt: pos
 ^nil
 
GbcClassReference>>sortableFieldAt: pos
 pos = 1 ifTrue: [^50]. "Order: namespaces, variables, classes, methods"
 pos = 2 ifTrue: [^self hierarchyName].
 pos = 3 ifTrue: [^self lastSourceMagnitude].
 ^nil
 
GbcMethod>>sortableFieldAt: pos
 pos = 1 ifTrue: [^100]. "Order: namespaces, variables, classes, methods"
 pos = 2 ifTrue: [^self className].
 pos = 3 ifTrue: [^self isOverrideInImage ifTrue: [75] ifFalse: [50]].
 pos = 4 ifTrue: [^self isMeta ifTrue: [50] ifFalse: [51]].
 pos = 5 ifTrue: [^self selector].
 pos = 6 ifTrue: [^self lastSourceMagnitude].
 ^nil
 
This is how you'd use it:
 
self asSortedCollection: [:a :b | #sortableFieldAt: should: a comeBefore: b ].
 
If you wanted to use #asSortedCollection then you'd define #<= with a default sort order like this:
 
<= other
    ^#sortableFieldAt: should: self comeBefore: other
 
It fetches a minimal number of field values (but may fetch the same field value more than once). Dissimilar objects can be grouped using a generality from the first field (as shown). One method defines the sort criteria. The sort block specifies the sort criteria to be used but does not define it.
 
This is how the same code can be adapted to caching field values into keys of associations as Steven suggested:
 
    | cachedFields sortedAssociations |
    cachedFields := self collect: [:ea | ea -> ((1 to: 10) collect: [:i | ea sortableFieldAt: i])].
    sortedAssociations := cachedFields asSortedCollection: [:a :b | #at: should: a value comeBefore: b value ].
    ^sortedAssociations collect: [:ea | ea key ]
 
The caching example creates lots of temporary arrays and associations and enumerates more than once; however, it still managed to be 15% faster at sorting 9,467 GbcCode objects in my test. It is faster because it never queries the object more than once for each field value.
 
The caching example caches an arbitrary 10 fields--to accommodate the maximum number of sortable fields in my code. A faster and better approach would likely be to use on-demand query and fill of the cache field values so that field cache collections are grown as necessary and field values are not queried if never used.
 
It would be interesting to compare with the #sortBy: field getter code that Dave showed. I'm not sure there would be much of a performance difference though. Dave's approach is easier to use for application-specified sorts of a collection of homogeneous objects.
 
Paul Baumann 
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Steven Kelly
Sent: Monday, August 09, 2010 7:59 PM
To: Dave Stevenson
Cc: vwnc NC
Subject: Re: [vwnc] Surprise of the day

Neat! I’ve been meaning to implement sorting by multiple criteria for a while. We have something similar (unary sort criterion, block or selector), with our addition being the caching of the result of the sort criterion for each element. This can produce a useful speedup, e.g. if the sort criteria creates a string. Since we only have a single criterion we simply precalculate the sort value for each element, making an association of element->sortValue, sort the associations with a fixed sortBlock using assoc value, and then collect the keys.

 

Steve

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Dave Stevenson
Sent: 10. elokuuta 2010 1:07
To: [hidden email]
Subject: Re: [vwnc] Surprise of the day

 

Some time ago I wrote some extension methods that allow me to pass a block, a selector, or a collection of blocks/selectors for sorting (see attached). It uses #sorted:, so it has the same bug for SortedCollections, but I think folds might find it useful. Here are some examples:

 

 #(-14 3 1 -3 7 12) sortBy: #abs
 #(-14 3 1 -3 7 12) sortBy: #(abs yourself)
 #(-14 3 1 -3 7 12) sortBy: [:x | x \\ 4]
 #(-14 3 1 -3 7 12) sortBy: (List with: [:x | x \\ 4] with: #yourself)

 

The default sort order is ascending, but a variant allows a descending sort:

 

 aCollection sortBy: #name ascending: false

 

I don't like that #sortBy: seems to imply a mutation of the receiver while actually providing a new collection. Perhaps I should have named it #sortedBy: or something. Handling the arguments:
 ...
 value1 := unarySelectorOrBlock isSymbol
  ifTrue: [element1 perform: unarySelectorOrBlock]
  ifFalse: [unarySelectorOrBlock value: element1].
 ...

 

could be streamlined a bit if Squeak Extensions are loaded (which defines Symbol>>value:):
 ...
 value1 := unarySelectorOrBlock value: element1.
 ...

 

but the attached version works in vanilla VW 7.6. In our code, only a few percent of senders of #sortBy: pass a block (likely because #sorted: and #asSortedCollection: already take a block). About a quarter of our senders pass a collection of selectors, and the rest pass a single selector. I've not yet had a desire to pass a mixed collection containing blocks and selectors, but the examples in the method comments validate that case.

 

Paul indicated that #asSortedCollection: is faster than #sorted:, so if that is important to you I confess I did not optimize #sortBy: for speed.

 

If Cincom felt so inclined, they could probably modify #sorted: to take a block, a selector, or a collection thereof, making that single api more flexible, and eliminating my desire for a separate #sortBy: api. Or, the attached can be freely used/modified by anyone.


 

Dave Stevenson
[hidden email]

 

 


From: Paul Baumann <[hidden email]>
To: Julian Fitzell <[hidden email]>; "Joerg Beekmann, DeepCove Labs (YVR)" <[hidden email]>; "[hidden email]" <[hidden email]>
Sent: Mon, August 9, 2010 12:32:58 PM
Subject: Re: [vwnc] Surprise of the day


>> I think it would read "Answer a sequenceable collection containing the contents of the receiver sorted according to aBlock. The receiver is not modified."

Yes, that is a better method comment.

#sorted: is broken, uncommented (at least as of VW 7.5), typically slower, and non-portable. You think #sorted: can be better than #asSortedCollection: and should be fixed instead of removed, I respect that. I see why you appreciate it. I'll stick with #asSortedCollection: until #sorted: is better.

Paul Baumann


-----Original Message-----
From: Julian Fitzell [mailto:[hidden email]]
Sent: Monday, August 09, 2010 6:03 AM
To: Paul Baumann; 'Joerg Beekmann, DeepCove Labs (YVR)'; [hidden email]
Subject: Re: [vwnc] Surprise of the day
Importance: High

On 10-08-06 4:58 PM, "Paul Baumann" <[hidden email]> wrote:
> If the purpose of #sorted: were documented, I think it would read
> "Answer a collection similar to the receiver with contents sorted according to aBlock".
> The "similar to the receiver" part is the only part that would
> distinguish it from #asSortedCollection:.

I disagree. I think it would read "Answer a sequenceable collection containing the contents of the receiver sorted according to aBlock. The receiver is not modified."

A SortedCollection is specifically implemented to allow the contents to change and to remain sorted when elements are added. If you don't need that functionality, there's no point demanding a SortedCollection. #sorted:
allows each collection to make the decision of how to most efficiently create and return the result, whether that is returning a copy of self, a copy with the sort block changed, an Array created in some fashion or other, or some other method. It's a protocol that allows you ask for something specific that you need without specifying the implementation, which is exactly what OO is supposed to be about.

Julian



This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc



This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: Surprise of the day

Dave Stevenson-3
In reply to this post by Dave Stevenson-3
Here's a refactored version that implements #sortBy: in SortedCollection as well, answering a copy with a new sortBlock.
 
Dave Stevenson
[hidden email]



From: Dave Stevenson <[hidden email]>
To: "[hidden email]" <[hidden email]>
Sent: Mon, August 9, 2010 3:07:08 PM
Subject: Re: [vwnc] Surprise of the day

Some time ago I wrote some extension methods that allow me to pass a block, a selector, or a collection of blocks/selectors for sorting (see attached). It uses #sorted:, so it has the same bug for SortedCollections, but I think folds might find it useful. Here are some examples:

 

 #(-14 3 1 -3 7 12) sortBy: #abs
 #(-14 3 1 -3 7 12) sortBy: #(abs yourself)
 #(-14 3 1 -3 7 12) sortBy: [:x | x \\ 4]
 #(-14 3 1 -3 7 12) sortBy: (List with: [:x | x \\ 4] with: #yourself)

 

The default sort order is ascending, but a variant allows a descending sort:

 

 aCollection sortBy: #name ascending: false

 

I don't like that #sortBy: seems to imply a mutation of the receiver while actually providing a new collection. Perhaps I should have named it #sortedBy: or something. Handling the arguments:
 ...
 value1 := unarySelectorOrBlock isSymbol
  ifTrue: [element1 perform: unarySelectorOrBlock]
  ifFalse: [unarySelectorOrBlock value: element1].
 ...

 

could be streamlined a bit if Squeak Extensions are loaded (which defines Symbol>>value:):
 ...
 value1 := unarySelectorOrBlock value: element1.
 ...

 

but the attached version works in vanilla VW 7.6. In our code, only a few percent of senders of #sortBy: pass a block (likely because #sorted: and #asSortedCollection: already take a block). About a quarter of our senders pass a collection of selectors, and the rest pass a single selector. I've not yet had a desire to pass a mixed collection containing blocks and selectors, but the examples in the method comments validate that case.

 

Paul indicated that #asSortedCollection: is faster than #sorted:, so if that is important to you I confess I did not optimize #sortBy: for speed.

 

If Cincom felt so inclined, they could probably modify #sorted: to take a block, a selector, or a collection thereof, making that single api more flexible, and eliminating my desire for a separate #sortBy: api. Or, the attached can be freely used/modified by anyone.


 
Dave Stevenson
[hidden email]



From: Paul Baumann <[hidden email]>
To: Julian Fitzell <[hidden email]>; "Joerg Beekmann, DeepCove Labs (YVR)" <[hidden email]>; "[hidden email]" <[hidden email]>
Sent: Mon, August 9, 2010 12:32:58 PM
Subject: Re: [vwnc] Surprise of the day


>> I think it would read "Answer a sequenceable collection containing the contents of the receiver sorted according to aBlock. The receiver is not modified."

Yes, that is a better method comment.

#sorted: is broken, uncommented (at least as of VW 7.5), typically slower, and non-portable. You think #sorted: can be better than #asSortedCollection: and should be fixed instead of removed, I respect that. I see why you appreciate it. I'll stick with #asSortedCollection: until #sorted: is better.

Paul Baumann


-----Original Message-----
From: Julian Fitzell [mailto:[hidden email]]
Sent: Monday, August 09, 2010 6:03 AM
To: Paul Baumann; 'Joerg Beekmann, DeepCove Labs (YVR)'; [hidden email]
Subject: Re: [vwnc] Surprise of the day
Importance: High

On 10-08-06 4:58 PM, "Paul Baumann" <[hidden email]> wrote:
> If the purpose of #sorted: were documented, I think it would read
> "Answer a collection similar to the receiver with contents sorted according to aBlock".
> The "similar to the receiver" part is the only part that would
> distinguish it from #asSortedCollection:.

I disagree. I think it would read "Answer a sequenceable collection containing the contents of the receiver sorted according to aBlock. The receiver is not modified."

A SortedCollection is specifically implemented to allow the contents to change and to remain sorted when elements are added. If you don't need that functionality, there's no point demanding a SortedCollection. #sorted:
allows each collection to make the decision of how to most efficiently create and return the result, whether that is returning a copy of self, a copy with the sort block changed, an Array created in some fashion or other, or some other method. It's a protocol that allows you ask for something specific that you need without specifying the implementation, which is exactly what OO is supposed to be about.

Julian



This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

SortBy.st (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Surprise of the day

Dave Stevenson-3
In reply to this post by Joerg Beekmann, DeepCove Labs (YVR)
Naturally the previous attachment contained typos...
 
Dave Stevenson
[hidden email]



From: Dave Stevenson <[hidden email]>
To: "[hidden email]" <[hidden email]>
Sent: Wed, August 11, 2010 3:27:11 PM
Subject: Re: [vwnc] Surprise of the day

Here's a refactored version that implements #sortBy: in SortedCollection as well, answering a copy with a new sortBlock.
 
Dave Stevenson
[hidden email]



From: Dave Stevenson <[hidden email]>
To: "[hidden email]" <[hidden email]>
Sent: Mon, August 9, 2010 3:07:08 PM
Subject: Re: [vwnc] Surprise of the day

Some time ago I wrote some extension methods that allow me to pass a block, a selector, or a collection of blocks/selectors for sorting (see attached). It uses #sorted:, so it has the same bug for SortedCollections, but I think folds might find it useful. Here are some examples:

 

 #(-14 3 1 -3 7 12) sortBy: #abs
 #(-14 3 1 -3 7 12) sortBy: #(abs yourself)
 #(-14 3 1 -3 7 12) sortBy: [:x | x \\ 4]
 #(-14 3 1 -3 7 12) sortBy: (List with: [:x | x \\ 4] with: #yourself)

 

The default sort order is ascending, but a variant allows a descending sort:

 

 aCollection sortBy: #name ascending: false

 

I don't like that #sortBy: seems to imply a mutation of the receiver while actually providing a new collection. Perhaps I should have named it #sortedBy: or something. Handling the arguments:
 ...
 value1 := unarySelectorOrBlock isSymbol
  ifTrue: [element1 perform: unarySelectorOrBlock]
  ifFalse: [unarySelectorOrBlock value: element1].
 ...

 

could be streamlined a bit if Squeak Extensions are loaded (which defines Symbol>>value:):
 ...
 value1 := unarySelectorOrBlock value: element1.
 ...

 

but the attached version works in vanilla VW 7.6. In our code, only a few percent of senders of #sortBy: pass a block (likely because #sorted: and #asSortedCollection: already take a block). About a quarter of our senders pass a collection of selectors, and the rest pass a single selector. I've not yet had a desire to pass a mixed collection containing blocks and selectors, but the examples in the method comments validate that case.

 

Paul indicated that #asSortedCollection: is faster than #sorted:, so if that is important to you I confess I did not optimize #sortBy: for speed.

 

If Cincom felt so inclined, they could probably modify #sorted: to take a block, a selector, or a collection thereof, making that single api more flexible, and eliminating my desire for a separate #sortBy: api. Or, the attached can be freely used/modified by anyone.


 
Dave Stevenson
[hidden email]



From: Paul Baumann <[hidden email]>
To: Julian Fitzell <[hidden email]>; "Joerg Beekmann, DeepCove Labs (YVR)" <[hidden email]>; "[hidden email]" <[hidden email]>
Sent: Mon, August 9, 2010 12:32:58 PM
Subject: Re: [vwnc] Surprise of the day


>> I think it would read "Answer a sequenceable collection containing the contents of the receiver sorted according to aBlock. The receiver is not modified."

Yes, that is a better method comment.

#sorted: is broken, uncommented (at least as of VW 7.5), typically slower, and non-portable. You think #sorted: can be better than #asSortedCollection: and should be fixed instead of removed, I respect that. I see why you appreciate it. I'll stick with #asSortedCollection: until #sorted: is better.

Paul Baumann


-----Original Message-----
From: Julian Fitzell [mailto:[hidden email]]
Sent: Monday, August 09, 2010 6:03 AM
To: Paul Baumann; 'Joerg Beekmann, DeepCove Labs (YVR)'; [hidden email]
Subject: Re: [vwnc] Surprise of the day
Importance: High

On 10-08-06 4:58 PM, "Paul Baumann" <[hidden email]> wrote:
> If the purpose of #sorted: were documented, I think it would read
> "Answer a collection similar to the receiver with contents sorted according to aBlock".
> The "similar to the receiver" part is the only part that would
> distinguish it from #asSortedCollection:.

I disagree. I think it would read "Answer a sequenceable collection containing the contents of the receiver sorted according to aBlock. The receiver is not modified."

A SortedCollection is specifically implemented to allow the contents to change and to remain sorted when elements are added. If you don't need that functionality, there's no point demanding a SortedCollection. #sorted:
allows each collection to make the decision of how to most efficiently create and return the result, whether that is returning a copy of self, a copy with the sort block changed, an Array created in some fashion or other, or some other method. It's a protocol that allows you ask for something specific that you need without specifying the implementation, which is exactly what OO is supposed to be about.

Julian



This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.


_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc

SortBy.st (8K) Download Attachment
12