The Trunk: Collections-ul.695.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Collections-ul.695.mcz

commits-2
Levente Uzonyi uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-ul.695.mcz

==================== Summary ====================

Name: Collections-ul.695
Author: ul
Time: 2 June 2016, 5:48:18.403513 pm
UUID: be1e0401-02a2-49e4-827f-fe0fee861114
Ancestors: Collections-topa.694

Heap revamp #1:

class side:
- Heaps can be created on an existing array with #on: #on:size: and #on:size:sortBlock:
- #withAll: and #withAll:sortBlock: avoid creating an extra copy if the argument is already an Array

instance side:
- imported and improved #= from Pharo
- #growTo: won't raise an error when newSize is less than array size (e.g. during compaction)
- #sortBlock: restores the heap invariant on its own
- improved #select:, #collect, #at:, #removeAt:, #add:
- added #sort and #compact; compatible with other collections
- #capacity returns the real capacity
- inlined #sorts:before: and #indexUpdateBlock: for performance
- improved performance and legibility of the heap operations #upHeap: #downHeap: and #downHeapSingle:
- added #isValidHeap
- added some comments

=============== Diff against Collections-topa.694 ===============

Item was added:
+ ----- Method: Heap class>>on: (in category 'instance creation') -----
+ on: anArray
+ "Create a new heap using anArray as the internal array"
+
+ ^self on: anArray size: anArray size sortBlock: nil!

Item was added:
+ ----- Method: Heap class>>on:size: (in category 'instance creation') -----
+ on: anArray size: size
+ "Create a new heap using the first size elements of anArray as the internal array"
+
+ ^self
+ on: anArray
+ size: size
+ sortBlock: nil!

Item was added:
+ ----- Method: Heap class>>on:size:sortBlock: (in category 'instance creation') -----
+ on: anArray size: size sortBlock: aBlockOrNil
+ "Create a new heap using the first size elements of anArray as the internal array and sorted by aBlockOrNil"
+
+ anArray isArray ifFalse: [ self error: 'Array expected.' ].
+ anArray size < size ifTrue: [ self error: 'size must not be larger than anArray size' ].
+ ^super new
+ setCollection: anArray tally: size;
+ sortBlock: aBlockOrNil;
+ yourself!

Item was changed:
  ----- Method: Heap class>>withAll: (in category 'instance creation') -----
  withAll: aCollection
  "Create a new heap with all the elements from aCollection"
+
+ ^self withAll: aCollection sortBlock: nil!
- ^(self basicNew)
- setCollection: aCollection asArray copy tally: aCollection size;
- reSort;
- yourself!

Item was changed:
  ----- Method: Heap class>>withAll:sortBlock: (in category 'instance creation') -----
  withAll: aCollection sortBlock: sortBlock
+ "Create a new heap with all the elements from aCollection, sorted by sortBlock"
+
+ | array |
+ array := aCollection asArray.
+ array == aCollection ifTrue: [ array := array copy ].
+ ^self on: array size: array size sortBlock: sortBlock!
- "Create a new heap with all the elements from aCollection"
- ^(self basicNew)
- setCollection: aCollection asArray copy tally: aCollection size;
- sortBlock: sortBlock;
- yourself!

Item was changed:
  ----- Method: Heap>>= (in category 'comparing') -----
  = anObject
+ "Heap are considered to be equal when they have the same sortBlock and the same elements. This method is expensive due to the sorted copies of the arrays. Try not to use it."
 
+ self == anObject ifTrue: [ ^true ].
+ anObject isHeap ifFalse: [ ^false ].
+ anObject size = tally ifFalse: [ ^false ].
+ anObject sortBlock = sortBlock ifFalse: [ ^false ].
+ ^((array first: tally) sort: sortBlock) = ((anObject array first: tally) sort: sortBlock)!
- ^ self == anObject
- ifTrue: [true]
- ifFalse: [anObject isHeap
- ifTrue: [sortBlock = anObject sortBlock and: [super = anObject]]
- ifFalse: [super = anObject]]!

Item was changed:
  ----- Method: Heap>>add: (in category 'adding') -----
  add: anObject
  "Include newObject as one of the receiver's elements. Answer newObject."
+
  tally = array size ifTrue:[self grow].
  array at: (tally := tally + 1) put: anObject.
+ indexUpdateBlock ifNotNil: [ indexUpdateBlock value: anObject value: tally ].
- self updateObjectIndex: tally.
  self upHeap: tally.
  ^anObject!

Item was changed:
  ----- Method: Heap>>at: (in category 'accessing') -----
  at: index
  "Return the element at the given position within the receiver"
+
+ index > tally ifTrue: [ ^self errorSubscriptBounds: index ].
- (index < 1 or:[index > tally]) ifTrue:[^self errorSubscriptBounds: index].
  ^array at: index!

Item was added:
+ ----- Method: Heap>>capacity (in category 'accessing') -----
+ capacity
+ "Answer the current capacity of the receiver."
+
+ ^array size!

Item was changed:
  ----- Method: Heap>>collect: (in category 'enumerating') -----
  collect: aBlock
+
+ ^(array first: tally) replace: aBlock!
- ^self collect: aBlock as: Array!

Item was added:
+ ----- Method: Heap>>compact (in category 'growing') -----
+ compact
+ "Remove any empty slots in the receiver."
+
+ self growTo: self size.!

Item was changed:
  ----- Method: Heap>>downHeap: (in category 'private-heap') -----
  downHeap: anIndex
  "Check the heap downwards for correctness starting at anIndex.
  Everything above (i.e. left of) anIndex is ok."
+
+ | childIndex childValue index value |
+ index := anIndex.
- | value k n j |
- anIndex = 0 ifTrue:[^self].
- n := tally bitShift: -1.
- k := anIndex.
  value := array at: anIndex.
+ [ (childIndex := 2 * index) >= tally or: [
+ "Select the child with the larger value. We know there are two children."
+ childValue := array at: childIndex.
+ (sortBlock
+ ifNil: [ (array at: childIndex + 1) <= childValue ]
+ ifNotNil: [ sortBlock value: (array at: childIndex + 1) value: childValue ])
+ ifTrue: [
+ childValue := array at: (childIndex := childIndex + 1) ].
+ "Check if the value at index is at the right position."
+ sortBlock
+ ifNil: [ value <= childValue ]
+ ifNotNil: [ sortBlock value: value value: childValue ] ] ]
+ whileFalse: [
+ "Move value downwards the tree."
+ array at: index put: childValue.
+ indexUpdateBlock ifNotNil: [ indexUpdateBlock value: childValue value: index ].
+ "Contine from childIndex"
+ index := childIndex ].
+ childIndex = tally ifTrue: [ "Special case: there's only one child."
+ "Check if the value at index is at the right position."
+ childValue := array at: childIndex.
+ (sortBlock
+ ifNil: [ value <= childValue ]
+ ifNotNil: [ sortBlock value: value value: childValue ])
+ ifFalse: [
+ "Move value downwards the tree."
+ array at: index put: childValue.
+ indexUpdateBlock ifNotNil: [ indexUpdateBlock value: childValue value: index ].
+ "Contine from childIndex"
+ index := childIndex ] ].
+ array at: index put: value.
+ indexUpdateBlock ifNotNil: [ indexUpdateBlock value: value value: index ]!
- [k <= n] whileTrue:[
- j := k + k.
- "use max(j,j+1)"
- (j < tally and:[self sorts: (array at: j+1) before: (array at: j)])
- ifTrue:[ j := j + 1].
- "check if position k is ok"
- (self sorts: value before: (array at: j))
- ifTrue:[ "yes -> break loop"
- n := k - 1]
- ifFalse:[ "no -> make room at j by moving j-th element to k-th position"
- array at: k put: (array at: j).
- self updateObjectIndex: k.
- "and try again with j"
- k := j]].
- array at: k put: value.
- self updateObjectIndex: k.!

Item was changed:
  ----- Method: Heap>>downHeapSingle: (in category 'private-heap') -----
  downHeapSingle: anIndex
  "This version is optimized for the case when only one element in the receiver can be at a wrong position. It avoids one comparison at each node when travelling down the heap and checks the heap upwards after the element is at a bottom position. Since the probability for being at the bottom of the heap is much larger than for being somewhere in the middle this version should be faster."
+
+ | childIndex index value |
+ index := anIndex.
- | value k n j |
- anIndex = 0 ifTrue:[^self].
- n := tally bitShift: -1.
- k := anIndex.
  value := array at: anIndex.
+ [ (childIndex := 2 * index) < tally ] whileTrue:[
+ "Select the child with the larger value. We know there are two children."
+ (sortBlock
+ ifNil: [ (array at: childIndex + 1) <= (array at: childIndex) ]
+ ifNotNil: [ sortBlock value: (array at: childIndex + 1) value: (array at: childIndex) ])
+ ifTrue: [ childIndex := childIndex + 1 ].
+ array at: index put: (array at: childIndex).
+ indexUpdateBlock ifNotNil: [ indexUpdateBlock value: (array at: index) value: index ].
+ "and repeat at the next level"
+ index := childIndex ].
+ childIndex = tally ifTrue: [ "Child with no sibling"
+ array at: index put: (array at: childIndex).
+ indexUpdateBlock ifNotNil: [ indexUpdateBlock value: (array at: index) value: index ].
+ index := childIndex ].
+ array at: index put: value.
+ indexUpdateBlock ifNotNil: [ indexUpdateBlock value: value value: index ].
+ self upHeap: index!
- [k <= n] whileTrue:[
- j := k + k.
- "use max(j,j+1)"
- (j < tally and:[self sorts: (array at: j+1) before: (array at: j)])
- ifTrue:[ j := j + 1].
- array at: k put: (array at: j).
- self updateObjectIndex: k.
- "and try again with j"
- k := j].
- array at: k put: value.
- self updateObjectIndex: k.
- self upHeap: k!

Item was changed:
  ----- Method: Heap>>growTo: (in category 'growing') -----
  growTo: newSize
  "Grow to the requested size."
+
  | newArray |
  newArray := Array new: (newSize max: tally).
+ newArray replaceFrom: 1 to: tally with: array startingAt: 1.
- newArray replaceFrom: 1 to: array size with: array startingAt: 1.
  array := newArray!

Item was changed:
  ----- Method: Heap>>indexUpdateBlock: (in category 'accessing') -----
  indexUpdateBlock: aBlockOrNil
+ "aBlockOrNil is either nil or a two argument block. The first argument is the object whose index has changed in the heap, the second is the new index. The block will be evaluated whenever an element is moved in the heap's internal array. If you don't plan to remove elements by index (see #removeAt:), then you should not set this block."
 
  indexUpdateBlock := aBlockOrNil.
 
  !

Item was added:
+ ----- Method: Heap>>isValidHeap (in category 'testing') -----
+ isValidHeap
+
+ "Check the size first."
+ (tally between: 0 and: array size) ifFalse: [ ^false ].
+ "Check the sort order between parent and child nodes."
+ 1 to: (tally bitShift: -1) do: [ :index |
+ | childIndex |
+ childIndex := 2 * index.
+ sortBlock
+ ifNil: [ (array at: index) <= (array at: childIndex) ifFalse: [ ^false ] ]
+ ifNotNil: [ (sortBlock value: (array at: index) value: (array at: childIndex)) ifFalse: [ ^false ] ].
+ childIndex < tally ifTrue: [
+ childIndex := childIndex + 1.
+ sortBlock
+ ifNil: [ (array at: index) <= (array at: childIndex) ifFalse: [ ^false ] ]
+ ifNotNil: [ (sortBlock value: (array at: index) value: (array at: childIndex)) ifFalse: [ ^false ] ] ] ].
+ "Check for elements left in array after tally."
+ tally + 1 to: array size do: [ :index |
+ (array at: index) ifNotNil: [ ^false ] ].
+ ^true!

Item was changed:
  ----- Method: Heap>>privateRemoveAt: (in category 'private') -----
  privateRemoveAt: index
+ "Remove the element at the given index and make sure the sorting order is okay. The value of index must not be larger than tally."
+
- "Remove the element at the given index and make sure the sorting order is okay"
  | removed |
  removed := array at: index.
+ index = tally ifTrue: [
+ array at: index put: nil.
+ tally := tally - 1.
+ ^removed ].
+ array
+ at: index put: (array at: tally);
+ at: tally put: nil.
- array at: index put: (array at: tally).
- array at: tally put: nil.
  tally := tally - 1.
+ 2 * index <= tally "The node at index has at least one child."
+ ifTrue: [ self downHeapSingle: index ]
+ ifFalse: [ self upHeap: index ].
- index > tally ifFalse:[
- "Use #downHeapSingle: since only one element has been removed"
- self downHeapSingle: index].
  ^removed!

Item was changed:
  ----- Method: Heap>>privateReverseSort (in category 'private') -----
  privateReverseSort
  "Arrange to have the array sorted in reverse order.
  WARNING: this method breaks the heap invariants. It's up to the sender to restore them afterwards."
+
  | oldTally |
+ self deprecated: 'Use #sort if you want to sort.'.
  oldTally := tally.
  [tally > 1] whileTrue:
  [array swap: 1 with: tally.
  tally := tally - 1.
  self downHeapSingle: 1].
  tally := oldTally!

Item was changed:
  ----- Method: Heap>>removeAt: (in category 'removing') -----
  removeAt: index
+ "Remove the element at the given index and make sure the sorting order is okay."
+
+ index > tally ifTrue: [ self errorSubscriptBounds: index ].
- "Remove the element at given position"
- (index < 1 or:[index > tally]) ifTrue:[^self errorSubscriptBounds: index].
  ^self privateRemoveAt: index!

Item was changed:
  ----- Method: Heap>>select: (in category 'enumerating') -----
  select: aBlock
  "Evaluate aBlock with each of my elements as the argument. Collect into
  a new collection like the receiver, only those elements for which aBlock
  evaluates to true."
 
  | newCollection |
  newCollection := self copyEmpty.
+ 1 to: tally do: [ :index |
+ | element |
+ (aBlock value: (element := array at: index)) ifTrue: [
+ newCollection add: element ] ].
- self do:
- [:each |
- (aBlock value: each)
- ifTrue: [newCollection add: each]].
  ^ newCollection!

Item was added:
+ ----- Method: Heap>>sort (in category 'sorting') -----
+ sort
+ "Fully sort the heap. This method preserves the heap invariants and can thus be sent safely"
+
+ | start end element originalIndexUpdateBlock |
+ end := tally.
+ "Temporarly remove indexUpdateBlock to speed up sorting."
+ originalIndexUpdateBlock := indexUpdateBlock.
+ indexUpdateBlock := nil.
+ [ tally > 1 ] whileTrue: [
+ element := array at: tally.
+ array
+ at: tally put: (array at: 1);
+ at: 1 put: element.
+ tally := tally - 1.
+ self downHeapSingle: 1 ].
+ tally := end.
+ start := 1.
+ originalIndexUpdateBlock ifNil: [
+ "The was no indexUpdateBlock; just reverse the elements"
+ [ start < end ] whileTrue: [
+ element := array at: start.
+ array
+ at: start put: (array at: end);
+ at: end put: element.
+ start := start + 1.
+ end := end - 1 ].
+ ^self ].
+ "Restore indexUpdateBlock, reverse the elements and update the indices."
+ indexUpdateBlock := originalIndexUpdateBlock.
+ start := 1.
+ [ start < end ] whileTrue: [
+ | endValue |
+ element := array at: start.
+ endValue := array at: end.
+ array
+ at: start put: endValue;
+ at: end put: element.
+ indexUpdateBlock
+ value: endValue value: start;
+ value: element value: end.
+ start := start + 1.
+ end := end - 1 ].
+ start = end ifTrue: [ indexUpdateBlock value: (array at: start) value: start ]!

Item was changed:
  ----- Method: Heap>>sortBlock: (in category 'accessing') -----
  sortBlock: aBlock
+
+ | oldIndexUpdateBlock |
  sortBlock := aBlock.
+ "Restore the heap invariant."
+ tally <= 1 ifTrue: [ ^self ].
+ oldIndexUpdateBlock := indexUpdateBlock.
+ indexUpdateBlock := nil.
+ (tally bitShift: -1) to: 1 by: -1 do: [ :index | self downHeap: index ].
+ indexUpdateBlock := oldIndexUpdateBlock ifNil: [ ^self ].
+ 1 to: tally do: [ :index |
+ indexUpdateBlock value: (array at: index) value: index ]
+
+ !
- self reSort.!

Item was changed:
  ----- Method: Heap>>upHeap: (in category 'private-heap') -----
  upHeap: anIndex
  "Check the heap upwards for correctness starting at anIndex.
  Everything below anIndex is ok."
+
+ | index parentValue parentIndex value |
+ anIndex = 1 ifTrue: [ ^self ].
+ value := array at: (index := anIndex).
+ [ index > 1 and: [
+ parentValue := array at: (parentIndex := index bitShift: -1).
+ sortBlock
+ ifNil: [ value <= parentValue ]
+ ifNotNil: [ sortBlock value: value value: parentValue ] ] ]
+ whileTrue: [
+ array at: index put: parentValue.
+ indexUpdateBlock ifNotNil: [ indexUpdateBlock value: parentValue value: index ].
+ index := parentIndex ].
+ array at: index put: value.
+ indexUpdateBlock ifNotNil: [ indexUpdateBlock value: value value: index ]!
- | value k kDiv2 tmp |
- anIndex = 0 ifTrue:[^self].
- k := anIndex.
- value := array at: anIndex.
- [ (k > 1) and:[self sorts: value before: (tmp := array at: (kDiv2 := k bitShift: -1))] ]
- whileTrue:[
- array at: k put: tmp.
- self updateObjectIndex: k.
- k := kDiv2].
- array at: k put: value.
- self updateObjectIndex: k.!