The Trunk: Collections-nice.246.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Collections-nice.246.mcz

commits-2
Nicolas Cellier uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-nice.246.mcz

==================== Summary ====================

Name: Collections-nice.246
Author: nice
Time: 9 December 2009, 9:03:30 am
UUID: 93e084cc-d836-1146-a61d-5ce2a8972cba
Ancestors: Collections-nice.245

After an idea from Henrik Johansen, speed-up WideCharacterSet enumeration
- 1) do not use a WordArray but a ByteArray to avoid large integers
- 2) inline enumeration of bits in a byte with optimistic subtraction of highest bit (work best if bits pattern is sparse)

=============== Diff against Collections-nice.245 ===============

Item was changed:
  ----- Method: WideCharacterSet>>setBitmap:at: (in category 'private') -----
  setBitmap: aMap at: shortInteger
  "set a single bit in aMap.
  shortInteger should be between: 0 and: 16rFFFF"
 
  | collecIndex bitIndex |
+ collecIndex := shortInteger bitShift: -3.
+ bitIndex := shortInteger bitAnd: 7.
- collecIndex := shortInteger bitShift: -5.
- bitIndex := shortInteger bitAnd: 16r1F.
  ^aMap at: collecIndex + 1 put: ((aMap at: collecIndex + 1) bitOr: (1 bitShift: bitIndex))!

Item was added:
+ ----- Method: WideCharacterSet>>convertInternalRepresentationToByteArray (in category 'private') -----
+ convertInternalRepresentationToByteArray
+ | oldMap |
+ (map anySatisfy: [:e | e class == WordArray]) ifTrue: [
+ oldMap := map.
+ map := Dictionary new.
+ oldMap keysAndValuesDo: [:i :words |
+ self wordBitmap: words do: [:code | self add: (Character value: (i bitShift: 16) + code)]]].!

Item was changed:
  ----- Method: WideCharacterSet>>add: (in category 'collection ops') -----
  add: aCharacter
  | val high low lowmap |
  val := aCharacter asciiValue.
  high := val bitShift: -16.
  low := val bitAnd: 16rFFFF.
+ lowmap := map at: high ifAbsentPut: ["create a chunk of 65536=8192*8 bits"
+ ByteArray new: 8192].
- lowmap := map at: high ifAbsentPut: [WordArray new: 2048].
  self setBitmap: lowmap at: low.
  ^ aCharacter!

Item was changed:
  ----- Method: WideCharacterSet>>byteArrayMap (in category 'comparing') -----
  byteArrayMap
  "return a ByteArray mapping each ascii value to a 1 if that ascii value is in the set, and a 0 if it isn't.
  Intended for use by primitives only. (and comparison)
  This version will answer a subset with only byte characters"
 
  | aMap lowmap |
  aMap := ByteArray new: 256.
  lowmap := map at: 0 ifAbsent: [^aMap].
+ lowmap := lowmap copyFrom: 1 to: 32. "Keep first 8*32=256 bits..."
- lowmap := lowmap copyFrom: 1 to: 8. "Keep first 8*32=256 bits..."
  self bitmap: lowmap do: [:code | aMap at: code + 1 put: 1].
  ^aMap!

Item was changed:
  ----- Method: WideCharacterSet>>bitmap:at: (in category 'private') -----
  bitmap: aMap at: shortInteger
  "access a single bit in aMap.
  shortInteger should be between: 0 and: 16rFFFF"
 
  | collecIndex bitIndex |
+ collecIndex := shortInteger bitShift: -3.
+ bitIndex := shortInteger bitAnd: 7.
- collecIndex := shortInteger bitShift: -5.
- bitIndex := shortInteger bitAnd: 16r1F.
  ^(aMap at: collecIndex + 1) bitAnd: (1 bitShift: bitIndex)!

Item was added:
+ ----- Method: WideCharacterSet class>>initialize (in category 'class initialization') -----
+ initialize
+ "Old representation use WordArray.
+ Accessing a WordArray can create slow LargeInteger and is inefficient.
+ This is a temporary hack to mutate internal representation to faster ByteArray"
+ self allInstancesDo: [:e | e convertInternalRepresentationToByteArray]!

Item was changed:
  ----- Method: WideCharacterSet>>bitmap:do: (in category 'private') -----
  bitmap: aMap do: aBlock
+ "Execute a block with each value (0 based) corresponding to set bits.
+ Implementation: this version works best for sparse maps.
+ The powers of two and highBit tables are inlined for speed"
- "Execute a block with each value (0 based) corresponding to set bits"
 
+ | byte bitOffset byteOffset powersOf2 highBits |
+ powersOf2 := #[1 2 4 8 16 32 64 128].
+ highBits := #[0 1 1 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7].
+ 1 to: aMap size do: [:i |
+ (byte := aMap at: i) = 0 ifFalse: [
+ byteOffset := (i - 1) bitShift: 3.
+ [aBlock value: (byteOffset + (bitOffset := highBits at: byte)).
+ (byte := byte - (powersOf2 at: 1 + bitOffset)) = 0] whileFalse]]!
- 0 to: 31 do: [:shift |
- | mask |
- mask := 1 bitShift: shift.
- 1 to: aMap size do: [:i |
- ((aMap at: i) bitAnd: mask) isZero ifFalse: [aBlock value: ((i - 1 bitShift: 5) bitOr: shift)]]]!

Item was changed:
  ----- Method: WideCharacterSet>>clearBitmap:at: (in category 'private') -----
  clearBitmap: aMap at: shortInteger
  "clear a single bit in aMap.
  shortInteger should be between: 0 and: 16rFFFF"
 
  | collecIndex bitIndex |
+ collecIndex := shortInteger bitShift: -3.
+ bitIndex := shortInteger bitAnd: 7.
- collecIndex := shortInteger bitShift: -5.
- bitIndex := shortInteger bitAnd: 16r1F.
  ^aMap at: collecIndex + 1 put: ((aMap at: collecIndex + 1) bitClear: (1 bitShift: bitIndex))!

Item was added:
+ ----- Method: WideCharacterSet>>wordBitmap:do: (in category 'private') -----
+ wordBitmap: aMap do: aBlock
+ "Execute a block with each value (0 based) corresponding to set bits"
+
+ 0 to: 31 do: [:shift |
+ | mask |
+ mask := 1 bitShift: shift.
+ 1 to: aMap size do: [:i |
+ ((aMap at: i) bitAnd: mask) isZero ifFalse: [aBlock value: ((i - 1 bitShift: 5) bitOr: shift)]]]!