Nicolas Cellier uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-nice.250.mcz ==================== Summary ==================== Name: Collections-nice.250 Author: nice Time: 10 December 2009, 7:25:35 am UUID: b5e0979e-1391-fd45-82ef-23cc83d21600 Ancestors: Collections-nice.249 Implement subclassResponsibility remove:ifAbsent: in CharacterSet family. Cache the byteArrayMap of a WideCharacterSet. Update class comment. =============== Diff against Collections-nice.249 =============== Item was changed: ----- Method: WideCharacterSet>>removeAll (in category 'collection ops') ----- removeAll + map removeAll. + byteArrayMap := ByteArray new: 256! - map removeAll! Item was added: + ----- Method: CharacterSetComplement>>remove:ifAbsent: (in category 'collection ops') ----- + remove: aCharacter ifAbsent: aBlock + (self includes: aCharacter) ifFalse: [^aBlock value]. + ^self remove: aCharacter! Item was changed: ----- Method: WideCharacterSet>>add: (in category 'collection ops') ----- add: aCharacter | val high low lowmap | val := aCharacter asciiValue. + val < 256 ifTrue: [self byteArrayMap at: val + 1 put: 1]. high := val bitShift: -16. low := val bitAnd: 16rFFFF. lowmap := map at: high ifAbsentPut: ["create a chunk of 65536=8192*8 bits" ByteArray new: 8192]. self setBitmap: lowmap at: low. ^ aCharacter! Item was changed: ----- Method: WideCharacterSet>>byteArrayMap (in category 'comparing') ----- byteArrayMap "return a ByteArray mapping each ascii value to a 1 if that ascii value is in the set, and a 0 if it isn't. Intended for use by primitives only. (and comparison) This version will answer a subset with only byte characters" + | lowmap | + byteArrayMap ifNil: [ + byteArrayMap := ByteArray new: 256. + lowmap := map at: 0 ifAbsent: [^byteArrayMap]. + lowmap := lowmap copyFrom: 1 to: 32. "Keep first 8*32=256 bits..." + self bitmap: lowmap do: [:code | byteArrayMap at: code + 1 put: 1]]. + ^byteArrayMap! - | aMap lowmap | - aMap := ByteArray new: 256. - lowmap := map at: 0 ifAbsent: [^aMap]. - lowmap := lowmap copyFrom: 1 to: 32. "Keep first 8*32=256 bits..." - self bitmap: lowmap do: [:code | aMap at: code + 1 put: 1]. - ^aMap! Item was changed: ----- Method: WideCharacterSet>>initialize (in category 'initialize-release') ----- initialize + map := Dictionary new. + byteArrayMap := ByteArray new: 256! - map := Dictionary new.! Item was changed: ----- Method: WideCharacterSet>>remove: (in category 'collection ops') ----- remove: aCharacter | val high low lowmap | val := aCharacter asciiValue. + val < 256 ifTrue: [self byteArrayMap at: val + 1 put: 0]. high := val bitShift: -16. low := val bitAnd: 16rFFFF. lowmap := map at: high ifAbsent: [^ aCharacter]. self clearBitmap: lowmap at: low. + (lowmap allSatisfy: [:e | e = 0]) - lowmap max = 0 ifTrue: [map removeKey: high]. ^ aCharacter! Item was added: + ----- Method: CharacterSet>>remove:ifAbsent: (in category 'collection ops') ----- + remove: aCharacter ifAbsent: aBlock + (self includes: aCharacter) ifFalse: [^aBlock value]. + ^self remove: aCharacter! Item was changed: ----- Method: WideCharacterSet>>do: (in category 'collection ops') ----- do: aBlock map + keysAndValuesDo: [:index :lowmap | + | high16Bits | + high16Bits := index bitShift: 16. + self - keysAndValuesDo: [:high :lowmap | self bitmap: lowmap + do: [:low16Bits | aBlock value: (Character value: high16Bits + low16Bits)]]! - do: [:low | aBlock - value: (Character value: ((high bitShift: 16) bitOr: low))]]! Item was added: + ----- Method: WideCharacterSet>>remove:ifAbsent: (in category 'collection ops') ----- + remove: aCharacter ifAbsent: aBlock + (self includes: aCharacter) ifFalse: [^aBlock value]. + ^self remove: aCharacter! Item was changed: Collection subclass: #WideCharacterSet + instanceVariableNames: 'map byteArrayMap' - instanceVariableNames: 'map' classVariableNames: '' poolDictionaries: '' category: 'Collections-Support'! + !WideCharacterSet commentStamp: 'nice 12/10/2009 19:17' prior: 0! - !WideCharacterSet commentStamp: 'nice 5/9/2006 23:33' prior: 0! WideCharacterSet is used to store a Set of WideCharacter with fast access and inclusion test. Implementation should be efficient in memory if sets are sufficently sparse. Wide Characters are at most 32bits. We split them into 16 highBits and 16 lowBits. map is a dictionary key: 16 highBits value: map of 16 lowBits. + Maps of lowBits are stored as arrays of bits in a ByteArray. - Maps of lowBits are stored as arrays of bits in a WordArray. If a bit is set to 1, this indicate that corresponding character is present. + 8192 bytes are necessary in each lowmap. + Empty lowmap are removed from the map Dictionary. + + A byteArrayMap is maintained in parallel with map for fast handling of ByteString. + (byteArrayMap at: i+1) = 0 means that character of asciiValue i is absent, = 1 means present.! - Only 2048 entries are necessary in each lowmap. - And only lowmap corresponding to a present high value are stored.! |
Free forum by Nabble | Edit this page |