The Trunk: Collections-nice.775.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Collections-nice.775.mcz

commits-2
Nicolas Cellier uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-nice.775.mcz

==================== Summary ====================

Name: Collections-nice.775
Author: nice
Time: 1 December 2017, 1:25:19.882602 am
UUID: 5ef7a80c-6213-4e8e-8f0c-b45e110ce15e
Ancestors: Collections-nice.774

Rename CharacterSet -> ByteCharacterSet

This is step 1:
- create a parallel ByteCharacterSet
- then mutate CharacterSet instances -> ByteCharacterSet in postscript

=============== Diff against Collections-nice.774 ===============

Item was added:
+ ----- Method: AbstractCharacterSet>>species (in category 'private') -----
+ species
+ ^CharacterSet!

Item was added:
+ Collection subclass: #ByteCharacterSet
+ instanceVariableNames: 'byteArrayMap tally'
+ classVariableNames: 'CrLf NonSeparators Separators'
+ poolDictionaries: ''
+ category: 'Collections-Support'!
+
+ !ByteCharacterSet commentStamp: '<historical>' prior: 0!
+ A set of characters.  Lookups for inclusion are very fast.!

Item was added:
+ ----- Method: ByteCharacterSet class>>allCharacters (in category 'instance creation') -----
+ allCharacters
+ "return a set containing all characters"
+
+ | set |
+ set := self empty.
+ 0 to: 255 do: [ :ascii | set add: (Character value: ascii) ].
+ ^set!

Item was added:
+ ----- Method: ByteCharacterSet class>>cleanUp: (in category 'initialize-release') -----
+ cleanUp: aggressive
+
+ CrLf := NonSeparators := Separators := nil!

Item was added:
+ ----- Method: ByteCharacterSet class>>crlf (in category 'accessing') -----
+ crlf
+
+ ^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!

Item was added:
+ ----- Method: ByteCharacterSet class>>empty (in category 'instance creation') -----
+ empty
+   "return an empty set of characters"
+ ^self new!

Item was added:
+ ----- Method: ByteCharacterSet class>>newFrom: (in category 'instance creation') -----
+ newFrom: aCollection
+ | newCollection |
+ newCollection := self new.
+ newCollection addAll: aCollection.
+ ^newCollection!

Item was added:
+ ----- Method: ByteCharacterSet class>>nonSeparators (in category 'accessing') -----
+ nonSeparators
+ "return a set containing everything but the whitespace characters"
+
+ ^NonSeparators ifNil: [
+ NonSeparators := self separators complement ]!

Item was added:
+ ----- Method: ByteCharacterSet class>>separators (in category 'accessing') -----
+ separators
+ "return a set containing just the whitespace characters"
+
+ ^Separators ifNil: [ Separators := self newFrom: Character separators ]!

Item was added:
+ ----- Method: ByteCharacterSet class>>withAll: (in category 'instance creation') -----
+ withAll: aCollection
+ "Create a new ByteCharacterSet containing all the characters from aCollection."
+
+ ^self newFrom: aCollection!

Item was added:
+ ----- Method: ByteCharacterSet>>= (in category 'comparing') -----
+ = anObject
+
+ self species == anObject species ifFalse: [ ^false ].
+ anObject size = tally ifFalse: [ ^false ].
+ ^self byteArrayMap = anObject byteArrayMap!

Item was added:
+ ----- Method: ByteCharacterSet>>add: (in category 'adding') -----
+ add: aCharacter
+ "I automatically become a WideByteCharacterSet if you add a wide character to myself"
+
+ | index |
+ (index := aCharacter asInteger + 1) <= 256 ifFalse: [
+ | wide |
+ wide := WideCharacterSet new.
+ wide addAll: self.
+ wide add: aCharacter.
+ self becomeForward: wide.
+ ^aCharacter ].
+ (byteArrayMap at: index) = 1 ifFalse: [
+ byteArrayMap at: index put: 1.
+ tally := tally + 1 ].
+ ^aCharacter!

Item was added:
+ ----- Method: ByteCharacterSet>>asString (in category 'conversion') -----
+ asString
+ "Convert the receiver into a String"
+
+ ^String new: self size streamContents:[:s|
+ self do:[:ch| s nextPut: ch].
+ ].!

Item was added:
+ ----- Method: ByteCharacterSet>>byteArrayMap (in category 'private') -----
+ byteArrayMap
+ "return a ByteArray mapping each ascii value to a 1 if that ascii value is in the set, and a 0 if it isn't.  Intended for use by primitives only"
+ ^byteArrayMap!

Item was added:
+ ----- Method: ByteCharacterSet>>byteComplement (in category 'conversion') -----
+ byteComplement
+ "return a character set containing precisely the single byte characters the receiver does not"
+
+ | set |
+ set := ByteCharacterSet allCharacters.
+ self do: [ :c | set remove: c ].
+ ^set!

Item was added:
+ ----- Method: ByteCharacterSet>>complement (in category 'conversion') -----
+ complement
+ "return a character set containing precisely the characters the receiver does not"
+
+ ^ByteCharacterSetComplement of: self copy!

Item was added:
+ ----- Method: ByteCharacterSet>>do: (in category 'enumerating') -----
+ do: aBlock
+ "evaluate aBlock with each character in the set"
+
+ | index |
+ tally >= 128 ifTrue: [ "dense"
+ index := 0.
+ [ (index := index + 1) <= 256 ] whileTrue: [
+ (byteArrayMap at: index) = 1 ifTrue: [
+ aBlock value: (Character value: index - 1) ] ].
+ ^self ].
+ "sparse"
+ index := 0.
+ [ (index := byteArrayMap indexOf: 1 startingAt: index + 1) = 0 ] whileFalse: [
+ aBlock value: (Character value: index - 1) ].
+ !

Item was added:
+ ----- Method: ByteCharacterSet>>findFirstInByteString:startingAt: (in category 'zap me later') -----
+ findFirstInByteString: aByteString startingAt: startIndex
+ "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
+ ^ByteString
+ findFirstInString: aByteString
+ inSet: self byteArrayMap
+ startingAt: startIndex!

Item was added:
+ ----- Method: ByteCharacterSet>>hasWideCharacters (in category 'testing') -----
+ hasWideCharacters
+ ^false!

Item was added:
+ ----- Method: ByteCharacterSet>>hash (in category 'comparing') -----
+ hash
+ ^self byteArrayMap hash!

Item was added:
+ ----- Method: ByteCharacterSet>>includes: (in category 'testing') -----
+ includes: anObject
+
+ | index |
+ anObject isCharacter ifFalse: [ ^false ].
+ (index := anObject asInteger + 1) > 256 ifTrue: [ ^false ].
+ ^(byteArrayMap at: index) > 0!

Item was added:
+ ----- Method: ByteCharacterSet>>initialize (in category 'private') -----
+ initialize
+
+ byteArrayMap := ByteArray new: 256.
+ tally := 0!

Item was added:
+ ----- Method: ByteCharacterSet>>isEmpty (in category 'testing') -----
+ isEmpty
+ ^tally = 0!

Item was added:
+ ----- Method: ByteCharacterSet>>occurrencesOf: (in category 'zap me later') -----
+ occurrencesOf: anObject
+ "Answer how many of the receiver's elements are equal to anObject. Optimized version."
+
+ (self includes: anObject) ifTrue: [ ^1 ].
+ ^0!

Item was added:
+ ----- Method: ByteCharacterSet>>postCopy (in category 'copying') -----
+ postCopy
+ super postCopy.
+ byteArrayMap := byteArrayMap copy!

Item was added:
+ ----- Method: ByteCharacterSet>>remove: (in category 'removing') -----
+ remove: aCharacter
+
+ ^self remove: aCharacter ifAbsent: aCharacter!

Item was added:
+ ----- Method: ByteCharacterSet>>remove:ifAbsent: (in category 'removing') -----
+ remove: aCharacter ifAbsent: aBlock
+
+ | index |
+ (index := aCharacter asciiValue + 1) <= 256 ifFalse: [ ^aBlock value ].
+ (byteArrayMap at: index) = 0 ifTrue: [ ^aBlock value ].
+ byteArrayMap at: index put: 0.
+ tally := tally - 1.
+ ^aCharacter!

Item was added:
+ ----- Method: ByteCharacterSet>>removeAll (in category 'removing') -----
+ removeAll
+
+ byteArrayMap atAllPut: 0.
+ tally := 0!

Item was added:
+ ----- Method: ByteCharacterSet>>size (in category 'accessing') -----
+ size
+
+ ^tally!

Item was added:
+ ----- Method: ByteCharacterSet>>species (in category 'zap me later') -----
+ species
+ ^CharacterSet!

Item was added:
+ ----- Method: ByteCharacterSet>>union: (in category 'enumerating') -----
+ union: aCollection
+ (self species = aCollection species or: [aCollection isString or: [aCollection allSatisfy: [:e | e isCharacter]]]) ifFalse: [^super union: aCollection].
+ (self species = aCollection species and: [self class ~= aCollection class]) ifTrue: [^aCollection union: self].
+ ^self copy addAll: aCollection; yourself!

Item was added:
+ ----- Method: ByteCharacterSet>>wideCharacterMap (in category 'private') -----
+ wideCharacterMap
+ "used for comparing with WideByteCharacterSet"
+
+ | wide |
+ wide := WideByteCharacterSet new.
+ wide addAll: self.
+ ^wide wideCharacterMap!

Item was changed:
  ----- Method: CharacterSet class>>crlf (in category 'accessing') -----
  crlf
 
+ ^CrLf ifNil: [ CrLf := ByteCharacterSet with: Character cr with: Character lf ]!
- ^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!

Item was changed:
  ----- Method: CharacterSet class>>empty (in category 'instance creation') -----
  empty
    "return an empty set of characters"
+ ^ByteCharacterSet new!
- ^self new!

Item was changed:
  ----- Method: CharacterSet class>>newFrom: (in category 'instance creation') -----
  newFrom: aCollection
  | newCollection |
+ newCollection := ByteCharacterSet new.
- newCollection := self new.
  newCollection addAll: aCollection.
  ^newCollection!

Item was removed:
- ----- Method: CharacterSetComplement>>findFirstInByteString:startingAt: (in category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
- "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
- ^ByteString
- findFirstInString: aByteString
- inSet: self byteArrayMap
- startingAt: startIndex!

Item was removed:
- ----- Method: WideCharacterSet>>findFirstInByteString:startingAt: (in category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
- "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
-
- ^ByteString
- findFirstInString: aByteString
- inSet: byteArrayMap
- startingAt: startIndex!

Item was removed:
- ----- Method: WideCharacterSet>>species (in category 'comparing') -----
- species
- ^self hasWideCharacters
- ifTrue: [WideCharacterSet]
- ifFalse: [CharacterSet]!

Item was changed:
+ (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: [:e | ByteCharacterSet adoptInstance: e ]'!
- (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: #size'!


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-nice.775.mcz

Nicolas Cellier
Sorry for introducing a few Undeclared at this stage...
They should disappear at next update.

2017-12-01 1:25 GMT+01:00 <[hidden email]>:
Nicolas Cellier uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-nice.775.mcz

==================== Summary ====================

Name: Collections-nice.775
Author: nice
Time: 1 December 2017, 1:25:19.882602 am
UUID: 5ef7a80c-6213-4e8e-8f0c-b45e110ce15e
Ancestors: Collections-nice.774

Rename CharacterSet -> ByteCharacterSet

This is step 1:
- create a parallel ByteCharacterSet
- then mutate CharacterSet instances -> ByteCharacterSet in postscript

=============== Diff against Collections-nice.774 ===============

Item was added:
+ ----- Method: AbstractCharacterSet>>species (in category 'private') -----
+ species
+       ^CharacterSet!

Item was added:
+ Collection subclass: #ByteCharacterSet
+       instanceVariableNames: 'byteArrayMap tally'
+       classVariableNames: 'CrLf NonSeparators Separators'
+       poolDictionaries: ''
+       category: 'Collections-Support'!
+
+ !ByteCharacterSet commentStamp: '<historical>' prior: 0!
+ A set of characters.  Lookups for inclusion are very fast.!

Item was added:
+ ----- Method: ByteCharacterSet class>>allCharacters (in category 'instance creation') -----
+ allCharacters
+       "return a set containing all characters"
+
+       | set |
+       set := self empty.
+       0 to: 255 do: [ :ascii | set add: (Character value: ascii) ].
+       ^set!

Item was added:
+ ----- Method: ByteCharacterSet class>>cleanUp: (in category 'initialize-release') -----
+ cleanUp: aggressive
+
+       CrLf := NonSeparators := Separators := nil!

Item was added:
+ ----- Method: ByteCharacterSet class>>crlf (in category 'accessing') -----
+ crlf
+
+       ^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!

Item was added:
+ ----- Method: ByteCharacterSet class>>empty (in category 'instance creation') -----
+ empty
+       "return an empty set of characters"
+       ^self new!

Item was added:
+ ----- Method: ByteCharacterSet class>>newFrom: (in category 'instance creation') -----
+ newFrom: aCollection
+       | newCollection |
+       newCollection := self new.
+       newCollection addAll: aCollection.
+       ^newCollection!

Item was added:
+ ----- Method: ByteCharacterSet class>>nonSeparators (in category 'accessing') -----
+ nonSeparators
+       "return a set containing everything but the whitespace characters"
+
+       ^NonSeparators ifNil: [
+               NonSeparators := self separators complement ]!

Item was added:
+ ----- Method: ByteCharacterSet class>>separators (in category 'accessing') -----
+ separators
+       "return a set containing just the whitespace characters"
+
+       ^Separators ifNil: [ Separators := self newFrom: Character separators ]!

Item was added:
+ ----- Method: ByteCharacterSet class>>withAll: (in category 'instance creation') -----
+ withAll: aCollection
+       "Create a new ByteCharacterSet containing all the characters from aCollection."
+
+       ^self newFrom: aCollection!

Item was added:
+ ----- Method: ByteCharacterSet>>= (in category 'comparing') -----
+ = anObject
+
+       self species == anObject species ifFalse: [ ^false ].
+       anObject size = tally ifFalse: [ ^false ].
+       ^self byteArrayMap = anObject byteArrayMap!

Item was added:
+ ----- Method: ByteCharacterSet>>add: (in category 'adding') -----
+ add: aCharacter
+       "I automatically become a WideByteCharacterSet if you add a wide character to myself"
+
+       | index |
+       (index := aCharacter asInteger + 1) <= 256 ifFalse: [
+               | wide |
+               wide := WideCharacterSet new.
+               wide addAll: self.
+               wide add: aCharacter.
+               self becomeForward: wide.
+               ^aCharacter ].
+       (byteArrayMap at: index) = 1 ifFalse: [
+               byteArrayMap at: index put: 1.
+               tally := tally + 1 ].
+       ^aCharacter!

Item was added:
+ ----- Method: ByteCharacterSet>>asString (in category 'conversion') -----
+ asString
+       "Convert the receiver into a String"
+
+       ^String new: self size streamContents:[:s|
+               self do:[:ch| s nextPut: ch].
+       ].!

Item was added:
+ ----- Method: ByteCharacterSet>>byteArrayMap (in category 'private') -----
+ byteArrayMap
+       "return a ByteArray mapping each ascii value to a 1 if that ascii value is in the set, and a 0 if it isn't.  Intended for use by primitives only"
+       ^byteArrayMap!

Item was added:
+ ----- Method: ByteCharacterSet>>byteComplement (in category 'conversion') -----
+ byteComplement
+       "return a character set containing precisely the single byte characters the receiver does not"
+
+       | set |
+       set := ByteCharacterSet allCharacters.
+       self do: [ :c | set remove: c ].
+       ^set!

Item was added:
+ ----- Method: ByteCharacterSet>>complement (in category 'conversion') -----
+ complement
+       "return a character set containing precisely the characters the receiver does not"
+
+       ^ByteCharacterSetComplement of: self copy!

Item was added:
+ ----- Method: ByteCharacterSet>>do: (in category 'enumerating') -----
+ do: aBlock
+       "evaluate aBlock with each character in the set"
+
+       | index |
+       tally >= 128 ifTrue: [ "dense"
+               index := 0.
+               [ (index := index + 1) <= 256 ] whileTrue: [
+                       (byteArrayMap at: index) = 1 ifTrue: [
+                               aBlock value: (Character value: index - 1) ] ].
+               ^self ].
+       "sparse"
+       index := 0.
+       [ (index := byteArrayMap indexOf: 1 startingAt: index + 1) = 0 ] whileFalse: [
+               aBlock value: (Character value: index - 1) ].
+       !

Item was added:
+ ----- Method: ByteCharacterSet>>findFirstInByteString:startingAt: (in category 'zap me later') -----
+ findFirstInByteString: aByteString startingAt: startIndex
+       "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
+       ^ByteString
+               findFirstInString: aByteString
+               inSet: self byteArrayMap
+               startingAt: startIndex!

Item was added:
+ ----- Method: ByteCharacterSet>>hasWideCharacters (in category 'testing') -----
+ hasWideCharacters
+       ^false!

Item was added:
+ ----- Method: ByteCharacterSet>>hash (in category 'comparing') -----
+ hash
+       ^self byteArrayMap hash!

Item was added:
+ ----- Method: ByteCharacterSet>>includes: (in category 'testing') -----
+ includes: anObject
+
+       | index |
+       anObject isCharacter ifFalse: [ ^false ].
+       (index := anObject asInteger + 1) > 256 ifTrue: [ ^false ].
+       ^(byteArrayMap at: index) > 0!

Item was added:
+ ----- Method: ByteCharacterSet>>initialize (in category 'private') -----
+ initialize
+
+       byteArrayMap := ByteArray new: 256.
+       tally := 0!

Item was added:
+ ----- Method: ByteCharacterSet>>isEmpty (in category 'testing') -----
+ isEmpty
+       ^tally = 0!

Item was added:
+ ----- Method: ByteCharacterSet>>occurrencesOf: (in category 'zap me later') -----
+ occurrencesOf: anObject
+       "Answer how many of the receiver's elements are equal to anObject. Optimized version."
+
+       (self includes: anObject) ifTrue: [ ^1 ].
+       ^0!

Item was added:
+ ----- Method: ByteCharacterSet>>postCopy (in category 'copying') -----
+ postCopy
+       super postCopy.
+       byteArrayMap := byteArrayMap copy!

Item was added:
+ ----- Method: ByteCharacterSet>>remove: (in category 'removing') -----
+ remove: aCharacter
+
+       ^self remove: aCharacter ifAbsent: aCharacter!

Item was added:
+ ----- Method: ByteCharacterSet>>remove:ifAbsent: (in category 'removing') -----
+ remove: aCharacter ifAbsent: aBlock
+
+       | index |
+       (index := aCharacter asciiValue + 1) <= 256 ifFalse: [ ^aBlock value ].
+       (byteArrayMap at: index) = 0 ifTrue: [ ^aBlock value ].
+       byteArrayMap at: index put: 0.
+       tally := tally - 1.
+       ^aCharacter!

Item was added:
+ ----- Method: ByteCharacterSet>>removeAll (in category 'removing') -----
+ removeAll
+
+       byteArrayMap atAllPut: 0.
+       tally := 0!

Item was added:
+ ----- Method: ByteCharacterSet>>size (in category 'accessing') -----
+ size
+
+       ^tally!

Item was added:
+ ----- Method: ByteCharacterSet>>species (in category 'zap me later') -----
+ species
+       ^CharacterSet!

Item was added:
+ ----- Method: ByteCharacterSet>>union: (in category 'enumerating') -----
+ union: aCollection
+       (self species = aCollection species or: [aCollection isString or: [aCollection allSatisfy: [:e | e isCharacter]]]) ifFalse: [^super union: aCollection].
+       (self species = aCollection species and: [self class ~= aCollection class]) ifTrue: [^aCollection union: self].
+       ^self copy addAll: aCollection; yourself!

Item was added:
+ ----- Method: ByteCharacterSet>>wideCharacterMap (in category 'private') -----
+ wideCharacterMap
+       "used for comparing with WideByteCharacterSet"
+
+       | wide |
+       wide := WideByteCharacterSet new.
+       wide addAll: self.
+       ^wide wideCharacterMap!

Item was changed:
  ----- Method: CharacterSet class>>crlf (in category 'accessing') -----
  crlf

+       ^CrLf ifNil: [ CrLf := ByteCharacterSet with: Character cr with: Character lf ]!
-       ^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!

Item was changed:
  ----- Method: CharacterSet class>>empty (in category 'instance creation') -----
  empty
        "return an empty set of characters"
+       ^ByteCharacterSet new!
-       ^self new!

Item was changed:
  ----- Method: CharacterSet class>>newFrom: (in category 'instance creation') -----
  newFrom: aCollection
        | newCollection |
+       newCollection := ByteCharacterSet new.
-       newCollection := self new.
        newCollection addAll: aCollection.
        ^newCollection!

Item was removed:
- ----- Method: CharacterSetComplement>>findFirstInByteString:startingAt: (in category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
-       "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
-       ^ByteString
-               findFirstInString: aByteString
-               inSet: self byteArrayMap
-               startingAt: startIndex!

Item was removed:
- ----- Method: WideCharacterSet>>findFirstInByteString:startingAt: (in category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
-       "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
-
-       ^ByteString
-               findFirstInString: aByteString
-               inSet: byteArrayMap
-               startingAt: startIndex!

Item was removed:
- ----- Method: WideCharacterSet>>species (in category 'comparing') -----
- species
-       ^self hasWideCharacters
-               ifTrue: [WideCharacterSet]
-               ifFalse: [CharacterSet]!

Item was changed:
+ (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: [:e | ByteCharacterSet adoptInstance: e ]'!
- (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: #size'!