The Inbox: Collections-nice.530.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

The Inbox: Collections-nice.530.mcz

commits-2
A new version of Collections was added to project The Inbox:
http://source.squeak.org/inbox/Collections-nice.530.mcz

==================== Summary ====================

Name: Collections-nice.530
Author: nice
Time: 4 September 2013, 2:21:28.852 am
UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
Ancestors: Collections-ul.529

Fix this:
  self assert: (allSymbols select: [:s | s = '+']) size = 1.

1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...

=============== Diff against Collections-ul.529 ===============

Item was changed:
  ----- Method: Symbol class>>allSymbols (in category 'access') -----
  allSymbols
  "Answer all interned symbols"
  ^Array streamContents:[:s|
  s nextPutAll: NewSymbols.
- s nextPutAll: OneCharacterSymbols.
  s nextPutAll: SymbolTable.
  ].
  !

Item was changed:
  ----- Method: Symbol class>>initialize (in category 'class initialization') -----
  initialize
 
  "Symbol initialize"
 
  Symbol rehash.
+ OneCharacterSymbols := WeakArray new: 256.
+ (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
- OneCharacterSymbols := nil.
- OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
  Smalltalk addToShutDownList: self.
  !

Item was changed:
  ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
  internCharacter: aCharacter
  aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
+ ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
- OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
- ^OneCharacterSymbols at: aCharacter asciiValue + 1
  !

Item was changed:
  ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
  selectorsContaining: aString
  "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."
 
  | size selectorList ascii |
 
  selectorList := OrderedCollection new.
  (size := aString size) = 0 ifTrue: [^selectorList].
 
  aString size = 1 ifTrue:
  [
  ascii := aString first asciiValue.
+ ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
- ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
  ].
 
  (aString first isLetter or: [aString first isDigit]) ifFalse:
  [
  aString size = 2 ifTrue:
  [Symbol hasInterned: aString ifTrue:
  [:s | selectorList add: s]].
  ^selectorList
  ].
 
  selectorList := selectorList copyFrom: 2 to: selectorList size.
 
  self allSymbolTablesDo: [:each |
  each size >= size ifTrue:
  [(each findSubstring: aString in: each startingAt: 1
  matchTable: CaseInsensitiveOrder) > 0
  ifTrue: [selectorList add: each]]].
 
  ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
  each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].
 
  "Symbol selectorsContaining: 'scon'"!


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.530.mcz

Hannes Hirzel
What is the aim of the OneCharacterSymbols class?

On Wed, 4 Sep 2013 00:21:42.547 0000, [hidden email]
<[hidden email]> wrote:

> A new version of Collections was added to project The Inbox:
> http://source.squeak.org/inbox/Collections-nice.530.mcz
>
> ==================== Summary ====================
>
> Name: Collections-nice.530
> Author: nice
> Time: 4 September 2013, 2:21:28.852 am
> UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
> Ancestors: Collections-ul.529
>
> Fix this:
>   self assert: (allSymbols select: [:s | s = '+']) size = 1.
>
> 1) It is un-necessary to enumerate the 1-char symbols, they already are in
> SymbolTable (the first time they were created via #findIntern: or by virtue
> of last #rehash) !
>
> 2) We can use a WeekArray rather than an Array because there is no point in
> keeping a reference to unused 1-char symbols.
>
> Maybe we should get rid of OneCharacterSymbols class var, but I didn't
> dare...
>
> =============== Diff against Collections-ul.529 ===============
>
> Item was changed:
>   ----- Method: Symbol class>>allSymbols (in category 'access') -----
>   allSymbols
>   "Answer all interned symbols"
>   ^Array streamContents:[:s|
>   s nextPutAll: NewSymbols.
> - s nextPutAll: OneCharacterSymbols.
>   s nextPutAll: SymbolTable.
>   ].
>   !
>
> Item was changed:
>   ----- Method: Symbol class>>initialize (in category 'class
> initialization') -----
>   initialize
>
>   "Symbol initialize"
>
>   Symbol rehash.
> + OneCharacterSymbols := WeakArray new: 256.
> + (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
> - OneCharacterSymbols := nil.
> - OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter
> asSymbol].
>   Smalltalk addToShutDownList: self.
>   !
>
> Item was changed:
>   ----- Method: Symbol class>>internCharacter: (in category 'instance
> creation') -----
>   internCharacter: aCharacter
>   aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
> + ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil:
> [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern:
> aCharacter asString)]
> - OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
> - ^OneCharacterSymbols at: aCharacter asciiValue + 1
>   !
>
> Item was changed:
>   ----- Method: Symbol class>>selectorsContaining: (in category 'access')
> -----
>   selectorsContaining: aString
>   "Answer a list of selectors that contain aString within them.
> Case-insensitive.  Does return symbols that begin with a capital letter."
>
>   | size selectorList ascii |
>
>   selectorList := OrderedCollection new.
>   (size := aString size) = 0 ifTrue: [^selectorList].
>
>   aString size = 1 ifTrue:
>   [
>   ascii := aString first asciiValue.
> + ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s |
> selectorList add: s]]
> - ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at:
> ascii+1)]
>   ].
>
>   (aString first isLetter or: [aString first isDigit]) ifFalse:
>   [
>   aString size = 2 ifTrue:
>   [Symbol hasInterned: aString ifTrue:
>   [:s | selectorList add: s]].
>   ^selectorList
>   ].
>
>   selectorList := selectorList copyFrom: 2 to: selectorList size.
>
>   self allSymbolTablesDo: [:each |
>   each size >= size ifTrue:
>   [(each findSubstring: aString in: each startingAt: 1
>   matchTable: CaseInsensitiveOrder) > 0
>   ifTrue: [selectorList add: each]]].
>
>   ^selectorList reject: [:each | "reject non-selectors, but keep ones that
> begin with an uppercase"
>   each numArgs < 0 and: [each asString withFirstCharacterDownshifted
> numArgs < 0]].
>
>   "Symbol selectorsContaining: 'scon'"!
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.530.mcz

Nicolas Cellier
No idea...
Presumably fast up some lookup?


2013/9/4 H. Hirzel <[hidden email]>
What is the aim of the OneCharacterSymbols class?

On Wed, 4 Sep 2013 00:21:42.547 0000, [hidden email]
<[hidden email]> wrote:
> A new version of Collections was added to project The Inbox:
> http://source.squeak.org/inbox/Collections-nice.530.mcz
>
> ==================== Summary ====================
>
> Name: Collections-nice.530
> Author: nice
> Time: 4 September 2013, 2:21:28.852 am
> UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
> Ancestors: Collections-ul.529
>
> Fix this:
>   self assert: (allSymbols select: [:s | s = '+']) size = 1.
>
> 1) It is un-necessary to enumerate the 1-char symbols, they already are in
> SymbolTable (the first time they were created via #findIntern: or by virtue
> of last #rehash) !
>
> 2) We can use a WeekArray rather than an Array because there is no point in
> keeping a reference to unused 1-char symbols.
>
> Maybe we should get rid of OneCharacterSymbols class var, but I didn't
> dare...
>
> =============== Diff against Collections-ul.529 ===============
>
> Item was changed:
>   ----- Method: Symbol class>>allSymbols (in category 'access') -----
>   allSymbols
>       "Answer all interned symbols"
>       ^Array streamContents:[:s|
>               s nextPutAll: NewSymbols.
> -             s nextPutAll: OneCharacterSymbols.
>               s nextPutAll: SymbolTable.
>       ].
>   !
>
> Item was changed:
>   ----- Method: Symbol class>>initialize (in category 'class
> initialization') -----
>   initialize
>
>       "Symbol initialize"
>
>       Symbol rehash.
> +     OneCharacterSymbols := WeakArray new: 256.
> +     (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
> -     OneCharacterSymbols := nil.
> -     OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter
> asSymbol].
>       Smalltalk addToShutDownList: self.
>   !
>
> Item was changed:
>   ----- Method: Symbol class>>internCharacter: (in category 'instance
> creation') -----
>   internCharacter: aCharacter
>       aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
> +     ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil:
> [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern:
> aCharacter asString)]
> -     OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
> -     ^OneCharacterSymbols at: aCharacter asciiValue + 1
>   !
>
> Item was changed:
>   ----- Method: Symbol class>>selectorsContaining: (in category 'access')
> -----
>   selectorsContaining: aString
>       "Answer a list of selectors that contain aString within them.
> Case-insensitive.  Does return symbols that begin with a capital letter."
>
>       | size selectorList ascii |
>
>       selectorList := OrderedCollection new.
>       (size := aString size) = 0 ifTrue: [^selectorList].
>
>       aString size = 1 ifTrue:
>               [
>                       ascii := aString first asciiValue.
> +                     ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s |
> selectorList add: s]]
> -                     ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at:
> ascii+1)]
>               ].
>
>       (aString first isLetter or: [aString first isDigit]) ifFalse:
>               [
>                       aString size = 2 ifTrue:
>                               [Symbol hasInterned: aString ifTrue:
>                                       [:s | selectorList add: s]].
>                       ^selectorList
>               ].
>
>       selectorList := selectorList copyFrom: 2 to: selectorList size.
>
>       self allSymbolTablesDo: [:each |
>               each size >= size ifTrue:
>                       [(each findSubstring: aString in: each startingAt: 1
>                               matchTable: CaseInsensitiveOrder) > 0
>                                               ifTrue: [selectorList add: each]]].
>
>       ^selectorList reject: [:each | "reject non-selectors, but keep ones that
> begin with an uppercase"
>               each numArgs < 0 and: [each asString withFirstCharacterDownshifted
> numArgs < 0]].
>
>   "Symbol selectorsContaining: 'scon'"!
>
>
>




Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.530.mcz

Levente Uzonyi-2
In reply to this post by commits-2
On Wed, 4 Sep 2013, [hidden email] wrote:

> A new version of Collections was added to project The Inbox:
> http://source.squeak.org/inbox/Collections-nice.530.mcz
>
> ==================== Summary ====================
>
> Name: Collections-nice.530
> Author: nice
> Time: 4 September 2013, 2:21:28.852 am
> UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
> Ancestors: Collections-ul.529
>
> Fix this:
>  self assert: (allSymbols select: [:s | s = '+']) size = 1.
>
> 1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !
>
> 2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.
>
> Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...

I say we should nuke it. The only user is Symbol class >>
#selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente

>
> =============== Diff against Collections-ul.529 ===============
>
> Item was changed:
>  ----- Method: Symbol class>>allSymbols (in category 'access') -----
>  allSymbols
>   "Answer all interned symbols"
>   ^Array streamContents:[:s|
>   s nextPutAll: NewSymbols.
> - s nextPutAll: OneCharacterSymbols.
>   s nextPutAll: SymbolTable.
>   ].
>  !
>
> Item was changed:
>  ----- Method: Symbol class>>initialize (in category 'class initialization') -----
>  initialize
>
>   "Symbol initialize"
>
>   Symbol rehash.
> + OneCharacterSymbols := WeakArray new: 256.
> + (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
> - OneCharacterSymbols := nil.
> - OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
>   Smalltalk addToShutDownList: self.
>  !
>
> Item was changed:
>  ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
>  internCharacter: aCharacter
>   aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
> + ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
> - OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
> - ^OneCharacterSymbols at: aCharacter asciiValue + 1
>  !
>
> Item was changed:
>  ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
>  selectorsContaining: aString
>   "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."
>
>   | size selectorList ascii |
>
>   selectorList := OrderedCollection new.
>   (size := aString size) = 0 ifTrue: [^selectorList].
>
>   aString size = 1 ifTrue:
>   [
>   ascii := aString first asciiValue.
> + ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
> - ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
>   ].
>
>   (aString first isLetter or: [aString first isDigit]) ifFalse:
>   [
>   aString size = 2 ifTrue:
>   [Symbol hasInterned: aString ifTrue:
>   [:s | selectorList add: s]].
>   ^selectorList
>   ].
>
>   selectorList := selectorList copyFrom: 2 to: selectorList size.
>
>   self allSymbolTablesDo: [:each |
>   each size >= size ifTrue:
>   [(each findSubstring: aString in: each startingAt: 1
>   matchTable: CaseInsensitiveOrder) > 0
>   ifTrue: [selectorList add: each]]].
>
>   ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
>   each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].
>
>  "Symbol selectorsContaining: 'scon'"!
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.530.mcz

Nicolas Cellier
Well, as you were the last modifier, I prefer to have your advice :)

Nicolas


2013/9/4 Levente Uzonyi <[hidden email]>
On Wed, 4 Sep 2013, [hidden email] wrote:

A new version of Collections was added to project The Inbox:
http://source.squeak.org/inbox/Collections-nice.530.mcz

==================== Summary ====================

Name: Collections-nice.530
Author: nice
Time: 4 September 2013, 2:21:28.852 am
UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
Ancestors: Collections-ul.529

Fix this:
 self assert: (allSymbols select: [:s | s = '+']) size = 1.

1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...

I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente



=============== Diff against Collections-ul.529 ===============

Item was changed:
 ----- Method: Symbol class>>allSymbols (in category 'access') -----
 allSymbols
        "Answer all interned symbols"
        ^Array streamContents:[:s|
                s nextPutAll: NewSymbols.
-               s nextPutAll: OneCharacterSymbols.
                s nextPutAll: SymbolTable.
        ].
 !

Item was changed:
 ----- Method: Symbol class>>initialize (in category 'class initialization') -----
 initialize

        "Symbol initialize"

        Symbol rehash.
+       OneCharacterSymbols := WeakArray new: 256.
+       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
-       OneCharacterSymbols := nil.
-       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
        Smalltalk addToShutDownList: self.
 !

Item was changed:
 ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
 internCharacter: aCharacter
        aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
+       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
-       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
-       ^OneCharacterSymbols at: aCharacter asciiValue + 1
 !

Item was changed:
 ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
 selectorsContaining: aString
        "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."

        | size selectorList ascii |

        selectorList := OrderedCollection new.
        (size := aString size) = 0 ifTrue: [^selectorList].

        aString size = 1 ifTrue:
                [
                        ascii := aString first asciiValue.
+                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
-                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
                ].

        (aString first isLetter or: [aString first isDigit]) ifFalse:
                [
                        aString size = 2 ifTrue:
                                [Symbol hasInterned: aString ifTrue:
                                        [:s | selectorList add: s]].
                        ^selectorList
                ].

        selectorList := selectorList copyFrom: 2 to: selectorList size.

        self allSymbolTablesDo: [:each |
                each size >= size ifTrue:
                        [(each findSubstring: aString in: each startingAt: 1
                                matchTable: CaseInsensitiveOrder) > 0
                                                ifTrue: [selectorList add: each]]].

        ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
                each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].

 "Symbol selectorsContaining: 'scon'"!







Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.530.mcz

Levente Uzonyi-2
On Wed, 4 Sep 2013, Nicolas Cellier wrote:

> Well, as you were the last modifier, I prefer to have your advice :)

It was just a cosmetic #== -> #= change using a rewrite tool. I didn't
spend any time to understand the method back then.


Levente

>
> Nicolas
>
>
> 2013/9/4 Levente Uzonyi <[hidden email]>
>       On Wed, 4 Sep 2013, [hidden email] wrote:
>
>             A new version of Collections was added to project The Inbox:
>             http://source.squeak.org/inbox/Collections-nice.530.mcz
>
>             ==================== Summary ====================
>
>             Name: Collections-nice.530
>             Author: nice
>             Time: 4 September 2013, 2:21:28.852 am
>             UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
>             Ancestors: Collections-ul.529
>
>             Fix this:
>              self assert: (allSymbols select: [:s | s = '+']) size = 1.
>
>             1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !
>
>             2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.
>
>             Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...
>
>
> I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
> - it uses OneCharacterSymbols only for infix selectors
> - it can't find infix selectors longer than 2 characters
> - it tries to optimize some stuff, but wastes cycles on other stuff
>
>
> Levente
>
>
>       =============== Diff against Collections-ul.529 ===============
>
>       Item was changed:
>        ----- Method: Symbol class>>allSymbols (in category 'access') -----
>        allSymbols
>               "Answer all interned symbols"
>               ^Array streamContents:[:s|
>                       s nextPutAll: NewSymbols.
>       -               s nextPutAll: OneCharacterSymbols.
>                       s nextPutAll: SymbolTable.
>               ].
>        !
>
>       Item was changed:
>        ----- Method: Symbol class>>initialize (in category 'class initialization') -----
>        initialize
>
>               "Symbol initialize"
>
>               Symbol rehash.
>       +       OneCharacterSymbols := WeakArray new: 256.
>       +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
>       -       OneCharacterSymbols := nil.
>       -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
>               Smalltalk addToShutDownList: self.
>        !
>
>       Item was changed:
>        ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
>        internCharacter: aCharacter
>               aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
>       +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
>       -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
>       -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
>        !
>
>       Item was changed:
>        ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
>        selectorsContaining: aString
>               "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."
>
>               | size selectorList ascii |
>
>               selectorList := OrderedCollection new.
>               (size := aString size) = 0 ifTrue: [^selectorList].
>
>               aString size = 1 ifTrue:
>                       [
>                               ascii := aString first asciiValue.
>       +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
>       -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
>                       ].
>
>               (aString first isLetter or: [aString first isDigit]) ifFalse:
>                       [
>                               aString size = 2 ifTrue:
>                                       [Symbol hasInterned: aString ifTrue:
>                                               [:s | selectorList add: s]].
>                               ^selectorList
>                       ].
>
>               selectorList := selectorList copyFrom: 2 to: selectorList size.
>
>               self allSymbolTablesDo: [:each |
>                       each size >= size ifTrue:
>                               [(each findSubstring: aString in: each startingAt: 1
>                                       matchTable: CaseInsensitiveOrder) > 0
>                                                       ifTrue: [selectorList add: each]]].
>
>               ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
>                       each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].
>
>        "Symbol selectorsContaining: 'scon'"!
>
>
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Levente Uzonyi-2
In reply to this post by Nicolas Cellier
There's another user of OneCharacterSymbols: Character >> #asSymbol. I
don't know how often it's used (my guess is that it's rarely used), or how
much slower it would be, if we were removing OneCharacterSymbols. The
implementation would change from

  ^Symbol internCharacter: self

to

  ^Symbol intern: self asString

Which means that an extra String will be created, and the lookup is done
in the WeakSets, instead of an Array/WeakArray.


Levente

On Wed, 4 Sep 2013, Nicolas Cellier wrote:

> Well, as you were the last modifier, I prefer to have your advice :)
>
> Nicolas
>
>
> 2013/9/4 Levente Uzonyi <[hidden email]>
>       On Wed, 4 Sep 2013, [hidden email] wrote:
>
>             A new version of Collections was added to project The Inbox:
>             http://source.squeak.org/inbox/Collections-nice.530.mcz
>
>             ==================== Summary ====================
>
>             Name: Collections-nice.530
>             Author: nice
>             Time: 4 September 2013, 2:21:28.852 am
>             UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
>             Ancestors: Collections-ul.529
>
>             Fix this:
>              self assert: (allSymbols select: [:s | s = '+']) size = 1.
>
>             1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !
>
>             2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.
>
>             Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...
>
>
> I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
> - it uses OneCharacterSymbols only for infix selectors
> - it can't find infix selectors longer than 2 characters
> - it tries to optimize some stuff, but wastes cycles on other stuff
>
>
> Levente
>
>
>       =============== Diff against Collections-ul.529 ===============
>
>       Item was changed:
>        ----- Method: Symbol class>>allSymbols (in category 'access') -----
>        allSymbols
>               "Answer all interned symbols"
>               ^Array streamContents:[:s|
>                       s nextPutAll: NewSymbols.
>       -               s nextPutAll: OneCharacterSymbols.
>                       s nextPutAll: SymbolTable.
>               ].
>        !
>
>       Item was changed:
>        ----- Method: Symbol class>>initialize (in category 'class initialization') -----
>        initialize
>
>               "Symbol initialize"
>
>               Symbol rehash.
>       +       OneCharacterSymbols := WeakArray new: 256.
>       +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
>       -       OneCharacterSymbols := nil.
>       -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
>               Smalltalk addToShutDownList: self.
>        !
>
>       Item was changed:
>        ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
>        internCharacter: aCharacter
>               aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
>       +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
>       -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
>       -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
>        !
>
>       Item was changed:
>        ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
>        selectorsContaining: aString
>               "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."
>
>               | size selectorList ascii |
>
>               selectorList := OrderedCollection new.
>               (size := aString size) = 0 ifTrue: [^selectorList].
>
>               aString size = 1 ifTrue:
>                       [
>                               ascii := aString first asciiValue.
>       +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
>       -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
>                       ].
>
>               (aString first isLetter or: [aString first isDigit]) ifFalse:
>                       [
>                               aString size = 2 ifTrue:
>                                       [Symbol hasInterned: aString ifTrue:
>                                               [:s | selectorList add: s]].
>                               ^selectorList
>                       ].
>
>               selectorList := selectorList copyFrom: 2 to: selectorList size.
>
>               self allSymbolTablesDo: [:each |
>                       each size >= size ifTrue:
>                               [(each findSubstring: aString in: each startingAt: 1
>                                       matchTable: CaseInsensitiveOrder) > 0
>                                                       ifTrue: [selectorList add: each]]].
>
>               ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
>                       each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].
>
>        "Symbol selectorsContaining: 'scon'"!
>
>
>
>
>
>
>

cbc
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.530.mcz

cbc
In reply to this post by Levente Uzonyi-2
Wouldn't it be nice if there was a non-annoying way to mark methods that were changed by rewrite tools (or other 'global' changing tools) as being changed by those tools?  And to keep track of the last person that deeply thought about the specific method?

I think a series of notes like "Changed on ##/##/## by someone using rewrite tool; previous changer was someoneelse on ##/##/##" would help, but would also drive me nuts.  Too much text to wade through with too little info.

Maybe the version stamp?  Especially if it was an object (lastChanger, lastModified, lastNonToolChanger/date, originalAuthor/date)?  I'd find it interesting.  Maybe useful, but if no one else cares, not that useful.

-Chris


On Wed, Sep 4, 2013 at 10:06 AM, Levente Uzonyi <[hidden email]> wrote:
On Wed, 4 Sep 2013, Nicolas Cellier wrote:

Well, as you were the last modifier, I prefer to have your advice :)

It was just a cosmetic #== -> #= change using a rewrite tool. I didn't spend any time to understand the method back then.


Levente



Nicolas


2013/9/4 Levente Uzonyi <[hidden email]>
      On Wed, 4 Sep 2013, [hidden email] wrote:

            A new version of Collections was added to project The Inbox:
            http://source.squeak.org/inbox/Collections-nice.530.mcz

            ==================== Summary ====================

            Name: Collections-nice.530
            Author: nice
            Time: 4 September 2013, 2:21:28.852 am
            UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
            Ancestors: Collections-ul.529

            Fix this:
             self assert: (allSymbols select: [:s | s = '+']) size = 1.

            1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

            2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

            Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...


I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente


      =============== Diff against Collections-ul.529 ===============

      Item was changed:
       ----- Method: Symbol class>>allSymbols (in category 'access') -----
       allSymbols
              "Answer all interned symbols"
              ^Array streamContents:[:s|
                      s nextPutAll: NewSymbols.
      -               s nextPutAll: OneCharacterSymbols.
                      s nextPutAll: SymbolTable.
              ].
       !

      Item was changed:
       ----- Method: Symbol class>>initialize (in category 'class initialization') -----
       initialize

              "Symbol initialize"

              Symbol rehash.
      +       OneCharacterSymbols := WeakArray new: 256.
      +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
      -       OneCharacterSymbols := nil.
      -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
              Smalltalk addToShutDownList: self.
       !

      Item was changed:
       ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
       internCharacter: aCharacter
              aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
      +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
      -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
      -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
       !

      Item was changed:
       ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
       selectorsContaining: aString
              "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."

              | size selectorList ascii |

              selectorList := OrderedCollection new.
              (size := aString size) = 0 ifTrue: [^selectorList].

              aString size = 1 ifTrue:
                      [
                              ascii := aString first asciiValue.
      +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
      -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
                      ].

              (aString first isLetter or: [aString first isDigit]) ifFalse:
                      [
                              aString size = 2 ifTrue:
                                      [Symbol hasInterned: aString ifTrue:
                                              [:s | selectorList add: s]].
                              ^selectorList
                      ].

              selectorList := selectorList copyFrom: 2 to: selectorList size.

              self allSymbolTablesDo: [:each |
                      each size >= size ifTrue:
                              [(each findSubstring: aString in: each startingAt: 1
                                      matchTable: CaseInsensitiveOrder) > 0
                                                      ifTrue: [selectorList add: each]]].

              ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
                      each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].

       "Symbol selectorsContaining: 'scon'"!












Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Collections-nice.530.mcz

timrowledge

On 04-09-2013, at 11:02 AM, Chris Cunningham <[hidden email]> wrote:

> Wouldn't it be nice if there was a non-annoying way to mark methods that were changed by rewrite tools (or other 'global' changing tools) as being changed by those tools?  And to keep track of the last person that deeply thought about the specific method?

Some metadata would be nice, certainly. It might become more practical when Chris M has the 'total reality vortex' magma complete code database and ice cream store up and running online.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Law of Logical Argument: Anything is possible if you don't know what you are talking about.




Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Nicolas Cellier
In reply to this post by Levente Uzonyi-2
A rapid review of senders indicates that Character asSymbol seems rather rare in a trunk image.

I found
#initTextConstants
#inOutdent:delta: (Character tab asSymbol asText which is rather crooked way of avoiding an allocation - String tab would allocate a new Object).



2013/9/4 Levente Uzonyi <[hidden email]>
There's another user of OneCharacterSymbols: Character >> #asSymbol. I don't know how often it's used (my guess is that it's rarely used), or how much slower it would be, if we were removing OneCharacterSymbols. The implementation would change from

        ^Symbol internCharacter: self

to

        ^Symbol intern: self asString

Which means that an extra String will be created, and the lookup is done in the WeakSets, instead of an Array/WeakArray.


Levente

On Wed, 4 Sep 2013, Nicolas Cellier wrote:

Well, as you were the last modifier, I prefer to have your advice :)

Nicolas


2013/9/4 Levente Uzonyi <[hidden email]>
      On Wed, 4 Sep 2013, [hidden email] wrote:

            A new version of Collections was added to project The Inbox:
            http://source.squeak.org/inbox/Collections-nice.530.mcz

            ==================== Summary ====================

            Name: Collections-nice.530
            Author: nice
            Time: 4 September 2013, 2:21:28.852 am
            UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
            Ancestors: Collections-ul.529

            Fix this:
             self assert: (allSymbols select: [:s | s = '+']) size = 1.

            1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

            2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

            Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...


I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente


      =============== Diff against Collections-ul.529 ===============

      Item was changed:
       ----- Method: Symbol class>>allSymbols (in category 'access') -----
       allSymbols
              "Answer all interned symbols"
              ^Array streamContents:[:s|
                      s nextPutAll: NewSymbols.
      -               s nextPutAll: OneCharacterSymbols.
                      s nextPutAll: SymbolTable.
              ].
       !

      Item was changed:
       ----- Method: Symbol class>>initialize (in category 'class initialization') -----
       initialize

              "Symbol initialize"

              Symbol rehash.
      +       OneCharacterSymbols := WeakArray new: 256.
      +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
      -       OneCharacterSymbols := nil.
      -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
              Smalltalk addToShutDownList: self.
       !

      Item was changed:
       ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
       internCharacter: aCharacter
              aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
      +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
      -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
      -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
       !

      Item was changed:
       ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
       selectorsContaining: aString
              "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."

              | size selectorList ascii |

              selectorList := OrderedCollection new.
              (size := aString size) = 0 ifTrue: [^selectorList].

              aString size = 1 ifTrue:
                      [
                              ascii := aString first asciiValue.
      +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
      -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
                      ].

              (aString first isLetter or: [aString first isDigit]) ifFalse:
                      [
                              aString size = 2 ifTrue:
                                      [Symbol hasInterned: aString ifTrue:
                                              [:s | selectorList add: s]].
                              ^selectorList
                      ].

              selectorList := selectorList copyFrom: 2 to: selectorList size.

              self allSymbolTablesDo: [:each |
                      each size >= size ifTrue:
                              [(each findSubstring: aString in: each startingAt: 1
                                      matchTable: CaseInsensitiveOrder) > 0
                                                      ifTrue: [selectorList add: each]]].

              ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
                      each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].

       "Symbol selectorsContaining: 'scon'"!












Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Bert Freudenberg

On 2013-09-04, at 21:41, Nicolas Cellier <[hidden email]> wrote:

A rapid review of senders indicates that Character asSymbol seems rather rare in a trunk image.

I found
#initTextConstants
#inOutdent:delta: (Character tab asSymbol asText which is rather crooked way of avoiding an allocation - String tab would allocate a new Object).

Scanner>>scanToken appears to be the most often used sender, and may well be the original reason for caching the 1-char symbols.

It might be a good idea to time compilation of complex methods with and without OneCharacterSymbols. (my guess is it wouldn't make much of a difference these days)

- Bert -




2013/9/4 Levente Uzonyi <[hidden email]>
There's another user of OneCharacterSymbols: Character >> #asSymbol. I don't know how often it's used (my guess is that it's rarely used), or how much slower it would be, if we were removing OneCharacterSymbols. The implementation would change from

        ^Symbol internCharacter: self

to

        ^Symbol intern: self asString

Which means that an extra String will be created, and the lookup is done in the WeakSets, instead of an Array/WeakArray.


Levente

On Wed, 4 Sep 2013, Nicolas Cellier wrote:

Well, as you were the last modifier, I prefer to have your advice :)

Nicolas


2013/9/4 Levente Uzonyi <[hidden email]>
      On Wed, 4 Sep 2013, [hidden email] wrote:

            A new version of Collections was added to project The Inbox:
            http://source.squeak.org/inbox/Collections-nice.530.mcz

            ==================== Summary ====================

            Name: Collections-nice.530
            Author: nice
            Time: 4 September 2013, 2:21:28.852 am
            UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
            Ancestors: Collections-ul.529

            Fix this:
             self assert: (allSymbols select: [:s | s = '+']) size = 1.

            1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

            2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

            Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...


I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente


      =============== Diff against Collections-ul.529 ===============

      Item was changed:
       ----- Method: Symbol class>>allSymbols (in category 'access') -----
       allSymbols
              "Answer all interned symbols"
              ^Array streamContents:[:s|
                      s nextPutAll: NewSymbols.
      -               s nextPutAll: OneCharacterSymbols.
                      s nextPutAll: SymbolTable.
              ].
       !

      Item was changed:
       ----- Method: Symbol class>>initialize (in category 'class initialization') -----
       initialize

              "Symbol initialize"

              Symbol rehash.
      +       OneCharacterSymbols := WeakArray new: 256.
      +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
      -       OneCharacterSymbols := nil.
      -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
              Smalltalk addToShutDownList: self.
       !

      Item was changed:
       ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
       internCharacter: aCharacter
              aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
      +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
      -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
      -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
       !

      Item was changed:
       ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
       selectorsContaining: aString
              "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."

              | size selectorList ascii |

              selectorList := OrderedCollection new.
              (size := aString size) = 0 ifTrue: [^selectorList].

              aString size = 1 ifTrue:
                      [
                              ascii := aString first asciiValue.
      +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
      -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
                      ].

              (aString first isLetter or: [aString first isDigit]) ifFalse:
                      [
                              aString size = 2 ifTrue:
                                      [Symbol hasInterned: aString ifTrue:
                                              [:s | selectorList add: s]].
                              ^selectorList
                      ].

              selectorList := selectorList copyFrom: 2 to: selectorList size.

              self allSymbolTablesDo: [:each |
                      each size >= size ifTrue:
                              [(each findSubstring: aString in: each startingAt: 1
                                      matchTable: CaseInsensitiveOrder) > 0
                                                      ifTrue: [selectorList add: each]]].

              ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
                      each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].

       "Symbol selectorsContaining: 'scon'"!















Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Nicolas Cellier
Thanks Bert, I just published a Compiler version that does not...


2013/9/5 Bert Freudenberg <[hidden email]>

On 2013-09-04, at 21:41, Nicolas Cellier <[hidden email]> wrote:

A rapid review of senders indicates that Character asSymbol seems rather rare in a trunk image.

I found
#initTextConstants
#inOutdent:delta: (Character tab asSymbol asText which is rather crooked way of avoiding an allocation - String tab would allocate a new Object).

Scanner>>scanToken appears to be the most often used sender, and may well be the original reason for caching the 1-char symbols.

It might be a good idea to time compilation of complex methods with and without OneCharacterSymbols. (my guess is it wouldn't make much of a difference these days)

- Bert -




2013/9/4 Levente Uzonyi <[hidden email]>
There's another user of OneCharacterSymbols: Character >> #asSymbol. I don't know how often it's used (my guess is that it's rarely used), or how much slower it would be, if we were removing OneCharacterSymbols. The implementation would change from

        ^Symbol internCharacter: self

to

        ^Symbol intern: self asString

Which means that an extra String will be created, and the lookup is done in the WeakSets, instead of an Array/WeakArray.


Levente

On Wed, 4 Sep 2013, Nicolas Cellier wrote:

Well, as you were the last modifier, I prefer to have your advice :)

Nicolas


2013/9/4 Levente Uzonyi <[hidden email]>
      On Wed, 4 Sep 2013, [hidden email] wrote:

            A new version of Collections was added to project The Inbox:
            http://source.squeak.org/inbox/Collections-nice.530.mcz

            ==================== Summary ====================

            Name: Collections-nice.530
            Author: nice
            Time: 4 September 2013, 2:21:28.852 am
            UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
            Ancestors: Collections-ul.529

            Fix this:
             self assert: (allSymbols select: [:s | s = '+']) size = 1.

            1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

            2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

            Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...


I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente


      =============== Diff against Collections-ul.529 ===============

      Item was changed:
       ----- Method: Symbol class>>allSymbols (in category 'access') -----
       allSymbols
              "Answer all interned symbols"
              ^Array streamContents:[:s|
                      s nextPutAll: NewSymbols.
      -               s nextPutAll: OneCharacterSymbols.
                      s nextPutAll: SymbolTable.
              ].
       !

      Item was changed:
       ----- Method: Symbol class>>initialize (in category 'class initialization') -----
       initialize

              "Symbol initialize"

              Symbol rehash.
      +       OneCharacterSymbols := WeakArray new: 256.
      +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
      -       OneCharacterSymbols := nil.
      -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
              Smalltalk addToShutDownList: self.
       !

      Item was changed:
       ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
       internCharacter: aCharacter
              aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
      +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
      -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
      -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
       !

      Item was changed:
       ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
       selectorsContaining: aString
              "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."

              | size selectorList ascii |

              selectorList := OrderedCollection new.
              (size := aString size) = 0 ifTrue: [^selectorList].

              aString size = 1 ifTrue:
                      [
                              ascii := aString first asciiValue.
      +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
      -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
                      ].

              (aString first isLetter or: [aString first isDigit]) ifFalse:
                      [
                              aString size = 2 ifTrue:
                                      [Symbol hasInterned: aString ifTrue:
                                              [:s | selectorList add: s]].
                              ^selectorList
                      ].

              selectorList := selectorList copyFrom: 2 to: selectorList size.

              self allSymbolTablesDo: [:each |
                      each size >= size ifTrue:
                              [(each findSubstring: aString in: each startingAt: 1
                                      matchTable: CaseInsensitiveOrder) > 0
                                                      ifTrue: [selectorList add: each]]].

              ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
                      each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].

       "Symbol selectorsContaining: 'scon'"!



















Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Bert Freudenberg
Another sender used to be #xColon. Not sure if we need to worry about that one.

CharAsSymbolSenders sortedCounts
==> {6474->#scanToken . 141->#xColon}

(this was from before your compiler changes)

- Bert -

On 2013-09-06, at 01:58, Nicolas Cellier <[hidden email]> wrote:

Thanks Bert, I just published a Compiler version that does not...


2013/9/5 Bert Freudenberg <[hidden email]>

On 2013-09-04, at 21:41, Nicolas Cellier <[hidden email]> wrote:

A rapid review of senders indicates that Character asSymbol seems rather rare in a trunk image.

I found
#initTextConstants
#inOutdent:delta: (Character tab asSymbol asText which is rather crooked way of avoiding an allocation - String tab would allocate a new Object).

Scanner>>scanToken appears to be the most often used sender, and may well be the original reason for caching the 1-char symbols.

It might be a good idea to time compilation of complex methods with and without OneCharacterSymbols. (my guess is it wouldn't make much of a difference these days)

- Bert -




2013/9/4 Levente Uzonyi <[hidden email]>
There's another user of OneCharacterSymbols: Character >> #asSymbol. I don't know how often it's used (my guess is that it's rarely used), or how much slower it would be, if we were removing OneCharacterSymbols. The implementation would change from

        ^Symbol internCharacter: self

to

        ^Symbol intern: self asString

Which means that an extra String will be created, and the lookup is done in the WeakSets, instead of an Array/WeakArray.


Levente

On Wed, 4 Sep 2013, Nicolas Cellier wrote:

Well, as you were the last modifier, I prefer to have your advice :)

Nicolas


2013/9/4 Levente Uzonyi <[hidden email]>
      On Wed, 4 Sep 2013, [hidden email] wrote:

            A new version of Collections was added to project The Inbox:
            http://source.squeak.org/inbox/Collections-nice.530.mcz

            ==================== Summary ====================

            Name: Collections-nice.530
            Author: nice
            Time: 4 September 2013, 2:21:28.852 am
            UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
            Ancestors: Collections-ul.529

            Fix this:
             self assert: (allSymbols select: [:s | s = '+']) size = 1.

            1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

            2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

            Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...


I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente


      =============== Diff against Collections-ul.529 ===============

      Item was changed:
       ----- Method: Symbol class>>allSymbols (in category 'access') -----
       allSymbols
              "Answer all interned symbols"
              ^Array streamContents:[:s|
                      s nextPutAll: NewSymbols.
      -               s nextPutAll: OneCharacterSymbols.
                      s nextPutAll: SymbolTable.
              ].
       !

      Item was changed:
       ----- Method: Symbol class>>initialize (in category 'class initialization') -----
       initialize

              "Symbol initialize"

              Symbol rehash.
      +       OneCharacterSymbols := WeakArray new: 256.
      +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
      -       OneCharacterSymbols := nil.
      -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
              Smalltalk addToShutDownList: self.
       !

      Item was changed:
       ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
       internCharacter: aCharacter
              aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
      +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
      -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
      -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
       !

      Item was changed:
       ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
       selectorsContaining: aString
              "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."

              | size selectorList ascii |

              selectorList := OrderedCollection new.
              (size := aString size) = 0 ifTrue: [^selectorList].

              aString size = 1 ifTrue:
                      [
                              ascii := aString first asciiValue.
      +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
      -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
                      ].

              (aString first isLetter or: [aString first isDigit]) ifFalse:
                      [
                              aString size = 2 ifTrue:
                                      [Symbol hasInterned: aString ifTrue:
                                              [:s | selectorList add: s]].
                              ^selectorList
                      ].

              selectorList := selectorList copyFrom: 2 to: selectorList size.

              self allSymbolTablesDo: [:each |
                      each size >= size ifTrue:
                              [(each findSubstring: aString in: each startingAt: 1
                                      matchTable: CaseInsensitiveOrder) > 0
                                                      ifTrue: [selectorList add: each]]].

              ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
                      each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].

       "Symbol selectorsContaining: 'scon'"!




















Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Nicolas Cellier
Ah right, thanks Bert.
I think that the send of xColon is rare anyway (assignment cases := apart, in which case we do not send asSymbol)

#':' is not referred directly in the Compiler stuff:
Let's see: (SystemNavigation default browseAllCallsOn: #':') -> Oops yet another bug
See fix attached

Though I think asSymbol is necessary given implementation of scanLitWord, I'll look deeper later




2013/9/11 Bert Freudenberg <[hidden email]>
Another sender used to be #xColon. Not sure if we need to worry about that one.

CharAsSymbolSenders sortedCounts
==> {6474->#scanToken . 141->#xColon}

(this was from before your compiler changes)

- Bert -

On 2013-09-06, at 01:58, Nicolas Cellier <[hidden email]> wrote:

Thanks Bert, I just published a Compiler version that does not...


2013/9/5 Bert Freudenberg <[hidden email]>

On 2013-09-04, at 21:41, Nicolas Cellier <[hidden email]> wrote:

A rapid review of senders indicates that Character asSymbol seems rather rare in a trunk image.

I found
#initTextConstants
#inOutdent:delta: (Character tab asSymbol asText which is rather crooked way of avoiding an allocation - String tab would allocate a new Object).

Scanner>>scanToken appears to be the most often used sender, and may well be the original reason for caching the 1-char symbols.

It might be a good idea to time compilation of complex methods with and without OneCharacterSymbols. (my guess is it wouldn't make much of a difference these days)

- Bert -




2013/9/4 Levente Uzonyi <[hidden email]>
There's another user of OneCharacterSymbols: Character >> #asSymbol. I don't know how often it's used (my guess is that it's rarely used), or how much slower it would be, if we were removing OneCharacterSymbols. The implementation would change from

        ^Symbol internCharacter: self

to

        ^Symbol intern: self asString

Which means that an extra String will be created, and the lookup is done in the WeakSets, instead of an Array/WeakArray.


Levente

On Wed, 4 Sep 2013, Nicolas Cellier wrote:

Well, as you were the last modifier, I prefer to have your advice :)

Nicolas


2013/9/4 Levente Uzonyi <[hidden email]>
      On Wed, 4 Sep 2013, [hidden email] wrote:

            A new version of Collections was added to project The Inbox:
            http://source.squeak.org/inbox/Collections-nice.530.mcz

            ==================== Summary ====================

            Name: Collections-nice.530
            Author: nice
            Time: 4 September 2013, 2:21:28.852 am
            UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
            Ancestors: Collections-ul.529

            Fix this:
             self assert: (allSymbols select: [:s | s = '+']) size = 1.

            1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

            2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

            Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...


I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente


      =============== Diff against Collections-ul.529 ===============

      Item was changed:
       ----- Method: Symbol class>>allSymbols (in category 'access') -----
       allSymbols
              "Answer all interned symbols"
              ^Array streamContents:[:s|
                      s nextPutAll: NewSymbols.
      -               s nextPutAll: OneCharacterSymbols.
                      s nextPutAll: SymbolTable.
              ].
       !

      Item was changed:
       ----- Method: Symbol class>>initialize (in category 'class initialization') -----
       initialize

              "Symbol initialize"

              Symbol rehash.
      +       OneCharacterSymbols := WeakArray new: 256.
      +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
      -       OneCharacterSymbols := nil.
      -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
              Smalltalk addToShutDownList: self.
       !

      Item was changed:
       ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
       internCharacter: aCharacter
              aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
      +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
      -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
      -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
       !

      Item was changed:
       ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
       selectorsContaining: aString
              "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."

              | size selectorList ascii |

              selectorList := OrderedCollection new.
              (size := aString size) = 0 ifTrue: [^selectorList].

              aString size = 1 ifTrue:
                      [
                              ascii := aString first asciiValue.
      +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
      -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
                      ].

              (aString first isLetter or: [aString first isDigit]) ifFalse:
                      [
                              aString size = 2 ifTrue:
                                      [Symbol hasInterned: aString ifTrue:
                                              [:s | selectorList add: s]].
                              ^selectorList
                      ].

              selectorList := selectorList copyFrom: 2 to: selectorList size.

              self allSymbolTablesDo: [:each |
                      each size >= size ifTrue:
                              [(each findSubstring: aString in: each startingAt: 1
                                      matchTable: CaseInsensitiveOrder) > 0
                                                      ifTrue: [selectorList add: each]]].

              ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
                      each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].

       "Symbol selectorsContaining: 'scon'"!

























SystemNavigation-headingAndAutoselectForLiteraldo.st (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Nicolas Cellier
After verification, xColon is not so rare, it is invoked for every block argument...
But we can safely replace (token := self step asSymbol) with (token := String with: self step), no problem.

We can even change it to (token := self step) if we are willing to remove last scories of alternate selectors beginning with $:
Indeed, #: and #:x are obsolete squeakism which would not compile in other dialects (#':' and #':x' is the universally correct syntax).

Removing this feature consists in
- removing #colon reference in xLitQuote scanLitWord scanLitVec scanAllTokenPosition
- removing asSymbol send in xColon.

But that would imply that methods with #: or #:x literals would not compile anymore
And that #( :x ) would result in #( #':' #x ) instead of #( #':x' ).
After my last hacking this would be a safe change in trunk (at least Compiler recompileAll works) and I'm all for it,
... but who knows what could happen to exotic third-party code.
Having code load request resulting in a SyntaxError popping up is just one of the most annoying thing...
And having interpretation of a literal changing without any sort of indication is not much better!

#: would still be supported with (Scanner prefAllowUnicharSymbol: true) but not #:x
Maybe we could make it a preference too for smoother migration?
Or some sort of disambiguating request if interactive and conservative behavior if not, like I did for 1@-2?

Thoughts?

2013/9/11 Nicolas Cellier <[hidden email]>
Ah right, thanks Bert.
I think that the send of xColon is rare anyway (assignment cases := apart, in which case we do not send asSymbol)

#':' is not referred directly in the Compiler stuff:
Let's see: (SystemNavigation default browseAllCallsOn: #':') -> Oops yet another bug
See fix attached

Though I think asSymbol is necessary given implementation of scanLitWord, I'll look deeper later




2013/9/11 Bert Freudenberg <[hidden email]>
Another sender used to be #xColon. Not sure if we need to worry about that one.

CharAsSymbolSenders sortedCounts
==> {6474->#scanToken . 141->#xColon}

(this was from before your compiler changes)

- Bert -

On 2013-09-06, at 01:58, Nicolas Cellier <[hidden email]> wrote:

Thanks Bert, I just published a Compiler version that does not...


2013/9/5 Bert Freudenberg <[hidden email]>

On 2013-09-04, at 21:41, Nicolas Cellier <[hidden email]> wrote:

A rapid review of senders indicates that Character asSymbol seems rather rare in a trunk image.

I found
#initTextConstants
#inOutdent:delta: (Character tab asSymbol asText which is rather crooked way of avoiding an allocation - String tab would allocate a new Object).

Scanner>>scanToken appears to be the most often used sender, and may well be the original reason for caching the 1-char symbols.

It might be a good idea to time compilation of complex methods with and without OneCharacterSymbols. (my guess is it wouldn't make much of a difference these days)

- Bert -




2013/9/4 Levente Uzonyi <[hidden email]>
There's another user of OneCharacterSymbols: Character >> #asSymbol. I don't know how often it's used (my guess is that it's rarely used), or how much slower it would be, if we were removing OneCharacterSymbols. The implementation would change from

        ^Symbol internCharacter: self

to

        ^Symbol intern: self asString

Which means that an extra String will be created, and the lookup is done in the WeakSets, instead of an Array/WeakArray.


Levente

On Wed, 4 Sep 2013, Nicolas Cellier wrote:

Well, as you were the last modifier, I prefer to have your advice :)

Nicolas


2013/9/4 Levente Uzonyi <[hidden email]>
      On Wed, 4 Sep 2013, [hidden email] wrote:

            A new version of Collections was added to project The Inbox:
            http://source.squeak.org/inbox/Collections-nice.530.mcz

            ==================== Summary ====================

            Name: Collections-nice.530
            Author: nice
            Time: 4 September 2013, 2:21:28.852 am
            UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
            Ancestors: Collections-ul.529

            Fix this:
             self assert: (allSymbols select: [:s | s = '+']) size = 1.

            1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

            2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

            Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...


I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente


      =============== Diff against Collections-ul.529 ===============

      Item was changed:
       ----- Method: Symbol class>>allSymbols (in category 'access') -----
       allSymbols
              "Answer all interned symbols"
              ^Array streamContents:[:s|
                      s nextPutAll: NewSymbols.
      -               s nextPutAll: OneCharacterSymbols.
                      s nextPutAll: SymbolTable.
              ].
       !

      Item was changed:
       ----- Method: Symbol class>>initialize (in category 'class initialization') -----
       initialize

              "Symbol initialize"

              Symbol rehash.
      +       OneCharacterSymbols := WeakArray new: 256.
      +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
      -       OneCharacterSymbols := nil.
      -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
              Smalltalk addToShutDownList: self.
       !

      Item was changed:
       ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
       internCharacter: aCharacter
              aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
      +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
      -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
      -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
       !

      Item was changed:
       ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
       selectorsContaining: aString
              "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."

              | size selectorList ascii |

              selectorList := OrderedCollection new.
              (size := aString size) = 0 ifTrue: [^selectorList].

              aString size = 1 ifTrue:
                      [
                              ascii := aString first asciiValue.
      +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
      -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
                      ].

              (aString first isLetter or: [aString first isDigit]) ifFalse:
                      [
                              aString size = 2 ifTrue:
                                      [Symbol hasInterned: aString ifTrue:
                                              [:s | selectorList add: s]].
                              ^selectorList
                      ].

              selectorList := selectorList copyFrom: 2 to: selectorList size.

              self allSymbolTablesDo: [:each |
                      each size >= size ifTrue:
                              [(each findSubstring: aString in: each startingAt: 1
                                      matchTable: CaseInsensitiveOrder) > 0
                                                      ifTrue: [selectorList add: each]]].

              ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
                      each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].

       "Symbol selectorsContaining: 'scon'"!

























Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Bert Freudenberg
On 2013-09-11, at 23:14, Nicolas Cellier <[hidden email]> wrote:

After verification, xColon is not so rare, it is invoked for every block argument...
But we can safely replace (token := self step asSymbol) with (token := String with: self step), no problem.

We can even change it to (token := self step) if we are willing to remove last scories of alternate selectors beginning with $:
Indeed, #: and #:x are obsolete squeakism which would not compile in other dialects (#':' and #':x' is the universally correct syntax).

Well, all Squeak-derived dialects still do support it. Pharo changed the printString of #: to #':', while Squeak and Cuis still print #:. And they all accept #:, #:x, and #:x:::.

Removing this feature consists in
- removing #colon reference in xLitQuote scanLitWord scanLitVec scanAllTokenPosition
- removing asSymbol send in xColon.

+ changing printing for Symbol

But that would imply that methods with #: or #:x literals would not compile anymore
And that #( :x ) would result in #( #':' #x ) instead of #( #':x' ).
After my last hacking this would be a safe change in trunk (at least Compiler recompileAll works) and I'm all for it,
... but who knows what could happen to exotic third-party code.
Having code load request resulting in a SyntaxError popping up is just one of the most annoying thing...
And having interpretation of a literal changing without any sort of indication is not much better!

#: would still be supported with (Scanner prefAllowUnicharSymbol: true) but not #:x

What else would the pref allow? #: is the only thing I've seen "in the wild" (occurring in Etoys, OMeta, and DrGeo)

Maybe we could make it a preference too for smoother migration?
Or some sort of disambiguating request if interactive and conservative behavior if not, like I did for 1@-2?

That would seem like the user-friendliest option, better than a preference IMHO. 

- Bert -

Thoughts?

2013/9/11 Nicolas Cellier <[hidden email]>
Ah right, thanks Bert.
I think that the send of xColon is rare anyway (assignment cases := apart, in which case we do not send asSymbol)

#':' is not referred directly in the Compiler stuff:
Let's see: (SystemNavigation default browseAllCallsOn: #':') -> Oops yet another bug
See fix attached

Though I think asSymbol is necessary given implementation of scanLitWord, I'll look deeper later




2013/9/11 Bert Freudenberg <[hidden email]>
Another sender used to be #xColon. Not sure if we need to worry about that one.

CharAsSymbolSenders sortedCounts
==> {6474->#scanToken . 141->#xColon}

(this was from before your compiler changes)

- Bert -

On 2013-09-06, at 01:58, Nicolas Cellier <[hidden email]> wrote:

Thanks Bert, I just published a Compiler version that does not...


2013/9/5 Bert Freudenberg <[hidden email]>

On 2013-09-04, at 21:41, Nicolas Cellier <[hidden email]> wrote:

A rapid review of senders indicates that Character asSymbol seems rather rare in a trunk image.

I found
#initTextConstants
#inOutdent:delta: (Character tab asSymbol asText which is rather crooked way of avoiding an allocation - String tab would allocate a new Object).

Scanner>>scanToken appears to be the most often used sender, and may well be the original reason for caching the 1-char symbols.

It might be a good idea to time compilation of complex methods with and without OneCharacterSymbols. (my guess is it wouldn't make much of a difference these days)

- Bert -




2013/9/4 Levente Uzonyi <[hidden email]>
There's another user of OneCharacterSymbols: Character >> #asSymbol. I don't know how often it's used (my guess is that it's rarely used), or how much slower it would be, if we were removing OneCharacterSymbols. The implementation would change from

        ^Symbol internCharacter: self

to

        ^Symbol intern: self asString

Which means that an extra String will be created, and the lookup is done in the WeakSets, instead of an Array/WeakArray.


Levente

On Wed, 4 Sep 2013, Nicolas Cellier wrote:

Well, as you were the last modifier, I prefer to have your advice :)

Nicolas


2013/9/4 Levente Uzonyi <[hidden email]>
      On Wed, 4 Sep 2013, [hidden email] wrote:

            A new version of Collections was added to project The Inbox:
            http://source.squeak.org/inbox/Collections-nice.530.mcz

            ==================== Summary ====================

            Name: Collections-nice.530
            Author: nice
            Time: 4 September 2013, 2:21:28.852 am
            UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
            Ancestors: Collections-ul.529

            Fix this:
             self assert: (allSymbols select: [:s | s = '+']) size = 1.

            1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !

            2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.

            Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...


I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
- it uses OneCharacterSymbols only for infix selectors
- it can't find infix selectors longer than 2 characters
- it tries to optimize some stuff, but wastes cycles on other stuff


Levente


      =============== Diff against Collections-ul.529 ===============

      Item was changed:
       ----- Method: Symbol class>>allSymbols (in category 'access') -----
       allSymbols
              "Answer all interned symbols"
              ^Array streamContents:[:s|
                      s nextPutAll: NewSymbols.
      -               s nextPutAll: OneCharacterSymbols.
                      s nextPutAll: SymbolTable.
              ].
       !

      Item was changed:
       ----- Method: Symbol class>>initialize (in category 'class initialization') -----
       initialize

              "Symbol initialize"

              Symbol rehash.
      +       OneCharacterSymbols := WeakArray new: 256.
      +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
      -       OneCharacterSymbols := nil.
      -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
              Smalltalk addToShutDownList: self.
       !

      Item was changed:
       ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
       internCharacter: aCharacter
              aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
      +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
      -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
      -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
       !

      Item was changed:
       ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
       selectorsContaining: aString
              "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."

              | size selectorList ascii |

              selectorList := OrderedCollection new.
              (size := aString size) = 0 ifTrue: [^selectorList].

              aString size = 1 ifTrue:
                      [
                              ascii := aString first asciiValue.
      +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
      -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
                      ].

              (aString first isLetter or: [aString first isDigit]) ifFalse:
                      [
                              aString size = 2 ifTrue:
                                      [Symbol hasInterned: aString ifTrue:
                                              [:s | selectorList add: s]].
                              ^selectorList
                      ].

              selectorList := selectorList copyFrom: 2 to: selectorList size.

              self allSymbolTablesDo: [:each |
                      each size >= size ifTrue:
                              [(each findSubstring: aString in: each startingAt: 1
                                      matchTable: CaseInsensitiveOrder) > 0
                                                      ifTrue: [selectorList add: each]]].

              ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
                      each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].

       "Symbol selectorsContaining: 'scon'"!



























Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Levente Uzonyi-2
On Thu, 12 Sep 2013, Bert Freudenberg wrote:

> On 2013-09-11, at 23:14, Nicolas Cellier <[hidden email]> wrote:
>
>       After verification, xColon is not so rare, it is invoked for every block argument...
>       But we can safely replace (token := self step asSymbol) with (token := String with: self step), no problem.
>
> We can even change it to (token := self step) if we are willing to remove last scories of alternate selectors beginning with $:
> Indeed, #: and #:x are obsolete squeakism which would not compile in other dialects (#':' and #':x' is the universally correct syntax).
>
>
> Well, all Squeak-derived dialects still do support it. Pharo changed the printString of #: to #':', while Squeak and Cuis still print #:. And they all accept #:, #:x, and #:x:::.

You mean you can still use #: or #:x as a selector in Squeak, Cuis or
Pharo? I'm pretty sure that was removed long ago from Squeak, and I doubt
the other dialects support them.


Levente

Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Bert Freudenberg

On 2013-09-12, at 23:03, Levente Uzonyi <[hidden email]> wrote:

> On Thu, 12 Sep 2013, Bert Freudenberg wrote:
>
>> On 2013-09-11, at 23:14, Nicolas Cellier <[hidden email]> wrote:
>>
>>>      After verification, xColon is not so rare, it is invoked for every block argument...
>>>      But we can safely replace (token := self step asSymbol) with (token := String with: self step), no problem.
>>> We can even change it to (token := self step) if we are willing to remove last scories of alternate selectors beginning with $:
>>> Indeed, #: and #:x are obsolete squeakism which would not compile in other dialects (#':' and #':x' is the universally correct syntax).
>> Well, all Squeak-derived dialects still do support it. Pharo changed the printString of #: to #':', while Squeak and Cuis still print #:. And they all accept #:, #:x, and #:x:::.
>
> You mean you can still use #: or #:x as a selector in Squeak, Cuis or Pharo? I'm pretty sure that was removed long ago from Squeak, and I doubt the other dialects support them.


No, I mean they recognize the symbol syntax.

- Bert -



Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Nicolas Cellier
Well, Squeak still accept #: but it already prints #':' because we refactored Scanner class>>isLiteralSymbol: some years ago
We also changed String>>numArgs so that #':if:then:else:' numArgs = -1
It would be logical to finish the clean-up.
Only the possibility to load historical code should retain us if it really matters
(but it's hard to tell without analyzing large code bases).

That's one thing we should always remember: adding a small extension to the syntax is very easy to do.
Removing it is much much harder (for social reasons).

Pharo 3.0 does not accept #: #:x anymore since OpalCompiler switch.


2013/9/12 Bert Freudenberg <[hidden email]>

On 2013-09-12, at 23:03, Levente Uzonyi <[hidden email]> wrote:

> On Thu, 12 Sep 2013, Bert Freudenberg wrote:
>
>> On 2013-09-11, at 23:14, Nicolas Cellier <[hidden email]> wrote:
>>
>>>      After verification, xColon is not so rare, it is invoked for every block argument...
>>>      But we can safely replace (token := self step asSymbol) with (token := String with: self step), no problem.
>>> We can even change it to (token := self step) if we are willing to remove last scories of alternate selectors beginning with $:
>>> Indeed, #: and #:x are obsolete squeakism which would not compile in other dialects (#':' and #':x' is the universally correct syntax).
>> Well, all Squeak-derived dialects still do support it. Pharo changed the printString of #: to #':', while Squeak and Cuis still print #:. And they all accept #:, #:x, and #:x:::.
>
> You mean you can still use #: or #:x as a selector in Squeak, Cuis or Pharo? I'm pretty sure that was removed long ago from Squeak, and I doubt the other dialects support them.


No, I mean they recognize the symbol syntax.

- Bert -






Reply | Threaded
Open this post in threaded view
|

Re: Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Bert Freudenberg
On 2013-09-13, at 08:40, Nicolas Cellier <[hidden email]> wrote:

Well, Squeak still accept #: but it already prints #':' because we refactored Scanner class>>isLiteralSymbol: some years ago

Ah, right.

We also changed String>>numArgs so that #':if:then:else:' numArgs = -1
It would be logical to finish the clean-up.
Only the possibility to load historical code should retain us if it really matters
(but it's hard to tell without analyzing large code bases).

That's one thing we should always remember: adding a small extension to the syntax is very easy to do.
Removing it is much much harder (for social reasons).

Pharo 3.0 does not accept #: #:x anymore since OpalCompiler switch.

... but it's not released yet.

Anyway, it looks like we can pretty safely remove it. In particular if the syntax error notifier lets us easily fix an occurrence when trying to load older code - does it?

- Bert -

2013/9/12 Bert Freudenberg <[hidden email]>

On 2013-09-12, at 23:03, Levente Uzonyi <[hidden email]> wrote:

> On Thu, 12 Sep 2013, Bert Freudenberg wrote:
>
>> On 2013-09-11, at 23:14, Nicolas Cellier <[hidden email]> wrote:
>>
>>>      After verification, xColon is not so rare, it is invoked for every block argument...
>>>      But we can safely replace (token := self step asSymbol) with (token := String with: self step), no problem.
>>> We can even change it to (token := self step) if we are willing to remove last scories of alternate selectors beginning with $:
>>> Indeed, #: and #:x are obsolete squeakism which would not compile in other dialects (#':' and #':x' is the universally correct syntax).
>> Well, all Squeak-derived dialects still do support it. Pharo changed the printString of #: to #':', while Squeak and Cuis still print #:. And they all accept #:, #:x, and #:x:::.
>
> You mean you can still use #: or #:x as a selector in Squeak, Cuis or Pharo? I'm pretty sure that was removed long ago from Squeak, and I doubt the other dialects support them.


No, I mean they recognize the symbol syntax.

- Bert -








12