The Trunk: Regex-Core-ul.52.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Regex-Core-ul.52.mcz

commits-2
Levente Uzonyi uploaded a new version of Regex-Core to project The Trunk:
http://source.squeak.org/trunk/Regex-Core-ul.52.mcz

==================== Summary ====================

Name: Regex-Core-ul.52
Author: ul
Time: 20 May 2016, 1:02:48.025069 am
UUID: 25d06172-cbd2-4454-a5e7-0dcfc27525b4
Ancestors: Regex-Core-pre.51

RxmLink changes:
- implemented #copyChain, #copyUsing: and #postCopyUsing: to create a copy of the matcher chain without creating duplicates of links being referenced from more than one place
- implemented missing #postCopy methods
- removed unused variables from subclasses

- use #copyChain instead of #veryDeepCopy in RxMatcher >> #makeQuantified:min:max: to avoid unnecessary duplication of non-link objects

=============== Diff against Regex-Core-pre.51 ===============

Item was changed:
  ----- Method: RxMatcher>>makeQuantified:min:max: (in category 'private') -----
  makeQuantified: anRxmLink min: min max: max
  "Perform recursive poor-man's transformation of the {<min>,<max>} quantifiers."
  | aMatcher |
 
  "<atom>{,<max>}       ==>  (<atom>{1,<max>})?"
  min = 0 ifTrue: [
  ^ self makeOptional: (self makeQuantified: anRxmLink min: 1 max: max) ].
 
  "<atom>{<min>,}       ==>  <atom>{<min>-1, <min>-1}<atom>+"
  max ifNil: [
+ ^ (self makeQuantified: anRxmLink min: 1 max: min-1) pointTailTo: (self makePlus: anRxmLink copyChain) ].
- ^ (self makeQuantified: anRxmLink min: 1 max: min-1) pointTailTo: (self makePlus: anRxmLink veryDeepCopy) ].
 
  "<atom>{<max>,<max>}  ==>  <atom><atom> ... <atom>"
  min = max
  ifTrue: [
+ aMatcher := anRxmLink copyChain.
+ (min-1) timesRepeat: [ aMatcher pointTailTo: anRxmLink copyChain ].
- aMatcher := anRxmLink veryDeepCopy.
- (min-1) timesRepeat: [ aMatcher pointTailTo: anRxmLink veryDeepCopy ].
  ^ aMatcher ].
 
  "<atom>{<min>,<max>}  ==>  <atom>{<min>,<min>}(<atom>{1,<max>-1})?"
+ aMatcher := self makeOptional: anRxmLink copyChain.
- aMatcher := self makeOptional: anRxmLink veryDeepCopy.
  (max - min - 1) timesRepeat: [
+ aMatcher := self makeOptional: (anRxmLink copyChain pointTailTo: aMatcher) ].
- aMatcher := self makeOptional: (anRxmLink veryDeepCopy pointTailTo: aMatcher) ].
  ^ (self makeQuantified: anRxmLink min: min max: min) pointTailTo: aMatcher!

Item was added:
+ ----- Method: RxmBranch>>postCopyUsing: (in category 'copying') -----
+ postCopyUsing: anIdentityDictionary
+
+ super postCopyUsing: anIdentityDictionary.
+ alternative ifNotNil: [
+ alternative := alternative copyUsing: anIdentityDictionary ]!

Item was added:
+ ----- Method: RxmLink>>copyChain (in category 'copying') -----
+ copyChain
+ "Create a full copy of all the links in this chain, including branches, while letting them share and reuse non-link objects as much as possible."
+
+ ^self copyUsing: IdentityDictionary new!

Item was added:
+ ----- Method: RxmLink>>copyUsing: (in category 'copying') -----
+ copyUsing: anIdentityDictionary
+ "Copy the receiver if it's not present in the argument dictionary, or just return the previously made copy. The rest of the object graph will be copied by #postCopyUsing:."
+
+ ^anIdentityDictionary
+ at: self
+ ifAbsent: [
+ "It may be tempting to use #at:ifAbsentPut: instead, but the argument block must not modify the receiver, so that wouldn't work."
+ anIdentityDictionary
+ at: self
+ put: (self shallowCopy
+ postCopyUsing: anIdentityDictionary;
+ yourself) ]!

Item was added:
+ ----- Method: RxmLink>>postCopyUsing: (in category 'copying') -----
+ postCopyUsing: anIdentityDictionary
+ "Copy the rest of the chain the same way as it's done in #copyUsing:."
+
+ next ifNotNil: [
+ next := next copyUsing: anIdentityDictionary ]!

Item was changed:
  RxmLink subclass: #RxmLookahead
+ instanceVariableNames: 'lookahead'
- instanceVariableNames: 'lookahead positive'
  classVariableNames: ''
  poolDictionaries: ''
  category: 'Regex-Core'!
 
  !RxmLookahead commentStamp: '<historical>' prior: 0!
  Instance holds onto a lookead which matches but does not consume anything.
 
  Instance variables:
  predicate <RxmLink>!

Item was removed:
- ----- Method: RxmLookahead>>initialize (in category 'initialization') -----
- initialize
- super initialize.
- positive := true.!

Item was added:
+ ----- Method: RxmLookahead>>postCopy (in category 'copying') -----
+ postCopy
+
+ super postCopy.
+ lookahead := lookahead copy!

Item was added:
+ ----- Method: RxmLookahead>>postCopyUsing: (in category 'copying') -----
+ postCopyUsing: anIdentityDictionary
+
+ super postCopyUsing: anIdentityDictionary.
+ lookahead := lookahead copyUsing: anIdentityDictionary!

Item was changed:
  RxmLink subclass: #RxmSubstring
+ instanceVariableNames: 'sampleStream ignoreCase'
- instanceVariableNames: 'sampleStream caseSensitive ignoreCase'
  classVariableNames: ''
  poolDictionaries: ''
  category: 'Regex-Core'!
 
  !RxmSubstring commentStamp: 'Tbn 11/12/2010 23:14' prior: 0!
  -- Regular Expression Matcher v 1.1 (C) 1996, 1999 Vassili Bykov
  --
  Instance holds onto a string and matches exactly this string, and exactly once.
 
  Instance variables:
  string <String>!

Item was added:
+ ----- Method: RxmSubstring>>postCopy (in category 'copying') -----
+ postCopy
+
+ super postCopy.
+ sampleStream := sampleStream copy!

Item was added:
+ ----- Method: RxmSubstring>>postCopyUsing: (in category 'copying') -----
+ postCopyUsing: anIdentityDictionary
+
+ super postCopyUsing: anIdentityDictionary.
+ sampleStream := sampleStream copy!


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Regex-Core-ul.52.mcz

Nicolas Cellier
+1

I presume copyChain is a bit more expensive than copy (I didn’t measure it though). More expensive but allways correct.


If the inexpensive copy is used extensively, and my suggestion is a performance killer, then let’s keep current solution: cleverly use expensive copy exactly where we now we are going to need it. Otherwise, I suggest renaming copyChain into copy, and hence have something much more robust.


2016-05-20 1:03 GMT+02:00 <[hidden email]>:
Levente Uzonyi uploaded a new version of Regex-Core to project The Trunk:
http://source.squeak.org/trunk/Regex-Core-ul.52.mcz

==================== Summary ====================

Name: Regex-Core-ul.52
Author: ul
Time: 20 May 2016, 1:02:48.025069 am
UUID: 25d06172-cbd2-4454-a5e7-0dcfc27525b4
Ancestors: Regex-Core-pre.51

RxmLink changes:
- implemented #copyChain, #copyUsing: and #postCopyUsing: to create a copy of the matcher chain without creating duplicates of links being referenced from more than one place
- implemented missing #postCopy methods
- removed unused variables from subclasses

- use #copyChain instead of #veryDeepCopy in RxMatcher >> #makeQuantified:min:max: to avoid unnecessary duplication of non-link objects

=============== Diff against Regex-Core-pre.51 ===============

Item was changed:
  ----- Method: RxMatcher>>makeQuantified:min:max: (in category 'private') -----
  makeQuantified: anRxmLink min: min max: max
        "Perform recursive poor-man's transformation of the {<min>,<max>} quantifiers."
        | aMatcher |

        "<atom>{,<max>}       ==>  (<atom>{1,<max>})?"
        min = 0 ifTrue: [
                ^ self makeOptional: (self makeQuantified: anRxmLink min: 1 max: max) ].

        "<atom>{<min>,}       ==>  <atom>{<min>-1, <min>-1}<atom>+"
        max ifNil: [
+               ^ (self makeQuantified: anRxmLink min: 1 max: min-1) pointTailTo: (self makePlus: anRxmLink copyChain) ].
-               ^ (self makeQuantified: anRxmLink min: 1 max: min-1) pointTailTo: (self makePlus: anRxmLink veryDeepCopy) ].

        "<atom>{<max>,<max>}  ==>  <atom><atom> ... <atom>"
        min = max
                ifTrue: [
+                       aMatcher := anRxmLink copyChain.
+                       (min-1) timesRepeat: [ aMatcher pointTailTo: anRxmLink copyChain ].
-                       aMatcher := anRxmLink veryDeepCopy.
-                       (min-1) timesRepeat: [ aMatcher pointTailTo: anRxmLink veryDeepCopy ].
                        ^ aMatcher ].

        "<atom>{<min>,<max>}  ==>  <atom>{<min>,<min>}(<atom>{1,<max>-1})?"
+       aMatcher := self makeOptional: anRxmLink copyChain.
-       aMatcher := self makeOptional: anRxmLink veryDeepCopy.
        (max - min - 1) timesRepeat: [
+                aMatcher := self makeOptional: (anRxmLink copyChain pointTailTo: aMatcher) ].
-                aMatcher := self makeOptional: (anRxmLink veryDeepCopy pointTailTo: aMatcher) ].
        ^ (self makeQuantified: anRxmLink min: min max: min) pointTailTo: aMatcher!

Item was added:
+ ----- Method: RxmBranch>>postCopyUsing: (in category 'copying') -----
+ postCopyUsing: anIdentityDictionary
+
+       super postCopyUsing: anIdentityDictionary.
+       alternative ifNotNil: [
+               alternative := alternative copyUsing: anIdentityDictionary ]!

Item was added:
+ ----- Method: RxmLink>>copyChain (in category 'copying') -----
+ copyChain
+       "Create a full copy of all the links in this chain, including branches, while letting them share and reuse non-link objects as much as possible."
+
+       ^self copyUsing: IdentityDictionary new!

Item was added:
+ ----- Method: RxmLink>>copyUsing: (in category 'copying') -----
+ copyUsing: anIdentityDictionary
+       "Copy the receiver if it's not present in the argument dictionary, or just return the previously made copy. The rest of the object graph will be copied by #postCopyUsing:."
+
+       ^anIdentityDictionary
+               at: self
+               ifAbsent: [
+                       "It may be tempting to use #at:ifAbsentPut: instead, but the argument block must not modify the receiver, so that wouldn't work."
+                       anIdentityDictionary
+                               at: self
+                               put: (self shallowCopy
+                                       postCopyUsing: anIdentityDictionary;
+                                       yourself) ]!

Item was added:
+ ----- Method: RxmLink>>postCopyUsing: (in category 'copying') -----
+ postCopyUsing: anIdentityDictionary
+       "Copy the rest of the chain the same way as it's done in #copyUsing:."
+
+       next ifNotNil: [
+               next := next copyUsing: anIdentityDictionary ]!

Item was changed:
  RxmLink subclass: #RxmLookahead
+       instanceVariableNames: 'lookahead'
-       instanceVariableNames: 'lookahead positive'
        classVariableNames: ''
        poolDictionaries: ''
        category: 'Regex-Core'!

  !RxmLookahead commentStamp: '<historical>' prior: 0!
  Instance holds onto a lookead which matches but does not consume anything.

  Instance variables:
        predicate               <RxmLink>!

Item was removed:
- ----- Method: RxmLookahead>>initialize (in category 'initialization') -----
- initialize
-       super initialize.
-       positive := true.!

Item was added:
+ ----- Method: RxmLookahead>>postCopy (in category 'copying') -----
+ postCopy
+
+       super postCopy.
+       lookahead := lookahead copy!

Item was added:
+ ----- Method: RxmLookahead>>postCopyUsing: (in category 'copying') -----
+ postCopyUsing: anIdentityDictionary
+
+       super postCopyUsing: anIdentityDictionary.
+       lookahead := lookahead copyUsing: anIdentityDictionary!

Item was changed:
  RxmLink subclass: #RxmSubstring
+       instanceVariableNames: 'sampleStream ignoreCase'
-       instanceVariableNames: 'sampleStream caseSensitive ignoreCase'
        classVariableNames: ''
        poolDictionaries: ''
        category: 'Regex-Core'!

  !RxmSubstring commentStamp: 'Tbn 11/12/2010 23:14' prior: 0!
  -- Regular Expression Matcher v 1.1 (C) 1996, 1999 Vassili Bykov
  --
  Instance holds onto a string and matches exactly this string, and exactly once.

  Instance variables:
        string  <String>!

Item was added:
+ ----- Method: RxmSubstring>>postCopy (in category 'copying') -----
+ postCopy
+
+       super postCopy.
+       sampleStream := sampleStream copy!

Item was added:
+ ----- Method: RxmSubstring>>postCopyUsing: (in category 'copying') -----
+ postCopyUsing: anIdentityDictionary
+
+       super postCopyUsing: anIdentityDictionary.
+       sampleStream := sampleStream copy!





Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Regex-Core-ul.52.mcz

Nicolas Cellier


2016-05-20 9:24 GMT+02:00 Nicolas Cellier <[hidden email]>:
+1

I presume copyChain is a bit more expensive than copy (I didn’t measure it though). More expensive but allways correct.


If the inexpensive copy is used extensively, and my suggestion is a performance killer, then let’s keep current solution: cleverly use expensive copy exactly where we now we are going to need it. Otherwise, I suggest renaming copyChain into copy, and hence have something much more robust.


where we know... Gah