The Inbox: Regex-Core-ul.57.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

The Inbox: Regex-Core-ul.57.mcz

commits-2
Levente Uzonyi uploaded a new version of Regex-Core to project The Inbox:
http://source.squeak.org/inbox/Regex-Core-ul.57.mcz

==================== Summary ====================

Name: Regex-Core-ul.57
Author: ul
Time: 11 April 2020, 9:42:43.405809 pm
UUID: 3cda3ad5-49f2-41f1-ab9d-2f6fc242ae9d
Ancestors: Regex-Core-ct.56

- avoid generating RxmBranch nodes when there's no real branch. e.g. 'abc' asRegex

=============== Diff against Regex-Core-ct.56 ===============

Item was changed:
  ----- Method: RxMatcher>>hookBranchOf:onto: (in category 'private') -----
  hookBranchOf: regexNode onto: endMarker
  "Private - Recurse down the chain of regexes starting at
  regexNode, compiling their branches and hooking their tails
  to the endMarker node."
 
+ ^regexNode regex
+ ifNil: [ "Avoid creating a branch without an alternative."
+ ^(regexNode branch dispatchTo: self)
+ pointTailTo: endMarker;
+ yourself ]
+ ifNotNil: [ :regex |
+ | rest |
+ rest := self hookBranchOf: regex onto: endMarker.
+ ^RxmBranch new
+ next: ((regexNode branch dispatchTo: self)
- | rest |
- rest := regexNode regex ifNotNil: [ :regex |
- self hookBranchOf: regex onto: endMarker ].
- ^RxmBranch new
- next: ((regexNode branch dispatchTo: self)
  pointTailTo: endMarker;
  yourself);
+ alternative: rest;
+ yourself ]
+ !
- alternative: rest;
- yourself!


Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Regex-Core-ul.57.mcz

Christoph Thiede

Great idea! Are there any visible performance improvements? In any case, this change makes it more easy to explore and debug the matching process!


FYI, you left three return carets in this method. :-)


Best,
Christoph

Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
Gesendet: Samstag, 11. April 2020 22:05:05
An: [hidden email]
Betreff: [squeak-dev] The Inbox: Regex-Core-ul.57.mcz
 
Levente Uzonyi uploaded a new version of Regex-Core to project The Inbox:
http://source.squeak.org/inbox/Regex-Core-ul.57.mcz

==================== Summary ====================

Name: Regex-Core-ul.57
Author: ul
Time: 11 April 2020, 9:42:43.405809 pm
UUID: 3cda3ad5-49f2-41f1-ab9d-2f6fc242ae9d
Ancestors: Regex-Core-ct.56

- avoid generating RxmBranch nodes when there's no real branch. e.g. 'abc' asRegex

=============== Diff against Regex-Core-ct.56 ===============

Item was changed:
  ----- Method: RxMatcher>>hookBranchOf:onto: (in category 'private') -----
  hookBranchOf: regexNode onto: endMarker
         "Private - Recurse down the chain of regexes starting at
         regexNode, compiling their branches and hooking their tails
         to the endMarker node."
 
+        ^regexNode regex
+                ifNil: [ "Avoid creating a branch without an alternative."
+                        ^(regexNode branch dispatchTo: self)
+                                pointTailTo: endMarker;
+                                yourself ]
+                ifNotNil: [ :regex |
+                        | rest |
+                        rest := self hookBranchOf: regex onto: endMarker.
+                        ^RxmBranch new
+                                next: ((regexNode branch dispatchTo: self)
-        | rest |
-        rest := regexNode regex ifNotNil: [ :regex |
-                self hookBranchOf: regex onto: endMarker ].
-        ^RxmBranch new
-                next: ((regexNode branch dispatchTo: self)
                                         pointTailTo: endMarker;
                                         yourself);
+                                alternative: rest;
+                                yourself ]
+ !
-                alternative: rest;
-                yourself!




Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Regex-Core-ul.57.mcz

Levente Uzonyi
Hi Christoph,

On Sat, 11 Apr 2020, Thiede, Christoph wrote:

>
> Great idea! Are there any visible performance improvements? In any case, this change makes it more easy to explore and debug the matching process!

There's ~5% speedup with the simple test case I was benchmarking.

>
>
> FYI, you left three return carets in this method. :-)

Right. I wanted to move the return instruction outside but forgot to
remove the inner ones.


Levente

>
>
> Best,
> Christoph
>
> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev <[hidden email]> im Auftrag von [hidden email] <[hidden email]>
> Gesendet: Samstag, 11. April 2020 22:05:05
> An: [hidden email]
> Betreff: [squeak-dev] The Inbox: Regex-Core-ul.57.mcz  
> Levente Uzonyi uploaded a new version of Regex-Core to project The Inbox:
> http://source.squeak.org/inbox/Regex-Core-ul.57.mcz
>
> ==================== Summary ====================
>
> Name: Regex-Core-ul.57
> Author: ul
> Time: 11 April 2020, 9:42:43.405809 pm
> UUID: 3cda3ad5-49f2-41f1-ab9d-2f6fc242ae9d
> Ancestors: Regex-Core-ct.56
>
> - avoid generating RxmBranch nodes when there's no real branch. e.g. 'abc' asRegex
>
> =============== Diff against Regex-Core-ct.56 ===============
>
> Item was changed:
>   ----- Method: RxMatcher>>hookBranchOf:onto: (in category 'private') -----
>   hookBranchOf: regexNode onto: endMarker
>          "Private - Recurse down the chain of regexes starting at
>          regexNode, compiling their branches and hooking their tails
>          to the endMarker node."
>  
> +        ^regexNode regex
> +                ifNil: [ "Avoid creating a branch without an alternative."
> +                        ^(regexNode branch dispatchTo: self)
> +                                pointTailTo: endMarker;
> +                                yourself ]
> +                ifNotNil: [ :regex |
> +                        | rest |
> +                        rest := self hookBranchOf: regex onto: endMarker.
> +                        ^RxmBranch new
> +                                next: ((regexNode branch dispatchTo: self)
> -        | rest |
> -        rest := regexNode regex ifNotNil: [ :regex |
> -                self hookBranchOf: regex onto: endMarker ].
> -        ^RxmBranch new
> -                next: ((regexNode branch dispatchTo: self)
>                                          pointTailTo: endMarker;
>                                          yourself);
> +                                alternative: rest;
> +                                yourself ]
> + !
> -                alternative: rest;
> -                yourself!
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Inbox: Regex-Core-ul.57.mcz

marcel.taeumel
+1 :-)

Best,
Marcel

Am 12.04.2020 09:29:27 schrieb Levente Uzonyi <[hidden email]>:

Hi Christoph,

On Sat, 11 Apr 2020, Thiede, Christoph wrote:

>
> Great idea! Are there any visible performance improvements? In any case, this change makes it more easy to explore and debug the matching process!

There's ~5% speedup with the simple test case I was benchmarking.

>
>
> FYI, you left three return carets in this method. :-)

Right. I wanted to move the return instruction outside but forgot to
remove the inner ones.


Levente

>
>
> Best,
> Christoph
>
> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev im Auftrag von [hidden email]
> Gesendet: Samstag, 11. April 2020 22:05:05
> An: [hidden email]
> Betreff: [squeak-dev] The Inbox: Regex-Core-ul.57.mcz  
> Levente Uzonyi uploaded a new version of Regex-Core to project The Inbox:
> http://source.squeak.org/inbox/Regex-Core-ul.57.mcz
>
> ==================== Summary ====================
>
> Name: Regex-Core-ul.57
> Author: ul
> Time: 11 April 2020, 9:42:43.405809 pm
> UUID: 3cda3ad5-49f2-41f1-ab9d-2f6fc242ae9d
> Ancestors: Regex-Core-ct.56
>
> - avoid generating RxmBranch nodes when there's no real branch. e.g. 'abc' asRegex
>
> =============== Diff against Regex-Core-ct.56 ===============
>
> Item was changed:
>   ----- Method: RxMatcher>>hookBranchOf:onto: (in category 'private') -----
>   hookBranchOf: regexNode onto: endMarker
>          "Private - Recurse down the chain of regexes starting at
>          regexNode, compiling their branches and hooking their tails
>          to the endMarker node."
>  
> +        ^regexNode regex
> +                ifNil: [ "Avoid creating a branch without an alternative."
> +                        ^(regexNode branch dispatchTo: self)
> +                                pointTailTo: endMarker;
> +                                yourself ]
> +                ifNotNil: [ :regex |
> +                        | rest |
> +                        rest := self hookBranchOf: regex onto: endMarker.
> +                        ^RxmBranch new
> +                                next: ((regexNode branch dispatchTo: self)
> -        | rest |
> -        rest := regexNode regex ifNotNil: [ :regex |
> -                self hookBranchOf: regex onto: endMarker ].
> -        ^RxmBranch new
> -                next: ((regexNode branch dispatchTo: self)
>                                          pointTailTo: endMarker;
>                                          yourself);
> +                                alternative: rest;
> +                                yourself ]
> + !
> -                alternative: rest;
> -                yourself!
>
>
>
>