[bug] regex doesn't support i (ignorecase) flag

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[bug] regex doesn't support i (ignorecase) flag

S11001001
Issue status update for
http://smalltalk.gnu.org/node/85
Post a follow up:
http://smalltalk.gnu.org/project/comments/add/85

 Project:      GNU Smalltalk
 Version:      <none>
 Component:    VM
 Category:     bug reports
 Priority:     normal
 Assigned to:  Unassigned
 Reported by:  S11001001
 Updated by:   S11001001
 Status:       patch
 Attachment:   http://smalltalk.gnu.org/files/issues/latin1-re-ignorecase.patch (2.58 KB)

Example:


st> ('a' =~ '(?i:A)') inspect!
An instance of Kernel.FailedMatchRegexResults

<!--break-->

I found that this is because pre_set_casetable in lib-src/regex.c is
never called.  This is fixed in
*[hidden email]--2007-nocandy/smalltalk--backstage--2.2--patch-62*,
"support (?i:...) in regexps".


st> ('a' =~ '(?i:A)') inspect!
An instance of Kernel.MatchingRegexResults


There are multiple solution paths, because case folding is
charset-dependent.  The patch implements #3:


    *  Always import I18N and use the locale database to determine the
charset of Strings.  I'm not sure what the exact semantics of this
would be.
    *  Assume ASCII.  regex.c already effectively assumes that strings
are somewhat ASCII-compatible, and this wouldn't bias in favor of a
particular ASCII superset.
    *  Assume Latin-1.  This has the benefit of offering a clear
behavior path to future support for matching full Unicode strings, so
it's what the patch uses.
    *  Assume Latin-9.  Technically this supersedes Latin-1, so is more
up-to-date, but is not a codepoint-wise subset of Unicode.



_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

[bug] regex doesn't support i (ignorecase) flag

Paolo Bonzini
Issue status update for
http://smalltalk.gnu.org/project/issue/85
Post a follow up:
http://smalltalk.gnu.org/project/comments/add/85

 Project:      GNU Smalltalk
 Version:      <none>
 Component:    VM
 Category:     bug reports
 Priority:     normal
 Assigned to:  Unassigned
 Reported by:  S11001001
 Updated by:   bonzinip
 Status:       patch

>    *  Assume ASCII.  regex.c already effectively assumes that strings
> are somewhat ASCII-compatible, and this wouldn't bias in favor of a
> particular ASCII superset.

I believe this is the best.

>    *  Assume Latin-1.  This has the benefit of offering a clear
> behavior path to future support for matching full Unicode strings, so
> it's what the patch uses.

Not really, because UTF-8 is not a superset of Latin-1.  All of eastern
Europe, plus Greece, plus most of Africa/Asia/Australia do not use
Latin-1.

If you don't mind some conflicts, I can adapt the patch you attached.
Otherwise, you can do the change yourself and I'll cherrypick both
changesets.




_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: [bug] regex doesn't support i (ignorecase) flag

S11001001
In reply to this post by S11001001
Issue status update for
http://smalltalk.gnu.org/project/issue/85
Post a follow up:
http://smalltalk.gnu.org/project/comments/add/85

 Project:      GNU Smalltalk
 Version:      <none>
 Component:    VM
 Category:     bug reports
 Priority:     normal
 Assigned to:  Unassigned
 Reported by:  S11001001
 Updated by:   S11001001
 Status:       patch
 Attachment:   http://smalltalk.gnu.org/files/issues/ascii-re-ignorecase.patch (1.39 KB)

*smalltalk--backstage--2.2--patch-63* in combination with the previous
patch changes the downcase table to ASCII.




_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: [bug] regex doesn't support i (ignorecase) flag

Paolo Bonzini
In reply to this post by S11001001
Issue status update for
http://smalltalk.gnu.org/project/issue/85
Post a follow up:
http://smalltalk.gnu.org/project/comments/add/85

 Project:      GNU Smalltalk
 Version:      <none>
 Component:    VM
 Category:     bug reports
 Priority:     normal
 Assigned to:  Unassigned
 Reported by:  S11001001
 Updated by:   bonzinip
-Status:       patch
+Status:       committed

Thanks, will apply soon.




_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: [bug] regex doesn't support i (ignorecase) flag

Paolo Bonzini
In reply to this post by S11001001
Issue status update for
http://smalltalk.gnu.org/project/issue/85
Post a follow up:
http://smalltalk.gnu.org/project/comments/add/85

 Project:      GNU Smalltalk
 Version:      <none>
 Component:    VM
 Category:     bug reports
 Priority:     normal
 Assigned to:  Unassigned
 Reported by:  S11001001
 Updated by:   bonzinip
-Status:       committed
+Status:       fixed

Applied.




_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk