Hello,
in our project we've removed the primitive calls in Character and string comparison methods because they delivered different results from the corresponding locale collation methods.
For example, $ü (u-umlaut) is sorted between $a and $z when using the collation method, and before $a when using the primitive VMprCharacterLessThan, which is incorrect according to german collation rules.
When switching to 8.6.3 from 8.6, we found that the new source compression algorithm uses isSmalltalkLetter for character categorization, which is essentially fine but it uses #between:and: which with or without our patch is incorrect for the intended purpose.
Instead of comparing characters (which is locale-sensitive) the method should compare code points:
isSmalltalkLetter
"Answer true if the receiver is a valid Smalltalk letter as described in the ANSI Smalltalk Standard; otherwise answer false.
letter ::= uppercaseAlphabetic | lowercaseAlphabetic | nonCaseLetter
uppercaseAlphabetic ::= ’A’ | ’B’ | ’C’ | ’D’ | ’E’ | ’F’ | ’G’ | ’H’ | ’I’ | ’J’ | ’K’ | ’L’ | ’M’ | ’N’ | ’O’ | ’P’ | ’Q’ | ’R’ | ’S’| ’T’ | ’U’ | ’V’ | ’W’ | ’X’ | ’Y’ | ’Z’
lowercaseAlphabetic ::= ’a’ | ’b’ | ’c’ | ’d’ | ’e’ | ’f’ | ’g’ | ’h’ | ’I’ | ’j’ | ’k’ | ’l’ | ’m’ | ’n’ | ’o’ | ’p’ | ’q’ | ’r’ | ’s’ | ’t’ | ’u’ | ’v’ | ’w’ | ’x’ | ’y’ | ’z’
nonCaseLetter ::= ’_’
It would be easier to simply send #isLetter, but we cannot do this because some country codes have characters that say they are letters but are not valid Smalltalk syntactic letters. We also need to allow for the nonCaseLetter"
| cp |
cp := self codePoint.
^(cp between: 97 "$a codePoint" and: 122 "$z codePoint") or: [(cp between: 65 "$A codePoint" and: 90 "$Z codePoint") or: [cp = 95 "$_ codePoint"]]
In addition, something should be done about the different behavior of Character comparison method #< and the corresponding locale collation. To me it is unclear what the correct behavior should be, but the locale collation is more useful to us.
Cheers,
Hans-Martin
--
You received this message because you are subscribed to the Google Groups "VA Smalltalk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
[hidden email].
To post to this group, send email to
[hidden email].
Visit this group at
https://groups.google.com/group/va-smalltalk.
For more options, visit
https://groups.google.com/d/optout.