Issue 321 in glassdb: Collation sequence fails for embedded accented characters (Unicode support)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Issue 321 in glassdb: Collation sequence fails for embedded accented characters (Unicode support)

glassdb
Status: Accepted
Owner: [hidden email]
Labels: Type-Defect Priority-Medium GLASS-Server Version-3.0.x bugid-41964  
Version-2.x

New issue 321 by [hidden email]: Collation sequence fails for  
embedded accented characters (Unicode support)
http://code.google.com/p/glassdb/issues/detail?id=321

 From the internal GemStone bug report for bug41964:

While we provide control over case sensitive collation using extended  
character sets (so you can always control the location of ó and Ó relative  
to o and O), we don't handle case/accent insensitive sorting. This is a  
pretty serious deficiency for other languages such as spanish that use  
accents.

Customer's example, these two values compare incorrectly (using GemStone  
default collation table):
#(
'Córdoba'
'Corrientes'
) asSortedCollection  asArray  -> anArray( 'Corrientes', 'Córdoba')

In this type of case (for most languages) various flavors of a character  
should sort as if they were upper/lower case variations.  For example,  
currently
#(
'cób'
'cob'
'cOb'
'cÓb'
'Cób'
'Cob'
'COb'
'CÓb'
'cóz'
'coz'
'cOz'
'cÓz'
'Cóz'
'Coz'
'COz'
'CÓz'
) asSortedCollection  asArray
with GemStone default collation table ->  
anArray( 'COb', 'Cob', 'cOb', 'cob', 'COz', 'Coz', 'cOz', 'coz', 'CÓb', 'Cób', 'cÓb', 'cób', 'CÓz', 'Cóz', 'cÓz', 'cóz')
with Unicode default collation table ->  
anArray( 'cob', 'cOb', 'Cob', 'COb', 'coz', 'cOz', 'Coz', 'COz', 'còb', 'cÒb', 'Còb', 'CÒb', 'còz', 'cÒz', 'Còz', 'CÒz')