Sorting Unicode strings

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Sorting Unicode strings

Hannes Hirzel
Hello

According to http://www.unicode.org/cldr/charts/27/collation/de.html the German
phonebook sort order is

a A ä Ä ą̈ Ą̈ ǟ Ǟ ạ̈ Ạ̈ ḁ̈ Ḁ̈ b B c C d D e E f F g G h H i I j J k K
l L m M n N o O ö Ö ǫ̈ Ǫ̈ ȫ Ȫ ơ̈ Ơ̈ ợ̈ Ợ̈ ọ̈ Ọ̈ p P q Q r R s S ss ß t
T u U ü Ü ǘ Ǘ ǜ Ǜ ǚ Ǚ ų̈ Ų̈ ǖ Ǖ ư̈ Ư̈ ự̈ Ự̈ ụ̈ Ụ̈ ṳ̈ Ṳ̈ ṷ̈ Ṷ̈ ṵ̈ Ṵ̈ v
V w W x X y Y z Z

I wonder why it looks like this. A lot of characters which never
appear in a German text.


For Spanish there is 'traditional' and 'standard'

http://www.unicode.org/cldr/charts/27/collation/es.html

standard        a A á Á b B c C d D e E é É f F g G h H i I í Í j J k K l L m
M n N ñ Ñ ņ̃ Ņ̃ ṇ̃ Ṇ̃ ṋ̃ Ṋ̃ ṉ̃ Ṉ̃ o O ó Ó p P q Q r R s S t T u U ú Ú
ü Ü v V w W x X y Y z Z

traditional     a A á Á b B c C ch Ch CH cĥ Cĥ CĤ cȟ Cȟ CȞ cḧ Cḧ CḦ cḣ Cḣ
CḢ cḩ Cḩ CḨ cḥ Cḥ CḤ cḫ Cḫ CḪ cẖ Cẖ d D e E é É f F g G h H i I í Í j
J k K l L ll Ll LL lĺ Lĺ LĹ lľ Lľ LĽ lļ Lļ LĻ lḷ Lḷ LḶ lḹ Lḹ LḸ lḽ Lḽ
LḼ lḻ Lḻ LḺ m M n N ñ Ñ ņ̃ Ņ̃ ṇ̃ Ṇ̃ ṋ̃ Ṋ̃ ṉ̃ Ṉ̃ o O ó Ó p P q Q r R s
S t T u U ú Ú ü Ü v V w W x X y Y z Z

And French is not easily found
http://www.unicode.org/cldr/charts/27/collation/index.html
or seems to be defined elsewhere

http://unicode.org/repos/cldr/tags/release-27/common/collation/fr.xml

Suggestions and hints are welcome

--Hannes


Reply | Threaded
Open this post in threaded view
|

Re: Sorting Unicode strings

Martin Bähr
Excerpts from H. Hirzel's message of 2015-12-07 21:52:09 +0100:

> According to http://www.unicode.org/cldr/charts/27/collation/de.html the German
> phonebook sort order is
>
> a A ä Ä ą̈ Ą̈ ǟ Ǟ ạ̈ Ạ̈ ḁ̈ Ḁ̈ b B c C d D e E f F g G h H i I j J k K
> l L m M n N o O ö Ö ǫ̈ Ǫ̈ ȫ Ȫ ơ̈ Ơ̈ ợ̈ Ợ̈ ọ̈ Ọ̈ p P q Q r R s S ss ß t
> T u U ü Ü ǘ Ǘ ǜ Ǜ ǚ Ǚ ų̈ Ų̈ ǖ Ǖ ư̈ Ư̈ ự̈ Ự̈ ụ̈ Ụ̈ ṳ̈ Ṳ̈ ṷ̈ Ṷ̈ ṵ̈ Ṵ̈ v
> V w W x X y Y z Z
>
> I wonder why it looks like this. A lot of characters which never
> appear in a German text.

but they might appear in names in a german phonebook (or any other list of names)

greetings, martin.

--
eKita                   -   the online platform for your entire academic life
--
chief engineer                                                       eKita.co
pike programmer      pike.lysator.liu.se    caudium.net     societyserver.org
secretary                                                      beijinglug.org
mentor                                                           fossasia.org
foresight developer  foresightlinux.org                            realss.com
unix sysadmin
Martin Bähr          working in china        http://societyserver.org/mbaehr/