Unicode Ranges you might want to copy

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Unicode Ranges you might want to copy

Squeak - Dev mailing list
Hi Folks,

You might find the below handy.

Fwiw, I have coded the UnicodeRangeBrowser  SeasideApp to display all the below (it times out before all display, I will be doing refactor and more development).

I am also coding a utility class to provide information such as below in a variety of ways.

fwiw, here are my current todo notes for this font information.

unifont

provide link to the unicode spec.
provide browser fonts list.
provide squeak fonts list.
provide phare fonts list ^self squeak fonts list:
what are Variant Selectors ?

show gaps in the ranges.

provide a link(s) to required fonts that will make a range display.
displays on: browser list, squeak, emacs, xterm...
status bar...broken, partial, full.
use cases for fonts
pairs with xyz   example superscripts and subscripts

The goal is to show what works and to find/test fonts that will support stuff.
I am modelling the Seaside app on https://jrgraphix.net/research/unicode_blocks.php  
But...for each unicode range, I intend to display smalltalk specific helpers on what/where to get fonts for both the browser side and the image side.

--------------snip -------------------

(16r000020 asCharacter to: 16r00007F asCharacter) -> 'Basic Latin'
(16r0000A0 asCharacter to: 16r0000FF asCharacter) -> 'Latin-1 Supplement'
(16r000100 asCharacter to: 16r00017F asCharacter) -> 'Latin Extended-A'
(16r000180 asCharacter to: 16r00024F asCharacter) -> 'Latin Extended-B'
(16r000250 asCharacter to: 16r0002AF asCharacter) -> 'IPA Extensions'
(16r0002B0 asCharacter to: 16r0002FF asCharacter) -> 'Spacing Modifier Letters'
(16r000300 asCharacter to: 16r00036F asCharacter) -> 'Combining Diacritical Marks'
(16r000370 asCharacter to: 16r0003FF asCharacter) -> 'Greek and Coptic'
(16r000400 asCharacter to: 16r0004FF asCharacter) -> 'Cyrillic'
(16r000500 asCharacter to: 16r00052F asCharacter) -> 'Cyrillic Supplementary'
(16r000530 asCharacter to: 16r00058F asCharacter) -> 'Armenian'
(16r000590 asCharacter to: 16r0005FF asCharacter) -> 'Hebrew'
(16r000600 asCharacter to: 16r0006FF asCharacter) -> 'Arabic'
(16r000700 asCharacter to: 16r00074F asCharacter) -> 'Syriac'
(16r000780 asCharacter to: 16r0007BF asCharacter) -> 'Thaana'
(16r000900 asCharacter to: 16r00097F asCharacter) -> 'Devanagari'
(16r000980 asCharacter to: 16r0009FF asCharacter) -> 'Bengali'
(16r000A00 asCharacter to: 16r000A7F asCharacter) -> 'Gurmukhi'
(16r000A80 asCharacter to: 16r000AFF asCharacter) -> 'Gujarati'
(16r000B00 asCharacter to: 16r000B7F asCharacter) -> 'Oriya'
(16r000B80 asCharacter to: 16r000BFF asCharacter) -> 'Tamil'
(16r000C00 asCharacter to: 16r000C7F asCharacter) -> 'Telugu'
(16r000C80 asCharacter to: 16r000CFF asCharacter) -> 'Kannada'
(16r000D00 asCharacter to: 16r000D7F asCharacter) -> 'Malayalam'
(16r000D80 asCharacter to: 16r000DFF asCharacter) -> 'Sinhala'
(16r000E00 asCharacter to: 16r000E7F asCharacter) -> 'Thai'
(16r000E80 asCharacter to: 16r000EFF asCharacter) -> 'Lao'
(16r000F00 asCharacter to: 16r000FFF asCharacter) -> 'Tibetan'
(16r001000 asCharacter to: 16r00109F asCharacter) -> 'Myanmar'
(16r0010A0 asCharacter to: 16r0010FF asCharacter) -> 'Georgian'
(16r001100 asCharacter to: 16r0011FF asCharacter) -> 'Hangul Jamo'
(16r001200 asCharacter to: 16r00137F asCharacter) -> 'Ethiopic'
(16r0013A0 asCharacter to: 16r0013FF asCharacter) -> 'Cherokee'
(16r001400 asCharacter to: 16r00167F asCharacter) -> 'Unified Canadian Aboriginal Syllabics'
(16r001680 asCharacter to: 16r00169F asCharacter) -> 'Ogham'
(16r0016A0 asCharacter to: 16r0016FF asCharacter) -> 'Runic'
(16r001700 asCharacter to: 16r00171F asCharacter) -> 'Tagalog'
(16r001720 asCharacter to: 16r00173F asCharacter) -> 'Hanunoo'
(16r001740 asCharacter to: 16r00175F asCharacter) -> 'Buhid'
(16r001760 asCharacter to: 16r00177F asCharacter) -> 'Tagbanwa'
(16r001780 asCharacter to: 16r0017FF asCharacter) -> 'Khmer'
(16r001800 asCharacter to: 16r0018AF asCharacter) -> 'Mongolian'
(16r001900 asCharacter to: 16r00194F asCharacter) -> 'Limbu'
(16r001950 asCharacter to: 16r00197F asCharacter) -> 'Tai Le'
(16r0019E0 asCharacter to: 16r0019FF asCharacter) -> 'Khmer Symbols'
(16r001D00 asCharacter to: 16r001D7F asCharacter) -> 'Phonetic Extensions'
(16r001E00 asCharacter to: 16r001EFF asCharacter) -> 'Latin Extended Additional'
(16r001F00 asCharacter to: 16r001FFF asCharacter) -> 'Greek Extended'
(16r002000 asCharacter to: 16r00206F asCharacter) -> 'General Punctuation'
(16r002070 asCharacter to: 16r00209F asCharacter) -> 'Superscripts and Subscripts'
(16r0020A0 asCharacter to: 16r0020CF asCharacter) -> 'Currency Symbols'
(16r0020D0 asCharacter to: 16r0020FF asCharacter) -> 'Combining Diacritical Marks for Symbols'
(16r002100 asCharacter to: 16r00214F asCharacter) -> 'Letterlike Symbols'
(16r002150 asCharacter to: 16r00218F asCharacter) -> 'Number Forms'
(16r002190 asCharacter to: 16r0021FF asCharacter) -> 'Arrows'
(16r002200 asCharacter to: 16r0022FF asCharacter) -> 'Mathematical Operators'
(16r002300 asCharacter to: 16r0023FF asCharacter) -> 'Miscellaneous Technical'
(16r002400 asCharacter to: 16r00243F asCharacter) -> 'Control Pictures'
(16r002440 asCharacter to: 16r00245F asCharacter) -> 'Optical Character Recognition'
(16r002460 asCharacter to: 16r0024FF asCharacter) -> 'Enclosed Alphanumerics'
(16r002500 asCharacter to: 16r00257F asCharacter) -> 'Box Drawing'
(16r002580 asCharacter to: 16r00259F asCharacter) -> 'Block Elements'
(16r0025A0 asCharacter to: 16r0025FF asCharacter) -> 'Geometric Shapes'
(16r002600 asCharacter to: 16r0026FF asCharacter) -> 'Miscellaneous Symbols'
(16r002700 asCharacter to: 16r0027BF asCharacter) -> 'Dingbats'
(16r0027C0 asCharacter to: 16r0027EF asCharacter) -> 'Miscellaneous Mathematical Symbols-A'
(16r0027F0 asCharacter to: 16r0027FF asCharacter) -> 'Supplemental Arrows-A'
(16r002800 asCharacter to: 16r0028FF asCharacter) -> 'Braille Patterns'
(16r002900 asCharacter to: 16r00297F asCharacter) -> 'Supplemental Arrows-B'
(16r002980 asCharacter to: 16r0029FF asCharacter) -> 'Miscellaneous Mathematical Symbols-B'
(16r002A00 asCharacter to: 16r002AFF asCharacter) -> 'Supplemental Mathematical Operators'
(16r002B00 asCharacter to: 16r002BFF asCharacter) -> 'Miscellaneous Symbols and Arrows'
(16r002E80 asCharacter to: 16r002EFF asCharacter) -> 'CJK Radicals Supplement'
(16r002F00 asCharacter to: 16r002FDF asCharacter) -> 'Kangxi Radicals'
(16r002FF0 asCharacter to: 16r002FFF asCharacter) -> 'Ideographic Description Characters'
(16r003000 asCharacter to: 16r00303F asCharacter) -> 'CJK Symbols and Punctuation'
(16r003040 asCharacter to: 16r00309F asCharacter) -> 'Hiragana'
(16r0030A0 asCharacter to: 16r0030FF asCharacter) -> 'Katakana'
(16r003100 asCharacter to: 16r00312F asCharacter) -> 'Bopomofo'
(16r003130 asCharacter to: 16r00318F asCharacter) -> 'Hangul Compatibility Jamo'
(16r003190 asCharacter to: 16r00319F asCharacter) -> 'Kanbun'
(16r0031A0 asCharacter to: 16r0031BF asCharacter) -> 'Bopomofo Extended'
(16r0031F0 asCharacter to: 16r0031FF asCharacter) -> 'Katakana Phonetic Extensions'
(16r003200 asCharacter to: 16r0032FF asCharacter) -> 'Enclosed CJK Letters and Months'
(16r003300 asCharacter to: 16r0033FF asCharacter) -> 'CJK Compatibility'
(16r003400 asCharacter to: 16r004DBF asCharacter) -> 'CJK Unified Ideographs Extension A'
(16r004DC0 asCharacter to: 16r004DFF asCharacter) -> 'Yijing Hexagram Symbols'
(16r004E00 asCharacter to: 16r009FFF asCharacter) -> 'CJK Unified Ideographs'
(16r00A000 asCharacter to: 16r00A48F asCharacter) -> 'Yi Syllables'
(16r00A490 asCharacter to: 16r00A4CF asCharacter) -> 'Yi Radicals'
(16r00AC00 asCharacter to: 16r00D7AF asCharacter) -> 'Hangul Syllables'
(16r00D800 asCharacter to: 16r00DB7F asCharacter) -> 'High Surrogates'
(16r00DB80 asCharacter to: 16r00DBFF asCharacter) -> 'High Private Use Surrogates'
(16r00DC00 asCharacter to: 16r00DFFF asCharacter) -> 'Low Surrogates'
(16r00E000 asCharacter to: 16r00F8FF asCharacter) -> 'Private Use Area'
(16r00F900 asCharacter to: 16r00FAFF asCharacter) -> 'CJK Compatibility Ideographs'
(16r00FB00 asCharacter to: 16r00FB4F asCharacter) -> 'Alphabetic Presentation Forms'
(16r00FB50 asCharacter to: 16r00FDFF asCharacter) -> 'Arabic Presentation Forms-A'
(16r00FE00 asCharacter to: 16r00FE0F asCharacter) -> 'Variation Selectors'
(16r00FE20 asCharacter to: 16r00FE2F asCharacter) -> 'Combining Half Marks'
(16r00FE30 asCharacter to: 16r00FE4F asCharacter) -> 'CJK Compatibility Forms'
(16r00FE50 asCharacter to: 16r00FE6F asCharacter) -> 'Small Form Variants'
(16r00FE70 asCharacter to: 16r00FEFF asCharacter) -> 'Arabic Presentation Forms-B'
(16r00FF00 asCharacter to: 16r00FFEF asCharacter) -> 'Halfwidth and Fullwidth Forms'
(16r00FFF0 asCharacter to: 16r00FFFF asCharacter) -> 'Specials'
(16r010000 asCharacter to: 16r01007F asCharacter) -> 'Linear B Syllabary'
(16r010080 asCharacter to: 16r0100FF asCharacter) -> 'Linear B Ideograms'
(16r010100 asCharacter to: 16r01013F asCharacter) -> 'Aegean Numbers'
(16r010300 asCharacter to: 16r01032F asCharacter) -> 'Old Italic'
(16r010330 asCharacter to: 16r01034F asCharacter) -> 'Gothic'
(16r010380 asCharacter to: 16r01039F asCharacter) -> 'Ugaritic'
(16r010400 asCharacter to: 16r01044F asCharacter) -> 'Deseret'
(16r010450 asCharacter to: 16r01047F asCharacter) -> 'Shavian'
(16r010480 asCharacter to: 16r0104AF asCharacter) -> 'Osmanya'
(16r010800 asCharacter to: 16r01083F asCharacter) -> 'Cypriot Syllabary'
(16r01D000 asCharacter to: 16r01D0FF asCharacter) -> 'Byzantine Musical Symbols'
(16r01D100 asCharacter to: 16r01D1FF asCharacter) -> 'Musical Symbols'
(16r01D300 asCharacter to: 16r01D35F asCharacter) -> 'Tai Xuan Jing Symbols'
(16r01D400 asCharacter to: 16r01D7FF asCharacter) -> 'Mathematical Alphanumeric Symbols'
(16r020000 asCharacter to: 16r02A6DF asCharacter) -> 'CJK Unified Ideographs Extension B'
(16r02F800 asCharacter to: 16r02FA1F asCharacter) -> 'CJK Compatibility Ideographs Supplement'
(16r0E0000 asCharacter to: 16r0E007F asCharacter) -> 'Tags'




Reply | Threaded
Open this post in threaded view
|

Re: Unicode Ranges you might want to copy

darth-cheney
Hi Timothy,

I spent several (fruitless) days last year trying to get cuneiform fonts to render properly in Squeak (see here for examples of the fonts). There does seem to be an issue with rendering glyphs above a certain code point from what I recall. I am definitely interested in your work and what you end up finding out.

On Sun, Jan 31, 2021 at 7:20 AM gettimothy via Squeak-dev <[hidden email]> wrote:
Hi Folks,

You might find the below handy.

Fwiw, I have coded the UnicodeRangeBrowser  SeasideApp to display all the below (it times out before all display, I will be doing refactor and more development).

I am also coding a utility class to provide information such as below in a variety of ways.

fwiw, here are my current todo notes for this font information.

unifont

provide link to the unicode spec.
provide browser fonts list.
provide squeak fonts list.
provide phare fonts list ^self squeak fonts list:
what are Variant Selectors ?

show gaps in the ranges.

provide a link(s) to required fonts that will make a range display.
displays on: browser list, squeak, emacs, xterm...
status bar...broken, partial, full.
use cases for fonts
pairs with xyz   example superscripts and subscripts

The goal is to show what works and to find/test fonts that will support stuff.
I am modelling the Seaside app on https://jrgraphix.net/research/unicode_blocks.php  
But...for each unicode range, I intend to display smalltalk specific helpers on what/where to get fonts for both the browser side and the image side.

--------------snip -------------------

(16r000020 asCharacter to: 16r00007F asCharacter) -> 'Basic Latin'
(16r0000A0 asCharacter to: 16r0000FF asCharacter) -> 'Latin-1 Supplement'
(16r000100 asCharacter to: 16r00017F asCharacter) -> 'Latin Extended-A'
(16r000180 asCharacter to: 16r00024F asCharacter) -> 'Latin Extended-B'
(16r000250 asCharacter to: 16r0002AF asCharacter) -> 'IPA Extensions'
(16r0002B0 asCharacter to: 16r0002FF asCharacter) -> 'Spacing Modifier Letters'
(16r000300 asCharacter to: 16r00036F asCharacter) -> 'Combining Diacritical Marks'
(16r000370 asCharacter to: 16r0003FF asCharacter) -> 'Greek and Coptic'
(16r000400 asCharacter to: 16r0004FF asCharacter) -> 'Cyrillic'
(16r000500 asCharacter to: 16r00052F asCharacter) -> 'Cyrillic Supplementary'
(16r000530 asCharacter to: 16r00058F asCharacter) -> 'Armenian'
(16r000590 asCharacter to: 16r0005FF asCharacter) -> 'Hebrew'
(16r000600 asCharacter to: 16r0006FF asCharacter) -> 'Arabic'
(16r000700 asCharacter to: 16r00074F asCharacter) -> 'Syriac'
(16r000780 asCharacter to: 16r0007BF asCharacter) -> 'Thaana'
(16r000900 asCharacter to: 16r00097F asCharacter) -> 'Devanagari'
(16r000980 asCharacter to: 16r0009FF asCharacter) -> 'Bengali'
(16r000A00 asCharacter to: 16r000A7F asCharacter) -> 'Gurmukhi'
(16r000A80 asCharacter to: 16r000AFF asCharacter) -> 'Gujarati'
(16r000B00 asCharacter to: 16r000B7F asCharacter) -> 'Oriya'
(16r000B80 asCharacter to: 16r000BFF asCharacter) -> 'Tamil'
(16r000C00 asCharacter to: 16r000C7F asCharacter) -> 'Telugu'
(16r000C80 asCharacter to: 16r000CFF asCharacter) -> 'Kannada'
(16r000D00 asCharacter to: 16r000D7F asCharacter) -> 'Malayalam'
(16r000D80 asCharacter to: 16r000DFF asCharacter) -> 'Sinhala'
(16r000E00 asCharacter to: 16r000E7F asCharacter) -> 'Thai'
(16r000E80 asCharacter to: 16r000EFF asCharacter) -> 'Lao'
(16r000F00 asCharacter to: 16r000FFF asCharacter) -> 'Tibetan'
(16r001000 asCharacter to: 16r00109F asCharacter) -> 'Myanmar'
(16r0010A0 asCharacter to: 16r0010FF asCharacter) -> 'Georgian'
(16r001100 asCharacter to: 16r0011FF asCharacter) -> 'Hangul Jamo'
(16r001200 asCharacter to: 16r00137F asCharacter) -> 'Ethiopic'
(16r0013A0 asCharacter to: 16r0013FF asCharacter) -> 'Cherokee'
(16r001400 asCharacter to: 16r00167F asCharacter) -> 'Unified Canadian Aboriginal Syllabics'
(16r001680 asCharacter to: 16r00169F asCharacter) -> 'Ogham'
(16r0016A0 asCharacter to: 16r0016FF asCharacter) -> 'Runic'
(16r001700 asCharacter to: 16r00171F asCharacter) -> 'Tagalog'
(16r001720 asCharacter to: 16r00173F asCharacter) -> 'Hanunoo'
(16r001740 asCharacter to: 16r00175F asCharacter) -> 'Buhid'
(16r001760 asCharacter to: 16r00177F asCharacter) -> 'Tagbanwa'
(16r001780 asCharacter to: 16r0017FF asCharacter) -> 'Khmer'
(16r001800 asCharacter to: 16r0018AF asCharacter) -> 'Mongolian'
(16r001900 asCharacter to: 16r00194F asCharacter) -> 'Limbu'
(16r001950 asCharacter to: 16r00197F asCharacter) -> 'Tai Le'
(16r0019E0 asCharacter to: 16r0019FF asCharacter) -> 'Khmer Symbols'
(16r001D00 asCharacter to: 16r001D7F asCharacter) -> 'Phonetic Extensions'
(16r001E00 asCharacter to: 16r001EFF asCharacter) -> 'Latin Extended Additional'
(16r001F00 asCharacter to: 16r001FFF asCharacter) -> 'Greek Extended'
(16r002000 asCharacter to: 16r00206F asCharacter) -> 'General Punctuation'
(16r002070 asCharacter to: 16r00209F asCharacter) -> 'Superscripts and Subscripts'
(16r0020A0 asCharacter to: 16r0020CF asCharacter) -> 'Currency Symbols'
(16r0020D0 asCharacter to: 16r0020FF asCharacter) -> 'Combining Diacritical Marks for Symbols'
(16r002100 asCharacter to: 16r00214F asCharacter) -> 'Letterlike Symbols'
(16r002150 asCharacter to: 16r00218F asCharacter) -> 'Number Forms'
(16r002190 asCharacter to: 16r0021FF asCharacter) -> 'Arrows'
(16r002200 asCharacter to: 16r0022FF asCharacter) -> 'Mathematical Operators'
(16r002300 asCharacter to: 16r0023FF asCharacter) -> 'Miscellaneous Technical'
(16r002400 asCharacter to: 16r00243F asCharacter) -> 'Control Pictures'
(16r002440 asCharacter to: 16r00245F asCharacter) -> 'Optical Character Recognition'
(16r002460 asCharacter to: 16r0024FF asCharacter) -> 'Enclosed Alphanumerics'
(16r002500 asCharacter to: 16r00257F asCharacter) -> 'Box Drawing'
(16r002580 asCharacter to: 16r00259F asCharacter) -> 'Block Elements'
(16r0025A0 asCharacter to: 16r0025FF asCharacter) -> 'Geometric Shapes'
(16r002600 asCharacter to: 16r0026FF asCharacter) -> 'Miscellaneous Symbols'
(16r002700 asCharacter to: 16r0027BF asCharacter) -> 'Dingbats'
(16r0027C0 asCharacter to: 16r0027EF asCharacter) -> 'Miscellaneous Mathematical Symbols-A'
(16r0027F0 asCharacter to: 16r0027FF asCharacter) -> 'Supplemental Arrows-A'
(16r002800 asCharacter to: 16r0028FF asCharacter) -> 'Braille Patterns'
(16r002900 asCharacter to: 16r00297F asCharacter) -> 'Supplemental Arrows-B'
(16r002980 asCharacter to: 16r0029FF asCharacter) -> 'Miscellaneous Mathematical Symbols-B'
(16r002A00 asCharacter to: 16r002AFF asCharacter) -> 'Supplemental Mathematical Operators'
(16r002B00 asCharacter to: 16r002BFF asCharacter) -> 'Miscellaneous Symbols and Arrows'
(16r002E80 asCharacter to: 16r002EFF asCharacter) -> 'CJK Radicals Supplement'
(16r002F00 asCharacter to: 16r002FDF asCharacter) -> 'Kangxi Radicals'
(16r002FF0 asCharacter to: 16r002FFF asCharacter) -> 'Ideographic Description Characters'
(16r003000 asCharacter to: 16r00303F asCharacter) -> 'CJK Symbols and Punctuation'
(16r003040 asCharacter to: 16r00309F asCharacter) -> 'Hiragana'
(16r0030A0 asCharacter to: 16r0030FF asCharacter) -> 'Katakana'
(16r003100 asCharacter to: 16r00312F asCharacter) -> 'Bopomofo'
(16r003130 asCharacter to: 16r00318F asCharacter) -> 'Hangul Compatibility Jamo'
(16r003190 asCharacter to: 16r00319F asCharacter) -> 'Kanbun'
(16r0031A0 asCharacter to: 16r0031BF asCharacter) -> 'Bopomofo Extended'
(16r0031F0 asCharacter to: 16r0031FF asCharacter) -> 'Katakana Phonetic Extensions'
(16r003200 asCharacter to: 16r0032FF asCharacter) -> 'Enclosed CJK Letters and Months'
(16r003300 asCharacter to: 16r0033FF asCharacter) -> 'CJK Compatibility'
(16r003400 asCharacter to: 16r004DBF asCharacter) -> 'CJK Unified Ideographs Extension A'
(16r004DC0 asCharacter to: 16r004DFF asCharacter) -> 'Yijing Hexagram Symbols'
(16r004E00 asCharacter to: 16r009FFF asCharacter) -> 'CJK Unified Ideographs'
(16r00A000 asCharacter to: 16r00A48F asCharacter) -> 'Yi Syllables'
(16r00A490 asCharacter to: 16r00A4CF asCharacter) -> 'Yi Radicals'
(16r00AC00 asCharacter to: 16r00D7AF asCharacter) -> 'Hangul Syllables'
(16r00D800 asCharacter to: 16r00DB7F asCharacter) -> 'High Surrogates'
(16r00DB80 asCharacter to: 16r00DBFF asCharacter) -> 'High Private Use Surrogates'
(16r00DC00 asCharacter to: 16r00DFFF asCharacter) -> 'Low Surrogates'
(16r00E000 asCharacter to: 16r00F8FF asCharacter) -> 'Private Use Area'
(16r00F900 asCharacter to: 16r00FAFF asCharacter) -> 'CJK Compatibility Ideographs'
(16r00FB00 asCharacter to: 16r00FB4F asCharacter) -> 'Alphabetic Presentation Forms'
(16r00FB50 asCharacter to: 16r00FDFF asCharacter) -> 'Arabic Presentation Forms-A'
(16r00FE00 asCharacter to: 16r00FE0F asCharacter) -> 'Variation Selectors'
(16r00FE20 asCharacter to: 16r00FE2F asCharacter) -> 'Combining Half Marks'
(16r00FE30 asCharacter to: 16r00FE4F asCharacter) -> 'CJK Compatibility Forms'
(16r00FE50 asCharacter to: 16r00FE6F asCharacter) -> 'Small Form Variants'
(16r00FE70 asCharacter to: 16r00FEFF asCharacter) -> 'Arabic Presentation Forms-B'
(16r00FF00 asCharacter to: 16r00FFEF asCharacter) -> 'Halfwidth and Fullwidth Forms'
(16r00FFF0 asCharacter to: 16r00FFFF asCharacter) -> 'Specials'
(16r010000 asCharacter to: 16r01007F asCharacter) -> 'Linear B Syllabary'
(16r010080 asCharacter to: 16r0100FF asCharacter) -> 'Linear B Ideograms'
(16r010100 asCharacter to: 16r01013F asCharacter) -> 'Aegean Numbers'
(16r010300 asCharacter to: 16r01032F asCharacter) -> 'Old Italic'
(16r010330 asCharacter to: 16r01034F asCharacter) -> 'Gothic'
(16r010380 asCharacter to: 16r01039F asCharacter) -> 'Ugaritic'
(16r010400 asCharacter to: 16r01044F asCharacter) -> 'Deseret'
(16r010450 asCharacter to: 16r01047F asCharacter) -> 'Shavian'
(16r010480 asCharacter to: 16r0104AF asCharacter) -> 'Osmanya'
(16r010800 asCharacter to: 16r01083F asCharacter) -> 'Cypriot Syllabary'
(16r01D000 asCharacter to: 16r01D0FF asCharacter) -> 'Byzantine Musical Symbols'
(16r01D100 asCharacter to: 16r01D1FF asCharacter) -> 'Musical Symbols'
(16r01D300 asCharacter to: 16r01D35F asCharacter) -> 'Tai Xuan Jing Symbols'
(16r01D400 asCharacter to: 16r01D7FF asCharacter) -> 'Mathematical Alphanumeric Symbols'
(16r020000 asCharacter to: 16r02A6DF asCharacter) -> 'CJK Unified Ideographs Extension B'
(16r02F800 asCharacter to: 16r02FA1F asCharacter) -> 'CJK Compatibility Ideographs Supplement'
(16r0E0000 asCharacter to: 16r0E007F asCharacter) -> 'Tags'





--
Eric


Reply | Threaded
Open this post in threaded view
|

Re: Unicode Ranges you might want to copy

Squeak - Dev mailing list
Hi Eric


Thanks for the heads up!

Doing some lite browsing on the matter, I found these resources....

https://unicode-table.com/en/#control-character
 https://css-tricks.com/almanac/properties/u/unicode-range/
 http://www.alanwood.net/unicode/alphabetic_presentation_forms.html



That last one  looks very useful  and I will ve working through it.

The middle one claims that fonts can be loaded dynamically when needed.

An intersting thing is that the alan wood site displays the $ sign under the currency range, but the jgraphix  website does not!


It is also somewhat apparent that  "like" ranges are not contigous.

Thanjs again for your gelp!

Cheers,  


---- On Mon, 01 Feb 2021 08:57:32 -0500 [hidden email] wrote ----

Hi Timothy,

I spent several (fruitless) days last year trying to get cuneiform fonts to render properly in Squeak (see here for examples of the fonts). There does seem to be an issue with rendering glyphs above a certain code point from what I recall. I am definitely interested in your work and what you end up finding out.

On Sun, Jan 31, 2021 at 7:20 AM gettimothy via Squeak-dev <[hidden email]> wrote:
Hi Folks,

You might find the below handy.

Fwiw, I have coded the UnicodeRangeBrowser  SeasideApp to display all the below (it times out before all display, I will be doing refactor and more development).

I am also coding a utility class to provide information such as below in a variety of ways.

fwiw, here are my current todo notes for this font information.

unifont

provide link to the unicode spec.
provide browser fonts list.
provide squeak fonts list.
provide phare fonts list ^self squeak fonts list:
what are Variant Selectors ?

show gaps in the ranges.

provide a link(s) to required fonts that will make a range display.
displays on: browser list, squeak, emacs, xterm...
status bar...broken, partial, full.
use cases for fonts
pairs with xyz   example superscripts and subscripts

The goal is to show what works and to find/test fonts that will support stuff.
I am modelling the Seaside app on https://jrgraphix.net/research/unicode_blocks.php  
But...for each unicode range, I intend to display smalltalk specific helpers on what/where to get fonts for both the browser side and the image side.

--------------snip -------------------

(16r000020 asCharacter to: 16r00007F asCharacter) -> 'Basic Latin'
(16r0000A0 asCharacter to: 16r0000FF asCharacter) -> 'Latin-1 Supplement'
(16r000100 asCharacter to: 16r00017F asCharacter) -> 'Latin Extended-A'
(16r000180 asCharacter to: 16r00024F asCharacter) -> 'Latin Extended-B'
(16r000250 asCharacter to: 16r0002AF asCharacter) -> 'IPA Extensions'
(16r0002B0 asCharacter to: 16r0002FF asCharacter) -> 'Spacing Modifier Letters'
(16r000300 asCharacter to: 16r00036F asCharacter) -> 'Combining Diacritical Marks'
(16r000370 asCharacter to: 16r0003FF asCharacter) -> 'Greek and Coptic'
(16r000400 asCharacter to: 16r0004FF asCharacter) -> 'Cyrillic'
(16r000500 asCharacter to: 16r00052F asCharacter) -> 'Cyrillic Supplementary'
(16r000530 asCharacter to: 16r00058F asCharacter) -> 'Armenian'
(16r000590 asCharacter to: 16r0005FF asCharacter) -> 'Hebrew'
(16r000600 asCharacter to: 16r0006FF asCharacter) -> 'Arabic'
(16r000700 asCharacter to: 16r00074F asCharacter) -> 'Syriac'
(16r000780 asCharacter to: 16r0007BF asCharacter) -> 'Thaana'
(16r000900 asCharacter to: 16r00097F asCharacter) -> 'Devanagari'
(16r000980 asCharacter to: 16r0009FF asCharacter) -> 'Bengali'
(16r000A00 asCharacter to: 16r000A7F asCharacter) -> 'Gurmukhi'
(16r000A80 asCharacter to: 16r000AFF asCharacter) -> 'Gujarati'
(16r000B00 asCharacter to: 16r000B7F asCharacter) -> 'Oriya'
(16r000B80 asCharacter to: 16r000BFF asCharacter) -> 'Tamil'
(16r000C00 asCharacter to: 16r000C7F asCharacter) -> 'Telugu'
(16r000C80 asCharacter to: 16r000CFF asCharacter) -> 'Kannada'
(16r000D00 asCharacter to: 16r000D7F asCharacter) -> 'Malayalam'
(16r000D80 asCharacter to: 16r000DFF asCharacter) -> 'Sinhala'
(16r000E00 asCharacter to: 16r000E7F asCharacter) -> 'Thai'
(16r000E80 asCharacter to: 16r000EFF asCharacter) -> 'Lao'
(16r000F00 asCharacter to: 16r000FFF asCharacter) -> 'Tibetan'
(16r001000 asCharacter to: 16r00109F asCharacter) -> 'Myanmar'
(16r0010A0 asCharacter to: 16r0010FF asCharacter) -> 'Georgian'
(16r001100 asCharacter to: 16r0011FF asCharacter) -> 'Hangul Jamo'
(16r001200 asCharacter to: 16r00137F asCharacter) -> 'Ethiopic'
(16r0013A0 asCharacter to: 16r0013FF asCharacter) -> 'Cherokee'
(16r001400 asCharacter to: 16r00167F asCharacter) -> 'Unified Canadian Aboriginal Syllabics'
(16r001680 asCharacter to: 16r00169F asCharacter) -> 'Ogham'
(16r0016A0 asCharacter to: 16r0016FF asCharacter) -> 'Runic'
(16r001700 asCharacter to: 16r00171F asCharacter) -> 'Tagalog'
(16r001720 asCharacter to: 16r00173F asCharacter) -> 'Hanunoo'
(16r001740 asCharacter to: 16r00175F asCharacter) -> 'Buhid'
(16r001760 asCharacter to: 16r00177F asCharacter) -> 'Tagbanwa'
(16r001780 asCharacter to: 16r0017FF asCharacter) -> 'Khmer'
(16r001800 asCharacter to: 16r0018AF asCharacter) -> 'Mongolian'
(16r001900 asCharacter to: 16r00194F asCharacter) -> 'Limbu'
(16r001950 asCharacter to: 16r00197F asCharacter) -> 'Tai Le'
(16r0019E0 asCharacter to: 16r0019FF asCharacter) -> 'Khmer Symbols'
(16r001D00 asCharacter to: 16r001D7F asCharacter) -> 'Phonetic Extensions'
(16r001E00 asCharacter to: 16r001EFF asCharacter) -> 'Latin Extended Additional'
(16r001F00 asCharacter to: 16r001FFF asCharacter) -> 'Greek Extended'
(16r002000 asCharacter to: 16r00206F asCharacter) -> 'General Punctuation'
(16r002070 asCharacter to: 16r00209F asCharacter) -> 'Superscripts and Subscripts'
(16r0020A0 asCharacter to: 16r0020CF asCharacter) -> 'Currency Symbols'
(16r0020D0 asCharacter to: 16r0020FF asCharacter) -> 'Combining Diacritical Marks for Symbols'
(16r002100 asCharacter to: 16r00214F asCharacter) -> 'Letterlike Symbols'
(16r002150 asCharacter to: 16r00218F asCharacter) -> 'Number Forms'
(16r002190 asCharacter to: 16r0021FF asCharacter) -> 'Arrows'
(16r002200 asCharacter to: 16r0022FF asCharacter) -> 'Mathematical Operators'
(16r002300 asCharacter to: 16r0023FF asCharacter) -> 'Miscellaneous Technical'
(16r002400 asCharacter to: 16r00243F asCharacter) -> 'Control Pictures'
(16r002440 asCharacter to: 16r00245F asCharacter) -> 'Optical Character Recognition'
(16r002460 asCharacter to: 16r0024FF asCharacter) -> 'Enclosed Alphanumerics'
(16r002500 asCharacter to: 16r00257F asCharacter) -> 'Box Drawing'
(16r002580 asCharacter to: 16r00259F asCharacter) -> 'Block Elements'
(16r0025A0 asCharacter to: 16r0025FF asCharacter) -> 'Geometric Shapes'
(16r002600 asCharacter to: 16r0026FF asCharacter) -> 'Miscellaneous Symbols'
(16r002700 asCharacter to: 16r0027BF asCharacter) -> 'Dingbats'
(16r0027C0 asCharacter to: 16r0027EF asCharacter) -> 'Miscellaneous Mathematical Symbols-A'
(16r0027F0 asCharacter to: 16r0027FF asCharacter) -> 'Supplemental Arrows-A'
(16r002800 asCharacter to: 16r0028FF asCharacter) -> 'Braille Patterns'
(16r002900 asCharacter to: 16r00297F asCharacter) -> 'Supplemental Arrows-B'
(16r002980 asCharacter to: 16r0029FF asCharacter) -> 'Miscellaneous Mathematical Symbols-B'
(16r002A00 asCharacter to: 16r002AFF asCharacter) -> 'Supplemental Mathematical Operators'
(16r002B00 asCharacter to: 16r002BFF asCharacter) -> 'Miscellaneous Symbols and Arrows'
(16r002E80 asCharacter to: 16r002EFF asCharacter) -> 'CJK Radicals Supplement'
(16r002F00 asCharacter to: 16r002FDF asCharacter) -> 'Kangxi Radicals'
(16r002FF0 asCharacter to: 16r002FFF asCharacter) -> 'Ideographic Description Characters'
(16r003000 asCharacter to: 16r00303F asCharacter) -> 'CJK Symbols and Punctuation'
(16r003040 asCharacter to: 16r00309F asCharacter) -> 'Hiragana'
(16r0030A0 asCharacter to: 16r0030FF asCharacter) -> 'Katakana'
(16r003100 asCharacter to: 16r00312F asCharacter) -> 'Bopomofo'
(16r003130 asCharacter to: 16r00318F asCharacter) -> 'Hangul Compatibility Jamo'
(16r003190 asCharacter to: 16r00319F asCharacter) -> 'Kanbun'
(16r0031A0 asCharacter to: 16r0031BF asCharacter) -> 'Bopomofo Extended'
(16r0031F0 asCharacter to: 16r0031FF asCharacter) -> 'Katakana Phonetic Extensions'
(16r003200 asCharacter to: 16r0032FF asCharacter) -> 'Enclosed CJK Letters and Months'
(16r003300 asCharacter to: 16r0033FF asCharacter) -> 'CJK Compatibility'
(16r003400 asCharacter to: 16r004DBF asCharacter) -> 'CJK Unified Ideographs Extension A'
(16r004DC0 asCharacter to: 16r004DFF asCharacter) -> 'Yijing Hexagram Symbols'
(16r004E00 asCharacter to: 16r009FFF asCharacter) -> 'CJK Unified Ideographs'
(16r00A000 asCharacter to: 16r00A48F asCharacter) -> 'Yi Syllables'
(16r00A490 asCharacter to: 16r00A4CF asCharacter) -> 'Yi Radicals'
(16r00AC00 asCharacter to: 16r00D7AF asCharacter) -> 'Hangul Syllables'
(16r00D800 asCharacter to: 16r00DB7F asCharacter) -> 'High Surrogates'
(16r00DB80 asCharacter to: 16r00DBFF asCharacter) -> 'High Private Use Surrogates'
(16r00DC00 asCharacter to: 16r00DFFF asCharacter) -> 'Low Surrogates'
(16r00E000 asCharacter to: 16r00F8FF asCharacter) -> 'Private Use Area'
(16r00F900 asCharacter to: 16r00FAFF asCharacter) -> 'CJK Compatibility Ideographs'
(16r00FB00 asCharacter to: 16r00FB4F asCharacter) -> 'Alphabetic Presentation Forms'
(16r00FB50 asCharacter to: 16r00FDFF asCharacter) -> 'Arabic Presentation Forms-A'
(16r00FE00 asCharacter to: 16r00FE0F asCharacter) -> 'Variation Selectors'
(16r00FE20 asCharacter to: 16r00FE2F asCharacter) -> 'Combining Half Marks'
(16r00FE30 asCharacter to: 16r00FE4F asCharacter) -> 'CJK Compatibility Forms'
(16r00FE50 asCharacter to: 16r00FE6F asCharacter) -> 'Small Form Variants'
(16r00FE70 asCharacter to: 16r00FEFF asCharacter) -> 'Arabic Presentation Forms-B'
(16r00FF00 asCharacter to: 16r00FFEF asCharacter) -> 'Halfwidth and Fullwidth Forms'
(16r00FFF0 asCharacter to: 16r00FFFF asCharacter) -> 'Specials'
(16r010000 asCharacter to: 16r01007F asCharacter) -> 'Linear B Syllabary'
(16r010080 asCharacter to: 16r0100FF asCharacter) -> 'Linear B Ideograms'
(16r010100 asCharacter to: 16r01013F asCharacter) -> 'Aegean Numbers'
(16r010300 asCharacter to: 16r01032F asCharacter) -> 'Old Italic'
(16r010330 asCharacter to: 16r01034F asCharacter) -> 'Gothic'
(16r010380 asCharacter to: 16r01039F asCharacter) -> 'Ugaritic'
(16r010400 asCharacter to: 16r01044F asCharacter) -> 'Deseret'
(16r010450 asCharacter to: 16r01047F asCharacter) -> 'Shavian'
(16r010480 asCharacter to: 16r0104AF asCharacter) -> 'Osmanya'
(16r010800 asCharacter to: 16r01083F asCharacter) -> 'Cypriot Syllabary'
(16r01D000 asCharacter to: 16r01D0FF asCharacter) -> 'Byzantine Musical Symbols'
(16r01D100 asCharacter to: 16r01D1FF asCharacter) -> 'Musical Symbols'
(16r01D300 asCharacter to: 16r01D35F asCharacter) -> 'Tai Xuan Jing Symbols'
(16r01D400 asCharacter to: 16r01D7FF asCharacter) -> 'Mathematical Alphanumeric Symbols'
(16r020000 asCharacter to: 16r02A6DF asCharacter) -> 'CJK Unified Ideographs Extension B'
(16r02F800 asCharacter to: 16r02FA1F asCharacter) -> 'CJK Compatibility Ideographs Supplement'
(16r0E0000 asCharacter to: 16r0E007F asCharacter) -> 'Tags'





--
Eric



Reply | Threaded
Open this post in threaded view
|

Re: Unicode Ranges you might want to copy

Squeak - Dev mailing list
In reply to this post by darth-cheney
To  add to my previous repky, I am curious  "how" the letters are written on a morphic display?  Is it a bitblt thing? A primitive? A glob of pixels that morphic plops into place?

cheers


---- On Mon, 01 Feb 2021 08:57:32 -0500 [hidden email] wrote ----

Hi Timothy,

I spent several (fruitless) days last year trying to get cuneiform fonts to render properly in Squeak (see here for examples of the fonts). There does seem to be an issue with rendering glyphs above a certain code point from what I recall. I am definitely interested in your work and what you end up finding out.

On Sun, Jan 31, 2021 at 7:20 AM gettimothy via Squeak-dev <[hidden email]> wrote:
Hi Folks,

You might find the below handy.

Fwiw, I have coded the UnicodeRangeBrowser  SeasideApp to display all the below (it times out before all display, I will be doing refactor and more development).

I am also coding a utility class to provide information such as below in a variety of ways.

fwiw, here are my current todo notes for this font information.

unifont

provide link to the unicode spec.
provide browser fonts list.
provide squeak fonts list.
provide phare fonts list ^self squeak fonts list:
what are Variant Selectors ?

show gaps in the ranges.

provide a link(s) to required fonts that will make a range display.
displays on: browser list, squeak, emacs, xterm...
status bar...broken, partial, full.
use cases for fonts
pairs with xyz   example superscripts and subscripts

The goal is to show what works and to find/test fonts that will support stuff.
I am modelling the Seaside app on https://jrgraphix.net/research/unicode_blocks.php  
But...for each unicode range, I intend to display smalltalk specific helpers on what/where to get fonts for both the browser side and the image side.

--------------snip -------------------

(16r000020 asCharacter to: 16r00007F asCharacter) -> 'Basic Latin'
(16r0000A0 asCharacter to: 16r0000FF asCharacter) -> 'Latin-1 Supplement'
(16r000100 asCharacter to: 16r00017F asCharacter) -> 'Latin Extended-A'
(16r000180 asCharacter to: 16r00024F asCharacter) -> 'Latin Extended-B'
(16r000250 asCharacter to: 16r0002AF asCharacter) -> 'IPA Extensions'
(16r0002B0 asCharacter to: 16r0002FF asCharacter) -> 'Spacing Modifier Letters'
(16r000300 asCharacter to: 16r00036F asCharacter) -> 'Combining Diacritical Marks'
(16r000370 asCharacter to: 16r0003FF asCharacter) -> 'Greek and Coptic'
(16r000400 asCharacter to: 16r0004FF asCharacter) -> 'Cyrillic'
(16r000500 asCharacter to: 16r00052F asCharacter) -> 'Cyrillic Supplementary'
(16r000530 asCharacter to: 16r00058F asCharacter) -> 'Armenian'
(16r000590 asCharacter to: 16r0005FF asCharacter) -> 'Hebrew'
(16r000600 asCharacter to: 16r0006FF asCharacter) -> 'Arabic'
(16r000700 asCharacter to: 16r00074F asCharacter) -> 'Syriac'
(16r000780 asCharacter to: 16r0007BF asCharacter) -> 'Thaana'
(16r000900 asCharacter to: 16r00097F asCharacter) -> 'Devanagari'
(16r000980 asCharacter to: 16r0009FF asCharacter) -> 'Bengali'
(16r000A00 asCharacter to: 16r000A7F asCharacter) -> 'Gurmukhi'
(16r000A80 asCharacter to: 16r000AFF asCharacter) -> 'Gujarati'
(16r000B00 asCharacter to: 16r000B7F asCharacter) -> 'Oriya'
(16r000B80 asCharacter to: 16r000BFF asCharacter) -> 'Tamil'
(16r000C00 asCharacter to: 16r000C7F asCharacter) -> 'Telugu'
(16r000C80 asCharacter to: 16r000CFF asCharacter) -> 'Kannada'
(16r000D00 asCharacter to: 16r000D7F asCharacter) -> 'Malayalam'
(16r000D80 asCharacter to: 16r000DFF asCharacter) -> 'Sinhala'
(16r000E00 asCharacter to: 16r000E7F asCharacter) -> 'Thai'
(16r000E80 asCharacter to: 16r000EFF asCharacter) -> 'Lao'
(16r000F00 asCharacter to: 16r000FFF asCharacter) -> 'Tibetan'
(16r001000 asCharacter to: 16r00109F asCharacter) -> 'Myanmar'
(16r0010A0 asCharacter to: 16r0010FF asCharacter) -> 'Georgian'
(16r001100 asCharacter to: 16r0011FF asCharacter) -> 'Hangul Jamo'
(16r001200 asCharacter to: 16r00137F asCharacter) -> 'Ethiopic'
(16r0013A0 asCharacter to: 16r0013FF asCharacter) -> 'Cherokee'
(16r001400 asCharacter to: 16r00167F asCharacter) -> 'Unified Canadian Aboriginal Syllabics'
(16r001680 asCharacter to: 16r00169F asCharacter) -> 'Ogham'
(16r0016A0 asCharacter to: 16r0016FF asCharacter) -> 'Runic'
(16r001700 asCharacter to: 16r00171F asCharacter) -> 'Tagalog'
(16r001720 asCharacter to: 16r00173F asCharacter) -> 'Hanunoo'
(16r001740 asCharacter to: 16r00175F asCharacter) -> 'Buhid'
(16r001760 asCharacter to: 16r00177F asCharacter) -> 'Tagbanwa'
(16r001780 asCharacter to: 16r0017FF asCharacter) -> 'Khmer'
(16r001800 asCharacter to: 16r0018AF asCharacter) -> 'Mongolian'
(16r001900 asCharacter to: 16r00194F asCharacter) -> 'Limbu'
(16r001950 asCharacter to: 16r00197F asCharacter) -> 'Tai Le'
(16r0019E0 asCharacter to: 16r0019FF asCharacter) -> 'Khmer Symbols'
(16r001D00 asCharacter to: 16r001D7F asCharacter) -> 'Phonetic Extensions'
(16r001E00 asCharacter to: 16r001EFF asCharacter) -> 'Latin Extended Additional'
(16r001F00 asCharacter to: 16r001FFF asCharacter) -> 'Greek Extended'
(16r002000 asCharacter to: 16r00206F asCharacter) -> 'General Punctuation'
(16r002070 asCharacter to: 16r00209F asCharacter) -> 'Superscripts and Subscripts'
(16r0020A0 asCharacter to: 16r0020CF asCharacter) -> 'Currency Symbols'
(16r0020D0 asCharacter to: 16r0020FF asCharacter) -> 'Combining Diacritical Marks for Symbols'
(16r002100 asCharacter to: 16r00214F asCharacter) -> 'Letterlike Symbols'
(16r002150 asCharacter to: 16r00218F asCharacter) -> 'Number Forms'
(16r002190 asCharacter to: 16r0021FF asCharacter) -> 'Arrows'
(16r002200 asCharacter to: 16r0022FF asCharacter) -> 'Mathematical Operators'
(16r002300 asCharacter to: 16r0023FF asCharacter) -> 'Miscellaneous Technical'
(16r002400 asCharacter to: 16r00243F asCharacter) -> 'Control Pictures'
(16r002440 asCharacter to: 16r00245F asCharacter) -> 'Optical Character Recognition'
(16r002460 asCharacter to: 16r0024FF asCharacter) -> 'Enclosed Alphanumerics'
(16r002500 asCharacter to: 16r00257F asCharacter) -> 'Box Drawing'
(16r002580 asCharacter to: 16r00259F asCharacter) -> 'Block Elements'
(16r0025A0 asCharacter to: 16r0025FF asCharacter) -> 'Geometric Shapes'
(16r002600 asCharacter to: 16r0026FF asCharacter) -> 'Miscellaneous Symbols'
(16r002700 asCharacter to: 16r0027BF asCharacter) -> 'Dingbats'
(16r0027C0 asCharacter to: 16r0027EF asCharacter) -> 'Miscellaneous Mathematical Symbols-A'
(16r0027F0 asCharacter to: 16r0027FF asCharacter) -> 'Supplemental Arrows-A'
(16r002800 asCharacter to: 16r0028FF asCharacter) -> 'Braille Patterns'
(16r002900 asCharacter to: 16r00297F asCharacter) -> 'Supplemental Arrows-B'
(16r002980 asCharacter to: 16r0029FF asCharacter) -> 'Miscellaneous Mathematical Symbols-B'
(16r002A00 asCharacter to: 16r002AFF asCharacter) -> 'Supplemental Mathematical Operators'
(16r002B00 asCharacter to: 16r002BFF asCharacter) -> 'Miscellaneous Symbols and Arrows'
(16r002E80 asCharacter to: 16r002EFF asCharacter) -> 'CJK Radicals Supplement'
(16r002F00 asCharacter to: 16r002FDF asCharacter) -> 'Kangxi Radicals'
(16r002FF0 asCharacter to: 16r002FFF asCharacter) -> 'Ideographic Description Characters'
(16r003000 asCharacter to: 16r00303F asCharacter) -> 'CJK Symbols and Punctuation'
(16r003040 asCharacter to: 16r00309F asCharacter) -> 'Hiragana'
(16r0030A0 asCharacter to: 16r0030FF asCharacter) -> 'Katakana'
(16r003100 asCharacter to: 16r00312F asCharacter) -> 'Bopomofo'
(16r003130 asCharacter to: 16r00318F asCharacter) -> 'Hangul Compatibility Jamo'
(16r003190 asCharacter to: 16r00319F asCharacter) -> 'Kanbun'
(16r0031A0 asCharacter to: 16r0031BF asCharacter) -> 'Bopomofo Extended'
(16r0031F0 asCharacter to: 16r0031FF asCharacter) -> 'Katakana Phonetic Extensions'
(16r003200 asCharacter to: 16r0032FF asCharacter) -> 'Enclosed CJK Letters and Months'
(16r003300 asCharacter to: 16r0033FF asCharacter) -> 'CJK Compatibility'
(16r003400 asCharacter to: 16r004DBF asCharacter) -> 'CJK Unified Ideographs Extension A'
(16r004DC0 asCharacter to: 16r004DFF asCharacter) -> 'Yijing Hexagram Symbols'
(16r004E00 asCharacter to: 16r009FFF asCharacter) -> 'CJK Unified Ideographs'
(16r00A000 asCharacter to: 16r00A48F asCharacter) -> 'Yi Syllables'
(16r00A490 asCharacter to: 16r00A4CF asCharacter) -> 'Yi Radicals'
(16r00AC00 asCharacter to: 16r00D7AF asCharacter) -> 'Hangul Syllables'
(16r00D800 asCharacter to: 16r00DB7F asCharacter) -> 'High Surrogates'
(16r00DB80 asCharacter to: 16r00DBFF asCharacter) -> 'High Private Use Surrogates'
(16r00DC00 asCharacter to: 16r00DFFF asCharacter) -> 'Low Surrogates'
(16r00E000 asCharacter to: 16r00F8FF asCharacter) -> 'Private Use Area'
(16r00F900 asCharacter to: 16r00FAFF asCharacter) -> 'CJK Compatibility Ideographs'
(16r00FB00 asCharacter to: 16r00FB4F asCharacter) -> 'Alphabetic Presentation Forms'
(16r00FB50 asCharacter to: 16r00FDFF asCharacter) -> 'Arabic Presentation Forms-A'
(16r00FE00 asCharacter to: 16r00FE0F asCharacter) -> 'Variation Selectors'
(16r00FE20 asCharacter to: 16r00FE2F asCharacter) -> 'Combining Half Marks'
(16r00FE30 asCharacter to: 16r00FE4F asCharacter) -> 'CJK Compatibility Forms'
(16r00FE50 asCharacter to: 16r00FE6F asCharacter) -> 'Small Form Variants'
(16r00FE70 asCharacter to: 16r00FEFF asCharacter) -> 'Arabic Presentation Forms-B'
(16r00FF00 asCharacter to: 16r00FFEF asCharacter) -> 'Halfwidth and Fullwidth Forms'
(16r00FFF0 asCharacter to: 16r00FFFF asCharacter) -> 'Specials'
(16r010000 asCharacter to: 16r01007F asCharacter) -> 'Linear B Syllabary'
(16r010080 asCharacter to: 16r0100FF asCharacter) -> 'Linear B Ideograms'
(16r010100 asCharacter to: 16r01013F asCharacter) -> 'Aegean Numbers'
(16r010300 asCharacter to: 16r01032F asCharacter) -> 'Old Italic'
(16r010330 asCharacter to: 16r01034F asCharacter) -> 'Gothic'
(16r010380 asCharacter to: 16r01039F asCharacter) -> 'Ugaritic'
(16r010400 asCharacter to: 16r01044F asCharacter) -> 'Deseret'
(16r010450 asCharacter to: 16r01047F asCharacter) -> 'Shavian'
(16r010480 asCharacter to: 16r0104AF asCharacter) -> 'Osmanya'
(16r010800 asCharacter to: 16r01083F asCharacter) -> 'Cypriot Syllabary'
(16r01D000 asCharacter to: 16r01D0FF asCharacter) -> 'Byzantine Musical Symbols'
(16r01D100 asCharacter to: 16r01D1FF asCharacter) -> 'Musical Symbols'
(16r01D300 asCharacter to: 16r01D35F asCharacter) -> 'Tai Xuan Jing Symbols'
(16r01D400 asCharacter to: 16r01D7FF asCharacter) -> 'Mathematical Alphanumeric Symbols'
(16r020000 asCharacter to: 16r02A6DF asCharacter) -> 'CJK Unified Ideographs Extension B'
(16r02F800 asCharacter to: 16r02FA1F asCharacter) -> 'CJK Compatibility Ideographs Supplement'
(16r0E0000 asCharacter to: 16r0E007F asCharacter) -> 'Tags'





--
Eric



Reply | Threaded
Open this post in threaded view
|

Re: Unicode Ranges you might want to copy

Christoph Thiede
In reply to this post by Squeak - Dev mailing list

Hi Timothy,


do you also know Unicode class and, in particular, its class variables and its testing selectors? Not sure whether this is on the same abstraction level as your work - just wanted to add this pointer.


Also note that Unicode class's character classification does not yet support UTF-16 characters - I have added this to my todo-list some disgracefully long time ago [1] and not yet managed to return back to this issue (so sorry, Levente!).


Best,

Christoph


[1] http://forum.world.st/Unicode-td5113495.html



Von: Squeak-dev <[hidden email]> im Auftrag von gettimothy via Squeak-dev <[hidden email]>
Gesendet: Montag, 1. Februar 2021 15:30:29
An: [hidden email]
Cc: [hidden email]
Betreff: Re: [squeak-dev] Unicode Ranges you might want to copy
 
Hi Eric


Thanks for the heads up!

Doing some lite browsing on the matter, I found these resources....

https://unicode-table.com/en/#control-character
 https://css-tricks.com/almanac/properties/u/unicode-range/
 http://www.alanwood.net/unicode/alphabetic_presentation_forms.html



That last one  looks very useful  and I will ve working through it.

The middle one claims that fonts can be loaded dynamically when needed.

An intersting thing is that the alan wood site displays the $ sign under the currency range, but the jgraphix  website does not!


It is also somewhat apparent that  "like" ranges are not contigous.

Thanjs again for your gelp!

Cheers,  


---- On Mon, 01 Feb 2021 08:57:32 -0500 [hidden email] wrote ----

Hi Timothy,

I spent several (fruitless) days last year trying to get cuneiform fonts to render properly in Squeak (see here for examples of the fonts). There does seem to be an issue with rendering glyphs above a certain code point from what I recall. I am definitely interested in your work and what you end up finding out.

On Sun, Jan 31, 2021 at 7:20 AM gettimothy via Squeak-dev <[hidden email]> wrote:
Hi Folks,

You might find the below handy.

Fwiw, I have coded the UnicodeRangeBrowser  SeasideApp to display all the below (it times out before all display, I will be doing refactor and more development).

I am also coding a utility class to provide information such as below in a variety of ways.

fwiw, here are my current todo notes for this font information.

unifont

provide link to the unicode spec.
provide browser fonts list.
provide squeak fonts list.
provide phare fonts list ^self squeak fonts list:
what are Variant Selectors ?

show gaps in the ranges.

provide a link(s) to required fonts that will make a range display.
displays on: browser list, squeak, emacs, xterm...
status bar...broken, partial, full.
use cases for fonts
pairs with xyz   example superscripts and subscripts

The goal is to show what works and to find/test fonts that will support stuff.
I am modelling the Seaside app on https://jrgraphix.net/research/unicode_blocks.php  
But...for each unicode range, I intend to display smalltalk specific helpers on what/where to get fonts for both the browser side and the image side.

--------------snip -------------------

(16r000020 asCharacter to: 16r00007F asCharacter) -> 'Basic Latin'
(16r0000A0 asCharacter to: 16r0000FF asCharacter) -> 'Latin-1 Supplement'
(16r000100 asCharacter to: 16r00017F asCharacter) -> 'Latin Extended-A'
(16r000180 asCharacter to: 16r00024F asCharacter) -> 'Latin Extended-B'
(16r000250 asCharacter to: 16r0002AF asCharacter) -> 'IPA Extensions'
(16r0002B0 asCharacter to: 16r0002FF asCharacter) -> 'Spacing Modifier Letters'
(16r000300 asCharacter to: 16r00036F asCharacter) -> 'Combining Diacritical Marks'
(16r000370 asCharacter to: 16r0003FF asCharacter) -> 'Greek and Coptic'
(16r000400 asCharacter to: 16r0004FF asCharacter) -> 'Cyrillic'
(16r000500 asCharacter to: 16r00052F asCharacter) -> 'Cyrillic Supplementary'
(16r000530 asCharacter to: 16r00058F asCharacter) -> 'Armenian'
(16r000590 asCharacter to: 16r0005FF asCharacter) -> 'Hebrew'
(16r000600 asCharacter to: 16r0006FF asCharacter) -> 'Arabic'
(16r000700 asCharacter to: 16r00074F asCharacter) -> 'Syriac'
(16r000780 asCharacter to: 16r0007BF asCharacter) -> 'Thaana'
(16r000900 asCharacter to: 16r00097F asCharacter) -> 'Devanagari'
(16r000980 asCharacter to: 16r0009FF asCharacter) -> 'Bengali'
(16r000A00 asCharacter to: 16r000A7F asCharacter) -> 'Gurmukhi'
(16r000A80 asCharacter to: 16r000AFF asCharacter) -> 'Gujarati'
(16r000B00 asCharacter to: 16r000B7F asCharacter) -> 'Oriya'
(16r000B80 asCharacter to: 16r000BFF asCharacter) -> 'Tamil'
(16r000C00 asCharacter to: 16r000C7F asCharacter) -> 'Telugu'
(16r000C80 asCharacter to: 16r000CFF asCharacter) -> 'Kannada'
(16r000D00 asCharacter to: 16r000D7F asCharacter) -> 'Malayalam'
(16r000D80 asCharacter to: 16r000DFF asCharacter) -> 'Sinhala'
(16r000E00 asCharacter to: 16r000E7F asCharacter) -> 'Thai'
(16r000E80 asCharacter to: 16r000EFF asCharacter) -> 'Lao'
(16r000F00 asCharacter to: 16r000FFF asCharacter) -> 'Tibetan'
(16r001000 asCharacter to: 16r00109F asCharacter) -> 'Myanmar'
(16r0010A0 asCharacter to: 16r0010FF asCharacter) -> 'Georgian'
(16r001100 asCharacter to: 16r0011FF asCharacter) -> 'Hangul Jamo'
(16r001200 asCharacter to: 16r00137F asCharacter) -> 'Ethiopic'
(16r0013A0 asCharacter to: 16r0013FF asCharacter) -> 'Cherokee'
(16r001400 asCharacter to: 16r00167F asCharacter) -> 'Unified Canadian Aboriginal Syllabics'
(16r001680 asCharacter to: 16r00169F asCharacter) -> 'Ogham'
(16r0016A0 asCharacter to: 16r0016FF asCharacter) -> 'Runic'
(16r001700 asCharacter to: 16r00171F asCharacter) -> 'Tagalog'
(16r001720 asCharacter to: 16r00173F asCharacter) -> 'Hanunoo'
(16r001740 asCharacter to: 16r00175F asCharacter) -> 'Buhid'
(16r001760 asCharacter to: 16r00177F asCharacter) -> 'Tagbanwa'
(16r001780 asCharacter to: 16r0017FF asCharacter) -> 'Khmer'
(16r001800 asCharacter to: 16r0018AF asCharacter) -> 'Mongolian'
(16r001900 asCharacter to: 16r00194F asCharacter) -> 'Limbu'
(16r001950 asCharacter to: 16r00197F asCharacter) -> 'Tai Le'
(16r0019E0 asCharacter to: 16r0019FF asCharacter) -> 'Khmer Symbols'
(16r001D00 asCharacter to: 16r001D7F asCharacter) -> 'Phonetic Extensions'
(16r001E00 asCharacter to: 16r001EFF asCharacter) -> 'Latin Extended Additional'
(16r001F00 asCharacter to: 16r001FFF asCharacter) -> 'Greek Extended'
(16r002000 asCharacter to: 16r00206F asCharacter) -> 'General Punctuation'
(16r002070 asCharacter to: 16r00209F asCharacter) -> 'Superscripts and Subscripts'
(16r0020A0 asCharacter to: 16r0020CF asCharacter) -> 'Currency Symbols'
(16r0020D0 asCharacter to: 16r0020FF asCharacter) -> 'Combining Diacritical Marks for Symbols'
(16r002100 asCharacter to: 16r00214F asCharacter) -> 'Letterlike Symbols'
(16r002150 asCharacter to: 16r00218F asCharacter) -> 'Number Forms'
(16r002190 asCharacter to: 16r0021FF asCharacter) -> 'Arrows'
(16r002200 asCharacter to: 16r0022FF asCharacter) -> 'Mathematical Operators'
(16r002300 asCharacter to: 16r0023FF asCharacter) -> 'Miscellaneous Technical'
(16r002400 asCharacter to: 16r00243F asCharacter) -> 'Control Pictures'
(16r002440 asCharacter to: 16r00245F asCharacter) -> 'Optical Character Recognition'
(16r002460 asCharacter to: 16r0024FF asCharacter) -> 'Enclosed Alphanumerics'
(16r002500 asCharacter to: 16r00257F asCharacter) -> 'Box Drawing'
(16r002580 asCharacter to: 16r00259F asCharacter) -> 'Block Elements'
(16r0025A0 asCharacter to: 16r0025FF asCharacter) -> 'Geometric Shapes'
(16r002600 asCharacter to: 16r0026FF asCharacter) -> 'Miscellaneous Symbols'
(16r002700 asCharacter to: 16r0027BF asCharacter) -> 'Dingbats'
(16r0027C0 asCharacter to: 16r0027EF asCharacter) -> 'Miscellaneous Mathematical Symbols-A'
(16r0027F0 asCharacter to: 16r0027FF asCharacter) -> 'Supplemental Arrows-A'
(16r002800 asCharacter to: 16r0028FF asCharacter) -> 'Braille Patterns'
(16r002900 asCharacter to: 16r00297F asCharacter) -> 'Supplemental Arrows-B'
(16r002980 asCharacter to: 16r0029FF asCharacter) -> 'Miscellaneous Mathematical Symbols-B'
(16r002A00 asCharacter to: 16r002AFF asCharacter) -> 'Supplemental Mathematical Operators'
(16r002B00 asCharacter to: 16r002BFF asCharacter) -> 'Miscellaneous Symbols and Arrows'
(16r002E80 asCharacter to: 16r002EFF asCharacter) -> 'CJK Radicals Supplement'
(16r002F00 asCharacter to: 16r002FDF asCharacter) -> 'Kangxi Radicals'
(16r002FF0 asCharacter to: 16r002FFF asCharacter) -> 'Ideographic Description Characters'
(16r003000 asCharacter to: 16r00303F asCharacter) -> 'CJK Symbols and Punctuation'
(16r003040 asCharacter to: 16r00309F asCharacter) -> 'Hiragana'
(16r0030A0 asCharacter to: 16r0030FF asCharacter) -> 'Katakana'
(16r003100 asCharacter to: 16r00312F asCharacter) -> 'Bopomofo'
(16r003130 asCharacter to: 16r00318F asCharacter) -> 'Hangul Compatibility Jamo'
(16r003190 asCharacter to: 16r00319F asCharacter) -> 'Kanbun'
(16r0031A0 asCharacter to: 16r0031BF asCharacter) -> 'Bopomofo Extended'
(16r0031F0 asCharacter to: 16r0031FF asCharacter) -> 'Katakana Phonetic Extensions'
(16r003200 asCharacter to: 16r0032FF asCharacter) -> 'Enclosed CJK Letters and Months'
(16r003300 asCharacter to: 16r0033FF asCharacter) -> 'CJK Compatibility'
(16r003400 asCharacter to: 16r004DBF asCharacter) -> 'CJK Unified Ideographs Extension A'
(16r004DC0 asCharacter to: 16r004DFF asCharacter) -> 'Yijing Hexagram Symbols'
(16r004E00 asCharacter to: 16r009FFF asCharacter) -> 'CJK Unified Ideographs'
(16r00A000 asCharacter to: 16r00A48F asCharacter) -> 'Yi Syllables'
(16r00A490 asCharacter to: 16r00A4CF asCharacter) -> 'Yi Radicals'
(16r00AC00 asCharacter to: 16r00D7AF asCharacter) -> 'Hangul Syllables'
(16r00D800 asCharacter to: 16r00DB7F asCharacter) -> 'High Surrogates'
(16r00DB80 asCharacter to: 16r00DBFF asCharacter) -> 'High Private Use Surrogates'
(16r00DC00 asCharacter to: 16r00DFFF asCharacter) -> 'Low Surrogates'
(16r00E000 asCharacter to: 16r00F8FF asCharacter) -> 'Private Use Area'
(16r00F900 asCharacter to: 16r00FAFF asCharacter) -> 'CJK Compatibility Ideographs'
(16r00FB00 asCharacter to: 16r00FB4F asCharacter) -> 'Alphabetic Presentation Forms'
(16r00FB50 asCharacter to: 16r00FDFF asCharacter) -> 'Arabic Presentation Forms-A'
(16r00FE00 asCharacter to: 16r00FE0F asCharacter) -> 'Variation Selectors'
(16r00FE20 asCharacter to: 16r00FE2F asCharacter) -> 'Combining Half Marks'
(16r00FE30 asCharacter to: 16r00FE4F asCharacter) -> 'CJK Compatibility Forms'
(16r00FE50 asCharacter to: 16r00FE6F asCharacter) -> 'Small Form Variants'
(16r00FE70 asCharacter to: 16r00FEFF asCharacter) -> 'Arabic Presentation Forms-B'
(16r00FF00 asCharacter to: 16r00FFEF asCharacter) -> 'Halfwidth and Fullwidth Forms'
(16r00FFF0 asCharacter to: 16r00FFFF asCharacter) -> 'Specials'
(16r010000 asCharacter to: 16r01007F asCharacter) -> 'Linear B Syllabary'
(16r010080 asCharacter to: 16r0100FF asCharacter) -> 'Linear B Ideograms'
(16r010100 asCharacter to: 16r01013F asCharacter) -> 'Aegean Numbers'
(16r010300 asCharacter to: 16r01032F asCharacter) -> 'Old Italic'
(16r010330 asCharacter to: 16r01034F asCharacter) -> 'Gothic'
(16r010380 asCharacter to: 16r01039F asCharacter) -> 'Ugaritic'
(16r010400 asCharacter to: 16r01044F asCharacter) -> 'Deseret'
(16r010450 asCharacter to: 16r01047F asCharacter) -> 'Shavian'
(16r010480 asCharacter to: 16r0104AF asCharacter) -> 'Osmanya'
(16r010800 asCharacter to: 16r01083F asCharacter) -> 'Cypriot Syllabary'
(16r01D000 asCharacter to: 16r01D0FF asCharacter) -> 'Byzantine Musical Symbols'
(16r01D100 asCharacter to: 16r01D1FF asCharacter) -> 'Musical Symbols'
(16r01D300 asCharacter to: 16r01D35F asCharacter) -> 'Tai Xuan Jing Symbols'
(16r01D400 asCharacter to: 16r01D7FF asCharacter) -> 'Mathematical Alphanumeric Symbols'
(16r020000 asCharacter to: 16r02A6DF asCharacter) -> 'CJK Unified Ideographs Extension B'
(16r02F800 asCharacter to: 16r02FA1F asCharacter) -> 'CJK Compatibility Ideographs Supplement'
(16r0E0000 asCharacter to: 16r0E007F asCharacter) -> 'Tags'





--
Eric



Carpe Squeak!
Reply | Threaded
Open this post in threaded view
|

Re: Unicode Ranges you might want to copy

Eliot Miranda-2
In reply to this post by darth-cheney
Hi Eric,

On Feb 1, 2021, at 5:57 AM, Eric Gade <[hidden email]> wrote:


Hi Timothy,

I spent several (fruitless) days last year trying to get cuneiform fonts to render properly in Squeak (see here for examples of the fonts).
There does seem to be an issue with rendering glyphs above a certain code point from what I recall.

Do you have any code for testing this, a workspace perhaps?  If so, could you post it here?

I am definitely interested in your work and what you end up finding out.

On Sun, Jan 31, 2021 at 7:20 AM gettimothy via Squeak-dev <[hidden email]> wrote:
Hi Folks,

You might find the below handy.

Fwiw, I have coded the UnicodeRangeBrowser  SeasideApp to display all the below (it times out before all display, I will be doing refactor and more development).

I am also coding a utility class to provide information such as below in a variety of ways.

fwiw, here are my current todo notes for this font information.

unifont

provide link to the unicode spec.
provide browser fonts list.
provide squeak fonts list.
provide phare fonts list ^self squeak fonts list:
what are Variant Selectors ?

show gaps in the ranges.

provide a link(s) to required fonts that will make a range display.
displays on: browser list, squeak, emacs, xterm...
status bar...broken, partial, full.
use cases for fonts
pairs with xyz   example superscripts and subscripts

The goal is to show what works and to find/test fonts that will support stuff.
I am modelling the Seaside app on https://jrgraphix.net/research/unicode_blocks.php  
But...for each unicode range, I intend to display smalltalk specific helpers on what/where to get fonts for both the browser side and the image side.

--------------snip -------------------

(16r000020 asCharacter to: 16r00007F asCharacter) -> 'Basic Latin'
(16r0000A0 asCharacter to: 16r0000FF asCharacter) -> 'Latin-1 Supplement'
(16r000100 asCharacter to: 16r00017F asCharacter) -> 'Latin Extended-A'
(16r000180 asCharacter to: 16r00024F asCharacter) -> 'Latin Extended-B'
(16r000250 asCharacter to: 16r0002AF asCharacter) -> 'IPA Extensions'
(16r0002B0 asCharacter to: 16r0002FF asCharacter) -> 'Spacing Modifier Letters'
(16r000300 asCharacter to: 16r00036F asCharacter) -> 'Combining Diacritical Marks'
(16r000370 asCharacter to: 16r0003FF asCharacter) -> 'Greek and Coptic'
(16r000400 asCharacter to: 16r0004FF asCharacter) -> 'Cyrillic'
(16r000500 asCharacter to: 16r00052F asCharacter) -> 'Cyrillic Supplementary'
(16r000530 asCharacter to: 16r00058F asCharacter) -> 'Armenian'
(16r000590 asCharacter to: 16r0005FF asCharacter) -> 'Hebrew'
(16r000600 asCharacter to: 16r0006FF asCharacter) -> 'Arabic'
(16r000700 asCharacter to: 16r00074F asCharacter) -> 'Syriac'
(16r000780 asCharacter to: 16r0007BF asCharacter) -> 'Thaana'
(16r000900 asCharacter to: 16r00097F asCharacter) -> 'Devanagari'
(16r000980 asCharacter to: 16r0009FF asCharacter) -> 'Bengali'
(16r000A00 asCharacter to: 16r000A7F asCharacter) -> 'Gurmukhi'
(16r000A80 asCharacter to: 16r000AFF asCharacter) -> 'Gujarati'
(16r000B00 asCharacter to: 16r000B7F asCharacter) -> 'Oriya'
(16r000B80 asCharacter to: 16r000BFF asCharacter) -> 'Tamil'
(16r000C00 asCharacter to: 16r000C7F asCharacter) -> 'Telugu'
(16r000C80 asCharacter to: 16r000CFF asCharacter) -> 'Kannada'
(16r000D00 asCharacter to: 16r000D7F asCharacter) -> 'Malayalam'
(16r000D80 asCharacter to: 16r000DFF asCharacter) -> 'Sinhala'
(16r000E00 asCharacter to: 16r000E7F asCharacter) -> 'Thai'
(16r000E80 asCharacter to: 16r000EFF asCharacter) -> 'Lao'
(16r000F00 asCharacter to: 16r000FFF asCharacter) -> 'Tibetan'
(16r001000 asCharacter to: 16r00109F asCharacter) -> 'Myanmar'
(16r0010A0 asCharacter to: 16r0010FF asCharacter) -> 'Georgian'
(16r001100 asCharacter to: 16r0011FF asCharacter) -> 'Hangul Jamo'
(16r001200 asCharacter to: 16r00137F asCharacter) -> 'Ethiopic'
(16r0013A0 asCharacter to: 16r0013FF asCharacter) -> 'Cherokee'
(16r001400 asCharacter to: 16r00167F asCharacter) -> 'Unified Canadian Aboriginal Syllabics'
(16r001680 asCharacter to: 16r00169F asCharacter) -> 'Ogham'
(16r0016A0 asCharacter to: 16r0016FF asCharacter) -> 'Runic'
(16r001700 asCharacter to: 16r00171F asCharacter) -> 'Tagalog'
(16r001720 asCharacter to: 16r00173F asCharacter) -> 'Hanunoo'
(16r001740 asCharacter to: 16r00175F asCharacter) -> 'Buhid'
(16r001760 asCharacter to: 16r00177F asCharacter) -> 'Tagbanwa'
(16r001780 asCharacter to: 16r0017FF asCharacter) -> 'Khmer'
(16r001800 asCharacter to: 16r0018AF asCharacter) -> 'Mongolian'
(16r001900 asCharacter to: 16r00194F asCharacter) -> 'Limbu'
(16r001950 asCharacter to: 16r00197F asCharacter) -> 'Tai Le'
(16r0019E0 asCharacter to: 16r0019FF asCharacter) -> 'Khmer Symbols'
(16r001D00 asCharacter to: 16r001D7F asCharacter) -> 'Phonetic Extensions'
(16r001E00 asCharacter to: 16r001EFF asCharacter) -> 'Latin Extended Additional'
(16r001F00 asCharacter to: 16r001FFF asCharacter) -> 'Greek Extended'
(16r002000 asCharacter to: 16r00206F asCharacter) -> 'General Punctuation'
(16r002070 asCharacter to: 16r00209F asCharacter) -> 'Superscripts and Subscripts'
(16r0020A0 asCharacter to: 16r0020CF asCharacter) -> 'Currency Symbols'
(16r0020D0 asCharacter to: 16r0020FF asCharacter) -> 'Combining Diacritical Marks for Symbols'
(16r002100 asCharacter to: 16r00214F asCharacter) -> 'Letterlike Symbols'
(16r002150 asCharacter to: 16r00218F asCharacter) -> 'Number Forms'
(16r002190 asCharacter to: 16r0021FF asCharacter) -> 'Arrows'
(16r002200 asCharacter to: 16r0022FF asCharacter) -> 'Mathematical Operators'
(16r002300 asCharacter to: 16r0023FF asCharacter) -> 'Miscellaneous Technical'
(16r002400 asCharacter to: 16r00243F asCharacter) -> 'Control Pictures'
(16r002440 asCharacter to: 16r00245F asCharacter) -> 'Optical Character Recognition'
(16r002460 asCharacter to: 16r0024FF asCharacter) -> 'Enclosed Alphanumerics'
(16r002500 asCharacter to: 16r00257F asCharacter) -> 'Box Drawing'
(16r002580 asCharacter to: 16r00259F asCharacter) -> 'Block Elements'
(16r0025A0 asCharacter to: 16r0025FF asCharacter) -> 'Geometric Shapes'
(16r002600 asCharacter to: 16r0026FF asCharacter) -> 'Miscellaneous Symbols'
(16r002700 asCharacter to: 16r0027BF asCharacter) -> 'Dingbats'
(16r0027C0 asCharacter to: 16r0027EF asCharacter) -> 'Miscellaneous Mathematical Symbols-A'
(16r0027F0 asCharacter to: 16r0027FF asCharacter) -> 'Supplemental Arrows-A'
(16r002800 asCharacter to: 16r0028FF asCharacter) -> 'Braille Patterns'
(16r002900 asCharacter to: 16r00297F asCharacter) -> 'Supplemental Arrows-B'
(16r002980 asCharacter to: 16r0029FF asCharacter) -> 'Miscellaneous Mathematical Symbols-B'
(16r002A00 asCharacter to: 16r002AFF asCharacter) -> 'Supplemental Mathematical Operators'
(16r002B00 asCharacter to: 16r002BFF asCharacter) -> 'Miscellaneous Symbols and Arrows'
(16r002E80 asCharacter to: 16r002EFF asCharacter) -> 'CJK Radicals Supplement'
(16r002F00 asCharacter to: 16r002FDF asCharacter) -> 'Kangxi Radicals'
(16r002FF0 asCharacter to: 16r002FFF asCharacter) -> 'Ideographic Description Characters'
(16r003000 asCharacter to: 16r00303F asCharacter) -> 'CJK Symbols and Punctuation'
(16r003040 asCharacter to: 16r00309F asCharacter) -> 'Hiragana'
(16r0030A0 asCharacter to: 16r0030FF asCharacter) -> 'Katakana'
(16r003100 asCharacter to: 16r00312F asCharacter) -> 'Bopomofo'
(16r003130 asCharacter to: 16r00318F asCharacter) -> 'Hangul Compatibility Jamo'
(16r003190 asCharacter to: 16r00319F asCharacter) -> 'Kanbun'
(16r0031A0 asCharacter to: 16r0031BF asCharacter) -> 'Bopomofo Extended'
(16r0031F0 asCharacter to: 16r0031FF asCharacter) -> 'Katakana Phonetic Extensions'
(16r003200 asCharacter to: 16r0032FF asCharacter) -> 'Enclosed CJK Letters and Months'
(16r003300 asCharacter to: 16r0033FF asCharacter) -> 'CJK Compatibility'
(16r003400 asCharacter to: 16r004DBF asCharacter) -> 'CJK Unified Ideographs Extension A'
(16r004DC0 asCharacter to: 16r004DFF asCharacter) -> 'Yijing Hexagram Symbols'
(16r004E00 asCharacter to: 16r009FFF asCharacter) -> 'CJK Unified Ideographs'
(16r00A000 asCharacter to: 16r00A48F asCharacter) -> 'Yi Syllables'
(16r00A490 asCharacter to: 16r00A4CF asCharacter) -> 'Yi Radicals'
(16r00AC00 asCharacter to: 16r00D7AF asCharacter) -> 'Hangul Syllables'
(16r00D800 asCharacter to: 16r00DB7F asCharacter) -> 'High Surrogates'
(16r00DB80 asCharacter to: 16r00DBFF asCharacter) -> 'High Private Use Surrogates'
(16r00DC00 asCharacter to: 16r00DFFF asCharacter) -> 'Low Surrogates'
(16r00E000 asCharacter to: 16r00F8FF asCharacter) -> 'Private Use Area'
(16r00F900 asCharacter to: 16r00FAFF asCharacter) -> 'CJK Compatibility Ideographs'
(16r00FB00 asCharacter to: 16r00FB4F asCharacter) -> 'Alphabetic Presentation Forms'
(16r00FB50 asCharacter to: 16r00FDFF asCharacter) -> 'Arabic Presentation Forms-A'
(16r00FE00 asCharacter to: 16r00FE0F asCharacter) -> 'Variation Selectors'
(16r00FE20 asCharacter to: 16r00FE2F asCharacter) -> 'Combining Half Marks'
(16r00FE30 asCharacter to: 16r00FE4F asCharacter) -> 'CJK Compatibility Forms'
(16r00FE50 asCharacter to: 16r00FE6F asCharacter) -> 'Small Form Variants'
(16r00FE70 asCharacter to: 16r00FEFF asCharacter) -> 'Arabic Presentation Forms-B'
(16r00FF00 asCharacter to: 16r00FFEF asCharacter) -> 'Halfwidth and Fullwidth Forms'
(16r00FFF0 asCharacter to: 16r00FFFF asCharacter) -> 'Specials'
(16r010000 asCharacter to: 16r01007F asCharacter) -> 'Linear B Syllabary'
(16r010080 asCharacter to: 16r0100FF asCharacter) -> 'Linear B Ideograms'
(16r010100 asCharacter to: 16r01013F asCharacter) -> 'Aegean Numbers'
(16r010300 asCharacter to: 16r01032F asCharacter) -> 'Old Italic'
(16r010330 asCharacter to: 16r01034F asCharacter) -> 'Gothic'
(16r010380 asCharacter to: 16r01039F asCharacter) -> 'Ugaritic'
(16r010400 asCharacter to: 16r01044F asCharacter) -> 'Deseret'
(16r010450 asCharacter to: 16r01047F asCharacter) -> 'Shavian'
(16r010480 asCharacter to: 16r0104AF asCharacter) -> 'Osmanya'
(16r010800 asCharacter to: 16r01083F asCharacter) -> 'Cypriot Syllabary'
(16r01D000 asCharacter to: 16r01D0FF asCharacter) -> 'Byzantine Musical Symbols'
(16r01D100 asCharacter to: 16r01D1FF asCharacter) -> 'Musical Symbols'
(16r01D300 asCharacter to: 16r01D35F asCharacter) -> 'Tai Xuan Jing Symbols'
(16r01D400 asCharacter to: 16r01D7FF asCharacter) -> 'Mathematical Alphanumeric Symbols'
(16r020000 asCharacter to: 16r02A6DF asCharacter) -> 'CJK Unified Ideographs Extension B'
(16r02F800 asCharacter to: 16r02FA1F asCharacter) -> 'CJK Compatibility Ideographs Supplement'
(16r0E0000 asCharacter to: 16r0E007F asCharacter) -> 'Tags'

--
Eric

Eliot
_,,,^..^,,,_ (phone)


Reply | Threaded
Open this post in threaded view
|

Re: Unicode Ranges you might want to copy

darth-cheney


On Mon, Feb 1, 2021 at 10:52 AM Eliot Miranda <[hidden email]> wrote:
Hi Eric,
 
Do you have any code for testing this, a workspace perhaps?  If so, could you post it here?
 
Eliot
_,,,^..^,,,_ (phone)


Eliot -- I'm having a problem tracking down any of the images I had when I was playing with this last year. However, some of my issue was documented in this mailing list exchange. Note that in my examples I'm using the Assurbanipal ttf font available here. My recollection is that the cuneiform (and perhaps other ancient script) fonts use unicode points that are higher than what Squeak's font handling is able to deal with. For example, all of the cuneiform stuff is from U+12000 to U+1254F.

--
Eric


Reply | Threaded
Open this post in threaded view
|

Re: Unicode Ranges you might want to copy

Squeak - Dev mailing list
In reply to this post by Christoph Thiede
Hi Christoph


I have looked at and added the class to my project notes.

I noticed the Class comment does not provide ease of use of viewing the characters in a ByteString the way that Levente  demonstrated with the key in the association: (16r002FF0 asCharacter to: 16r002FFF asCharacter) -> 'Ideographic Description Characters'

Its really handy for a quick way to see if the characters display.

When you write "does not yet support UTF-16 characters, I presume you mean stuff like the isUppercaseCode method:

isUppercaseCode: anInteger
"Answer whether anInteger is the code of an uppercase letter."

^ 8r101 <= anInteger and: [anInteger <= 8r132].


anyway, thanks for pointing out the class to me. It may come in handy.

cheers.


---- On Mon, 01 Feb 2021 09:36:35 -0500 Thiede, Christoph <[hidden email]> wrote ----

Hi Timothy,


do you also know Unicode class and, in particular, its class variables and its testing selectors? Not sure whether this is on the same abstraction level as your work - just wanted to add this pointer.


Also note that Unicode class's character classification does not yet support UTF-16 characters - I have added this to my todo-list some disgracefully long time ago [1] and not yet managed to return back to this issue (so sorry, Levente!).


Best,

Christoph


[1] http://forum.world.st/Unicode-td5113495.html





Von: Squeak-dev <[hidden email]> im Auftrag von gettimothy via Squeak-dev <[hidden email]>
Gesendet: Montag, 1. Februar 2021 15:30:29
An: [hidden email]
Cc: [hidden email]
Betreff: Re: [squeak-dev] Unicode Ranges you might want to copy
 
Hi Eric


Thanks for the heads up!

Doing some lite browsing on the matter, I found these resources....

https://unicode-table.com/en/#control-character
 https://css-tricks.com/almanac/properties/u/unicode-range/
 http://www.alanwood.net/unicode/alphabetic_presentation_forms.html



That last one  looks very useful  and I will ve working through it.

The middle one claims that fonts can be loaded dynamically when needed.

An intersting thing is that the alan wood site displays the $ sign under the currency range, but the jgraphix  website does not!


It is also somewhat apparent that  "like" ranges are not contigous.

Thanjs again for your gelp!

Cheers,  




---- On Mon, 01 Feb 2021 08:57:32 -0500[hidden email] wrote ----

Hi Timothy,

I spent several (fruitless) days last year trying to get cuneiform fonts to render properly in Squeak (see here for examples of the fonts). There does seem to be an issue with rendering glyphs above a certain code point from what I recall. I am definitely interested in your work and what you end up finding out.

On Sun, Jan 31, 2021 at 7:20 AM gettimothy via Squeak-dev <[hidden email]> wrote:
Hi Folks,

You might find the below handy.

Fwiw, I have coded the UnicodeRangeBrowser  SeasideApp to display all the below (it times out before all display, I will be doing refactor and more development).

I am also coding a utility class to provide information such as below in a variety of ways.

fwiw, here are my current todo notes for this font information.

unifont

provide link to the unicode spec.
provide browser fonts list.
provide squeak fonts list.
provide phare fonts list ^self squeak fonts list:
what are Variant Selectors ?

show gaps in the ranges.

provide a link(s) to required fonts that will make a range display.
displays on: browser list, squeak, emacs, xterm...
status bar...broken, partial, full.
use cases for fonts
pairs with xyz   example superscripts and subscripts

The goal is to show what works and to find/test fonts that will support stuff.
I am modelling the Seaside app on https://jrgraphix.net/research/unicode_blocks.php  
But...for each unicode range, I intend to display smalltalk specific helpers on what/where to get fonts for both the browser side and the image side.

--------------snip -------------------

(16r000020 asCharacter to: 16r00007F asCharacter) -> 'Basic Latin'
(16r0000A0 asCharacter to: 16r0000FF asCharacter) -> 'Latin-1 Supplement'
(16r000100 asCharacter to: 16r00017F asCharacter) -> 'Latin Extended-A'
(16r000180 asCharacter to: 16r00024F asCharacter) -> 'Latin Extended-B'
(16r000250 asCharacter to: 16r0002AF asCharacter) -> 'IPA Extensions'
(16r0002B0 asCharacter to: 16r0002FF asCharacter) -> 'Spacing Modifier Letters'
(16r000300 asCharacter to: 16r00036F asCharacter) -> 'Combining Diacritical Marks'
(16r000370 asCharacter to: 16r0003FF asCharacter) -> 'Greek and Coptic'
(16r000400 asCharacter to: 16r0004FF asCharacter) -> 'Cyrillic'
(16r000500 asCharacter to: 16r00052F asCharacter) -> 'Cyrillic Supplementary'
(16r000530 asCharacter to: 16r00058F asCharacter) -> 'Armenian'
(16r000590 asCharacter to: 16r0005FF asCharacter) -> 'Hebrew'
(16r000600 asCharacter to: 16r0006FF asCharacter) -> 'Arabic'
(16r000700 asCharacter to: 16r00074F asCharacter) -> 'Syriac'
(16r000780 asCharacter to: 16r0007BF asCharacter) -> 'Thaana'
(16r000900 asCharacter to: 16r00097F asCharacter) -> 'Devanagari'
(16r000980 asCharacter to: 16r0009FF asCharacter) -> 'Bengali'
(16r000A00 asCharacter to: 16r000A7F asCharacter) -> 'Gurmukhi'
(16r000A80 asCharacter to: 16r000AFF asCharacter) -> 'Gujarati'
(16r000B00 asCharacter to: 16r000B7F asCharacter) -> 'Oriya'
(16r000B80 asCharacter to: 16r000BFF asCharacter) -> 'Tamil'
(16r000C00 asCharacter to: 16r000C7F asCharacter) -> 'Telugu'
(16r000C80 asCharacter to: 16r000CFF asCharacter) -> 'Kannada'
(16r000D00 asCharacter to: 16r000D7F asCharacter) -> 'Malayalam'
(16r000D80 asCharacter to: 16r000DFF asCharacter) -> 'Sinhala'
(16r000E00 asCharacter to: 16r000E7F asCharacter) -> 'Thai'
(16r000E80 asCharacter to: 16r000EFF asCharacter) -> 'Lao'
(16r000F00 asCharacter to: 16r000FFF asCharacter) -> 'Tibetan'
(16r001000 asCharacter to: 16r00109F asCharacter) -> 'Myanmar'
(16r0010A0 asCharacter to: 16r0010FF asCharacter) -> 'Georgian'
(16r001100 asCharacter to: 16r0011FF asCharacter) -> 'Hangul Jamo'
(16r001200 asCharacter to: 16r00137F asCharacter) -> 'Ethiopic'
(16r0013A0 asCharacter to: 16r0013FF asCharacter) -> 'Cherokee'
(16r001400 asCharacter to: 16r00167F asCharacter) -> 'Unified Canadian Aboriginal Syllabics'
(16r001680 asCharacter to: 16r00169F asCharacter) -> 'Ogham'
(16r0016A0 asCharacter to: 16r0016FF asCharacter) -> 'Runic'
(16r001700 asCharacter to: 16r00171F asCharacter) -> 'Tagalog'
(16r001720 asCharacter to: 16r00173F asCharacter) -> 'Hanunoo'
(16r001740 asCharacter to: 16r00175F asCharacter) -> 'Buhid'
(16r001760 asCharacter to: 16r00177F asCharacter) -> 'Tagbanwa'
(16r001780 asCharacter to: 16r0017FF asCharacter) -> 'Khmer'
(16r001800 asCharacter to: 16r0018AF asCharacter) -> 'Mongolian'
(16r001900 asCharacter to: 16r00194F asCharacter) -> 'Limbu'
(16r001950 asCharacter to: 16r00197F asCharacter) -> 'Tai Le'
(16r0019E0 asCharacter to: 16r0019FF asCharacter) -> 'Khmer Symbols'
(16r001D00 asCharacter to: 16r001D7F asCharacter) -> 'Phonetic Extensions'
(16r001E00 asCharacter to: 16r001EFF asCharacter) -> 'Latin Extended Additional'
(16r001F00 asCharacter to: 16r001FFF asCharacter) -> 'Greek Extended'
(16r002000 asCharacter to: 16r00206F asCharacter) -> 'General Punctuation'
(16r002070 asCharacter to: 16r00209F asCharacter) -> 'Superscripts and Subscripts'
(16r0020A0 asCharacter to: 16r0020CF asCharacter) -> 'Currency Symbols'
(16r0020D0 asCharacter to: 16r0020FF asCharacter) -> 'Combining Diacritical Marks for Symbols'
(16r002100 asCharacter to: 16r00214F asCharacter) -> 'Letterlike Symbols'
(16r002150 asCharacter to: 16r00218F asCharacter) -> 'Number Forms'
(16r002190 asCharacter to: 16r0021FF asCharacter) -> 'Arrows'
(16r002200 asCharacter to: 16r0022FF asCharacter) -> 'Mathematical Operators'
(16r002300 asCharacter to: 16r0023FF asCharacter) -> 'Miscellaneous Technical'
(16r002400 asCharacter to: 16r00243F asCharacter) -> 'Control Pictures'
(16r002440 asCharacter to: 16r00245F asCharacter) -> 'Optical Character Recognition'
(16r002460 asCharacter to: 16r0024FF asCharacter) -> 'Enclosed Alphanumerics'
(16r002500 asCharacter to: 16r00257F asCharacter) -> 'Box Drawing'
(16r002580 asCharacter to: 16r00259F asCharacter) -> 'Block Elements'
(16r0025A0 asCharacter to: 16r0025FF asCharacter) -> 'Geometric Shapes'
(16r002600 asCharacter to: 16r0026FF asCharacter) -> 'Miscellaneous Symbols'
(16r002700 asCharacter to: 16r0027BF asCharacter) -> 'Dingbats'
(16r0027C0 asCharacter to: 16r0027EF asCharacter) -> 'Miscellaneous Mathematical Symbols-A'
(16r0027F0 asCharacter to: 16r0027FF asCharacter) -> 'Supplemental Arrows-A'
(16r002800 asCharacter to: 16r0028FF asCharacter) -> 'Braille Patterns'
(16r002900 asCharacter to: 16r00297F asCharacter) -> 'Supplemental Arrows-B'
(16r002980 asCharacter to: 16r0029FF asCharacter) -> 'Miscellaneous Mathematical Symbols-B'
(16r002A00 asCharacter to: 16r002AFF asCharacter) -> 'Supplemental Mathematical Operators'
(16r002B00 asCharacter to: 16r002BFF asCharacter) -> 'Miscellaneous Symbols and Arrows'
(16r002E80 asCharacter to: 16r002EFF asCharacter) -> 'CJK Radicals Supplement'
(16r002F00 asCharacter to: 16r002FDF asCharacter) -> 'Kangxi Radicals'
(16r002FF0 asCharacter to: 16r002FFF asCharacter) -> 'Ideographic Description Characters'
(16r003000 asCharacter to: 16r00303F asCharacter) -> 'CJK Symbols and Punctuation'
(16r003040 asCharacter to: 16r00309F asCharacter) -> 'Hiragana'
(16r0030A0 asCharacter to: 16r0030FF asCharacter) -> 'Katakana'
(16r003100 asCharacter to: 16r00312F asCharacter) -> 'Bopomofo'
(16r003130 asCharacter to: 16r00318F asCharacter) -> 'Hangul Compatibility Jamo'
(16r003190 asCharacter to: 16r00319F asCharacter) -> 'Kanbun'
(16r0031A0 asCharacter to: 16r0031BF asCharacter) -> 'Bopomofo Extended'
(16r0031F0 asCharacter to: 16r0031FF asCharacter) -> 'Katakana Phonetic Extensions'
(16r003200 asCharacter to: 16r0032FF asCharacter) -> 'Enclosed CJK Letters and Months'
(16r003300 asCharacter to: 16r0033FF asCharacter) -> 'CJK Compatibility'
(16r003400 asCharacter to: 16r004DBF asCharacter) -> 'CJK Unified Ideographs Extension A'
(16r004DC0 asCharacter to: 16r004DFF asCharacter) -> 'Yijing Hexagram Symbols'
(16r004E00 asCharacter to: 16r009FFF asCharacter) -> 'CJK Unified Ideographs'
(16r00A000 asCharacter to: 16r00A48F asCharacter) -> 'Yi Syllables'
(16r00A490 asCharacter to: 16r00A4CF asCharacter) -> 'Yi Radicals'
(16r00AC00 asCharacter to: 16r00D7AF asCharacter) -> 'Hangul Syllables'
(16r00D800 asCharacter to: 16r00DB7F asCharacter) -> 'High Surrogates'
(16r00DB80 asCharacter to: 16r00DBFF asCharacter) -> 'High Private Use Surrogates'
(16r00DC00 asCharacter to: 16r00DFFF asCharacter) -> 'Low Surrogates'
(16r00E000 asCharacter to: 16r00F8FF asCharacter) -> 'Private Use Area'
(16r00F900 asCharacter to: 16r00FAFF asCharacter) -> 'CJK Compatibility Ideographs'
(16r00FB00 asCharacter to: 16r00FB4F asCharacter) -> 'Alphabetic Presentation Forms'
(16r00FB50 asCharacter to: 16r00FDFF asCharacter) -> 'Arabic Presentation Forms-A'
(16r00FE00 asCharacter to: 16r00FE0F asCharacter) -> 'Variation Selectors'
(16r00FE20 asCharacter to: 16r00FE2F asCharacter) -> 'Combining Half Marks'
(16r00FE30 asCharacter to: 16r00FE4F asCharacter) -> 'CJK Compatibility Forms'
(16r00FE50 asCharacter to: 16r00FE6F asCharacter) -> 'Small Form Variants'
(16r00FE70 asCharacter to: 16r00FEFF asCharacter) -> 'Arabic Presentation Forms-B'
(16r00FF00 asCharacter to: 16r00FFEF asCharacter) -> 'Halfwidth and Fullwidth Forms'
(16r00FFF0 asCharacter to: 16r00FFFF asCharacter) -> 'Specials'
(16r010000 asCharacter to: 16r01007F asCharacter) -> 'Linear B Syllabary'
(16r010080 asCharacter to: 16r0100FF asCharacter) -> 'Linear B Ideograms'
(16r010100 asCharacter to: 16r01013F asCharacter) -> 'Aegean Numbers'
(16r010300 asCharacter to: 16r01032F asCharacter) -> 'Old Italic'
(16r010330 asCharacter to: 16r01034F asCharacter) -> 'Gothic'
(16r010380 asCharacter to: 16r01039F asCharacter) -> 'Ugaritic'
(16r010400 asCharacter to: 16r01044F asCharacter) -> 'Deseret'
(16r010450 asCharacter to: 16r01047F asCharacter) -> 'Shavian'
(16r010480 asCharacter to: 16r0104AF asCharacter) -> 'Osmanya'
(16r010800 asCharacter to: 16r01083F asCharacter) -> 'Cypriot Syllabary'
(16r01D000 asCharacter to: 16r01D0FF asCharacter) -> 'Byzantine Musical Symbols'
(16r01D100 asCharacter to: 16r01D1FF asCharacter) -> 'Musical Symbols'
(16r01D300 asCharacter to: 16r01D35F asCharacter) -> 'Tai Xuan Jing Symbols'
(16r01D400 asCharacter to: 16r01D7FF asCharacter) -> 'Mathematical Alphanumeric Symbols'
(16r020000 asCharacter to: 16r02A6DF asCharacter) -> 'CJK Unified Ideographs Extension B'
(16r02F800 asCharacter to: 16r02FA1F asCharacter) -> 'CJK Compatibility Ideographs Supplement'
(16r0E0000 asCharacter to: 16r0E007F asCharacter) -> 'Tags'





--
Eric





Reply | Threaded
Open this post in threaded view
|

Re: Unicode Ranges you might want to copy

Squeak - Dev mailing list
In reply to this post by darth-cheney
Either Levente or Christoph recommended something named Unifont.

I just did a search http://unifoundry.com/unifont/  and it claims to cover all glyphs in True Type.

I will be messing around with this next.






Hi Timothy,

I spent several (fruitless) days last year trying to get cuneiform fonts to render properly in Squeak (see here for examples of the fonts). There does seem to be an issue with rendering glyphs above a certain code point from what I recall. I am definitely interested in your work and what you end up finding out.

 If you can find a character range, we can try to get them to appear in the browser. My Seaside is coded for UTF-8 .



Reply | Threaded
Open this post in threaded view
|

With Unifont Russia Alphabet Comes to You! (or...some success with the font work)

Squeak - Dev mailing list
In reply to this post by darth-cheney
Hi folks.

I am looking at the wide string for (16r000400  asCharacter to:  16r0004FF asCharacter)  and I can see Russian from my house!

Very happy about this.

Here is the process

linux
      download ttf font  example: http://unifoundry.com/unifont/
      Copy your font files (.ttf and/or .otf) to their respective directories:
      /usr/share/fonts/TTF
      /usr/share/fonts/OTF
      Run the following commands, with the directory where you copied the fonts as argument:
      mkfontdir /usr/share/fonts/{TTF,OTF}
      mkfontscale /usr/share/fonts/{TTF,OTF}
      fc-cache -f -v  (i do this as root and as my normal user account)
      Restart X.
      
in Squeak

Apps -> Font Importer  (import the fonts into the image)
World menu -> Appearance -> System Fonts  (blindly select one of the fonts you installed to see what sticks)
Unfortunately, my mail client (see! now I know what to blame, yesterday, I had no clue) does not support cyrillic.

My local instance of the UnicodeRangeBrowser displays the cyrillic in the browser and my workspace displays it.

very good news!


Ok...for UnicodeRangeBrowser...I need to expand it to keep tabs on what sticks.
This means I have to create forms and such so others can add information if they wish.
Then, for every Unicode Font Range, give instructions/fonts/steps to get things consistent from image to browser.

Then...tests....
It MIGHT make sense to create UnicodeFontRange specific in-image tests.  i.e. loop through the characters and verify the integer = the displayed character. Maybe throw in some should: reason: 'font xyz not installed?" stuff.

anyhoo, it feels good not to be flailing around.

cheers.