TrueTypeFont and Unicode Characters

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

TrueTypeFont and Unicode Characters

darth-cheney
Hello,

I am trying to load a particular TrueTypeFont that represents the Cuneiform signs in the Cuneiform Unicode block.
You can find some information on this page.

Specifically, I'm attempting to work with the Neo-Assyrian font on that page ('Assurbanipal.ttf' in the zip file). I have also tried the Old Babylonian fonts with similar results.

I have "imported" this font using the FileList tool and it does appear as a TextStyle. However, if I attempt to run:

```
m := TextMorph new
beAllFont: ((TextStyle named: 'Assurbanipal') fontOfSize: 36);
backgroundColor: Color white.

signString := (Unicode value: 16r12038) asString.
m contents: signString.

m openInHand.
```

I get a TextMorph that appears blank. Either I am doing something incorrectly or the system cannot render the glyph at that code point (which should be the sign "AŠ"). Any ideas?
--
Eric


Reply | Threaded
Open this post in threaded view
|

Re: TrueTypeFont and Unicode Characters

marcel.taeumel
Hi Eric!

I have "imported" this font using the FileList tool and it does appear as a TextStyle

Please use the Font Importer to import such fonts. You can find it in the Tools menu.



Best,
Marcel

Am 30.04.2020 04:22:12 schrieb Eric Gade <[hidden email]>:

Hello,

I am trying to load a particular TrueTypeFont that represents the Cuneiform signs in the Cuneiform Unicode block.
You can find some information on this page.

Specifically, I'm attempting to work with the Neo-Assyrian font on that page ('Assurbanipal.ttf' in the zip file). I have also tried the Old Babylonian fonts with similar results.

I have "imported" this font using the FileList tool and it does appear as a TextStyle. However, if I attempt to run:

```
m := TextMorph new
beAllFont: ((TextStyle named: 'Assurbanipal') fontOfSize: 36);
backgroundColor: Color white.

signString := (Unicode value: 16r12038) asString.
m contents: signString.

m openInHand.
```

I get a TextMorph that appears blank. Either I am doing something incorrectly or the system cannot render the glyph at that code point (which should be the sign "AŠ"). Any ideas?
--
Eric


Reply | Threaded
Open this post in threaded view
|

Re: TrueTypeFont and Unicode Characters

darth-cheney
Thanks Marcel.

I have used the specified Font Importer tool now, but am still having the same issue.

On Thu, Apr 30, 2020 at 3:43 AM Marcel Taeumel <[hidden email]> wrote:
Hi Eric!

I have "imported" this font using the FileList tool and it does appear as a TextStyle

Please use the Font Importer to import such fonts. You can find it in the Tools menu.



Best,
Marcel

Am 30.04.2020 04:22:12 schrieb Eric Gade <[hidden email]>:

Hello,

I am trying to load a particular TrueTypeFont that represents the Cuneiform signs in the Cuneiform Unicode block.
You can find some information on this page.

Specifically, I'm attempting to work with the Neo-Assyrian font on that page ('Assurbanipal.ttf' in the zip file). I have also tried the Old Babylonian fonts with similar results.

I have "imported" this font using the FileList tool and it does appear as a TextStyle. However, if I attempt to run:

```
m := TextMorph new
beAllFont: ((TextStyle named: 'Assurbanipal') fontOfSize: 36);
backgroundColor: Color white.

signString := (Unicode value: 16r12038) asString.
m contents: signString.

m openInHand.
```

I get a TextMorph that appears blank. Either I am doing something incorrectly or the system cannot render the glyph at that code point (which should be the sign "AŠ"). Any ideas?
--
Eric



--
Eric



Screen Shot 2020-04-30 at 7.48.38 AM.png (482K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: TrueTypeFont and Unicode Characters

Tobias Pape
Hi

> On 30.04.2020, at 13:49, Eric Gade <[hidden email]> wrote:
>
> Thanks Marcel.
>
> I have used the specified Font Importer tool now, but am still having the same issue.

I'm not sure we are actually loading all glyphs.
Also note, that nothing in the font (like substitiution tables) are actually supported :/
sorry.

Best regards
        -Tobias

>
> On Thu, Apr 30, 2020 at 3:43 AM Marcel Taeumel <[hidden email]> wrote:
> Hi Eric!
>
> > I have "imported" this font using the FileList tool and it does appear as a TextStyle
>
> Please use the Font Importer to import such fonts. You can find it in the Tools menu.
>
>
>
> Best,
> Marcel
>> Am 30.04.2020 04:22:12 schrieb Eric Gade <[hidden email]>:
>>
>> Hello,
>>
>> I am trying to load a particular TrueTypeFont that represents the Cuneiform signs in the Cuneiform Unicode block.
>> You can find some information on this page.
>>
>> Specifically, I'm attempting to work with the Neo-Assyrian font on that page ('Assurbanipal.ttf' in the zip file). I have also tried the Old Babylonian fonts with similar results.
>>
>> I have "imported" this font using the FileList tool and it does appear as a TextStyle. However, if I attempt to run:
>>
>> ```
>> m := TextMorph new
>> beAllFont: ((TextStyle named: 'Assurbanipal') fontOfSize: 36);
>> backgroundColor: Color white.
>>
>> signString := (Unicode value: 16r12038) asString.
>> m contents: signString.
>>
>> m openInHand.
>> ```
>>
>> I get a TextMorph that appears blank. Either I am doing something incorrectly or the system cannot render the glyph at that code point (which should be the sign "AŠ"). Any ideas?
>> --
>> Eric




Reply | Threaded
Open this post in threaded view
|

Re: TrueTypeFont and Unicode Characters

Beckmann, Tom
Hey Eric,

I just tried with a Devanagari font, and this was able to return a form for me:

((TextStyle named: #NotoSansDevanagari) fontOfSize: 24) ttcDescription renderGlyph: 2327 height: 30 fgColor: Color white bgColor: Color black depth: 32

where 2327 is the unicode codepoint. Of course, as Tobi said, we do not support shaping (so no ligatures). If you have a very well defined use case, you might be able to get away with accessing glyphs by their index in the font directly. For this, you could inspect what ttcDescription returns and see if its glyphs field contains the ligatures you're looking for.

Best,
Tom
________________________________________
From: Squeak-dev <[hidden email]> on behalf of Tobias Pape <[hidden email]>
Sent: Thursday, April 30, 2020 1:58:12 PM
To: The general-purpose Squeak developers list
Subject: Re: [squeak-dev] TrueTypeFont and Unicode Characters

Hi

> On 30.04.2020, at 13:49, Eric Gade <[hidden email]> wrote:
>
> Thanks Marcel.
>
> I have used the specified Font Importer tool now, but am still having the same issue.

I'm not sure we are actually loading all glyphs.
Also note, that nothing in the font (like substitiution tables) are actually supported :/
sorry.

Best regards
        -Tobias

>
> On Thu, Apr 30, 2020 at 3:43 AM Marcel Taeumel <[hidden email]> wrote:
> Hi Eric!
>
> > I have "imported" this font using the FileList tool and it does appear as a TextStyle
>
> Please use the Font Importer to import such fonts. You can find it in the Tools menu.
>
>
>
> Best,
> Marcel
>> Am 30.04.2020 04:22:12 schrieb Eric Gade <[hidden email]>:
>>
>> Hello,
>>
>> I am trying to load a particular TrueTypeFont that represents the Cuneiform signs in the Cuneiform Unicode block.
>> You can find some information on this page.
>>
>> Specifically, I'm attempting to work with the Neo-Assyrian font on that page ('Assurbanipal.ttf' in the zip file). I have also tried the Old Babylonian fonts with similar results.
>>
>> I have "imported" this font using the FileList tool and it does appear as a TextStyle. However, if I attempt to run:
>>
>> ```
>> m := TextMorph new
>> beAllFont: ((TextStyle named: 'Assurbanipal') fontOfSize: 36);
>> backgroundColor: Color white.
>>
>> signString := (Unicode value: 16r12038) asString.
>> m contents: signString.
>>
>> m openInHand.
>> ```
>>
>> I get a TextMorph that appears blank. Either I am doing something incorrectly or the system cannot render the glyph at that code point (which should be the sign "AŠ"). Any ideas?
>> --
>> Eric





Reply | Threaded
Open this post in threaded view
|

Re: TrueTypeFont and Unicode Characters

darth-cheney
> If you have a very well defined use case, you might be able to get away with accessing glyphs by their index in the font directly. For this, you could inspect what ttcDescription returns and see if its glyphs field contains the ligatures you're looking for.
 
Thanks Tom. I don't know much at all about Fonts in any deep way, so this is a good start. I'll let you know if I have any success

On Thu, Apr 30, 2020 at 2:22 PM Beckmann, Tom <[hidden email]> wrote:
Hey Eric,

I just tried with a Devanagari font, and this was able to return a form for me:

((TextStyle named: #NotoSansDevanagari) fontOfSize: 24) ttcDescription renderGlyph: 2327 height: 30 fgColor: Color white bgColor: Color black depth: 32

where 2327 is the unicode codepoint. Of course, as Tobi said, we do not support shaping (so no ligatures). If you have a very well defined use case, you might be able to get away with accessing glyphs by their index in the font directly. For this, you could inspect what ttcDescription returns and see if its glyphs field contains the ligatures you're looking for.

Best,
Tom
________________________________________
From: Squeak-dev <[hidden email]> on behalf of Tobias Pape <[hidden email]>
Sent: Thursday, April 30, 2020 1:58:12 PM
To: The general-purpose Squeak developers list
Subject: Re: [squeak-dev] TrueTypeFont and Unicode Characters

Hi

> On 30.04.2020, at 13:49, Eric Gade <[hidden email]> wrote:
>
> Thanks Marcel.
>
> I have used the specified Font Importer tool now, but am still having the same issue.

I'm not sure we are actually loading all glyphs.
Also note, that nothing in the font (like substitiution tables) are actually supported :/
sorry.

Best regards
        -Tobias

>
> On Thu, Apr 30, 2020 at 3:43 AM Marcel Taeumel <[hidden email]> wrote:
> Hi Eric!
>
> > I have "imported" this font using the FileList tool and it does appear as a TextStyle
>
> Please use the Font Importer to import such fonts. You can find it in the Tools menu.
>
>
>
> Best,
> Marcel
>> Am 30.04.2020 04:22:12 schrieb Eric Gade <[hidden email]>:
>>
>> Hello,
>>
>> I am trying to load a particular TrueTypeFont that represents the Cuneiform signs in the Cuneiform Unicode block.
>> You can find some information on this page.
>>
>> Specifically, I'm attempting to work with the Neo-Assyrian font on that page ('Assurbanipal.ttf' in the zip file). I have also tried the Old Babylonian fonts with similar results.
>>
>> I have "imported" this font using the FileList tool and it does appear as a TextStyle. However, if I attempt to run:
>>
>> ```
>> m := TextMorph new
>> beAllFont: ((TextStyle named: 'Assurbanipal') fontOfSize: 36);
>> backgroundColor: Color white.
>>
>> signString := (Unicode value: 16r12038) asString.
>> m contents: signString.
>>
>> m openInHand.
>> ```
>>
>> I get a TextMorph that appears blank. Either I am doing something incorrectly or the system cannot render the glyph at that code point (which should be the sign "AŠ"). Any ideas?
>> --
>> Eric







--
Eric


Reply | Threaded
Open this post in threaded view
|

Re: TrueTypeFont and Unicode Characters

darth-cheney
Hello all,

I wanted to revisit this issue as I'm working on it again. Back in April, Tobias mentioned the following:

I'm not sure we are actually loading all glyphs.
Also note, that nothing in the font (like substitiution tables) are actually supported :/
sorry.

It appears that the particular ancient fonts that I'm using -- and other ancient fonts whose codepoints are now part of the Unicode standard -- make use of substitution in order to access points beyond 65535 (a full unsigned 16 bits). I know this to be true for my particular Akkadian fonts because if I dig into the code I am able to render the glyphs that TTFontReader has parsed out of the file, but only if I modify the renderGlyph:height:fgColor:bgColor:depth: message to look in the Font description's table of glyphs (rather than the codepoint sparsetable, which is 65535 entries long and has no glyphs at all  in it).

Is there some deep reason not to support GSUB (substitution), aside from the complexity of ligatures? Right now if a TTF file has a GSUB entry it's not even being parsed out by TTFontReader. I'm really bumping around in the dark here, but I might be able to come up with something that at least permits the use of straight 1 to 1 substitutions. Any thoughts or ideas?

I'm attaching a few screenshots that can give a better idea of what I've done just to get something to render properly. I created a modified version of renderGlyph: (etc, etc).



Workspace.png (14K) Download Attachment
TTFontDescription.png (123K) Download Attachment
CuneiformRendered.png (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: TrueTypeFont and Unicode Characters

Yoshiki Ohshima-3

On Tue, Aug 4, 2020 at 1:49 PM Eric Gade <[hidden email]> wrote:
Hello all,

I wanted to revisit this issue as I'm working on it again. Back in April, Tobias mentioned the following:

I'm not sure we are actually loading all glyphs.
Also note, that nothing in the font (like substitiution tables) are actually supported :/
sorry.

It appears that the particular ancient fonts that I'm using -- and other ancient fonts whose codepoints are now part of the Unicode standard -- make use of substitution in order to access points beyond 65535 (a full unsigned 16 bits). I know this to be true for my particular Akkadian fonts because if I dig into the code I am able to render the glyphs that TTFontReader has parsed out of the file, but only if I modify the renderGlyph:height:fgColor:bgColor:depth: message to look in the Font description's table of glyphs (rather than the codepoint sparsetable, which is 65535 entries long and has no glyphs at all  in it).

Is there some deep reason not to support GSUB (substitution), aside from the complexity of ligatures? Right now if a TTF file has a GSUB entry it's not even being parsed out by TTFontReader. I'm really bumping around in the dark here, but I might be able to come up with something that at least permits the use of straight 1 to 1 substitutions. Any thoughts or ideas?

I'm attaching a few screenshots that can give a better idea of what I've done just to get something to render properly. I created a modified version of renderGlyph: (etc, etc).

It looks very cool!

The major reason is historical; using TrueType for rendering text was added with the idea that it follows the Smalltalk-80's basic text handling, e.g. the text width is the sum of individual character width, and none has non-positive advance width, etc., etc. One of the reasons is that it is not about just rendering, but if the user points within the text on screen, it'd have to figure out the index in the text.

The original design was that one can write a dedicated renderer and position-to-index feature for each language etc., but nobody (including me) spent much time to utilize it (except for Japanese language). Another run to support more complete text handling, either in Squeak native or perhaps in the manner of calling external libraries could be useful.

--
-- Yoshiki



Reply | Threaded
Open this post in threaded view
|

Re: TrueTypeFont and Unicode Characters

Beckmann, Tom
Hi Eric,

I recently played around with FreeType and ended up with this: https://github.com/tom95/sqfreetypefont
It was developed for Linux and may work on Mac. The version of FreeType that you can get for Windows is, however, not compatible it appears.

Note that if the font you're using also makes use of ligatures this may still not help you, as a library like HarfBuzz is required for the shaping process. The lookup is also optimized for characters in the ASCII range and otherwise uses a Dictionary, so it may be worth changing this if rendering turns out to be slow for large bodies of text outside the ASCII range.

If you do try out the FreeTypeFont and run into any problems, I'm happy to assist via the Github issues of the project.

Best,
Tom
________________________________________
From: Squeak-dev <[hidden email]> on behalf of Yoshiki Ohshima <[hidden email]>
Sent: Tuesday, August 4, 2020 11:24:08 PM
To: The general-purpose Squeak developers list
Subject: Re: [squeak-dev] TrueTypeFont and Unicode Characters

On Tue, Aug 4, 2020 at 1:49 PM Eric Gade <[hidden email]<mailto:[hidden email]>> wrote:
Hello all,

I wanted to revisit this issue as I'm working on it again. Back in April, Tobias mentioned the following:

I'm not sure we are actually loading all glyphs.
Also note, that nothing in the font (like substitiution tables) are actually supported :/
sorry.

It appears that the particular ancient fonts that I'm using<https://www.hethport.uni-wuerzburg.de/cuneifont/> -- and other ancient fonts whose codepoints are now part of the Unicode standard -- make use of substitution in order to access points beyond 65535 (a full unsigned 16 bits). I know this to be true for my particular Akkadian fonts because if I dig into the code I am able to render the glyphs that TTFontReader has parsed out of the file, but only if I modify the renderGlyph:height:fgColor:bgColor:depth: message to look in the Font description's table of glyphs (rather than the codepoint sparsetable, which is 65535 entries long and has no glyphs at all  in it).

Is there some deep reason not to support GSUB (substitution), aside from the complexity of ligatures? Right now if a TTF file has a GSUB entry it's not even being parsed out by TTFontReader. I'm really bumping around in the dark here, but I might be able to come up with something that at least permits the use of straight 1 to 1 substitutions. Any thoughts or ideas?

I'm attaching a few screenshots that can give a better idea of what I've done just to get something to render properly. I created a modified version of renderGlyph: (etc, etc).

It looks very cool!

The major reason is historical; using TrueType for rendering text was added with the idea that it follows the Smalltalk-80's basic text handling, e.g. the text width is the sum of individual character width, and none has non-positive advance width, etc., etc. One of the reasons is that it is not about just rendering, but if the user points within the text on screen, it'd have to figure out the index in the text.

The original design was that one can write a dedicated renderer and position-to-index feature for each language etc., but nobody (including me) spent much time to utilize it (except for Japanese language). Another run to support more complete text handling, either in Squeak native or perhaps in the manner of calling external libraries could be useful.

--
-- Yoshiki


Reply | Threaded
Open this post in threaded view
|

Re: TrueTypeFont and Unicode Characters

darth-cheney

On Wed, Aug 5, 2020 at 8:11 AM Beckmann, Tom <[hidden email]> wrote:
Hi Eric,

I recently played around with FreeType and ended up with this: https://github.com/tom95/sqfreetypefont
It was developed for Linux and may work on Mac. The version of FreeType that you can get for Windows is, however, not compatible it appears.

This is excellent, thanks! I'm having an issue trying to install according to the instructions, but I'll post that as an issue over on Github instead of polluting this space.

On Tue, Aug 4, 2020 at 5:24 PM Yoshiki Ohshima <[hidden email]> wrote:

The major reason is historical; using TrueType for rendering text was added with the idea that it follows the Smalltalk-80's basic text handling, e.g. the text width is the sum of individual character width, and none has non-positive advance width, etc., etc. One of the reasons is that it is not about just rendering, but if the user points within the text on screen, it'd have to figure out the index in the text.

The original design was that one can write a dedicated renderer and position-to-index feature for each language etc., but nobody (including me) spent much time to utilize it (except for Japanese language). Another run to support more complete text handling, either in Squeak native or perhaps in the manner of calling external libraries could be useful.

Thanks, Yoshiki. That all makes sense to me. I think the basic TTF stuff is actually in pretty good shape (aside from ligatures).

The following is more an FYI in case anyone else stumbles on this issue. After further investigation it appears that my diagnosis was incorrect: my particular issue is not about GSUB or any additional fancy text-handling tables. It turns out that the `cmap` table platform ID 0 (unicode), which is one of the few that the current Squeak implementation handles, has been updated in the time since the library was written. Specifically, Unicode 2.0 now specifies characters whose codepoints are 32-bit values (see the Overview here). There are different encoding table sub-formats for these scenarios, as well as other encoding strategies for dealing with points that fall outside the 16 bit range. My particular font seems to be using Format 4. If you click that link you'll see how complex it is. I can't even say I understand what they are talking about there, and it would be complicated to implement.

OpenType is complicated!

 

--
Eric