Infinite runarrays?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Infinite runarrays?

Georg Heeg

All,

 

Working on fonts and encodings again we came across the issue that the Unicode Character set is definined to be based upon positive integers. This implies that there is (theoretically) no predefined upper limit for the highest Unicode Character value. In the up to date Unicode standard 5.0.0 the highest character defined has the number 16rE01EF=917999. The highest character reserved is 16r10FFFD=1114109.

 

Having these changes in mind we thought about an appropriate class to map Unicode numbers to fonts (e.g. in composed fonts). Currently ByteArrays of 65536 elements are used for this purpose. This is inappropriate for two reasons: First, it limits the number of fonts in a composed font to 255 and secondly it does nout support any Unicode character with a larger number that 62536. Our first idea was to use (modified) RunArrays for this purpose, and the initial step was actually easy: We loaded AT-MetaNumerics and set the last run in the RunArray to be Inifinity positive. Thus this RunArray has size +infinity (which is correct), but it also gets +infinity in its hash value (which is not compliant with anything using a hash value).

 

Does anybody have experiences in this area?

 

Georg

Reply | Threaded
Open this post in threaded view
|

Re: Infinite runarrays?

Reinout Heeck-2
Georg,

at http://www.unicode.org/Public/5.0.0/ucd/UCD.html it says:
"Code points are expressed as hexadecimal numbers with four to six digits"

Which seems like a pretty definite range for 5.0.0 (and probably for the
next couple of revs). So your RunArray could have 16rffffff or
16r1000000 as the last entry.


Please keep Infinity and cousins out of my base image ;-)


R
-





Georg Heeg wrote:

> All,
>
>  
>
> Working on fonts and encodings again we came across the issue that the
> Unicode Character set is definined to be based upon positive integers.
> This implies that there is (theoretically) no predefined upper limit for
> the highest Unicode Character value. In the up to date Unicode standard
> 5.0.0 the highest character defined has the number 16rE01EF=917999. The
> highest character reserved is 16r10FFFD=1114109.
>
>  
>
> Having these changes in mind we thought about an appropriate class to
> map Unicode numbers to fonts (e.g. in composed fonts). Currently
> ByteArrays of 65536 elements are used for this purpose. This is
> inappropriate for two reasons: First, it limits the number of fonts in a
> composed font to 255 and secondly it does nout support any Unicode
> character with a larger number that 62536. Our first idea was to use
> (modified) RunArrays for this purpose, and the initial step was actually
> easy: We loaded AT-MetaNumerics and set the last run in the RunArray to
> be Inifinity positive. Thus this RunArray has size +infinity (which is
> correct), but it also gets +infinity in its hash value (which is not
> compliant with anything using a hash value).
>
>  
>
> Does anybody have experiences in this area?
>
>  
>
> Georg
>