Hi Bert and other concerned folk.
In reading Bert's post about fixing fonts to show the invisible characters I was reminded of tripping over the nonbreaking space (nbsp). See mantis report: http://bugs.impara.de/view.php?id=2446 I use a Mac and MacRoman defines nbsp as char 202. And this can be gotten from Character nbsp. In the default font in 7021 this appears as the British pound sign. There are some squeak fonts (atlantis for example) that will show a blank space for that character. Now Bert's fix uses char 160. Which is used by browsers as nbsp but the Latin1 standard I was pointed to has 160 in a range of undefined character values. So the question is there is (at least one) bug in this. What is the bug? 1) Should nbsp be define as the latin1 value? 2) Should squeak fonts have a way of saying what set of characters they represent? 3) Should the available fonts in squeak be consistent in choice of encodeing? 4) Should Character class be refactored to reflect the ability to choose different encodings? 5) Should Character class be debugged to reflect Latin1 rather than MacRoman encodings? If so what do you do about MacRoman? I have enough knowledge to know these questions are significant to the well being and maintenence of squeak. I am out of my depth in trying to suggest answers. It would be good it someone who understands the issue more deeply would formulate a mantis issue around it. Yours in service, -- Jerome Peace __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
Hi all,
Ah, the eyes are going and the glasses are not up to the task. Correction: the default font shows 202 as a E with a circumflex (the latin1 encoding). --- Peace Jerome <[hidden email]> wrote: > Hi Bert and other concerned folk. > > In reading Bert's post about fixing fonts to show > the > invisible characters I was reminded of tripping over > the nonbreaking space (nbsp). > > See mantis report: > > http://bugs.impara.de/view.php?id=2446 > > > I use a Mac and MacRoman defines nbsp as char 202. > And > this can be gotten from Character nbsp. > > In the default font in 7021 this appears as the > British pound sign. There are some squeak fonts > (atlantis for example) that will show a blank space > for that character. > > Now Bert's fix uses char 160. Which is used by > browsers as nbsp but the Latin1 standard I was > pointed > to has 160 in a range of undefined character values. > > > > So the question is there is (at least one) bug in > this. What is the bug? > > 1) Should nbsp be define as the latin1 value? > 2) Should squeak fonts have a way of saying what set > of characters they represent? > 3) Should the available fonts in squeak be > consistent > in choice of encodeing? > 4) Should Character class be refactored to reflect > the > ability to choose different encodings? > 5) Should Character class be debugged to reflect > Latin1 rather than MacRoman encodings? If so what do > you do about MacRoman? > > I have enough knowledge to know these questions are > significant to the well being and maintenence of > squeak. I am out of my depth in trying to suggest > answers. > > It would be good it someone who understands the > issue > more deeply would formulate a mantis issue around > it. > > Yours in service, -- Jerome Peace > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam > protection around > http://mail.yahoo.com > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
In reply to this post by Jerome Peace
Am 10.04.2006 um 02:19 schrieb Peace Jerome:
> Hi Bert and other concerned folk. > > In reading Bert's post about fixing fonts to show the > invisible characters I was reminded of tripping over > the nonbreaking space (nbsp). > > See mantis report: > > http://bugs.impara.de/view.php?id=2446 > > > I use a Mac and MacRoman defines nbsp as char 202. And > this can be gotten from Character nbsp. Doesn't have anything to do with the host operating system. We switched to Unicode, of which latin-1 (iso-8859-1) is the 8-bit subset (nitpicking aside). > In the default font in 7021 this appears as the > British pound sign. It should be Ê (E circumflex). > There are some squeak fonts > (atlantis for example) that will show a blank space > for that character. Only because Atlantis never had a glyph for "E circumflex". That's why it was blank. That's why it's replaced with a rectangle with my fix now. > Now Bert's fix uses char 160. Which is used by > browsers as nbsp but the Latin1 standard I was pointed > to has 160 in a range of undefined character values. Codepoints 128-159 do have a meaning but no glyphs in Unicode. 160 is indeed the non-breaking space. It's "reserved" in that there is no actual glyph associated with it, in that respect it's more like a control character. However, for our particular implementation of bitmap fonts it's convenient to just use a blank glyph. See http://www.unicode.org/charts/PDF/U0080.pdf > So the question is there is (at least one) bug in > this. What is the bug? > > 1) Should nbsp be define as the latin1 value? Yes. > 2) Should squeak fonts have a way of saying what set > of characters they represent? I guess so. > 3) Should the available fonts in squeak be consistent > in choice of encodeing? In an ideal world, yes. For practical reasons I think we have to deal with whatever we get. > 4) Should Character class be refactored to reflect the > ability to choose different encodings? No. Characters are not encoded, they represent Unicode values. Or at least by default they are. We support some non-unicode 16-bit encodings for asian languages, too, IIRC. Yoshiki would know best. > 5) Should Character class be debugged to reflect > Latin1 rather than MacRoman encodings? Yes. > If so what do you do about MacRoman? Use the appropriate converter class. > I have enough knowledge to know these questions are > significant to the well being and maintenence of > squeak. I am out of my depth in trying to suggest > answers. > > It would be good it someone who understands the issue > more deeply would formulate a mantis issue around it. Sure. There's a whole lot still to do in that area. - Bert - |
Free forum by Nabble | Edit this page |