I think I usually agree with Lex, but this time I must say I do not. C was
breathtakingly good at interop as well, but at some point we saw that it was holding us back and moved on. I feel it's the same with the ASCII set. I haven't seen much pain in Smalltalk because of being tied to ASCII (though that doesn't mean it isn't there, just that we are good at working around it). But one place that is showing it in a big way is Haskell [1]. I not sure what to do here, but there must be something. Maybe do like laptops already do and have a "function" key that causes normal keys to do an alternative function (normally written in blue), e.g. holding down F2 and the L key could make a lambda symbol. When a user presses the "function" key, we could have a little keyboard pop up to show what the symbols are since they wont be written on the keyboard. [1] In Haskell one becomes aware of the limitations of using ASCII right away. The compose symbol (normally a circle) is the period. They decided to use the \ symbol for lambdas (!!!) since that was the closest graphic available. >From: Bert Freudenberg <[hidden email]> >Reply-To: The general-purpose Squeak developers >list<[hidden email]> >To: The general-purpose Squeak developers >list<[hidden email]> >Subject: Re: Squeak-dev/Squeak-web image v95-2 >Date: Sat, 7 Apr 2007 18:29:25 +0200 > > >On Apr 7, 2007, at 15:45 , Lex Spoon wrote: > >>tim Rowledge <[hidden email]> writes: >>>Make the parser accept the unicode leftarrow as assign; leave the >>>':=' for backcompat. >>>Make fileout to text convert leftaror to := for plausible human >>>readability and ascii compat. >>>Make filein convert := to leftarrow for aesthetic compat. >>>Make a hotkey to insert leftarrow. Remember, the $= key is just a >>>hotkey that is already set for you. >>>Let people chose a suitable assign hotkey. Perhaps cmd-=? Perhaps alt- >>>\? I don't really care at this stage. >>>Profit. >> >> >>This approach will work. However, we could instead not do all that, >>and use pure ASCII. ASCII is simple, sufficient, and breathtakingly >>good at interop. > >The American SCII also breathtakingly sucks at expressiveness, even for >many Americans. If you find its expressiveness sufficient, good for you. >In most parts of the world it is not. > >Typewriters with their limited character set fortunately were only a small >interlude in the history of typography. I'm glad in Smalltalk we at least >use proportional fonts. Arbitrary punctuation like := really disturbs in >reading IMHO. I actually hope we're moving towards higher typographical >standards (we at least should be able to do what Fortress does for >example) and not falling back into the stone age of computing. > >- Bert - > > > _________________________________________________________________ MSN is giving away a trip to Vegas to see Elton John. Enter to win today. http://msnconcertcontest.com?icid-nceltontagline |
In reply to this post by Bert Freudenberg
On 7-Apr-07, at 9:29 AM, Bert Freudenberg wrote: [snip] > > The American SCII also breathtakingly sucks at expressiveness, even > for many Americans. If you find its expressiveness sufficient, good > for you. In most parts of the world it is not. Hear-hear. How am I supposed to put a diaresis on a 'y' to indicate it is to be pronounced as a separate vowel? There's lot of important lexical(sic) symbols that Lex's wish for 'pure ASCII' would prevent us using. Smalltalk-return and Smalltalk- assign are just two that happen to concern us here and now. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Oxymorons: Sweet sorrow |
In reply to this post by Bert Freudenberg
Hi Bert,
> The freetype importer needs to map those characters to a blank glyph. freetype? I think you meant truetype? In which case - yes, but how? Lots of problems with WideString/Unicode/MultiTTCFont will be exposed if people start using my hacked Shout. Which reinforces the (very valid) observation you made earlier about excercising WideString code paths. Cheers, Andy |
In reply to this post by J J-6
J J wrote:
> [1] In Haskell one becomes aware of the limitations of using ASCII right > away. The compose symbol (normally a circle) is the period. They > decided to use the \ symbol for lambdas (!!!) since that was the closest > graphic available. For what it's worth, I think both of those choices work well for haskell, and in a modern emacs and with a lucky font choice :), (setq haskell-font-lock-symbols 'unicode) makes them look great! |
In reply to this post by timrowledge
tim Rowledge <[hidden email]> writes:
> On 7-Apr-07, at 9:29 AM, Bert Freudenberg wrote: > > The American SCII also breathtakingly sucks at expressiveness, even > > for many Americans. If you find its expressiveness sufficient, good > > for you. In most parts of the world it is not. > Hear-hear. How am I supposed to put a diaresis on a 'y' to indicate > it is to be pronounced as a separate vowel? > > There's lot of important lexical(sic) symbols that Lex's wish for > 'pure ASCII' would prevent us using. Smalltalk-return and Smalltalk- > assign are just two that happen to concern us here and now. I get the argument for going to Unicode, though that is not without its problems. What bugs me is redefining _ and ^ in non-standard ways. It's awkward and obscure. Lex |
In reply to this post by Andrew Tween
On Apr 7, 2007, at 21:12 , Andrew Tween wrote: > Hi Bert, >> The freetype importer needs to map those characters to a blank glyph. > > freetype? I think you meant truetype? Sorry, yes, that's what I meant. > In which case - yes, but how? IIRC truetype requires glyph number 0 to be a zero-width blank glyph. So if you assign that in the character-to-glyph map it will not be rendered. I seem to recall there even was a method for doing this remapping, but I haven't looked for it. - Bert - |
Hi,
----- Original Message ----- From: "Bert Freudenberg" <[hidden email]> To: "The general-purpose Squeak developers list" <[hidden email]> Sent: Saturday, April 07, 2007 10:34 PM Subject: Re: Unicode Arrows in Shout ( was Re: Squeak-dev/Squeak-webimagev95-2) > > On Apr 7, 2007, at 21:12 , Andrew Tween wrote: > > > Hi Bert, > >> The freetype importer needs to map those characters to a blank glyph. > > > > freetype? I think you meant truetype? > > Sorry, yes, that's what I meant. > > > In which case - yes, but how? > > IIRC truetype requires glyph number 0 to be a zero-width blank glyph. > So if you assign that in the character-to-glyph map it will not be > rendered. glyph 1 is the "null" glyph. Having said that, the TTCFontReader is making its own blank glyph for separators. Which is ok. But... TTCFontReader>>processCharMap: creates two encodings; the first has 256 entries, the second 65536. In the first encoding, it sets the entry for each separator character to the blank glyph. But, in the second encoding, it doesn't. Modifying it so that separators are set to a blank glyph in both encodings fixes the problem. Fileout is attached. Cheers, Andy TTCFontReader-processCharMap.st (1K) Download Attachment |
> Modifying it so that separators are set to a blank glyph in both encodings fixes
> the problem. Fileout is attached. Thanks, works fine. Not sure whose fault it is, but when browsing the senders of setDemoFonts, looking at Preferences class>>fontConfigurationMenu, I get a ByteArray>>errorSubscriptBounds:, index=8593. The byte array in question is CaseInsensitiveMatchOrder. The call originates from TextMorphForShoutEditor>>againOnce:, where it says where ← paragraph text findString: FindText startingAt: self stopIndex caseSensitive: ((ChangeText ~~ FindText) or: [Preferences caseSensitiveFinds]). The actual problem seems to be that CaseInsensitiveMatchOrder is only 256 bytes in size. Regards, Martin |
In reply to this post by Andrew Tween
As another follow-up: Cut-n-paste of source text does not work anymore,
on Linux. The arrows are stripped when pasting (both between two Squeak windows, and when copying from Squeak to, say, this IceDove window I'm typing the mail into (whereas copying the arrow from gucharmap works fine). Regards, Martin |
In reply to this post by "Martin v. Löwis"
On Apr 8, 2007, at 10:49 , Martin v. Löwis wrote:
>> Modifying it so that separators are set to a blank glyph in both >> encodings fixes >> the problem. Fileout is attached. > > Thanks, works fine. Not sure whose fault it is, but when browsing > the senders of setDemoFonts, looking at Preferences > class>>fontConfigurationMenu, I get a > ByteArray>>errorSubscriptBounds:, > index=8593. The byte array in question is CaseInsensitiveMatchOrder. > > The call originates from TextMorphForShoutEditor>>againOnce:, > where it says > > where ← paragraph text findString: FindText startingAt: > self stopIndex caseSensitive: > ((ChangeText ~~ FindText) or: [Preferences caseSensitiveFinds]). > > The actual problem seems to be that CaseInsensitiveMatchOrder is > only 256 bytes in size. Yep. Sounds like my suggestion to get the wide char paths more exercised is valid ;) - Bert - |
In reply to this post by "Martin v. Löwis"
Hi,
----- Original Message ----- From: "Martin v. Löwis" <[hidden email]> To: "The general-purpose Squeak developers list" <[hidden email]> Sent: Sunday, April 08, 2007 9:53 AM Subject: Re: Unicode Arrows in Shout ( wasRe: Squeak-dev/Squeak-webimagev95-2) > As another follow-up: Cut-n-paste of source text does not work anymore, > on Linux. The arrows are stripped when pasting (both between two Squeak > windows, and when copying from Squeak to, say, this IceDove window > I'm typing the mail into (whereas copying the arrow from gucharmap > works fine). I get the same problem on Windows. Interestingly, cut-paste of a string containing only up-arrows and left-arrows works ok. But not with a longer string containing both arrows and ascii chars. Cheers, Andy |
In reply to this post by Bert Freudenberg
Bert Freudenberg a écrit :
> On Apr 8, 2007, at 10:49 , Martin v. Löwis wrote: > >>> Modifying it so that separators are set to a blank glyph in both >>> encodings fixes >>> the problem. Fileout is attached. >> >> Thanks, works fine. Not sure whose fault it is, but when browsing >> the senders of setDemoFonts, looking at Preferences >> class>>fontConfigurationMenu, I get a ByteArray>>errorSubscriptBounds:, >> index=8593. The byte array in question is CaseInsensitiveMatchOrder. >> >> The call originates from TextMorphForShoutEditor>>againOnce:, >> where it says >> >> where ← paragraph text findString: FindText startingAt: >> self stopIndex caseSensitive: >> ((ChangeText ~~ FindText) or: [Preferences caseSensitiveFinds]). >> >> The actual problem seems to be that CaseInsensitiveMatchOrder is >> only 256 bytes in size. > > > Yep. Sounds like my suggestion to get the wide char paths more exercised > is valid ;) > > - Bert - > > > > One way to correct bugs indeed is to release the unicode arrow change, let bugs express themselves randomly and correct them as they appear. Because this kind of bug is so central, harvesting should come fast. The longer way is to first write tests foreach method (preferably tests with wide character in WideString, not just asciiString asWideString). Former solution will let uncorrected bugs present and we then should be prepared to big traffic dedicated to complaints and angry mails in the lists from user loosing their changes. But don't forget to use Mantis, you certainly experienced one of these http://bugs.squeak.org/view.php?id=6367 http://bugs.squeak.org/view.php?id=6366 http://bugs.squeak.org/view.php?id=5331 http://bugs.squeak.org/view.php?id=3574 Some corrections are underway in 3.10. But not all known bugs are corrected yet. And quite sure we can find more problems because a lot of code were optimized for single byte string and is abusively inherited by WideString. Quite sure some bugs are also encountered and maybe cured in other forks (Squeakland, Sophie, OLPC?). That'a a weak point of forks. We also have to rewrite some plugins or be prepared to some loss of efficiency. That said, I would encourage going unicode. And Smalltalk is a framework in which we can do this well and easily. Try to program such a shift in C++ for example... Problems will appear when interfacing outside world, but with clever conversions and wrappers we can deal. Nicolas |
> But don't forget to use Mantis, you certainly experienced one of these
> http://bugs.squeak.org/view.php?id=3574 Indeed, this is the one I found. The other one (cut-n-paste does not work) apparently hasn't been reported. > And Smalltalk is a framework in which we can do this well and easily. > Try to program such a shift in C++ for example... Well, the Win32 API (pure C) handles Unicode just fine for more than 10 years, now. Many applications have been ported during that time, although you still find applications that break when confronted with characters outside the current locale. Many C++ APIs have been moved to Unicode also, e.g. the Mozilla XPCOM infrastructure started out in single-byte mode, but is now Unicode throughout. This is a case where static typing helps: if you change the APIs (which is essentially what has been done in both cases), you will get a compiler error (so people have to change a lot of code), but once you are through with these changes, there will be fewer surprises at run-time. Regards, Martin |
In reply to this post by Simon Michael
Well, they may be workable. But it is 2007, sooner or later one should be
able to write mathematical code in mathematical symbols. How much longer is paper going to be much more flexible then computers for writing things down? :) >From: Simon Michael <[hidden email]> >Reply-To: The general-purpose Squeak developers >list<[hidden email]> >To: [hidden email] >Subject: Re: Squeak-dev/Squeak-web image v95-2 >Date: Sat, 07 Apr 2007 13:05:49 -0700 > >J J wrote: >>[1] In Haskell one becomes aware of the limitations of using ASCII right >>away. The compose symbol (normally a circle) is the period. They decided >>to use the \ symbol for lambdas (!!!) since that was the closest graphic >>available. > >For what it's worth, I think both of those choices work well for haskell, >and in a modern emacs and with a lucky font choice :), (setq >haskell-font-lock-symbols 'unicode) makes them look great! > > _________________________________________________________________ MSN is giving away a trip to Vegas to see Elton John. Enter to win today. http://msnconcertcontest.com?icid-nceltontagline |
In reply to this post by Lex Spoon-3
>From: Lex Spoon <[hidden email]>
>Reply-To: The general-purpose Squeak developers >list<[hidden email]> >To: [hidden email] >Subject: Re: Squeak-dev/Squeak-web image v95-2 >Date: 07 Apr 2007 16:13:43 -0400 > >I get the argument for going to Unicode, though that is not without >its problems. What bugs me is redefining _ and ^ in non-standard >ways. It's awkward and obscure. Who wants to do that? I think _ should mean _. If we want an arrow for assignment and there is one in Unicode then we should provide a way to input it and use that. And we could provide a preference to use only ASCII for people who want to, but the _ assignments should go away imho. _________________________________________________________________ MSN is giving away a trip to Vegas to see Elton John. Enter to win today. http://msnconcertcontest.com?icid-nceltontagline |
On 9-Apr-07, at 2:51 AM, J J wrote: >> From: Lex Spoon <[hidden email]> >> Reply-To: The general-purpose Squeak developers list<squeak- >> [hidden email]> >> To: [hidden email] >> Subject: Re: Squeak-dev/Squeak-web image v95-2 >> Date: 07 Apr 2007 16:13:43 -0400 >> >> I get the argument for going to Unicode, though that is not without >> its problems. What bugs me is redefining _ and ^ in non-standard >> ways. It's awkward and obscure. > > Who wants to do that? I think _ should mean _. If we want an > arrow for assignment and there is one in Unicode then we should > provide a way to input it and use that. And we could provide a > preference to use only ASCII for people who want to, but the _ > assignments should go away imho. Exactly; I certainly didn't intend to suggest "redefining _ and ^ in non-standard ways" and I don't think I saw anyone else do so. We have the option of using proper characters that are outside the narrow scope of mere ASCII to improve our system. I posit the we should so do. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Useful random insult:- Ought to have a warning label on his forehead. |
In reply to this post by J J-6
Fortress is using unicode, so that it can look more like mathematics. IIRC, they have an ASCII encoding that can be used for input or portability, but the language is designed to be typed with multiple fonts and wide range of glyphs.
On 9 Apr 2007, at 2:49, J J wrote:
Andrew P. Black Department of Computer Science Portland State University +1 503 725 2411 |
In reply to this post by timrowledge
tim Rowledge <[hidden email]> writes:
> Exactly; I certainly didn't intend to suggest "redefining _ and ^ in > non-standard > ways" and I don't think I saw anyone else do so. We have the option > of using proper characters that are outside the narrow scope of mere > ASCII to improve our system. I posit the we should so do. As you know, this is the status quo in Squeak. _ and ^ are left-arrow and up-arrow in many Squeak fonts. Changing these back to the normal characters is IMHO an improvement for Squeak interop, and an improvement in general. You did propose a collection of rewrites that should happen as code comes into and out of the system. This is part of what I meant by treating characters in non-standard ways. Whether or not we switch to Unicode, translating on file-in/file-out is problematic. Just consider cut and paste with Squeak workspaces. Sometimes workspaces hold code and sometimes they do not, so you cannot know whether or not to apply the rewrite. It is far nicer if you can treat the code as normal text, and thus copy/paste it as is. Even if we do switch to Unicode, there is a case for sticking with := and ^ instead of using Unicode arrow characters. := and ^ are easy to type, and they will work fine in every tool that processes text. You don't have to get into rewrites and pretty printers; you can just use the text as it is. That degree of interop is powerful for our users. Lex |
Free forum by Nabble | Edit this page |