I struggled to use Japanese on Seaside recently.
The problem is not only about accented characters (Unicode). The cause of that is a lack of the fundamental facility, 'charset', in Seaside. Charset is very important especially in Asia. Many of Asian sites uses various local charset not Unicode in reality. Umezawa-san and I made a patch for internationalization of Seaside2/Squeak. This patch fixes the problems cause of charset encodings (includes Unicode) on Seaside2.6b1. It makes WAKom (NOT WAKomEncoded) handle Charset, XHTML lang, encoded filenames and encoded URL. http://squeaksource.blueplane.jp/Seaside2I18N/ (This is a project on the SqueakSource in *Japanese*, however you can load it into your image without any changes.) I tested it on SqueakLand3.8-05 and Squeak3.9b-7048 with UTF8 and several Japanese charsets by hand. It works well in any encoding. The character encoding is a basic facility of a web server so, I think, the patch should be merged into the main package of Seaside2. (Seaside2I18N is branched from Seaside2.6b1-lr.52.) Please, take a look. Koji On Mon, 24 Jul 2006 10:48:04 -0700 Avi Bryant <[hidden email]> wrote: > > On Jul 24, 2006, at 7:45 AM, Damien Cassou wrote: > > > > > > How is it possible that nobody cares about accented characters > > within Seaside ? > > Speaking for myself: I certainly care about them, but I use Squeak > 3.7. The UTF-8 support from 3.8 mostly tends to complicate the issue > (of course, accented characters don't show up correctly in inspectors > etc, but that's a price I'm willing to pay). > > I'm surprised things are now broken in 3.8/3.9, however - > WAKomEncoded *used* to work, didn't it? > > Avi > _______________________________________________ > Seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside -- ! Koji Yokokawa <[hidden email]> http://yengawa.com/ ^self new! _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Koji Yokokawa wrote:
> I struggled to use Japanese on Seaside recently. > > The problem is not only about accented characters (Unicode). The cause > of that is a lack of the fundamental facility, 'charset', in Seaside. > Charset is very important especially in Asia. Many of Asian sites uses > various local charset not Unicode in reality. > > Umezawa-san and I made a patch for internationalization of > Seaside2/Squeak. This patch fixes the problems cause of charset > encodings (includes Unicode) on Seaside2.6b1. It makes WAKom (NOT > WAKomEncoded) handle Charset, XHTML lang, encoded filenames and encoded URL. > > http://squeaksource.blueplane.jp/Seaside2I18N/ > (This is a project on the SqueakSource in *Japanese*, however you can > load it into your image without any changes.) > > I tested it on SqueakLand3.8-05 and Squeak3.9b-7048 with UTF8 and > several Japanese charsets by hand. It works well in any encoding. > > The character encoding is a basic facility of a web server so, I think, > the patch should be merged into the main package of Seaside2. > (Seaside2I18N is branched from Seaside2.6b1-lr.52.) > Please, take a look. Hi, thank you for this. Why don't you put it on squeaksource.com ? People will be able to review it. Thank you very much Bye _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Koji Yokokawa
2006/7/27, Koji Yokokawa <[hidden email]>:
> I struggled to use Japanese on Seaside recently. > > The problem is not only about accented characters (Unicode). The cause > of that is a lack of the fundamental facility, 'charset', in Seaside. > Charset is very important especially in Asia. Many of Asian sites uses > various local charset not Unicode in reality. > > Umezawa-san and I made a patch for internationalization of > Seaside2/Squeak. This patch fixes the problems cause of charset > encodings (includes Unicode) on Seaside2.6b1. It makes WAKom (NOT > WAKomEncoded) handle Charset, XHTML lang, encoded filenames and encoded URL. I think this should be done by WAKomEncoded instead of WAKom. WAKom is supposed to do no conversion at all and thus effectively deals with byte arrays rather than strings. Like said before, for some people it's perfectly ok to have raw utf-8 (or whatever encoding) strings in the image. Others even want it that way. > http://squeaksource.blueplane.jp/Seaside2I18N/ > (This is a project on the SqueakSource in *Japanese*, however you can > load it into your image without any changes.) The problem is that this is not at all portable. I will only work on Squeak 3.9 with Kom. No other Squeak, not other Smalltalk, no other http server. Philippe _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Hi,
On Thu, 27 Jul 2006 16:58:34 +0200 "Philippe Marschall" <[hidden email]> wrote: > 2006/7/27, Koji Yokokawa <[hidden email]>: > > I struggled to use Japanese on Seaside recently. > > > > The problem is not only about accented characters (Unicode). The cause > > of that is a lack of the fundamental facility, 'charset', in Seaside. > > Charset is very important especially in Asia. Many of Asian sites uses > > various local charset not Unicode in reality. > > > > Umezawa-san and I made a patch for internationalization of > > Seaside2/Squeak. This patch fixes the problems cause of charset > > encodings (includes Unicode) on Seaside2.6b1. It makes WAKom (NOT > > WAKomEncoded) handle Charset, XHTML lang, encoded filenames and encoded URL. > > I think this should be done by WAKomEncoded instead of WAKom. WAKom is > supposed to do no conversion at all and thus effectively deals with > byte arrays rather than strings. > > Like said before, for some people it's perfectly ok to have raw utf-8 > (or whatever encoding) strings in the image. Others even want it that > way. I don't think so. The encoding depends on the application (the session to be exact), not on the server. Therefor I added the 'charset' value as a property of an application. Then the changes are scattered over the system. (check the changed methods by the Monticello's 'Merge' button in your Seaside image.) > > > http://squeaksource.blueplane.jp/Seaside2I18N/ > > (This is a project on the SqueakSource in *Japanese*, however you can > > load it into your image without any changes.) > > The problem is that this is not at all portable. I will only work on > Squeak 3.9 with Kom. No other Squeak, not other Smalltalk, no other > http server. You're right. I don't have knowledge of porting Seaside to other environment. Is there some one teach me rules or idioms to make the code portable in Seaside? Koji -- ! Koji Yokokawa <[hidden email]> http://yengawa.com/ ^self new! _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Damien Cassou-3
Hi,
On Thu, 27 Jul 2006 16:45:34 +0200 Damien Cassou <[hidden email]> wrote: > thank you for this. Why don't you put it on squeaksource.com ? People > will be able to review it. I started it only for Japanese community. But I agree with you now. I'll put it on squeaksource.com when I have time to do. Koji -- ! Koji Yokokawa <[hidden email]> http://yengawa.com/ ^self new! _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Koji Yokokawa
2006/7/27, Koji Yokokawa <[hidden email]>:
> Hi, > > On Thu, 27 Jul 2006 16:58:34 +0200 > "Philippe Marschall" <[hidden email]> wrote: > > > 2006/7/27, Koji Yokokawa <[hidden email]>: > > > I struggled to use Japanese on Seaside recently. > > > > > > The problem is not only about accented characters (Unicode). The cause > > > of that is a lack of the fundamental facility, 'charset', in Seaside. > > > Charset is very important especially in Asia. Many of Asian sites uses > > > various local charset not Unicode in reality. > > > > > > Umezawa-san and I made a patch for internationalization of > > > Seaside2/Squeak. This patch fixes the problems cause of charset > > > encodings (includes Unicode) on Seaside2.6b1. It makes WAKom (NOT > > > WAKomEncoded) handle Charset, XHTML lang, encoded filenames and encoded URL. > > > > I think this should be done by WAKomEncoded instead of WAKom. WAKom is > > supposed to do no conversion at all and thus effectively deals with > > byte arrays rather than strings. > > > > Like said before, for some people it's perfectly ok to have raw utf-8 > > (or whatever encoding) strings in the image. Others even want it that > > way. > > I don't think so. > The encoding depends on the application (the session to be exact), not > on the server. Therefor I added the 'charset' value as a property of an > application. Then the changes are scattered over the system. (check the > changed methods by the Monticello's 'Merge' button in your Seaside image.) I think we are talking about different this. What I meant is the following. Suppose you have an application that uses utf-8 (or whatever encoding) both externally and in the backend for the database. The application never needs to query the size of strings in number of characters and never directly indices into the strings. You now have to options. Either convert the strings that come into the image (form database or web) to WideStrings only the convert them back to the original encoding when the out of the image (to database or web) or do no conversion at all. Sometimes the later really is a valid option. > > > > > http://squeaksource.blueplane.jp/Seaside2I18N/ > > > (This is a project on the SqueakSource in *Japanese*, however you can > > > load it into your image without any changes.) > > > > The problem is that this is not at all portable. I will only work on > > Squeak 3.9 with Kom. No other Squeak, not other Smalltalk, no other > > http server. > > You're right. > I don't have knowledge of porting Seaside to other environment. Is there > some one teach me rules or idioms to make the code portable in Seaside? Michel was our expert here but it looks like Boris has taken over. So they are probably better qualified. Some rules I learned: 1. don't send #asString, send #displayString instead (exception WAUrl) 2. move platform specifc stuff to SeasidePlatformSupport. Now in your special case, I suggest the following What about the following contract: We don't do any conversion (character encoding or decoding) in Seaside. We do it in the server adapters. This should make porting easier since they are platform specific anyway. This way the get rid of all the TextConverters in Seaside (I don't think they are anywhere near portable). In cases where we absolutely have to (probably WAUrl) move it to SeasidePlatformSupport. Move the Kom specific stuff to Kom. We probably have to do a Kom for 3.9. Let's keep it the way that WAKom does not do any de/encoding and do it instead in WAKomEncoded. And ask Michel, Boris and Avi what the think about it. Philippe _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Free forum by Nabble | Edit this page |