Changes in method AidaSite>>properString

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Changes in method AidaSite>>properString

Jaroslav Havlín
Hello all,

I had some problems with encoding (UTF-8) in Aida 6 beta on
Smalltalk/X recently.
Honza Vrany found that the problem could be in method properString:
aString in class AidaSite, that looks like this:

properString: aString
                "if two byte string, convert it to one byte, cut twobyte
characters, make them $? "
         | stream |
         aString class == ByteString ifTrue: [^aString].
         stream := WriteStream on: String new.
         aString
                 do: [:char | stream nextPut: (char asInteger < 256
ifTrue: [char] ifFalse: [$?])].
         ^stream contents

Honza changed it, so it encodes everything into UTF-8 if needed:

properString: aString
         | stream |
                aString bitsPerCharacter == 8 ifTrue: [^aString].
                ^aString utf8Encoded
       

We find that method a little tricky and we are not sure whether that
change is safe
and won't cause some problems when it is called in some special contexts.

Do you think our change is ok? Is that method still needed, if UTF-8
is becoming common encoding in Aida now?

Kind regards,
 Jarda Havlin
_______________________________________________
Aida mailing list
[hidden email]
http://lists.aidaweb.si/mailman/listinfo/aida
Reply | Threaded
Open this post in threaded view
|

Re: Changes in method AidaSite>>properString

Janko Mivšek
Hello Jaroslav,

#properString: method was introduced in VW Aida for probably the same
problem as you have, so now it is a time to find the cause.

Namely, even that rendering the WebElements should always return ASCII
strings only (by converting texts to UTF-8), sometimes it happens that
they still return TwoByteString. I didn't have time to find where, so I
implemented that #properString: instead. But we must get rid of that
method ASAP, because it prohibits direct streaming to the output request
and therefore causes slower response time.

It would be just wonderful if you guys can find the problem by yourself.
Probably by inserting some watchdog code to see, where some element
emits TwoByteStrings for a first time, because that causes that a whole
result become a TwoByteString.

PS: TwoByteString is used in VW for Unicode text, in Squeak there is
WideString, what is it in ST/X?

Best regards
Janko

Jaroslav Havlín pravi:

> Hello all,
>
> I had some problems with encoding (UTF-8) in Aida 6 beta on
> Smalltalk/X recently.
> Honza Vrany found that the problem could be in method properString:
> aString in class AidaSite, that looks like this:
>
> properString: aString
>        "if two byte string, convert it to one byte, cut twobyte
> characters, make them $? "
>          | stream |
>          aString class == ByteString ifTrue: [^aString].
>          stream := WriteStream on: String new.
>          aString
>                  do: [:char | stream nextPut: (char asInteger < 256
> ifTrue: [char] ifFalse: [$?])].
>          ^stream contents
>
> Honza changed it, so it encodes everything into UTF-8 if needed:
>
> properString: aString
>          | stream |
>        aString bitsPerCharacter == 8 ifTrue: [^aString].
>        ^aString utf8Encoded
>
>
> We find that method a little tricky and we are not sure whether that
> change is safe
> and won't cause some problems when it is called in some special contexts.
>
> Do you think our change is ok? Is that method still needed, if UTF-8
> is becoming common encoding in Aida now?
>
> Kind regards,
>  Jarda Havlin
> _______________________________________________
> Aida mailing list
> [hidden email]
> http://lists.aidaweb.si/mailman/listinfo/aida
>

--
Janko Mivšek
Svetovalec za informatiko
Eranova d.o.o.
Ljubljana, Slovenija
www.eranova.si
tel:  01 514 22 55
faks: 01 514 22 56
gsm: 031 674 565
_______________________________________________
Aida mailing list
[hidden email]
http://lists.aidaweb.si/mailman/listinfo/aida
Reply | Threaded
Open this post in threaded view
|

Re: Changes in method AidaSite>>properString

Janko Mivšek
In reply to this post by Jaroslav Havlín
> properString: aString
>          | stream |
>        aString bitsPerCharacter == 8 ifTrue: [^aString].
>        ^aString utf8Encoded
>
>
> We find that method a little tricky and we are not sure whether that
> change is safe
> and won't cause some problems when it is called in some special contexts.
>
> Do you think our change is ok? Is that method still needed, if UTF-8
> is becoming common encoding in Aida now?

This solution is a good extend of my hack, but let we try to avoid that
method once for ever, because as I said it just hides the real problem
somewhere else, and prohibits direct streaming of web pages.

Best regards
Janko


--
Janko Mivšek
AIDA/Web
Smalltalk Web Application Server
http://www.aidaweb.si
_______________________________________________
Aida mailing list
[hidden email]
http://lists.aidaweb.si/mailman/listinfo/aida
Reply | Threaded
Open this post in threaded view
|

Re: Changes in method AidaSite>>properString

Jan Vrany-2
In reply to this post by Janko Mivšek
Hi Janko,

On Wed, 2009-04-01 at 20:00 +0200, Janko Mivšek wrote:

> Hello Jaroslav,
>
> #properString: method was introduced in VW Aida for probably the same
> problem as you have, so now it is a time to find the cause.
>
> Namely, even that rendering the WebElements should always return ASCII
> strings only (by converting texts to UTF-8), sometimes it happens that
> they still return TwoByteString. I didn't have time to find where, so I
> implemented that #properString: instead. But we must get rid of that
> method ASAP, because it prohibits direct streaming to the output request
> and therefore causes slower response time.

Sure, that would be nice. Go on!
>
> It would be just wonderful if you guys can find the problem by yourself.
> Probably by inserting some watchdog code to see, where some element
> emits TwoByteStrings for a first time, because that causes that a whole
> result become a TwoByteString.

Well, the reason is pretty clear to me, but hard to explain :-)
First of all, our Jaroslav's application basically renders some
text with national characters (czech ones :-) to the output.

In order to make all the national characters working seamlessly in
Aida on St/X, we are using CharacterWriteStream class, which is
basically a write stream on a single-byte stream. Once a non-ascii
character (i.e. character who's code point is > 255) is written to a
CharacterWriteStream, underlaying  single-byte string is converted to
multi-byte stream. CharacterWriteStream protect us from fiddling about
all the string recoding every time something is written to response
stream. Programmer can #nextPutAll: any string, no matter whether it
is single or multi-byte string, no matter whether it is UTF8, UTF16 or
even chinesse in GB/Big5 encoding. That's why we are using
CharacterWriteStream in Aida on St/X.

>
> PS: TwoByteString is used in VW for Unicode text, in Squeak there is
> WideString, what is it in ST/X?

Well, in St/X there are bunch of string-like classes:

CharacterArray
  String
    Symbol
  TwoByteString
    Unicode16String
    GBEncodedString
    BIG5EncodedString
    JISEncodedString
    KSCEncodedString
  FourByteString
    Unicode32String



>
> Best regards
> Janko
>
> Jaroslav Havlín pravi:
> > Hello all,
> >
> > I had some problems with encoding (UTF-8) in Aida 6 beta on
> > Smalltalk/X recently.
> > Honza Vrany found that the problem could be in method properString:
> > aString in class AidaSite, that looks like this:
> >
> > properString: aString
> >        "if two byte string, convert it to one byte, cut twobyte
> > characters, make them $? "
> >          | stream |
> >          aString class == ByteString ifTrue: [^aString].
> >          stream := WriteStream on: String new.
> >          aString
> >                  do: [:char | stream nextPut: (char asInteger < 256
> > ifTrue: [char] ifFalse: [$?])].
> >          ^stream contents
> >
> > Honza changed it, so it encodes everything into UTF-8 if needed:
> >
> > properString: aString
> >          | stream |
> >        aString bitsPerCharacter == 8 ifTrue: [^aString].
> >        ^aString utf8Encoded
> >
> >
> > We find that method a little tricky and we are not sure whether that
> > change is safe
> > and won't cause some problems when it is called in some special contexts.
> >
> > Do you think our change is ok? Is that method still needed, if UTF-8
> > is becoming common encoding in Aida now?
> >
> > Kind regards,
> >  Jarda Havlin
> > _______________________________________________
> > Aida mailing list
> > [hidden email]
> > http://lists.aidaweb.si/mailman/listinfo/aida
> >
>

_______________________________________________
Aida mailing list
[hidden email]
http://lists.aidaweb.si/mailman/listinfo/aida
Reply | Threaded
Open this post in threaded view
|

Re: Changes in method AidaSite>>properString

Jan Vrany-2
In reply to this post by Janko Mivšek
OKay, so what about following solution:

lets define an output encoding (probably somewhere in
AidaSite). Then, let the output stream transparently
encodes all data to that encoding.

This is pretty easy implement in St/X and probably
on VW too, since there is something like EncodingWriteStream

Cheers, Jan


On Wed, 2009-04-01 at 20:03 +0200, Janko Mivšek wrote:

> > properString: aString
> >          | stream |
> >        aString bitsPerCharacter == 8 ifTrue: [^aString].
> >        ^aString utf8Encoded
> >
> >
> > We find that method a little tricky and we are not sure whether that
> > change is safe
> > and won't cause some problems when it is called in some special contexts.
> >
> > Do you think our change is ok? Is that method still needed, if UTF-8
> > is becoming common encoding in Aida now?
>
> This solution is a good extend of my hack, but let we try to avoid that
> method once for ever, because as I said it just hides the real problem
> somewhere else, and prohibits direct streaming of web pages.
>
> Best regards
> Janko
>
>

_______________________________________________
Aida mailing list
[hidden email]
http://lists.aidaweb.si/mailman/listinfo/aida
Reply | Threaded
Open this post in threaded view
|

Re: Changes in method AidaSite>>properString

Janko Mivšek
Jan Vrany pravi:
> OKay, so what about following solution:
>
> lets define an output encoding (probably somewhere in
> AidaSite). Then, let the output stream transparently
> encodes all data to that encoding.
>
> This is pretty easy implement in St/X and probably
> on VW too, since there is something like EncodingWriteStream

There are already methods in AIDASite class, see codepage converting:

        convert: aString fromCodepage: aSymbol
        convert: aString toCodepage: aSymbol
        convertFromWeb: aString on: aSession
        convertToWeb: aString on: aSession

If you implemented first two methods at least for UTF-8, Aida should
convert properly by itself. Then you just:

        e addText: anInternalUnicodeString

and you'll get a proper UTF-8 formated text to the browser and also back.

Hope this helps
Janko

>
> Cheers, Jan
>
>
> On Wed, 2009-04-01 at 20:03 +0200, Janko Mivšek wrote:
>>> properString: aString
>>>          | stream |
>>>        aString bitsPerCharacter == 8 ifTrue: [^aString].
>>>        ^aString utf8Encoded
>>>
>>>
>>> We find that method a little tricky and we are not sure whether that
>>> change is safe
>>> and won't cause some problems when it is called in some special contexts.
>>>
>>> Do you think our change is ok? Is that method still needed, if UTF-8
>>> is becoming common encoding in Aida now?
>> This solution is a good extend of my hack, but let we try to avoid that
>> method once for ever, because as I said it just hides the real problem
>> somewhere else, and prohibits direct streaming of web pages.
>>
>> Best regards
>> Janko
>>
>>
>
> _______________________________________________
> Aida mailing list
> [hidden email]
> http://lists.aidaweb.si/mailman/listinfo/aida

--
Janko Mivšek
AIDA/Web
Smalltalk Web Application Server
http://www.aidaweb.si

_______________________________________________
Aida mailing list
[hidden email]
http://lists.aidaweb.si/mailman/listinfo/aida