3.9 and encoding in multipart fields

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

3.9 and encoding in multipart fields

NorbertHartl
Hi,

I am using squeak 3.9 and Seaside 2.7. I use WAKomEncoded39
for proper handling of the encoding stuff. I can't remember
for sure but I thought everything worked like a charm. Now
I recognize that sometimes the encoding is wrong when using
forms. The strange thing is that the encoding is right if the
form isn't multipart. On multipart forms the encoding is broken.

Does anybody experience the same thing? Any hints?

Thanks,

Norbert

_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

Damien Cassou-3
Hi Norbert,

Lukas told me to always use WAKom. It handles properly what you call
the "encoding stuff" :-) See
http://www.lukas-renggli.ch/blog/studenckifestwal for an example.

The only problems are:
- you won't have a meaningful result if you send #size to your strings.
- you won't be able to read the data through Squeak inspectors.
- you can't use string literals containing accented characters
directly in your Squeak code.

The rest should just work.

--
Damien Cassou
_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

NorbertHartl
On Thu, 2007-04-26 at 16:48 +0200, Damien Cassou wrote:
> Hi Norbert,
>
> Lukas told me to always use WAKom. It handles properly what you call
> the "encoding stuff" :-) See
> http://www.lukas-renggli.ch/blog/studenckifestwal for an example.
>
> The only problems are:
> - you won't have a meaningful result if you send #size to your strings.
Ok, I see. No problem.
> - you won't be able to read the data through Squeak inspectors.
No problem ... can workaround this.
> - you can't use string literals containing accented characters
That is a problem. I'm german and I need to use labels containing
Umlauts. What is the way to generate this character. Are strings
you get from WAKom different to the internal encoding squeak uses.

> directly in your Squeak code.
>
> The rest should just work.

But it works so far.

Norbert

_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

Damien Cassou-3
2007/4/26, Norbert Hartl <[hidden email]>:
> On Thu, 2007-04-26 at 16:48 +0200, Damien Cassou wrote:
> > - you can't use string literals containing accented characters
> That is a problem. I'm german and I need to use labels containing
> Umlauts. What is the way to generate this character. Are strings
> you get from WAKom different to the internal encoding squeak uses.

What you get from WAKom is what you get from the distant
user/webbrowser. I think it's always utf-8 because Seaside generates
utf-8 but I'm not sure.

About string literals, Lukas told me he uses  a #squeakToUtf method,
but I don't know where to find it.

--
Damien Cassou
_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

John Thornborrow
Damien Cassou wrote:

> 2007/4/26, Norbert Hartl <[hidden email]>:
>> On Thu, 2007-04-26 at 16:48 +0200, Damien Cassou wrote:
>> > - you can't use string literals containing accented characters
>> That is a problem. I'm german and I need to use labels containing
>> Umlauts. What is the way to generate this character. Are strings
>> you get from WAKom different to the internal encoding squeak uses.
>
> What you get from WAKom is what you get from the distant
> user/webbrowser. I think it's always utf-8 because Seaside generates
> utf-8 but I'm not sure.
>
> About string literals, Lukas told me he uses  a #squeakToUtf method,
> but I don't know where to find it.
>
var _ 'thisisastring' squeakToIso isoToUtf

is the best I can find.


Pinesoft Computers are registered in England, Registered number: 2914825. Registered office: 266-268 High Street, Waltham Cross, Herts, EN8 7EA



This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com

_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

NorbertHartl
In reply to this post by Damien Cassou-3
On Thu, 2007-04-26 at 17:18 +0200, Damien Cassou wrote:

> 2007/4/26, Norbert Hartl <[hidden email]>:
> > On Thu, 2007-04-26 at 16:48 +0200, Damien Cassou wrote:
> > > - you can't use string literals containing accented characters
> > That is a problem. I'm german and I need to use labels containing
> > Umlauts. What is the way to generate this character. Are strings
> > you get from WAKom different to the internal encoding squeak uses.
>
> What you get from WAKom is what you get from the distant
> user/webbrowser. I think it's always utf-8 because Seaside generates
> utf-8 but I'm not sure.
>
> About string literals, Lukas told me he uses  a #squeakToUtf method,
> but I don't know where to find it.

Hmmm, it is still not working. I have two forms. One is only text fields
(multipart false) and one is text fields and input type="file"
(multipart true). I store the values in the database using Glorp.
I have two modifications to convert strings. One to utf8 when storing
in the database and one to convert from utf8 when reading from the
database.

The multipart=false case leads to a proper store of the string in the
database. The multipart=true does not. Using WAKomEncoded39 the none
multipart case is displayed correctly. With WAKom the multipart case
is displayed correct.

Examine the kom requests I can see that on multipart=false the strings
are displaying correct in the inspector. With multipart=true the
strings are looking strange (no conversion to the squeak encoding???)

I don't know much about the multipart stuff. Are the single parts from
a multipart recognizable of which type they are (binary or text). I
would like to do a conversion if it is text. Do you have any hint where
in WAKom this encoding should occurr.

thanks,

Norbert


_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

Avi Bryant-2
In reply to this post by John Thornborrow
On 4/26/07, John Thornborrow <[hidden email]> wrote:

> var _ 'thisisastring' squeakToIso isoToUtf
>
> is the best I can find.

Here's what I use on Squeak 3.7:

Integer>>asUTF8
        ^ self <= 16r7F
                ifTrue: [self asCharacter asString]
                ifFalse:
                        [self <= 16r7FF
                                ifTrue:
                                        [String
                                                with: (16rC0 bitOr: (self bitShift: -6)) asCharacter
                                                with: (16r80 bitOr: (self bitAnd: (16r3F))) asCharacter]
                                ifFalse:
                                        [String
                                                with: (16rE0 bitOr: (self bitShift: -12)) asCharacter
                                                with: (16r80 bitOr: ((self bitAnd: 16rFC0) bitShift: -6)) asCharacter
                                                with: (16r80 bitOr: (self bitAnd: 16r3F)) asCharacter]]


Then you can do things like "247 asUTF8" (which is a division symbol,
for example).

Avi
_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

NorbertHartl
In reply to this post by NorbertHartl
.

> Hmmm, it is still not working. I have two forms. One is only text fields
> (multipart false) and one is text fields and input type="file"
> (multipart true). I store the values in the database using Glorp.
> I have two modifications to convert strings. One to utf8 when storing
> in the database and one to convert from utf8 when reading from the
> database.
>
> The multipart=false case leads to a proper store of the string in the
> database. The multipart=true does not. Using WAKomEncoded39 the none
> multipart case is displayed correctly. With WAKom the multipart case
> is displayed correct.
>
> Examine the kom requests I can see that on multipart=false the strings
> are displaying correct in the inspector. With multipart=true the
> strings are looking strange (no conversion to the squeak encoding???)
>
> I don't know much about the multipart stuff. Are the single parts from
> a multipart recognizable of which type they are (binary or text). I
> would like to do a conversion if it is text. Do you have any hint where
> in WAKom this encoding should occurr.
>
Ok, now I tested my theory. I added some code to WAKom to convert
multipart fields to an internal representation.

processMultipartFields: aRequest
        aRequest multipartFormFieldsDo:
                [:chunk |
                chunk explore.
                chunk fileName isEmptyOrNil ifFalse:
                        [|stream file|
                        stream := WriteStream on: String new.
                        chunk saveToStream: stream.
                        file := WAFile new
                                fileName: chunk fileName;
                                contents: stream contents;
                                contentType: chunk contentType;
                                yourself.
                        aRequest postFields at: chunk fieldName put: file]
                ifTrue: [
                        |stream|
                        stream := WriteStream on: String new.
                        chunk saveToStream: stream.

                        aRequest postFields at: chunk fieldName put:( (stream contents)
convertFromEncoding: #utf8)]
                ].

That works for me now. It is slower than any other approach while
doing:

- getting UTF-8 from the client (is that always true or how does WAKom
  decide?)
- converting multipart and none multipart fields from utf-8 to squeak
- converting the string to utf-8 when storing in the database of the
  PostgresQL client encoding utf-8
....
- converting from utf-8 when reading from the database
- Using WAKomEncoded39 which encodes to utf-8 before sending to the
  client.

It is much encoding but relatively clean (I think). I can use Umlauts
in squeak code (being squeak encoding).

Does anybody have any doubt about this approach?

Norbert

_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

NorbertHartl
In reply to this post by Avi Bryant-2
On Thu, 2007-04-26 at 10:29 -0700, Avi Bryant wrote:

> On 4/26/07, John Thornborrow <[hidden email]> wrote:
>
> > var _ 'thisisastring' squeakToIso isoToUtf
> >
> > is the best I can find.
>
> Here's what I use on Squeak 3.7:
>
> Integer>>asUTF8
> ^ self <= 16r7F
> ifTrue: [self asCharacter asString]
> ifFalse:
> [self <= 16r7FF
> ifTrue:
> [String
> with: (16rC0 bitOr: (self bitShift: -6)) asCharacter
> with: (16r80 bitOr: (self bitAnd: (16r3F))) asCharacter]
> ifFalse:
> [String
> with: (16rE0 bitOr: (self bitShift: -12)) asCharacter
> with: (16r80 bitOr: ((self bitAnd: 16rFC0) bitShift: -6)) asCharacter
> with: (16r80 bitOr: (self bitAnd: 16r3F)) asCharacter]]
>
>
> Then you can do things like "247 asUTF8" (which is a division symbol,
> for example).

I don't know how encoding is done in 3.7. I think 3.7 and 3.8 are
similar as the big changes happened in 3.9 (as far as know). What
is the internal representation of strings after WAKom processing?

Norbert

_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

Avi Bryant-2
On 4/26/07, Norbert Hartl <[hidden email]> wrote:

> I don't know how encoding is done in 3.7. I think 3.7 and 3.8 are
> similar as the big changes happened in 3.9 (as far as know).

No, the big String changes happened in 3.8.  In 3.7, the internal
representation is always as single byte strings.

Avi
_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

Philippe Marschall
In reply to this post by Damien Cassou-3
2007/4/26, Damien Cassou <[hidden email]>:
> Hi Norbert,
>
> Lukas told me to always use WAKom. It handles properly what you call
> the "encoding stuff" :-) See

No, it doesn't at all. It makes exactly nothing and leaves everything
to you. And does that only on Squeak <= 3.8. On Squeak 3.9 it's
broken.

Cheers
Philippe

> http://www.lukas-renggli.ch/blog/studenckifestwal for an example.
>
> The only problems are:
> - you won't have a meaningful result if you send #size to your strings.
> - you won't be able to read the data through Squeak inspectors.
> - you can't use string literals containing accented characters
> directly in your Squeak code.
>
> The rest should just work.
>
> --
> Damien Cassou
> _______________________________________________
> Seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>
_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

Philippe Marschall
In reply to this post by NorbertHartl
2007/4/26, Norbert Hartl <[hidden email]>:

> .
> > Hmmm, it is still not working. I have two forms. One is only text fields
> > (multipart false) and one is text fields and input type="file"
> > (multipart true). I store the values in the database using Glorp.
> > I have two modifications to convert strings. One to utf8 when storing
> > in the database and one to convert from utf8 when reading from the
> > database.
> >
> > The multipart=false case leads to a proper store of the string in the
> > database. The multipart=true does not. Using WAKomEncoded39 the none
> > multipart case is displayed correctly. With WAKom the multipart case
> > is displayed correct.
> >
> > Examine the kom requests I can see that on multipart=false the strings
> > are displaying correct in the inspector. With multipart=true the
> > strings are looking strange (no conversion to the squeak encoding???)
> >
> > I don't know much about the multipart stuff. Are the single parts from
> > a multipart recognizable of which type they are (binary or text). I
> > would like to do a conversion if it is text. Do you have any hint where
> > in WAKom this encoding should occurr.
> >
> Ok, now I tested my theory. I added some code to WAKom to convert
> multipart fields to an internal representation.
>
> processMultipartFields: aRequest
>         aRequest multipartFormFieldsDo:
>                 [:chunk |
>                 chunk explore.
>                 chunk fileName isEmptyOrNil ifFalse:
>                         [|stream file|
>                         stream := WriteStream on: String new.
>                         chunk saveToStream: stream.
>                         file := WAFile new
>                                 fileName: chunk fileName;
>                                 contents: stream contents;
>                                 contentType: chunk contentType;
>                                 yourself.
>                         aRequest postFields at: chunk fieldName put: file]
>                 ifTrue: [
>                         |stream|
>                         stream := WriteStream on: String new.
>                         chunk saveToStream: stream.
>
>                         aRequest postFields at: chunk fieldName put:( (stream contents)
> convertFromEncoding: #utf8)]
>                 ].

Do I see correctly that the #ifTrue: part ist new?

> That works for me now. It is slower than any other approach while
> doing:
>
> - getting UTF-8 from the client (is that always true or how does WAKom
>   decide?)

WAKom does _nothing_ whatever the clients sends to you in whatever
encoding will be what you get.

> - converting multipart and none multipart fields from utf-8 to squeak
> - converting the string to utf-8 when storing in the database of the
>   PostgresQL client encoding utf-8

Your job. Seaside does not know or care about databases.

> ....
> - converting from utf-8 when reading from the database

See above.

> - Using WAKomEncoded39 which encodes to utf-8 before sending to the
>   client.
>
> It is much encoding but relatively clean (I think). I can use Umlauts
> in squeak code (being squeak encoding).

Yes, in Squeak >= 3.8 and WAKomEncoded(39)

Cheers
Philippe

> Does anybody have any doubt about this approach?
>
> Norbert
>
> _______________________________________________
> Seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>
_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

NorbertHartl
On Thu, 2007-04-26 at 23:17 +0200, Philippe Marschall wrote:

> 2007/4/26, Norbert Hartl <[hidden email]>:
> > .
> > > Hmmm, it is still not working. I have two forms. One is only text fields
> > > (multipart false) and one is text fields and input type="file"
> > > (multipart true). I store the values in the database using Glorp.
> > > I have two modifications to convert strings. One to utf8 when storing
> > > in the database and one to convert from utf8 when reading from the
> > > database.
> > >
> > > The multipart=false case leads to a proper store of the string in the
> > > database. The multipart=true does not. Using WAKomEncoded39 the none
> > > multipart case is displayed correctly. With WAKom the multipart case
> > > is displayed correct.
> > >
> > > Examine the kom requests I can see that on multipart=false the strings
> > > are displaying correct in the inspector. With multipart=true the
> > > strings are looking strange (no conversion to the squeak encoding???)
> > >
> > > I don't know much about the multipart stuff. Are the single parts from
> > > a multipart recognizable of which type they are (binary or text). I
> > > would like to do a conversion if it is text. Do you have any hint where
> > > in WAKom this encoding should occurr.
> > >
> > Ok, now I tested my theory. I added some code to WAKom to convert
> > multipart fields to an internal representation.
> >
> > processMultipartFields: aRequest
> >         aRequest multipartFormFieldsDo:
> >                 [:chunk |
> >                 chunk explore.
> >                 chunk fileName isEmptyOrNil ifFalse:
> >                         [|stream file|
> >                         stream := WriteStream on: String new.
> >                         chunk saveToStream: stream.
> >                         file := WAFile new
> >                                 fileName: chunk fileName;
> >                                 contents: stream contents;
> >                                 contentType: chunk contentType;
> >                                 yourself.
> >                         aRequest postFields at: chunk fieldName put: file]
> >                 ifTrue: [
> >                         |stream|
> >                         stream := WriteStream on: String new.
> >                         chunk saveToStream: stream.
> >
> >                         aRequest postFields at: chunk fieldName put:( (stream contents)
> > convertFromEncoding: #utf8)]
> >                 ].
>
> Do I see correctly that the #ifTrue: part ist new?
>
yes.
> > That works for me now. It is slower than any other approach while
> > doing:
> >
> > - getting UTF-8 from the client (is that always true or how does WAKom
> >   decide?)
>
> WAKom does _nothing_ whatever the clients sends to you in whatever
> encoding will be what you get.
>
Ok, that isn't very good. So I have to figure out the cases where I
don't get utf-8 and cope with that.

> > - converting multipart and none multipart fields from utf-8 to squeak
> > - converting the string to utf-8 when storing in the database of the
> >   PostgresQL client encoding utf-8
>
> Your job. Seaside does not know or care about databases.
>
Seaside doesn't but squeak should :)

> > ....
> > - converting from utf-8 when reading from the database
>
> See above.
>
> > - Using WAKomEncoded39 which encodes to utf-8 before sending to the
> >   client.
> >
> > It is much encoding but relatively clean (I think). I can use Umlauts
> > in squeak code (being squeak encoding).
>
> Yes, in Squeak >= 3.8 and WAKomEncoded(39)

That is what I'm using.

Norbert

_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

Philippe Marschall
2007/4/26, Norbert Hartl <[hidden email]>:

> On Thu, 2007-04-26 at 23:17 +0200, Philippe Marschall wrote:
> > 2007/4/26, Norbert Hartl <[hidden email]>:
> > > .
> > > > Hmmm, it is still not working. I have two forms. One is only text fields
> > > > (multipart false) and one is text fields and input type="file"
> > > > (multipart true). I store the values in the database using Glorp.
> > > > I have two modifications to convert strings. One to utf8 when storing
> > > > in the database and one to convert from utf8 when reading from the
> > > > database.
> > > >
> > > > The multipart=false case leads to a proper store of the string in the
> > > > database. The multipart=true does not. Using WAKomEncoded39 the none
> > > > multipart case is displayed correctly. With WAKom the multipart case
> > > > is displayed correct.
> > > >
> > > > Examine the kom requests I can see that on multipart=false the strings
> > > > are displaying correct in the inspector. With multipart=true the
> > > > strings are looking strange (no conversion to the squeak encoding???)
> > > >
> > > > I don't know much about the multipart stuff. Are the single parts from
> > > > a multipart recognizable of which type they are (binary or text). I
> > > > would like to do a conversion if it is text. Do you have any hint where
> > > > in WAKom this encoding should occurr.
> > > >
> > > Ok, now I tested my theory. I added some code to WAKom to convert
> > > multipart fields to an internal representation.
> > >
> > > processMultipartFields: aRequest
> > >         aRequest multipartFormFieldsDo:
> > >                 [:chunk |
> > >                 chunk explore.
> > >                 chunk fileName isEmptyOrNil ifFalse:
> > >                         [|stream file|
> > >                         stream := WriteStream on: String new.
> > >                         chunk saveToStream: stream.
> > >                         file := WAFile new
> > >                                 fileName: chunk fileName;
> > >                                 contents: stream contents;
> > >                                 contentType: chunk contentType;
> > >                                 yourself.
> > >                         aRequest postFields at: chunk fieldName put: file]
> > >                 ifTrue: [
> > >                         |stream|
> > >                         stream := WriteStream on: String new.
> > >                         chunk saveToStream: stream.
> > >
> > >                         aRequest postFields at: chunk fieldName put:( (stream contents)
> > > convertFromEncoding: #utf8)]
> > >                 ].
> >
> > Do I see correctly that the #ifTrue: part ist new?
> >
> yes.
> > > That works for me now. It is slower than any other approach while
> > > doing:
> > >
> > > - getting UTF-8 from the client (is that always true or how does WAKom
> > >   decide?)
> >
> > WAKom does _nothing_ whatever the clients sends to you in whatever
> > encoding will be what you get.
> >
> Ok, that isn't very good. So I have to figure out the cases where I
> don't get utf-8 and cope with that.

Then use the WAKomEncoded* series, they will deliver you WideStrings
and expect WideStrings for you (and use and expect utf8 externally,
you could make subclasses that do something else, it's just that
nobody ever needed anything besides utf8).

BTW although some people actually say Seaside is the worst documented
piece of Smalltalk code they know all this stuff is covered in class
comments. Which just proves that writing comments is pointless because
nobody reads them anyway ;)

Cheers
Philippe

> > > - converting multipart and none multipart fields from utf-8 to squeak
> > > - converting the string to utf-8 when storing in the database of the
> > >   PostgresQL client encoding utf-8
> >
> > Your job. Seaside does not know or care about databases.
> >
> Seaside doesn't but squeak should :)
> > > ....
> > > - converting from utf-8 when reading from the database
> >
> > See above.
> >
> > > - Using WAKomEncoded39 which encodes to utf-8 before sending to the
> > >   client.
> > >
> > > It is much encoding but relatively clean (I think). I can use Umlauts
> > > in squeak code (being squeak encoding).
> >
> > Yes, in Squeak >= 3.8 and WAKomEncoded(39)
>
> That is what I'm using.
>
> Norbert
>
> _______________________________________________
> Seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>
_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

NorbertHartl
On Thu, 2007-04-26 at 23:54 +0200, Philippe Marschall wrote:

> 2007/4/26, Norbert Hartl <[hidden email]>:
> > On Thu, 2007-04-26 at 23:17 +0200, Philippe Marschall wrote:
> > > 2007/4/26, Norbert Hartl <[hidden email]>:
> > > > .
> > > > > Hmmm, it is still not working. I have two forms. One is only text fields
> > > > > (multipart false) and one is text fields and input type="file"
> > > > > (multipart true). I store the values in the database using Glorp.
> > > > > I have two modifications to convert strings. One to utf8 when storing
> > > > > in the database and one to convert from utf8 when reading from the
> > > > > database.
> > > > >
> > > > > The multipart=false case leads to a proper store of the string in the
> > > > > database. The multipart=true does not. Using WAKomEncoded39 the none
> > > > > multipart case is displayed correctly. With WAKom the multipart case
> > > > > is displayed correct.
> > > > >
> > > > > Examine the kom requests I can see that on multipart=false the strings
> > > > > are displaying correct in the inspector. With multipart=true the
> > > > > strings are looking strange (no conversion to the squeak encoding???)
> > > > >
> > > > > I don't know much about the multipart stuff. Are the single parts from
> > > > > a multipart recognizable of which type they are (binary or text). I
> > > > > would like to do a conversion if it is text. Do you have any hint where
> > > > > in WAKom this encoding should occurr.
> > > > >
> > > > Ok, now I tested my theory. I added some code to WAKom to convert
> > > > multipart fields to an internal representation.
> > > >
> > > > processMultipartFields: aRequest
> > > >         aRequest multipartFormFieldsDo:
> > > >                 [:chunk |
> > > >                 chunk explore.
> > > >                 chunk fileName isEmptyOrNil ifFalse:
> > > >                         [|stream file|
> > > >                         stream := WriteStream on: String new.
> > > >                         chunk saveToStream: stream.
> > > >                         file := WAFile new
> > > >                                 fileName: chunk fileName;
> > > >                                 contents: stream contents;
> > > >                                 contentType: chunk contentType;
> > > >                                 yourself.
> > > >                         aRequest postFields at: chunk fieldName put: file]
> > > >                 ifTrue: [
> > > >                         |stream|
> > > >                         stream := WriteStream on: String new.
> > > >                         chunk saveToStream: stream.
> > > >
> > > >                         aRequest postFields at: chunk fieldName put:( (stream contents)
> > > > convertFromEncoding: #utf8)]
> > > >                 ].
> > >
> > > Do I see correctly that the #ifTrue: part ist new?
> > >
> > yes.
> > > > That works for me now. It is slower than any other approach while
> > > > doing:
> > > >
> > > > - getting UTF-8 from the client (is that always true or how does WAKom
> > > >   decide?)
> > >
> > > WAKom does _nothing_ whatever the clients sends to you in whatever
> > > encoding will be what you get.
> > >
> > Ok, that isn't very good. So I have to figure out the cases where I
> > don't get utf-8 and cope with that.
>
> Then use the WAKomEncoded* series, they will deliver you WideStrings
> and expect WideStrings for you (and use and expect utf8 externally,
> you could make subclasses that do something else, it's just that
> nobody ever needed anything besides utf8).
>
I can see WAKomEncoded39 is converting to utf-8 before sending the
respose. WAKomEncoded does some conversion I don't understand. Under
which circumstances is a field kind of OrderedCollection?
Using WAKom with 3.9 means you can easily lose control over strings.
As long as there is no conversion every part has to assume to deal
with the same character encoding. But I doubt this is always easy to
accomplish. As long as no conversion occurs there is no internal
representation of the corresponding character. So string comparsion,
regex and the like are rendered unusable. Or do I misunderstand this
completly?

> BTW although some people actually say Seaside is the worst documented
> piece of Smalltalk code they know all this stuff is covered in class
> comments. Which just proves that writing comments is pointless because
> nobody reads them anyway ;)
>

One point is if there is documentation another one is to find it ;)
Which documentation do mean?

Norbert



_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: 3.9 and encoding in multipart fields

Philippe Marschall
2007/4/27, Norbert Hartl <[hidden email]>:

> On Thu, 2007-04-26 at 23:54 +0200, Philippe Marschall wrote:
> > 2007/4/26, Norbert Hartl <[hidden email]>:
> > > On Thu, 2007-04-26 at 23:17 +0200, Philippe Marschall wrote:
> > > > 2007/4/26, Norbert Hartl <[hidden email]>:
> > > > > .
> > > > > > Hmmm, it is still not working. I have two forms. One is only text fields
> > > > > > (multipart false) and one is text fields and input type="file"
> > > > > > (multipart true). I store the values in the database using Glorp.
> > > > > > I have two modifications to convert strings. One to utf8 when storing
> > > > > > in the database and one to convert from utf8 when reading from the
> > > > > > database.
> > > > > >
> > > > > > The multipart=false case leads to a proper store of the string in the
> > > > > > database. The multipart=true does not. Using WAKomEncoded39 the none
> > > > > > multipart case is displayed correctly. With WAKom the multipart case
> > > > > > is displayed correct.
> > > > > >
> > > > > > Examine the kom requests I can see that on multipart=false the strings
> > > > > > are displaying correct in the inspector. With multipart=true the
> > > > > > strings are looking strange (no conversion to the squeak encoding???)
> > > > > >
> > > > > > I don't know much about the multipart stuff. Are the single parts from
> > > > > > a multipart recognizable of which type they are (binary or text). I
> > > > > > would like to do a conversion if it is text. Do you have any hint where
> > > > > > in WAKom this encoding should occurr.
> > > > > >
> > > > > Ok, now I tested my theory. I added some code to WAKom to convert
> > > > > multipart fields to an internal representation.
> > > > >
> > > > > processMultipartFields: aRequest
> > > > >         aRequest multipartFormFieldsDo:
> > > > >                 [:chunk |
> > > > >                 chunk explore.
> > > > >                 chunk fileName isEmptyOrNil ifFalse:
> > > > >                         [|stream file|
> > > > >                         stream := WriteStream on: String new.
> > > > >                         chunk saveToStream: stream.
> > > > >                         file := WAFile new
> > > > >                                 fileName: chunk fileName;
> > > > >                                 contents: stream contents;
> > > > >                                 contentType: chunk contentType;
> > > > >                                 yourself.
> > > > >                         aRequest postFields at: chunk fieldName put: file]
> > > > >                 ifTrue: [
> > > > >                         |stream|
> > > > >                         stream := WriteStream on: String new.
> > > > >                         chunk saveToStream: stream.
> > > > >
> > > > >                         aRequest postFields at: chunk fieldName put:( (stream contents)
> > > > > convertFromEncoding: #utf8)]
> > > > >                 ].
> > > >
> > > > Do I see correctly that the #ifTrue: part ist new?
> > > >
> > > yes.
> > > > > That works for me now. It is slower than any other approach while
> > > > > doing:
> > > > >
> > > > > - getting UTF-8 from the client (is that always true or how does WAKom
> > > > >   decide?)
> > > >
> > > > WAKom does _nothing_ whatever the clients sends to you in whatever
> > > > encoding will be what you get.
> > > >
> > > Ok, that isn't very good. So I have to figure out the cases where I
> > > don't get utf-8 and cope with that.
> >
> > Then use the WAKomEncoded* series, they will deliver you WideStrings
> > and expect WideStrings for you (and use and expect utf8 externally,
> > you could make subclasses that do something else, it's just that
> > nobody ever needed anything besides utf8).
> >
> I can see WAKomEncoded39 is converting to utf-8 before sending the
> respose. WAKomEncoded does some conversion I don't understand. Under
> which circumstances is a field kind of OrderedCollection?

Pff, you ask me stuff (yes, comments would help) ;)
I know this method has my author initials but that just means I was
the last one who touched it and it was probably a formatting or
categorization issue.

> Using WAKom with 3.9 means you can easily lose control over strings.

And you will have internal server errors.

> As long as there is no conversion every part has to assume to deal
> with the same character encoding. But I doubt this is always easy to
> accomplish. As long as no conversion occurs there is no internal
> representation of the corresponding character. So string comparsion,
> regex and the like are rendered unusable. Or do I misunderstand this
> completly?

Yes, also substrings are broken, and #size, and checking for letters
and capitalization and whatever I forgot.
And you won't be able to use literals with non-ascii characters directly.

On a side note, who knows the difference between mysql_escape_string
and mysql_real_escape_string? ;)

> > BTW although some people actually say Seaside is the worst documented
> > piece of Smalltalk code they know all this stuff is covered in class
> > comments. Which just proves that writing comments is pointless because
> > nobody reads them anyway ;)
> >
>
> One point is if there is documentation another one is to find it ;)
> Which documentation do mean?

Class comment in WAKomEncoded. No I was just joking. ;)

Cheers
Philippe

> Norbert
>
>
>
> _______________________________________________
> Seaside mailing list
> [hidden email]
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>
_______________________________________________
Seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside