Hi folks,
I've an encoding issue when saving an html page. The page has some strings with accented letters like: è or é and so on. I save the page retrieving the stream doing: html context document stream contents but every accented char is wrong encoded. Can you help me? Thanks Dave |
On 17-10-15 10:27, Dave wrote:
> html context document stream Inspect the stream to see what kind of encoder/codec/converter is used Stephan _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Hi Stephan
The stream is: '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><title>Title</title><meta http-equiv="Content-Type" content="text/html;charset=utf-8"/><meta http-equiv="Content-Script-Type" content="text/javascript"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/></head><body onload="onLoad()"><br/><h4 style="margin-left:56px"> 17/10/2015</h4><div style="display: block; overflow: visible; font-family: Monaco, Consolas, monospace; font-size: 14px; line-height: 1.5; white-space: pre-wrap; margin-left:56px; margin-right:36px; padding:10px; border-top-style:solid; border-width:1px;">perché</div><br/><br/>'You can see the last line contains "perché" instead of "perché" Thanks Dave
|
On 17/10/15 14:32, Dave wrote:
> Hi Stephan > > The stream is: > > > You can see the last line contains "perché" instead of "perché" So that's fine then. That's the UTF8 representation Stephan _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
But if you save the text of my previous email to an html file and open it, it shows "perché" instead of "perché"
:-( Dave
|
On 17/10/15 15:46, Dave wrote:
> But if you save the text of my previous email to an html file and open it, it > shows "perché" instead of "perché" That depends on how you save it. You probably want to use an UTF8TextConverter somewhere It is used in MultiByteBinaryOrTextStream class>>on:encoding: Stephan _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
You might want to look at [1] for a related discussion about this issue.
Instead of getting the stream contents from your context and saving that, try the following: | fullDocument | fullDocument := WAHtmlCanvas builder fullDocument: true; rootBlock: [:root | root meta contentType: (WAMimeType textHtml charset:'utf-8') ]; render: [ :html | html text: 'Ñuñoa' ]. '/tmp/test.html' asFileReference writeStreamDo: [ :out | out << fullDocument ]. If you really want to create a text file, how are you saving and viewing that text file once you get the contents from a Seaside document? Johan > On 17 Oct 2015, at 16:44, Stephan Eggermont <[hidden email]> wrote: > > On 17/10/15 15:46, Dave wrote: >> But if you save the text of my previous email to an html file and open it, it >> shows "perché" instead of "perché" > > That depends on how you save it. > You probably want to use an UTF8TextConverter somewhere > It is used in MultiByteBinaryOrTextStream class>>on:encoding: > > Stephan > > > _______________________________________________ > seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside _______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Stephan Eggermont-3
Hi Stephan,
I don't find a hint on MultiByteBinaryOrTextStream but I found a help from a thread of mine (from time to time I have some encoding issue :-) ) http://forum.world.st/File-upload-encoding-issue-tp4783446p4783615.html and Sven's reply helps me again, I decoded the document using this method: (GRCodec forEncoding: 'utf-8') decode: html context document stream contents. Anyway if there is a better solution I'll gladly try it Dave
|
In reply to this post by Johan Brichau-2
Hi Johan,
With [1] I guess you are linking to this: http://forum.world.st/Accent-in-generated-pages-tp4832873p4832910.html I tried to change html text: 'Ñuñoa'with html html: contextFromMyDocumentand save it as you suggest, but it doesn't work, text contains perché I tried something different as Sven suggested some time ago: http://forum.world.st/File-upload-encoding-issue-tp4783446p4783615.html i.e (GRCodec forEncoding: 'utf-8') decode: html context document stream contentsand it works, but if you can explain how to make your example working I'll be glad. Thanks Dave
|
(GRCodec forEncoding: 'utf8') decode: 'perché’ So, how our opening the text file? And how are you saving it to disk? Because if you save it as bytes to a file, it will be correctly utf8 encoded. Johan
_______________________________________________ seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Hi Johan,
You are right, if I save the file as binary everything is fine. htmlFilename := 'test.html'. stream := FileStream forceNewFileNamed: htmlFilename. stream binary. [ stream nextPutAll: html context document stream contents ] ensure: [ stream close ]. When I tried your snipped it didn't work out because it didn't save as binary (or at least I couldn't convert it to binary) look here: fullDocument := WAHtmlCanvas builder fullDocument: true; rootBlock: [:root | root meta contentType: (WAMimeType textHtml charset:'utf-8') ]; render: [ :r | r html: html context document stream contents ]. '/tmp/test.html' asFileReference writeStreamDo: [ :out | (out << fullDocument) ]. Dave
|
Free forum by Nabble | Edit this page |