HTML character set encoding/decoding

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

HTML character set encoding/decoding

Ian Upright-2
I've spent a little time digging around and didn't quickly turn up an
answer, so I thought someone here would know..
If I've got a string HTML encoded like ' <>& ' what VW
frameworks or goodies will decode this into a regular string?  What
about encoding in the same format? e.g. I'm looking for sample workspace
code to convert back and forth.

Thanks, Ian

Reply | Threaded
Open this post in threaded view
|

Re: HTML character set encoding/decoding

Alan Knight-2
I don't know what would decode it nicely. Possibly the XML parser, at least for some of the characters, because that seems to be how it generates things into the raw source code if they contain those characters. And something that rendered HTML (TwoFlower, WithStyle, etc) might be a possibility.

The Web Toolkit encodes such things, although it doesn't use the nice human-readable forms, just generates &1234; with the unicode code point of the character in question, if it doesn't fit in the encoding specified for the page. That's done in EscapingStreamErrorPolicy in Wave-Server-Base.

At 09:26 PM 9/18/2006, ian Upright wrote:
>I've spent a little time digging around and didn't quickly turn up an
>answer, so I thought someone here would know..
>If I've got a string HTML encoded like ' <>& ' what VW
>frameworks or goodies will decode this into a regular string?  What
>about encoding in the same format? e.g. I'm looking for sample workspace
>code to convert back and forth.
>
>Thanks, Ian

--
Alan Knight [|], Cincom Smalltalk Development
[hidden email]
[hidden email]
http://www.cincom.com/smalltalk

"The Static Typing Philosophy: Make it fast. Make it right. Make it run." - Niall Ross