HTML encoding

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

HTML encoding

Andre Schnoor
Is there anywhere a stream encoder avaiable for HTML that converts
umlauts and non-html characters to
ä  ^ and such? I couldn't seem to find one. Nor am I sure, if
StreamEncoder is the right place to look for at all.

What is the recommended way for rendering arbitrary (internal) strings
to HTML?

Andre

Reply | Threaded
Open this post in threaded view
|

Re: HTML encoding

Mark Roberts
At 05:57 PM 5/19/2007, Andre Schnoor wrote:
>Is there anywhere a stream encoder avaiable for HTML that converts
>umlauts and non-html characters to
>ä  ^ and such? I couldn't seem to find one. Nor am I sure,
>if StreamEncoder is the right place to look for at all.
>
>What is the recommended way for rendering arbitrary (internal)
>strings to HTML?

There is no code in the VisualWorks base image to do this. Note that
#withHTMLEscapes is incomplete.

AFAIK, there are no plans to provide real support for HTML, so you
might have better luck searching around in the public repository.

M

Reply | Threaded
Open this post in threaded view
|

Re: HTML encoding

Boris Popov, DeepCove Labs (SNN)
In reply to this post by Andre Schnoor
Re: HTML encoding

Look at #encode: in Seaside, don't have the image handy to be more specific, sorry.

Cheers!

-Boris
(Sent from a BlackBerry)

----- Original Message -----
From: Mark Roberts <[hidden email]>
To: Andre Schnoor <[hidden email]>; vwnc-list <[hidden email]>
Sent: Sat May 19 15:45:29 2007
Subject: Re: HTML encoding

At 05:57 PM 5/19/2007, Andre Schnoor wrote:
>Is there anywhere a stream encoder avaiable for HTML that converts
>umlauts and non-html characters to
>&auml;  &#94; and such? I couldn't seem to find one. Nor am I sure,
>if StreamEncoder is the right place to look for at all.
>
>What is the recommended way for rendering arbitrary (internal)
>strings to HTML?

There is no code in the VisualWorks base image to do this. Note that
#withHTMLEscapes is incomplete.

AFAIK, there are no plans to provide real support for HTML, so you
might have better luck searching around in the public repository.

M

Reply | Threaded
Open this post in threaded view
|

Re: HTML encoding

Alexander Lazarevic'
In reply to this post by Andre Schnoor
Hi Andre,

I don't know what the recommended way is for rendering strings to HTML,
but you could utilize XML.Text and EscapingStreamErrorPolicy (which is
in Wave-Server-Base) for your needs.

string :=  'a < b & ö > ä & µ ° §'.
string := (XML.Text text: string) canonicalPrintString.
string := EscapingStreamErrorPolicy encode: string.

This does not give you named entities. For that you could extend
EscapingStreamErrorPolicy or use an extended mapping with
SAXCanonicalWriter and XML.Text and forget about EscapingStreamErrorPolicy.

It doesn't look nice, but it is something to play with...

Cheers,
        Alex

Andre Schnoor schrieb:

> Is there anywhere a stream encoder avaiable for HTML that converts
> umlauts and non-html characters to
> &auml;  &#94; and such? I couldn't seem to find one. Nor am I sure, if
> StreamEncoder is the right place to look for at all.
>
> What is the recommended way for rendering arbitrary (internal) strings
> to HTML?
>
> Andre
>

Reply | Threaded
Open this post in threaded view
|

Re: HTML encoding

Holger Kleinsorgen-4
In reply to this post by Andre Schnoor
> Is there anywhere a stream encoder avaiable for HTML that converts
> umlauts and non-html characters to
> &auml;  &#94; and such? I couldn't seem to find one. Nor am I sure, if
> StreamEncoder is the right place to look for at all.

for your convenience, I just published package HkHTMLWriting to the
public repository.

the stream encoder is a bit fishy, because ideally it would wrap the
actual output encoder (e.g. a HTML encoder on a iso-8859-1 encoder on a
stream). however, that does not seem to be (easily) possible in VW.

Anyway, hth

Holger

---------------
package comment
---------------

A stream encoder and a SAX writer that provide basic HTML entity
encoding. The stream encoder currently does not support decoding. Only
minimal testing was done.

Examples:

| str |
str := (ByteArray new withEncoding: #HkHTML) writeStream.
str nextPutAll: 'Quelque chose s''était cassé dans mon moteur'.
str encodedContents asByteString inspect.

| str writer |
str := String new writeStream.
writer := Hk.HTMLWriter on: str.
writer characters: 'Quelque chose s''était cassé dans mon moteur'.
str contents inspect.

Reply | Threaded
Open this post in threaded view
|

Re: HTML encoding

Alan Knight-2
In reply to this post by Andre Schnoor
The Web Toolkit does this. I believe all of the required code would be in Wave-Server-Base. See EscapingStreamErrorPolicy. It makes no attempt to use "nice" abbreviations like &auml; - if it can't encode a character on a stream, it just uses the unicode code point as the escape. That's been there for years, but apparently too well hidden for the other responders.

At 04:57 AM 5/19/2007, Andre Schnoor wrote:
Is there anywhere a stream encoder avaiable for HTML that converts umlauts and non-html characters to
&auml;  &#94; and such? I couldn't seem to find one. Nor am I sure, if StreamEncoder is the right place to look for at all.

What is the recommended way for rendering arbitrary (internal) strings to HTML?

Andre

--
Alan Knight [|], Cincom Smalltalk Development

"The Static Typing Philosophy: Make it fast. Make it right. Make it run." - Niall Ross
Reply | Threaded
Open this post in threaded view
|

Re: HTML encoding

Alexander Lazarevic'
Alan Knight schrieb:
> on a stream, it just uses the unicode code point as the escape. That's
> been there for years, but apparently too well hidden for the other
> responders.

Maybe my posts to vwnc are not coming thru? :-}

Alex

Reply | Threaded
Open this post in threaded view
|

Re: HTML encoding

Alan Knight-2
At 09:09 AM 5/20/2007, Alexander Lazarevic' wrote:
Alan Knight schrieb:
> on a stream, it just uses the unicode code point as the escape. That's
> been there for years, but apparently too well hidden for the other
> responders.

Maybe my posts to vwnc are not coming thru? :-}

Alex

Oops. I missed your response. That's what I get for responding to the list before the coffee's ready.

--
Alan Knight [|], Cincom Smalltalk Development

"The Static Typing Philosophy: Make it fast. Make it right. Make it run." - Niall Ross
Reply | Threaded
Open this post in threaded view
|

Re: HTML encoding

Andre Schnoor
In reply to this post by Boris Popov, DeepCove Labs (SNN)
Thank you all for the replies so far.

I helped myself with a simple ad-hoc solution (lookup dictionary and
replacement), implemented as String>>asHTML. Yeah, its ugly, but it does
the job. I needed to do so, because I didn't want < and > as well as &
be encoded. Just umlauts and some special characters.

Andre