How to escape special characters in string literals?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

How to escape special characters in string literals?

Thomas Gagné-2
I'm ashamed I don't know this!  How can I express the characters /, cr,
lf as a string literal?  I need to do it inside #().  I remember
something about a /r or \r for a carriage return but can't find the
reference now...

--
Visit <http://tggagne.blogspot.com/>,<http://gagne.homedns.org/> or
      <http://gagne.homedns.org/~tgagne/> for more great reading.

Reply | Threaded
Open this post in threaded view
|

RE: How to escape special characters in string literals?

Steven Kelly
#('
') works for me. You can do the same with lf, e.g. pasting into the string body after this:
ParagraphEditor currentSelection: (String with: Character lf)

If you want character literals, that works too:
$($
)

If you try to read the code for such a method in from XML, you lose the difference between LF and CR: both become CR :-(.

I don't know of a \r trick, unless you mean this:
'\' withCRs
which returns the same as (String with: Character cr), but of course isn't much use if you need a literal.

HTH,
Steve

> -----Original Message-----
> From: Thomas Gagné [mailto:[hidden email]]
> Sent: 31 January 2007 19:51
> To: vwnc
> Subject: How to escape special characters in string literals?
>
> I'm ashamed I don't know this!  How can I express the characters /, cr,
> lf as a string literal?  I need to do it inside #().  I remember
> something about a /r or \r for a carriage return but can't find the
> reference now...
>
> --
> Visit <http://tggagne.blogspot.com/>,<http://gagne.homedns.org/> or
>       <http://gagne.homedns.org/~tgagne/> for more great reading.

Reply | Threaded
Open this post in threaded view
|

Re: How to escape special characters in string literals?

Alan Knight-2
In reply to this post by Thomas Gagné-2
I don't think Smalltalk has much in the way of special escaping characters at all. For forward slash, just put it in the string
  #('/')
Similarly for carriage returns
 #('
').
Looks a bit funny, but works. Smalltalk internally uses cr as the line delimiter, so lf is a bit trickier. But if you do

(String with: Character lf) printString, you'll get something that looks the same as for cr, but has an LF in it instead.

At 12:50 PM 1/31/2007, Thomas Gagné wrote:
I'm ashamed I don't know this!  How can I express the characters /, cr, lf as a string literal?  I need to do it inside #().  I remember something about a /r or \r for a carriage return but can't find the reference now...

--
Visit < http://tggagne.blogspot.com/ >,< http://gagne.homedns.org/> or
     < http://gagne.homedns.org/~tgagne/> for more great reading.

--
Alan Knight [|], Cincom Smalltalk Development

"The Static Typing Philosophy: Make it fast. Make it right. Make it run." - Niall Ross
Reply | Threaded
Open this post in threaded view
|

Re: How to escape special characters in string literals?

Eliot Miranda-2
In reply to this post by Steven Kelly


On 1/31/07, Steven Kelly <[hidden email]> wrote:
#('
') works for me. You can do the same with lf, e.g. pasting into the string body after this:
ParagraphEditor currentSelection: (String with: Character lf)

it works until... you file-out on e.g. Windows (which turns it into a CR-LF pair) or on Unix (which turns it into an LF), and file it in somewhere else.  The only safe ways are '\' withCRs and (better) '<n>' expandMacros (see StringParameterSubstitution; there's no default escape for lf).

This is definitely a big achilles heel for Smalltalk.  Stuff it needs to do but doesn't:

printing in fixed widths with leading and trailing padding
printing with parameterised widths (printf's %*s)
multiple floating-point formats (with exponent, with specified exponent, without exponent)

I did some code for StringParameterSubstitution that did some of this before I left (to do Isaac Gouy's Computer Language Shootout benchmarlks more neatly).  I don't know the status of this.  You're welcome to the change set if you're interested.  Let me know.

Dave Leibs did a much more radical and interesting thing requiring compiler modifications.  I can dig this out too but it'll require some massaging to get to work.

But the bottom line is that standard Smalltalk has no good answer for this (that I know of).

If you want character literals, that works too:
$($
)

If you try to read the code for such a method in from XML, you lose the difference between LF and CR: both become CR :-(.

I don't know of a \r trick, unless you mean this:
'\' withCRs
which returns the same as (String with: Character cr), but of course isn't much use if you need a literal.

HTH,
Steve

> -----Original Message-----
> From: Thomas Gagné [mailto:[hidden email]]
> Sent: 31 January 2007 19:51
> To: vwnc
> Subject: How to escape special characters in string literals?
>
> I'm ashamed I don't know this!  How can I express the characters /, cr,
> lf as a string literal?  I need to do it inside #().  I remember
> something about a /r or \r for a carriage return but can't find the
> reference now...
>
> --
> Visit <http://tggagne.blogspot.com/>,<http://gagne.homedns.org/> or
>       < http://gagne.homedns.org/~tgagne/> for more great reading.


Reply | Threaded
Open this post in threaded view
|

RE: How to escape special characters in string literals?

Steven Kelly
In reply to this post by Thomas Gagné-2

From: Eliot Miranda [mailto:[hidden email]]


On 1/31/07, Steven Kelly <[hidden email]> wrote:

#('
') works for me. You can do the same with lf, e.g. pasting into the string body after this:
ParagraphEditor currentSelection: (String with: Character lf)


it works until... you file-out on e.g. Windows (which turns it into a CR-LF pair) or on Unix (which turns it into an LF), and file it in somewhere else. 

 

If you mean good old chunk fileout, that's true but somewhat unfair: if you move a text file to a different platform, you're responsible for line end conversion etc. My experience is that if you did the appropriate conversion this worked OK with chunk format in VW3, although later changes to "source" encoding may have messed it up.

 

If you mean XML, VW maintains the difference between CR and LF as simple characters when saving. When reading back in, the problem is XML itself, which demands that all line feeds be munged to LF when reading,  thus killing any semantically important CR/LF differences. VW then munges all the LFs to Character cr. I think CR can be smuggled past the normalization as &#13; – CDATA doesn't help, because the normalization of CR or CR+LF to LF takes place before any parsing (assuming VW follows the specs, and I'm reading the specs right).

 

VW writes out normal line breaks (Character cr) in XML source format as 16r0D. That's maybe not completely illegal, but it's certainly frowned on. It might be better to munge Character cr to 16r0A when writing, to match the post-normalization munging of 16r0A to Character cr when reading. We could then do what seems the only possible trick, ugly though it is, of writing Character lf as &#x13;. When read, that would escape normalization, be seen by VW as 16r0D, and VW could munge it to Character lf.

 

In other words, since VW's normal line end is cr, and XML's is LF, swap the two around when reading and writing. But that's so ugly I'd have to shoot myself if you did it. Or more rationally and less violently, arm myself with custard pies and go on a justifiable homi-pied spree after ASCII's dumb control characters, XML's dumb "there is one line break and his format is LF", and VW's non-XML XML source format.

 

 The only safe ways are '\' withCRs and (better) '<n>' expandMacros (see StringParameterSubstitution; there's no default escape for lf).

 

Thomas said he needed a literal string in a literal array, but I'd agree he should reconsider some non-literal solution.

 

Steve

 

> -----Original Message-----
> From: Thomas Gagné [mailto:[hidden email]]
> Sent:
31 January 2007 19:51
> To: vwnc
> Subject: How to escape special characters in string literals?
>
> I'm ashamed I don't know this!  How can I express the characters /, cr,
> lf as a string literal?  I need to do it inside #().  I remember
> something about a /r or \r for a carriage return but can't find the
> reference now...
>
> --
> Visit <http://tggagne.blogspot.com/>,<http://gagne.homedns.org/> or
>       < http://gagne.homedns.org/~tgagne/> for more great reading.

 

Reply | Threaded
Open this post in threaded view
|

Re: How to escape special characters in string literals?

Nicolas Cellier-3
Steven Kelly a écrit :

> *From:* Eliot Miranda [mailto:[hidden email]]
>
>
> On 1/31/07, *Steven Kelly* <[hidden email]
> <mailto:[hidden email]>> wrote:
>
> #('
> ') works for me. You can do the same with lf, e.g. pasting into the
> string body after this:
> ParagraphEditor currentSelection: (String with: Character lf)
>
>
> it works until... you file-out on e.g. Windows (which turns it into a
> CR-LF pair) or on Unix (which turns it into an LF), and file it in
> somewhere else.
>
>  
>
> If you mean good old chunk fileout, that's true but somewhat unfair: if
> you move a text file to a different platform, you're responsible for
> line end conversion etc. My experience is that if you did the
> appropriate conversion this worked OK with chunk format in VW3, although
> later changes to "source" encoding may have messed it up.
>
>  
>
> If you mean XML, VW maintains the difference between CR and LF as simple
> characters when saving. When reading back in, the problem is XML itself,
> which demands that all line feeds be munged to LF when reading,  thus
> killing any semantically important CR/LF differences.

Agree with clever analysis except the word semantically.
I think you mean important convention differences.

I would be amazed that someone really use historical typewritter
semantics: Carriage Return sends your carriage to begining of line, Line
Feed advances your paper one line.

We can reproduce that behaviour on Screen just for fun.

Even at typewriter ages, some would need a single control character for
both operations, others would need two, or even a longer sequence.

Semantically, CR, CR-LF and LF are equivalent in text files and should
all be interpreted as a historical CR-LF pair.

Does someone use another semantic?

To import text in Smalltalk, lineEndAuto is almost fine, except it won't
handle mixed conventions correctly (mixed conventions just happen in
text files edited on several platforms, since some text editors are
clever enough to preserve conventions. I have a lot of these myself).

So, universal behaviour to import text into Smalltalk would be to
lineEndTransparent our Stream and hack our DisplayScanner to handle all
three lineEnd conventions (easy).

Unfortunately, we are so used to cr that replacing each and every upTo:
CR send in our code with a upToEndOfLine and other similar uses, won't
be easy...

On write, it is another problem. You export to a target, so you must use
the convention of this target.
If the CR / LF difference is important, then the difference SHALL be
explicitely visible to other programmers. So, the clever tricks SHALL
NOT be used at all, whether xml preserve it or not.

Since String literals are limited (no escape in Smalltalk Syntax but for
quote itself ''), you should better use an explicit Smalltalk construct.
You can even have it as optimized as a literal, see "When You Come Back"
at http://www.cincomsmalltalk.com/userblogs/travis/blogView

Nicolas

Reply | Threaded
Open this post in threaded view
|

RE: How to escape special characters in string literals?

Dave Stevenson-2
In reply to this post by Thomas Gagné-2
It may not fit your needs, but perhaps you could provide literal byte
arrays instead:

#( #[10] ) collect: [:ea | ea asString]

Dave


> -----Original Message-----
> From: Thomas Gagné [mailto:[hidden email]]
> Sent: 31 January 2007 19:51
> To: vwnc
> Subject: How to escape special characters in string literals?
>
> I'm ashamed I don't know this!  How can I express the characters /, cr,
> lf as a string literal?  I need to do it inside #().  I remember
> something about a /r or \r for a carriage return but can't find the
> reference now...
>
> --
> Visit <http://tggagne.blogspot.com/>,<http://gagne.homedns.org/> or
>       <http://gagne.homedns.org/~tgagne/> for more great reading.