Smalltalk › Pharo › Pharo Smalltalk Users

String replacement

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

4 messages Options

Thomas Worthington-2

String replacement

I've got a string which is the header section of an email. I have a regex
which will split a header field name from its data (ie, "From:
[hidden email]" becomes "From" and "[hidden email]") but some header lines are
long and have been continued by inserting a newline and one or more
spaces. Before splitting the fields I need to undo these continuations by
deleting these combinations of a newline followed by some whitespace.

This would certainly be trivial in Perl or any of the normal Linux regex
engines but I've spent hours on this today, equipped with the PBE2
chapter, and got nowhere.

How do I do this in Pharo?

Thanks,

Thomas

Igor Stasenko

Re: String replacement

On 27 May 2013 18:42, Thomas Worthington <[hidden email]> wrote:

> I've got a string which is the header section of an email. I have a regex
> which will split a header field name from its data (ie, "From:
> [hidden email]" becomes "From" and "[hidden email]") but some header lines are
> long and have been continued by inserting a newline and one or more
> spaces. Before splitting the fields I need to undo these continuations by
> deleting these combinations of a newline followed by some whitespace.
>
> This would certainly be trivial in Perl or any of the normal Linux regex
> engines but I've spent hours on this today, equipped with the PBE2
> chapter, and got nowhere.
>
> How do I do this in Pharo?

trimBlock := [:string |
| lines |
lines := string lines.
lines collect: #trimmed ].

trimBlock value: 'Header1: fooo
Header2: barrr
Header3: zork'

=>
#('Header1: fooo' 'Header2: barrr' 'Header3: zork')

>
> Thanks,
>
> Thomas
>

--
Best regards,
Igor Stasenko.

Sven Van Caekenberghe-2

Re: String replacement

Igor,

I don't think that is what he wants.

Thomas,

You can use the build in ZnHeaders class from the Zinc HTTP Components library/framework:

ZnHeaders readFrom: (String crlf join: 'Foo:1
Bar: foo-
bar
Final:true
Foo: another-foo' lines) readStream.

=> a ZnHeaders('Bar'->'foo- bar' 'Final'->'true' 'Foo'->#('1' 'another-foo') )

Note that ZnHeaders>>#readFrom: expects CRLF delimited lines (line the whole of the internet), while Smalltalk uses CR's, hence the little hack. Your input will probably use CRLF. Note how ZnHeaders handles identical headers, while it joins the foo- and bar.

HTH,

Sven

On 27 May 2013, at 18:52, Igor Stasenko <[hidden email]> wrote:

> On 27 May 2013 18:42, Thomas Worthington <[hidden email]> wrote:
>> I've got a string which is the header section of an email. I have a regex
>> which will split a header field name from its data (ie, "From:
>> [hidden email]" becomes "From" and "[hidden email]") but some header lines are
>> long and have been continued by inserting a newline and one or more
>> spaces. Before splitting the fields I need to undo these continuations by
>> deleting these combinations of a newline followed by some whitespace.
>>
>> This would certainly be trivial in Perl or any of the normal Linux regex
>> engines but I've spent hours on this today, equipped with the PBE2
>> chapter, and got nowhere.
>>
>> How do I do this in Pharo?
>
> trimBlock := [:string |
> | lines |
> lines := string lines.
> lines collect: #trimmed ].
>
> trimBlock value: 'Header1: fooo
> Header2: barrr
> Header3: zork'
>
> =>
> #('Header1: fooo' 'Header2: barrr' 'Header3: zork')
>
>>
>> Thanks,
>>
>> Thomas
>>
>
>
>
> --
> Best regards,
> Igor Stasenko.
>

S Krish

Re: String replacement

In reply to this post by Thomas Worthington-2

* Use the regex in Pharo too. Should be more than adequate for what you seek.

* Create your string parsing code, in the lines of what Igor suggests, extending it for multiple line header fields

* Use other libraries, like Zinc as Sven suggests, there are others similar in other packages, bit contrived and not really what you seek I presume.

On Mon, May 27, 2013 at 10:12 PM, Thomas Worthington <[hidden email]> wrote:

I've got a string which is the header section of an email. I have a regex
which will split a header field name from its data (ie, "From:
[hidden email]" becomes "From" and "[hidden email]") but some header lines are
long and have been continued by inserting a newline and one or more
spaces. Before splitting the fields I need to undo these continuations by
deleting these combinations of a newline followed by some whitespace.

This would certainly be trivial in Perl or any of the normal Linux regex
engines but I've spent hours on this today, equipped with the PBE2
chapter, and got nowhere.

How do I do this in Pharo?

Thanks,

Thomas