petitparser tutorial

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

petitparser tutorial

Tudor Girba-2
Hi,

I put together a tutorial for PetitParser in The Moose Book:
http://www.themoosebook.org/book/internals/petit-parser/mse

The tutorial is meant to work with a Moose image and it is based on the same scenario as I used at the Deep into Smalltalk School:
http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip


Feedback is always appreciated (including the negative one :)). You can either reply by mail, or leave a comment online.

Cheers,
Doru


--
www.tudorgirba.com

"Presenting is storytelling."


Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Stéphane Ducasse
Thanks a lot doru.
I would like to have chapter for PBE so we can join forces.
I will read it.

Stef

On Jul 11, 2011, at 2:18 AM, Tudor Girba wrote:

> Hi,
>
> I put together a tutorial for PetitParser in The Moose Book:
> http://www.themoosebook.org/book/internals/petit-parser/mse
>
> The tutorial is meant to work with a Moose image and it is based on the same scenario as I used at the Deep into Smalltalk School:
> http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
> http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip
>
>
> Feedback is always appreciated (including the negative one :)). You can either reply by mail, or leave a comment online.
>
> Cheers,
> Doru
>
>
> --
> www.tudorgirba.com
>
> "Presenting is storytelling."
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev


Reply | Threaded
Open this post in threaded view
|

Re: petitparser tutorial

giorgiof
In reply to this post by Tudor Girba-2
Hi, Tudor, 

if I use the search facility on the bbok, i get the following:

Internal Error

Error: subscript is out of bounds: 8218

ByteArray(Object)>>errorSubscriptBounds:
ByteArray(Object)>>at:
WideString(String)>>findSubstring:in:startingAt:matchTable:
WideString(String)>>findString:startingAt:caseSensitive:
WideString(String)>>includesSubstring:caseSensitive:
[] in [] in PRFullTextSearch>>visitStructure:
[] in Set(Collection)>>anySatisfy:
Set>>do:
Set(Collection)>>anySatisfy:
[] in PRFullTextSearch>>visitStructure:
SortedCollection(OrderedCollection)>>do:
MAPriorityContainer(MAContainer)>>do:
PRFullTextSearch>>visitStructure:
PRFullTextSearch(PRVisitor)>>visitCase:
PRFullTextSearch(PRVisitor)>>visitPublication:
PRFullTextSearch(PRVisitor)>>visitPortion:
BOPortion>>accept:
BOPortion(Object)>>acceptDecorated:
[] in BOPortion(PRDecorated)>>acceptDecorated:
BOPortion(PRDecorated)>>decorationsDo:ownerDo:


thanks for the document

giorgio ferraris

On Mon, Jul 11, 2011 at 2:18 AM, Tudor Girba <[hidden email]> wrote:
Hi,

I put together a tutorial for PetitParser in The Moose Book:
http://www.themoosebook.org/book/internals/petit-parser/mse

The tutorial is meant to work with a Moose image and it is based on the same scenario as I used at the Deep into Smalltalk School:
http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip


Feedback is always appreciated (including the negative one :)). You can either reply by mail, or leave a comment online.

Cheers,
Doru


--
www.tudorgirba.com

"Presenting is storytelling."



Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Lukas Renggli
This is a long known problem:
http://code.google.com/p/pharo/issues/detail?id=2353. For now it can
only be avoided by not using WideString (i.e. by not using an encoded
server adapter).

Lukas

On Tuesday, 12 July 2011, giorgio ferraris <[hidden email]> wrote:

> Hi, Tudor,
> if I use the search facility on the bbok, i get the following:
> Internal Error
> Error: subscript is out of bounds: 8218ByteArray(Object)>>errorSubscriptBounds:
> ByteArray(Object)>>at:
> WideString(String)>>findSubstring:in:startingAt:matchTable:
> WideString(String)>>findString:startingAt:caseSensitive:
> WideString(String)>>includesSubstring:caseSensitive:
> [] in [] in PRFullTextSearch>>visitStructure:
> [] in Set(Collection)>>anySatisfy:
> Set>>do:
> Set(Collection)>>anySatisfy:
> [] in PRFullTextSearch>>visitStructure:
> SortedCollection(OrderedCollection)>>do:
> MAPriorityContainer(MAContainer)>>do:
> PRFullTextSearch>>visitStructure:
> PRFullTextSearch(PRVisitor)>>visitCase:
> PRFullTextSearch(PRVisitor)>>visitPublication:
> PRFullTextSearch(PRVisitor)>>visitPortion:
> BOPortion>>accept:
> BOPortion(Object)>>acceptDecorated:
> [] in BOPortion(PRDecorated)>>acceptDecorated:
> BOPortion(PRDecorated)>>decorationsDo:ownerDo:
>
> thanks for the document
>
> giorgio ferraris
>
> On Mon, Jul 11, 2011 at 2:18 AM, Tudor Girba <[hidden email]> wrote:
>
> Hi,
>
> I put together a tutorial for PetitParser in The Moose Book:
> http://www.themoosebook.org/book/internals/petit-parser/mse
>
> The tutorial is meant to work with a Moose image and it is based on the same scenario as I used at the Deep into Smalltalk School:
> http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
> http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip
>
>
> Feedback is always appreciated (including the negative one :)). You can either reply by mail, or leave a comment online.
>
> Cheers,
> Doru
>
>
> --
> www.tudorgirba.com
>
> "Presenting is storytelling."
>
>
>
>
>

--
Lukas Renggli
www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: petitparser tutorial

Tudor Girba-2
In reply to this post by giorgiof
Hi Giorgio,

Thanks for reporting.

Cheers,
Doru


On 12 Jul 2011, at 13:22, giorgio ferraris wrote:

> Hi, Tudor,
>
> if I use the search facility on the bbok, i get the following:
>
> Internal Error
>
> Error: subscript is out of bounds: 8218
>
> ByteArray(Object)>>errorSubscriptBounds:
> ByteArray(Object)>>at:
> WideString(String)>>findSubstring:in:startingAt:matchTable:
> WideString(String)>>findString:startingAt:caseSensitive:
> WideString(String)>>includesSubstring:caseSensitive:
> [] in [] in PRFullTextSearch>>visitStructure:
> [] in Set(Collection)>>anySatisfy:
> Set>>do:
> Set(Collection)>>anySatisfy:
> [] in PRFullTextSearch>>visitStructure:
> SortedCollection(OrderedCollection)>>do:
> MAPriorityContainer(MAContainer)>>do:
> PRFullTextSearch>>visitStructure:
> PRFullTextSearch(PRVisitor)>>visitCase:
> PRFullTextSearch(PRVisitor)>>visitPublication:
> PRFullTextSearch(PRVisitor)>>visitPortion:
> BOPortion>>accept:
> BOPortion(Object)>>acceptDecorated:
> [] in BOPortion(PRDecorated)>>acceptDecorated:
> BOPortion(PRDecorated)>>decorationsDo:ownerDo:
>
>
> thanks for the document
>
> giorgio ferraris
>
> On Mon, Jul 11, 2011 at 2:18 AM, Tudor Girba <[hidden email]> wrote:
> Hi,
>
> I put together a tutorial for PetitParser in The Moose Book:
> http://www.themoosebook.org/book/internals/petit-parser/mse
>
> The tutorial is meant to work with a Moose image and it is based on the same scenario as I used at the Deep into Smalltalk School:
> http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
> http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip
>
>
> Feedback is always appreciated (including the negative one :)). You can either reply by mail, or leave a comment online.
>
> Cheers,
> Doru
>
>
> --
> www.tudorgirba.com
>
> "Presenting is storytelling."
>
>
>

--
www.tudorgirba.com

"What we can governs what we wish."




Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Tudor Girba-2
In reply to this post by Lukas Renggli
Hi Lukas,

I now switched to WAKom instead of WAKomEncoded, but the problem still persists. Any ideas?

Cheers,
Doru

On 12 Jul 2011, at 13:37, Lukas Renggli wrote:

> This is a long known problem:
> http://code.google.com/p/pharo/issues/detail?id=2353. For now it can
> only be avoided by not using WideString (i.e. by not using an encoded
> server adapter).
>
> Lukas
>
> On Tuesday, 12 July 2011, giorgio ferraris <[hidden email]> wrote:
>> Hi, Tudor,
>> if I use the search facility on the bbok, i get the following:
>> Internal Error
>> Error: subscript is out of bounds: 8218ByteArray(Object)>>errorSubscriptBounds:
>> ByteArray(Object)>>at:
>> WideString(String)>>findSubstring:in:startingAt:matchTable:
>> WideString(String)>>findString:startingAt:caseSensitive:
>> WideString(String)>>includesSubstring:caseSensitive:
>> [] in [] in PRFullTextSearch>>visitStructure:
>> [] in Set(Collection)>>anySatisfy:
>> Set>>do:
>> Set(Collection)>>anySatisfy:
>> [] in PRFullTextSearch>>visitStructure:
>> SortedCollection(OrderedCollection)>>do:
>> MAPriorityContainer(MAContainer)>>do:
>> PRFullTextSearch>>visitStructure:
>> PRFullTextSearch(PRVisitor)>>visitCase:
>> PRFullTextSearch(PRVisitor)>>visitPublication:
>> PRFullTextSearch(PRVisitor)>>visitPortion:
>> BOPortion>>accept:
>> BOPortion(Object)>>acceptDecorated:
>> [] in BOPortion(PRDecorated)>>acceptDecorated:
>> BOPortion(PRDecorated)>>decorationsDo:ownerDo:
>>
>> thanks for the document
>>
>> giorgio ferraris
>>
>> On Mon, Jul 11, 2011 at 2:18 AM, Tudor Girba <[hidden email]> wrote:
>>
>> Hi,
>>
>> I put together a tutorial for PetitParser in The Moose Book:
>> http://www.themoosebook.org/book/internals/petit-parser/mse
>>
>> The tutorial is meant to work with a Moose image and it is based on the same scenario as I used at the Deep into Smalltalk School:
>> http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
>> http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip
>>
>>
>> Feedback is always appreciated (including the negative one :)). You can either reply by mail, or leave a comment online.
>>
>> Cheers,
>> Doru
>>
>>
>> --
>> www.tudorgirba.com
>>
>> "Presenting is storytelling."
>>
>>
>>
>>
>>
>
> --
> Lukas Renggli
> www.lukas-renggli.ch
>

--
www.tudorgirba.com

"What is more important: To be happy, or to make happy?"


Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Dale Henrichs
Doru,

I think the problem is related to the fact that a WideString has been pasted into the document somewhere ... the section with the WideString embedded in it must have a new non-WideString string pasted in to replace the original WideString....

I also don't think that WAKomEncoded will help either ... if I'm not mistaken the WideString is created internally when the UTF8 is decoded ...

Dale

----- Original Message -----
| From: "Tudor Girba" <[hidden email]>
| To: "A friendly place where any question about pharo is welcome" <[hidden email]>
| Sent: Tuesday, July 12, 2011 5:09:01 AM
| Subject: Re: [Pharo-users] [Moose-dev] petitparser tutorial
|
| Hi Lukas,
|
| I now switched to WAKom instead of WAKomEncoded, but the problem
| still persists. Any ideas?
|
| Cheers,
| Doru
|
| On 12 Jul 2011, at 13:37, Lukas Renggli wrote:
|
| > This is a long known problem:
| > http://code.google.com/p/pharo/issues/detail?id=2353. For now it
| > can
| > only be avoided by not using WideString (i.e. by not using an
| > encoded
| > server adapter).
| >
| > Lukas
| >
| > On Tuesday, 12 July 2011, giorgio ferraris
| > <[hidden email]> wrote:
| >> Hi, Tudor,
| >> if I use the search facility on the bbok, i get the following:
| >> Internal Error
| >> Error: subscript is out of bounds:
| >> 8218ByteArray(Object)>>errorSubscriptBounds:
| >> ByteArray(Object)>>at:
| >> WideString(String)>>findSubstring:in:startingAt:matchTable:
| >> WideString(String)>>findString:startingAt:caseSensitive:
| >> WideString(String)>>includesSubstring:caseSensitive:
| >> [] in [] in PRFullTextSearch>>visitStructure:
| >> [] in Set(Collection)>>anySatisfy:
| >> Set>>do:
| >> Set(Collection)>>anySatisfy:
| >> [] in PRFullTextSearch>>visitStructure:
| >> SortedCollection(OrderedCollection)>>do:
| >> MAPriorityContainer(MAContainer)>>do:
| >> PRFullTextSearch>>visitStructure:
| >> PRFullTextSearch(PRVisitor)>>visitCase:
| >> PRFullTextSearch(PRVisitor)>>visitPublication:
| >> PRFullTextSearch(PRVisitor)>>visitPortion:
| >> BOPortion>>accept:
| >> BOPortion(Object)>>acceptDecorated:
| >> [] in BOPortion(PRDecorated)>>acceptDecorated:
| >> BOPortion(PRDecorated)>>decorationsDo:ownerDo:
| >>
| >> thanks for the document
| >>
| >> giorgio ferraris
| >>
| >> On Mon, Jul 11, 2011 at 2:18 AM, Tudor Girba
| >> <[hidden email]> wrote:
| >>
| >> Hi,
| >>
| >> I put together a tutorial for PetitParser in The Moose Book:
| >> http://www.themoosebook.org/book/internals/petit-parser/mse
| >>
| >> The tutorial is meant to work with a Moose image and it is based
| >> on the same scenario as I used at the Deep into Smalltalk School:
| >> http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
| >> http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip
| >>
| >>
| >> Feedback is always appreciated (including the negative one :)).
| >> You can either reply by mail, or leave a comment online.
| >>
| >> Cheers,
| >> Doru
| >>
| >>
| >> --
| >> www.tudorgirba.com
| >>
| >> "Presenting is storytelling."
| >>
| >>
| >>
| >>
| >>
| >
| > --
| > Lukas Renggli
| > www.lukas-renggli.ch
| >
|
| --
| www.tudorgirba.com
|
| "What is more important: To be happy, or to make happy?"
|
|
|

Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Lukas Renggli
In reply to this post by Tudor Girba-2
As Dale writes, changing the encoding of the server now doesn't help.
In fact, it will just mess up your model with strings in different
encodings. To fix you need to somehow find this Widestring and replace
it with an UTF-8 encoded one; or disable the encoding of the server
and transform all strings into bytearrays as Strings.

The pain with encodings in Pharo is that whatever you do you end up
with a broken system in some ways (see philipps presenation last
ESUG). If you don't encode in my experience this causes less problems
and is more performant.

Lukas

On Tuesday, 12 July 2011, Tudor Girba <[hidden email]> wrote:

> Hi Lukas,
>
> I now switched to WAKom instead of WAKomEncoded, but the problem still persists. Any ideas?
>
> Cheers,
> Doru
>
> On 12 Jul 2011, at 13:37, Lukas Renggli wrote:
>
>> This is a long known problem:
>> http://code.google.com/p/pharo/issues/detail?id=2353. For now it can
>> only be avoided by not using WideString (i.e. by not using an encoded
>> server adapter).
>>
>> Lukas
>>
>> On Tuesday, 12 July 2011, giorgio ferraris <[hidden email]> wrote:
>>> Hi, Tudor,
>>> if I use the search facility on the bbok, i get the following:
>>> Internal Error
>>> Error: subscript is out of bounds: 8218ByteArray(Object)>>errorSubscriptBounds:
>>> ByteArray(Object)>>at:
>>> WideString(String)>>findSubstring:in:startingAt:matchTable:
>>> WideString(String)>>findString:startingAt:caseSensitive:
>>> WideString(String)>>includesSubstring:caseSensitive:
>>> [] in [] in PRFullTextSearch>>visitStructure:
>>> [] in Set(Collection)>>anySatisfy:
>>> Set>>do:
>>> Set(Collection)>>anySatisfy:
>>> [] in PRFullTextSearch>>visitStructure:
>>> SortedCollection(OrderedCollection)>>do:
>>> MAPriorityContainer(MAContainer)>>do:
>>> PRFullTextSearch>>visitStructure:
>>> PRFullTextSearch(PRVisitor)>>visitCase:
>>> PRFullTextSearch(PRVisitor)>>visitPublication:
>>> PRFullTextSearch(PRVisitor)>>visitPortion:
>>> BOPortion>>accept:
>>> BOPortion(Object)>>acceptDecorated:
>>> [] in BOPortion(PRDecorated)>>acceptDecorated:
>>> BOPortion(PRDecorated)>>decorationsDo:ownerDo:
>>>
>>> thanks for the document
>>>
>>> giorgio ferraris
>>>
>>> On Mon, Jul 11, 2011 at 2:18 AM, Tudor Girba <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I put together a tutorial for PetitParser in The Moose Book:
>>> http://www.themoosebook.org/book/internals/petit-parser/mse
>>>
>>> The tutorial is meant to work with a Moose image and it is based on the same scenario as I used at the Deep into Smalltalk School:
>>> http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
>>> http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip
>>>
>>>
>>> Feedback is always appreciated (including the negative one :)). You can either reply by mail, or leave a comment online.
>>>
>>> Cheers,
>>> Doru
>>>
>>>
>>> --
>>> www.tudorgirba.com
>>>
>>> "Presenting is storytelling."
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Lukas Renggli
>> www.lukas-renggli.ch
>>
>
> --
> www.tudorgirba.com
>
> "What is more important: To be happy, or to make happy?"
>
>
>

--
Lukas Renggli
www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Sven Van Caekenberghe
Lukas,

On 12 Jul 2011, at 15:54, Lukas Renggli wrote:

> The pain with encodings in Pharo is that whatever you do you end up
> with a broken system in some ways (see philipps presenation last
> ESUG). If you don't encode in my experience this causes less problems
> and is more performant.

Of course, not doing encoding/decoding is faster, no question about it.

What do you mean with 'whatever you do, you end up with a broken system' ?
Any concrete examples ?
Do you have a pointer to Philippe's presentation ?

Thx,

Sven


Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Lukas Renggli
In reply to this post by Tudor Girba-2
As Dale writes, changing the encoding of the server now doesn't help.
In fact, it will just mess up your model with strings in different
encodings. To fix you need to somehow find this Widestring and replace
it with an UTF-8 encoded one; or disable the encoding of the server
and transform all strings into bytearrays as Strings.

The pain with encodings in Pharo is that whatever you do you end up
with a broken system in some ways (see philipps presenation last
ESUG). If you don't encode in my experience this causes less problems
and is more performant.

Lukas

On Tuesday, 12 July 2011, Tudor Girba <[hidden email]> wrote:

> Hi Lukas,
>
> I now switched to WAKom instead of WAKomEncoded, but the problem still persists. Any ideas?
>
> Cheers,
> Doru
>
> On 12 Jul 2011, at 13:37, Lukas Renggli wrote:
>
>> This is a long known problem:
>> http://code.google.com/p/pharo/issues/detail?id=2353. For now it can
>> only be avoided by not using WideString (i.e. by not using an encoded
>> server adapter).
>>
>> Lukas
>>
>> On Tuesday, 12 July 2011, giorgio ferraris <[hidden email]> wrote:
>>> Hi, Tudor,
>>> if I use the search facility on the bbok, i get the following:
>>> Internal Error
>>> Error: subscript is out of bounds: 8218ByteArray(Object)>>errorSubscriptBounds:
>>> ByteArray(Object)>>at:
>>> WideString(String)>>findSubstring:in:startingAt:matchTable:
>>> WideString(String)>>findString:startingAt:caseSensitive:
>>> WideString(String)>>includesSubstring:caseSensitive:
>>> [] in [] in PRFullTextSearch>>visitStructure:
>>> [] in Set(Collection)>>anySatisfy:
>>> Set>>do:
>>> Set(Collection)>>anySatisfy:
>>> [] in PRFullTextSearch>>visitStructure:
>>> SortedCollection(OrderedCollection)>>do:
>>> MAPriorityContainer(MAContainer)>>do:
>>> PRFullTextSearch>>visitStructure:
>>> PRFullTextSearch(PRVisitor)>>visitCase:
>>> PRFullTextSearch(PRVisitor)>>visitPublication:
>>> PRFullTextSearch(PRVisitor)>>visitPortion:
>>> BOPortion>>accept:
>>> BOPortion(Object)>>acceptDecorated:
>>> [] in BOPortion(PRDecorated)>>acceptDecorated:
>>> BOPortion(PRDecorated)>>decorationsDo:ownerDo:
>>>
>>> thanks for the document
>>>
>>> giorgio ferraris
>>>
>>> On Mon, Jul 11, 2011 at 2:18 AM, Tudor Girba <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I put together a tutorial for PetitParser in The Moose Book:
>>> http://www.themoosebook.org/book/internals/petit-parser/mse
>>>
>>> The tutorial is meant to work with a Moose image and it is based on the same scenario as I used at the Deep into Smalltalk School:
>>> http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
>>> http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip
>>>
>>>
>>> Feedback is always appreciated (including the negative one :)). You can either reply by mail, or leave a comment online.
>>>
>>> Cheers,
>>> Doru
>>>
>>>
>>> --
>>> www.tudorgirba.com
>>>
>>> "Presenting is storytelling."
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Lukas Renggli
>> www.lukas-renggli.ch
>>
>
> --
> www.tudorgirba.com
>
> "What is more important: To be happy, or to make happy?"
>
>
>

--
Lukas Renggli
www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Lukas Renggli
In reply to this post by Sven Van Caekenberghe
> What do you mean with 'whatever you do, you end up with a broken system' ?
> Any concrete examples ?

If you do not encode, then you have "unreadable" strings in the image
and operations like #copyFrom:to:, #size, #indexOf:, ... might answer
unexpected or invalid results, because UTF-8 strings are just treated
as byte arrays.

If you do encode, then you have "readable" strings in the image but
you might run into widestring/encoding problems (check the open issues
on the tracker).

> Do you have a pointer to Philippe's presentation ?

http://www.slideshare.net/esug/esug-unicode

Lukas

--
Lukas Renggli
www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Igor Stasenko
On 12 July 2011 18:12, Lukas Renggli <[hidden email]> wrote:

>> What do you mean with 'whatever you do, you end up with a broken system' ?
>> Any concrete examples ?
>
> If you do not encode, then you have "unreadable" strings in the image
> and operations like #copyFrom:to:, #size, #indexOf:, ... might answer
> unexpected or invalid results, because UTF-8 strings are just treated
> as byte arrays.
>
> If you do encode, then you have "readable" strings in the image but
> you might run into widestring/encoding problems (check the open issues
> on the tracker).
>
so, what about adding a string subclass UT8String
so it can be readable and with correct encoding? :)

>> Do you have a pointer to Philippe's presentation ?
>
> http://www.slideshare.net/esug/esug-unicode
>
> Lukas
>
> --
> Lukas Renggli
> www.lukas-renggli.ch
>
>



--
Best regards,
Igor Stasenko AKA sig.

Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Lukas Renggli
In reply to this post by Tudor Girba-2
As Dale writes, changing the encoding of the server now doesn't help.
In fact, it will just mess up your model with strings in different
encodings. To fix you need to somehow find this Widestring and replace
it with an UTF-8 encoded one; or disable the encoding of the server
and transform all strings into bytearrays as Strings.

The pain with encodings in Pharo is that whatever you do you end up
with a broken system in some ways (see philipps presenation last
ESUG). If you don't encode in my experience this causes less problems
and is more performant.

Lukas

On Tuesday, 12 July 2011, Tudor Girba <[hidden email]> wrote:

> Hi Lukas,
>
> I now switched to WAKom instead of WAKomEncoded, but the problem still persists. Any ideas?
>
> Cheers,
> Doru
>
> On 12 Jul 2011, at 13:37, Lukas Renggli wrote:
>
>> This is a long known problem:
>> http://code.google.com/p/pharo/issues/detail?id=2353. For now it can
>> only be avoided by not using WideString (i.e. by not using an encoded
>> server adapter).
>>
>> Lukas
>>
>> On Tuesday, 12 July 2011, giorgio ferraris <[hidden email]> wrote:
>>> Hi, Tudor,
>>> if I use the search facility on the bbok, i get the following:
>>> Internal Error
>>> Error: subscript is out of bounds: 8218ByteArray(Object)>>errorSubscriptBounds:
>>> ByteArray(Object)>>at:
>>> WideString(String)>>findSubstring:in:startingAt:matchTable:
>>> WideString(String)>>findString:startingAt:caseSensitive:
>>> WideString(String)>>includesSubstring:caseSensitive:
>>> [] in [] in PRFullTextSearch>>visitStructure:
>>> [] in Set(Collection)>>anySatisfy:
>>> Set>>do:
>>> Set(Collection)>>anySatisfy:
>>> [] in PRFullTextSearch>>visitStructure:
>>> SortedCollection(OrderedCollection)>>do:
>>> MAPriorityContainer(MAContainer)>>do:
>>> PRFullTextSearch>>visitStructure:
>>> PRFullTextSearch(PRVisitor)>>visitCase:
>>> PRFullTextSearch(PRVisitor)>>visitPublication:
>>> PRFullTextSearch(PRVisitor)>>visitPortion:
>>> BOPortion>>accept:
>>> BOPortion(Object)>>acceptDecorated:
>>> [] in BOPortion(PRDecorated)>>acceptDecorated:
>>> BOPortion(PRDecorated)>>decorationsDo:ownerDo:
>>>
>>> thanks for the document
>>>
>>> giorgio ferraris
>>>
>>> On Mon, Jul 11, 2011 at 2:18 AM, Tudor Girba <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I put together a tutorial for PetitParser in The Moose Book:
>>> http://www.themoosebook.org/book/internals/petit-parser/mse
>>>
>>> The tutorial is meant to work with a Moose image and it is based on the same scenario as I used at the Deep into Smalltalk School:
>>> http://www.slideshare.net/girba/petitparser-at-the-deep-into-smalltalk-school-2011
>>> http://ci.moosetechnology.org/job/moose-latest-dev/lastSuccessfulBuild/artifact/moose/*zip*/moose.zip
>>>
>>>
>>> Feedback is always appreciated (including the negative one :)). You can either reply by mail, or leave a comment online.
>>>
>>> Cheers,
>>> Doru
>>>
>>>
>>> --
>>> www.tudorgirba.com
>>>
>>> "Presenting is storytelling."
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Lukas Renggli
>> www.lukas-renggli.ch
>>
>
> --
> www.tudorgirba.com
>
> "What is more important: To be happy, or to make happy?"
>
>
>

--
Lukas Renggli
www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Sven Van Caekenberghe
In reply to this post by Lukas Renggli

On 12 Jul 2011, at 18:12, Lukas Renggli wrote:

>> What do you mean with 'whatever you do, you end up with a broken system' ?
>> Any concrete examples ?
>
> If you do not encode, then you have "unreadable" strings in the image
> and operations like #copyFrom:to:, #size, #indexOf:, ... might answer
> unexpected or invalid results, because UTF-8 strings are just treated
> as byte arrays.

Not the way to go IMO.

> If you do encode, then you have "readable" strings in the image but
> you might run into widestring/encoding problems (check the open issues
> on the tracker).

I just commented on http://code.google.com/p/pharo/issues/detail?id=2353 which seems to be the most important one.

I can't reproduce this (anymore).

http://code.google.com/p/pharo/issues/detail?id=830 seems specific to the changelog
http://code.google.com/p/pharo/issues/detail?id=2697 is an MC problem

The next 2 I reported myself, they are encoding/decoding problems:

http://code.google.com/p/pharo/issues/detail?id=3360
http://code.google.com/p/pharo/issues/detail?id=4187

>> Do you have a pointer to Philippe's presentation ?
>
> http://www.slideshare.net/esug/esug-unicode

Thx, reading...

Sven


Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Sven Van Caekenberghe
Lukas,

On 12 Jul 2011, at 22:23, Sven Van Caekenberghe wrote:

>>> Do you have a pointer to Philippe's presentation ?
>>
>> http://www.slideshare.net/esug/esug-unicode
>
> Thx, reading...

You probably mean then that proper (locale aware) collation (sorting), equality (including normalization) are not implemented on WideString ?

Sven


Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Lukas Renggli
I don't remember. Ask Philippe, he is the expert on these kind of questions.

Lukas


On 12 July 2011 22:37, Sven Van Caekenberghe <[hidden email]> wrote:

> Lukas,
>
> On 12 Jul 2011, at 22:23, Sven Van Caekenberghe wrote:
>
>>>> Do you have a pointer to Philippe's presentation ?
>>>
>>> http://www.slideshare.net/esug/esug-unicode
>>
>> Thx, reading...
>
> You probably mean then that proper (locale aware) collation (sorting), equality (including normalization) are not implemented on WideString ?
>
> Sven
>
>
>



--
Lukas Renggli
www.lukas-renggli.ch

Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

Sven Van Caekenberghe
Any comments, Philippe ?

Anyway, IMHO Unicode and UTF-8 are not broken in Pharo. It is perfectly possible, as you know, to write Seaside web apps or pure web services that deal correctly with Unicode through UTF-8, storing data is databases and/or files, or sending it further along to other services.

On the other hand, I can imagine that not everything that one could wish for is already there, so slowly filling these functionality holes is important as well. But unless we know what they are, we can't begin to make progress.

Sven

On 12 Jul 2011, at 22:45, Lukas Renggli wrote:

> I don't remember. Ask Philippe, he is the expert on these kind of questions.
>
> Lukas
>
>
> On 12 July 2011 22:37, Sven Van Caekenberghe <[hidden email]> wrote:
>> Lukas,
>>
>> On 12 Jul 2011, at 22:23, Sven Van Caekenberghe wrote:
>>
>>>>> Do you have a pointer to Philippe's presentation ?
>>>>
>>>> http://www.slideshare.net/esug/esug-unicode
>>>
>>> Thx, reading...
>>
>> You probably mean then that proper (locale aware) collation (sorting), equality (including normalization) are not implemented on WideString ?
>>
>> Sven
>>
>>
>>
>
>
>
> --
> Lukas Renggli
> www.lukas-renggli.ch
>


Reply | Threaded
Open this post in threaded view
|

Re: [Moose-dev] petitparser tutorial

NorbertHartl

Am 13.07.2011 um 11:08 schrieb Sven Van Caekenberghe:

Any comments, Philippe ?

Anyway, IMHO Unicode and UTF-8 are not broken in Pharo. It is perfectly possible, as you know, to write Seaside web apps or pure web services that deal correctly with Unicode through UTF-8, storing data is databases and/or files, or sending it further along to other services.

On the other hand, I can imagine that not everything that one could wish for is already there, so slowly filling these functionality holes is important as well. But unless we know what they are, we can't begin to make progress.

I like to second this. I can see the tension to not decode on image entrance and to not encode on image exit. This way strings are treated "to be just in transit" which is true for a lot of strings if you do a web service that pulls data from a database. But this only works if you assume everything is assuming the same encoding.
While it has performance benefits I'm not fond of treating stuff like this. Encoding is just not to be ignored. To me the only way is to do it right everytime. I can't even believe the statement that there is less trouble by not do decoding. In average of the use cases the mess you create by e.g. by accidentially accessing magnitude information from those strings should be on par with the en/decoding troubles.

Norbert


On 12 Jul 2011, at 22:45, Lukas Renggli wrote:

I don't remember. Ask Philippe, he is the expert on these kind of questions.

Lukas


On 12 July 2011 22:37, Sven Van Caekenberghe <[hidden email]> wrote:
Lukas,

On 12 Jul 2011, at 22:23, Sven Van Caekenberghe wrote:

Do you have a pointer to Philippe's presentation ?

http://www.slideshare.net/esug/esug-unicode

Thx, reading...

You probably mean then that proper (locale aware) collation (sorting), equality (including normalization) are not implemented on WideString ?

Sven






--
Lukas Renggli
www.lukas-renggli.ch