Re: [vwnc] Parsing in Smalltalk

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Ben Coman
Am .10.2018, 20:14 Uhr, schrieb Steffen Märcker <[hidden email]>:

> Dear all,
>
> I have two questions regarding parsing frameworks.
>
> 1) Do you have any insights on the performance of SmaCC VS Xtreams  
> Parsing VS PetitParser?
> 2) Has anybody started to port PetitParser 2 from Pharo to VW? Is it  
> worth the effort?
>
> Sorry for cross-posting, I thought this might interest both communities.
>
> Cheers, Steffen

On Fri, 5 Oct 2018 at 04:47, Steffen Märcker <[hidden email]> wrote:
I gave Xtreams-Parsing and PetitParser a shot and like to share my 
findings.[*]

The task was to parse the modelling language of the probabilistic model 
checker PRISM. I've written a grammer of about 130 definitions in the 
Xtreams DSL, which is close to Bryan Fords syntax. To avoid doing it all 
again with PetitParser, I wrote a PetitParserGenerator that takes the DSL 
and builds a PetitParser.

The numbers below are just parsing times, no further actions involved. For 
reference I show the times from PRISM (which uses JavaCC), too -- although 
they involve additional verification and normalization steps on the AST.

input  Prism    XP   PP
230kB    14s    9s   2s
544kB   121s   20s   5s
1.1MB   421s   34s   8s
1.4MB  1091s   47s  12s
2.2MB          63s  16s
2.9MB          81s  20s
3.8MB         107s  25s
4.4MB         123s  30s

Please note that these times are not representative at all. It's just a 
single example and I put zero effort in optimization. However, I am quite 
satisfied with the results.

[*] I was already familiar with the DSL of Xtreams-Parsing, which I like 
very much. I did not consider SmaCC, as I find PEGs easier to use.

Best, Steffen

Thanks for your report Steffen. Nice to see such comparisons even when a bit apples & oranges.
Will you be implementing those "additional verification and normalization steps" ?
It seems they have an exponential or power impact on times.

cheers -ben 





Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Jan Kurš
#memoized is one of the most efficient and hardest optimizations. It cannot be done efficiently in an automated way. It depends on input. Best way is to identify repeated invocation of the same parser combinator at the same position for a typical input, pp2 has a tooling support for this, I wrote a chapter about #memoized in PP2 [1]. PP2 does the poor-man version of memoization (based on grammar analysis) automatically, just by calling #optimize. 

If really needed, provide me with parser and input, I can check and suggest optimizations.  

There should be no fundamental issue with porting PP2 to VW. As far as I know, there is an automated tool to do so, right? On the other hand, PP is stable and does not change, PP2 is maintained and updated from time to time (mostly adding optimizations), so there might be an overhead of syncing PP2 to VW2.

Cheers,
Jan

[1]: https://kursjan.github.io/petitparser2/pillar-book/build/Chapters/memoization.html

On Fri, Oct 5, 2018, 13:26 Steffen Märcker <[hidden email]> wrote:
Hi Doru!

> I assume that you tried the original PetitParser. PetitParser2 offers 
> the possibility to optimize the parser (kind of a compilation), and this 
> provides a significant speedup:
> https://github.com/kursjan/petitparser2
>
> Would you be interested in trying this out?

Yes, I'd like to give this a shot, too. However, as far as I know, PP2 is 
only available for Pharo and not VW, is it?

Speaking of optimizations, I also tried to use memoizing the petit parser. 
However, the times got worse instead of better. Is there a rule of thumb 
where to apply #memoized in a sensible way? As far as I understand, 
applying it to the root parser does not memoize subsequent parsers, does 
it?

Kind regards, Steffen

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Steffen Märcker
Dear Jan,

I just tried to use PP2 but ran into two issues:

1. PP2 does not load into Pharo 6.1 stable.
2. I use #- to create character classes but was not able to find the  
equivalent in PP2 yet.

> There should be no fundamental issue with porting PP2 to VW. As far as I
> know, there is an automated tool to do so, right?

I am not aware of this tool. Can you give me some hints what exactly to  
look for?

Best, Steffen


On the other hand, PP

> is
> stable and does not change, PP2 is maintained and updated from time to  
> time
> (mostly adding optimizations), so there might be an overhead of syncing  
> PP2
> to VW2.
>
> Cheers,
> Jan
>
> [1]:
> https://kursjan.github.io/petitparser2/pillar-book/build/Chapters/memoization.html
>
> On Fri, Oct 5, 2018, 13:26 Steffen Märcker <[hidden email]> wrote:
>
>> Hi Doru!
>>
>> > I assume that you tried the original PetitParser. PetitParser2 offers
>> > the possibility to optimize the parser (kind of a compilation), and
>> this
>> > provides a significant speedup:
>> > https://github.com/kursjan/petitparser2
>> >
>> > Would you be interested in trying this out?
>>
>> Yes, I'd like to give this a shot, too. However, as far as I know, PP2  
>> is
>> only available for Pharo and not VW, is it?
>>
>> Speaking of optimizations, I also tried to use memoizing the petit
>> parser.
>> However, the times got worse instead of better. Is there a rule of thumb
>> where to apply #memoized in a sensible way? As far as I understand,
>> applying it to the root parser does not memoize subsequent parsers, does
>> it?
>>
>> Kind regards, Steffen
>>

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Sean P. DeNigris
Administrator
Steffen Märcker wrote
> 1. PP2 does not load into Pharo 6.1 stable.

Can you give more details? IIRC I have PP2 loaded in several 6.1 images.



-----
Cheers,
Sean
--
Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html

Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Steffen Märcker
> Can you give more details? IIRC I have PP2 loaded in several 6.

I did the following:
1)  Download and start Pharo 6.1 stable via the launcher.
2a) Attempt to install PetitParser2 via the CatalogBrowser:
     "Information
     There was an error while trying to install PetitParser2.
     Installation was cancelled."
2b) Attempt to install PP2 via the scripts from GitHub:
     Metacello new
         baseline: 'PetitParser2';
         repository: 'github://kursjan/petitparser2';
         load.
     Metacello new
         baseline: 'PetitParser2Gui';
         repository: 'github://kursjan/petitparser2';
         load.
     "Could not resolve: [BaselineOfPetitParser2] in [...]"

Interestingly, it works in Pharo 7 dev, but there the GUI-Tools won't load  
because of some issues with their dependencies.

I hope this helps. As I am not familiar with Pharo, I'd appreciate any  
hints.

Best, Steffen

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Sean P. DeNigris
Administrator
Steffen Märcker wrote

> I did the following:
> 1)  Download and start Pharo 6.1 stable via the launcher.
> 2b) Attempt to install PP2 via the scripts from GitHub:
>      Metacello new
>          baseline: 'PetitParser2';
>          repository: 'github://kursjan/petitparser2';
>          load.
>      Metacello new
>          baseline: 'PetitParser2Gui';
>          repository: 'github://kursjan/petitparser2';
>          load.

This way worked for me in Pharo #60546 (check in World->System->About). What
exact Pharo version/OS are you on? 32 or 64-bit



-----
Cheers,
Sean
--
Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html

Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Steffen Märcker
I am using MacOS 10.13.6 and the 32bit VM:

Pharo 6.0
Latest update: #60546

... the String in about is wrong, it should be 6.1. I installed it via the  
launcher as "Official Distribution: Pharo 6.1 - 32Bit (stable)" I just  
noticed, that the sources file is missing from vms/private/6521/, too.

Am .10.2018, 17:02 Uhr, schrieb Sean P. DeNigris <[hidden email]>:

> Steffen Märcker wrote
>> I did the following:
>> 1)  Download and start Pharo 6.1 stable via the launcher.
>> 2b) Attempt to install PP2 via the scripts from GitHub:
>>      Metacello new
>>          baseline: 'PetitParser2';
>>          repository: 'github://kursjan/petitparser2';
>>          load.
>>      Metacello new
>>          baseline: 'PetitParser2Gui';
>>          repository: 'github://kursjan/petitparser2';
>>          load.
>
> This way worked for me in Pharo #60546 (check in World->System->About).  
> What
> exact Pharo version/OS are you on? 32 or 64-bit
>
>
>
> -----
> Cheers,
> Sean
> --
> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
>



Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Steffen Märcker
Reading the code of PetitParser, I wonder why PPRepeatingParser  
initializes 'max' with SmallInteger maxVal instead of some notion of  
infinity, like Float infinity (and PP2RepeatingNode as well). If I  
understand the code correctly, PParser>>min: fails if the number of  
repetitions exceeds SmallInteger maxVal, doesn't it?

Best, Steffen


Am .10.2018, 17:10 Uhr, schrieb Steffen Märcker <[hidden email]>:

> I am using MacOS 10.13.6 and the 32bit VM:
>
> Pharo 6.0
> Latest update: #60546
>
> ... the String in about is wrong, it should be 6.1. I installed it via  
> the launcher as "Official Distribution: Pharo 6.1 - 32Bit (stable)" I  
> just noticed, that the sources file is missing from vms/private/6521/,  
> too.
>
> Am .10.2018, 17:02 Uhr, schrieb Sean P. DeNigris <[hidden email]>:
>
>> Steffen Märcker wrote
>>> I did the following:
>>> 1)  Download and start Pharo 6.1 stable via the launcher.
>>> 2b) Attempt to install PP2 via the scripts from GitHub:
>>>      Metacello new
>>>          baseline: 'PetitParser2';
>>>          repository: 'github://kursjan/petitparser2';
>>>          load.
>>>      Metacello new
>>>          baseline: 'PetitParser2Gui';
>>>          repository: 'github://kursjan/petitparser2';
>>>          load.
>>
>> This way worked for me in Pharo #60546 (check in World->System->About).  
>> What
>> exact Pharo version/OS are you on? 32 or 64-bit
>>
>>
>>
>> -----
>> Cheers,
>> Sean
>> --
>> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Peter Kenny
In reply to this post by Steffen Märcker
Steffen

I do most of my work using Moose Suite 6.1, which is Pharo 6.1 with a lot of
extras, because it comes with the tools I want (PetitParser, PP2 and
XMLParser) already loaded. The image is huge, but if that's not a problem
for you it could be an easy way to get PP2.

Best wishes

Peter Kenny

-----Original Message-----
From: Pharo-users <[hidden email]> On Behalf Of Steffen
Märcker
Sent: 11 October 2018 16:11
To: [hidden email]
Subject: Re: [Pharo-users] [vwnc] Parsing in Smalltalk

I am using MacOS 10.13.6 and the 32bit VM:

Pharo 6.0
Latest update: #60546

... the String in about is wrong, it should be 6.1. I installed it via the
launcher as "Official Distribution: Pharo 6.1 - 32Bit (stable)" I just
noticed, that the sources file is missing from vms/private/6521/, too.

Am .10.2018, 17:02 Uhr, schrieb Sean P. DeNigris <[hidden email]>:

> Steffen Märcker wrote
>> I did the following:
>> 1)  Download and start Pharo 6.1 stable via the launcher.
>> 2b) Attempt to install PP2 via the scripts from GitHub:
>>      Metacello new
>>          baseline: 'PetitParser2';
>>          repository: 'github://kursjan/petitparser2';
>>          load.
>>      Metacello new
>>          baseline: 'PetitParser2Gui';
>>          repository: 'github://kursjan/petitparser2';
>>          load.
>
> This way worked for me in Pharo #60546 (check in World->System->About).  
> What
> exact Pharo version/OS are you on? 32 or 64-bit
>
>
>
> -----
> Cheers,
> Sean
> --
> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
>




Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Jan Kurš
In reply to this post by Steffen Märcker
I run PP2 on travis [1], seems Pharo 6.1 loads all configurations, both on linux and mac. Pharo 5, Pharo 6.0 got broken, why is build configuration so hard :'( I don't know, how can I support you. I myself had to gave up on some tools, because I failed to load them.

There is no specific reason to use SmallInteger maxVal...  and nobody ever thought it might be too little. 'PP2 min: X' fails if there are less repetitions that X. 'PP2 max: X' parses at most X repetitions.

($a asPParser min: 2 max: 3) parse: 'a'. -> Failure
($a asPParser min: 2 max: 3) parse: 'aa'.  #($a $a)
($a asPParser min: 2 max: 3) parse: 'aaa'. #($a $a $a)
($a asPParser min: 2 max: 3) parse: 'aaaa'. #($a $a $a)
 

Use $- asPParser for characters, e.g:
$- asPParser parse: '-'


On Thu, Oct 11, 2018 at 8:13 PM Steffen Märcker <[hidden email]> wrote:
Reading the code of PetitParser, I wonder why PPRepeatingParser 
initializes 'max' with SmallInteger maxVal instead of some notion of 
infinity, like Float infinity (and PP2RepeatingNode as well). If I 
understand the code correctly, PParser>>min: fails if the number of 
repetitions exceeds SmallInteger maxVal, doesn't it?

Best, Steffen


Am .10.2018, 17:10 Uhr, schrieb Steffen Märcker <[hidden email]>:

> I am using MacOS 10.13.6 and the 32bit VM:
>
> Pharo 6.0
> Latest update: #60546
>
> ... the String in about is wrong, it should be 6.1. I installed it via 
> the launcher as "Official Distribution: Pharo 6.1 - 32Bit (stable)" I 
> just noticed, that the sources file is missing from vms/private/6521/, 
> too.
>
> Am .10.2018, 17:02 Uhr, schrieb Sean P. DeNigris <[hidden email]>:
>
>> Steffen Märcker wrote
>>> I did the following:
>>> 1)  Download and start Pharo 6.1 stable via the launcher.
>>> 2b) Attempt to install PP2 via the scripts from GitHub:
>>>      Metacello new
>>>          baseline: 'PetitParser2';
>>>          repository: 'github://kursjan/petitparser2';
>>>          load.
>>>      Metacello new
>>>          baseline: 'PetitParser2Gui';
>>>          repository: 'github://kursjan/petitparser2';
>>>          load.
>>
>> This way worked for me in Pharo #60546 (check in World->System->About). 
>> What
>> exact Pharo version/OS are you on? 32 or 64-bit
>>
>>
>>
>> -----
>> Cheers,
>> Sean
>> --
>> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Steffen Märcker
Hi, I tried it some more times and things are different now:
- image appeared to lock up (1st)
- no network traffic at all (2nd)
- image unresponsive, loading successful after 2 minutes (3rd)
Call me a fool, but I didn't do anything different. Notably, it succeeded  
each time in 7.0. =)

> There is no specific reason to use SmallInteger maxVal...  and nobody  
> ever thought it might be too little.

Maybe it makes sense to change this? It appears to be just wrong and on  
32bit the limit is well in practical reach. With a little guidance, I'd  
try to do a first PR myself (if the change is considered sensible).

I was mentioning #min: since it is implemented in terms of 'min: min max:  
SmallInteber maxVal'.

Is there an easy way to create a character class, similar to [1-2x-z]?

Best, Steffen



> Use $- asPParser for characters, e.g:
> $- asPParser parse: '-'
>
> [1]: https://travis-ci.org/kursjan/petitparser2/builds/438358467
>
> On Thu, Oct 11, 2018 at 8:13 PM Steffen Märcker <[hidden email]> wrote:
>
>> Reading the code of PetitParser, I wonder why PPRepeatingParser
>> initializes 'max' with SmallInteger maxVal instead of some notion of
>> infinity, like Float infinity (and PP2RepeatingNode as well). If I
>> understand the code correctly, PParser>>min: fails if the number of
>> repetitions exceeds SmallInteger maxVal, doesn't it?
>>
>> Best, Steffen
>>
>>
>> Am .10.2018, 17:10 Uhr, schrieb Steffen Märcker <[hidden email]>:
>>
>> > I am using MacOS 10.13.6 and the 32bit VM:
>> >
>> > Pharo 6.0
>> > Latest update: #60546
>> >
>> > ... the String in about is wrong, it should be 6.1. I installed it via
>> > the launcher as "Official Distribution: Pharo 6.1 - 32Bit (stable)" I
>> > just noticed, that the sources file is missing from vms/private/6521/,
>> > too.
>> >
>> > Am .10.2018, 17:02 Uhr, schrieb Sean P. DeNigris  
>> <[hidden email]
>> >:
>> >
>> >> Steffen Märcker wrote
>> >>> I did the following:
>> >>> 1)  Download and start Pharo 6.1 stable via the launcher.
>> >>> 2b) Attempt to install PP2 via the scripts from GitHub:
>> >>>      Metacello new
>> >>>          baseline: 'PetitParser2';
>> >>>          repository: 'github://kursjan/petitparser2';
>> >>>          load.
>> >>>      Metacello new
>> >>>          baseline: 'PetitParser2Gui';
>> >>>          repository: 'github://kursjan/petitparser2';
>> >>>          load.
>> >>
>> >> This way worked for me in Pharo #60546 (check in
>> World->System->About).
>> >> What
>> >> exact Pharo version/OS are you on? 32 or 64-bit
>> >>
>> >>
>> >>
>> >> -----
>> >> Cheers,
>> >> Sean
>> >> --
>> >> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
>> >>
>> >
>> >
>>

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Jan Kurš
In reply to this post by Ben Coman
HI Steffen,

Thanks for the report, number pleases me :)

Speaking of tool for porting, I was recently showed this one, I don't have any experience with it:

Speaking of character ranges, there is currently available:
#letter asPParser (to recognize character matching #isLetter predicate)
#word asPParser (characters matching #isAlphaNumeric predicate)
#digit (#isDigit predicate)
#hex ([a-fA-F])
#space
#blank
#any

You can find definitions in PP2NodeFactory, see implementation of #hex, which is probably closest to what you want. For your convenience, your project can extend the class to fit your needs.

You can also specify:$a asPParser / $b asPParser / $c asPParser / #digit asPParser ... PetitParser2 can recognize this pattern and makes a character class during optimization pass.

On Sat, Oct 13, 2018 at 5:38 PM Steffen Märcker <[hidden email]> wrote:
Hi,

I gave PetitParser 2 a try and I am pretty impressed by the results, 
please see the updated table below. =) Again, that's pure parsing and 
Array-based AST-building. Moving to PP2 was indeed as easy as sending 
#asPParser and working around character ranges ($a - $z). Is there a 
preferred way to do the latter?

Jan mentioned that there might be an automated tool to port stuff to 
VisualWorks. Do you have a name? And again the old question: what is the 
preferred workflow to exchange code between the two dialects? Till now I 
stick to FileOut30.

input  Prism        Storm  Xtreams.PEG  PP     PP2
size   parse check  check  parse cache  parse  parse optim
230kB   0.1s   10s     6s     9s    3s     2s     4s  0.2s
544kB   0.2s   90s    20s    20s    7s     5s     9s  0.5s
1.1MB   0.4s  392s    46s    34s   13s     8s    15s  1.0s
1.4MB   0.8s 1091s    85s    47s   20s    12s    20s  1.3s
2.2MB                        63s   30s    16s    27s  1.9s
2.9MB                        81s   44s    20s    34s  2.5s
3.8MB                       107s   61s    25s    45s  3.1s
4.4MB                       123s   76s    30s    56s  3.7s

Best, Steffen


Am .10.2018, 05:22 Uhr, schrieb Tudor Girba <[hidden email]>:

> Hi,
>
> Interesting experiment. Thanks for sharing!
>
> I assume that you tried the original PetitParser. PetitParser2 offers 
> the possibility to optimize the parser (kind of a compilation), and this 
> provides a significant speedup:
> https://github.com/kursjan/petitparser2
>
> Would you be interested in trying this out?
>
> Cheers,
> Doru
>
>
>
>> On Oct 4, 2018, at 10:46 PM, Steffen Märcker <[hidden email]> wrote:
>>
>> I gave Xtreams-Parsing and PetitParser a shot and like to share my 
>> findings.[*]
>>
>> The task was to parse the modelling language of the probabilistic model 
>> checker PRISM. I've written a grammer of about 130 definitions in the 
>> Xtreams DSL, which is close to Bryan Fords syntax. To avoid doing it 
>> all again with PetitParser, I wrote a PetitParserGenerator that takes 
>> the DSL and builds a PetitParser.
>>
>> The numbers below are just parsing times, no further actions involved. 
>> For reference I show the times from PRISM (which uses JavaCC), too -- 
>> although they involve additional verification and normalization steps 
>> on the AST.
>>
>> input  Prism    XP   PP     
>> 230kB    14s    9s   2s
>> 544kB        121s   20s   5s
>> 1.1MB        421s   34s   8s
>> 1.4MB  1091s   47s  12s
>> 2.2MB          63s  16s
>> 2.9MB          81s  20s
>> 3.8MB         107s  25s
>> 4.4MB         123s  30s
>>
>> Please note that these times are not representative at all. It's just a 
>> single example and I put zero effort in optimization. However, I am 
>> quite satisfied with the results.
>>
>> [*] I was already familiar with the DSL of Xtreams-Parsing, which I 
>> like very much. I did not consider SmaCC, as I find PEGs easier to use.
>>
>> Best, Steffen
>>
>>
>>
>> Am .10.2018, 20:14 Uhr, schrieb Steffen Märcker <[hidden email]>:
>>
>>> Dear all,
>>>
>>> I have two questions regarding parsing frameworks.
>>>
>>> 1) Do you have any insights on the performance of SmaCC VS Xtreams 
>>> Parsing VS PetitParser?
>>> 2) Has anybody started to port PetitParser 2 from Pharo to VW? Is it 
>>> worth the effort?
>>>
>>> Sorry for cross-posting, I thought this might interest both 
>>> communities.
>>>
>>> Cheers, Steffen
>
> --
> www.feenk.com
>
> "No matter how many recipes we know, we still value a chef."
>
>
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Steffen Märcker
In reply to this post by Jan Kurš
Hi Jan,

I am trying to port PP2 to VW and managed to get most of the tests green.  
=) Some of the remaining test failures occur in

PP2DebuggingStrategy>>cacheIfNeeded:debugResult:

where a result of nil is to be stored in an IdentityDictionary. But in VW  
Dictionaries do not accept nil as a key. If this is indeed intended, I  
wonder how to circumvent the limitation best. Would it be feasible to use  
a placeholder object instead of nil (which code would be effected)? Or  
would it be better to not cache nil at all?

I'd be happy to hear your thoughts.

Best, Steffen





Am .10.2018, 20:58 Uhr, schrieb Jan Kurš <[hidden email]>:

> I run PP2 on travis [1], seems Pharo 6.1 loads all configurations, both  
> on
> linux and mac. Pharo 5, Pharo 6.0 got broken, why is build configuration  
> so
> hard :'( I don't know, how can I support you. I myself had to gave up on
> some tools, because I failed to load them.
>
> There is no specific reason to use SmallInteger maxVal...  and nobody  
> ever
> thought it might be too little. 'PP2 min: X' fails if there are less
> repetitions that X. 'PP2 max: X' parses at most X repetitions.
>
> ($a asPParser min: 2 max: 3) parse: 'a'. -> Failure
> ($a asPParser min: 2 max: 3) parse: 'aa'.  #($a $a)
> ($a asPParser min: 2 max: 3) parse: 'aaa'. #($a $a $a)
> ($a asPParser min: 2 max: 3) parse: 'aaaa'. #($a $a $a)
>
>
> Use $- asPParser for characters, e.g:
> $- asPParser parse: '-'
>
> [1]: https://travis-ci.org/kursjan/petitparser2/builds/438358467
>
> On Thu, Oct 11, 2018 at 8:13 PM Steffen Märcker <[hidden email]> wrote:
>
>> Reading the code of PetitParser, I wonder why PPRepeatingParser
>> initializes 'max' with SmallInteger maxVal instead of some notion of
>> infinity, like Float infinity (and PP2RepeatingNode as well). If I
>> understand the code correctly, PParser>>min: fails if the number of
>> repetitions exceeds SmallInteger maxVal, doesn't it?
>>
>> Best, Steffen
>>
>>
>> Am .10.2018, 17:10 Uhr, schrieb Steffen Märcker <[hidden email]>:
>>
>> > I am using MacOS 10.13.6 and the 32bit VM:
>> >
>> > Pharo 6.0
>> > Latest update: #60546
>> >
>> > ... the String in about is wrong, it should be 6.1. I installed it via
>> > the launcher as "Official Distribution: Pharo 6.1 - 32Bit (stable)" I
>> > just noticed, that the sources file is missing from vms/private/6521/,
>> > too.
>> >
>> > Am .10.2018, 17:02 Uhr, schrieb Sean P. DeNigris  
>> <[hidden email]
>> >:
>> >
>> >> Steffen Märcker wrote
>> >>> I did the following:
>> >>> 1)  Download and start Pharo 6.1 stable via the launcher.
>> >>> 2b) Attempt to install PP2 via the scripts from GitHub:
>> >>>      Metacello new
>> >>>          baseline: 'PetitParser2';
>> >>>          repository: 'github://kursjan/petitparser2';
>> >>>          load.
>> >>>      Metacello new
>> >>>          baseline: 'PetitParser2Gui';
>> >>>          repository: 'github://kursjan/petitparser2';
>> >>>          load.
>> >>
>> >> This way worked for me in Pharo #60546 (check in
>> World->System->About).
>> >> What
>> >> exact Pharo version/OS are you on? 32 or 64-bit
>> >>
>> >>
>> >>
>> >> -----
>> >> Cheers,
>> >> Sean
>> >> --
>> >> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
>> >>
>> >
>> >
>>

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

jgfoster
We will have the same issue porting PP2 to GemStone and look forward to suggestions.

> On Oct 22, 2018, at 4:40 AM, Steffen Märcker <[hidden email]> wrote:
>
> Hi Jan,
>
> I am trying to port PP2 to VW and managed to get most of the tests green. =) Some of the remaining test failures occur in
>
> PP2DebuggingStrategy>>cacheIfNeeded:debugResult:
>
> where a result of nil is to be stored in an IdentityDictionary. But in VW Dictionaries do not accept nil as a key. If this is indeed intended, I wonder how to circumvent the limitation best. Would it be feasible to use a placeholder object instead of nil (which code would be effected)? Or would it be better to not cache nil at all?
>
> I'd be happy to hear your thoughts.
>
> Best, Steffen
>
>
>
>
>
> Am .10.2018, 20:58 Uhr, schrieb Jan Kurš <[hidden email]>:
>
>> I run PP2 on travis [1], seems Pharo 6.1 loads all configurations, both on
>> linux and mac. Pharo 5, Pharo 6.0 got broken, why is build configuration so
>> hard :'( I don't know, how can I support you. I myself had to gave up on
>> some tools, because I failed to load them.
>>
>> There is no specific reason to use SmallInteger maxVal...  and nobody ever
>> thought it might be too little. 'PP2 min: X' fails if there are less
>> repetitions that X. 'PP2 max: X' parses at most X repetitions.
>>
>> ($a asPParser min: 2 max: 3) parse: 'a'. -> Failure
>> ($a asPParser min: 2 max: 3) parse: 'aa'.  #($a $a)
>> ($a asPParser min: 2 max: 3) parse: 'aaa'. #($a $a $a)
>> ($a asPParser min: 2 max: 3) parse: 'aaaa'. #($a $a $a)
>>
>>
>> Use $- asPParser for characters, e.g:
>> $- asPParser parse: '-'
>>
>> [1]: https://travis-ci.org/kursjan/petitparser2/builds/438358467
>>
>> On Thu, Oct 11, 2018 at 8:13 PM Steffen Märcker <[hidden email]> wrote:
>>
>>> Reading the code of PetitParser, I wonder why PPRepeatingParser
>>> initializes 'max' with SmallInteger maxVal instead of some notion of
>>> infinity, like Float infinity (and PP2RepeatingNode as well). If I
>>> understand the code correctly, PParser>>min: fails if the number of
>>> repetitions exceeds SmallInteger maxVal, doesn't it?
>>>
>>> Best, Steffen
>>>
>>>
>>> Am .10.2018, 17:10 Uhr, schrieb Steffen Märcker <[hidden email]>:
>>>
>>> > I am using MacOS 10.13.6 and the 32bit VM:
>>> >
>>> > Pharo 6.0
>>> > Latest update: #60546
>>> >
>>> > ... the String in about is wrong, it should be 6.1. I installed it via
>>> > the launcher as "Official Distribution: Pharo 6.1 - 32Bit (stable)" I
>>> > just noticed, that the sources file is missing from vms/private/6521/,
>>> > too.
>>> >
>>> > Am .10.2018, 17:02 Uhr, schrieb Sean P. DeNigris <[hidden email]
>>> >:
>>> >
>>> >> Steffen Märcker wrote
>>> >>> I did the following:
>>> >>> 1)  Download and start Pharo 6.1 stable via the launcher.
>>> >>> 2b) Attempt to install PP2 via the scripts from GitHub:
>>> >>>      Metacello new
>>> >>>          baseline: 'PetitParser2';
>>> >>>          repository: 'github://kursjan/petitparser2';
>>> >>>          load.
>>> >>>      Metacello new
>>> >>>          baseline: 'PetitParser2Gui';
>>> >>>          repository: 'github://kursjan/petitparser2';
>>> >>>          load.
>>> >>
>>> >> This way worked for me in Pharo #60546 (check in
>>> World->System->About).
>>> >> What
>>> >> exact Pharo version/OS are you on? 32 or 64-bit
>>> >>
>>> >>
>>> >>
>>> >> -----
>>> >> Cheers,
>>> >> Sean
>>> >> --
>>> >> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
>>> >>
>>> >
>>> >
>>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Steffen Märcker
Hi,

my current shot is to subclass IdentityDictionary with  
NildentityDictionary. The class checks on indexed access whether the key  
is nil. If so, storing/retrival uses a singleton Object as key (stored in  
instance variable NIL := Object new). I needed to override at least:

   >>at:ifAbsent:
   >>associationsAt:ifAbsent:
   >>at:put:
   >>findElementLike:ifAbsent:
   >>initialize
   class>>new
   class>>new:

It works for PP2 (without serious performance impact so far), although the  
class is yet incomplete wrt. Collection/Dictionary protocol.

Currently, all but the Morphic-related tests are green. \o/ After some  
cleanup of the bundles, I'll publish an inital version to the Public Store.

@James: If you like, I we could discuss our ports later to double-check an  
making the future process easier. How about that?
@Jan: I'd be happy to hear your thoughts on the porting matter.

Best, Steffen



Am .10.2018, 16:06 Uhr, schrieb James Foster <[hidden email]>:

> We will have the same issue porting PP2 to GemStone and look forward to  
> suggestions.
>
>> On Oct 22, 2018, at 4:40 AM, Steffen Märcker <[hidden email]> wrote:
>>
>> Hi Jan,
>>
>> I am trying to port PP2 to VW and managed to get most of the tests  
>> green. =) Some of the remaining test failures occur in
>>
>> PP2DebuggingStrategy>>cacheIfNeeded:debugResult:
>>
>> where a result of nil is to be stored in an IdentityDictionary. But in  
>> VW Dictionaries do not accept nil as a key. If this is indeed intended,  
>> I wonder how to circumvent the limitation best. Would it be feasible to  
>> use a placeholder object instead of nil (which code would be effected)?  
>> Or would it be better to not cache nil at all?
>>
>> I'd be happy to hear your thoughts.
>>
>> Best, Steffen
>>
>>
>>
>>
>>
>> Am .10.2018, 20:58 Uhr, schrieb Jan Kurš <[hidden email]>:
>>
>>> I run PP2 on travis [1], seems Pharo 6.1 loads all configurations,  
>>> both on
>>> linux and mac. Pharo 5, Pharo 6.0 got broken, why is build  
>>> configuration so
>>> hard :'( I don't know, how can I support you. I myself had to gave up  
>>> on
>>> some tools, because I failed to load them.
>>>
>>> There is no specific reason to use SmallInteger maxVal...  and nobody  
>>> ever
>>> thought it might be too little. 'PP2 min: X' fails if there are less
>>> repetitions that X. 'PP2 max: X' parses at most X repetitions.
>>>
>>> ($a asPParser min: 2 max: 3) parse: 'a'. -> Failure
>>> ($a asPParser min: 2 max: 3) parse: 'aa'.  #($a $a)
>>> ($a asPParser min: 2 max: 3) parse: 'aaa'. #($a $a $a)
>>> ($a asPParser min: 2 max: 3) parse: 'aaaa'. #($a $a $a)
>>>
>>>
>>> Use $- asPParser for characters, e.g:
>>> $- asPParser parse: '-'
>>>
>>> [1]: https://travis-ci.org/kursjan/petitparser2/builds/438358467
>>>
>>> On Thu, Oct 11, 2018 at 8:13 PM Steffen Märcker <[hidden email]> wrote:
>>>
>>>> Reading the code of PetitParser, I wonder why PPRepeatingParser
>>>> initializes 'max' with SmallInteger maxVal instead of some notion of
>>>> infinity, like Float infinity (and PP2RepeatingNode as well). If I
>>>> understand the code correctly, PParser>>min: fails if the number of
>>>> repetitions exceeds SmallInteger maxVal, doesn't it?
>>>>
>>>> Best, Steffen
>>>>
>>>>
>>>> Am .10.2018, 17:10 Uhr, schrieb Steffen Märcker <[hidden email]>:
>>>>
>>>> > I am using MacOS 10.13.6 and the 32bit VM:
>>>> >
>>>> > Pharo 6.0
>>>> > Latest update: #60546
>>>> >
>>>> > ... the String in about is wrong, it should be 6.1. I installed it  
>>>> via
>>>> > the launcher as "Official Distribution: Pharo 6.1 - 32Bit (stable)"  
>>>> I
>>>> > just noticed, that the sources file is missing from  
>>>> vms/private/6521/,
>>>> > too.
>>>> >
>>>> > Am .10.2018, 17:02 Uhr, schrieb Sean P. DeNigris  
>>>> <[hidden email]
>>>> >:
>>>> >
>>>> >> Steffen Märcker wrote
>>>> >>> I did the following:
>>>> >>> 1)  Download and start Pharo 6.1 stable via the launcher.
>>>> >>> 2b) Attempt to install PP2 via the scripts from GitHub:
>>>> >>>      Metacello new
>>>> >>>          baseline: 'PetitParser2';
>>>> >>>          repository: 'github://kursjan/petitparser2';
>>>> >>>          load.
>>>> >>>      Metacello new
>>>> >>>          baseline: 'PetitParser2Gui';
>>>> >>>          repository: 'github://kursjan/petitparser2';
>>>> >>>          load.
>>>> >>
>>>> >> This way worked for me in Pharo #60546 (check in
>>>> World->System->About).
>>>> >> What
>>>> >> exact Pharo version/OS are you on? 32 or 64-bit
>>>> >>
>>>> >>
>>>> >>
>>>> >> -----
>>>> >> Cheers,
>>>> >> Sean
>>>> >> --
>>>> >> Sent from:  
>>>> http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
>>>> >>
>>>> >
>>>> >
>>>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Sean P. DeNigris
Administrator
In reply to this post by Ben Coman
I rediscovered this thread while pondering an (existential?!) problem:

Why do we keep having to write, and rewrite, and rewrite, ad infinitum,
parsers for well known domains like rfc specs? The parser world in many ways
feels like a modern post-"Tower of Babel". I was really excited about the
reverse parsing stuff done in Squeak by Ted Kaehler and Alessandro Warth
[1], which never seemed to get picked up.


Steffen Märcker wrote
> I wrote a PetitParserGenerator that takes the DSL and builds a
> PetitParser.

I don't know how I could've missed this gem! I hope Steffen is still
subscribed. I googled "PetitParserGenerator", but only found these ML posts
:/

IIRC Xtreams can take a BNF and generate a parser. I was thinking about
implementing a BNF parser in PetitParser, but would love to avoid that,
wouldn't mind a two step BNF -> Xtreams -> PP process. Although there was
this SO reply [2] where Lukas said that one can't necessarily blindly feed a
BNF to a PEG. It also got me thinking about sharing parsers between PP and
PP2, since in many cases it seems that only the internal class names would
differ...

[1] http://www.vpri.org/pdf/m2008001_parseback.pdf
[2] https://stackoverflow.com/a/9443024/424245



-----
Cheers,
Sean
--
Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html

Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Steffen Märcker
Dear Sean,

thanks for bringing this up and for the interesting link [1]. I'll have a  
look. Meanwhile, I am still on both lists. ;-)

> Steffen Märcker wrote
>> I wrote a PetitParserGenerator that takes the DSL and builds a
>> PetitParser.
>
> I don't know how I could've missed this gem! I hope Steffen is still
> subscribed. I googled "PetitParserGenerator", but only found these ML  
> posts
> :/

It was easy to miss, since I never actually posted that code. Here's what  
I've done:

 From Xtreams.PEG syntax:
- PetitParserGenerater > PetitParser code
- PetitParserParser    > PetitParser instance

I can send you that code. It should be straight forward to adapt it for  
PP2. However, I moved away from Xtreams own PEG flavor to Bryan Ford's  
original PEG syntax (close but more common) and programmed for Xtreams, PP  
and PP2:

 From Bryan Ford's original PEG syntax:
1. Xtreams Grammer and Actor that build an Xtreams parser
2. PP(2)PEGParserParser that builds a PP(2) parser instance
3. PP(2)PEGParserCompiler that builds PP(2) parser classes

All three are available for VW in the Cincom's public repository:
- Xtreasm-Parsing (8.2-3,stm): +PEG parser +some fixes to Xtreams parser
- PetitParser-PEG (1.11,stm)
- PetitParser2-PEG (1.11,stm)

And 2, 3 for Pharo on GitHub, e.g.
- https://github.com/kursjan/petitparser2/tree/master/PetitParser2-PEG

Speaking of, I just noticed that translator form Xtreams.PEG to PEG is  
missing to complete the picture here. ;-)

> IIRC Xtreams can take a BNF and generate a parser.

As far as I know, there is no BNF-like parser generator for Xtreams,  
available.

> wouldn't mind a two step BNF -> Xtreams -> PP process.

Is it an option for you to convert your BNF to PEG manually?

Best regards,
Steffen

Reply | Threaded
Open this post in threaded view
|

Re: [vwnc] Parsing in Smalltalk

Sean P. DeNigris
Administrator
Steffen Märcker wrote
> I can send you that code.

Awesome. I'll email you. Although, I wonder how relevant this is given your
answers down below. Are there many grammars available in Xtreams syntax to
make this useful? Also, what is the license? Can I add it somewhere on GH
under MIT (with attribution of course)?


Steffen Märcker wrote
> However, I moved away from Xtreams own PEG flavor to Bryan Ford's  
> original PEG syntax (close but more common)

Ah, interesting. I didn't realize that Xtreams used a custom PEG syntax. I
wonder why?!


Steffen Märcker wrote
>  From Bryan Ford's original PEG syntax:
> 1. Xtreams Grammer and Actor that build an Xtreams parser
> ...
> All three are available for VW in the Cincom's public repository:
> - Xtreasm-Parsing (8.2-3,stm): +PEG parser +some fixes to Xtreams parser

I'd certainly like to port that at some point, but I'm currently fairly
mystified about the best practice process. I just reached out to Pavel about
the Ring2 approach on which he spoke at ESUG. Do you have a documented
process or any pointers even?


Steffen Märcker wrote
> And 2, 3 for Pharo on GitHub, e.g.
> - https://github.com/kursjan/petitparser2/tree/master/PetitParser2-PEG

Great. I use PP2 a lot. So if IIUC, I can now feed a PEG-syntax grammar
string and have a PP2 parser generated for it?


Steffen Märcker wrote
> Speaking of, I just noticed that translator form Xtreams.PEG to PEG is  
> missing to complete the picture here. ;-)

Ah, yes that makes sense. Any idea how much effort would be involved?


Steffen Märcker wrote
> As far as I know, there is no BNF-like parser generator for Xtreams,  
> available.
> ...
> Is it an option for you to convert your BNF to PEG manually?

I'm not a parsing expert, so that may have been what I'm already doing and
I'm using the wrong terminology. I took the ABNF from rfc5322 [1] and
adapted it slightly [2] to be consumed by Xtreams [3]

[OT?]
As a final aside, I've been wondering if there's any way to generate
"hand-rolled" equivalent parsers from Xtreams, PP, etc. for use cases where
none of the libraries are available. I have in mind Pharo's MailMessage. It
doesn't seem like any full-featured parsing libraries will be integrated any
time soon, so the lowest levels use painful, duplication-riddled hand-rolled
parsers. It would be great to leverage all this great library tech to create
and reason about those...
[/OT]

Thanks for the discussion!

1. https://tools.ietf.org/html/rfc5322
2.
https://github.com/seandenigris/Xtreams-Pharo/blob/master/repository/Xtreams-Email.package/PEGParser.extension/class/grammarEmail.st
3.
https://github.com/seandenigris/Xtreams-Pharo/blob/master/repository/Xtreams-Email.package/PEGParserEmailTest.class/instance/setUp.st



-----
Cheers,
Sean
--
Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html

Cheers,
Sean