Smalltalk › Frameworks & Tools › Moose

PetitParser: refer to previous values

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

4 messages Options

Sean P. DeNigris

PetitParser: refer to previous values

Administrator

Is there a way to do something like:
#digit asParser, 'same digit that I just parsed' asParser ?

I snooped around and saw PPMemoizedParser, but didn't exactly understand the test for it.

Thanks.
Sean

Cheers,
Sean

NorbertHartl

Re: PetitParser: refer to previous values

Am 25.06.2011 um 15:57 schrieb Sean P. DeNigris:

> Is there a way to do something like:
> #digit asParser, 'same digit that I just parsed' asParser ?
>
I'm not sure if there is a possibility in declarative way. But you can work on the products of the individual parsers. You can do it like this:

twoEqualDigits
^ #digit asParser, #digit asParser ==> [:nodes|
(nodes first = nodes second)
ifTrue: [ nodes ]
ifFalse: [ PPFailure message: 'not the same value' at: 0]]

> I snooped around and saw PPMemoizedParser, but didn't exactly understand the
> test for it.

I would like to know exactly, too :)

Norbert

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Lukas Renggli

Re: PetitParser: refer to previous values

>> Is there a way to do something like:
>> #digit asParser, 'same digit that I just parsed' asParser ?
>>
> I'm not sure if there is a possibility in declarative way. But you can work on the products of the individual parsers. You can do it like this:
>
> twoEqualDigits
> ^ #digit asParser, #digit asParser ==> [:nodes|
> (nodes first = nodes second)
> ifTrue: [ nodes ]
> ifFalse: [ PPFailure message: 'not the same value' at: 0]]

Yes, this is essentially the trick.

Have a look at some of the other references to PPFailure, for example

- in PPXmlParser>>element we ensure that the open and close tag are the same,
- in PPSmalltalkGrammar>>number an external parser is used to validate
and extract the number, and
- in PPObjectTest>>testFibonacci the input must be a fibonacci
sequence (quite complicated code).

Now this seems to be quite a common pattern, so it might be worth to
add a helper method to PetitParser. I imagine something along the
lines of

(#digit asParser , #digit asParser)
condition: [ :nodes | nodes first = nodes second ]
message: 'not the same value'

>> I snooped around and saw PPMemoizedParser, but didn't exactly understand the
>> test for it.
>
> I would like to know exactly, too :)

Did you see the class comment and the method comment in the factory
method of PPMemoizedParser?

PPParser>>memoized
"Answer a new memoized parser, for refraining redundant computations.
This ensures polynomial time O(n^4) for left-recursive grammars and
O(n^3) for non left-recursive grammars in the worst case. Not
necessary for most grammars that are carefully written and in O(n)
anyway."

^ PPMemoizedParser on: self

So, this is to cache repeated parses of the same rule, at the same
position, and in the same input stream. In almost all cases it is not
useful and actually slows things down. In rare cases it is necessary,
but then you should probably rather fix your grammar.

The basic idea behind memoization is explained in "Packrat parsing:
simple, powerful, lazy, linear time" by Bryan Ford
(http://arxiv.org/pdf/cs/0603077). The implementation of PetitParser
is actually based on ideas presented in "Packrat Parsers Can Support
Left Recursion" by Alessandro Warth
(http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.99.7466&rep=rep1&type=pdf).
Note that the PetitParser implementation of PPMemoizedParser has the
same problem as the one of Warth leading to invalid parses on some
PEGs. Details can be found in "Direct Left-Recursive Parsing
Expressing Grammars" by Laurence Tratt
(http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.156.7945&rep=rep1&type=pdf).

Lukas

--
Lukas Renggli
www.lukas-renggli.ch

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

NorbertHartl

Re: PetitParser: refer to previous values

Am 25.06.2011 um 18:01 schrieb Lukas Renggli:

>>> Is there a way to do something like:
>>> #digit asParser, 'same digit that I just parsed' asParser ?
>>>
>> I'm not sure if there is a possibility in declarative way. But you can work on the products of the individual parsers. You can do it like this:
>>
>> twoEqualDigits
>> ^ #digit asParser, #digit asParser ==> [:nodes|
>> (nodes first = nodes second)
>> ifTrue: [ nodes ]
>> ifFalse: [ PPFailure message: 'not the same value' at: 0]]
>
> Yes, this is essentially the trick.
>
> Have a look at some of the other references to PPFailure, for example
>
> - in PPXmlParser>>element we ensure that the open and close tag are the same,
> - in PPSmalltalkGrammar>>number an external parser is used to validate
> and extract the number, and
> - in PPObjectTest>>testFibonacci the input must be a fibonacci
> sequence (quite complicated code).
>
> Now this seems to be quite a common pattern, so it might be worth to
> add a helper method to PetitParser. I imagine something along the
> lines of
>
> (#digit asParser , #digit asParser)
> condition: [ :nodes | nodes first = nodes second ]
> message: 'not the same value'
>
>>> I snooped around and saw PPMemoizedParser, but didn't exactly understand the
>>> test for it.
>>
>> I would like to know exactly, too :)
>
> Did you see the class comment and the method comment in the factory
> method of PPMemoizedParser?
>
> PPParser>>memoized
> "Answer a new memoized parser, for refraining redundant computations.
> This ensures polynomial time O(n^4) for left-recursive grammars and
> O(n^3) for non left-recursive grammars in the worst case. Not
> necessary for most grammars that are carefully written and in O(n)
> anyway."
>
> ^ PPMemoizedParser on: self
>
> So, this is to cache repeated parses of the same rule, at the same
> position, and in the same input stream. In almost all cases it is not
> useful and actually slows things down. In rare cases it is necessary,
> but then you should probably rather fix your grammar.
>
> The basic idea behind memoization is explained in "Packrat parsing:
> simple, powerful, lazy, linear time" by Bryan Ford
> (http://arxiv.org/pdf/cs/0603077). The implementation of PetitParser
> is actually based on ideas presented in "Packrat Parsers Can Support
> Left Recursion" by Alessandro Warth
> (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.99.7466&rep=rep1&type=pdf).
> Note that the PetitParser implementation of PPMemoizedParser has the
> same problem as the one of Warth leading to invalid parses on some
> PEGs. Details can be found in "Direct Left-Recursive Parsing
> Expressing Grammars" by Laurence Tratt
> (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.156.7945&rep=rep1&type=pdf).

Thanks for the pointers. I hope I find some time to get into this. I played a little with memoized parser because my grammar used some left recursive rules. But with memoized parsers I had pretty strange effects with white spaces and trimming. So I wanted to know how it works to test this exactly. At the end I reworked the grammar to be not left recursive which gave me an additional speed up as well.

thanks,

Norbert

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev