Hello Pharoers and Moosers, I did a Pratt parser extension for PetitParser. A Pratt parser (a.k.a top-down operator precedence parser) handles left-recursion and operator precedence. It handles grouping, prefix, postfix, infix (right- or left-associative) and "multifix” operators (e.g. "if ... then ... else ...", "... ? ... : ...", Smalltalk keyword messages). Normally Pratt Parsing needs a tokenization phase but here tokenization is done on the fly with other PP parsers. Apart from tokenization, no backtracking is needed so parsing is quite fast (approximatively 2 times faster than PPExpressionParser). Here is an exemple of a calculator: parser := PPPrattParser new. "Numbers" parser terminal: #digit asParser plus do: [ :token | token inputValue asNumber ]. parser skip: #space asParser plus. "Parentheses" parser groupLeft: $( asParser right: $) asParser. "Addition, substraction, multiplication, division: all left infix, * and / have higher precedence than + and -" parser leftInfix: $+ asParser precedence: 1 do: [ :left :op :right | left + right ]. parser leftInfix: $- asParser precedence: 1 do: [ :left :op :right | left - right ]. parser leftInfix: $* asParser precedence: 2 do: [ :left :op :right | left * right ]. parser leftInfix: $/ asParser precedence: 2 do: [ :left :op :right | left / right ]. "Power: right infix with higher precedence than multiplication and division" parser rightInfix: $^ asParser precedence: 3 do: [ :left :op :right | left raisedTo: right ]. "Unary minus: prefix with highest precedence" parser prefix: $- asParser precedence: 4 do: [ :op :right | right negated ]. parser parse: '2*3 + 4^(1/2)*3' ----> 12 To try it: Gofer it smalltalkhubUser: 'CamilleTeruel' project: 'PetitPratt'; package: 'PetitPratt'; load Note that it is in beta stage so it might still change drastically. @PP Devs: I had trouble with the PPContext furthestFailure that is taken into account instead of the failures I return, so I had to redefine #parseWithContext: to return the failures I want. The results given by furthestFailure were not very meaningful in my case (the same is true for PPExpressionParser btw). But I guess it was introduced because it gives good results in other cases. So would it be possible to change this behavior to let the parser decide if it returns the furthestFailure or the original failure? Cheers, Camille _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
That sounds really cool and useful extension. Regarding the furthest failure, the core of the problem is the distinction between an error and a failure. Error reports on a problem in the input, while failure is information for choice parser combinator. In general, the furthest failure is a better approximation of an error than the last failure, so we use it. I am not sure what exactly is the problem in case of PrattParser. I guess the last failure gives better results for a user? One has to consider a pratt parser included in the normal parser, e. g. Expressions parsed by pratt in a Java Grammar. Depending where an error occurs, different strategy for choosing the proper failure is necessary :-/ Regarding tokenization, there is a message token, that returns PPTokenParser, which transforms a parsed input into the PPToken object. Perhaps this might be helpful? Cheers Jan On Wed, Jun 10, 2015, 20:52 Richard Sargent <[hidden email]> wrote: camille teruel wrote _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Thank you Jan!
Yes, with furthest failure I get errors from tokenization instead of my errors. For exemple with the calculator grammar I gave in my first mail when I parse ‘1+’ the furthestFailure is ‘$- expected at 2’ whereas I return ‘expression expected at 2’ because there's a whole expression missing. Same thing with ‘(1+2’ that returns ‘digit expected at 4’ instead of ‘$) expected at 4’. But furthest failure gives wrong messages in other cases to. Consider this sequence parser: keyword := #letter asParser plus , $: asParser. keyword parse: ‘foo' This returns 'letter expected at 3’, but no matter how many letters I add to the end I’ll still get ‘letter expected’. I want to get what is really missing: '$: expected at 3’. Maybe returning the “latest furthest failure” instead of the “first furthest failure” could solves the problem here (i.e. replacing > with >= in PPContext>>#noteFailure:)?
Indeed, my hack (redefining #parseWithContext:) works only when the Pratt parser is the top parser, but a soon as I compose it I’m screwed because only parseOn: is sent to the Pratt parser. That’s why I wonder if letting the parser decide what to return wouldn’t solve the problem: by default the furthest failure but special parsers can still decide.
The Pratt tokens are special: a token points back to the parser that generated it (its “kind”). PPTokenKind subclasses PPFlatteningParser and generates instances of PPPrattToken. A PPTokenKind stores the precedence, the action to be executed when a token of this kind is met at the start of an expression (for terminals and prefixes) and the action to be executed when a token is met in the middle of an expression (for postfixes and infixes). Cheers, Camille
_______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Hi, Did you tried to inspect the PPFailure in the Moose? There is a tab with a tree structure giving you a pretty good overview, what is going wrong... Or as an alternative, one can call: myParser enableDebug parse:myInput On Fri, Jun 12, 2015 at 8:07 AM Francisco Garau <[hidden email]> wrote:
_______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
I didn’t know about these features, there are cool :) Thanks!
I would be happy to discuss any idea you’ll have on that subject :)
_______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Free forum by Nabble | Edit this page |