Hi,
Yet another question on PetitParser :) All the grammars that I find in PetitParser (e.g., PetitXML, PetitSmalltalk) are defined in a single class called PP...Grammar. However, the Java grammar has many rules and including all of them in a single class seems not the right approach. For example, now I have a class called PPJavaLexicon, in which I cover the rules for finding tokens and comments (i.e. the lexical structure [1]). Then, for example, I would continue working on types, values, and variables [2]. So, I would create another class that references PPJavaLexicon and uses the rules defined there to define the new ones. Something like: PPJavaTypes>>typeVariable ^ppJavaLexicon identifier Is this a good approach to split a grammar in more classes, or would you suggest something different? Thank you, Alberto [1] http://java.sun.com/docs/books/jls/third_edition/html/lexical.html [2] http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
> a single class called PP...Grammar. However, the Java grammar has many rules and
> including all of them in a single class seems not the right approach. Sure, depending on the size and structure of the grammar you might want to split it into multiple classes. > For example, now I have a class called PPJavaLexicon, in which I cover the rules > for finding tokens and comments (i.e. the lexical structure [1]). > Then, for example, > I would continue working on types, values, and variables [2]. So, I would > create another class that references PPJavaLexicon and uses the > rules defined there to define the new ones. Something like: > > PPJavaTypes>>typeVariable > ^ppJavaLexicon identifier Yes, that's a possibility that works well. Maybe better use the accessor #productionAt: to access the cached productions of a different grammar, otherwise you end up with much larger grammars than necessary. > Is this a good approach to split a grammar in more classes, > or would you suggest something different? The problem of splitting up the grammars as you propose is that it is not that easy anymore (but still possible) when you want to use subclassing to customize the grammar with different production actions. Another (and more traditional approach) is to use a separate lexer: You can see that in TextLint (check on squeaksource.com). There we have different lexers for plain text, LaTeX and HTML; and a parser for a 'natural language' of words, sentences, and paragraphs (very simple) that can be composed in different ways. For Java such a split probably doesn't make sense, but it is a good example of PetitParser being very flexible to different requirements. Also you might want to look at my work on language embedding, especially <http://scg.unibe.ch/archive/papers/Reng09cLanguageBoxes.pdf>. There we programmatically compose different languages modeled as PPCompositeParser instances at specific join-points. Lukas -- Lukas Renggli www.lukas-renggli.ch _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Free forum by Nabble | Edit this page |