I have a narrow question and a broader one.
The narrow question is how to get SmaCC into the prepackaged moose image. I tried to file in SmaCCDev-lr.23.mcz from squeak source (version 24 is for Pharo 1.1, which I don't think the moose image is using). It failed because it needed SmaCCParser and SmaCCScanner classes. http://www.squeaksource.com/SmaccDevelopment.html says the run time is already in the image; perhaps it's been stripped out? The broader question is whether anyone has any advice on approaching my problem of parsing SAS files. Advice could be pointers to other lexer/scanners or the news that PEG (i.e., PetitParser) works OK for my purposes (I may want to use it for the parsing, but this is about dealing with the macro language). To give a taste of the detailed issues: a macro variable reference, &V, can occur almost anywhere in a SAS program (but not inside of some comments and some quotes). It is immediately expanded; this may occur in the middle of what looks like a token, e.g. (1) data run&V; becomes (2) data run04 (start=5); In perverse cases one could even have (3) da&v become (2), including the semicolon. Macro variables obey scoping rules. There are also macro invocations like %mymacro(3, abc) which expand at the closing parenthesis. %INCLUDE brings a whole file into the source. And the macro language itself has conditional and looping constructs. As an added bonus, SAS macros are not simple preprocessors, since their expansion can depend on information obtained in the main language at runtime (in fact, macros can be written at run time). It's unlikely I'll ever attempt to handle that case, however. Essentially, there are 2 or 3 different syntaxes operating in the same program (the main syntax, the macro language, and the expansion of macro variables via &). This was the setup for the initial Conway paper on coroutines. I don't currently see any gain from using coroutines. _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
> I tried to file in SmaCCDev-lr.23.mcz from squeak source (version 24 is
> for Pharo 1.1, which I don't think the moose image is using). It failed > because it needed SmaCCParser and SmaCCScanner classes. > http://www.squeaksource.com/SmaccDevelopment.html says the run time is > already in the image; perhaps it's been stripped out? The runtime is not part of the image anymore. You need to load it (SmaCC) before you load the development tools (SmaCCDev). I've update the comment on SqueakSource. > The broader question is whether anyone has any advice on approaching my > problem of parsing SAS files. Advice could be pointers to other > lexer/scanners or the news that PEG (i.e., PetitParser) works OK for my > purposes (I may want to use it for the parsing, but this is about > dealing with the macro language). PetitParser is more general in what it can parse than SmaCC. So I don't see a reason that wouldn't work. > To give a taste of the detailed issues: a macro variable reference, &V, > can occur almost anywhere in a SAS program (but not inside of some > comments and some quotes). It is immediately expanded; this may occur > in the middle of what looks like a token, e.g. > (1) data run&V; > becomes > (2) data run04 (start=5); > > In perverse cases one could even have > (3) da&v > become (2), including the semicolon. > > Macro variables obey scoping rules. Looks to me like you don't want to do that during parsing, but in a separate step after parsing the AST. > Essentially, there are 2 or 3 different syntaxes operating in the same > program (the main syntax, the macro language, and the expansion of macro > variables via &). This was the setup for the initial Conway paper on > coroutines. I don't currently see any gain from using coroutines. PetitParser allows you to define and test these 3 syntaxes separately and then combine them later on. The coroutines sound like an optimization to combine the parsing and macro expansion steps into one single step. I would implement and test them separately in the beginning. Lukas -- Lukas Renggli www.lukas-renggli.ch _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
On Tue, 2010-07-13 at 08:24 +0200, Lukas Renggli wrote:
> > I tried to file in SmaCCDev-lr.23.mcz from squeak source (version 24 is > > for Pharo 1.1, which I don't think the moose image is using). It failed > > because it needed SmaCCParser and SmaCCScanner classes. > > http://www.squeaksource.com/SmaccDevelopment.html says the run time is > > already in the image; perhaps it's been stripped out? > > The runtime is not part of the image anymore. You need to load it > (SmaCC) before you load the development tools (SmaCCDev). I've update > the comment on SqueakSource. Thank you. > > > The broader question is whether anyone has any advice on approaching my > > problem of parsing SAS files. Advice could be pointers to other > > lexer/scanners or the news that PEG (i.e., PetitParser) works OK for my > > purposes (I may want to use it for the parsing, but this is about > > dealing with the macro language). > > PetitParser is more general in what it can parse than SmaCC. So I > don't see a reason that wouldn't work. I was thinking of using the lexer in SmaCC. > > > To give a taste of the detailed issues: a macro variable reference, &V, > > can occur almost anywhere in a SAS program (but not inside of some > > comments and some quotes). It is immediately expanded; this may occur > > in the middle of what looks like a token, e.g. > > (1) data run&V; > > becomes > > (2) data run04 (start=5); > > > > In perverse cases one could even have > > (3) da&v > > become (2), including the semicolon. > > > > Macro variables obey scoping rules. > > Looks to me like you don't want to do that during parsing, but in a > separate step after parsing the AST. parsed. The text without macro expansion may not even be well-formed. > > > Essentially, there are 2 or 3 different syntaxes operating in the same > > program (the main syntax, the macro language, and the expansion of macro > > variables via &). This was the setup for the initial Conway paper on > > coroutines. I don't currently see any gain from using coroutines. > > PetitParser allows you to define and test these 3 syntaxes separately > and then combine them later on. Could you say a bit more about the nature of the combination? A simple sequential approach does not seem feasible. For example %let V = data one; %macro foo(input); &V &input; a= b+c; run; %mend foo; %foo(two) The macro processor needs to register the definition of V in the %let so it can be used later. During the macro definition (between %macro and %mend) &V should not be expanded. %foo invokes the macro. At this point the macro processor produces the desired text, and the macro variable processor in turn needs to expand &V and &input, the latter bound to the local argument. Only then does the plain SAS parsing happen. Ross > > The coroutines sound like an optimization to combine the parsing and > macro expansion steps into one single step. I would implement and test > them separately in the beginning. > > Lukas > _______________________________________________ Moose-dev mailing list [hidden email] https://www.iam.unibe.ch/mailman/listinfo/moose-dev |
Free forum by Nabble | Edit this page |