Hi,
I took some time to start developing a C parser with PetitParser. I used the slides of the smalltalk school, they were pretty useful, and after some hours I have a very basic tokens scanner. The idea of this work is to be able to feed the parser with C headers and automagically get the C bindings done in smalltalk.
Now I have to actually start doing something with the scanned code. How should I do that, all in the same class? What if I want to generate bindings for different kinds of FFIs? Other question, how should I handle preprocessing? (now there isn't any kind of preprocessing)
You can see the code, download it and commit improvements to the repo at www.squeaksource.com/Bindings Anybody who is interested is welcomed to help in the development! Regards, Javier. -- Javier Pimás Ciudad de Buenos Aires |
by the way, the grammar is based on
I'd be very grateful if you contribute to cover the full extent of it! On Fri, Apr 8, 2011 at 12:37 AM, Javier Pimás <[hidden email]> wrote: Hi, -- Javier Pimás Ciudad de Buenos Aires |
In reply to this post by melkyades
On Fri, Apr 8, 2011 at 12:37 AM, Javier Pimás <[hidden email]> wrote: Hi, Hi Javier, That's a great project and I'm highly interested in it. I'm not experienced in PetitParser so I won't be a lot of help, but I'll give it a try.
Maybe you already thought of this but what if you make your parser build some simple C AST and then you make specific visitors for each kind of FFI? Cheers
Richo
|
In reply to this post by melkyades
Hi Javier,
On 8 Apr 2011, at 05:37, Javier Pimás wrote: > Hi, > I took some time to start developing a C parser with PetitParser. I used the slides of the smalltalk school, they were pretty useful, and after some hours I have a very basic tokens scanner. The idea of this work is to be able to feed the parser with C headers and automagically get the C bindings done in smalltalk. Nice. I took a quick look. If you want to scale, please write tests (one small example per test). Grammars become complex and if you do not have close to 100% coverage, you will get stuck. Cheers, Doru > Now I have to actually start doing something with the scanned code. How should I do that, all in the same class? What if I want to generate bindings for different kinds of FFIs? Other question, how should I handle preprocessing? (now there isn't any kind of preprocessing) > > You can see the code, download it and commit improvements to the repo at www.squeaksource.com/Bindings > > Anybody who is interested is welcomed to help in the development! > > Regards, > Javier. > > -- > Javier Pimás > Ciudad de Buenos Aires -- www.tudorgirba.com "Sometimes the best solution is not the best solution." |
In reply to this post by melkyades
What is tricky with analyzing C are macros and preprocessors directive.
srcML does a satisfactory job. But indeed, having a solution based on PetitParser is indeed promising. Cheers, Alexandre On 7 Apr 2011, at 23:53, Javier Pimás wrote: > by the way, the grammar is based on > > http://www.quut.com/c/ANSI-C-grammar-l-1998.html > > and > > http://www.quut.com/c/ANSI-C-grammar-y.html > > I'd be very grateful if you contribute to cover the full extent of it! > > > On Fri, Apr 8, 2011 at 12:37 AM, Javier Pimás <[hidden email]> wrote: > Hi, > I took some time to start developing a C parser with PetitParser. I used the slides of the smalltalk school, they were pretty useful, and after some hours I have a very basic tokens scanner. The idea of this work is to be able to feed the parser with C headers and automagically get the C bindings done in smalltalk. > > Now I have to actually start doing something with the scanned code. How should I do that, all in the same class? What if I want to generate bindings for different kinds of FFIs? Other question, how should I handle preprocessing? (now there isn't any kind of preprocessing) > > You can see the code, download it and commit improvements to the repo at www.squeaksource.com/Bindings > > Anybody who is interested is welcomed to help in the development! > > Regards, > Javier. > > -- > Javier Pimás > Ciudad de Buenos Aires > > > > -- > Javier Pimás > Ciudad de Buenos Aires -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
In reply to this post by Ricardo Moran
On Fri, Apr 8, 2011 at 2:05 AM, Ricardo Moran <[hidden email]> wrote:
excelent
I didn't think of it because I never did a parser before and have no experience with PP. Is it the clean way to do it? How is that implented? maybe have for each expression in the bnf a class that represents it, and make PP create the instances?
-- Javier Pimás Ciudad de Buenos Aires |
In reply to this post by Tudor Girba
On Fri, Apr 8, 2011 at 3:41 AM, Tudor Girba <[hidden email]> wrote: Hi Javier, OK, I'll try. One question regarding PP: I started coding by looking at PPArithmeticParser, and noticed that for each sub-grammar you have a method and an instance named the same way, is that a requirement? If I don't add the instances this sub-grammars are not shown in the PPBrowser.
Regards, Javier. Cheers, -- Javier Pimás Ciudad de Buenos Aires |
In reply to this post by abergel
On Fri, Apr 8, 2011 at 1:08 PM, Alexandre Bergel <[hidden email]> wrote: What is tricky with analyzing C are macros and preprocessors directive. Yes, and I don't know what to do, and would like to know if you or someone else have any idea of how to solve it. I will think some approach, but if you have some idea to share it would be better so I go directly to a good solution. I'm thinking that to feed the parser I'll have to preprocess the string with another parser, and make a string to string conversion by having a set of defines and paths so that #includes and #ifdefs work. But probably I'll need a cache because the generated string will be huge.
What do you think? Regards, Javier.
-- Javier Pimás Ciudad de Buenos Aires |
A good source of ideas: http://www.sdml.info/projects/srcml/
Alexandre On 8 Apr 2011, at 12:57, Javier Pimás wrote: > > > On Fri, Apr 8, 2011 at 1:08 PM, Alexandre Bergel <[hidden email]> wrote: > What is tricky with analyzing C are macros and preprocessors directive. > srcML does a satisfactory job. But indeed, having a solution based on PetitParser is indeed promising. > > Yes, and I don't know what to do, and would like to know if you or someone else have any idea of how to solve it. I will think some approach, but if you have some idea to share it would be better so I go directly to a good solution. I'm thinking that to feed the parser I'll have to preprocess the string with another parser, and make a string to string conversion by having a set of defines and paths so that #includes and #ifdefs work. But probably I'll need a cache because the generated string will be huge. > > What do you think? > > Regards, > Javier. > > Cheers, > Alexandre > > > On 7 Apr 2011, at 23:53, Javier Pimás wrote: > > > by the way, the grammar is based on > > > > http://www.quut.com/c/ANSI-C-grammar-l-1998.html > > > > and > > > > http://www.quut.com/c/ANSI-C-grammar-y.html > > > > I'd be very grateful if you contribute to cover the full extent of it! > > > > > > On Fri, Apr 8, 2011 at 12:37 AM, Javier Pimás <[hidden email]> wrote: > > Hi, > > I took some time to start developing a C parser with PetitParser. I used the slides of the smalltalk school, they were pretty useful, and after some hours I have a very basic tokens scanner. The idea of this work is to be able to feed the parser with C headers and automagically get the C bindings done in smalltalk. > > > > Now I have to actually start doing something with the scanned code. How should I do that, all in the same class? What if I want to generate bindings for different kinds of FFIs? Other question, how should I handle preprocessing? (now there isn't any kind of preprocessing) > > > > You can see the code, download it and commit improvements to the repo at www.squeaksource.com/Bindings > > > > Anybody who is interested is welcomed to help in the development! > > > > Regards, > > Javier. > > > > -- > > Javier Pimás > > Ciudad de Buenos Aires > > > > > > > > -- > > Javier Pimás > > Ciudad de Buenos Aires > > -- > _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: > Alexandre Bergel http://www.bergel.eu > ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. > > > > > > > > > > -- > Javier Pimás > Ciudad de Buenos Aires -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
In reply to this post by melkyades
On Fri, Apr 8, 2011 at 1:44 PM, Javier Pimás <[hidden email]> wrote:
Well, I have no experience with parsers, either. I just played with PP a little (I learned a lot by looking at PetitSmalltalk), and I'm currently using the visitor pattern to translate smalltalk to NXC. You know what they said: "if you have a hammer, then everything looks like a nail" so I'm probably biased towards this pattern :)
I don't know if this is the clean way to do it, but I think it's clean enough for me :) I would implement it just as you said: a class for each expression, and make PP create the instances.
|
In reply to this post by melkyades
On Fri, Apr 8, 2011 at 1:49 PM, Javier Pimás <[hidden email]> wrote:
Yes, I think it is a requirement. Otherwise, the parsers can't be initialized. Look at: PPCompositeParser>>#initializeStartingAt:
|
In reply to this post by melkyades
On Fri, Apr 8, 2011 at 1:57 PM, Javier Pimás <[hidden email]> wrote:
Can't macros and preprocessor directives be represented as just another nodes in the AST?
|
maybe, tomorrow I'll see if I have some free time to continue and see how to handle it.
Regards, Javier.
On Fri, Apr 8, 2011 at 4:34 PM, Ricardo Moran <[hidden email]> wrote:
-- Javier Pimás Ciudad de Buenos Aires |
In reply to this post by Ricardo Moran
> Can't macros and preprocessor directives be represented as just another nodes in the AST?
In some cases yes, in some other no. The preprocessor allows you to completely change the syntax of C. This makes it very hard to parse. Consider the following valid C program: #define LEFTPARENT ( #define RIGHTPARENT ) int main LEFTPARENT void RIGHTPARENT { return 0; } Cheers, Alexandre > > > On 7 Apr 2011, at 23:53, Javier Pimás wrote: > > > by the way, the grammar is based on > > > > http://www.quut.com/c/ANSI-C-grammar-l-1998.html > > > > and > > > > http://www.quut.com/c/ANSI-C-grammar-y.html > > > > I'd be very grateful if you contribute to cover the full extent of it! > > > > > > On Fri, Apr 8, 2011 at 12:37 AM, Javier Pimás <[hidden email]> wrote: > > Hi, > > I took some time to start developing a C parser with PetitParser. I used the slides of the smalltalk school, they were pretty useful, and after some hours I have a very basic tokens scanner. The idea of this work is to be able to feed the parser with C headers and automagically get the C bindings done in smalltalk. > > > > Now I have to actually start doing something with the scanned code. How should I do that, all in the same class? What if I want to generate bindings for different kinds of FFIs? Other question, how should I handle preprocessing? (now there isn't any kind of preprocessing) > > > > You can see the code, download it and commit improvements to the repo at www.squeaksource.com/Bindings > > > > Anybody who is interested is welcomed to help in the development! > > > > Regards, > > Javier. > > > > -- > > Javier Pimás > > Ciudad de Buenos Aires > > > > > > > > -- > > Javier Pimás > > Ciudad de Buenos Aires > > -- > _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: > Alexandre Bergel http://www.bergel.eu > ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. > > > > > > > > > > -- > Javier Pimás > Ciudad de Buenos Aires > -- _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: Alexandre Bergel http://www.bergel.eu ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. |
On Fri, Apr 8, 2011 at 6:17 PM, Alexandre Bergel <[hidden email]> wrote:
Ok, I see the problem now...
|
In reply to this post by abergel
in that case you have also cil in oCaml used by Frama-C.
But I do not think that this is necessary to read all the ressources of earth. Javier contact nick because he started (I guess to implement a parser for C using PP). Stef On Apr 8, 2011, at 8:25 PM, Alexandre Bergel wrote: > A good source of ideas: http://www.sdml.info/projects/srcml/ > > Alexandre > > > On 8 Apr 2011, at 12:57, Javier Pimás wrote: > >> >> >> On Fri, Apr 8, 2011 at 1:08 PM, Alexandre Bergel <[hidden email]> wrote: >> What is tricky with analyzing C are macros and preprocessors directive. >> srcML does a satisfactory job. But indeed, having a solution based on PetitParser is indeed promising. >> >> Yes, and I don't know what to do, and would like to know if you or someone else have any idea of how to solve it. I will think some approach, but if you have some idea to share it would be better so I go directly to a good solution. I'm thinking that to feed the parser I'll have to preprocess the string with another parser, and make a string to string conversion by having a set of defines and paths so that #includes and #ifdefs work. But probably I'll need a cache because the generated string will be huge. >> >> What do you think? >> >> Regards, >> Javier. >> >> Cheers, >> Alexandre >> >> >> On 7 Apr 2011, at 23:53, Javier Pimás wrote: >> >>> by the way, the grammar is based on >>> >>> http://www.quut.com/c/ANSI-C-grammar-l-1998.html >>> >>> and >>> >>> http://www.quut.com/c/ANSI-C-grammar-y.html >>> >>> I'd be very grateful if you contribute to cover the full extent of it! >>> >>> >>> On Fri, Apr 8, 2011 at 12:37 AM, Javier Pimás <[hidden email]> wrote: >>> Hi, >>> I took some time to start developing a C parser with PetitParser. I used the slides of the smalltalk school, they were pretty useful, and after some hours I have a very basic tokens scanner. The idea of this work is to be able to feed the parser with C headers and automagically get the C bindings done in smalltalk. >>> >>> Now I have to actually start doing something with the scanned code. How should I do that, all in the same class? What if I want to generate bindings for different kinds of FFIs? Other question, how should I handle preprocessing? (now there isn't any kind of preprocessing) >>> >>> You can see the code, download it and commit improvements to the repo at www.squeaksource.com/Bindings >>> >>> Anybody who is interested is welcomed to help in the development! >>> >>> Regards, >>> Javier. >>> >>> -- >>> Javier Pimás >>> Ciudad de Buenos Aires >>> >>> >>> >>> -- >>> Javier Pimás >>> Ciudad de Buenos Aires >> >> -- >> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: >> Alexandre Bergel http://www.bergel.eu >> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. >> >> >> >> >> >> >> >> >> >> -- >> Javier Pimás >> Ciudad de Buenos Aires > > -- > _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;: > Alexandre Bergel http://www.bergel.eu > ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;. > > > > > > |
Good! Then we can do this in half the time with half the effort each (if it's not already done).
Regards, Javier.
2011/4/9 Stéphane Ducasse <[hidden email]> in that case you have also cil in oCaml used by Frama-C. -- Javier Pimás Ciudad de Buenos Aires |
Free forum by Nabble | Edit this page |