Hello all,
I am getting dangerously close to rolling my own C parser. The reality is that all it needs to do is handle some structs (some of which are nested) that I am copy/pasting from a PDF document containing some communications specs. Looking at it that way, it seems reasonable to write it and forget all about it when the project is finished. However, would you approach it some other way? I thought of IDL->MIDL->analyzer->structs, but external structures are not the desired end point, so it probably would not help all that much. Suggestions will be appreciated. Have a good one, Bill -- Wilhelm K. Schwab, Ph.D. [hidden email] |
"Bill Schwab" <[hidden email]> wrote
> I am getting dangerously close to rolling my own C parser. The reality is > that all it needs to do is handle some structs (some of which are nested) > that I am copy/pasting from a PDF document containing some communications > specs. > > Looking at it that way, it seems reasonable to write it and forget all > about it when the project is finished. However, would you approach it > some other way? I thought of IDL->MIDL->analyzer->structs, but external > structures are not the desired end point, so it probably would not help > all that much. As a Smalltalker you already know that Smalltalk objects are the easiest way to represent complex data structures, including nested extensible ones. In some cases they can be too complex to be of value but to graph purists. So why can't you do it in Smalltalk? Of course that may be a trick question since recently attempted use of MapiPac on http://www.smalltalking.net/Goodies/Dolphin/index.htm showed it can work but not reliably so maybe there's something wrong with the com functions that also gives you access to PDF documents. Kirk |
True Christian escribió:
> "Bill Schwab" <[hidden email]> wrote >>I am getting dangerously close to rolling my own C parser. The reality is >>that all it needs to do is handle some structs (some of which are nested) >>that I am copy/pasting from a PDF document containing some communications >>specs. [SNIP] > Of course that may be a trick question since recently attempted use of > MapiPac on http://www.smalltalking.net/Goodies/Dolphin/index.htm showed it > can work but not reliably so maybe there's something wrong with the com > functions that also gives you access to PDF documents. As I understand Bill wasn't talking about parsing a PDF document. But, by the way, what's the problem with that goodie? You can report it to info @ smalltalking.net, I can contact the person in charge of the mainance of Dolphin Goodies. Best Regards. -- Esteban A. Maringolo |
In reply to this post by Schwab,Wilhelm K
Bill
You wrote in message news:crn6le$1216$[hidden email]... > > I am getting dangerously close to rolling my own C parser. The reality is > that all it needs to do is handle some structs (some of which are nested) > that I am copy/pasting from a PDF document containing some communications > specs. > > Looking at it that way, it seems reasonable to write it and forget all > about it when the project is finished. However, would you approach it > some other way? As you probably know, I would always go the MIDL route. Most C header files can be #included into an IDL file and compiled up without any modification. Quite often it pays to make some small changes to get better results (for example you might want to add typedefs for the structs), but this is a mechanical process. Usually you'll be browsing the structs in Dolphin in a matter of minutes, and the offsets and sizes will be correct if you have used the correct MIDL structure alignment flags. >...I thought of IDL->MIDL->analyzer->structs, but external structures are >not the desired end point, so it probably would not help all that much. So what do you want to do with the structures? The ExternalStructure definitions are just an interpretation of the type information, but the analyser is more general than that and can pull out whatever information that is available about the types, for example see the IDL reverse engineering code starting from AXTypeLibraryAnalyser>>printIDL. > > Suggestions will be appreciated. Another suggestion is SmaCC (http://www.refactory.com/Software/SmaCC/). It comes with an example C parser. Regards Blair |
Blair,
> As you probably know, I would always go the MIDL route. Most C header files > can be #included into an IDL file and compiled up without any modification. > Quite often it pays to make some small changes to get better results (for > example you might want to add typedefs for the structs), but this is a > mechanical process. Usually you'll be browsing the structs in Dolphin in a > matter of minutes, and the offsets and sizes will be correct if you have > used the correct MIDL structure alignment flags. In this case, the structures are byte aligned. Do you have a preference for forcing that? >>...I thought of IDL->MIDL->analyzer->structs, but external structures are >>not the desired end point, so it probably would not help all that much. > > So what do you want to do with the structures? I have a framework that handles raw and formatted I/O with the usual kinds of twists that medical devices have thrown at me over time. Most protocols are more challenging than this one _appears_ to be (but it's a little early to claim victory<g>), so the framework does all kinds of crazy stuff, with some fields that do fixed-size binary I/O as a welcome diversion. One way or another, I want to use the structure definitions to generate a composite of fields from the framework. It would probably be possible to add a field type that wraps an external structure. But it might be easier to use the structures as a source of type information, in part because of a customized inspector that allows me to look at any part of one of the composite fields. > The ExternalStructure > definitions are just an interpretation of the type information, but the > analyser is more general than that and can pull out whatever information > that is available about the types, for example see the IDL reverse > engineering code starting from AXTypeLibraryAnalyser>>printIDL. That raises another option I considered: get my type information from the same source(s) you use to generate the external structures. > Another suggestion is SmaCC (http://www.refactory.com/Software/SmaCC/). It > comes with an example C parser. Thanks for the reminder (now downloaded for future reference). One unforunate item: "the C and Java parsers do not produce anything useful". However, I might be able to hack into it as I would need only structures and fields. So far anyway, my duct tape and bungee cord parser seems to be working. I very quickly arrived at the real problem of translating the field types and names into code that does something. In fact, it was almost too easy to get that far :( It's nice to have some backup plans. Have a good one, Bill -- Wilhelm K. Schwab, Ph.D. [hidden email] |
In reply to this post by Esteban A. Maringolo-2
Esteban,
> As I understand Bill wasn't talking about parsing a PDF document. Well, I might be :) Given the ability to enumerate tables and to "back up" from each of them, it might be possible to have a machine pull out the relevant pieces. Have a good one, Bill -- Wilhelm K. Schwab, Ph.D. [hidden email] |
In reply to this post by Esteban A. Maringolo-2
"Esteban A. Maringolo" <[hidden email]> wrote
>> MapiPac on http://www.smalltalking.net/Goodies/Dolphin/index.htm showed >> it can work but not reliably so maybe there's something wrong with the >> com > But, by the way, what's the problem with that goodie? > > You can report it to info @ smalltalking.net, I can contact the person in > charge of the mainance of Dolphin Goodies. Ok, I loaded MapiPac up in my D.5 Pro and intended to send a political request to all my state legislators but tried testing it on an email to myself first. The first time it did not work, the second time it worked with no changes, the thrid time it failed again with no changes. So I decided this wasn't going to serve my needs and wrote some routines to solve my problem using the mailTo: function of HTML to generate a page of links to each of the legislators that required a click to generate the whole email and a click to send it. Had MapiPac worked reliably each time I would have sent them all with one Evaluate it. I do not know how to tell what went wrong. |
Free forum by Nabble | Edit this page |