Stop me before I parse again

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Stop me before I parse again

Schwab,Wilhelm K
Hello all,

I am getting dangerously close to rolling my own C parser.  The reality
is that all it needs to do is handle some structs (some of which are
nested) that I am copy/pasting from a PDF document containing some
communications specs.

Looking at it that way, it seems reasonable to write it and forget all
about it when the project is finished.  However, would you approach it
some other way?  I thought of IDL->MIDL->analyzer->structs, but external
structures are not the desired end point, so it probably would not help
all that much.

Suggestions will be appreciated.

Have a good one,

Bill


--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Stop me before I parse again

True Christian
"Bill Schwab" <[hidden email]> wrote

> I am getting dangerously close to rolling my own C parser.  The reality is
> that all it needs to do is handle some structs (some of which are nested)
> that I am copy/pasting from a PDF document containing some communications
> specs.
>
> Looking at it that way, it seems reasonable to write it and forget all
> about it when the project is finished.  However, would you approach it
> some other way?  I thought of IDL->MIDL->analyzer->structs, but external
> structures are not the desired end point, so it probably would not help
> all that much.

As a Smalltalker you already know that Smalltalk objects are the easiest way
to represent complex data structures, including nested extensible ones.  In
some cases they can be too complex to be of value but to graph purists.  So
why can't you do it in Smalltalk?

Of course that may be a trick question since recently attempted use of
MapiPac on http://www.smalltalking.net/Goodies/Dolphin/index.htm showed it
can work but not reliably so maybe there's something wrong with the com
functions that also gives you access to PDF documents.

Kirk


Reply | Threaded
Open this post in threaded view
|

Re: Stop me before I parse again

Esteban A. Maringolo-2
True Christian escribió:
> "Bill Schwab" <[hidden email]> wrote
>>I am getting dangerously close to rolling my own C parser.  The reality is
>>that all it needs to do is handle some structs (some of which are nested)
>>that I am copy/pasting from a PDF document containing some communications
>>specs.

[SNIP]

> Of course that may be a trick question since recently attempted use of
> MapiPac on http://www.smalltalking.net/Goodies/Dolphin/index.htm showed it
> can work but not reliably so maybe there's something wrong with the com
> functions that also gives you access to PDF documents.


As I understand Bill wasn't talking about parsing a PDF document.

But, by the way, what's the problem with that goodie?

You can report it to info @ smalltalking.net, I can contact the
person in charge of the mainance of Dolphin Goodies.

Best Regards.

--
Esteban A. Maringolo


Reply | Threaded
Open this post in threaded view
|

Re: Stop me before I parse again

Blair McGlashan-3
In reply to this post by Schwab,Wilhelm K
Bill

You wrote in message news:crn6le$1216$[hidden email]...
>
> I am getting dangerously close to rolling my own C parser.  The reality is
> that all it needs to do is handle some structs (some of which are nested)
> that I am copy/pasting from a PDF document containing some communications
> specs.
>
> Looking at it that way, it seems reasonable to write it and forget all
> about it when the project is finished.  However, would you approach it
> some other way?

As you probably know, I would always go the MIDL route. Most C header files
can be #included into an IDL file and compiled up without any modification.
Quite often it pays to make some small changes to get better results (for
example you might want to add typedefs for the structs), but this is a
mechanical process. Usually you'll be browsing the structs in Dolphin in a
matter of minutes, and the offsets and sizes will be correct if you have
used the correct MIDL structure alignment flags.

>...I thought of IDL->MIDL->analyzer->structs, but external structures are
>not the desired end point, so it probably would not help all that much.

So what do you want to do with the structures? The ExternalStructure
definitions are just an interpretation of the type information, but the
analyser is more general than that and can pull out whatever information
that is available about the types, for example see the IDL reverse
engineering code starting from AXTypeLibraryAnalyser>>printIDL.

>
> Suggestions will be appreciated.

Another suggestion is SmaCC (http://www.refactory.com/Software/SmaCC/). It
comes with an example C parser.

Regards

Blair


Reply | Threaded
Open this post in threaded view
|

Re: Stop me before I parse again

Schwab,Wilhelm K
Blair,

> As you probably know, I would always go the MIDL route. Most C header files
> can be #included into an IDL file and compiled up without any modification.
> Quite often it pays to make some small changes to get better results (for
> example you might want to add typedefs for the structs), but this is a
> mechanical process. Usually you'll be browsing the structs in Dolphin in a
> matter of minutes, and the offsets and sizes will be correct if you have
> used the correct MIDL structure alignment flags.

In this case, the structures are byte aligned.  Do you have a preference
for forcing that?


>>...I thought of IDL->MIDL->analyzer->structs, but external structures are
>>not the desired end point, so it probably would not help all that much.
>
> So what do you want to do with the structures?

I have a framework that handles raw and formatted I/O with the usual
kinds of twists that medical devices have thrown at me over time.  Most
protocols are more challenging than this one _appears_ to be (but it's a
little early to claim victory<g>), so the framework does all kinds of
crazy stuff, with some fields that do fixed-size binary I/O as a welcome
diversion.

One way or another, I want to use the structure definitions to generate
a composite of fields from the framework.  It would probably be possible
to add a field type that wraps an external structure.  But it might be
easier to use the structures as a source of type information, in part
because of a customized inspector that allows me to look at any part of
one of the composite fields.


 > The ExternalStructure
> definitions are just an interpretation of the type information, but the
> analyser is more general than that and can pull out whatever information
> that is available about the types, for example see the IDL reverse
> engineering code starting from AXTypeLibraryAnalyser>>printIDL.

That raises another option I considered: get my type information from
the same source(s) you use to generate the external structures.


> Another suggestion is SmaCC (http://www.refactory.com/Software/SmaCC/). It
> comes with an example C parser.

Thanks for the reminder (now downloaded for future reference).  One
unforunate item: "the C and Java parsers do not produce anything
useful".  However, I might be able to hack into it as I would need only
structures and fields.

So far anyway, my duct tape and bungee cord parser seems to be working.
  I very quickly arrived at the real problem of translating the field
types and names into code that does something.  In fact, it was almost
too easy to get that far :(  It's nice to have some backup plans.

Have a good one,

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Stop me before I parse again

Schwab,Wilhelm K
In reply to this post by Esteban A. Maringolo-2
Esteban,

> As I understand Bill wasn't talking about parsing a PDF document.

Well, I might be :)  Given the ability to enumerate tables and to "back
up" from each of them, it might be possible to have a machine pull out
the relevant pieces.

Have a good one,

Bill


--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Stop me before I parse again

True Christian
In reply to this post by Esteban A. Maringolo-2
"Esteban A. Maringolo" <[hidden email]> wrote
>> MapiPac on http://www.smalltalking.net/Goodies/Dolphin/index.htm showed
>> it can work but not reliably so maybe there's something wrong with the
>> com

> But, by the way, what's the problem with that goodie?
>
> You can report it to info @ smalltalking.net, I can contact the person in
> charge of the mainance of Dolphin Goodies.

Ok, I loaded MapiPac up in my D.5 Pro and intended to send a political
request to all my state legislators but tried testing it on an email to
myself first.  The first time it did not work, the second time it worked
with no changes, the thrid time it failed again with no changes.  So I
decided this wasn't going to serve my needs and wrote some routines to solve
my problem using the mailTo: function of HTML to generate a page of links to
each of the legislators that required a click to generate the whole email
and a click to send it.  Had MapiPac worked reliably each time I would have
sent them all with one Evaluate it.  I do not know how to tell what went
wrong.