Lagoon logs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Lagoon logs

Bill Schwab-2
Hi Blair,

Here's one for the "customers are never satisified" column :)   For various
reasons, I wanted to parse a Lagoon stripping log and display a method
browser on the methods that were removed.

My first thought was to use Beck's parsing stream pattern, but I noticed the
following:
"<h2>Removing methods that are not required or which must be
stripped</h2>Removing Aspect class>>menu:<BR>
Removing Aspect class>>menuBar:<BR>
Removing AvatarChat class>>uninitialize<BR>
Removing AXComponentWizard class>>initialize<BR>"

The above will likely get butchered by the line wrapping; the important part
is

  "be stripped</h2>Removing Aspect class>>menu:<BR>"

In particular, there is no cr/lf following the heading, which defeats the
over-simplified (but convenient<g>) trick of reading lines and then parsing
their contents assuming that each line carries one datum.

Seeking a quick solution, I decided to try the XMLDOM package.  It blew up
on the following:

<h2>Pre-strip Image statistics:</h2>
<TABLE>
<TR><TD><B>Objects</TD><TD>201396</TD></TR>
<TR><TD><B>Classes</B></TD><TD>2376</TD></TR>
<TR><TD><B>Methods</B></TD><TD>38229</TD></TR>
<TR><TD><B>Symbols</B></TD><TD>24086
</TD></TR>
</TABLE>

Note that the "Objects" line is missing a </B>.  Adding that helped, but it
blew up later on the following:

   "Removing Integer>><<<BR>"

#<<, #>>, etc. are going to cause problems.

I think I understand this next item, but I'll mention it in case I'm wrong.
A log included

   'Removing COAUTHIDENTITY>>DomainLength:<BR>'

but #DomainLength: is not defined in my image.  I'm assuming that the method
gets created when the structure definitions are compiled during the strip,
and then is removed later in stripping.  My parsing complained about it, but
an ifAbsent: silenced it.

At this point, it looks like a parsing stream is my best option; it will
miss a few things but should be helpful.  Long term, would it make sense for
the stripping log to be written in XML?

Have a good one,

Bill

--
Wilhelm K. Schwab, Ph.D.
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Lagoon logs

John Brant
"Bill Schwab" <[hidden email]> wrote in message
news:ajg59u$1b2rre$[hidden email]...
>
> Here's one for the "customers are never satisified" column :)   For
various
> reasons, I wanted to parse a Lagoon stripping log and display a method
> browser on the methods that were removed.
>
> My first thought was to use Beck's parsing stream pattern, but I noticed
the
> following:
> "<h2>Removing methods that are not required or which must be
> stripped</h2>Removing Aspect class>>menu:<BR>
> Removing Aspect class>>menuBar:<BR>
> Removing AvatarChat class>>uninitialize<BR>
> Removing AXComponentWizard class>>initialize<BR>"
>
> The above will likely get butchered by the line wrapping; the important
part
> is
>
>   "be stripped</h2>Removing Aspect class>>menu:<BR>"
>
> In particular, there is no cr/lf following the heading, which defeats the
> over-simplified (but convenient<g>) trick of reading lines and then
parsing
> their contents assuming that each line carries one datum.

You can generate the interface to the Microsoft VBScript Regular Expressions
package and use them to parse the file:

     | methods stream contents regex matches submatch class methodName |
     methods := OrderedCollection new.
     stream := FileStream read: 'myfile.htm'.
     [contents := stream contents] ensure: [stream close].
     regex := IRegExp2 new.
     regex pattern: 'Removing[ ](\w+)(|[ ]class)\>\>(([^\<]|\<[^\w])+)'.
     regex global: true.
     matches := (regex execute: contents) queryInterface: IMatchCollection2.
     0 to: matches count - 1
          do:
               [:i |
               submatch := (matches item: i) subMatches queryInterface:
ISubMatches.
               class := Smalltalk at: (submatch item: 0) asSymbol.
               (submatch item: 1) notEmpty ifTrue: [class := class class].
               methodName := (submatch item: 2) asSymbol.
               (class compiledMethodAt: methodName ifAbsent: [nil])
                    ifNotNil: [:method | methods add: method]].
     methods

You might want to adds some helper methods so the COM stuff doesn't show
through so much. Of course, it would be nice if Dolphin came with the
interface and helper methods already generated :)...


John Brant


Reply | Threaded
Open this post in threaded view
|

Re: Lagoon logs

Blair McGlashan
In reply to this post by Bill Schwab-2
"Bill Schwab" <[hidden email]> wrote in message
news:ajg59u$1b2rre$[hidden email]...
> ...
> Here's one for the "customers are never satisified" column :)   For
various
> reasons, I wanted to parse a Lagoon stripping log and display a method
> browser on the methods that were removed.
> ....
> At this point, it looks like a parsing stream is my best option; it will
> miss a few things but should be helpful.  Long term, would it make sense
for
> the stripping log to be written in XML?
>

It was planned to generate the stripping log in XML for D5, but there was
just too much other stuff to do (mainly I didn't fancy writing the XSL style
sheet since the XML output would be pretty trivial). It would be relatively
easy to ensure that the existing HTML output is valid as XML, but that isn't
quite the same thing as making it useful as "data", since it would still
need parsing to extract the information.

The other enhancement we planned was an XML "manifest" that detailed exactly
what was in the image - in fact with that one might not need the XML
stripping log, although the log might still occassionally be useful for
debugging purposes. You might find that you could easily implement the XML
manifest in a custom image stripper, and of course determining the set of
methods that were removed is easy given the set of methods that remain. The
main thing to bear in mind, of course, is that you have to limit what is
used by the image stripper in order not to subvert its own purpose (i.e. it
would be a mistake to start using the XML DOM in the stripper itself).

Other than that, John's regex suggestion sounds like a good one.

Regards

Blair