Hi Blair,
Here's one for the "customers are never satisified" column :) For various reasons, I wanted to parse a Lagoon stripping log and display a method browser on the methods that were removed. My first thought was to use Beck's parsing stream pattern, but I noticed the following: "<h2>Removing methods that are not required or which must be stripped</h2>Removing Aspect class>>menu:<BR> Removing Aspect class>>menuBar:<BR> Removing AvatarChat class>>uninitialize<BR> Removing AXComponentWizard class>>initialize<BR>" The above will likely get butchered by the line wrapping; the important part is "be stripped</h2>Removing Aspect class>>menu:<BR>" In particular, there is no cr/lf following the heading, which defeats the over-simplified (but convenient<g>) trick of reading lines and then parsing their contents assuming that each line carries one datum. Seeking a quick solution, I decided to try the XMLDOM package. It blew up on the following: <h2>Pre-strip Image statistics:</h2> <TABLE> <TR><TD><B>Objects</TD><TD>201396</TD></TR> <TR><TD><B>Classes</B></TD><TD>2376</TD></TR> <TR><TD><B>Methods</B></TD><TD>38229</TD></TR> <TR><TD><B>Symbols</B></TD><TD>24086 </TD></TR> </TABLE> Note that the "Objects" line is missing a </B>. Adding that helped, but it blew up later on the following: "Removing Integer>><<<BR>" #<<, #>>, etc. are going to cause problems. I think I understand this next item, but I'll mention it in case I'm wrong. A log included 'Removing COAUTHIDENTITY>>DomainLength:<BR>' but #DomainLength: is not defined in my image. I'm assuming that the method gets created when the structure definitions are compiled during the strip, and then is removed later in stripping. My parsing complained about it, but an ifAbsent: silenced it. At this point, it looks like a parsing stream is my best option; it will miss a few things but should be helpful. Long term, would it make sense for the stripping log to be written in XML? Have a good one, Bill -- Wilhelm K. Schwab, Ph.D. [hidden email] |
"Bill Schwab" <[hidden email]> wrote in message
news:ajg59u$1b2rre$[hidden email]... > > Here's one for the "customers are never satisified" column :) For various > reasons, I wanted to parse a Lagoon stripping log and display a method > browser on the methods that were removed. > > My first thought was to use Beck's parsing stream pattern, but I noticed the > following: > "<h2>Removing methods that are not required or which must be > stripped</h2>Removing Aspect class>>menu:<BR> > Removing Aspect class>>menuBar:<BR> > Removing AvatarChat class>>uninitialize<BR> > Removing AXComponentWizard class>>initialize<BR>" > > The above will likely get butchered by the line wrapping; the important part > is > > "be stripped</h2>Removing Aspect class>>menu:<BR>" > > In particular, there is no cr/lf following the heading, which defeats the > over-simplified (but convenient<g>) trick of reading lines and then parsing > their contents assuming that each line carries one datum. You can generate the interface to the Microsoft VBScript Regular Expressions package and use them to parse the file: | methods stream contents regex matches submatch class methodName | methods := OrderedCollection new. stream := FileStream read: 'myfile.htm'. [contents := stream contents] ensure: [stream close]. regex := IRegExp2 new. regex pattern: 'Removing[ ](\w+)(|[ ]class)\>\>(([^\<]|\<[^\w])+)'. regex global: true. matches := (regex execute: contents) queryInterface: IMatchCollection2. 0 to: matches count - 1 do: [:i | submatch := (matches item: i) subMatches queryInterface: ISubMatches. class := Smalltalk at: (submatch item: 0) asSymbol. (submatch item: 1) notEmpty ifTrue: [class := class class]. methodName := (submatch item: 2) asSymbol. (class compiledMethodAt: methodName ifAbsent: [nil]) ifNotNil: [:method | methods add: method]]. methods You might want to adds some helper methods so the COM stuff doesn't show through so much. Of course, it would be nice if Dolphin came with the interface and helper methods already generated :)... John Brant |
In reply to this post by Bill Schwab-2
"Bill Schwab" <[hidden email]> wrote in message
news:ajg59u$1b2rre$[hidden email]... > ... > Here's one for the "customers are never satisified" column :) For various > reasons, I wanted to parse a Lagoon stripping log and display a method > browser on the methods that were removed. > .... > At this point, it looks like a parsing stream is my best option; it will > miss a few things but should be helpful. Long term, would it make sense for > the stripping log to be written in XML? > It was planned to generate the stripping log in XML for D5, but there was just too much other stuff to do (mainly I didn't fancy writing the XSL style sheet since the XML output would be pretty trivial). It would be relatively easy to ensure that the existing HTML output is valid as XML, but that isn't quite the same thing as making it useful as "data", since it would still need parsing to extract the information. The other enhancement we planned was an XML "manifest" that detailed exactly what was in the image - in fact with that one might not need the XML stripping log, although the log might still occassionally be useful for debugging purposes. You might find that you could easily implement the XML manifest in a custom image stripper, and of course determining the set of methods that were removed is easy given the set of methods that remain. The main thing to bear in mind, of course, is that you have to limit what is used by the image stripper in order not to subvert its own purpose (i.e. it would be a mistake to start using the XML DOM in the stripper itself). Other than that, John's regex suggestion sounds like a good one. Regards Blair |
Free forum by Nabble | Edit this page |