Parsing an xml

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Parsing an xml

mani kartha
Hi all,

from what i understand i can use the following ways to parse an XML

1) create & communicate with a DOM using XML.Parser with DOM_SAXDriver 
2) create & use MyDOM_SAXDriver and get the information in my required format while the Parser actually parses the XML
3) XML to Object Binding
4) create a DOM and use XPath

situation: my requirement is to parse a big xml which contains large amount of data and only 25 % of that data will be useful for me. The data which i intend to collect is fragmented in the xml. Say that the xml has section A,B and C. section A gives a set of independent information, section B gives another set of independed information and section C gives information about how to relate and use the information in section A & B.  But the combined information of A,B and C will become only 25% of the total data in the xml.

For the above situation what will be the best way to parse the xml from the above 4? Or if there are other ways. I would like to know how you decide on the trade-off between development effort vs performance of execution (if at all a trade-off is necessary)

Thanks in advance,

Mani

_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc
Reply | Threaded
Open this post in threaded view
|

Re: Parsing an xml

Steffen Märcker
Hi,

I try to speak from my experience with projects involving XML.
At first, I'd give the DOM parser a try to see whether it it handles the  
documents sufficiently fast and does not run out of memory. Only if it  
doesn't I'd go and build a specialized DOM_SAXDriver/SAX parser.
I think, querying the DOM with XPath "by hand" is an easy and convenient  
way to get your information, especially for testing purposes. But it  
becomes pretty complex and hard to maintain as soon as the document, data  
or the relations in your model become more complex. If possible, I'd stick  
to one of the XO libraries.
I've made good experiences with Cincom's solution which handles both  
directions, i.e, XML to Object and vice versa. It is well documented and  
provides support XMLSchema specifications.
If marshalling objects to XML is not required or the XML is 'convoluted',  
I stick to SimpleXO. It's a library I had to build to map pretty bad and  
complex XML documents easily to objects. It provides very flexible  
mappings that can be specified using a dedicated DSL or pure ST code. The  
latter is the recommended approach. SimpleXO and its test suite are  
available in the public repository under the MIT license. Please tell me  
if you give it a try!

Kind regards,
Steffen

Am 07.02.2014, 14:46 Uhr, schrieb mani kartha <[hidden email]>:

> Hi all,
>
> from what i understand i can use the following ways to parse an XML
>
> 1) create & communicate with a DOM using XML.Parser with DOM_SAXDriver
> 2) create & use MyDOM_SAXDriver and get the information in my required
> format while the Parser actually parses the XML
> 3) XML to Object Binding
> 4) create a DOM and use XPath
>
> *situation: *my requirement is to parse a big xml which contains large
> amount of data and only 25 % of that data will be useful for me. The data
> which i intend to collect is fragmented in the xml. Say that the xml has
> section A,B and C. section A gives a set of independent information,
> section B gives another set of independed information and section C gives
> information about how to relate and use the information in section A & B.
>  But the combined information of A,B and C will become only 25% of the
> total data in the xml.
>
> For the above situation what will be the best way to parse the xml from  
> the
> above 4? Or if there are other ways. I would like to know how you decide  
> on
> the trade-off between development effort vs performance of execution (if  
> at
> all a trade-off is necessary)
>
> Thanks in advance,
>
> Mani
_______________________________________________
vwnc mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/vwnc