Automate XML parsing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Automate XML parsing

SergeStinckwich
Hi all,

I often have to wrote XML parsers (for BPMN for example) in order to
populate OO models. This is not really difficult, just a little bit
cumbersome to wrote by hand, especially when the XML is comply. Most
of these XML files specification are based on XML schema. So I guess
this is possible to automate partly.

Do we have something in the Moose/Pharo eco-system something to do that ?
What is the equivalent stuff in Eclipse eco-system ?
Thank you.
--
Serge Stinckwich
UCBN & UMI UMMISCO 209 (IRD/UPMC)
Every DSL ends up being Smalltalk
http://www.doesnotunderstand.org/
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automate XML parsing

stepharo
Synectique paid Peter to offer a way to read in XMI files.

Now I do not know the status and if this is what you are looking for.


Setf


Le 4/6/16 à 18:14, Serge Stinckwich a écrit :

> Hi all,
>
> I often have to wrote XML parsers (for BPMN for example) in order to
> populate OO models. This is not really difficult, just a little bit
> cumbersome to wrote by hand, especially when the XML is comply. Most
> of these XML files specification are based on XML schema. So I guess
> this is possible to automate partly.
>
> Do we have something in the Moose/Pharo eco-system something to do that ?
> What is the equivalent stuff in Eclipse eco-system ?
> Thank you.

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automate XML parsing

SergeStinckwich
On Sat, Jun 4, 2016 at 5:29 PM, stepharo <[hidden email]> wrote:
> Synectique paid Peter to offer a way to read in XMI files.
>
> Now I do not know the status and if this is what you are looking for.
>

Unfortunately, this is not I'm looking for.
The XML files I'm using are based on XML Schema and not on XMI.

--
Serge Stinckwich
UCBN & UMI UMMISCO 209 (IRD/UPMC)
Every DSL ends up being Smalltalk
http://www.doesnotunderstand.org/
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automate XML parsing

Peter Uhnak
Hi,

XMI is a subset of XML with some more well-defined behavior.
I wrote a tool that can process XML files exhibiting certain XMI properties.

the repo is here: https://github.com/peteruhnak/xmi-analyzer

I don't know if Usman had chance to work with it in practice, but if Synectique has any feedback or feature requests then let me know.

You are right that if you have XML Schema (or model-compliant XMI), then processing the XML can be fully automated.
Unfortunately implementing full support for that is quite a lot of work, that's why my xmi-analyzer takes shortcuts.

I suggest you give it a try, or send me an example of your XML file and maybe we can improve it.

XSD is wanted, but nowhere in sight, maybe Monty (current mastermind behind XML-Parser) may have an idea of the complexity of it?

Peter

On Sat, Jun 04, 2016 at 05:37:32PM +0100, Serge Stinckwich wrote:

> On Sat, Jun 4, 2016 at 5:29 PM, stepharo <[hidden email]> wrote:
> > Synectique paid Peter to offer a way to read in XMI files.
> >
> > Now I do not know the status and if this is what you are looking for.
> >
>
> Unfortunately, this is not I'm looking for.
> The XML files I'm using are based on XML Schema and not on XMI.
>
> --
> Serge Stinckwich
> UCBN & UMI UMMISCO 209 (IRD/UPMC)
> Every DSL ends up being Smalltalk
> http://www.doesnotunderstand.org/
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.list.inf.unibe.ch/listinfo/moose-dev
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automate XML parsing

Peter Uhnak
So I did take a look at Serge's XMLs, and they really need XSD processor.
They are too loose to be processed as XMI — XMI is a subset of XML, and generic XML is way too complex for the analyzer. (The result I got was way too deep (it generated a lot of classes)).

Now I could modify it a bit/a lot to be able to generate better results, however I think that a proper XSD processor would be better here.
I will need to think about how complex that would be, and of course if Monty has anything to say, I'm all ears. :)

Peter

On Sat, Jun 04, 2016 at 07:54:35PM +0200, Peter Uhnak wrote:

> Hi,
>
> XMI is a subset of XML with some more well-defined behavior.
> I wrote a tool that can process XML files exhibiting certain XMI properties.
>
> the repo is here: https://github.com/peteruhnak/xmi-analyzer
>
> I don't know if Usman had chance to work with it in practice, but if Synectique has any feedback or feature requests then let me know.
>
> You are right that if you have XML Schema (or model-compliant XMI), then processing the XML can be fully automated.
> Unfortunately implementing full support for that is quite a lot of work, that's why my xmi-analyzer takes shortcuts.
>
> I suggest you give it a try, or send me an example of your XML file and maybe we can improve it.
>
> XSD is wanted, but nowhere in sight, maybe Monty (current mastermind behind XML-Parser) may have an idea of the complexity of it?
>
> Peter
>
> On Sat, Jun 04, 2016 at 05:37:32PM +0100, Serge Stinckwich wrote:
> > On Sat, Jun 4, 2016 at 5:29 PM, stepharo <[hidden email]> wrote:
> > > Synectique paid Peter to offer a way to read in XMI files.
> > >
> > > Now I do not know the status and if this is what you are looking for.
> > >
> >
> > Unfortunately, this is not I'm looking for.
> > The XML files I'm using are based on XML Schema and not on XMI.
> >
> > --
> > Serge Stinckwich
> > UCBN & UMI UMMISCO 209 (IRD/UPMC)
> > Every DSL ends up being Smalltalk
> > http://www.doesnotunderstand.org/
> > _______________________________________________
> > Moose-dev mailing list
> > [hidden email]
> > https://www.list.inf.unibe.ch/listinfo/moose-dev
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automate XML parsing

Tudor Girba-2
Hi Serge, Peter etal,

I believe you are looking for an X/O mapper. This is indeed something that is dearly missing in Moose, and indeed, the main reason we do not have something like this is that we do not have an XSD parser.

Here is a paper that describes the X/O space:

Revealing the X/O impedance mismatch
Ralf Laemmel and Erik Meijer
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.105.1550&rep=rep1&type=pdf

Here is a presentation that might be relevant:
http://www.slideshare.net/rlaemmel/xml-data-binding

In Java, a prominent X/O mapper is JAXB:
https://jaxb.java.net

Cheers,
Doru



> On Jun 5, 2016, at 1:16 AM, Peter Uhnak <[hidden email]> wrote:
>
> So I did take a look at Serge's XMLs, and they really need XSD processor.
> They are too loose to be processed as XMI — XMI is a subset of XML, and generic XML is way too complex for the analyzer. (The result I got was way too deep (it generated a lot of classes)).
>
> Now I could modify it a bit/a lot to be able to generate better results, however I think that a proper XSD processor would be better here.
> I will need to think about how complex that would be, and of course if Monty has anything to say, I'm all ears. :)
>
> Peter
>
> On Sat, Jun 04, 2016 at 07:54:35PM +0200, Peter Uhnak wrote:
>> Hi,
>>
>> XMI is a subset of XML with some more well-defined behavior.
>> I wrote a tool that can process XML files exhibiting certain XMI properties.
>>
>> the repo is here: https://github.com/peteruhnak/xmi-analyzer
>>
>> I don't know if Usman had chance to work with it in practice, but if Synectique has any feedback or feature requests then let me know.
>>
>> You are right that if you have XML Schema (or model-compliant XMI), then processing the XML can be fully automated.
>> Unfortunately implementing full support for that is quite a lot of work, that's why my xmi-analyzer takes shortcuts.
>>
>> I suggest you give it a try, or send me an example of your XML file and maybe we can improve it.
>>
>> XSD is wanted, but nowhere in sight, maybe Monty (current mastermind behind XML-Parser) may have an idea of the complexity of it?
>>
>> Peter
>>
>> On Sat, Jun 04, 2016 at 05:37:32PM +0100, Serge Stinckwich wrote:
>>> On Sat, Jun 4, 2016 at 5:29 PM, stepharo <[hidden email]> wrote:
>>>> Synectique paid Peter to offer a way to read in XMI files.
>>>>
>>>> Now I do not know the status and if this is what you are looking for.
>>>>
>>>
>>> Unfortunately, this is not I'm looking for.
>>> The XML files I'm using are based on XML Schema and not on XMI.
>>>
>>> --
>>> Serge Stinckwich
>>> UCBN & UMI UMMISCO 209 (IRD/UPMC)
>>> Every DSL ends up being Smalltalk
>>> http://www.doesnotunderstand.org/
>>> _______________________________________________
>>> Moose-dev mailing list
>>> [hidden email]
>>> https://www.list.inf.unibe.ch/listinfo/moose-dev
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.list.inf.unibe.ch/listinfo/moose-dev

--
www.tudorgirba.com
www.feenk.com

"The coherence of a trip is given by the clearness of the goal."





_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automate XML parsing

Ben Coman
In reply to this post by Peter Uhnak
On Sun, Jun 5, 2016 at 1:54 AM, Peter Uhnak <[hidden email]> wrote:

> Hi,
>
> XMI is a subset of XML with some more well-defined behavior.
> I wrote a tool that can process XML files exhibiting certain XMI properties.
>
> the repo is here: https://github.com/peteruhnak/xmi-analyzer
>
> I don't know if Usman had chance to work with it in practice, but if Synectique has any feedback or feature requests then let me know.
>
> You are right that if you have XML Schema (or model-compliant XMI), then processing the XML can be fully automated.
> Unfortunately implementing full support for that is quite a lot of work, that's why my xmi-analyzer takes shortcuts.
>
> I suggest you give it a try, or send me an example of your XML file and maybe we can improve it.
>
> XSD is wanted, but nowhere in sight, maybe Monty (current mastermind behind XML-Parser) may have an idea of the complexity of it?
>
> Peter

Hi Peter,

Great to hear of this.  A CIM importer was a missing part from my
Masters project a few years ago on the IEC61970 Common Information
Model [1].  This has about 800 classes and 800 relationships between
them is maintained as a UML diagram using Enterprise Architect, from
which the IEC standard is generated.  EA can export CIM, but at the
time I could not find such an importer and had to hack my own import
process from the raw EA file.

[1] https://en.wikipedia.org/wiki/Common_Information_Model_(electricity)

I downloaded the latest CIM.xmi file to try it, and I get a RBParser
syntax error...
     thecustomprofile:Compound
           ^ <RED>Variable or expression expected</RED>
                 -> thecustomprofile:Compound

where the <RED> line is error inserted in red coloured text.
The immediate problem being the colon in the identifier.
I tracked this down as coming from the use of a namespace like this...

<xmi:XMI xmi:version="2.1" xmlns:uml="http://schema.omg.org/spec/UML/2.1"
  xmlns:xmi="http://schema.omg.org/spec/XMI/2.1"
  xmlns:thecustomprofile="http://www.sparxsystems.com/profiles/thecustomprofile/1.0"
  xmlns:EAUML="http://www.sparxsystems.com/profiles/EAUML/1.0">
....
<thecustomprofile:Compound
base_Class="EAID_1BEA10DB_2F71_467b_82A6_6C453C5657A5"/>

How can I work around this?

Now "thecustomprofile" is Enterprise Architect related (i.e. the
editing tool) and *nothing* to do with the domain, so I guess it could
be thrown away.  I notice down the stack that XLMElement has
variables...
   name ==> thecustomprofile:Compound
   localName ==> Compound

so in XACLassStructureGenerator>>processElement:
I changed "self ensureAttributeWithAccessor: anElement name..."
to   "self ensureAttributeWithAccessor: anElement localName..."
and it seems to have got further.  I'm now looking at the next bump in the road.

cheers -ben
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automate XML parsing

Peter Uhnak
> syntax error...
>      thecustomprofile:Compound
>            ^ <RED>Variable or expression expected</RED>
>                  -> thecustomprofile:Compound
>
> where the <RED> line is error inserted in red coloured text.
> The immediate problem being the colon in the identifier.
> I tracked this down as coming from the use of a namespace like this...
>
> <xmi:XMI xmi:version="2.1" xmlns:uml="http://schema.omg.org/spec/UML/2.1"
>   xmlns:xmi="http://schema.omg.org/spec/XMI/2.1"
>   xmlns:thecustomprofile="http://www.sparxsystems.com/profiles/thecustomprofile/1.0"
>   xmlns:EAUML="http://www.sparxsystems.com/profiles/EAUML/1.0">
> ....
> <thecustomprofile:Compound
> base_Class="EAID_1BEA10DB_2F71_467b_82A6_6C453C5657A5"/>
>
> How can I work around this?

I need a better handling of tag names with namespaces in them… I'll add it to my todo queue. :)


>
> Now "thecustomprofile" is Enterprise Architect related (i.e. the
> editing tool) and *nothing* to do with the domain, so I guess it could
> be thrown away.

Enterprise Architect adds this if you use a custom stereotype but without a defined profile (UML requires a profile for a stereotype).
But since people blisfully ignore the specs and just shove stereotypes to their models, the tool will autogenerate a profile for you.

> I notice down the stack that XLMElement has
> variables...
>    name ==> thecustomprofile:Compound
>    localName ==> Compound

I think this depends on what namespaces are defined in parent elements (iirc it wasn't reliable), but it's another thing for me to look at.

The tool is a work of about a week of intensive work, so there are certainly holes.
We wanted to make some behavior pluggable, especially when you are mapping a file on an existing model… but that's for future work.

Thanks for trying the tool!

Peter
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automate XML parsing

SergeStinckwich
In reply to this post by Tudor Girba-2
Thank you Doru ! Now I have a name about what I need ;-)

Sent from my iPhone

> On 5 juin 2016, at 06:16, Tudor Girba <[hidden email]> wrote:
>
> Hi Serge, Peter etal,
>
> I believe you are looking for an X/O mapper. This is indeed something that is dearly missing in Moose, and indeed, the main reason we do not have something like this is that we do not have an XSD parser.
>
> Here is a paper that describes the X/O space:
>
> Revealing the X/O impedance mismatch
> Ralf Laemmel and Erik Meijer
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.105.1550&rep=rep1&type=pdf
>
> Here is a presentation that might be relevant:
> http://www.slideshare.net/rlaemmel/xml-data-binding
>
> In Java, a prominent X/O mapper is JAXB:
> https://jaxb.java.net
>
> Cheers,
> Doru
>
>
>
>> On Jun 5, 2016, at 1:16 AM, Peter Uhnak <[hidden email]> wrote:
>>
>> So I did take a look at Serge's XMLs, and they really need XSD processor.
>> They are too loose to be processed as XMI — XMI is a subset of XML, and generic XML is way too complex for the analyzer. (The result I got was way too deep (it generated a lot of classes)).
>>
>> Now I could modify it a bit/a lot to be able to generate better results, however I think that a proper XSD processor would be better here.
>> I will need to think about how complex that would be, and of course if Monty has anything to say, I'm all ears. :)
>>
>> Peter
>>
>>> On Sat, Jun 04, 2016 at 07:54:35PM +0200, Peter Uhnak wrote:
>>> Hi,
>>>
>>> XMI is a subset of XML with some more well-defined behavior.
>>> I wrote a tool that can process XML files exhibiting certain XMI properties.
>>>
>>> the repo is here: https://github.com/peteruhnak/xmi-analyzer
>>>
>>> I don't know if Usman had chance to work with it in practice, but if Synectique has any feedback or feature requests then let me know.
>>>
>>> You are right that if you have XML Schema (or model-compliant XMI), then processing the XML can be fully automated.
>>> Unfortunately implementing full support for that is quite a lot of work, that's why my xmi-analyzer takes shortcuts.
>>>
>>> I suggest you give it a try, or send me an example of your XML file and maybe we can improve it.
>>>
>>> XSD is wanted, but nowhere in sight, maybe Monty (current mastermind behind XML-Parser) may have an idea of the complexity of it?
>>>
>>> Peter
>>>
>>>> On Sat, Jun 04, 2016 at 05:37:32PM +0100, Serge Stinckwich wrote:
>>>>> On Sat, Jun 4, 2016 at 5:29 PM, stepharo <[hidden email]> wrote:
>>>>> Synectique paid Peter to offer a way to read in XMI files.
>>>>>
>>>>> Now I do not know the status and if this is what you are looking for.
>>>>
>>>> Unfortunately, this is not I'm looking for.
>>>> The XML files I'm using are based on XML Schema and not on XMI.
>>>>
>>>> --
>>>> Serge Stinckwich
>>>> UCBN & UMI UMMISCO 209 (IRD/UPMC)
>>>> Every DSL ends up being Smalltalk
>>>> http://www.doesnotunderstand.org/
>>>> _______________________________________________
>>>> Moose-dev mailing list
>>>> [hidden email]
>>>> https://www.list.inf.unibe.ch/listinfo/moose-dev
>> _______________________________________________
>> Moose-dev mailing list
>> [hidden email]
>> https://www.list.inf.unibe.ch/listinfo/moose-dev
>
> --
> www.tudorgirba.com
> www.feenk.com
>
> "The coherence of a trip is given by the clearness of the goal."
>
>
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.list.inf.unibe.ch/listinfo/moose-dev
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automate XML parsing

Usman Bhatti
In reply to this post by Peter Uhnak
On Sat, Jun 4, 2016 at 7:54 PM, Peter Uhnak <[hidden email]> wrote:
Hi,

XMI is a subset of XML with some more well-defined behavior.
I wrote a tool that can process XML files exhibiting certain XMI properties.

the repo is here: https://github.com/peteruhnak/xmi-analyzer

I don't know if Usman had chance to work with it in practice, but if Synectique has any feedback or feature requests then let me know.

I didn't have a chance to test it yet; I was busy with other projects. But now the deadline is approaching for it so I'll have to have a look on it this week. I''ll let you know.
 

You are right that if you have XML Schema (or model-compliant XMI), then processing the XML can be fully automated.
Unfortunately implementing full support for that is quite a lot of work, that's why my xmi-analyzer takes shortcuts.

I suggest you give it a try, or send me an example of your XML file and maybe we can improve it.

XSD is wanted, but nowhere in sight, maybe Monty (current mastermind behind XML-Parser) may have an idea of the complexity of it?

Peter

On Sat, Jun 04, 2016 at 05:37:32PM +0100, Serge Stinckwich wrote:
> On Sat, Jun 4, 2016 at 5:29 PM, stepharo <[hidden email]> wrote:
> > Synectique paid Peter to offer a way to read in XMI files.
> >
> > Now I do not know the status and if this is what you are looking for.
> >
>
> Unfortunately, this is not I'm looking for.
> The XML files I'm using are based on XML Schema and not on XMI.
>
> --
> Serge Stinckwich
> UCBN & UMI UMMISCO 209 (IRD/UPMC)
> Every DSL ends up being Smalltalk
> http://www.doesnotunderstand.org/
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.list.inf.unibe.ch/listinfo/moose-dev
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.list.inf.unibe.ch/listinfo/moose-dev