The future of Opax in XML-Support

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

The future of Opax in XML-Support

jaayer
The most recent version of XML-Support comes with a factory-based abstraction layer to customize the node object construction done by the parser. The default factory is just a number of stateless accessors with names like #elementClass and #documentClass that return class objects. This class can be subclassed and those messages overridden, and the subclass can then be instantiated and injected into a DOM parser using #nodeFactory: before parsing.

However, also included is a node factory subclass named XMLPluggableElementFactory that can map specific elements to specific element classes based on the name and namespace information of the elements in question. Here is an example of its use:

        doc := (XMLDOMParser on: aStringOrStream)
                nodeFactory:
                        (XMLPluggableElementFactory new
                                handleElement: 'user' withClass: MyUserElement;
                                handleElement: 'report' withClass: MyReportElement;
                                handleElement: 'report' namespaceURI: 'urn:specialReport' withClass: MySpecialReportElement);
                parseDocument.

This is undoubtedly similar to Opax, if not in its approach then certainly in its intent. I feel, however, that it is superior to Opax in a number of respects and question whether Opax should remain in XML-Support.

First, the node factory abstraction layer is based on the DOM parser and XMLNode classes, not the SAX parser and the custom OP*Element classes Opax has added. This means that users will get back instances of XMLElement subclasses rather than OPGenericElement subclasses, and those objects will support the entire XMLElement protocol. Second, the pluggable element factory enables more powerful element/class mapping based not only on the names of elements but their namespace information as well. Third, the element factory and its test are both quite small, in part because they do not need to duplicate the functionality of XMLElement like OPGenericElement does. Fourth, because the element factory is a subclass of XMLNodeFacory, it can be subclassed further to exert additional control over which classes the DOM parser should use for other, non-element nodes. Fifth and perhaps most important of all, the factory approach does not require a specific subclass of XMLDOMParser to be used
 ; it is injected into a DOM parser by the user, so any DOM parser, even ones that already exist, can use it without modification. To use Opax, however, a parser must be a subclass of OPOpaxHandler, which is itself a subclass of SAXHandler. That means DOM parsers can't be used with Opax at all, and existing SAX parsers can't use it unless they are rewritten to be subclasses of OPOpaxHandler.

I think Opax should be taken out and made optionally loadable by the Metacello configuration file. In fact, this should be done regardless of which approach is preferred, as the package is already quite large and stands to get larger as more is added. I plan on making XMLWriter a separate but required package, and the test suites separate but optional packages.

If anyone has any objections, I would like to hear them.


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Tudor Girba
Hi,

Thanks for this nice overview. I am happy that XML handling gets a bit more traction in Smalltalk :). I took a quick look and your XMLPluggableElementFactory sounds quite interesting, and it's great that it supports namespaces.

Regarding Opax, your analysis is not quite right.
- You do not need to subclass the OPOpaxHandler.
- The goal of Opax is not to replace DOM, but to enhance SAX. It's true that at the moment it still creates a tree, but this should be changed to make it optional. The original idea of Opax was to dispatch everything, including the factory decision to the Element, but the implementation remained behind the wishes.
- Opax is tiny: 3 classes + 4 test classes
- OPGenericElement should simply be made a subclass of XMLElement, and we would have the compatibility we would need.
- I do not see the reasons why DOM should be preferred to SAX. The problem with DOM is that it always creates XML elements :). When you have large XML files, you often do not want to load them, but just to process them directly. This is the goal of SAX, but then SAX is procedural. Opax should be used to transform SAX into an object-oriented handling.

Instead of removing it, I would suggest a different approach. Let's make it focus on the SAX parsing:
- We could easily get it to use the XMLNodeFactory
- We could subclass OPGenericElement from XMLElement.

In any case, regarding packaging, I would definitely be in favor of splitting XMLSupport in multiple packages.

Cheers,
Doru


On 15 Nov 2010, at 08:46, jaayer wrote:

> The most recent version of XML-Support comes with a factory-based abstraction layer to customize the node object construction done by the parser. The default factory is just a number of stateless accessors with names like #elementClass and #documentClass that return class objects. This class can be subclassed and those messages overridden, and the subclass can then be instantiated and injected into a DOM parser using #nodeFactory: before parsing.
>
> However, also included is a node factory subclass named XMLPluggableElementFactory that can map specific elements to specific element classes based on the name and namespace information of the elements in question. Here is an example of its use:
>
> doc := (XMLDOMParser on: aStringOrStream)
> nodeFactory:
> (XMLPluggableElementFactory new
> handleElement: 'user' withClass: MyUserElement;
> handleElement: 'report' withClass: MyReportElement;
> handleElement: 'report' namespaceURI: 'urn:specialReport' withClass: MySpecialReportElement);
> parseDocument.
>
> This is undoubtedly similar to Opax, if not in its approach then certainly in its intent. I feel, however, that it is superior to Opax in a number of respects and question whether Opax should remain in XML-Support.
>
> First, the node factory abstraction layer is based on the DOM parser and XMLNode classes, not the SAX parser and the custom OP*Element classes Opax has added. This means that users will get back instances of XMLElement subclasses rather than OPGenericElement subclasses, and those objects will support the entire XMLElement protocol. Second, the pluggable element factory enables more powerful element/class mapping based not only on the names of elements but their namespace information as well. Third, the element factory and its test are both quite small, in part because they do not need to duplicate the functionality of XMLElement like OPGenericElement does. Fourth, because the element factory is a subclass of XMLNodeFacory, it can be subclassed further to exert additional control over which classes the DOM parser should use for other, non-element nodes. Fifth and perhaps most important of all, the factory approach does not require a specific subclass of XMLDOMParser to be us
 ed

> ; it is injected into a DOM parser by the user, so any DOM parser, even ones that already exist, can use it without modification. To use Opax, however, a parser must be a subclass of OPOpaxHandler, which is itself a subclass of SAXHandler. That means DOM parsers can't be used with Opax at all, and existing SAX parsers can't use it unless they are rewritten to be subclasses of OPOpaxHandler.
>
> I think Opax should be taken out and made optionally loadable by the Metacello configuration file. In fact, this should be done regardless of which approach is preferred, as the package is already quite large and stands to get larger as more is added. I plan on making XMLWriter a separate but required package, and the test suites separate but optional packages.
>
> If anyone has any objections, I would like to hear them.
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

--
www.tudorgirba.com

"We cannot reach the flow of things unless we let go."




_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Stéphane Ducasse
In reply to this post by jaayer

On Nov 15, 2010, at 9:45 AM, Tudor Girba wrote:

> Hi,
>
> Thanks for this nice overview. I am happy that XML handling gets a bit more traction in Smalltalk :). I took a quick look and your XMLPluggableElementFactory sounds quite interesting, and it's great that it supports namespaces.
>
> Regarding Opax, your analysis is not quite right.
> - You do not need to subclass the OPOpaxHandler.
> - The goal of Opax is not to replace DOM, but to enhance SAX. It's true that at the moment it still creates a tree, but this should be changed to make it optional. The original idea of Opax was to dispatch everything, including the factory decision to the Element, but the implementation remained behind the wishes.
> - Opax is tiny: 3 classes + 4 test classes
> - OPGenericElement should simply be made a subclass of XMLElement, and we would have the compatibility we would need.
> - I do not see the reasons why DOM should be preferred to SAX. The problem with DOM is that it always creates XML elements :). When you have large XML files, you often do not want to load them, but just to process them directly. This is the goal of SAX, but then SAX is procedural. Opax should be used to transform SAX into an object-oriented handling.
>
> Instead of removing it, I would suggest a different approach. Let's make it focus on the SAX parsing:
> - We could easily get it to use the XMLNodeFactory
> - We could subclass OPGenericElement from XMLElement.
>
> In any case, regarding packaging, I would definitely be in favor of splitting XMLSupport in multiple packages.

It would be good also to have a more sexy name that XMLSupport.
I know people do not care about names but XMLSupport looks like ok here is a garage yes you can find the tool may be,
while I would love to have Gardner a nice and cool toolkit to take care of tree (soft and nice (woman) voice) if you see what I mean.

Stef


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Tudor Girba
+1 :)

Doru


On 15 Nov 2010, at 10:26, Stéphane Ducasse wrote:

>
> On Nov 15, 2010, at 9:45 AM, Tudor Girba wrote:
>
>> Hi,
>>
>> Thanks for this nice overview. I am happy that XML handling gets a bit more traction in Smalltalk :). I took a quick look and your XMLPluggableElementFactory sounds quite interesting, and it's great that it supports namespaces.
>>
>> Regarding Opax, your analysis is not quite right.
>> - You do not need to subclass the OPOpaxHandler.
>> - The goal of Opax is not to replace DOM, but to enhance SAX. It's true that at the moment it still creates a tree, but this should be changed to make it optional. The original idea of Opax was to dispatch everything, including the factory decision to the Element, but the implementation remained behind the wishes.
>> - Opax is tiny: 3 classes + 4 test classes
>> - OPGenericElement should simply be made a subclass of XMLElement, and we would have the compatibility we would need.
>> - I do not see the reasons why DOM should be preferred to SAX. The problem with DOM is that it always creates XML elements :). When you have large XML files, you often do not want to load them, but just to process them directly. This is the goal of SAX, but then SAX is procedural. Opax should be used to transform SAX into an object-oriented handling.
>>
>> Instead of removing it, I would suggest a different approach. Let's make it focus on the SAX parsing:
>> - We could easily get it to use the XMLNodeFactory
>> - We could subclass OPGenericElement from XMLElement.
>>
>> In any case, regarding packaging, I would definitely be in favor of splitting XMLSupport in multiple packages.
>
> It would be good also to have a more sexy name that XMLSupport.
> I know people do not care about names but XMLSupport looks like ok here is a garage yes you can find the tool may be,
> while I would love to have Gardner a nice and cool toolkit to take care of tree (soft and nice (woman) voice) if you see what I mean.
>
> Stef
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

--
www.tudorgirba.com

"What is more important: To be happy, or to make happy?"


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

jaayer
In reply to this post by Tudor Girba

---- On Mon, 15 Nov 2010 00:45:04 -0800 Tudor Girba  wrote ----

>Hi,
>
>Thanks for this nice overview. I am happy that XML handling gets a bit more traction in Smalltalk :). I took a quick look and your XMLPluggableElementFactory sounds quite interesting, and it's great that it supports namespaces.
>
>Regarding Opax, your analysis is not quite right.
>- You do not need to subclass the OPOpaxHandler.

Really? So if I have a pre-existing SAX parser, say SVGSAXParser, there is a way to make it support Opax-like functionality without changing its superclass to be that of OPOpaxHandler?

>- The goal of Opax is not to replace DOM, but to enhance SAX. It's true that at the moment it still creates a tree, but this should be changed to make it optional. The original idea of Opax was to dispatch everything, including the factory decision to the Element, but the implementation remained behind the wishes.

To be perfectly honest with you, I did not before nor do I now fully understand what Opax is supposed do. I understand that at the very least it involves mapping elements in an XML document to different kinds of objects, but how it is ultimately supposed to go about doing this remains unclear and appears to still be in flux.

>- Opax is tiny: 3 classes + 4 test classes

True, but it takes up two top-level class categories and still adds more weight to the package, and by your own admission it stands to only get bigger.

>- OPGenericElement should simply be made a subclass of XMLElement, and we would have the compatibility we would need.

Right, but then it would be a DOM node, and you said you wanted Opax to avoid DOM, or at least the DOM parser.

>- I do not see the reasons why DOM should be preferred to SAX. The problem with DOM is that it always creates XML elements :). When you have large XML files, you often do not want to load them, but just to process them directly. This is the goal of SAX, but then SAX is procedural. Opax should be used to transform SAX into an object-oriented handling.

So the goal is something that only produces objects for certain portions of a document, but ignores the rest? I think this could be better built on top of the DOM parser, perhaps as a partial DOM parser.

>Instead of removing it, I would suggest a different approach. Let's make it focus on the SAX parsing:
>- We could easily get it to use the XMLNodeFactory
>- We could subclass OPGenericElement from XMLElement.

I think an approach that used more metaprogramming and dependency injection rather than inheritance would be better. Maybe something that uses reflection to query injected classes to be used for elements and then fills their instance variables based on the names of those variables and the names of the child elements and attributes that the elements the class has been mapped to contain. In other words, you wouldn't need to subclass OPGenericElement OR XMLElement; just have instance variables in the injected class with names matching, roughly, the attribute and child element names of the elements the class has been mapped to. You could also support explicit conversion instructions. For example, something that could be told to map "timestamp" elements to the DateAndTime class and to convert their content using fromString:.

>In any case, regarding packaging, I would definitely be in favor of splitting XMLSupport in multiple packages.
>  

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

jaayer
In reply to this post by Stéphane Ducasse




---- On Mon, 15 Nov 2010 01:26:50 -0800 Stéphane Ducasse  wrote ----

>
>On Nov 15, 2010, at 9:45 AM, Tudor Girba wrote:
>
>> Hi,
>>
>> Thanks for this nice overview. I am happy that XML handling gets a bit more traction in Smalltalk :). I took a quick look and your XMLPluggableElementFactory sounds quite interesting, and it's great that it supports namespaces.
>>
>> Regarding Opax, your analysis is not quite right.
>> - You do not need to subclass the OPOpaxHandler.
>> - The goal of Opax is not to replace DOM, but to enhance SAX. It's true that at the moment it still creates a tree, but this should be changed to make it optional. The original idea of Opax was to dispatch everything, including the factory decision to the Element, but the implementation remained behind the wishes.
>> - Opax is tiny: 3 classes + 4 test classes
>> - OPGenericElement should simply be made a subclass of XMLElement, and we would have the compatibility we would need.
>> - I do not see the reasons why DOM should be preferred to SAX. The problem with DOM is that it always creates XML elements :). When you have large XML files, you often do not want to load them, but just to process them directly. This is the goal of SAX, but then SAX is procedural. Opax should be used to transform SAX into an object-oriented handling.
>>
>> Instead of removing it, I would suggest a different approach. Let's make it focus on the SAX parsing:
>> - We could easily get it to use the XMLNodeFactory
>> - We could subclass OPGenericElement from XMLElement.
>>
>> In any case, regarding packaging, I would definitely be in favor of splitting XMLSupport in multiple packages.
>
>It would be good also to have a more sexy name that XMLSupport.
>I know people do not care about names but XMLSupport looks like ok here is a garage yes you can find the tool may be,
>while I would love to have Gardner a nice and cool toolkit to take care of tree (soft and nice (woman) voice) if you see what I mean.

I believe the library used to be called YAXO, which was an acronym for... something. I am not sure why the name was changed to something so generic; perhaps because it sounds more professional, or for SEO performance reasons, or so people would know this was *the* package to download for handling XML in Squeak and Pharo. I don't really know.

>Stef
>
>
>_______________________________________________
>Moose-dev mailing list
>[hidden email]
>https://www.iam.unibe.ch/mailman/listinfo/moose-dev 
>


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Stéphane Ducasse
In reply to this post by Stéphane Ducasse
>>>
>>>
>>> Regarding Opax, your analysis is not quite right.
>>> - You do not need to subclass the OPOpaxHandler.
>>> - The goal of Opax is not to replace DOM, but to enhance SAX. It's true that at the moment it still creates a tree, but this should be changed to make it optional. The original idea of Opax was to dispatch everything, including the factory decision to the Element, but the implementation remained behind the wishes.
>>> - Opax is tiny: 3 classes + 4 test classes
>>> - OPGenericElement should simply be made a subclass of XMLElement, and we would have the compatibility we would need.
>>> - I do not see the reasons why DOM should be preferred to SAX. The problem with DOM is that it always creates XML elements :). When you have large XML files, you often do not want to load them, but just to process them directly. This is the goal of SAX, but then SAX is procedural. Opax should be used to transform SAX into an object-oriented handling.
>>>
>>> Instead of removing it, I would suggest a different approach. Let's make it focus on the SAX parsing:
>>> - We could easily get it to use the XMLNodeFactory
>>> - We could subclass OPGenericElement from XMLElement.
>>>
>>> In any case, regarding packaging, I would definitely be in favor of splitting XMLSupport in multiple packages.
>>
>> It would be good also to have a more sexy name that XMLSupport.
>> I know people do not care about names but XMLSupport looks like ok here is a garage yes you can find the tool may be,
>> while I would love to have Gardner a nice and cool toolkit to take care of tree (soft and nice (woman) voice) if you see what I mean.
>
> I believe the library used to be called YAXO, which was an acronym for... something.

YetAnotherX....
and Yaxo is way better than XMLSupportKitchenSinkAndMoreButThisIsNotClearAndNameSucksSoMayBeCodeTooWhyNot

> I am not sure why the name was changed to something so generic; perhaps because it sounds more professional,

certainly not.
Just lack of idea.

> or for SEO performance reasons, or so people would know this was *the* package to download for handling XML in Squeak and Pharo. I don't really know.

this is why I propose that we think about it, have fun and get a real cool name.

For example (this is not really working) just brainstorming 3 seconds.
        AvaDoom (because Ava Gardner ~ Gardener ~ Trees of Dom Objects)
        the logo could be a voodoo invocation.


Stef
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

jaayer




---- On Mon, 15 Nov 2010 12:37:58 -0800 Stéphane Ducasse  wrote ----

>>>>
>>>>
>>>> Regarding Opax, your analysis is not quite right.
>>>> - You do not need to subclass the OPOpaxHandler.
>>>> - The goal of Opax is not to replace DOM, but to enhance SAX. It's true that at the moment it still creates a tree, but this should be changed to make it optional. The original idea of Opax was to dispatch everything, including the factory decision to the Element, but the implementation remained behind the wishes.
>>>> - Opax is tiny: 3 classes + 4 test classes
>>>> - OPGenericElement should simply be made a subclass of XMLElement, and we would have the compatibility we would need.
>>>> - I do not see the reasons why DOM should be preferred to SAX. The problem with DOM is that it always creates XML elements :). When you have large XML files, you often do not want to load them, but just to process them directly. This is the goal of SAX, but then SAX is procedural. Opax should be used to transform SAX into an object-oriented handling.
>>>>
>>>> Instead of removing it, I would suggest a different approach. Let's make it focus on the SAX parsing:
>>>> - We could easily get it to use the XMLNodeFactory
>>>> - We could subclass OPGenericElement from XMLElement.
>>>>
>>>> In any case, regarding packaging, I would definitely be in favor of splitting XMLSupport in multiple packages.
>>>
>>> It would be good also to have a more sexy name that XMLSupport.
>>> I know people do not care about names but XMLSupport looks like ok here is a garage yes you can find the tool may be,
>>> while I would love to have Gardner a nice and cool toolkit to take care of tree (soft and nice (woman) voice) if you see what I mean.
>>
>> I believe the library used to be called YAXO, which was an acronym for... something.
>
>YetAnotherX....
>and Yaxo is way better than XMLSupportKitchenSinkAndMoreButThisIsNotClearAndNameSucksSoMayBeCodeTooWhyNot

We can always change it back to YAXO or change it to something else entirely. I would at least prefer something with "XML" in it so people will still be able to easily find, whether through Googling or searching on SqueakSource, an XML parser for Squeak/Pharo/Gemstone without having to know the cutesy name its developers chose for it ahead of time (see practically any Ruby library for an example of what I'm talking about).

One plus of a new name would be that Google searches for it would return only new(er), relevant content, like blog posts with code examples that still work rather than wiki pages that have been abandoned for years.

>> I am not sure why the name was changed to something so generic; perhaps because it sounds more professional,
>
>certainly not.
>Just lack of idea.
>
>> or for SEO performance reasons, or so people would know this was *the* package to download for handling XML in Squeak and Pharo. I don't really know.
>
>this is why I propose that we think about it, have fun and get a real cool name.
>
>For example (this is not really working) just brainstorming 3 seconds.
>    AvaDoom (because Ava Gardner ~ Gardener ~ Trees of Dom Objects)
>    the logo could be a voodoo invocation.
>
>
>Stef
>_______________________________________________
>Moose-dev mailing list
>[hidden email]
>https://www.iam.unibe.ch/mailman/listinfo/moose-dev 
>


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Alexandre Bergel
> We can always change it back to YAXO or change it to something else entirely. I would at least prefer something with "XML" in it so people will still be able to easily find, whether through Googling or searching on SqueakSource, an XML parser for Squeak/Pharo/Gemstone without having to know the cutesy name its developers chose for it ahead of time (see practically any Ruby library for an example of what I'm talking about).


Yep. Having 'XML' in the name is important.

Alexandre

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Tudor Girba
In reply to this post by jaayer
Hi,

On 15 Nov 2010, at 15:29, jaayer wrote:

>
> ---- On Mon, 15 Nov 2010 00:45:04 -0800 Tudor Girba  wrote ----
>
>> Hi,
>>
>> Thanks for this nice overview. I am happy that XML handling gets a bit more traction in Smalltalk :). I took a quick look and your XMLPluggableElementFactory sounds quite interesting, and it's great that it supports namespaces.
>>
>> Regarding Opax, your analysis is not quite right.
>> - You do not need to subclass the OPOpaxHandler.
>
> Really? So if I have a pre-existing SAX parser, say SVGSAXParser, there is a way to make it support Opax-like functionality without changing its superclass to be that of OPOpaxHandler?

:). No, and you are not supposed to. The reason for subclassing the SAXHandler is to accommodate the stream of XML nodes in methods like startElement:... . The OPOpaxHandler overrides these methods and creates corresponding nodes and dispatches to them the handling.

>> - The goal of Opax is not to replace DOM, but to enhance SAX. It's true that at the moment it still creates a tree, but this should be changed to make it optional. The original idea of Opax was to dispatch everything, including the factory decision to the Element, but the implementation remained behind the wishes.
>
> To be perfectly honest with you, I did not before nor do I now fully understand what Opax is supposed do. I understand that at the very least it involves mapping elements in an XML document to different kinds of objects, but how it is ultimately supposed to go about doing this remains unclear and appears to still be in flux.

It's the same as with your XMLDOMParser: you do not subclass it, you just parameterize it. In your case, the parametrization is quite nice.

>> - Opax is tiny: 3 classes + 4 test classes
>
> True, but it takes up two top-level class categories and still adds more weight to the package, and by your own admission it stands to only get bigger.
>
>> - OPGenericElement should simply be made a subclass of XMLElement, and we would have the compatibility we would need.
>
> Right, but then it would be a DOM node, and you said you wanted Opax to avoid DOM, or at least the DOM parser.

Yes and no. It would be a DOM node, but this does not mean that we have to store all of them in a tree if I do not need them.

>> - I do not see the reasons why DOM should be preferred to SAX. The problem with DOM is that it always creates XML elements :). When you have large XML files, you often do not want to load them, but just to process them directly. This is the goal of SAX, but then SAX is procedural. Opax should be used to transform SAX into an object-oriented handling.
>
> So the goal is something that only produces objects for certain portions of a document, but ignores the rest? I think this could be better built on top of the DOM parser, perhaps as a partial DOM parser.

With the new Factory, the DOMParser is similar to what the OPOpaxHandler is doing. The difference is that the Factory is doing the mapping, while in Opax the mapping is done on the class side of the element.

>> Instead of removing it, I would suggest a different approach. Let's make it focus on the SAX parsing:
>> - We could easily get it to use the XMLNodeFactory
>> - We could subclass OPGenericElement from XMLElement.
>
> I think an approach that used more metaprogramming and dependency injection rather than inheritance would be better. Maybe something that uses reflection to query injected classes to be used for elements and then fills their instance variables based on the names of those variables and the names of the child elements and attributes that the elements the class has been mapped to contain. In other words, you wouldn't need to subclass OPGenericElement OR XMLElement; just have instance variables in the injected class with names matching, roughly, the attribute and child element names of the elements the class has been mapped to. You could also support explicit conversion instructions. For example, something that could be told to map "timestamp" elements to the DateAndTime class and to convert their content using fromString:.

I do not see what would be gained with this because I do not see for what else I could use these classes. In any case, I would not go the path of playing with instance variables as long as there are simpler ways.

All in all, I think that the DOM starts to do a good job at creating a tree. I would propose in moving Opax towards an on-the-fly analysis of the parsed tree but without storing it (you basically most of the time only need the current stack = the path to the root).

Cheers,
Doru


--
www.tudorgirba.com

"Sometimes the best solution is not the best solution."


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Stéphane Ducasse
In reply to this post by jaayer
why?

a name is not an explanation.

Ava: an XML parser

So for me I repeat is XML support is a bad name.

Look at Comanche, Apache

why apache is not called HTTPServer?


Stef

>> We can always change it back to YAXO or change it to something else entirely. I would at least prefer something with "XML" in it so people will still be able to easily find, whether through Googling or searching on SqueakSource, an XML parser for Squeak/Pharo/Gemstone without having to know the cutesy name its developers chose for it ahead of time (see practically any Ruby library for an example of what I'm talking about).
>
>
> Yep. Having 'XML' in the name is important.
>
> Alexandre
>
> --
> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
> Alexandre Bergel  http://www.bergel.eu
> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
>
>
>
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

jfabry

A name that is not an explanation makes it hard for the user to know what the thing is or does. I have many problems with the Moose / Pharo / Squeak / ... community in this respect: I am sure there are a lot of cool tools and technolgies out there, and I see names flying around but I have no idea what they are.

Apache gets away with it because it is world famous. Once your tool is world famous the name does not matter anymore.

OK name your stuff how you want but in that case the community should also put a glossary online so that people like me can find their way.

On 16 Nov 2010, at 11:16, Stéphane Ducasse wrote:

> why?
>
> a name is not an explanation.
>
> Ava: an XML parser
>
> So for me I repeat is XML support is a bad name.
>
> Look at Comanche, Apache
>
> why apache is not called HTTPServer?
>
>
> Stef

--
Johan Fabry  
[hidden email] - http://dcc.uchile.cl/~jfabry
PLEIAD Lab - Computer Science Department (DCC) - University of Chile




_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Stéphane Ducasse
In reply to this post by Stéphane Ducasse
This is why you can get a little explanation beside the acronym

Do you prefer
        Moose or SotwareAnalysisPlatform
        Glamour or BrowserFlowFramework
        Pier or SeasideAppllicationCMS
        Seaside or dynamicWebFramework
        Citizen or the fuckingLibToParseBibUglyEntries....
        Mondrian or theDrawingAPIToOnlyDrawBoxesAndArrows

I prefer the first. So let us be a bit clever. Have nice name and explanation.

 look at VW they have VisualWorks Web Server.... I proposed them to call it Sioux long time ago (because of blackfoot (smart guys), Apache, ....).


Stef


On Nov 16, 2010, at 3:44 PM, Johan Fabry wrote:

>
> A name that is not an explanation makes it hard for the user to know what the thing is or does. I have many problems with the Moose / Pharo / Squeak / ... community in this respect: I am sure there are a lot of cool tools and technolgies out there, and I see names flying around but I have no idea what they are.
>
> Apache gets away with it because it is world famous. Once your tool is world famous the name does not matter anymore.
>
> OK name your stuff how you want but in that case the community should also put a glossary online so that people like me can find their way.
>
> On 16 Nov 2010, at 11:16, Stéphane Ducasse wrote:
>
>> why?
>>
>> a name is not an explanation.
>>
>> Ava: an XML parser
>>
>> So for me I repeat is XML support is a bad name.
>>
>> Look at Comanche, Apache
>>
>> why apache is not called HTTPServer?
>>
>>
>> Stef
>
> --
> Johan Fabry  
> [hidden email] - http://dcc.uchile.cl/~jfabry
> PLEIAD Lab - Computer Science Department (DCC) - University of Chile
>
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev


_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Guillermo Schwarz
That's a great explanation of what everything does.

You should consider putting it in a wiki.

Saludos,
Guillermo Schwarz.

El 16-11-2010, a las 13:16, Stéphane Ducasse  
<[hidden email]> escribió:

> This is why you can get a little explanation beside the acronym
>
> Do you prefer
>    Moose or SotwareAnalysisPlatform
>    Glamour or BrowserFlowFramework
>    Pier or SeasideAppllicationCMS
>    Seaside or dynamicWebFramework
>    Citizen or the fuckingLibToParseBibUglyEntries....
>    Mondrian or theDrawingAPIToOnlyDrawBoxesAndArrows
>
> I prefer the first. So let us be a bit clever. Have nice name and  
> explanation.
>
> look at VW they have VisualWorks Web Server.... I proposed them to  
> call it Sioux long time ago (because of blackfoot (smart guys),  
> Apache, ....).
>
>
> Stef
>
>
> On Nov 16, 2010, at 3:44 PM, Johan Fabry wrote:
>
>>
>> A name that is not an explanation makes it hard for the user to  
>> know what the thing is or does. I have many problems with the  
>> Moose / Pharo / Squeak / ... community in this respect: I am sure  
>> there are a lot of cool tools and technolgies out there, and I see  
>> names flying around but I have no idea what they are.
>>
>> Apache gets away with it because it is world famous. Once your tool  
>> is world famous the name does not matter anymore.
>>
>> OK name your stuff how you want but in that case the community  
>> should also put a glossary online so that people like me can find  
>> their way.
>>
>> On 16 Nov 2010, at 11:16, Stéphane Ducasse wrote:
>>
>>> why?
>>>
>>> a name is not an explanation.
>>>
>>> Ava: an XML parser
>>>
>>> So for me I repeat is XML support is a bad name.
>>>
>>> Look at Comanche, Apache
>>>
>>> why apache is not called HTTPServer?
>>>
>>>
>>> Stef
>>
>> --
>> Johan Fabry
>> [hidden email] - http://dcc.uchile.cl/~jfabry
>> PLEIAD Lab - Computer Science Department (DCC) - University of Chile
>>
>>
>>
>>
>> _______________________________________________
>> Moose-dev mailing list
>> [hidden email]
>> https://www.iam.unibe.ch/mailman/listinfo/moose-dev
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

jfabry
In reply to this post by Stéphane Ducasse

Yes, a glossary is what we need!

On 16 Nov 2010, at 13:16, Stéphane Ducasse wrote:

> This is why you can get a little explanation beside the acronym
>
> Do you prefer
> Moose or SotwareAnalysisPlatform
> Glamour or BrowserFlowFramework
> Pier or SeasideAppllicationCMS
> Seaside or dynamicWebFramework
> Citizen or the fuckingLibToParseBibUglyEntries....
> Mondrian or theDrawingAPIToOnlyDrawBoxesAndArrows
>
> I prefer the first. So let us be a bit clever. Have nice name and explanation.
>
> look at VW they have VisualWorks Web Server.... I proposed them to call it Sioux long time ago (because of blackfoot (smart guys), Apache, ....).
>
>
> Stef

--
Johan Fabry  
[hidden email] - http://dcc.uchile.cl/~jfabry
PLEIAD Lab - Computer Science Department (DCC) - University of Chile




_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

Tudor Girba
There is one:
http://www.moosetechnology.org/tools

Cheers,
Doru


On 16 Nov 2010, at 17:36, Johan Fabry wrote:

>
> Yes, a glossary is what we need!
>
> On 16 Nov 2010, at 13:16, Stéphane Ducasse wrote:
>
>> This is why you can get a little explanation beside the acronym
>>
>> Do you prefer
>> Moose or SotwareAnalysisPlatform
>> Glamour or BrowserFlowFramework
>> Pier or SeasideAppllicationCMS
>> Seaside or dynamicWebFramework
>> Citizen or the fuckingLibToParseBibUglyEntries....
>> Mondrian or theDrawingAPIToOnlyDrawBoxesAndArrows
>>
>> I prefer the first. So let us be a bit clever. Have nice name and explanation.
>>
>> look at VW they have VisualWorks Web Server.... I proposed them to call it Sioux long time ago (because of blackfoot (smart guys), Apache, ....).
>>
>>
>> Stef
>
> --
> Johan Fabry  
> [hidden email] - http://dcc.uchile.cl/~jfabry
> PLEIAD Lab - Computer Science Department (DCC) - University of Chile
>
>
>
>
> _______________________________________________
> Moose-dev mailing list
> [hidden email]
> https://www.iam.unibe.ch/mailman/listinfo/moose-dev

--
www.tudorgirba.com

"Be rather willing to give than demanding to get."




_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev
Reply | Threaded
Open this post in threaded view
|

Re: The future of Opax in XML-Support

jaayer
In reply to this post by Tudor Girba




---- On Tue, 16 Nov 2010 02:27:27 -0800 Tudor Girba  wrote ----

>Hi,
>
>On 15 Nov 2010, at 15:29, jaayer wrote:
>
>>
>> ---- On Mon, 15 Nov 2010 00:45:04 -0800 Tudor Girba wrote ----
>>
>>> Hi,
>>>
>>> Thanks for this nice overview. I am happy that XML handling gets a bit more traction in Smalltalk :). I took a quick look and your XMLPluggableElementFactory sounds quite interesting, and it's great that it supports namespaces.
>>>
>>> Regarding Opax, your analysis is not quite right.
>>> - You do not need to subclass the OPOpaxHandler.
>>
>> Really? So if I have a pre-existing SAX parser, say SVGSAXParser, there is a way to make it support Opax-like functionality without changing its superclass to be that of OPOpaxHandler?
>
>:). No, and you are not supposed to. The reason for subclassing the SAXHandler is to accommodate the stream of XML nodes in methods like startElement:... . The OPOpaxHandler overrides these methods and creates corresponding nodes and dispatches to them the handling.

Looking back at my earlier posts, I realize now that I wasn't clear. I understand you can use OPOpaxHandler direcly without subclassing it, and that you really only need to subclass OPGenericElement and override #xmlTags for the magic to happen. What I was referring to was a situation where you wanted to extend another, already-existing SAX parser to have Opax-like functionality. Take a hypothetical pre-existing SAXDocBookParser as an example; you would have to change it to inherit from OPOpaxHandler rather than SAXHandler or the SAXHandler subclass it presently inherits from and probably do some overriding and super sending of the handlers to make it work with Opax properly.

Another potential problem with Opax is that it blindly enumerates *all* subclasses of OPGenericElement looking for any that expresses interest in handling a particular element by inspecting the #xmlTags collection of each. Other than the lack of caching and the performance degradation this will result in as more subclasses of OPGenericElement are added, there is the issue of conflict resolution. What if you add a subclass of OPGenericElement named UsernameElement whose #xmlTags collection contains "name," and then I add a NameElement class whose #xmlTags collection also contains "name"--which class should be used? What if I don't add my own NameElement class or any class with "name" in its #xmlTags collection at all because I want "name" elements ignored altogether; they'll still get handled anyway, because your UsernameElement class is still there and its #xmlTags collection still contains "name."

>>> - The goal of Opax is not to replace DOM, but to enhance SAX. It's true that at the moment it still creates a tree, but this should be changed to make it optional. The original idea of Opax was to dispatch everything, including the factory decision to the Element, but the implementation remained behind the wishes.
>>
>> To be perfectly honest with you, I did not before nor do I now fully understand what Opax is supposed do. I understand that at the very least it involves mapping elements in an XML document to different kinds of objects, but how it is ultimately supposed to go about doing this remains unclear and appears to still be in flux.
>
>It's the same as with your XMLDOMParser: you do not subclass it, you just parameterize it. In your case, the parametrization is quite nice.
>
>>> - Opax is tiny: 3 classes + 4 test classes
>>
>> True, but it takes up two top-level class categories and still adds more weight to the package, and by your own admission it stands to only get bigger.
>>
>>> - OPGenericElement should simply be made a subclass of XMLElement, and we would have the compatibility we would need.
>>
>> Right, but then it would be a DOM node, and you said you wanted Opax to avoid DOM, or at least the DOM parser.
>
>Yes and no. It would be a DOM node, but this does not mean that we have to store all of them in a tree if I do not need them.
>
>>> - I do not see the reasons why DOM should be preferred to SAX. The problem with DOM is that it always creates XML elements :). When you have large XML files, you often do not want to load them, but just to process them directly. This is the goal of SAX, but then SAX is procedural. Opax should be used to transform SAX into an object-oriented handling.
>>
>> So the goal is something that only produces objects for certain portions of a document, but ignores the rest? I think this could be better built on top of the DOM parser, perhaps as a partial DOM parser.
>
>With the new Factory, the DOMParser is similar to what the OPOpaxHandler is doing. The difference is that the Factory is doing the mapping, while in Opax the mapping is done on the class side of the element.
>
>>> Instead of removing it, I would suggest a different approach. Let's make it focus on the SAX parsing:
>>> - We could easily get it to use the XMLNodeFactory
>>> - We could subclass OPGenericElement from XMLElement.
>>
>> I think an approach that used more metaprogramming and dependency injection rather than inheritance would be better. Maybe something that uses reflection to query injected classes to be used for elements and then fills their instance variables based on the names of those variables and the names of the child elements and attributes that the elements the class has been mapped to contain. In other words, you wouldn't need to subclass OPGenericElement OR XMLElement; just have instance variables in the injected class with names matching, roughly, the attribute and child element names of the elements the class has been mapped to. You could also support explicit conversion instructions. For example, something that could be told to map "timestamp" elements to the DateAndTime class and to convert their content using fromString:.
>
>I do not see what would be gained with this because I do not see for what else I could use these classes. In any case, I would not go the path of playing with instance variables as long as there are simpler ways.
>
>All in all, I think that the DOM starts to do a good job at creating a tree. I would propose in moving Opax towards an on-the-fly analysis of the parsed tree but without storing it (you basically most of the time only need the current stack = the path to the root).
>
>Cheers,
>Doru
>
>
>--
>www.tudorgirba.com
>
>"Sometimes the best solution is not the best solution."
>
>
>_______________________________________________
>Moose-dev mailing list
>[hidden email]
>https://www.iam.unibe.ch/mailman/listinfo/moose-dev 
>

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev