Smalltalk › Squeak › Squeak - Dev

[squeak-dev] XML DOM parser which properly handles namespaces?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

3 messages Options

cdavidshaffer

[squeak-dev] XML DOM parser which properly handles namespaces?

I'm currently using XML-Parser-mir.14 from XMLSupport on Squeak Source.
It doesn't seem to handle namespaces correctly though. When using

XMLDOMParser parseDocumentFrom: aStream useNamespaces: true

it produces a DOM which expands unqualified tag names with their
namespace but leaves qualified tag names alone. For example:

<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1">
<author>....</author>
<openSearch:searchResults>...</openSearch:searchResults>
</feed>

the DOM would looks like:

http://www.w3.org/2005/Atom:feed <---- this should use a # instead of
a : but I can live with that
http://www.w3.org/2005/Atom:author
openSearch:searchResults

Note that openSearch is not expanded. Also, qualified attribute names
are not expanded. I looked at the VisualWorks parser in VW7.6 and it
handles this properly by allowing tags and attributes to have both a
#name and an #expandedName unfortunately I can't get the VWXML parser to
load in Squeak (not to mention the fact that the port looks quite old).

Anyone else wrestling with this?

David

Michael Rueger-6

Re: [squeak-dev] XML DOM parser which properly handles namespaces?

C. David Shaffer wrote:
> I'm currently using XML-Parser-mir.14 from XMLSupport on Squeak Source.
> It doesn't seem to handle namespaces correctly though. When using

The namespace handling was more "make it parse", not so much a full
implementation.

> it produces a DOM which expands unqualified tag names with their
> namespace but leaves qualified tag names alone. For example:
...

> handles this properly by allowing tags and attributes to have both a
> #name and an #expandedName unfortunately I can't get the VWXML parser to

Would that be what you would like to get (DOM with both name and
expandedName entries)?

I'm not sure I'll find the time right now to implement this, but if
anyone feels like diving into it? :-)

Michael

cdavidshaffer

Re: [squeak-dev] XML DOM parser which properly handles namespaces?

Michael Rueger wrote:
> Would that be what you would like to get (DOM with both name and
> expandedName entries)?
>
> I'm not sure I'll find the time right now to implement this, but if
> anyone feels like diving into it? :-)
I'm not sure what other people expect. Right now attributes are stored
in a string -> string. Moving to an XMLAttribtue would confuse a lot of
code. So, for attributes we're stuck with either expanding the name
(useNamespaces == true) or not (useNamespaces == false). It makes sense
to follow this convention with Nodes as well (ie tag names).

I think that I can make the fixes:

expand all tag names if useNamespaces == true
expand all attribute names if useNamespaces == true

I'll keep the convention of using $: to separate the namespace IRI from
the tag/attribute name since someone might depend on that. I looked
through the XML namespaces spec and it never talks about the "expanded
name". All namespaces must be declared and the only valid way to refer
to them is through the abbreviated name. Still, this seems contrary to
the convention used in Schema. For example, if I have some XML schema
and I want to refer to some type in it I would use the form IRI#type.
Anyway...

David