Hi -
I just spent about two hours staring at code because of an oddity in the XML parser's printing of nodes. Here's an example: node:= (XMLElement new) name: 'foo'; addContent: (XMLStringNode string: 'Hello World'); setAttributes: (Dictionary new); yourself. This prints '<foo>Hello World</foo>' which is fine. However, the following construction, which adds just a single attribute: node:= (XMLElement new) name: 'foo'; addContent: (XMLStringNode string: 'Hello World'); setAttributes: (Dictionary newFromPairs: {#id. 1}); yourself. prints now as '<foo id="1"/>' (i.e., losing its content string). Looking at the code in XMLElement>>printXmlOn: it does something weird if the writer is considered "non-canonical", i.e., "... snip ..." (writer canonical not and: [self isEmpty and: [self attributes isEmpty not]]) ifTrue: [writer endEmptyTag: self name] "... snap ..." Two questions about this: 1) What's the meaning of 'canonical' XML? Is this a well-defined (sub-)set of XML? If so, where can I read about it? 2) Is the above a bug or a feature? I'm wondering in particular about XMLElement>>isEmpty which only considers the elements but not eventual contents. Any help is greatly welcome. Cheers, - Andreas |
On 10.08.2010, at 21:21, Andreas Raab wrote: > Hi - > > I just spent about two hours staring at code because of an oddity in the XML parser's printing of nodes. Here's an example: > > node:= (XMLElement new) name: 'foo'; > addContent: (XMLStringNode string: 'Hello World'); > setAttributes: (Dictionary new); > yourself. > > This prints '<foo>Hello World</foo>' which is fine. However, the following construction, which adds just a single attribute: > > node:= (XMLElement new) name: 'foo'; > addContent: (XMLStringNode string: 'Hello World'); > setAttributes: (Dictionary newFromPairs: {#id. 1}); > yourself. > > prints now as '<foo id="1"/>' (i.e., losing its content string). Looking at the code in XMLElement>>printXmlOn: it does something weird if the writer is considered "non-canonical", i.e., > > "... snip ..." > (writer canonical not > and: [self isEmpty and: [self attributes isEmpty not]]) > ifTrue: [writer endEmptyTag: self name] > "... snap ..." > > Two questions about this: 1) What's the meaning of 'canonical' XML? Is this a well-defined (sub-)set of XML? If so, where can I read about it? 2) Is the above a bug or a feature? I'm wondering in particular about XMLElement>>isEmpty which only considers the elements but not eventual contents. > > Any help is greatly welcome. > > Cheers, > - Andreas Sounds like #isEmpty is buggy, it certainly should look at both contents and elements. And "canonical" may mean that there are no empty "shorthand" tags but always an opening and closing tag. - Bert - |
On 8/10/2010 12:33 PM, Bert Freudenberg wrote:
> Sounds like #isEmpty is buggy, it certainly should look at both contents and elements. And "canonical" may mean that there are no empty "shorthand" tags but always an opening and closing tag. Good theory, but it seems that in that case the test that says: (writer canonical not and: [self isEmpty and: [self attributes isEmpty not]]) could be shortened to just (writer canonical not and: [self isEmpty]) no? I mean why would it matter if the list of attributes is empty or not? The way it's right now, you get: node:= (XMLElement new) name: 'foo'; setAttributes: (Dictionary new); yourself. => '<foo></foo>' even when running 'non-canonical' (due to 'self attributes isEmpty not' failing). Cheers, - Andreas |
On 10.08.2010, at 22:08, Andreas Raab wrote: > On 8/10/2010 12:33 PM, Bert Freudenberg wrote: >> Sounds like #isEmpty is buggy, it certainly should look at both contents and elements. And "canonical" may mean that there are no empty "shorthand" tags but always an opening and closing tag. > > Good theory, but it seems that in that case the test that says: > > (writer canonical not > and: [self isEmpty and: [self attributes isEmpty not]]) > > could be shortened to just > > (writer canonical not > and: [self isEmpty]) > > no? I mean why would it matter if the list of attributes is empty or not? The way it's right now, you get: > > node:= (XMLElement new) name: 'foo'; > setAttributes: (Dictionary new); > yourself. > > => '<foo></foo>' > > even when running 'non-canonical' (due to 'self attributes isEmpty not' failing). > > Cheers, > - Andreas I can't see the canonicalization having anything to do with attributes being present or not: http://www.w3.org/TR/xml-c14n#Example-SETags - Bert - |
Free forum by Nabble | Edit this page |