Hi,
during my work on the XML-Mapping framework SimpleXO, I realized that the XML querying code could be useful standalone, too. I factored out a library named SimpleXPath and made it available in Cincom public repository under the MIT license. It is similar to the XPath location path subset (without predicates) but offers some distinct features: - paths are built as pure Smalltalk expressions - extended wildcard support - simple API Example: (RootStep // 'source' /@ 'id') "XPath: //source/@id" contextNode: anXmlNode; nodesDo: [:node | Transcript show: node stringValue; cr]. The above code prints the 'id' value of all 'source' elements in the XML document from which anXmlNode is taken. I am interested in your opinions. I'd be glad If you give it a try and discuss your thoughts here. Below I've attached the package comment explaining the API, just in case. ;) Regards and happy coding! Steffen Simple XPath is an XML query library based on a subset of the XPath 1.0 language. It provides a handy API to construct paths and a parser for abbreviated XPath location paths without predicates. See also: http://www.w3.org/TR/xpath/. I. NodeSets ----------------- The result of constructing a path or parsing an XPath location path is a NodeSet. If applied to an XML node, a NodeSet provides access to the nodes selected by this set. 1. Call #contextNode: to define the node a NodeSet is applied to. 2. Call #nodes to get a set of all matched nodes, #nodesDo: with a one argument block to iterate over all matched nodes and #selectNodes: with a one argument block to select some of the matched nodes. If you are working with tags that have prefixed names, ensure that you resolve the associated namespace before using a NodeSet. Call >>#resolveNamespaces: with a dictionary that maps all prefixes to their namespace. II. Path construction API: ------------------------------------ To construct a path programmatically, use the Axis classes and the methods from the protocol "path construction". 1. Single path steps: ChildAxis ? 'name'. "select all child nodes tagged with 'name'" ChildAxis ? ('prefix' + 'name'). "select all child nodes tagged with 'prefix:name'" AttributeAxis ? 'id'. "select all attribute nodes tagged with 'id'" SelfAxis ? AnyNodeTest. "select the context node itself" DescendantOrSelfAxis ? CommentTest. "select all descendant comment nodes" 2. Concatenate steps with #/ : (ChildAxis ? 'name') / (ChildAxis ? ('second' + 'name')). (ChildAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). "Often, the axis can be omitted:" 'name' / ('second' + 'name'). "same as" (ChildAxis ? 'name') / (ChildAxis ? ('second' + 'name')). AnyNodeTest / (AttributeAxis ? 'id'). "same as" (ChildAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). "Similar to XPath, #/@, #// and #//@ abbreviate attribute and descendant-or-self steps:" AnyNodeTest /@ 'id'. "same as" (ChildAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). 'name' // CommentTest. "same as" (ChildAxis ? 'name') / (DescendantOrSelfAxis ? AnyNodeTest) / (ChildAxis ? CommentTest). 'name' //@ 'id'. "same as" (ChildAxis ? 'name') / (DescendantOrSelfAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). 3. Query from the document root with a RootStep: RootStep // AnyNodeTest. "all nodes" RootStep //@ 'id'. "id of each node" 4. Create the union of two NodeSets with #| : (RootStep // 'element') | (RootStep // CommentTest). "#\@ abbreviates the union with an attribute step:" CommentTest \@ 'id'. "same as" (ChildAxis ? CommentTest) | (AttributeAxis ? 'id'). 5. The wildcards # and * match single and multiple characters in local tag names: ChildAxis ? 'name_##'. "selects e.g. <name_01 />" AttributeAxis ? '*_id'. "selects e.g. ... svg_id='0x5' ..." "NOTE: XPath allows * only for the whole tag name, e.g. //prefix:* " III. Parser API: -------------------- To parse an abbreviated XPath location path, use SimpleXPathParser. However, predicate expressions are not supported. Call #parseString: with the XPath string to parse that string and obtain a NodeSet and #validateString: to check whether the string is free of syntax errors. If parsing fails, a SyntaxError is raised that gives the error position and a brief description. _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Good idea!
Stef On Dec 5, 2011, at 2:29 PM, Steffen Märcker wrote: > Hi, > > during my work on the XML-Mapping framework SimpleXO, I realized that the > XML querying code could be useful standalone, too. I factored out a > library named SimpleXPath and made it available in Cincom public > repository under the MIT license. It is similar to the XPath location path > subset (without predicates) but offers some distinct features: > > - paths are built as pure Smalltalk expressions > - extended wildcard support > - simple API > > Example: > (RootStep // 'source' /@ 'id') "XPath: //source/@id" > contextNode: anXmlNode; > nodesDo: [:node | Transcript show: node stringValue; cr]. > > The above code prints the 'id' value of all 'source' elements in the XML > document from which anXmlNode is taken. > > I am interested in your opinions. I'd be glad If you give it a try and > discuss your thoughts here. Below I've attached the package comment > explaining the API, just in case. ;) > > Regards and happy coding! > Steffen > > > > > Simple XPath is an XML query library based on a subset of the XPath 1.0 > language. It provides a handy API to construct paths and a parser for > abbreviated XPath location paths without predicates. > > See also: http://www.w3.org/TR/xpath/. > > I. NodeSets > ----------------- > The result of constructing a path or parsing an XPath location path is a > NodeSet. If applied to an XML node, a NodeSet provides access to the nodes > selected by this set. > 1. Call #contextNode: to define the node a NodeSet is applied to. > 2. Call > #nodes to get a set of all matched nodes, > #nodesDo: with a one argument block to iterate over all matched nodes and > #selectNodes: with a one argument block to select some of the matched > nodes. > If you are working with tags that have prefixed names, ensure that you > resolve the associated namespace before using a NodeSet. > Call >>#resolveNamespaces: with a dictionary that maps all prefixes to > their namespace. > > II. Path construction API: > ------------------------------------ > To construct a path programmatically, use the Axis classes and the methods > from the protocol "path construction". > 1. Single path steps: > ChildAxis ? 'name'. "select all child nodes tagged with 'name'" > ChildAxis ? ('prefix' + 'name'). "select all child nodes tagged with > 'prefix:name'" > AttributeAxis ? 'id'. "select all attribute nodes tagged with 'id'" > > SelfAxis ? AnyNodeTest. "select the context node itself" > DescendantOrSelfAxis ? CommentTest. "select all descendant comment nodes" > > 2. Concatenate steps with #/ : > (ChildAxis ? 'name') / (ChildAxis ? ('second' + 'name')). > (ChildAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). > > "Often, the axis can be omitted:" > 'name' / ('second' + 'name'). "same as" > (ChildAxis ? 'name') / (ChildAxis ? ('second' + 'name')). > AnyNodeTest / (AttributeAxis ? 'id'). "same as" > (ChildAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). > > "Similar to XPath, #/@, #// and #//@ abbreviate attribute and > descendant-or-self steps:" > AnyNodeTest /@ 'id'. "same as" > (ChildAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). > 'name' // CommentTest. "same as" > (ChildAxis ? 'name') / (DescendantOrSelfAxis ? AnyNodeTest) / (ChildAxis > ? CommentTest). > 'name' //@ 'id'. "same as" > (ChildAxis ? 'name') / (DescendantOrSelfAxis ? AnyNodeTest) / > (AttributeAxis ? 'id'). > > 3. Query from the document root with a RootStep: > RootStep // AnyNodeTest. "all nodes" > RootStep //@ 'id'. "id of each node" > > 4. Create the union of two NodeSets with #| : > (RootStep // 'element') | (RootStep // CommentTest). > > "#\@ abbreviates the union with an attribute step:" > CommentTest \@ 'id'. "same as" > (ChildAxis ? CommentTest) | (AttributeAxis ? 'id'). > > 5. The wildcards # and * match single and multiple characters in local tag > names: > ChildAxis ? 'name_##'. "selects e.g. <name_01 />" > AttributeAxis ? '*_id'. "selects e.g. ... svg_id='0x5' ..." > "NOTE: XPath allows * only for the whole tag name, e.g. //prefix:* " > > III. Parser API: > -------------------- > To parse an abbreviated XPath location path, use SimpleXPathParser. > However, predicate expressions are not supported. > Call > #parseString: with the XPath string to parse that string and obtain a > NodeSet and > #validateString: to check whether the string is free of syntax errors. > If parsing fails, a SyntaxError is raised that gives the error position > and a brief description. > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Steffen Märcker
Hello again,
I've pushed a new version of SimpleXPath that fixes a serious bug. It was possible that a Path/NodeSet enumerates a node twice. This is not allowed by the spec, since NodeSets are sets. I highly recommend updating to this version. This affects SimpleXO as well, since node duplicates may cause an incorrect XML to object mapping. Regards, Steffen Am 05.12.2011, 14:29 Uhr, schrieb Steffen Märcker <[hidden email]>: > Hi, > > during my work on the XML-Mapping framework SimpleXO, I realized that the > XML querying code could be useful standalone, too. I factored out a > library named SimpleXPath and made it available in Cincom public > repository under the MIT license. It is similar to the XPath location > path > subset (without predicates) but offers some distinct features: > > - paths are built as pure Smalltalk expressions > - extended wildcard support > - simple API > > Example: > (RootStep // 'source' /@ 'id') "XPath: //source/@id" > contextNode: anXmlNode; > nodesDo: [:node | Transcript show: node stringValue; cr]. > > The above code prints the 'id' value of all 'source' elements in the XML > document from which anXmlNode is taken. > > I am interested in your opinions. I'd be glad If you give it a try and > discuss your thoughts here. Below I've attached the package comment > explaining the API, just in case. ;) > > Regards and happy coding! > Steffen > > > > > Simple XPath is an XML query library based on a subset of the XPath 1.0 > language. It provides a handy API to construct paths and a parser for > abbreviated XPath location paths without predicates. > > See also: http://www.w3.org/TR/xpath/. > > I. NodeSets > ----------------- > The result of constructing a path or parsing an XPath location path is a > NodeSet. If applied to an XML node, a NodeSet provides access to the > nodes > selected by this set. > 1. Call #contextNode: to define the node a NodeSet is applied to. > 2. Call > #nodes to get a set of all matched nodes, > #nodesDo: with a one argument block to iterate over all matched nodes > and > #selectNodes: with a one argument block to select some of the matched > nodes. > If you are working with tags that have prefixed names, ensure that you > resolve the associated namespace before using a NodeSet. > Call >>#resolveNamespaces: with a dictionary that maps all prefixes to > their namespace. > > II. Path construction API: > ------------------------------------ > To construct a path programmatically, use the Axis classes and the > methods > from the protocol "path construction". > 1. Single path steps: > ChildAxis ? 'name'. "select all child nodes tagged with 'name'" > ChildAxis ? ('prefix' + 'name'). "select all child nodes tagged with > 'prefix:name'" > AttributeAxis ? 'id'. "select all attribute nodes tagged with 'id'" > > SelfAxis ? AnyNodeTest. "select the context node itself" > DescendantOrSelfAxis ? CommentTest. "select all descendant comment > nodes" > > 2. Concatenate steps with #/ : > (ChildAxis ? 'name') / (ChildAxis ? ('second' + 'name')). > (ChildAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). > > "Often, the axis can be omitted:" > 'name' / ('second' + 'name'). "same as" > (ChildAxis ? 'name') / (ChildAxis ? ('second' + 'name')). > AnyNodeTest / (AttributeAxis ? 'id'). "same as" > (ChildAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). > > "Similar to XPath, #/@, #// and #//@ abbreviate attribute and > descendant-or-self steps:" > AnyNodeTest /@ 'id'. "same as" > (ChildAxis ? AnyNodeTest) / (AttributeAxis ? 'id'). > 'name' // CommentTest. "same as" > (ChildAxis ? 'name') / (DescendantOrSelfAxis ? AnyNodeTest) / > (ChildAxis > ? CommentTest). > 'name' //@ 'id'. "same as" > (ChildAxis ? 'name') / (DescendantOrSelfAxis ? AnyNodeTest) / > (AttributeAxis ? 'id'). > > 3. Query from the document root with a RootStep: > RootStep // AnyNodeTest. "all nodes" > RootStep //@ 'id'. "id of each node" > > 4. Create the union of two NodeSets with #| : > (RootStep // 'element') | (RootStep // CommentTest). > > "#\@ abbreviates the union with an attribute step:" > CommentTest \@ 'id'. "same as" > (ChildAxis ? CommentTest) | (AttributeAxis ? 'id'). > > 5. The wildcards # and * match single and multiple characters in local > tag > names: > ChildAxis ? 'name_##'. "selects e.g. <name_01 />" > AttributeAxis ? '*_id'. "selects e.g. ... svg_id='0x5' ..." > "NOTE: XPath allows * only for the whole tag name, e.g. //prefix:* " > > III. Parser API: > -------------------- > To parse an abbreviated XPath location path, use SimpleXPathParser. > However, predicate expressions are not supported. > Call > #parseString: with the XPath string to parse that string and obtain a > NodeSet and > #validateString: to check whether the string is free of syntax errors. > If parsing fails, a SyntaxError is raised that gives the error position > and a brief description. > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |