Expat XML Parser extension

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Expat XML Parser extension

Robin Redeker-2
Hi!

I just wanted to inform you that I've been writing an Expat
extension for gnu smalltalk today. I'll add some more documentation,
finish it up and send a path within the next week.

It's mainly a wrapper around the C API of Expat which emits SAX events,
which can be processed by the existing XML.SAXDriver implementation.

Expat is a stream parser and the smalltalk interface dows allow parsing
of unpositional streams like eg. sockets.

The good with the Expat parser is that it supports new (yet
currently not implemented in XML.st) SAX events which are defined by the
latest SAX extensions (such as cdata start/end and dtd start/end
detection, etc.).

The bad with Expat is that it requires yet another C lib.

Also the current interface returns String objects for all strings. Those
Strings are multibyte encoded (UTF-8) unicode. But writing a subclass
of the exising interface which decodes them to UnicodeString won't be a
big problem when someone needs it.

The expat parser also parses (encoded) byte streams (utf-8, iso, ascii
or utf-16 encoded) and not unicode strings. The current XML.XMLParser
implementation seems not to be able to parse byte streams and only
unicode strings (at least i want't able to throw in a UTF-8 encoded
document string).

Open questions with regard to the implementation are:

How/When do I correctly free the allocated expat C parser struct?
I'm currently storing the XML_Parser C struct as OOP in the Smalltalk
interface object. (Similar to the zlib implemenation).

But how do I ensure that the free function for it is called when my
interface object is destroyed?

I also put the package in packages/expat/ and used the zlib/ stuff as
skeleton for the building system. Is that okay?

I've also yet to discover how to create a patchset against the current
tla tree. But I guess I'll just have to read the manual again ;-/ (GIT
war a bit easier IMO :)



Robin


_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Expat XML Parser extension

Jan Vrany-2
Hi, I already wrote Expat binding for Smalltalk/X.
It does not support encondigs and parses internal
streams as well as external (files & sockets :-).
Maybe you can reuse some know-how from my (very simple) code.
Good luck :-)

Jan



On Ne, 2007-09-23 at 22:23 +0200, Robin Redeker wrote:

> Hi!
>
> I just wanted to inform you that I've been writing an Expat
> extension for gnu smalltalk today. I'll add some more documentation,
> finish it up and send a path within the next week.
>
> It's mainly a wrapper around the C API of Expat which emits SAX events,
> which can be processed by the existing XML.SAXDriver implementation.
>
> Expat is a stream parser and the smalltalk interface dows allow parsing
> of unpositional streams like eg. sockets.
>
> The good with the Expat parser is that it supports new (yet
> currently not implemented in XML.st) SAX events which are defined by the
> latest SAX extensions (such as cdata start/end and dtd start/end
> detection, etc.).
>
> The bad with Expat is that it requires yet another C lib.
>
> Also the current interface returns String objects for all strings. Those
> Strings are multibyte encoded (UTF-8) unicode. But writing a subclass
> of the exising interface which decodes them to UnicodeString won't be a
> big problem when someone needs it.
>
> The expat parser also parses (encoded) byte streams (utf-8, iso, ascii
> or utf-16 encoded) and not unicode strings. The current XML.XMLParser
> implementation seems not to be able to parse byte streams and only
> unicode strings (at least i want't able to throw in a UTF-8 encoded
> document string).
>
> Open questions with regard to the implementation are:
>
> How/When do I correctly free the allocated expat C parser struct?
> I'm currently storing the XML_Parser C struct as OOP in the Smalltalk
> interface object. (Similar to the zlib implemenation).
>
> But how do I ensure that the free function for it is called when my
> interface object is destroyed?
>
> I also put the package in packages/expat/ and used the zlib/ stuff as
> skeleton for the building system. Is that okay?
>
> I've also yet to discover how to create a patchset against the current
> tla tree. But I guess I'll just have to read the manual again ;-/ (GIT
> war a bit easier IMO :)
>
>
>
> Robin
>
>
> _______________________________________________
> help-smalltalk mailing list
> [hidden email]
> http://lists.gnu.org/mailman/listinfo/help-smalltalk

_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk

XMLv2__ExpatXMLReader.st (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Expat XML Parser extension

Robin Redeker-2
Hi!

On Sun, Sep 23, 2007 at 10:32:33PM +0200, Jan Vrany wrote:
> Hi, I already wrote Expat binding for Smalltalk/X.
> It does not support encondigs and parses internal
> streams as well as external (files & sockets :-).
> Maybe you can reuse some know-how from my (very simple) code.

Thanks! But my bindings are finished now anyways :)


Robin


_______________________________________________
help-smalltalk mailing list
[hidden email]
http://lists.gnu.org/mailman/listinfo/help-smalltalk