I sent Mariano an update that merges upstream support for the latest XMLParser into the GS fork. I did have to change the representations of chars and strings somewhat. XML (P)CDATA can only contain certain characters:
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] and Dale chose a compact but malformed representation for multibyte chars, outputting a '$' followed by each byte (big endian) of the code point as a byte char. But while #x100 (256) is a valid Char (in the #x20-#xD7FF range), #x01 and #x00 aren't, so representing it with '$' followed by #x01 and #x00 is not well-formed. I also removed erroneous sends of #encodeAsUTF8. This needs to be done at the document-level, not at the level of element content. To encode output, you do 'myObjectGraph sixxString encodeAsUTF8' (or 'XMLUTF8StreamConverter encode: myObjectGraph sixxString'). Or better, give a #sixxOn:* message a write stream (or GsFile) wrapped with an encoder so it will encode its output as UTF-8. I gave Mariano some examples. Decoding is done automatically by newer XMLParser versions when parsing. SIXX does no encoding/decoding itself. And since the recent package reorganization, XMLParser and related projects can be stably upgraded (or downgraded) within the same stone, but there might still be write conflicts, so it's not recommended. _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Monty,
Thanks for working on this, despite my late reply, I do appreciate the work that you are doing. In reading this email, I am reminded that I owe you responses on several issues ... I've run out of time today and there are some open Rowan issues that I need to start addressing ... So I'll come back to addressing the issues that I owe you tomorrow morning ... Dale On 03/18/2018 09:01 PM, monty via Glass wrote: > I sent Mariano an update that merges upstream support for the latest XMLParser into the GS fork. I did have to change the representations of chars and strings somewhat. XML (P)CDATA can only contain certain characters: > Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] > > and Dale chose a compact but malformed representation for multibyte chars, outputting a '$' followed by each byte (big endian) of the code point as a byte char. But while #x100 (256) is a valid Char (in the #x20-#xD7FF range), #x01 and #x00 aren't, so representing it with '$' followed by #x01 and #x00 is not well-formed. > > I also removed erroneous sends of #encodeAsUTF8. This needs to be done at the document-level, not at the level of element content. > > To encode output, you do 'myObjectGraph sixxString encodeAsUTF8' (or 'XMLUTF8StreamConverter encode: myObjectGraph sixxString'). Or better, give a #sixxOn:* message a write stream (or GsFile) wrapped with an encoder so it will encode its output as UTF-8. I gave Mariano some examples. > > Decoding is done automatically by newer XMLParser versions when parsing. SIXX does no encoding/decoding itself. > > And since the recent package reorganization, XMLParser and related projects can be stably upgraded (or downgraded) within the same stone, but there might still be write conflicts, so it's not recommended. > _______________________________________________ > Glass mailing list > [hidden email] > http://lists.gemtalksystems.com/mailman/listinfo/glass _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Free forum by Nabble | Edit this page |