Re: String UTF error from sixx string

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: String UTF error from sixx string

dario trussardi
Dale,

i have some problem when porting some sixx  data files from Pharo run on MAC   to GLASS environment run on Ubuntu system.

The problem is relative to manage a string with  $ù  $à  character.

The String method is setup with:

createInstanceOfOrg: aClass withSixxElement: sixxElement
^ (SixxXmlUtil characterDataFrom: sixxElement)



An simple test from workspace :

Object readSixxFrom: '<sixx.object sixx.id="13" sixx.name="commento"
sixx.type="String">più professionalità</sixx.object>'

works fine, it answer a string with  'più professionalità'




But when i reading a file with string containing a $ù   

the sixxElement  parameter of  createInstanceOfOrg: aClass withSixxElement: sixxElement

is set to  :


'<sixx.object sixx.id="588" sixx.name="commento" sixx.type="String">La nostra esperienza si tramanda da generazioni, fin dal 1924,&#10;e grazie a continui aggiornamenti siamo sempre più specializzati &#10;nella fornitura di uffici, aziende, scuole ed enti pubblici.&#10;&#10;Alla nostra Clientela assicuriamo un servizio accurato e tempestivo, &#10;svolto con professionalità , attenzione e cortesia.</sixx.object>'


You can note the : più  à


I have the impression that when the system reading the file with some $à, ( ISO 8859-1 character )

 interpreting the file adding some code,

and passing to the sixx    a data as above.

But in this case i don't understand what i can do to create string with right data.

Thanks,

Dario

Dario,

I'm sorry, you need to use:

encodeAsUTF8 asString

When you encodeAsUTF8 in 3.1.x, you end up with an instance of UTF8 which is a subclass of ByteArray, but the bulk of the SIXX code expects utf8 encoded data to be a String ...

Dale

----- Original Message -----
| From: "Dario Trussardi" <[hidden email]>
| To: "GemStone Seaside beta discussion" <[hidden email]>
| Sent: Thursday, December 13, 2012 1:05:29 AM
| Subject: Re: [GS/SS Beta] String UTF error from sixx string

| Dale,








| Dario,

| The code that sends decodeFromUTF8 expects that the strings are
| encoded in UTF8 .. in your example you are passing in an unencoded
| string that contains caracters beyond the 7bitASCII range (The
| encoding for 7bit ascii characters and UTF8 are identical), So your
| example should work if you encode the string passed into
| #readSixxFrom: in UTF8 using #encodeAsUTF8.




| i follow your indication but when i do:





| Object readSixxFrom: '<sixx.object sixx.id="13" sixx.name="commento"
| | sixx.type="String">La nostra esperienza si tramanda da generazioni,
| | fin dal 1924,&#10;e grazie a continui aggiornamenti siamo sempre
| | più specializzati &#10;nella fornitura di uffici, aziende, scuole
| | ed enti pubblici.&#10;&#10;Alla nostra Clientela assicuriamo un
| | servizio accurato e tempestivo, &#10;svolto con professionalità,
| | attenzione e cortesia.&#10;&#10;Venite a trovarci a Clusone, in Via
| | San Vincenzo de Paoli n.9, &#10;oppure contattateci telefonicamente
| | al numero 034623833.</sixx.object>' encodeAsUTF8.

| the system erase the error:

| AbstractException >> _signalWith: (envId 0) AbstractException >>
| signal (envId 0) SixxPortableUtil class >> signalException: (envId
| 0) [] in SixxXmlUtil class >> parseXml:persistentRoot: (envId 0)
| AbstractException >> _executeHandler: (envId 0) AbstractException >>
| _signalWith: (envId 0) AbstractException >> signal (envId 0) Object
| >> _error:args: (envId 0) Object >> _errorExpectedClass: (envId 0)


| THE ERROR where self is SmallInteger 60 and aClass is
| AbstractCharacter
| _validateClass: aClass
| "Returns true if self is a kind of aClass. Otherwise, generates an
| error."
| (self isKindOf: aClass) ifFalse: [ self _errorExpectedClass: aClass .
| ^ false ]. ^ true
| Object >> _validateClass: (envId 0) String >> at:put: (envId 0)
| WriteStream >> _nextPut: (envId 0) WriteStream >> nextPut: (envId 0)
| XMLTokenizer >> nextPCDataDelimitedBy:putOn: (envId 0) [] in
| XMLTokenizer >> nextPCData (envId 0) [] in XMLStreamWriter >>
| writeWith: (envId 0) ExecBlock >> ensure: (envId 0) XMLStreamWriter
| >> writeWith: (envId 0) XMLTokenizer >> nextPCData (envId 0)
| XMLTokenizer >> nextToken (envId 0) SAXHandler >> parseDocument
| (envId 0) XMLDOMParser >> parseDocument (envId 0) SAXHandler class
| >> parseDocumentFrom:useNamespaces:persistentRoot: (envId 0)
| XMLDOMParser class >>
| parseDocumentFrom:useNamespaces:persistentRoot: (envId 0) SAXHandler
| class >> parseDocumentFrom:persistentRoot: (envId 0)
| SixxYaxoXmlParserAdapter class >> parseXml:persistentRoot: (envId 0)
| [] in SixxXmlUtil class >> parseXml:persistentRoot: (envId 0)
| ExecBlock >> on:do: (envId 0) SixxXmlUtil class >>
| parseXml:persistentRoot: (envId 0) Behavior >>
| readSixxFrom:context:persistentRoot: (envId 0) Behavior >>
| readSixxFrom: (envId 0) Executed Code String >>
| evaluateInContext:symbolList: (envId 0) JadeServer >>
| evaluate:inContext: (envId 0) JadeServer >> doIt:in: (envId 0)
| GsNMethod class >> _gsReturnToC (envId 0)


| Thank for any consideration;


| Dario



| Dale

| ----- Original Message -----
| | From: "Dario Trussardi" < [hidden email] >
| | To: "beta discussion Gemstone Seaside" < [hidden email]
| | >
| | Sent: Wednesday, December 12, 2012 7:28:14 AM
| | Subject: [GS/SS Beta] String UTF error from sixx string
| | 
| | Hi,
| | 
| | i port some data from Pharo to GLASS with SIXX.
| | 
| | 
| | Now when into GLASS i create a object with:
| | 
| | Object readSixxFrom: '<sixx.object sixx.id="13"
| | sixx.name="commento"
| | sixx.type="String">La nostra esperienza si tramanda da generazioni,
| | fin dal 1924,&#10;e grazie a continui aggiornamenti siamo sempre
| | più specializzati &#10;nella fornitura di uffici, aziende, scuole
| | ed enti pubblici.&#10;&#10;Alla nostra Clientela assicuriamo un
| | servizio accurato e tempestivo, &#10;svolto con professionalità,
| | attenzione e cortesia.&#10;&#10;Venite a trovarci a Clusone, in Via
| | San Vincenzo de Paoli n.9, &#10;oppure contattateci telefonicamente
| | al numero 034623833.</sixx.object>'
| | 
| | the system erase the error:
| | 
| | AbstractException >> _signalFromPrimitive: (envId 0)
| | String >> decodeFromUTF8 (envId 0)
| | String class >> createInstanceOf:withSixxElement: (envId 0)
| | Behavior >> newInstanceFromSixxElement:context: (envId 0)
| | Behavior >> fromSixxElement:context: (envId 0)
| | Behavior >> readSixxFromSixxElement:context: (envId 0)
| | Behavior >> readSixxFrom:context:persistentRoot: (envId 0)
| | Behavior >> readSixxFrom: (envId 0)
| | Executed Code
| | String >> evaluateInContext:symbolList: (envId 0)
| | JadeServer >> evaluate:inContext: (envId 0)
| | JadeServer >> printIt:in: (envId 0)
| | GsNMethod class >> _gsReturnToC (envId 0)
| | 
| | 
| | 
| | I change the String class method
| | 
| | createInstanceOf: aClass withSixxElement: sixxElement
| | ^ (SixxXmlUtil characterDataFrom: sixxElement) decodeFromUTF8
| | asString
| | 
| | 
| | removing the decodeFromUTF8.
| | 
| | With this modification the system works fine but it's right
| | solutions?
| | 
| | I need to do some change in Pharo?
| | 
| | Thanks for any considerations.
| | 
| | Dario
| | 
| | 

|

Reply | Threaded
Open this post in threaded view
|

Re: String UTF error from sixx string

Dale Henrichs
Dario,

I imagine that the problem you are hitting has to do with the encoding of the file ... if the file is not encoded as UTF8 when it is written then I think things start going haywire ...

I think the rule of thumb should be: when passing strings between systems whether over the wire or via file, the strings should always be encoded as UTF8 and immediately after loading into a system the UTF8 encoded strings should be decoded to native string representations ...

You can't rely on the systems doing the write thing, so you must explicitly use UTF8 encoding ... I think pharo allows you to associate an encoding with a FileStream, while you'll need to explicitly encode strings for GemStone ...

Dale

----- Original Message -----
| From: "Dario Trussardi" <[hidden email]>
| To: "beta discussion Gemstone Seaside" <[hidden email]>
| Sent: Thursday, January 24, 2013 9:45:31 AM
| Subject: Re: [GS/SS Beta] String UTF error from sixx string
|
| Dale,
|
| i have some problem when porting some sixx data files from Pharo run
| on MAC to GLASS environment run on Ubuntu system.
|
| The problem is relative to manage a string with $ù $à character.
|
| The String method is setup with:
|
| createInstanceOfOrg: aClass withSixxElement: sixxElement
| ^ (SixxXmlUtil characterDataFrom: sixxElement)
|
|
|
| An simple test from workspace :
|
| Object readSixxFrom: '<sixx.object sixx.id="13" sixx.name="commento"
| sixx.type="String">più professionalità</sixx.object>'
|
| works fine, it answer a string with 'più professionalità'
|
|
|
|
| But when i reading a file with string containing a $ù
|
| the sixxElement parameter of createInstanceOfOrg: aClass
| withSixxElement: sixxElement
|
| is set to :
|
|
| '<sixx.object sixx.id="588" sixx.name="commento"
| sixx.type="String">La nostra esperienza si tramanda da generazioni,
| fin dal 1924,&#10;e grazie a continui aggiornamenti siamo sempre
| più specializzati &#10;nella fornitura di uffici, aziende, scuole
| ed enti pubblici.&#10;&#10;Alla nostra Clientela assicuriamo un
| servizio accurato e tempestivo, &#10;svolto con professionalità ,
| attenzione e cortesia.</sixx.object>'
|
|
| You can note the : più Ã
|
|
| I have the impression that when the system reading the file with some
| $à, ( ISO 8859-1 character )
|
| interpreting the file adding some code,
|
| and passing to the sixx a data as above.
|
| But in this case i don't understand what i can do to create string
| with right data.
|
| Thanks,
|
| Dario
|
|
|
| Dario,
|
|
|
|
|
| I'm sorry, you need to use:
|
|
|
|
|
| encodeAsUTF8 asString
|
|
|
|
|
| When you encodeAsUTF8 in 3.1.x, you end up with an instance of UTF8
| which is a subclass of ByteArray, but the bulk of the SIXX code
| expects utf8 encoded data to be a String ...
|
|
|
|
|
| Dale
|
|
|
|
|
| ----- Original Message -----
|
|
| | From: "Dario Trussardi" < [hidden email] >
|
|
| | To: "GemStone Seaside beta discussion" < [hidden email]
| | >
|
|
| | Sent: Thursday, December 13, 2012 1:05:29 AM
|
|
| | Subject: Re: [GS/SS Beta] String UTF error from sixx string
|
|
| |
|
|
| | Dale,
|
|
| |
|
|
| |
|
|
| |
|
|
| |
|
|
| |
|
|
| |
|
|
| |
|
|
| |
|
|
| | Dario,
|
|
| |
|
|
| | The code that sends decodeFromUTF8 expects that the strings are
|
|
| | encoded in UTF8 .. in your example you are passing in an unencoded
|
|
| | string that contains caracters beyond the 7bitASCII range (The
|
|
| | encoding for 7bit ascii characters and UTF8 are identical), So your
|
|
| | example should work if you encode the string passed into
|
|
| | #readSixxFrom: in UTF8 using #encodeAsUTF8.
|
|
| |
|
|
| |
|
|
| |
|
|
| |
|
|
| | i follow your indication but when i do:
|
|
| |
|
|
| |
|
|
| |
|
|
| |
|
|
| |
|
|
| | Object readSixxFrom: '<sixx.object sixx.id="13"
| | sixx.name="commento"
|
|
| | | sixx.type="String">La nostra esperienza si tramanda da
| | | generazioni,
|
|
| | | fin dal 1924,&#10;e grazie a continui aggiornamenti siamo sempre
|
|
| | | più specializzati &#10;nella fornitura di uffici, aziende, scuole
|
|
| | | ed enti pubblici.&#10;&#10;Alla nostra Clientela assicuriamo un
|
|
| | | servizio accurato e tempestivo, &#10;svolto con professionalità,
|
|
| | | attenzione e cortesia.&#10;&#10;Venite a trovarci a Clusone, in
| | | Via
|
|
| | | San Vincenzo de Paoli n.9, &#10;oppure contattateci
| | | telefonicamente
|
|
| | | al numero 034623833.</sixx.object>' encodeAsUTF8.
|
|
| |
|
|
| | the system erase the error:
|
|
| |
|
|
| | AbstractException >> _signalWith: (envId 0) AbstractException >>
|
|
| | signal (envId 0) SixxPortableUtil class >> signalException: (envId
|
|
| | 0) [] in SixxXmlUtil class >> parseXml:persistentRoot: (envId 0)
|
|
| | AbstractException >> _executeHandler: (envId 0) AbstractException
| | >>
|
|
| | _signalWith: (envId 0) AbstractException >> signal (envId 0) Object
|
|
| | >> _error:args: (envId 0) Object >> _errorExpectedClass: (envId 0)
|
|
| |
|
|
| |
|
|
| | THE ERROR where self is SmallInteger 60 and aClass is
|
|
| | AbstractCharacter
|
|
| | _validateClass: aClass
|
|
| | "Returns true if self is a kind of aClass. Otherwise, generates an
|
|
| | error."
|
|
| | (self isKindOf: aClass) ifFalse: [ self _errorExpectedClass: aClass
| | .
|
|
| | ^ false ]. ^ true
|
|
| | Object >> _validateClass: (envId 0) String >> at:put: (envId 0)
|
|
| | WriteStream >> _nextPut: (envId 0) WriteStream >> nextPut: (envId
| | 0)
|
|
| | XMLTokenizer >> nextPCDataDelimitedBy:putOn: (envId 0) [] in
|
|
| | XMLTokenizer >> nextPCData (envId 0) [] in XMLStreamWriter >>
|
|
| | writeWith: (envId 0) ExecBlock >> ensure: (envId 0) XMLStreamWriter
|
|
| | >> writeWith: (envId 0) XMLTokenizer >> nextPCData (envId 0)
|
|
| | XMLTokenizer >> nextToken (envId 0) SAXHandler >> parseDocument
|
|
| | (envId 0) XMLDOMParser >> parseDocument (envId 0) SAXHandler class
|
|
| | >> parseDocumentFrom:useNamespaces:persistentRoot: (envId 0)
|
|
| | XMLDOMParser class >>
|
|
| | parseDocumentFrom:useNamespaces:persistentRoot: (envId 0)
| | SAXHandler
|
|
| | class >> parseDocumentFrom:persistentRoot: (envId 0)
|
|
| | SixxYaxoXmlParserAdapter class >> parseXml:persistentRoot: (envId
| | 0)
|
|
| | [] in SixxXmlUtil class >> parseXml:persistentRoot: (envId 0)
|
|
| | ExecBlock >> on:do: (envId 0) SixxXmlUtil class >>
|
|
| | parseXml:persistentRoot: (envId 0) Behavior >>
|
|
| | readSixxFrom:context:persistentRoot: (envId 0) Behavior >>
|
|
| | readSixxFrom: (envId 0) Executed Code String >>
|
|
| | evaluateInContext:symbolList: (envId 0) JadeServer >>
|
|
| | evaluate:inContext: (envId 0) JadeServer >> doIt:in: (envId 0)
|
|
| | GsNMethod class >> _gsReturnToC (envId 0)
|
|
| |
|
|
| |
|
|
| | Thank for any consideration;
|
|
| |
|
|
| |
|
|
| | Dario
|
|
| |
|
|
| |
|
|
| |
|
|
| | Dale
|
|
| |
|
|
| | ----- Original Message -----
|
|
| | | From: "Dario Trussardi" < [hidden email] >
|
|
| | | To: "beta discussion Gemstone Seaside" <
| | | [hidden email]
|
|
| | | >
|
|
| | | Sent: Wednesday, December 12, 2012 7:28:14 AM
|
|
| | | Subject: [GS/SS Beta] String UTF error from sixx string
|
|
| | |
|
|
| | | Hi,
|
|
| | |
|
|
| | | i port some data from Pharo to GLASS with SIXX.
|
|
| | |
|
|
| | |
|
|
| | | Now when into GLASS i create a object with:
|
|
| | |
|
|
| | | Object readSixxFrom: '<sixx.object sixx.id="13"
|
|
| | | sixx.name="commento"
|
|
| | | sixx.type="String">La nostra esperienza si tramanda da
| | | generazioni,
|
|
| | | fin dal 1924,&#10;e grazie a continui aggiornamenti siamo sempre
|
|
| | | più specializzati &#10;nella fornitura di uffici, aziende, scuole
|
|
| | | ed enti pubblici.&#10;&#10;Alla nostra Clientela assicuriamo un
|
|
| | | servizio accurato e tempestivo, &#10;svolto con professionalità,
|
|
| | | attenzione e cortesia.&#10;&#10;Venite a trovarci a Clusone, in
| | | Via
|
|
| | | San Vincenzo de Paoli n.9, &#10;oppure contattateci
| | | telefonicamente
|
|
| | | al numero 034623833.</sixx.object>'
|
|
| | |
|
|
| | | the system erase the error:
|
|
| | |
|
|
| | | AbstractException >> _signalFromPrimitive: (envId 0)
|
|
| | | String >> decodeFromUTF8 (envId 0)
|
|
| | | String class >> createInstanceOf:withSixxElement: (envId 0)
|
|
| | | Behavior >> newInstanceFromSixxElement:context: (envId 0)
|
|
| | | Behavior >> fromSixxElement:context: (envId 0)
|
|
| | | Behavior >> readSixxFromSixxElement:context: (envId 0)
|
|
| | | Behavior >> readSixxFrom:context:persistentRoot: (envId 0)
|
|
| | | Behavior >> readSixxFrom: (envId 0)
|
|
| | | Executed Code
|
|
| | | String >> evaluateInContext:symbolList: (envId 0)
|
|
| | | JadeServer >> evaluate:inContext: (envId 0)
|
|
| | | JadeServer >> printIt:in: (envId 0)
|
|
| | | GsNMethod class >> _gsReturnToC (envId 0)
|
|
| | |
|
|
| | |
|
|
| | |
|
|
| | | I change the String class method
|
|
| | |
|
|
| | | createInstanceOf: aClass withSixxElement: sixxElement
|
|
| | | ^ (SixxXmlUtil characterDataFrom: sixxElement) decodeFromUTF8
|
|
| | | asString
|
|
| | |
|
|
| | |
|
|
| | | removing the decodeFromUTF8.
|
|
| | |
|
|
| | | With this modification the system works fine but it's right
|
|
| | | solutions?
|
|
| | |
|
|
| | | I need to do some change in Pharo?
|
|
| | |
|
|
| | | Thanks for any considerations.
|
|
| | |
|
|
| | | Dario
|
|
| | |
|
|
| | |
|
|
| |
|
|
| |
|
|
|