the xml parsing uses the following for conversion of utf8 file into the LeadEncoded representation
the red marked code return EsLeadEncodedBytes ( before the code page transformation )
and then the later code page transformation fails , ending up with an SGML error prompt.
if aStream contents is replaced by aStream fileStream contents ( returning String ) , everything works.
( using VA 8.0.2, it is sufficent to change the regional settings for programs not using unicode to japanes to have the described behavior )
convertStream: aStream withEncoding: anEncodingString
" Convert the contents of the passed stream to aTargetCodePage. Answer a new stream
containing the converted contents "
| encoding platformString codePage tempString |
encoding := ( anEncodingString isNil
ifTrue: [ self encodingFromStream: aStream ]
ifFalse: [ anEncodingString ]).
" No special encoding type found. Answer the stream as is "
encoding isNil
ifTrue: [ aStream reset. ^aStream ].
platformString := aStream contents.
" Check for unicode "
( self isUnicodeEncoding: encoding )
ifTrue: [ platformString := self convertBuffer: platformString fromUnicodeEncoding: encoding ]
ifFalse: [
( codePage := self ibmCodePageForEncoding: encoding ) notNil
ifTrue: [ platformString := AbtCodePageConverter current
convert: platformString
fromCodePage: codePage
toCodePage: AbtCodePageConverter currentCodePage
bufferSize: platformString size * 2 ]].
" Answer a new SBCS string with any trailing nulls removed. #platformString may be an
EsLeadEncdodeBytes prior to invoking the #removeNullsFor: method. It will become an SBCS string "
platformString := self removeNullsFor: platformString.
^ReadStream on: platformString