Patrick Rein uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-pre.228.mcz ==================== Summary ==================== Name: Multilingual-pre.228 Author: pre Time: 8 June 2017, 9:40:08.341697 am UUID: cb64d235-8b3f-a140-b34f-4695b78dd94e Ancestors: Multilingual-pre.227 Adds a UTF32 TextConverter. Updates the comments of some of the TextConverter. Updates the encoding names of utf16. =============== Diff against Multilingual-pre.227 =============== Item was changed: ISO8859TextConverter subclass: #ISO88592TextConverter instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Multilingual-TextConversion'! + + !ISO88592TextConverter commentStamp: '<historical>' prior: 0! + Text converter for ISO 8859-2. An international encoding used in Eastern Europe.! Item was changed: ISO8859TextConverter subclass: #ISO88597TextConverter instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Multilingual-TextConversion'! + + !ISO88597TextConverter commentStamp: '<historical>' prior: 0! + Text converter for ISO 8859-7. An international encoding used for Greek.! Item was changed: ISO88591TextConverter subclass: #Latin1TextConverter instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Multilingual-TextConversion'! + + !Latin1TextConverter commentStamp: '<historical>' prior: 0! + Text converter for ISO 8859-1. An international encoding used in Western Europe.! Item was changed: ISO885915TextConverter subclass: #Latin9TextConverter instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Multilingual-TextConversion'! + + !Latin9TextConverter commentStamp: 'pre 4/21/2017 11:40' prior: 0! + Text converter for ISO 8859-15. An international encoding also used in Western Europe.! Item was changed: ----- Method: UTF16TextConverter class>>encodingNames (in category 'utilities') ----- encodingNames + ^ #('utf-16' 'utf16' 'utf-16-le' 'utf-16-be' 'utf-16be' 'utf-16le') copy. - ^ #('utf-16' 'utf16' 'utf-16-le' 'utf-16-be') copy. ! Item was added: + TextConverter subclass: #UTF32TextConverter + instanceVariableNames: 'useLittleEndian useByteOrderMark byteOrderMarkDone' + classVariableNames: '' + poolDictionaries: '' + category: 'Multilingual-TextConversion'! + + !UTF32TextConverter commentStamp: 'pre 6/7/2017 17:55' prior: 0! + Text converter for UTF-32. It supports the endianness and byte order mark.! Item was added: + ----- Method: UTF32TextConverter class>>encodingNames (in category 'utilities') ----- + encodingNames + + ^ #( 'utf32' 'utf32be' 'utf32le' 'utf-32' 'utf-32be' 'utf-32le' 'ucs4' 'ucs4be' 'ucs4le') copy + ! Item was added: + ----- Method: UTF32TextConverter class>>initializeLatin1MapAndEncodings (in category 'utilities') ----- + initializeLatin1MapAndEncodings + "Initialize the latin1Map and latin1Encodings. + These variables ensure that conversions from latin1 ByteString is reasonably fast" + + latin1Map := (ByteArray new: 256) atAllPut: 1. + latin1Encodings := (0 to: 255) collect: [:i | (ByteArray newFrom: {0 . 0 . 0 . i}) asString]! Item was added: + ----- Method: UTF32TextConverter>>initialize (in category 'initialize-release') ----- + initialize + + super initialize. + useLittleEndian := useByteOrderMark := byteOrderMarkDone := false! Item was added: + ----- Method: UTF32TextConverter>>next16BitValue:toStream: (in category 'private') ----- + next16BitValue: value toStream: aStream + + | v1 v2 | + v1 := (value bitShift: -8) bitAnd: 16rFF. + v2 := value bitAnd: 16rFF. + useLittleEndian + ifTrue: [ + aStream + basicNextPut: (Character value: v2); + basicNextPut: (Character value: v1) ] + ifFalse: [ + aStream + basicNextPut: (Character value: v1); + basicNextPut: (Character value: v2) ]. + ! Item was added: + ----- Method: UTF32TextConverter>>next32BitValue:toStream: (in category 'private') ----- + next32BitValue: value toStream: aStream + + | v1 v2 v3 v4 | + v1 := (value bitShift: -24) bitAnd: 16rFF. + v2 := (value bitShift: -16) bitAnd: 16rFF. + v3 := (value bitShift: -8) bitAnd: 16rFF. + v4 := (value bitShift: 0) bitAnd: 16rFF. + useLittleEndian + ifTrue: [ + aStream + basicNextPut: (Character value: v4); + basicNextPut: (Character value: v3); + basicNextPut: (Character value: v2); + basicNextPut: (Character value: v1) ] + ifFalse: [ + aStream + basicNextPut: (Character value: v1); + basicNextPut: (Character value: v2); + basicNextPut: (Character value: v3); + basicNextPut: (Character value: v4) ]. + ! Item was added: + ----- Method: UTF32TextConverter>>nextFromStream: (in category 'conversion') ----- + nextFromStream: aStream + + | character1 character2 readBOM charValue character3 character4 | + aStream isBinary ifTrue: [ ^aStream basicNext ]. + character1 := aStream basicNext ifNil: [ ^nil ]. + character2 := aStream basicNext ifNil: [ ^nil ]. + character3 := aStream basicNext ifNil: [ ^nil ]. + character4 := aStream basicNext ifNil: [ ^nil ]. + + readBOM := false. + (character1 asciiValue = 16rFF and: [character2 asciiValue = 16rFE]) ifTrue: [ + self + useByteOrderMark: true; + useLittleEndian: true. + readBOM := true ]. + + ((character1 asciiValue = 0 and: [character2 asciiValue = 0]) + and: [character3 asciiValue = 16rFE and: [character4 asciiValue = 16rFF]]) ifTrue: [ + self + useByteOrderMark: true; + useLittleEndian: false. + readBOM := true ]. + + readBOM ifTrue: [ + "Re-initialize character variables if they contain BOM" + character1 := aStream basicNext ifNil: [ ^nil ]. + character2 := aStream basicNext ifNil: [ ^nil ]. + character3 := aStream basicNext ifNil: [ ^nil ]. + character4 := aStream basicNext ifNil: [ ^nil ]. ]. + + useLittleEndian + ifTrue: [ charValue := (character4 charCode bitShift: 24) + (character3 charCode bitShift: 16) + (character2 charCode bitShift: 8) + character1 charCode ] + ifFalse: [ charValue := (character1 charCode bitShift: 24) + (character2 charCode bitShift: 16) + (character3 charCode bitShift: 8) + character4 charCode ]. + + ^ Unicode value: charValue! Item was added: + ----- Method: UTF32TextConverter>>nextPut:toStream: (in category 'conversion') ----- + nextPut: aCharacter toStream: aStream + + | charCode | + aStream isBinary ifTrue: [ ^aCharacter storeBinaryOn: aStream ]. + (useByteOrderMark and: [ byteOrderMarkDone not ]) ifTrue: [ + self next32BitValue: 16r0000FEFF toStream: aStream. + byteOrderMarkDone := true ]. + (charCode := aCharacter charCode) < 256 + ifTrue: [ + (latin1Encodings at: charCode + 1) + ifNil: [ self next32BitValue: charCode toStream: aStream ] + ifNotNil: [ :encodedString | aStream basicNextPutAll: encodedString ] ] + ifFalse: [ + self next32BitValue: charCode toStream: aStream ]. + ^aCharacter! Item was added: + ----- Method: UTF32TextConverter>>swapLatin1EncodingByteOrder (in category 'private') ----- + swapLatin1EncodingByteOrder + latin1Encodings := latin1Encodings collect: [:each | + each ifNotNil: [each reverse]]! Item was added: + ----- Method: UTF32TextConverter>>useByteOrderMark (in category 'accessing') ----- + useByteOrderMark + + ^useByteOrderMark + ! Item was added: + ----- Method: UTF32TextConverter>>useByteOrderMark: (in category 'accessing') ----- + useByteOrderMark: aBoolean + + useByteOrderMark := aBoolean. + ! Item was added: + ----- Method: UTF32TextConverter>>useLittleEndian (in category 'accessing') ----- + useLittleEndian + + ^useLittleEndian + ! Item was added: + ----- Method: UTF32TextConverter>>useLittleEndian: (in category 'accessing') ----- + useLittleEndian: aBoolean + + aBoolean = useLittleEndian ifFalse: [ self swapLatin1EncodingByteOrder ]. + useLittleEndian := aBoolean. + ! |
Free forum by Nabble | Edit this page |