The Trunk: Multilingual-ul.238.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Multilingual-ul.238.mcz

commits-2
Levente Uzonyi uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-ul.238.mcz

==================== Summary ====================

Name: Multilingual-ul.238
Author: ul
Time: 13 June 2018, 5:07:40.804457 pm
UUID: 8f89eae0-1eca-48c6-a3f7-ebe13afa2dbe
Ancestors: Multilingual-nice.237

Changes to various TextConverters:
#nextPut:toStream: can not be used to initialize latin1Map and latin1Encodings anymore (because most implementations reference them), so
 - moved the current implementation of #initializeLatin1MapAndEncodings to ByteTextConverter, because only those converters understand #encode:.
 - TextConverter >> #initializeLatin1MapAndEncodings has become subclass responsibility
 - implemented #initializeLatin1MapAndEncodings in UTF* text converters
 - removed class-side #initialize implementations. Will add a new one to TextConverter once all subclasses can reinitialize their tables.
 - added class side #initializeTables which is expected to initialize the various class side tables in the correct order
 - postscript will reinitialize class side tables where possible (all converters except for  EUCTextConverter, CNGBTextConverter, EUCJPTextConverter, EUCKRTextConverter, ShiftJISTextConverter, CompoundTextConverter, KOI8RTextConverter)
 - latin1Map and latin1Encodings are not initialized lazily anymore

UTF16TextConverter will take endianness into account when writing the BOM. Same thing is probably missing from UTF32TextConverter too.

=============== Diff against Multilingual-nice.237 ===============

Item was removed:
- ----- Method: ByteTextConverter class>>initialize (in category 'class initialization') -----
- initialize
-
-       self == ByteTextConverter
- ifTrue: [self allSubclassesDo: [:c | c initialize]]
- ifFalse: [self
- initializeDecodeTable;
- initializeEncodeTable;
- initializeLatin1MapAndEncodings]
- !

Item was added:
+ ----- Method: ByteTextConverter class>>initializeLatin1MapAndEncodings (in category 'class initialization') -----
+ initializeLatin1MapAndEncodings
+ "Initialize the latin1Map and latin1Encodings.
+ These variables ensure that conversions from latin1 ByteString is reasonably fast.
+ This implementation assumes that encodeTable is initialized."
+
+ latin1Map := ByteArray new: 256.
+ latin1Encodings := Array new: 256.
+ 0 to: 255 do:[:i |
+ | latin1 encoded |
+ latin1 := String with: (Character value: i).
+ [encoded := String with: (Character value: (self new encode: latin1 first charCode))]
+ ifError: [].
+ latin1 = encoded ifTrue:[
+ latin1Map at: i+1 put: 0. "no translation needed"
+ ] ifFalse: [
+ latin1Map at: i+1 put: 1. "translation needed"
+ latin1Encodings at: i+1 put: encoded.
+ ]].!

Item was added:
+ ----- Method: ByteTextConverter class>>initializeTables (in category 'class initialization') -----
+ initializeTables
+
+       self == ByteTextConverter ifTrue: [ ^self ].
+ self
+ initializeDecodeTable;
+ initializeEncodeTable;
+ initializeLatin1MapAndEncodings
+ !

Item was changed:
+ ----- Method: TextConverter class>>initializeLatin1MapAndEncodings (in category 'class initialization') -----
- ----- Method: TextConverter class>>initializeLatin1MapAndEncodings (in category 'accessing') -----
  initializeLatin1MapAndEncodings
  "Initialize the latin1Map and latin1Encodings.
+ These variables ensure that conversions from latin1 ByteString is reasonably fast."
- These variables ensure that conversions from latin1 ByteString is reasonably fast"
 
+ self subclassResponsibility
+
+ !
- latin1Map := ByteArray new: 256.
- latin1Encodings := Array new: 256.
- 0 to: 255 do:[:i |
- | latin1 encoded |
- latin1 := String with: (Character value: i).
- [encoded := String with: (Character value: (self new encode: latin1 first charCode))]
- ifError: [].
- latin1 = encoded ifTrue:[
- latin1Map at: i+1 put: 0. "no translation needed"
- ] ifFalse: [
- latin1Map at: i+1 put: 1. "translation needed"
- latin1Encodings at: i+1 put: encoded.
- ]].!

Item was added:
+ ----- Method: TextConverter class>>initializeTables (in category 'class initialization') -----
+ initializeTables
+
+ self initializeLatin1MapAndEncodings!

Item was changed:
  ----- Method: TextConverter class>>latin1Encodings (in category 'accessing') -----
  latin1Encodings
  "Answer an Array mapping latin1 characters to conversion string"
 
+ ^latin1Encodings!
- ^latin1Encodings ifNil:
- [self initializeLatin1MapAndEncodings.
- latin1Encodings]!

Item was changed:
  ----- Method: TextConverter class>>latin1Map (in category 'accessing') -----
  latin1Map
  "Answer a ByteArray map telling if latin1 characters needs conversion or not"
 
+ ^latin1Map!
- ^latin1Map ifNil:
- [self initializeLatin1MapAndEncodings.
- latin1Map]!

Item was added:
+ ----- Method: UTF16TextConverter class>>initializeLatin1MapAndEncodings (in category 'accessing') -----
+ initializeLatin1MapAndEncodings
+ "Initialize the latin1Map and latin1Encodings.
+ These variables ensure that conversions from latin1 ByteString is reasonably fast."
+
+ latin1Map := ByteArray new: 256 withAll: 1.
+ latin1Encodings := (0 to: 255) collect: [ :i | { 0. i } asByteArray asString ]!

Item was changed:
  ----- Method: UTF16TextConverter>>nextPut:toStream: (in category 'conversion') -----
  nextPut: aCharacter toStream: aStream
 
  | charCode |
  aStream isBinary ifTrue: [ ^aCharacter storeBinaryOn: aStream ].
  (useByteOrderMark and: [ byteOrderMarkDone not ]) ifTrue: [
+ self next16BitValue: (useLittleEndian ifTrue: [ 16rFFFE ] ifFalse: [ 16rFEFF ]) toStream: aStream.
- self next16BitValue: 16rFEFF toStream: aStream.
  byteOrderMarkDone := true ].
  (charCode := aCharacter charCode) < 256
  ifTrue: [
  (latin1Encodings at: charCode + 1)
  ifNil: [ self next16BitValue: charCode toStream: aStream ]
  ifNotNil: [ :encodedString | aStream basicNextPutAll: encodedString ] ]
  ifFalse: [
  charCode <= 16rFFFF
  ifTrue: [ self next16BitValue: charCode toStream: aStream ]
  ifFalse: [
  | low high |
  charCode := charCode - 16r10000.
  low := charCode \\ 16r400 + 16rDC00.
  high := charCode // 16r400 + 16rD800.
  self
  next16BitValue: high toStream: aStream;
  next16BitValue: low toStream: aStream ] ].
  ^aCharacter!

Item was changed:
  ----- Method: UTF32TextConverter class>>initializeLatin1MapAndEncodings (in category 'utilities') -----
  initializeLatin1MapAndEncodings
  "Initialize the latin1Map and latin1Encodings.
+ These variables ensure that conversions from latin1 ByteString is reasonably fast."
- These variables ensure that conversions from latin1 ByteString is reasonably fast"
 
+ latin1Map := ByteArray new: 256 withAll: 1.
+ latin1Encodings := (0 to: 255) collect: [ :i | { 0. 0. 0. i } asByteArray asString ]
+ !
- latin1Map := (ByteArray new: 256) atAllPut: 1.
- latin1Encodings := (0 to: 255) collect: [:i | (ByteArray newFrom: {0 . 0 . 0 . i}) asString]!

Item was added:
+ ----- Method: UTF8TextConverter class>>initializeLatin1MapAndEncodings (in category 'as yet unclassified') -----
+ initializeLatin1MapAndEncodings
+ "Initialize the latin1Map and latin1Encodings. These variables ensure that conversions from latin1 ByteString is reasonably fast."
+
+ latin1Map := (0 to: 255) collect: [ :each | each bitShift: -7 ] as: ByteArray.
+ latin1Encodings := (0 to: 255) collect: [ :each |
+ each <= 127
+ ifTrue: [ nil ]
+ ifFalse: [
+ { 192 bitOr: (each bitShift: -6). (each bitAnd: 63) bitOr: 128 } asByteArray asString ] ]!

Item was changed:
+ (PackageInfo named: 'Multilingual') postscript: '"Initialize the tables in all TextConverters that support it."
+ TextConverter allSubclassesDo: [ :each | [ each initializeTables ] ifError: [] ]'!
- (PackageInfo named: 'Multilingual') postscript: 'Unicode
- initializeCaseFolding;
- initializeCompositionMappings'!