The Trunk: Multilingual-topa.239.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Multilingual-topa.239.mcz

commits-2
Tobias Pape uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-topa.239.mcz

==================== Summary ====================

Name: Multilingual-topa.239
Author: topa
Time: 12 September 2018, 3:25:37.647469 pm
UUID: 9a2616f0-677b-4d2b-9a60-4495a0fae606
Ancestors: Multilingual-ul.238

Fix and unify Unicode data downloading

(old url was stable but newer is preferred).

=============== Diff against Multilingual-ul.238 ===============

Item was changed:
+ ----- Method: Unicode class>>caseFoldingData (in category 'unicode data') -----
- ----- Method: Unicode class>>caseFoldingData (in category 'casing') -----
  caseFoldingData
 
+ ^ self fetch: 'CaseFolding Unicode data' fromUnicodeData: 'CaseFolding.txt'
+ !
- UIManager default informUserDuring: [ :bar |
- | stream |
- bar value: 'Downloading CaseFolding Unicode data'.
- stream := HTTPClient httpGet: 'http://www.unicode.org/Public/UNIDATA/CaseFolding.txt'.
- (stream isKindOf: RWBinaryOrTextStream) ifFalse: [
- ^self error: 'Download failed' ].
- ^stream reset; contents ]!

Item was added:
+ ----- Method: Unicode class>>fetch:fromUnicodeData: (in category 'unicode data') -----
+ fetch: what fromUnicodeData: fileName
+ | unicodeLocation |
+ unicodeLocation := 'https://www.unicode.org/Public/UCD/latest/ucd/'.
+ UIManager default informUser: 'Downloading ', what  during:
+ [ | response|
+ response := WebClient httpGet: unicodeLocation, fileName.
+ ^ response isSuccess
+ ifFalse: [self error: 'Download failed']
+ ifTrue: [response content]].
+
+ !

Item was changed:
+ ----- Method: Unicode class>>generalCategory (in category 'unicode data') -----
- ----- Method: Unicode class>>generalCategory (in category 'class methods') -----
  generalCategory
 
  ^ GeneralCategory.
 
  !

Item was changed:
+ ----- Method: Unicode class>>generalCategoryComment (in category 'unicode data') -----
- ----- Method: Unicode class>>generalCategoryComment (in category 'class methods') -----
  generalCategoryComment
  "
  Lu Letter, Uppercase
  Ll Letter, Lowercase
  Lt Letter, Titlecase
  Lm Letter, Modifier
  Lo Letter, Other
  Mn Mark, Non-Spacing
  Mc Mark, Spacing Combining
  Me Mark, Enclosing
  Nd Number, Decimal
  Nl Number, Letter
  No Number, Other
  Pc Punctuation, Connector
  Pd Punctuation, Dash
  Ps Punctuation, Open
  Pe Punctuation, Close
  Pi Punctuation, Initial quote (may behave like Ps or Pe depending on usage)
  Pf Punctuation, Final quote (may behave like Ps or Pe depending on usage)
  Po Punctuation, Other
  Sm Symbol, Math
  Sc Symbol, Currency
  Sk Symbol, Modifier
  So Symbol, Other
  Zs Separator, Space
  Zl Separator, Line
  Zp Separator, Paragraph
  Cc Other, Control
  Cf Other, Format
  Cs Other, Surrogate
  Co Other, Private Use
  Cn Other, Not Assigned (no characters in the file have this property)
  "!

Item was changed:
+ ----- Method: Unicode class>>parseUnicodeDataFrom: (in category 'unicode data') -----
- ----- Method: Unicode class>>parseUnicodeDataFrom: (in category 'class methods') -----
  parseUnicodeDataFrom: stream
  "
  self halt.
  self parseUnicodeDataFile
  "
 
  | line fieldEnd point fieldStart toNumber generalCategory decimalProperty |
 
  toNumber := [:quad | ('16r', quad) asNumber].
 
  GeneralCategory := SparseLargeTable new: 16rE0080 chunkSize: 1024 arrayClass: Array base: 1 defaultValue:  'Cn'.
  DecimalProperty := SparseLargeTable new: 16rE0080 chunkSize: 32 arrayClass: Array base: 1 defaultValue: -1.
 
  16r3400 to: 16r4DB5 do: [:i | GeneralCategory at: i+1 put: 'Lo'].
  16r4E00 to: 16r9FA5 do: [:i | GeneralCategory at: i+1 put: 'Lo'].
  16rAC00 to: 16rD7FF do: [:i | GeneralCategory at: i+1 put: 'Lo'].
 
  [(line := stream nextLine) size > 0] whileTrue: [
  fieldEnd := line indexOf: $; startingAt: 1.
  point := toNumber value: (line copyFrom: 1 to: fieldEnd - 1).
  point > 16rE007F ifTrue: [
  GeneralCategory zapDefaultOnlyEntries.
  DecimalProperty zapDefaultOnlyEntries.
  ^ self].
  2 to: 3 do: [:i |
  fieldStart := fieldEnd + 1.
  fieldEnd := line indexOf: $; startingAt: fieldStart.
  ].
  generalCategory := line copyFrom: fieldStart to: fieldEnd - 1.
  GeneralCategory at: point+1 put: generalCategory.
  generalCategory = 'Nd' ifTrue: [
  4 to: 7 do: [:i |
  fieldStart := fieldEnd + 1.
  fieldEnd := line indexOf: $; startingAt: fieldStart.
  ].
  decimalProperty :=  line copyFrom: fieldStart to: fieldEnd - 1.
  DecimalProperty at: point+1 put: decimalProperty asNumber.
  ].
  ].
  GeneralCategory zapDefaultOnlyEntries.
  DecimalProperty zapDefaultOnlyEntries.
  !

Item was changed:
+ ----- Method: Unicode class>>unicodeData (in category 'unicode data') -----
- ----- Method: Unicode class>>unicodeData (in category 'composing') -----
  unicodeData
 
+ ^ self fetch: 'Unicode Data' fromUnicodeData: 'UnicodeData.txt'
+ !
- UIManager default informUserDuring: [ :bar |
- | stream |
- bar value: 'Downloading Unicode data'.
- stream := HTTPClient httpGet: 'http://www.unicode.org/Public/UNIDATA/UnicodeData.txt'.
- (stream isKindOf: RWBinaryOrTextStream) ifFalse: [
- ^self error: 'Download failed' ].
- ^stream reset; contents ]!