The Trunk: Multilingual-nice.189.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Multilingual-nice.189.mcz

commits-2
Nicolas Cellier uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-nice.189.mcz

==================== Summary ====================

Name: Multilingual-nice.189
Author: nice
Time: 10 October 2013, 1:33:16.023 am
UUID: 2f31cb35-a0d0-41b4-be3e-f5c0bdb88e72
Ancestors: Multilingual-nice.188

Put the charsetAt: send out of the scanJapaneseCharactersFrom:to:in:rightX: inner loop.

=============== Diff against Multilingual-nice.188 ===============

Item was changed:
  ----- Method: CharacterScanner>>scanJapaneseCharactersFrom:to:in:rightX: (in category '*Multilingual-Display') -----
  scanJapaneseCharactersFrom: startIndex to: stopIndex in: sourceString rightX: rightX
  "this is a scanning method for
  multibyte Japanese characters in a WideString - hence the isBreakable:in:in:
  a font that does not do character-pair kerning "
 
+ | ascii encoding nextDestX startEncoding char charset |
- | ascii encoding nextDestX startEncoding char |
  lastIndex := startIndex.
  lastIndex > stopIndex ifTrue: [^self handleEndOfRunAt: stopIndex].
  startEncoding := (sourceString at: startIndex) leadingChar.
+ charset := EncodedCharSet charsetAt: startEncoding.
  [lastIndex <= stopIndex] whileTrue: [
  char := sourceString at: lastIndex.
  encoding := char leadingChar.
  encoding ~= startEncoding ifTrue: [lastIndex := lastIndex - 1. ^ stopConditions endOfRun].
  ascii := char charCode.
  (encoding = 0 and: [ascii < 256 and:[(stopConditions at: ascii + 1) notNil]])
  ifTrue: [^ stopConditions at: ascii + 1].
+ (self isBreakableAt: lastIndex in: sourceString in: charset) ifTrue: [
- (self isBreakableAt: lastIndex in: sourceString in: (EncodedCharSet charsetAt: encoding)) ifTrue: [
  self registerBreakableIndex.
  ].
  nextDestX := destX + (font widthOf: char).
  nextDestX > rightX ifTrue: [^ stopConditions crossedX].
  destX := nextDestX + kern.
  lastIndex := lastIndex + 1.
  ].
  ^self handleEndOfRunAt: stopIndex!


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-nice.189.mcz

timrowledge

On 09-10-2013, at 11:33 PM, [hidden email] wrote:

> Nicolas Cellier uploaded a new version of Multilingual to project The Trunk:
> http://source.squeak.org/trunk/Multilingual-nice.189.mcz
>
> ==================== Summary ====================
>
> Name: Multilingual-nice.189
> Author: nice
> Time: 10 October 2013, 1:33:16.023 am
> UUID: 2f31cb35-a0d0-41b4-be3e-f5c0bdb88e72
> Ancestors: Multilingual-nice.188
>
> Put the charsetAt: send out of the scanJapaneseCharactersFrom:to:in:rightX: inner loop.

I *think* that this is probably wrong; what if we have a wide string where a few characters are encoding 12 (middle earth Elvish), the next chunk is encoding 'pi' (irrational political verbiage) and then some japanese, and then some other encoding? The checking of the encoding = startEncoding is how we work out that the encoding is changed and we need to exit to deal with that by reseting stuff. Isn't it?


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: DO: Deadstart Operator



Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-nice.189.mcz

Nicolas Cellier
If the encoding changes, then we never reach (self isBreakableAt: lastIndex in: sourceString in: charset) because we ^#endOfRun.
Else, the encoding does not change, so we can proceed with charsetAt: startEncoding.
So, as I see it, the charsetAt: can safely go outside the loop, no?


2013/10/10 tim Rowledge <[hidden email]>

On 09-10-2013, at 11:33 PM, [hidden email] wrote:

> Nicolas Cellier uploaded a new version of Multilingual to project The Trunk:
> http://source.squeak.org/trunk/Multilingual-nice.189.mcz
>
> ==================== Summary ====================
>
> Name: Multilingual-nice.189
> Author: nice
> Time: 10 October 2013, 1:33:16.023 am
> UUID: 2f31cb35-a0d0-41b4-be3e-f5c0bdb88e72
> Ancestors: Multilingual-nice.188
>
> Put the charsetAt: send out of the scanJapaneseCharactersFrom:to:in:rightX: inner loop.

I *think* that this is probably wrong; what if we have a wide string where a few characters are encoding 12 (middle earth Elvish), the next chunk is encoding 'pi' (irrational political verbiage) and then some japanese, and then some other encoding? The checking of the encoding = startEncoding is how we work out that the encoding is changed and we need to exit to deal with that by reseting stuff. Isn't it?


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: DO: Deadstart Operator






Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-nice.189.mcz

timrowledge

On 09-10-2013, at 5:02 PM, Nicolas Cellier <[hidden email]> wrote:

> If the encoding changes, then we never reach (self isBreakableAt: lastIndex in: sourceString in: charset) because we ^#endOfRun.
> Else, the encoding does not change, so we can proceed with charsetAt: startEncoding.
> So, as I see it, the charsetAt: can safely go outside the loop, no?

Sorry - I misread your diffs! I looked too quickly and though you were removing the check for
encoding ~= startEncoding ifTrue: [lastIndex := lastIndex - 1. ^ stopConditions endOfRun].


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Useful random insult:- Has an inferiority complex, but not a very good one.