The Trunk: ST80-dtl.141.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: ST80-dtl.141.mcz

commits-2
David T. Lewis uploaded a new version of ST80 to project The Trunk:
http://source.squeak.org/trunk/ST80-dtl.141.mcz

==================== Summary ====================

Name: ST80-dtl.141
Author: dtl
Time: 8 February 2013, 8:01:05.105 pm
UUID: f838c540-19cd-4828-bcce-242b8ff4d30b
Ancestors: ST80-nice.140

Fix ParagraphEditor>>zapSelectionWithCompositionWith: failing to reset setMark or setPoint values after converting to Unicode in the case of an input string of length > 1. The problem was apparent only in the case of keyboard entry from MVC when a wait2ms delay causing two or more characters to be buffered prior to handling the zap selection. Problem was best exhibited using a Linux interpreter VM with poor delay time resolution. This update fixes the problem in MVC but has not been tested with Unicode input (author is using US English keyboard and locale).

Problem identified by Bob Arning: http://lists.squeakfoundation.org/pipermail/squeak-dev/2013-February/168666.html

=============== Diff against ST80-nice.140 ===============

Item was changed:
  ----- Method: ParagraphEditor>>zapSelectionWithCompositionWith: (in category 'accessing-selection') -----
  zapSelectionWithCompositionWith: aString
  "Deselect, and replace the selection text by aString.
  Remember the resulting selectionInterval in UndoInterval and otherInterval.
  Do not set up for undo."
 
  | stream newString aText beforeChar |
  wasComposition := false.
  ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
  aString size = 1 and: [(Unicode isComposition: aString first) not]]) ifTrue: [
  ^ self zapSelectionWith: (Text string: aString emphasis: emphasisHere)].
 
  stream := UnicodeCompositionStream on: (String new: 16).
  stream nextPut: beforeChar.
  stream nextPutAll: aString.
  newString := stream contents.
  aText := Text string: newString emphasis: emphasisHere.
- self markBlock < self pointBlock
- ifTrue: [self setMark: self markBlock stringIndex - 1]
- ifFalse: [self setPoint: self  pointBlock stringIndex - 1].
-
  wasComposition := true.
+ self markBlock < self pointBlock
+ ifTrue: [ self setMark: self markBlock stringIndex - 1.
+ self zapSelectionWith: aText.
+ self setMark: self markBlock stringIndex + 1]
+ ifFalse: [ self setPoint: self pointBlock stringIndex - 1.
+ self zapSelectionWith: aText.
+ self setPoint: self pointBlock stringIndex + 1]
- self zapSelectionWith: aText.
  !


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: ST80-dtl.141.mcz

Bob Arning-2
I fear that doesn't even work with ordinary USASCII input. I tweaked readKeyboard to simulate additional characters arriving in close succession:

        self hasSelection ifTrue: "save highlighted characters"
            [UndoSelection := self selection].
        ((model isKindOf: Workspace) and: [char = $z]) ifTrue: [typeAhead nextPutAll: '123'].

        self zapSelectionWithCompositionWith: typeAhead contents.

and what I get by typing some z's in a Workspace is:

z123zzzzzz123123123123123123

I think the safest way to guard against multiple characters arriving in quick succession is to simply process each one completely before proceeding to the next. This, after all, is the likeliest case by far for modern computers and ordinary humans typing:

        self deselect.
         sensor keyboardPressed ifTrue: "was a whileTrue:"
            [char := sensor keyboardPeek.
            (self dispatchOnCharacter: char with: typeAhead) ifTrue:
                [self doneTyping.
                self setEmphasisHere.
                ^self selectAndScroll; updateMarker].
            self openTypeIn].

Cheers,
Bob

On 2/9/13 1:01 AM, [hidden email] wrote:
aText := Text string: newString emphasis: emphasisHere.
- 	self markBlock < self pointBlock
- 		ifTrue: [self setMark: self markBlock stringIndex - 1]
- 		ifFalse: [self setPoint: self  pointBlock stringIndex - 1].
- 
  	wasComposition := true. 
+ 	self markBlock < self pointBlock
+ 		ifTrue: [ self setMark: self markBlock stringIndex - 1.
+ 				self zapSelectionWith: aText.
+ 				self setMark: self markBlock stringIndex + 1]
+ 		ifFalse: [ self setPoint: self pointBlock stringIndex - 1.
+ 				self zapSelectionWith: aText.
+ 				self setPoint: self pointBlock stringIndex + 1]
- 	self zapSelectionWith: aText.



Reply | Threaded
Open this post in threaded view
|

Re: wait2ms (The Trunk: ST80-dtl.141.mcz)

David T. Lewis
Hi Bob, thanks for this.

I am hopelessly out of my depth on multilingual processing and multibyte
characters, but I'm sure others on the list will know: Are there cases
(e.g. in Japan or Korea) where keyboards provide multi-byte inputs, either
directly or through X11? If so, the "sensor keyboardPressed whileTrue:"
may be necessary in order to process those key entries. Otherwise, I
suspect you are right that it is easier to just process one character at
a time. I note also that the "whileTrue:" approach dates back to at least
2002, well before introduction of multilingual support in Squeak. I don't
know if that's significant, or if that's just the way it happened to have
been done.

Just by way of explanation, here is how I found and verified the change
to zapSelectionWithCompositionWith: in ST80-dtl.141.mcz:

1) In the original method, I forced the method to always execute the
multi-character path through the code, even if the input string was one
character long (the normal case). So I changed from this:

  ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
    aString size = 1 and: [(Unicode isComposition: aString first) not]]) ifTrue: [
      ^ self zapSelectionWith: (Text string: aString emphasis: emphasisHere)].

to this:

  ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
    aString size = 1 and: [(Unicode isComposition: aString first) not]]) ifTrue: [
      "^ self zapSelectionWith: (Text string: aString emphasis: emphasisHere)" ].

Result: All characters typed into an MVC workspace resulted in the cursor
positioned incorrectly to the left of the current character.

2) I changed the method to reposition the cursor correctly, leaving the
above change in place to continue forcing execution of the multi-byte code
path. This fixed the cursor positioning problem in the MVC workspace.

3) I removed the change from step 1 so that single character input would
be handled through the normal path, and verified that the MVC workspace
still worked.

4) I filed the updated method into another image that still has the wait2ms
delays, and I ran that image on a Linux interpreter VM (with "2 ms" delays
that might be more like 15 ms). Result: Cursor positioning now worked correctly
in the MVC workspace, even though keyboard entry was slow.

Conclusion: The patch in ST80-dtl.141.mcz does address a real bug in cursor
positioning that occured only in cases where sluggish key entry resulted
in multi-byte strings (presumably two characters) being handled in the
zapSelectionWithCompositionWith: method.

The problem that you identified with your "z123" test does not seem to be
occurring in practice, but it certainly looks as though it *could* occur.
I tested only by typing into a workspace from my US ascii keyboard, so I
certainly would not rule out the possibility of further problems.

I do not know if it is safe to make the change you suggest to process one
character at a time. It looks to me like it would be good to do that, but
I don't know if it might cause problems for non-US keyboards or locales.

If it was safe to make that change, then the entire bit of code for handling
multi-byte strings in zapSelectionWithCompositionWith: would become irrelevant,
and we could just get rid of it.

Hopefully someone with experience in this area can help?

Dave


On Sat, Feb 09, 2013 at 06:52:43AM -0500, Bob Arning wrote:

> I fear that doesn't even work with ordinary USASCII input. I tweaked
> readKeyboard to simulate additional characters arriving in close succession:
>
>         self hasSelection ifTrue: "save highlighted characters"
>             [UndoSelection := self selection].
> *((model isKindOf: Workspace) and: [char = $z]) ifTrue: [typeAhead
> nextPutAll: '123'].*
>
>         self zapSelectionWithCompositionWith: typeAhead contents.
>
> and what I get by typing some z's in a Workspace is:
>
> z123zzzzzz123123123123123123
>
> I think the safest way to guard against multiple characters arriving in
> quick succession is to simply process each one completely before
> proceeding to the next. This, after all, is the likeliest case by far
> for modern computers and ordinary humans typing:
>
>         self deselect.
> *sensor keyboardPressed ifTrue: "was a whileTrue:"*
>             [char := sensor keyboardPeek.
>             (self dispatchOnCharacter: char with: typeAhead) ifTrue:
>                 [self doneTyping.
>                 self setEmphasisHere.
>                 ^self selectAndScroll; updateMarker].
>             self openTypeIn].
>
> Cheers,
> Bob
>
> On 2/9/13 1:01 AM, [hidden email] wrote:
> >aText := Text string: newString emphasis: emphasisHere.
> >- self markBlock < self pointBlock
> >- ifTrue: [self setMark: self markBlock stringIndex - 1]
> >- ifFalse: [self setPoint: self  pointBlock stringIndex - 1].
> >-
> >   wasComposition := true.
> >+ self markBlock < self pointBlock
> >+ ifTrue: [ self setMark: self markBlock stringIndex - 1.
> >+ self zapSelectionWith: aText.
> >+ self setMark: self markBlock stringIndex + 1]
> >+ ifFalse: [ self setPoint: self pointBlock stringIndex - 1.
> >+ self zapSelectionWith: aText.
> >+ self setPoint: self pointBlock stringIndex +
> >1]
> >- self zapSelectionWith: aText.
>

>


Reply | Threaded
Open this post in threaded view
|

Re: wait2ms (The Trunk: ST80-dtl.141.mcz)

Bob Arning-2

On 2/9/13 1:48 PM, David T. Lewis wrote:
> Hi Bob, thanks for this.
>
> I am hopelessly out of my depth on multilingual processing and multibyte
> characters, but I'm sure others on the list will know: Are there cases
> (e.g. in Japan or Korea) where keyboards provide multi-byte inputs, either
> directly or through X11? If so, the "sensor keyboardPressed whileTrue:"
> may be necessary in order to process those key entries.
Remember that there is another "[sensor keyboardPressed] whileTrue: "
higher up that encompases this, so I don't think this is something to
worry about.
> Otherwise, I
> suspect you are right that it is easier to just process one character at
> a time. I note also that the "whileTrue:" approach dates back to at least
> 2002, well before introduction of multilingual support in Squeak. I don't
> know if that's significant, or if that's just the way it happened to have
> been done.
It was that way in 1998. My guess is that if multiple input characters
were available, it made some sense efficiency-wise to insert them into
the text all at once rather than piecemeal. Back the people typed just
as fast as today, but the processors were much slower.

>
> Just by way of explanation, here is how I found and verified the change
> to zapSelectionWithCompositionWith: in ST80-dtl.141.mcz:
>
> 1) In the original method, I forced the method to always execute the
> multi-character path through the code, even if the input string was one
> character long (the normal case). So I changed from this:
>
>    ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
>      aString size = 1 and: [(Unicode isComposition: aString first) not]]) ifTrue: [
>        ^ self zapSelectionWith: (Text string: aString emphasis: emphasisHere)].
>
> to this:
>
>    ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
>      aString size = 1 and: [(Unicode isComposition: aString first) not]]) ifTrue: [
>        "^ self zapSelectionWith: (Text string: aString emphasis: emphasisHere)" ].
>
> Result: All characters typed into an MVC workspace resulted in the cursor
> positioned incorrectly to the left of the current character.
>
> 2) I changed the method to reposition the cursor correctly, leaving the
> above change in place to continue forcing execution of the multi-byte code
> path. This fixed the cursor positioning problem in the MVC workspace.
>
> 3) I removed the change from step 1 so that single character input would
> be handled through the normal path, and verified that the MVC workspace
> still worked.
>
> 4) I filed the updated method into another image that still has the wait2ms
> delays, and I ran that image on a Linux interpreter VM (with "2 ms" delays
> that might be more like 15 ms). Result: Cursor positioning now worked correctly
> in the MVC workspace, even though keyboard entry was slow.
I put your change in an otherwise unchanged 4.4 and mashed keys very
quickly. Everything got entered, but the cursor got farther and farther
behind the end of the text. This is on a Mac where 2ms seems to be 2ms.
I then changed wait2ms to wait 20ms and things got worse - it took long
pauses for characters to appear and when they did, the cursor did not
keep pace with the end of the text. What might be interesting to know
how often (if ever) your tests actually had more that one byte to
process in the zap methods. I think a single character works fine either
way, multiple characters work fine in the original zap.. but multiple
characters in the new zap are the problem.
>
> Conclusion: The patch in ST80-dtl.141.mcz does address a real bug in cursor
> positioning that occured only in cases where sluggish key entry resulted
> in multi-byte strings (presumably two characters) being handled in the
> zapSelectionWithCompositionWith: method.
>
> The problem that you identified with your "z123" test does not seem to be
> occurring in practice,
But it did and that's what got me started here - I was trying to see
sluggish input in MVC, so I mashed keys as fast as I could and they all
made  it to the screen, just not always at the end of the text.

> but it certainly looks as though it *could* occur.
> I tested only by typing into a workspace from my US ascii keyboard, so I
> certainly would not rule out the possibility of further problems.
>
> I do not know if it is safe to make the change you suggest to process one
> character at a time. It looks to me like it would be good to do that, but
> I don't know if it might cause problems for non-US keyboards or locales.
>
> If it was safe to make that change, then the entire bit of code for handling
> multi-byte strings in zapSelectionWithCompositionWith: would become irrelevant,
> and we could just get rid of it.
I don't think so. That code has a purpose - the UnicodeCompositionStream
is used to combined multiple characters (like a letter followed by a
separate accent character) into a single character. It is basically
saying: if there are 1 or more characters in the typeahead and the first
is one of these combining characters (accent) and there is a character
in the text before the insertion point, then back up 1 space so we can
process that character and the accent together. I suspect it works fine
when there is just once character in the typeahead and that character is
an accent. I don't think it works well when there are more than one
character in the input. Hence my suggestion to do it a char at a time.

Cheers,
Bob

>
> Hopefully someone with experience in this area can help?
>
> Dave
>
>
> On Sat, Feb 09, 2013 at 06:52:43AM -0500, Bob Arning wrote:
>> I fear that doesn't even work with ordinary USASCII input. I tweaked
>> readKeyboard to simulate additional characters arriving in close succession:
>>
>>          self hasSelection ifTrue: "save highlighted characters"
>>              [UndoSelection := self selection].
>> *((model isKindOf: Workspace) and: [char = $z]) ifTrue: [typeAhead
>> nextPutAll: '123'].*
>>
>>          self zapSelectionWithCompositionWith: typeAhead contents.
>>
>> and what I get by typing some z's in a Workspace is:
>>
>> z123zzzzzz123123123123123123
>>
>> I think the safest way to guard against multiple characters arriving in
>> quick succession is to simply process each one completely before
>> proceeding to the next. This, after all, is the likeliest case by far
>> for modern computers and ordinary humans typing:
>>
>>          self deselect.
>> *sensor keyboardPressed ifTrue: "was a whileTrue:"*
>>              [char := sensor keyboardPeek.
>>              (self dispatchOnCharacter: char with: typeAhead) ifTrue:
>>                  [self doneTyping.
>>                  self setEmphasisHere.
>>                  ^self selectAndScroll; updateMarker].
>>              self openTypeIn].
>>
>> Cheers,
>> Bob
>>
>> On 2/9/13 1:01 AM, [hidden email] wrote:
>>> aText := Text string: newString emphasis: emphasisHere.
>>> - self markBlock < self pointBlock
>>> - ifTrue: [self setMark: self markBlock stringIndex - 1]
>>> - ifFalse: [self setPoint: self  pointBlock stringIndex - 1].
>>> -
>>>     wasComposition := true.
>>> + self markBlock < self pointBlock
>>> + ifTrue: [ self setMark: self markBlock stringIndex - 1.
>>> + self zapSelectionWith: aText.
>>> + self setMark: self markBlock stringIndex + 1]
>>> + ifFalse: [ self setPoint: self pointBlock stringIndex - 1.
>>> + self zapSelectionWith: aText.
>>> + self setPoint: self pointBlock stringIndex +
>>> 1]
>>> - self zapSelectionWith: aText.
>
>


Reply | Threaded
Open this post in threaded view
|

Re: wait2ms (The Trunk: ST80-dtl.141.mcz)

David T. Lewis
Bob,

I tried reverting zapSelectionWithCompositionWith: to its original
implementation, and making the change that you suggest to readKeyboard.
It works exactly as you describe, including when I type rapidly in an
image with the wait2ms delay.

I will make this update to Squeak trunk. To summarize the changes
we will have at that point:

1) The wait2ms is removed
2) ParagraphEditor>>readKeyboard will always process one character at a time
3) No change to zapSelectionWithCompositionWith:

Thanks very much for your help on this!

Dave

On Sat, Feb 09, 2013 at 02:57:32PM -0500, Bob Arning wrote:

>
> On 2/9/13 1:48 PM, David T. Lewis wrote:
> >Hi Bob, thanks for this.
> >
> >I am hopelessly out of my depth on multilingual processing and multibyte
> >characters, but I'm sure others on the list will know: Are there cases
> >(e.g. in Japan or Korea) where keyboards provide multi-byte inputs, either
> >directly or through X11? If so, the "sensor keyboardPressed whileTrue:"
> >may be necessary in order to process those key entries.
> Remember that there is another "[sensor keyboardPressed] whileTrue: "
> higher up that encompases this, so I don't think this is something to
> worry about.
> >Otherwise, I
> >suspect you are right that it is easier to just process one character at
> >a time. I note also that the "whileTrue:" approach dates back to at least
> >2002, well before introduction of multilingual support in Squeak. I don't
> >know if that's significant, or if that's just the way it happened to have
> >been done.
> It was that way in 1998. My guess is that if multiple input characters
> were available, it made some sense efficiency-wise to insert them into
> the text all at once rather than piecemeal. Back the people typed just
> as fast as today, but the processors were much slower.
> >
> >Just by way of explanation, here is how I found and verified the change
> >to zapSelectionWithCompositionWith: in ST80-dtl.141.mcz:
> >
> >1) In the original method, I forced the method to always execute the
> >multi-character path through the code, even if the input string was one
> >character long (the normal case). So I changed from this:
> >
> >   ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
> >     aString size = 1 and: [(Unicode isComposition: aString first) not]])
> >     ifTrue: [
> >       ^ self zapSelectionWith: (Text string: aString emphasis:
> >       emphasisHere)].
> >
> >to this:
> >
> >   ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
> >     aString size = 1 and: [(Unicode isComposition: aString first) not]])
> >     ifTrue: [
> >       "^ self zapSelectionWith: (Text string: aString emphasis:
> >       emphasisHere)" ].
> >
> >Result: All characters typed into an MVC workspace resulted in the cursor
> >positioned incorrectly to the left of the current character.
> >
> >2) I changed the method to reposition the cursor correctly, leaving the
> >above change in place to continue forcing execution of the multi-byte code
> >path. This fixed the cursor positioning problem in the MVC workspace.
> >
> >3) I removed the change from step 1 so that single character input would
> >be handled through the normal path, and verified that the MVC workspace
> >still worked.
> >
> >4) I filed the updated method into another image that still has the wait2ms
> >delays, and I ran that image on a Linux interpreter VM (with "2 ms" delays
> >that might be more like 15 ms). Result: Cursor positioning now worked
> >correctly
> >in the MVC workspace, even though keyboard entry was slow.
> I put your change in an otherwise unchanged 4.4 and mashed keys very
> quickly. Everything got entered, but the cursor got farther and farther
> behind the end of the text. This is on a Mac where 2ms seems to be 2ms.
> I then changed wait2ms to wait 20ms and things got worse - it took long
> pauses for characters to appear and when they did, the cursor did not
> keep pace with the end of the text. What might be interesting to know
> how often (if ever) your tests actually had more that one byte to
> process in the zap methods. I think a single character works fine either
> way, multiple characters work fine in the original zap.. but multiple
> characters in the new zap are the problem.
> >
> >Conclusion: The patch in ST80-dtl.141.mcz does address a real bug in cursor
> >positioning that occured only in cases where sluggish key entry resulted
> >in multi-byte strings (presumably two characters) being handled in the
> >zapSelectionWithCompositionWith: method.
> >
> >The problem that you identified with your "z123" test does not seem to be
> >occurring in practice,
> But it did and that's what got me started here - I was trying to see
> sluggish input in MVC, so I mashed keys as fast as I could and they all
> made  it to the screen, just not always at the end of the text.
> >but it certainly looks as though it *could* occur.
> >I tested only by typing into a workspace from my US ascii keyboard, so I
> >certainly would not rule out the possibility of further problems.
> >
> >I do not know if it is safe to make the change you suggest to process one
> >character at a time. It looks to me like it would be good to do that, but
> >I don't know if it might cause problems for non-US keyboards or locales.
> >
> >If it was safe to make that change, then the entire bit of code for
> >handling
> >multi-byte strings in zapSelectionWithCompositionWith: would become
> >irrelevant,
> >and we could just get rid of it.
> I don't think so. That code has a purpose - the UnicodeCompositionStream
> is used to combined multiple characters (like a letter followed by a
> separate accent character) into a single character. It is basically
> saying: if there are 1 or more characters in the typeahead and the first
> is one of these combining characters (accent) and there is a character
> in the text before the insertion point, then back up 1 space so we can
> process that character and the accent together. I suspect it works fine
> when there is just once character in the typeahead and that character is
> an accent. I don't think it works well when there are more than one
> character in the input. Hence my suggestion to do it a char at a time.
>
> Cheers,
> Bob
> >
> >Hopefully someone with experience in this area can help?
> >
> >Dave
> >
> >
> >On Sat, Feb 09, 2013 at 06:52:43AM -0500, Bob Arning wrote:
> >>I fear that doesn't even work with ordinary USASCII input. I tweaked
> >>readKeyboard to simulate additional characters arriving in close
> >>succession:
> >>
> >>         self hasSelection ifTrue: "save highlighted characters"
> >>             [UndoSelection := self selection].
> >>*((model isKindOf: Workspace) and: [char = $z]) ifTrue: [typeAhead
> >>nextPutAll: '123'].*
> >>
> >>         self zapSelectionWithCompositionWith: typeAhead contents.
> >>
> >>and what I get by typing some z's in a Workspace is:
> >>
> >>z123zzzzzz123123123123123123
> >>
> >>I think the safest way to guard against multiple characters arriving in
> >>quick succession is to simply process each one completely before
> >>proceeding to the next. This, after all, is the likeliest case by far
> >>for modern computers and ordinary humans typing:
> >>
> >>         self deselect.
> >>*sensor keyboardPressed ifTrue: "was a whileTrue:"*
> >>             [char := sensor keyboardPeek.
> >>             (self dispatchOnCharacter: char with: typeAhead) ifTrue:
> >>                 [self doneTyping.
> >>                 self setEmphasisHere.
> >>                 ^self selectAndScroll; updateMarker].
> >>             self openTypeIn].
> >>
> >>Cheers,
> >>Bob
> >>
> >>On 2/9/13 1:01 AM, [hidden email] wrote:
> >>>aText := Text string: newString emphasis: emphasisHere.
> >>>- self markBlock < self pointBlock
> >>>- ifTrue: [self setMark: self markBlock stringIndex - 1]
> >>>- ifFalse: [self setPoint: self  pointBlock stringIndex - 1].
> >>>-
> >>>   wasComposition := true.
> >>>+ self markBlock < self pointBlock
> >>>+ ifTrue: [ self setMark: self markBlock stringIndex - 1.
> >>>+ self zapSelectionWith: aText.
> >>>+ self setMark: self markBlock stringIndex + 1]
> >>>+ ifFalse: [ self setPoint: self pointBlock stringIndex - 1.
> >>>+ self zapSelectionWith: aText.
> >>>+ self setPoint: self pointBlock stringIndex +
> >>>1]
> >>>- self zapSelectionWith: aText.
> >
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: wait2ms (The Trunk: ST80-dtl.141.mcz)

Bob Arning-2
One final oddity - this zapSelectionWithCompositionWith: seems to have appeared in squeak 4.1 at the same time as new editors were introduced on the morphic side. There is no sign of this composition business in the new morphic editors. I wonder if that was a design decision or just a fork that happened and somebody changed one fork and not the other. This leads me to think that ditching zapSelectionWithCompositionWith: would not be missed by anyone.

Cheers,
Bob

On 2/9/13 3:29 PM, David T. Lewis wrote:
Bob,

I tried reverting zapSelectionWithCompositionWith: to its original
implementation, and making the change that you suggest to readKeyboard.
It works exactly as you describe, including when I type rapidly in an
image with the wait2ms delay.

I will make this update to Squeak trunk. To summarize the changes
we will have at that point:

1) The wait2ms is removed
2) ParagraphEditor>>readKeyboard will always process one character at a time
3) No change to zapSelectionWithCompositionWith:

Thanks very much for your help on this!

Dave

On Sat, Feb 09, 2013 at 02:57:32PM -0500, Bob Arning wrote:
On 2/9/13 1:48 PM, David T. Lewis wrote:
Hi Bob, thanks for this.

I am hopelessly out of my depth on multilingual processing and multibyte
characters, but I'm sure others on the list will know: Are there cases
(e.g. in Japan or Korea) where keyboards provide multi-byte inputs, either
directly or through X11? If so, the "sensor keyboardPressed whileTrue:"
may be necessary in order to process those key entries.
Remember that there is another "[sensor keyboardPressed] whileTrue: " 
higher up that encompases this, so I don't think this is something to 
worry about.
Otherwise, I
suspect you are right that it is easier to just process one character at
a time. I note also that the "whileTrue:" approach dates back to at least
2002, well before introduction of multilingual support in Squeak. I don't
know if that's significant, or if that's just the way it happened to have
been done.
It was that way in 1998. My guess is that if multiple input characters 
were available, it made some sense efficiency-wise to insert them into 
the text all at once rather than piecemeal. Back the people typed just 
as fast as today, but the processors were much slower.
Just by way of explanation, here is how I found and verified the change
to zapSelectionWithCompositionWith: in ST80-dtl.141.mcz:

1) In the original method, I forced the method to always execute the
multi-character path through the code, even if the input string was one
character long (the normal case). So I changed from this:

  ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
    aString size = 1 and: [(Unicode isComposition: aString first) not]]) 
    ifTrue: [
      ^ self zapSelectionWith: (Text string: aString emphasis: 
      emphasisHere)].

to this:

  ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
    aString size = 1 and: [(Unicode isComposition: aString first) not]]) 
    ifTrue: [
      "^ self zapSelectionWith: (Text string: aString emphasis: 
      emphasisHere)" ].

Result: All characters typed into an MVC workspace resulted in the cursor
positioned incorrectly to the left of the current character.

2) I changed the method to reposition the cursor correctly, leaving the
above change in place to continue forcing execution of the multi-byte code
path. This fixed the cursor positioning problem in the MVC workspace.

3) I removed the change from step 1 so that single character input would
be handled through the normal path, and verified that the MVC workspace
still worked.

4) I filed the updated method into another image that still has the wait2ms
delays, and I ran that image on a Linux interpreter VM (with "2 ms" delays
that might be more like 15 ms). Result: Cursor positioning now worked 
correctly
in the MVC workspace, even though keyboard entry was slow.
I put your change in an otherwise unchanged 4.4 and mashed keys very 
quickly. Everything got entered, but the cursor got farther and farther 
behind the end of the text. This is on a Mac where 2ms seems to be 2ms. 
I then changed wait2ms to wait 20ms and things got worse - it took long 
pauses for characters to appear and when they did, the cursor did not 
keep pace with the end of the text. What might be interesting to know 
how often (if ever) your tests actually had more that one byte to 
process in the zap methods. I think a single character works fine either 
way, multiple characters work fine in the original zap.. but multiple 
characters in the new zap are the problem.
Conclusion: The patch in ST80-dtl.141.mcz does address a real bug in cursor
positioning that occured only in cases where sluggish key entry resulted
in multi-byte strings (presumably two characters) being handled in the
zapSelectionWithCompositionWith: method.

The problem that you identified with your "z123" test does not seem to be
occurring in practice,
But it did and that's what got me started here - I was trying to see 
sluggish input in MVC, so I mashed keys as fast as I could and they all 
made  it to the screen, just not always at the end of the text.
but it certainly looks as though it *could* occur.
I tested only by typing into a workspace from my US ascii keyboard, so I
certainly would not rule out the possibility of further problems.

I do not know if it is safe to make the change you suggest to process one
character at a time. It looks to me like it would be good to do that, but
I don't know if it might cause problems for non-US keyboards or locales.

If it was safe to make that change, then the entire bit of code for 
handling
multi-byte strings in zapSelectionWithCompositionWith: would become 
irrelevant,
and we could just get rid of it.
I don't think so. That code has a purpose - the UnicodeCompositionStream 
is used to combined multiple characters (like a letter followed by a 
separate accent character) into a single character. It is basically 
saying: if there are 1 or more characters in the typeahead and the first 
is one of these combining characters (accent) and there is a character 
in the text before the insertion point, then back up 1 space so we can 
process that character and the accent together. I suspect it works fine 
when there is just once character in the typeahead and that character is 
an accent. I don't think it works well when there are more than one 
character in the input. Hence my suggestion to do it a char at a time.

Cheers,
Bob
Hopefully someone with experience in this area can help?

Dave


On Sat, Feb 09, 2013 at 06:52:43AM -0500, Bob Arning wrote:
I fear that doesn't even work with ordinary USASCII input. I tweaked
readKeyboard to simulate additional characters arriving in close 
succession:

        self hasSelection ifTrue: "save highlighted characters"
            [UndoSelection := self selection].
*((model isKindOf: Workspace) and: [char = $z]) ifTrue: [typeAhead
nextPutAll: '123'].*

        self zapSelectionWithCompositionWith: typeAhead contents.

and what I get by typing some z's in a Workspace is:

z123zzzzzz123123123123123123

I think the safest way to guard against multiple characters arriving in
quick succession is to simply process each one completely before
proceeding to the next. This, after all, is the likeliest case by far
for modern computers and ordinary humans typing:

        self deselect.
*sensor keyboardPressed ifTrue: "was a whileTrue:"*
            [char := sensor keyboardPeek.
            (self dispatchOnCharacter: char with: typeAhead) ifTrue:
                [self doneTyping.
                self setEmphasisHere.
                ^self selectAndScroll; updateMarker].
            self openTypeIn].

Cheers,
Bob

On 2/9/13 1:01 AM, [hidden email] wrote:
aText := Text string: newString emphasis: emphasisHere.
- 	self markBlock < self pointBlock
- 		ifTrue: [self setMark: self markBlock stringIndex - 1]
- 		ifFalse: [self setPoint: self  pointBlock stringIndex - 1].
-
  	wasComposition := true.
+ 	self markBlock < self pointBlock
+ 		ifTrue: [ self setMark: self markBlock stringIndex - 1.
+ 				self zapSelectionWith: aText.
+ 				self setMark: self markBlock stringIndex + 1]
+ 		ifFalse: [ self setPoint: self pointBlock stringIndex - 1.
+ 				self zapSelectionWith: aText.
+ 				self setPoint: self pointBlock stringIndex +
1]
- 	self zapSelectionWith: aText.


      




Reply | Threaded
Open this post in threaded view
|

Re: wait2ms (The Trunk: ST80-dtl.141.mcz)

Yoshiki Ohshima-3
In reply to this post by David T. Lewis
On Sat, Feb 9, 2013 at 12:29 PM, David T. Lewis <[hidden email]> wrote:

> Bob,
>
> I tried reverting zapSelectionWithCompositionWith: to its original
> implementation, and making the change that you suggest to readKeyboard.
> It works exactly as you describe, including when I type rapidly in an
> image with the wait2ms delay.
>
> I will make this update to Squeak trunk. To summarize the changes
> we will have at that point:
>
> 1) The wait2ms is removed
> 2) ParagraphEditor>>readKeyboard will always process one character at a time
> 3) No change to zapSelectionWithCompositionWith:
>
> Thanks very much for your help on this!

>From the author initials of #zapSelectionWithCompositionWith, I
thought that it does not have any bugs (hehe).  So, yes, handling
multiple strokes properly was the way to go.  Thank you for the fix!

The implementation was done primarily for the latin-1 input,
especially for the OLPC XO.  It went with the "dead key" to enter an
accented character.  So, you enter an accented latin character (say,
a-umlaut) by holding the alt-graphic key and hit the umlaut key. And
you can enter normal "a" next (say, 2 seconds later) and the VM and
the image receives 0x61 ("a") and 0x308 (combining diaeresis) at the
"same time".

In an application that favors the decomposed form would keep these two
code points and handles it, but Squeak (I think) favors the
pre-composed form and combine these two code points into one  ut.  The
other parts of Squeak cannot handle the decomposed form of code
sequence well, so we tried to make a composed form when possible in
Squeak.  (I.e., there are multiple possible code point sequence to
represent the same character in Unicode.  Anybody who thinks that
Unicode provides a uniform single representation is misled by them.)

When the image receives a sequance that it cannot compose, it gives up
and gives the sequence down stream (that is the logic in
#zapSelectionWithCompositionWith:  and UnicodeCompositionStream).  In
the end, it'd receive n characters at the same time, and need to spit
out m characters.

Inputting the asian languages is in a way less complicated.  The
front-end input method may generate multiple characters at once, but
they just have to be handles one by one properly.

--
-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: wait2ms (The Trunk: ST80-dtl.141.mcz)

Yoshiki Ohshima-3
In reply to this post by Bob Arning-2
On Sat, Feb 9, 2013 at 5:44 PM, Bob Arning <[hidden email]> wrote:
> One final oddity - this zapSelectionWithCompositionWith: seems to have
> appeared in squeak 4.1 at the same time as new editors were introduced on
> the morphic side. There is no sign of this composition business in the new
> morphic editors. I wonder if that was a design decision or just a fork that
> happened and somebody changed one fork and not the other. This leads me to
> think that ditching zapSelectionWithCompositionWith: would not be missed by
> anyone.

This is probably the right observation.  As I wrote in another email,
#zapSelectionWithCompositionWith: was written for the OLPC Etoys
image, independent from the editor rewrite work going on in the trunk
image.  I even wasn't aware that this is in the trunk, and nobody else
was really paying attention.

That said, it should be the right thing to keep this method in and fix
the editor (which is as simple as David did.  Just assume multiple
characters coming in and have a loop).

--
-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: wait2ms (The Trunk: ST80-dtl.141.mcz)

David T. Lewis
On Sat, Feb 09, 2013 at 05:54:49PM -0800, Yoshiki Ohshima wrote:

> On Sat, Feb 9, 2013 at 5:44 PM, Bob Arning <[hidden email]> wrote:
> > One final oddity - this zapSelectionWithCompositionWith: seems to have
> > appeared in squeak 4.1 at the same time as new editors were introduced on
> > the morphic side. There is no sign of this composition business in the new
> > morphic editors. I wonder if that was a design decision or just a fork that
> > happened and somebody changed one fork and not the other. This leads me to
> > think that ditching zapSelectionWithCompositionWith: would not be missed by
> > anyone.
>
> This is probably the right observation.  As I wrote in another email,
> #zapSelectionWithCompositionWith: was written for the OLPC Etoys
> image, independent from the editor rewrite work going on in the trunk
> image.  I even wasn't aware that this is in the trunk, and nobody else
> was really paying attention.
>
> That said, it should be the right thing to keep this method in and fix
> the editor (which is as simple as David did.  Just assume multiple
> characters coming in and have a loop).
>

Hi Yoshiki,

Thank you for explaining this.

I am still confused on one point. Can you say if the #zapSelectionWithCompositionWith:
method should be changed? Here is the change that I was proposing:

zapSelectionWithCompositionWith: aString
        "Deselect, and replace the selection text by aString.
         Remember the resulting selectionInterval in UndoInterval and otherInterval.
         Do not set up for undo."

        | stream newString aText beforeChar |
        wasComposition := false.
        ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
                aString size = 1 and: [(Unicode isComposition: aString first) not]]) ifTrue: [
                        ^ self zapSelectionWith: (Text string: aString emphasis: emphasisHere)].

        stream := UnicodeCompositionStream on: (String new: 16).
        stream nextPut: beforeChar.
        stream nextPutAll: aString.
        newString := stream contents.
        aText := Text string: newString emphasis: emphasisHere.
        wasComposition := true.
        self markBlock < self pointBlock
                ifTrue: [ self setMark: self markBlock stringIndex - 1.
                                self zapSelectionWith: aText.
                                self setMark: self markBlock stringIndex + 1]
                ifFalse: [ self setPoint: self pointBlock stringIndex - 1.
                                self zapSelectionWith: aText.
                                self setPoint: self pointBlock stringIndex + 1]


But I do not know if this is correct, or if we should leave the method
as it is now:

        ( ... )
        Text := Text string: newString emphasis: emphasisHere.
        self markBlock < self pointBlock
                ifTrue: [self setMark: self markBlock stringIndex - 1]
                ifFalse: [self setPoint: self  pointBlock stringIndex - 1].

        wasComposition := true.
        self zapSelectionWith: aText.

Thanks for your help.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: wait2ms (The Trunk: ST80-dtl.141.mcz)

Yoshiki Ohshima-3
On Sun, Feb 10, 2013 at 6:52 AM, David T. Lewis <[hidden email]> wrote:

> Hi Yoshiki,
>
> Thank you for explaining this.
>
> I am still confused on one point. Can you say if the #zapSelectionWithCompositionWith:
> method should be changed? Here is the change that I was proposing:
>
> zapSelectionWithCompositionWith: aString
>         "Deselect, and replace the selection text by aString.
>          Remember the resulting selectionInterval in UndoInterval and otherInterval.
>          Do not set up for undo."
>
>         | stream newString aText beforeChar |
>         wasComposition := false.
>         ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
>                 aString size = 1 and: [(Unicode isComposition: aString first) not]]) ifTrue: [
>                         ^ self zapSelectionWith: (Text string: aString emphasis: emphasisHere)].
>
>         stream := UnicodeCompositionStream on: (String new: 16).
>         stream nextPut: beforeChar.
>         stream nextPutAll: aString.
>         newString := stream contents.
>         aText := Text string: newString emphasis: emphasisHere.
>         wasComposition := true.
>         self markBlock < self pointBlock
>                 ifTrue: [ self setMark: self markBlock stringIndex - 1.
>                                 self zapSelectionWith: aText.
>                                 self setMark: self markBlock stringIndex + 1]
>                 ifFalse: [ self setPoint: self pointBlock stringIndex - 1.
>                                 self zapSelectionWith: aText.
>                                 self setPoint: self pointBlock stringIndex + 1]
>
>
> But I do not know if this is correct, or if we should leave the method
> as it is now:
>
>         ( ... )
>         Text := Text string: newString emphasis: emphasisHere.
>         self markBlock < self pointBlock
>                 ifTrue: [self setMark: self markBlock stringIndex - 1]
>                 ifFalse: [self setPoint: self  pointBlock stringIndex - 1].
>
>         wasComposition := true.
>         self zapSelectionWith: aText.

My memory on this is murky but I think what I wrote was not perfect.
At one point, OLPC XO had two theories on inputting accented
characters and it did gave us a "naked" composition character on its
own.  In that case, that composition character has to be tested with
the "beforeChar" and may have to be combined with it.  So, beforeChar
may be replaced with the new string.  So, the code supported both
ways, if IIRC.

I need to test things to see whether we need to fiddle with the off-by-one...

--
-- Yoshiki

Reply | Threaded
Open this post in threaded view
|

Re: wait2ms (The Trunk: ST80-dtl.141.mcz)

Bert Freudenberg

On 2013-02-12, at 00:27, Yoshiki Ohshima <[hidden email]> wrote:

> On Sun, Feb 10, 2013 at 6:52 AM, David T. Lewis <[hidden email]> wrote:
>> Hi Yoshiki,
>>
>> Thank you for explaining this.
>>
>> I am still confused on one point. Can you say if the #zapSelectionWithCompositionWith:
>> method should be changed? Here is the change that I was proposing:
>>
>> zapSelectionWithCompositionWith: aString
>>        "Deselect, and replace the selection text by aString.
>>         Remember the resulting selectionInterval in UndoInterval and otherInterval.
>>         Do not set up for undo."
>>
>>        | stream newString aText beforeChar |
>>        wasComposition := false.
>>        ((aString isEmpty or: [(beforeChar := self charBefore) isNil]) or: [
>>                aString size = 1 and: [(Unicode isComposition: aString first) not]]) ifTrue: [
>>                        ^ self zapSelectionWith: (Text string: aString emphasis: emphasisHere)].
>>
>>        stream := UnicodeCompositionStream on: (String new: 16).
>>        stream nextPut: beforeChar.
>>        stream nextPutAll: aString.
>>        newString := stream contents.
>>        aText := Text string: newString emphasis: emphasisHere.
>>        wasComposition := true.
>>        self markBlock < self pointBlock
>>                ifTrue: [ self setMark: self markBlock stringIndex - 1.
>>                                self zapSelectionWith: aText.
>>                                self setMark: self markBlock stringIndex + 1]
>>                ifFalse: [ self setPoint: self pointBlock stringIndex - 1.
>>                                self zapSelectionWith: aText.
>>                                self setPoint: self pointBlock stringIndex + 1]
>>
>>
>> But I do not know if this is correct, or if we should leave the method
>> as it is now:
>>
>>        ( ... )
>>        Text := Text string: newString emphasis: emphasisHere.
>>        self markBlock < self pointBlock
>>                ifTrue: [self setMark: self markBlock stringIndex - 1]
>>                ifFalse: [self setPoint: self  pointBlock stringIndex - 1].
>>
>>        wasComposition := true.
>>        self zapSelectionWith: aText.
>
> My memory on this is murky but I think what I wrote was not perfect.
> At one point, OLPC XO had two theories on inputting accented
> characters and it did gave us a "naked" composition character on its
> own.  In that case, that composition character has to be tested with
> the "beforeChar" and may have to be combined with it.  So, beforeChar
> may be replaced with the new string.  So, the code supported both
> ways, if IIRC.
>
> I need to test things to see whether we need to fiddle with the off-by-one...
>
> --
> -- Yoshiki

Ah yes, the XO by default doesn't use "dead keys" for accented characters (where you press the accent first, which does not produce a character, then you press the char and it makes an accented one) but instead "composes" the accent on top of the char (that is, you press the base char first, and then the accent, which gets applied to the previous char).

The way this works (I think) is that the keyboard directly creates Unicode Combining Marks which other software (like GTK) even stores as two unicode code points. E.g., when I get a title string that was entered in a different app we have to run #composeAccents to make the string display correctly in Etoys (Squeak's Unicode rendering does not deal with composite marks yet).

http://www.unicode.org/faq/char_combmark.html

- Bert -

PS: If you want to pull your hairs out, try to read about correct Unicode string comparisons, or sorting ...