FormInspector, or also: Text>>#= and its consequences

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

FormInspector, or also: Text>>#= and its consequences

Christoph Thiede

Hi all,


is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?


Naively and without an overview of any existing components that could rely on this implementation, I would like to question it.

Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...


There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses 

TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors.
Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.

I'm looking forward to your opinions.

Best,
Christoph



Reply | Threaded
Open this post in threaded view
|

Re: FormInspector, or also: Text>>#= and its consequences

marcel.taeumel
Hi Christoph.

Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.

Best,
Marcel

Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <[hidden email]>:

Hi all,


is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?


Naively and without an overview of any existing components that could rely on this implementation, I would like to question it.

Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...


There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses 

TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors.
Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.

I'm looking forward to your opinions.

Best,
Christoph



Reply | Threaded
Open this post in threaded view
|

Re: FormInspector, or also: Text>>#= and its consequences

Eliot Miranda-2
Hi Marcel,  Hi Levente, Hi Christoph, Hi All,

On Tue, Sep 15, 2020 at 7:42 AM Marcel Taeumel <[hidden email]> wrote:
Hi Christoph.

Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.

I think it's a historical hold-over.  Here's the same method in Smalltalk-80 v2:

!Text methodsFor: 'comparing'!
= anotherText
    ^string = anotherText string! !

Changing it to read
   other isText ifTrue: [ ^string = other string and: [runs = other runs]].
other isString ifTrue: [ ^string = other ].
^false

 is not going to affect performance noticeably (runs are typically shorter than the strings and Array comparison isn't particularly slow).  However, it corresponds much closer to my intuitive understanding of Texts.  If I wanted to see if two texts had the same characters I would use aText string = bText string.  I see Levente's comment. but I think he's just commenting the anomaly inherited from Smalltalk-80, not saying "it must be this way".  Am I right Levente?


So how bad is the performance?  I chose some texts (they happen to be in the help browser, and as such they represent representative large texts, which I think is what we're worried about for performance) via

Text allInstances select: [:t| t size > 5000 and: [t runs runs size > (t size / 200)]]

(why text runs runs size?  Because text runs size = text size.  text runs runs answers the size of the array holding the lengths of each emphasis run)

Then I benchmarked the comparison via

"Using the existing method compare strings."
| copy |
copy := self first copy.
[self first = copy] bench '186,000 per second. 5.39 microseconds per run. 0 % GC time.'

"Estimate the additional cost of comparing runs in a typical text"
| copy |
copy := self first copy.
[self first string = copy string and: [self first runs = copy runs]] bench '154,000 per second. 6.48 microseconds per run. 0 % GC time.'

"Estimate the additional cost when there is some difference in emphasis"
| copy |
copy := self first copy.
copy addAttribute: (TextColor color: Color red) from: copy size // 2 to: copy size.
[self first string = copy string and: [self first runs = copy runs]] bench '187,000 per second. 5.36 microseconds per run. 0 % GC time.'


What the second one shows is that including testing for runs worsens performance by about 20%.  For me that's acceptable.

What the third one shows is that if emphases do in fact differ the overhead is far less, because in the runs comparison there is a size comparison, and that fails without bothering to compare all the elements.


And of course the additional cost of comparing runs depends on how complex typical runs are.  Here's a histogram:

| texts |
texts := Text allInstances select: [:t| t size > 0].
(10 to: 100 by: 10) collect:
[:percentage|
{ percentage.
 (texts select: [:t| | ratio |
ratio := t runs runs size / t size * 100.
(ratio between: percentage - 10 and: percentage) and: [ratio ~= (percentage - 10)]]) size * 100.0 / texts size roundTo: 0.01} ]
#(#(10 67.62) #(20 15.48) #(30 6.58) #(40 2.49) #(50 2.49) #(60 0.36) #(70 0.18) #(80 0.0) #(90 0.0) #(100 4.8))

So most texts have very few emphases (typically one ;-).  Only 5.3% of texts have runs longer than half the size of the text.  So in most cases the slow down by adding the runs comparison to the mix will be less than the 20% overhead above.  The worst case is represented by benchmark two above, a large text compared against an identical copy.

Best,
Marcel

Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <[hidden email]>:

Hi all,


is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?


Naively and without an overview of any existing components that could rely on this implementation, I would like to question it.

Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...


There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses 

TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors.
Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.

I'm looking forward to your opinions.

Best,
Christoph




--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: FormInspector, or also: Text>>#= and its consequences

Christoph Thiede
In reply to this post by marcel.taeumel

Interesting, I would not have assumed that this would be only about performance, sounded like a more profound design decision to me.


If no one sees a problem in changing this behavior, I can try my luck. :-)


Best,

Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Taeumel, Marcel
Gesendet: Dienstag, 15. September 2020 16:42:11
An: squeak-dev
Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences
 
Hi Christoph.

Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.

Best,
Marcel

Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <[hidden email]>:

Hi all,


is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?


Naively and without an overview of any existing components that could rely on this implementation, I would like to question it.

Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...


There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses 

TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors.
Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.

I'm looking forward to your opinions.

Best,
Christoph



Reply | Threaded
Open this post in threaded view
|

Re: FormInspector, or also: Text>>#= and its consequences

Levente Uzonyi
Hi Christoph,


On Wed, 16 Sep 2020, Thiede, Christoph wrote:

>
> Interesting, I would not have assumed that this would be only about performance, sounded like a more profound design decision to me.

If you have a look at the comment of Text >> #=, you'll find that it's a
design decision (no reasoning though):

= other
  "Am I equal to the other Text or String?
  ***** Warning ***** Two Texts are considered equal if they have
the same characters in them.  They might have completely different
emphasis, fonts, sizes, text actions, or embedded morphs.  If you need to
find out if one is a true copy of the other, you must do (text1 = text2
and: [text1 runs = text2 runs])."


Though equality with Strings is not symmetric;

'foo' asText = 'foo'. "==> true"
'foo' = 'foo' asText. "==> false"

I don't know what relies on Text-String equality, but probably many things
assume that Texts and Strings are somewhat interchangable (remember when
you changed SHParserST80 >> #initializeVariablesFromContext, and Shout ran
into errors because it expected source to be a String but got a Text?)

You can't keep equality with Strings if you change #= because you'll lose
transitivity:

'foo' asText allBold = 'foo'. "==> true"
'foo' asText = 'foo'. "==> true"
'foo' asText allBold = 'foo' asText "==> false"


Should you decide to change #=, remember to change #hash as well, and
rehash all hashed collections with text keys.


Levente

>
>
> If no one sees a problem in changing this behavior, I can try my luck. :-)
>
>
> Best,
>
> Christoph
>
> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev <[hidden email]> im Auftrag von Taeumel, Marcel
> Gesendet: Dienstag, 15. September 2020 16:42:11
> An: squeak-dev
> Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences  
> Hi Christoph.
> Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.
>
> Best,
> Marcel
>
>       Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <[hidden email]>:
>
>       Hi all,
>
>
>       is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?
>
>
>       Naively and without an overview of any existing components that could rely on this implementation, I would like to question it.
>
>       Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...
>
>
>       There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses 
>
>       TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into
>       the inspectors.
> Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
>
> I'm looking forward to your opinions.
>
> Best,
> Christoph
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: FormInspector, or also: Text>>#= and its consequences

Christoph Thiede

Hi Levente,


hm, I think #= should be always commutative and transitive, everything else is at least confusing ...


Can't we move that "attribute invariant" comparison rather to something like Text >> #sameAs:?


Best,

Christoph


Von: Squeak-dev <[hidden email]> im Auftrag von Levente Uzonyi <[hidden email]>
Gesendet: Mittwoch, 16. September 2020 15:00:28
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences
 
Hi Christoph,


On Wed, 16 Sep 2020, Thiede, Christoph wrote:

>
> Interesting, I would not have assumed that this would be only about performance, sounded like a more profound design decision to me.

If you have a look at the comment of Text >> #=, you'll find that it's a
design decision (no reasoning though):

= other
         "Am I equal to the other Text or String?
         ***** Warning ***** Two Texts are considered equal if they have
the same characters in them.  They might have completely different
emphasis, fonts, sizes, text actions, or embedded morphs.  If you need to
find out if one is a true copy of the other, you must do (text1 = text2
and: [text1 runs = text2 runs])."


Though equality with Strings is not symmetric;

'foo' asText = 'foo'. "==> true"
'foo' = 'foo' asText. "==> false"

I don't know what relies on Text-String equality, but probably many things
assume that Texts and Strings are somewhat interchangable (remember when
you changed SHParserST80 >> #initializeVariablesFromContext, and Shout ran
into errors because it expected source to be a String but got a Text?)

You can't keep equality with Strings if you change #= because you'll lose
transitivity:

'foo' asText allBold = 'foo'. "==> true"
'foo' asText = 'foo'. "==> true"
'foo' asText allBold = 'foo' asText "==> false"


Should you decide to change #=, remember to change #hash as well, and
rehash all hashed collections with text keys.


Levente

>
>
> If no one sees a problem in changing this behavior, I can try my luck. :-)
>
>
> Best,
>
> Christoph
>
> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev <[hidden email]> im Auftrag von Taeumel, Marcel
> Gesendet: Dienstag, 15. September 2020 16:42:11
> An: squeak-dev
> Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences  
> Hi Christoph.
> Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.
>
> Best,
> Marcel
>
>       Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <[hidden email]>:
>
>       Hi all,
>
>
>       is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?
>
>
>       Naively and without an overview of any existing components that could rely on this implementation, I would like to question it.
>
>       Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...
>
>
>       There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses 
>
>       TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into
>       the inspectors.
> Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
>
> I'm looking forward to your opinions.
>
> Best,
> Christoph
>
>
>