Smalltalk › Squeak › Squeak - Dev

The Trunk: Collections-pre.857.mcz

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

5 messages Options

commits-2

Oct 04, 2019; 9:04am

The Trunk: Collections-pre.857.mcz

22167 posts

Patrick Rein uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-pre.857.mcz

==================== Summary ====================

Name: Collections-pre.857
Author: pre
Time: 4 October 2019, 11:04:30.363303 am
UUID: 5ef00b65-3884-c445-b276-0cc01f0b10a1
Ancestors: Collections-pre.856

Adds startOfHeader to Character, adds empty abstract implementations of scanFrom:, writeScanOn: to TextAttribute to allow for Texts which include TextAttributes which do not implement serialization to still be serialized, adds a comment to these methods.

=============== Diff against Collections-pre.856 ===============

Item was added:
+ ----- Method: Character class>>startOfHeader (in category 'accessing untypeable characters') -----
+ startOfHeader
+
+ ^ self value: 1 !

Item was added:
+ ----- Method: TextAttribute class>>scanFrom: (in category 'fileIn/Out') -----
+ scanFrom: strm
+ "Read the text attribute properties from the stream. When this method has
+ been called the concrete TextAttribute class has already been selected via
+ scanCharacter. (see TextAttribute class>>#newFrom:).
+ For writing the format see TextAttribute>>#writeScanOn:"!

Item was added:
+ ----- Method: TextAttribute>>writeScanOn: (in category 'fileIn/fileOut') -----
+ writeScanOn: strm
+ "Implement this method for a text attribute to define how it it should be written
+ to a serialized form of a text object. The form should correspond to the source
+ file format, i.e. use a scan character to denote its subclass.
+ As TextAttributes are stored in RunArrays, this method is mostly called from RunArray>>#write scan.
+ For reading the written information see TextAttribute class>>#scanFrom:"
+
+ "Do nothing because of abstract class"!

patrick.rein

Oct 04, 2019; 11:01am

Re: The Trunk: Collections-pre.857.mcz

50 posts

Hi everyone,

in the context of the new text anchor layouting infrastructure there is still one thing missing. Currently start of header is not included in Character class>>#separators. This leads to problems with text editing Morphs. As start of header is not printed at all (not even as white space) I would rather classify it as a separator and add it to the list in Character class>>#separators and to Character>>#isSeparator. The advantage of this approach is that we would not need any special case in the text editing morphs. The disadvantage is that the list of separators will be less obvious to understand.

Any thoughts about this?

Bests
Patrick

P.S.: Conceptually start of header (or heading) is a control character. ietf says: " A communication control character used at the beginning of a sequence of characters which constitute a machine-sensible address or routing information. Such a sequence is referred to as the "heading." An STX character has the effect of terminating a heading."
(https://tools.ietf.org/html/rfc20#section-5.2)

>Patrick Rein uploaded a new version of Collections to project The Trunk:
>http://source.squeak.org/trunk/Collections-pre.857.mcz
>
>==================== Summary ====================
>
>Name: Collections-pre.857
>Author: pre
>Time: 4 October 2019, 11:04:30.363303 am
>UUID: 5ef00b65-3884-c445-b276-0cc01f0b10a1
>Ancestors: Collections-pre.856
>
>Adds startOfHeader to Character, adds empty abstract implementations of scanFrom:, writeScanOn: to TextAttribute to allow for Texts which include TextAttributes which do not implement serialization to still be serialized, adds a comment to these methods.
>
>=============== Diff against Collections-pre.856 ===============
>
>Item was added:
>+ ----- Method: Character class>>startOfHeader (in category 'accessing untypeable characters') -----
>+ startOfHeader
>+
>+ ^ self value: 1 !
>
>Item was added:
>+ ----- Method: TextAttribute class>>scanFrom: (in category 'fileIn/Out') -----
>+ scanFrom: strm
>+ "Read the text attribute properties from the stream. When this method has
>+ been called the concrete TextAttribute class has already been selected via
>+ scanCharacter. (see TextAttribute class>>#newFrom:).
>+ For writing the format see TextAttribute>>#writeScanOn:"!
>
>Item was added:
>+ ----- Method: TextAttribute>>writeScanOn: (in category 'fileIn/fileOut') -----
>+ writeScanOn: strm
>+ "Implement this method for a text attribute to define how it it should be written
>+ to a serialized form of a text object. The form should correspond to the source
>+ file format, i.e. use a scan character to denote its subclass.
>+ As TextAttributes are stored in RunArrays, this method is mostly called from RunArray>>#write scan.
>+ For reading the written information see TextAttribute class>>#scanFrom:"
>+
>+ "Do nothing because of abstract class"!
>
>

... [show rest of quote]

David T. Lewis

Oct 04, 2019; 1:26pm

Re: The Trunk: Collections-pre.857.mcz

7165 posts

In reply to this post by commits-2

+1

This sounds like a good approach to me. If I understand correctly, it
amounts to this:

Character class>>separators
"Answer a collection of the standard ASCII separator characters."

^ #(32 "space"
13 "cr"
9 "tab"
10 "line feed"
12 "form feed"
1 "text separator")
collect: [:v | Character value: v] as: String

This seems simple and clear to me.

There are a lot of senders of #separators in the image, so it is
possible that it might have some unintended side effect. But that
seems unlikely.

Dave

On Fri, Oct 04, 2019 at 01:01:30PM +0200, [hidden email] wrote:

> Hi everyone,
>
> in the context of the new text anchor layouting infrastructure there is still one thing missing. Currently start of header is not included in Character class>>#separators. This leads to problems with text editing Morphs. As start of header is not printed at all (not even as white space) I would rather classify it as a separator and add it to the list in Character class>>#separators and to Character>>#isSeparator. The advantage of this approach is that we would not need any special case in the text editing morphs. The disadvantage is that the list of separators will be less obvious to understand.
>
> Any thoughts about this?
>
> Bests
> Patrick
>
> P.S.: Conceptually start of header (or heading) is a control character. ietf says: " A communication control character used at the beginning of a sequence of characters which constitute a machine-sensible address or routing information. Such a sequence is referred to as the "heading." An STX character has the effect of terminating a heading."
> (https://tools.ietf.org/html/rfc20#section-5.2)
>
> >Patrick Rein uploaded a new version of Collections to project The Trunk:
> >http://source.squeak.org/trunk/Collections-pre.857.mcz
> >
> >==================== Summary ====================
> >
> >Name: Collections-pre.857
> >Author: pre
> >Time: 4 October 2019, 11:04:30.363303 am
> >UUID: 5ef00b65-3884-c445-b276-0cc01f0b10a1
> >Ancestors: Collections-pre.856
> >
> >Adds startOfHeader to Character, adds empty abstract implementations of scanFrom:, writeScanOn: to TextAttribute to allow for Texts which include TextAttributes which do not implement serialization to still be serialized, adds a comment to these methods.
> >
> >=============== Diff against Collections-pre.856 ===============
> >
> >Item was added:
> >+ ----- Method: Character class>>startOfHeader (in category 'accessing untypeable characters') -----
> >+ startOfHeader
> >+
> >+ ^ self value: 1 !
> >
> >Item was added:
> >+ ----- Method: TextAttribute class>>scanFrom: (in category 'fileIn/Out') -----
> >+ scanFrom: strm
> >+ "Read the text attribute properties from the stream. When this method has
> >+ been called the concrete TextAttribute class has already been selected via
> >+ scanCharacter. (see TextAttribute class>>#newFrom:).
> >+ For writing the format see TextAttribute>>#writeScanOn:"!
> >
> >Item was added:
> >+ ----- Method: TextAttribute>>writeScanOn: (in category 'fileIn/fileOut') -----
> >+ writeScanOn: strm
> >+ "Implement this method for a text attribute to define how it it should be written
> >+ to a serialized form of a text object. The form should correspond to the source
> >+ file format, i.e. use a scan character to denote its subclass.
> >+ As TextAttributes are stored in RunArrays, this method is mostly called from RunArray>>#write scan.
> >+ For reading the written information see TextAttribute class>>#scanFrom:"
> >+
> >+ "Do nothing because of abstract class"!
> >
> >
>

... [show rest of quote]

Levente Uzonyi

Oct 05, 2019; 12:27pm

Re: The Trunk: Collections-pre.857.mcz

1183 posts

In reply to this post by commits-2

On Fri, 4 Oct 2019, [hidden email] wrote:

> Hi everyone,
>
> in the context of the new text anchor layouting infrastructure there is still one thing missing. Currently start of header is not included in Character class>>#separators. This leads to problems with text editing Morphs. As start of header is not printed at all (not even as white space) I would rather classify it as a separator and add it to the list in Character class>>#separators and to Character>>#isSeparator. The advantage of this approach is that we would not need any special case in the text editing morphs. The disadvantage is that the list of separators will be less obvious to understand.

Are there too many senders of these methods that need to identify start
of header? Or is it too hard to identify these methods?
If the answer is no to both questsions, then I suggest using different
method names. E.g.: #isTextSeparator, #textSeparators.

Levente

>
> Any thoughts about this?
>
> Bests
> Patrick
>
> P.S.: Conceptually start of header (or heading) is a control character. ietf says: " A communication control character used at the beginning of a sequence of characters which constitute a machine-sensible address or routing information. Such a sequence is referred to as the "heading." An STX character has the effect of terminating a heading."
> (https://tools.ietf.org/html/rfc20#section-5.2)
>
>> Patrick Rein uploaded a new version of Collections to project The Trunk:
>> http://source.squeak.org/trunk/Collections-pre.857.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Collections-pre.857
>> Author: pre
>> Time: 4 October 2019, 11:04:30.363303 am
>> UUID: 5ef00b65-3884-c445-b276-0cc01f0b10a1
>> Ancestors: Collections-pre.856
>>
>> Adds startOfHeader to Character, adds empty abstract implementations of scanFrom:, writeScanOn: to TextAttribute to allow for Texts which include TextAttributes which do not implement serialization to still be serialized, adds a comment to these methods.
>>
>> =============== Diff against Collections-pre.856 ===============
>>
>> Item was added:
>> + ----- Method: Character class>>startOfHeader (in category 'accessing untypeable characters') -----
>> + startOfHeader
>> +
>> + ^ self value: 1 !
>>
>> Item was added:
>> + ----- Method: TextAttribute class>>scanFrom: (in category 'fileIn/Out') -----
>> + scanFrom: strm
>> + "Read the text attribute properties from the stream. When this method has
>> + been called the concrete TextAttribute class has already been selected via
>> + scanCharacter. (see TextAttribute class>>#newFrom:).
>> + For writing the format see TextAttribute>>#writeScanOn:"!
>>
>> Item was added:
>> + ----- Method: TextAttribute>>writeScanOn: (in category 'fileIn/fileOut') -----
>> + writeScanOn: strm
>> + "Implement this method for a text attribute to define how it it should be written
>> + to a serialized form of a text object. The form should correspond to the source
>> + file format, i.e. use a scan character to denote its subclass.
>> + As TextAttributes are stored in RunArrays, this method is mostly called from RunArray>>#write scan.
>> + For reading the written information see TextAttribute class>>#scanFrom:"
>> +
>> + "Do nothing because of abstract class"!
>>
>>

... [show rest of quote]

Christoph Thiede

Aug 19, 2021; 6:09pm

Re: The Trunk: Collections-pre.857.mcz

1513 posts

Hi all,

two years later, I still would love to see Patrick's proposal being accepted in the Trunk.

My concrete problem with SOH (start of header) not being in Character separators is that text anchors in Smalltalk source code currently mix up the Shout styler, which is due to the send to CharacterSet nonSeparators from SHParserST80 scanWhitespace. Now one might argue that we could introduce a separate CharacterSet notAtAllSeparators/nonUnicodeSeparator autc. (which would also exclude SOH), but I would rather dislike this proposal because it would force us to maintain multiple different definitions of the term "character" and increase the overall domain complexity. I can't see what would be wrong with treating all character instances according to the Unicode standard (as other frameworks such as .NET seem to do, too).

I have been using Dave's version of Character separators from above for the latest months and I did not experience any unintended side effects of the change.

Could we please integrate Patrick's change, or are there any major objections? It would be great to get this kind of stuff working in Babylonian & Co. :-)

Best,
Christoph

PS: See also: http://forum.world.st/ENH-isSeparator-td5129517.html

Carpe Squeak!