Patrick Rein uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-pre.857.mcz ==================== Summary ==================== Name: Collections-pre.857 Author: pre Time: 4 October 2019, 11:04:30.363303 am UUID: 5ef00b65-3884-c445-b276-0cc01f0b10a1 Ancestors: Collections-pre.856 Adds startOfHeader to Character, adds empty abstract implementations of scanFrom:, writeScanOn: to TextAttribute to allow for Texts which include TextAttributes which do not implement serialization to still be serialized, adds a comment to these methods. =============== Diff against Collections-pre.856 =============== Item was added: + ----- Method: Character class>>startOfHeader (in category 'accessing untypeable characters') ----- + startOfHeader + + ^ self value: 1 ! Item was added: + ----- Method: TextAttribute class>>scanFrom: (in category 'fileIn/Out') ----- + scanFrom: strm + "Read the text attribute properties from the stream. When this method has + been called the concrete TextAttribute class has already been selected via + scanCharacter. (see TextAttribute class>>#newFrom:). + For writing the format see TextAttribute>>#writeScanOn:"! Item was added: + ----- Method: TextAttribute>>writeScanOn: (in category 'fileIn/fileOut') ----- + writeScanOn: strm + "Implement this method for a text attribute to define how it it should be written + to a serialized form of a text object. The form should correspond to the source + file format, i.e. use a scan character to denote its subclass. + As TextAttributes are stored in RunArrays, this method is mostly called from RunArray>>#write scan. + For reading the written information see TextAttribute class>>#scanFrom:" + + "Do nothing because of abstract class"! |
Hi everyone,
in the context of the new text anchor layouting infrastructure there is still one thing missing. Currently start of header is not included in Character class>>#separators. This leads to problems with text editing Morphs. As start of header is not printed at all (not even as white space) I would rather classify it as a separator and add it to the list in Character class>>#separators and to Character>>#isSeparator. The advantage of this approach is that we would not need any special case in the text editing morphs. The disadvantage is that the list of separators will be less obvious to understand. Any thoughts about this? Bests Patrick P.S.: Conceptually start of header (or heading) is a control character. ietf says: " A communication control character used at the beginning of a sequence of characters which constitute a machine-sensible address or routing information. Such a sequence is referred to as the "heading." An STX character has the effect of terminating a heading." (https://tools.ietf.org/html/rfc20#section-5.2) >Patrick Rein uploaded a new version of Collections to project The Trunk: >http://source.squeak.org/trunk/Collections-pre.857.mcz > >==================== Summary ==================== > >Name: Collections-pre.857 >Author: pre >Time: 4 October 2019, 11:04:30.363303 am >UUID: 5ef00b65-3884-c445-b276-0cc01f0b10a1 >Ancestors: Collections-pre.856 > >Adds startOfHeader to Character, adds empty abstract implementations of scanFrom:, writeScanOn: to TextAttribute to allow for Texts which include TextAttributes which do not implement serialization to still be serialized, adds a comment to these methods. > >=============== Diff against Collections-pre.856 =============== > >Item was added: >+ ----- Method: Character class>>startOfHeader (in category 'accessing untypeable characters') ----- >+ startOfHeader >+ >+ ^ self value: 1 ! > >Item was added: >+ ----- Method: TextAttribute class>>scanFrom: (in category 'fileIn/Out') ----- >+ scanFrom: strm >+ "Read the text attribute properties from the stream. When this method has >+ been called the concrete TextAttribute class has already been selected via >+ scanCharacter. (see TextAttribute class>>#newFrom:). >+ For writing the format see TextAttribute>>#writeScanOn:"! > >Item was added: >+ ----- Method: TextAttribute>>writeScanOn: (in category 'fileIn/fileOut') ----- >+ writeScanOn: strm >+ "Implement this method for a text attribute to define how it it should be written >+ to a serialized form of a text object. The form should correspond to the source >+ file format, i.e. use a scan character to denote its subclass. >+ As TextAttributes are stored in RunArrays, this method is mostly called from RunArray>>#write scan. >+ For reading the written information see TextAttribute class>>#scanFrom:" >+ >+ "Do nothing because of abstract class"! > > |
In reply to this post by commits-2
+1
This sounds like a good approach to me. If I understand correctly, it amounts to this: Character class>>separators "Answer a collection of the standard ASCII separator characters." ^ #(32 "space" 13 "cr" 9 "tab" 10 "line feed" 12 "form feed" 1 "text separator") collect: [:v | Character value: v] as: String This seems simple and clear to me. There are a lot of senders of #separators in the image, so it is possible that it might have some unintended side effect. But that seems unlikely. Dave On Fri, Oct 04, 2019 at 01:01:30PM +0200, [hidden email] wrote: > Hi everyone, > > in the context of the new text anchor layouting infrastructure there is still one thing missing. Currently start of header is not included in Character class>>#separators. This leads to problems with text editing Morphs. As start of header is not printed at all (not even as white space) I would rather classify it as a separator and add it to the list in Character class>>#separators and to Character>>#isSeparator. The advantage of this approach is that we would not need any special case in the text editing morphs. The disadvantage is that the list of separators will be less obvious to understand. > > Any thoughts about this? > > Bests > Patrick > > P.S.: Conceptually start of header (or heading) is a control character. ietf says: " A communication control character used at the beginning of a sequence of characters which constitute a machine-sensible address or routing information. Such a sequence is referred to as the "heading." An STX character has the effect of terminating a heading." > (https://tools.ietf.org/html/rfc20#section-5.2) > > >Patrick Rein uploaded a new version of Collections to project The Trunk: > >http://source.squeak.org/trunk/Collections-pre.857.mcz > > > >==================== Summary ==================== > > > >Name: Collections-pre.857 > >Author: pre > >Time: 4 October 2019, 11:04:30.363303 am > >UUID: 5ef00b65-3884-c445-b276-0cc01f0b10a1 > >Ancestors: Collections-pre.856 > > > >Adds startOfHeader to Character, adds empty abstract implementations of scanFrom:, writeScanOn: to TextAttribute to allow for Texts which include TextAttributes which do not implement serialization to still be serialized, adds a comment to these methods. > > > >=============== Diff against Collections-pre.856 =============== > > > >Item was added: > >+ ----- Method: Character class>>startOfHeader (in category 'accessing untypeable characters') ----- > >+ startOfHeader > >+ > >+ ^ self value: 1 ! > > > >Item was added: > >+ ----- Method: TextAttribute class>>scanFrom: (in category 'fileIn/Out') ----- > >+ scanFrom: strm > >+ "Read the text attribute properties from the stream. When this method has > >+ been called the concrete TextAttribute class has already been selected via > >+ scanCharacter. (see TextAttribute class>>#newFrom:). > >+ For writing the format see TextAttribute>>#writeScanOn:"! > > > >Item was added: > >+ ----- Method: TextAttribute>>writeScanOn: (in category 'fileIn/fileOut') ----- > >+ writeScanOn: strm > >+ "Implement this method for a text attribute to define how it it should be written > >+ to a serialized form of a text object. The form should correspond to the source > >+ file format, i.e. use a scan character to denote its subclass. > >+ As TextAttributes are stored in RunArrays, this method is mostly called from RunArray>>#write scan. > >+ For reading the written information see TextAttribute class>>#scanFrom:" > >+ > >+ "Do nothing because of abstract class"! > > > > > |
In reply to this post by commits-2
On Fri, 4 Oct 2019, [hidden email] wrote:
> Hi everyone, > > in the context of the new text anchor layouting infrastructure there is still one thing missing. Currently start of header is not included in Character class>>#separators. This leads to problems with text editing Morphs. As start of header is not printed at all (not even as white space) I would rather classify it as a separator and add it to the list in Character class>>#separators and to Character>>#isSeparator. The advantage of this approach is that we would not need any special case in the text editing morphs. The disadvantage is that the list of separators will be less obvious to understand. Are there too many senders of these methods that need to identify start of header? Or is it too hard to identify these methods? If the answer is no to both questsions, then I suggest using different method names. E.g.: #isTextSeparator, #textSeparators. Levente > > Any thoughts about this? > > Bests > Patrick > > P.S.: Conceptually start of header (or heading) is a control character. ietf says: " A communication control character used at the beginning of a sequence of characters which constitute a machine-sensible address or routing information. Such a sequence is referred to as the "heading." An STX character has the effect of terminating a heading." > (https://tools.ietf.org/html/rfc20#section-5.2) > >> Patrick Rein uploaded a new version of Collections to project The Trunk: >> http://source.squeak.org/trunk/Collections-pre.857.mcz >> >> ==================== Summary ==================== >> >> Name: Collections-pre.857 >> Author: pre >> Time: 4 October 2019, 11:04:30.363303 am >> UUID: 5ef00b65-3884-c445-b276-0cc01f0b10a1 >> Ancestors: Collections-pre.856 >> >> Adds startOfHeader to Character, adds empty abstract implementations of scanFrom:, writeScanOn: to TextAttribute to allow for Texts which include TextAttributes which do not implement serialization to still be serialized, adds a comment to these methods. >> >> =============== Diff against Collections-pre.856 =============== >> >> Item was added: >> + ----- Method: Character class>>startOfHeader (in category 'accessing untypeable characters') ----- >> + startOfHeader >> + >> + ^ self value: 1 ! >> >> Item was added: >> + ----- Method: TextAttribute class>>scanFrom: (in category 'fileIn/Out') ----- >> + scanFrom: strm >> + "Read the text attribute properties from the stream. When this method has >> + been called the concrete TextAttribute class has already been selected via >> + scanCharacter. (see TextAttribute class>>#newFrom:). >> + For writing the format see TextAttribute>>#writeScanOn:"! >> >> Item was added: >> + ----- Method: TextAttribute>>writeScanOn: (in category 'fileIn/fileOut') ----- >> + writeScanOn: strm >> + "Implement this method for a text attribute to define how it it should be written >> + to a serialized form of a text object. The form should correspond to the source >> + file format, i.e. use a scan character to denote its subclass. >> + As TextAttributes are stored in RunArrays, this method is mostly called from RunArray>>#write scan. >> + For reading the written information see TextAttribute class>>#scanFrom:" >> + >> + "Do nothing because of abstract class"! >> >> |
Hi all,
two years later, I still would love to see Patrick's proposal being accepted in the Trunk. My concrete problem with SOH (start of header) not being in Character separators is that text anchors in Smalltalk source code currently mix up the Shout styler, which is due to the send to CharacterSet nonSeparators from SHParserST80 scanWhitespace. Now one might argue that we could introduce a separate CharacterSet notAtAllSeparators/nonUnicodeSeparator autc. (which would also exclude SOH), but I would rather dislike this proposal because it would force us to maintain multiple different definitions of the term "character" and increase the overall domain complexity. I can't see what would be wrong with treating all character instances according to the Unicode standard (as other frameworks such as .NET seem to do, too). I have been using Dave's version of Character separators from above for the latest months and I did not experience any unintended side effects of the change. Could we please integrate Patrick's change, or are there any major objections? It would be great to get this kind of stuff working in Babylonian & Co. :-) Best, Christoph PS: See also: http://forum.world.st/ENH-isSeparator-td5129517.html
Carpe Squeak!
|
Free forum by Nabble | Edit this page |