On Monday 30 April 2007 7:12 pm, Edgar J. De Cleene wrote:
> El 4/30/07 10:22 AM, "subbukk" <[hidden email]> escribió: > > Hi, > > > > I see some artifacts (like []) displayed in comments and descriptions in > > Squeak (3.7-7 vm on Linux, 3.8 image and SqueakV39.sources). I suspect > > the '\r' line terminator in the *.sources file could be causing it. > > > > Is this a bug? > > > > TIA .. Subbu > > I think what could be eliminated if you do Smalltalk removeAllLineFeeds. [] glyphs instead of it. The method seems to replace CR with CRLF which makes it worse. Strangely, 'hello world' displayAt: 100@100. shows up correctly as two line text. So why does the [] appear the browser? Regards .. Subbu |
On Monday 30 April 2007 7:12 pm, Edgar J. De Cleene wrote:
> El 4/30/07 10:22 AM, "subbukk" <[hidden email]> escribió: > > Hi, > > > > I see some artifacts (like []) displayed in comments and descriptions in > > Squeak (3.7-7 vm on Linux, 3.8 image and SqueakV39.sources). I suspect > > the '\r' line terminator in the *.sources file could be causing it. > > > > Is this a bug? > > > > TIA .. Subbu > > I think what could be eliminated if you do Smalltalk removeAllLineFeeds. > > In Mac , I found this often. Squeak3.9 handling of CRLF sequences in sources file is defective. Squeak3.8 image correctly strips of LF in CRLF while reading in text from SqueakV3.sources file. For instance, editStartPage method in Scamper uses CRLF in SqueakV3.sources file but its Text in codepane strips out LF.There is a typo in my original request. It should have read "3.9 image and SqueakV39.sources. BTW, my original mail should have "3.9 image and SqueakV39.sources". Sorry for the typo. Regards .. Subbu |
On May 7, 2007, at 6:14 , subbukk wrote: > On Monday 30 April 2007 7:12 pm, Edgar J. De Cleene wrote: >> El 4/30/07 10:22 AM, "subbukk" <[hidden email]> escribió: >>> Hi, >>> >>> I see some artifacts (like []) displayed in comments and >>> descriptions in >>> Squeak (3.7-7 vm on Linux, 3.8 image and SqueakV39.sources). I >>> suspect >>> the '\r' line terminator in the *.sources file could be causing it. >>> >>> Is this a bug? >>> >>> TIA .. Subbu >> >> I think what could be eliminated if you do Smalltalk >> removeAllLineFeeds. >> >> In Mac , I found this often. > > Squeak3.9 handling of CRLF sequences in sources file is defective. I do not think it is. > Squeak3.8 > image correctly strips of LF in CRLF while reading in text from > SqueakV3.sources file. For instance, editStartPage method in > Scamper uses > CRLF in SqueakV3.sources file but its Text in codepane strips out LF. I do not think this is the case. It is just that images up to 3.8 did not *display* LF chars embedded in a String because the corresponding glyph in the fonts was blank. This has been fixed, we should not have invisible characters anymore (which are very annoying), and this just exposes the problem of LFs in some method sources. They have been there all along, they just were not visible. Reading in code must not strip LFs unless specifically told to, because it is perfectly valid to embed a LF in a String in source code if you want to. - Bert - |
On Monday 07 May 2007 10:53 pm, Bert Freudenberg wrote:
> > Squeak3.9 handling of CRLF sequences in sources file is defective. > > I do not think it is. SqueakV39.sources contains the following sequence as seen in a hexeditor for DateAndTime commentStamp: I have zero duration\r\n\r\n\r\n When I browse this class and inspect the Text object in the annotation pane, the same sequence shows up in its string variable. Since annotations are read in as line-oriented text from files, the CRLFs should have been replaced with CRs. Of course, we could take a stance that the sources files is corrupt since it uses mixed line endings, but then what about text read in from a changes file or from a filein that came thru email? > > Squeak3.8 > > image correctly strips of LF in CRLF while reading in text from > > SqueakV3.sources file. For instance, editStartPage method in > > Scamper uses > > CRLF in SqueakV3.sources file but its Text in codepane strips out LF. > > I do not think this is the case. SqueakV3.sources contains the sequence: editStartPage\r\n\t But the Text object in the codepane shows the sequence: editStartPage\r\t But I did notice that the CRLF was retained in the comments for B3DRotation. Is it because code strings are parsed while comment strings are not interpreted? Regards .. Subbu |
On May 7, 2007, at 14:47 , subbukk wrote: > On Monday 07 May 2007 10:53 pm, Bert Freudenberg wrote: >>> Squeak3.9 handling of CRLF sequences in sources file is defective. >> >> I do not think it is. > SqueakV39.sources contains the following sequence as seen in a > hexeditor for > DateAndTime commentStamp: > I have zero duration\r\n\r\n\r\n > When I browse this class and inspect the Text object in the > annotation pane, > the same sequence shows up in its string variable. Since > annotations are read > in as line-oriented text from files They are not. The sources and changes file is *not* a text file even though it might look like one to the uninitiated. It's a database of data chunks and the image actually stores byte offsets into this file. When you move the file to a different platform you *must not* change the line ending convention. > , the CRLFs should have been replaced with > CRs. Of course, we could take a stance that the sources files is > corrupt > since it uses mixed line endings, but then what about text read in > from a > changes file or from a filein that came thru email? We might be more tolerant when filing in, this is true. But this should be an explicit action because there actually are file-ins that contain binary data which we do *no* want to mess with. >>> Squeak3.8 >>> image correctly strips of LF in CRLF while reading in text from >>> SqueakV3.sources file. For instance, editStartPage method in >>> Scamper uses >>> CRLF in SqueakV3.sources file but its Text in codepane strips out >>> LF. >> >> I do not think this is the case. > SqueakV3.sources contains the sequence: > editStartPage\r\n\t > But the Text object in the codepane shows the sequence: > editStartPage\r\t Let's see. (Scamper>>#editStartPage) fileIndex "2" which means it is in the changes file, not the sources file. (Scamper>>#editStartPage) filePosition "10310306" which tells you the file offset (Scamper>>#editStartPage) getSourceFromFile asString asByteArray "a ByteArray(101 100 105 116 83 116 97 114 116 80 97 103 101 13 9 ...)" which is the source code as retrieved by the browser, note there only is a 13 (CR) no LF (10). | f | [(f := FileStream readOnlyFileNamed: Smalltalk changesName) binary; position: 10310306; next: 40] ensure: [f close] "a ByteArray(101 100 105 116 83 116 97 114 116 80 97 103 101 13 9 ...) which confirms that this is actually in the file. - Bert - |
On Tuesday 08 May 2007 12:41 am, Bert Freudenberg wrote:
> On May 7, 2007, at 14:47 , subbukk wrote: > > SqueakV39.sources contains the following sequence as seen in a > > hexeditor for > > DateAndTime commentStamp: > > I have zero duration\r\n\r\n\r\n > > When I browse this class and inspect the Text object in the > > annotation pane, > > the same sequence shows up in its string variable. Since > > annotations are read > > in as line-oriented text from files > > They are not. The sources and changes file is *not* a text file even > though it might look like one to the uninitiated. It's a database of > data chunks and the image actually stores byte offsets into this > file. When you move the file to a different platform you *must not* > change the line ending convention. reason I used a hexeditor for the whole file. But it is good that you pointed out the special nature of these files. > > , the CRLFs should have been replaced with > > CRs. Of course, we could take a stance that the sources files is > > corrupt > > since it uses mixed line endings, but then what about text read in > > from a > > changes file or from a filein that came thru email? > > We might be more tolerant when filing in, this is true. But this > should be an explicit action because there actually are file-ins that > contain binary data which we do *no* want to mess with. be a text sequence. > > SqueakV3.sources contains the sequence: > > editStartPage\r\n\t > > But the Text object in the codepane shows the sequence: > > editStartPage\r\t >.. > (Scamper>>#editStartPage) getSourceFromFile asString asByteArray > "a ByteArray(101 100 105 116 83 116 97 114 116 80 97 103 101 13 > 9 ...)" > > which is the source code as retrieved by the browser, note there only > is a 13 (CR) no LF (10). the changes file, so this is a wrong example. There are other examples 3.8 image like: (String>>#asDateAndTime) getSourceFromFile asString asByteArray a ByteArray(97 115 68 97 116 101 65 110 100 84 105 109 101 13 10 13 10 9 34 67 ... where the CRLF line ending pops up. I am curious about how these CRLFs got into the chunks in the first place? I dont know Squeak well enough to track this down quickly, so when I saw the artifacts, I seized this opportunity to dig into internals. Bret, thank you very much for explaining your reasoning in detail and in Squeak code. It helps me learn internals faster. Regards .. Subbu |
do not read the code using emacs or vi.
Use the tools in squeak, sources and changes are internal format of squeak saving code. Stef On 8 mai 07, at 08:34, subbukk wrote: > On Tuesday 08 May 2007 12:41 am, Bert Freudenberg wrote: >> On May 7, 2007, at 14:47 , subbukk wrote: >>> SqueakV39.sources contains the following sequence as seen in a >>> hexeditor for >>> DateAndTime commentStamp: >>> I have zero duration\r\n\r\n\r\n >>> When I browse this class and inspect the Text object in the >>> annotation pane, >>> the same sequence shows up in its string variable. Since >>> annotations are read >>> in as line-oriented text from files >> >> They are not. The sources and changes file is *not* a text file even >> though it might look like one to the uninitiated. It's a database of >> data chunks and the image actually stores byte offsets into this >> file. When you move the file to a different platform you *must not* >> change the line ending convention. > By text, I only meant portions of chunk and not the whole file. > This is the > reason I used a hexeditor for the whole file. But it is good that > you pointed > out the special nature of these files. > >>> , the CRLFs should have been replaced with >>> CRs. Of course, we could take a stance that the sources files is >>> corrupt >>> since it uses mixed line endings, but then what about text read in >>> from a >>> changes file or from a filein that came thru email? >> >> We might be more tolerant when filing in, this is true. But this >> should be an explicit action because there actually are file-ins that >> contain binary data which we do *no* want to mess with. > Yes, I understand this. But there are contexts where we know the > byteArray to > be a text sequence. > >>> SqueakV3.sources contains the sequence: >>> editStartPage\r\n\t >>> But the Text object in the codepane shows the sequence: >>> editStartPage\r\t >> .. >> (Scamper>>#editStartPage) getSourceFromFile asString asByteArray >> "a ByteArray(101 100 105 116 83 116 97 114 116 80 97 103 101 13 >> 9 ...)" >> >> which is the source code as retrieved by the browser, note there only >> is a 13 (CR) no LF (10). > I stand corrected. I poked directly into the Sources file and > forgot to check > the changes file, so this is a wrong example. There are other > examples 3.8 > image like: > (String>>#asDateAndTime) getSourceFromFile asString asByteArray a > ByteArray(97 > 115 68 97 116 101 65 110 100 84 105 109 101 13 10 13 10 9 34 67 ... > where the CRLF line ending pops up. > > I am curious about how these CRLFs got into the chunks in the first > place? I > dont know Squeak well enough to track this down quickly, so when I > saw the > artifacts, I seized this opportunity to dig into internals. > > Bret, thank you very much for explaining your reasoning in detail > and in > Squeak code. It helps me learn internals faster. > > Regards .. Subbu > > |
In reply to this post by K. K. Subramaniam
On May 8, 2007, at 2:34 , subbukk wrote:
> m curious about how these CRLFs got into the chunks in the first > place? By people and software who think changing and even adding some bytes in a file is a jolly good idea. Traditionally, only CR was used. All was fine. Then some people insisted on storing fileouts the "right way" with platform-dependent line endings by using CrLfStream (or whatever it was named). Or ftp tools tried to be "helpful" by converting CRs to CRLFs in fileouts, at least on on one particular platform that Squeak happens to run on. Anyway, when people filed these in, the bad characters went unnoticed because LF was shown as a zero-width (hence invisible) character. Only now with the fixed fonts these show up. - Bert - |
In reply to this post by Bert Freudenberg
El 5/7/07 2:23 PM, "Bert Freudenberg" <[hidden email]> escribió: > > On May 7, 2007, at 6:14 , subbukk wrote: > >> On Monday 30 April 2007 7:12 pm, Edgar J. De Cleene wrote: >>> El 4/30/07 10:22 AM, "subbukk" <[hidden email]> escribió: >>>> Hi, >>>> >>>> I see some artifacts (like []) displayed in comments and >>>> descriptions in >>>> Squeak (3.7-7 vm on Linux, 3.8 image and SqueakV39.sources). I >>>> suspect >>>> the '\r' line terminator in the *.sources file could be causing it. >>>> >>>> Is this a bug? >>>> >>>> TIA .. Subbu >>> >>> I think what could be eliminated if you do Smalltalk >>> removeAllLineFeeds. >>> >>> In Mac , I found this often. >> >> Squeak3.9 handling of CRLF sequences in sources file is defective. > > I do not think it is. > >> Squeak3.8 >> image correctly strips of LF in CRLF while reading in text from >> SqueakV3.sources file. For instance, editStartPage method in >> Scamper uses >> CRLF in SqueakV3.sources file but its Text in codepane strips out LF. > > I do not think this is the case. > > It is just that images up to 3.8 did not *display* LF chars embedded > in a String because the corresponding glyph in the fonts was blank. > This has been fixed, we should not have invisible characters anymore > (which are very annoying), and this just exposes the problem of LFs > in some method sources. They have been there all along, they just > were not visible. > > Reading in code must not strip LFs unless specifically told to, > because it is perfectly valid to embed a LF in a String in source > code if you want to. > > - Bert - > > solution what works on Mac. I select the text with artifacts (in my case the preamble of a new .cs) and paste again in PluggableTextMorph Edgar > Clipboard-clipboardText.st (1K) Download Attachment |
Free forum by Nabble | Edit this page |