Hi,
I bumped into CRLF pollution in SqueakV39.sources file and also saw the bug report http://bugs.squeak.org/view.php?id=6173 by Andrew. IMHO, line endings is platform-specific (not filesystem specific). A fileout on a FAT32 filesystem on a Linux box, should put LF in the file not CRLF. Reader should be prepared to handle any convention while writers should be using platform-native convention for text files. Shouldn't this be handled in a primitive or a System Attribute? Attempting to guess line ending in Squeak code sounds regressive. Regards .. Subbu |
Subbu,
I think that you are right when you say "line endings is platform-specific (not filesystem specific). A fileout on a FAT32 filesystem on a Linux box, should put LF in the file not CRLF." My proposed fix would have made line-endings system specific for Mac, but not for Linux. Hard to test this one unless you have all the systems. Please embellish my fix so that it is right for Linux, and add to the bug report. Reader should be prepared to handle any convention while writers should be
Andrew On 7 May 2007, at 10:52, subbukk wrote:
Andrew P. Black Department of Computer Science Portland State University +1 503 725 2411 |
On Tuesday 08 May 2007 12:35 pm, Andrew P. Black wrote:
> Subbu, >... > My proposed fix would have made line-endings system specific for Mac, > but not for Linux. Hard to test this one unless you have all the > systems.... I am not sure if putting platform-specific logic in Squeak image is a good idea. The code soon turns ugly as more platforms get added. Such assumptions are best encapsulated in VMs. The confusion with line endings is because there are two text line objects - say SqueakLine and MachLine. SqueakLine uses '\r' for EOL (end of line) marker while MachLine EOL marker is specific to the underlying VM. Since Squeak objects can cross machine boundaries, all lines in Squeak must be SqueakLines. We could use methods like: Stream>>isEOL "check for EOL marker in current position for the current VM Stream>>skipEOL "skip EOL marker if present. return number of octets skipped" Stream>>putEOL "put EOL marker on a stream" to deal with MachLines. I wish I knew enough Squeak to put out a patch file :-(. > > Reader should be prepared to handle any convention while writers > > should be > > using platform-native convention for text files. > > That is in fact what the code does. For existing files, the CRLF > code will detect what is there. The CRLF platform-specific detection > applies only to new files, and only to CRLF text files. I meant text *lines* and not text *files*. Sorry. Files should be treated ByteArray and ASCII is one of the interpretation for ByteArray sequences. ASCII sequences can come embedded in "binary" files too. > > Shouldn't this be handled in a primitive or a System Attribute? > > Attempting to > > guess line ending in Squeak code sounds regressive. > > I don't see why there is need for a primitive, when a couple of lines > of Smalltalk can do the right thing easily enough... Code like beginsWith('darwin') soon gets to be unmanageable. The file read/write system primitive should be able to check for EOL or EOF markers in a much more portable way (like writing to /dev/stdout or /dev/console). This is just my $0.02, Subbu |
Free forum by Nabble | Edit this page |