Text Bug

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Text Bug

Kirk W. Fraser
I discovered an old text file that causes both old and new Dolphin's to
react quite differently than they do to the same file saved by a modern text
editor.

I've tried using Notepad to save a similar file in text, msdos-text, and
unicode text but none of those reproduced the error.  Following is the
Dolpin program which reacts by ignoring the commented lines when processing
the old file yet executes them on the new.  The old file is Grady Ward's KJV
Bible circa 1993.  Anybody encountered something like this before?

| in out line chr |
"Clean Bible"
in := (File open: 'c:\BackAI\Kjv\Kjvbible') readWriteStream.
out := (File open: 'c:\BackAI\Kjv\Bible' mode: #create check: false share:
#exclusive) readWriteStream.
[in atEnd] whileFalse:[
line := ReadStream on: in nextLine.
(line atEnd)                       "stop at first blank line"
ifTrue: [in setToEnd]
ifFalse:[line upTo: $ .         "removes book abbreviated name"
line upTo: $ .                     "removes chapter and verse numbers"
[line atEnd] whileFalse:[
chr := line next.
chr isPunctuation
ifFalse:[(chr = $Õ or: [chr = $õ]) ifTrue: [chr := $'].
out nextPut: chr asLowercase]].
out cr]].
in close.
out close.

Thanks,
Kirk Fraser


Reply | Threaded
Open this post in threaded view
|

Re: Text Bug

Ian Bartholomew-13
Kirk,

> I discovered an old text file that causes both old and new Dolphin's to
> react quite differently than they do to the same file saved by a modern
> text editor.
>
> I've tried using Notepad to save a similar file in text, msdos-text, and
> unicode text but none of those reproduced the error.  Following is the
> Dolpin program which reacts by ignoring the commented lines when
> processing the old file yet executes them on the new.  The old file is
> Grady Ward's KJV Bible circa 1993.  Anybody encountered
> something like this before?

I would guess the old file was created by an editor that could set the high
bit in the character code for space to indicate some sort of word processing
operation (a  common use was to set bit 7 on all spaces that was inserted as
part of a justification operation). The fact that your text also contains
non standard ASCII, and therefore was created using 8 bit characters, would
seem to confirm that.  You wouldn't notice anything different visually as a
blank is displayed for both space (ASCII 16r20) and it's 8 bit equivalent
(ASCII 16rA0) but it would cause your test (line upTo: $ .) to fail.

An additional problem may be that lines in the original are terminated in
U*IX format, a single lf character, rather than the Windows standard cr-lf.

Rather than trying to guess which combination of the above (if any of course
:-) ) the file uses it might be best to either post (or mail me if you
prefer) a short section, a couple of K should do it, so we can have a look
for ourselves.  Don't extract the sample using a text editor though, that
could destroy the evidence, but use a Dolphin script.

inFile := FileStream read: 'your original file path' text: false.
outFile := FileStream write: 'sample.bin' text: false.
2000 timesRepeat: [outFile nextPut: inFile next].
inFile close.
outFile close.

Regards
    Ian