Assumed bug in Regex11 parcel

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Assumed bug in Regex11 parcel

Lothar Rostek
I found a strange behaviour by applying the method
     #copyStream:to:translatingMatchesUsing:
of class RxMatcher

I have on a Windows Machine a text file 'test1.txt' having the following
content:

'test text
...further text
end of text
'

When I inspect the following expression in a workspace:

"in := 'test1.txt' asFilename readStream.
out := WriteStream on: (String new: in size).
coll := OrderedCollection new.
['\.\.\.' asRegex copyStream: in to: out translatingMatchesUsing: [:x | | |
     coll add: x.
     '…']] ensure: [in close. out close].
coll"

then coll includes '..f' instead of '...'.

The reason is that the implementation relies on the fact that with each
'aStream next' the stream position will be increased by 1. But in case
of files with lineEndConvention CRLF (which is default on windows) this
is not always true!

One solution is to replace

"searchStart to: matchStart - 1 do:
     [:ignoredPos | writeStream nextPut: aStream next]."
by:

     "[aStream position < matchStart] whileTrue: [writeStream nextPut:
aStream next]."

The same piece of code is also in the method
#copyStream:to:replacingMatchesWith:

Perhaps someone else will find a better solution.


----
Lothar Rostek