Hi all,
I want to stream over a file which is encoded in UTF-8 and have a problem with upToAll: answering more than it should. Here is a minimal test: 'ße' readStream upToAll: 'e'. This answers 'ß' as expected. I wrote those two characters to a file: FileStream forceNewFileNamed: 'test' do: [ :stream | stream nextPutAll: 'ße' ]. I opened it using a text editor to verify the encoding. It is UTF-8 as expected. Now if I do the following... 'test' asFileReference readStreamDo: [ :stream | stream upToAll: 'e' ]. ... I get 'ße' instead of just 'ß'. I tried explicitly setting a UTF8TextConverter: 'test' asFileReference readStreamDo: [ :stream | stream converter: UTF8TextConverter new; upToAll: 'e' ]. However, the result is still 'ße'. If I read the entire file first using #contentsOfEntireFile first it works: ('test' asFileReference readStreamDo: [ :stream | stream contentsOfEntireFile ]) readStream upToAll: 'e' However, that defies the purpose of streaming. Can someone tell me what I am doing wrong? I am on Pharo 6.1 on a Mac. Bernhard |
Hi,
#upTo: works fine. 'test' asFileReference readStreamDo: [ :stream | stream converter: UTF8TextConverter new; upTo: $e ]. "'ß'" It looks like PositionableStream>>#upToAll: assumes a 1 to 1 map per item, and only takes the difference between current position up to the pattern when found. Best regards, Henrik -- Sent from: http://forum.world.st/Pharo-Smalltalk-Developers-f1294837.html |
Hi Henrik,
Thanks for your answer. Sounds like a bug, then. :-/ Cheers, Bernhard > Am 28.12.2017 um 20:31 schrieb Henrik-Nergaard <[hidden email]>: > > Hi, > > #upTo: works fine. > > 'test' asFileReference readStreamDo: [ :stream | stream converter: > UTF8TextConverter new; upTo: $e ]. "'ß'" > > It looks like PositionableStream>>#upToAll: assumes a 1 to 1 map per item, > and only takes the difference between current position up to the pattern > when found. > > Best regards, > Henrik |
I just checked and the bug in #upToAll: is still there in Pharo 7.
Bernhard > Am 29.12.2017 um 20:26 schrieb Bernhard Pieber <[hidden email]>: > > Hi Henrik, > > Thanks for your answer. Sounds like a bug, then. :-/ > > Cheers, > Bernhard > >> Am 28.12.2017 um 20:31 schrieb Henrik-Nergaard <[hidden email]>: >> >> Hi, >> >> #upTo: works fine. >> >> 'test' asFileReference readStreamDo: [ :stream | stream converter: >> UTF8TextConverter new; upTo: $e ]. "'ß'" >> >> It looks like PositionableStream>>#upToAll: assumes a 1 to 1 map per item, >> and only takes the difference between current position up to the pattern >> when found. >> >> Best regards, >> Henrik |
I created an issue for this:
https://pharo.fogbugz.com/f/cases/20898 Bernhard > Am 30.12.2017 um 12:01 schrieb Bernhard Pieber <[hidden email]>: > > I just checked and the bug in #upToAll: is still there in Pharo 7. > > Bernhard > >> Am 29.12.2017 um 20:26 schrieb Bernhard Pieber <[hidden email]>: >> >> Hi Henrik, >> >> Thanks for your answer. Sounds like a bug, then. :-/ >> >> Cheers, >> Bernhard >> >>> Am 28.12.2017 um 20:31 schrieb Henrik-Nergaard <[hidden email]>: >>> >>> Hi, >>> >>> #upTo: works fine. >>> >>> 'test' asFileReference readStreamDo: [ :stream | stream converter: >>> UTF8TextConverter new; upTo: $e ]. "'ß'" >>> >>> It looks like PositionableStream>>#upToAll: assumes a 1 to 1 map per item, >>> and only takes the difference between current position up to the pattern >>> when found. >>> >>> Best regards, >>> Henrik > |
In reply to this post by bpi
Here is a fix.
------------------------------------ PositionableStream>>#upToAll: aCollection "Answer a subcollection from the current access position to the occurrence (if any, but not inclusive) of aCollection. If aCollection is not in the stream, answer the entire rest of the stream." | output pattern | aCollection ifEmpty: [ ^ collection species empty ]. output := (collection species new: 100) writeStream. pattern := aCollection readStream. [ pattern atEnd ] whileFalse: [ | item | self atEnd ifTrue: [ output next: pattern position putAll: aCollection startingAt: 1. ^ output contents ]. item := self next. (pattern peekFor: item) ifFalse: [ output next: pattern position putAll: aCollection startingAt: 1; nextPut: item. pattern reset. ]. ]. ^ output contents ------------------------------------ Best regards, Henrik -- Sent from: http://forum.world.st/Pharo-Smalltalk-Developers-f1294837.html |
In reply to this post by bpi
Thanks.
What I understood is that positionableReadStream does not work with variable elements like utf-8 because there is no 1 to 1 mapping. On Sat, Dec 30, 2017 at 12:54 PM, Bernhard Pieber <[hidden email]> wrote: > I created an issue for this: > https://pharo.fogbugz.com/f/cases/20898 > > Bernhard > >> Am 30.12.2017 um 12:01 schrieb Bernhard Pieber <[hidden email]>: >> >> I just checked and the bug in #upToAll: is still there in Pharo 7. >> >> Bernhard >> >>> Am 29.12.2017 um 20:26 schrieb Bernhard Pieber <[hidden email]>: >>> >>> Hi Henrik, >>> >>> Thanks for your answer. Sounds like a bug, then. :-/ >>> >>> Cheers, >>> Bernhard >>> >>>> Am 28.12.2017 um 20:31 schrieb Henrik-Nergaard <[hidden email]>: >>>> >>>> Hi, >>>> >>>> #upTo: works fine. >>>> >>>> 'test' asFileReference readStreamDo: [ :stream | stream converter: >>>> UTF8TextConverter new; upTo: $e ]. "'ß'" >>>> >>>> It looks like PositionableStream>>#upToAll: assumes a 1 to 1 map per item, >>>> and only takes the difference between current position up to the pattern >>>> when found. >>>> >>>> Best regards, >>>> Henrik >> > > |
In reply to this post by Henrik-Nergaard
Hi Henrik,
Thanks for the fix. I just saw it today. In the meantime I have created a pull request with another fix. To be honest, I have just taken the working implementation from Squeak: https://github.com/pharo-project/pharo/pull/632 Alas, the CI check failed for some reason I don't understand. :-/ Happy New Year! Bernhard > Am 30.12.2017 um 13:32 schrieb Henrik-Nergaard <[hidden email]>: > > Here is a fix. > > ------------------------------------ > PositionableStream>>#upToAll: aCollection > "Answer a subcollection from the current access position to the occurrence > (if any, but not inclusive) of aCollection. If aCollection is not in the > stream, answer the entire rest of the stream." > > | output pattern | > > aCollection ifEmpty: [ ^ collection species empty ]. > > output := (collection species new: 100) writeStream. > pattern := aCollection readStream. > > [ pattern atEnd ] whileFalse: [ | item | > self atEnd ifTrue: [ > output next: pattern position putAll: aCollection startingAt: 1. > ^ output contents > ]. > > item := self next. > (pattern peekFor: item) ifFalse: [ > output > next: pattern position putAll: aCollection startingAt: 1; > nextPut: item. > > pattern reset. > ]. > ]. > > ^ output contents > ------------------------------------ > > Best regards, > Henrik |
Thanks for the submission we will check.
On Sun, Dec 31, 2017 at 4:26 PM, Bernhard Pieber <[hidden email]> wrote: > Hi Henrik, > > Thanks for the fix. I just saw it today. In the meantime I have created a pull request with another fix. To be honest, I have just taken the working implementation from Squeak: > https://github.com/pharo-project/pharo/pull/632 > > Alas, the CI check failed for some reason I don't understand. :-/ > > Happy New Year! > Bernhard > >> Am 30.12.2017 um 13:32 schrieb Henrik-Nergaard <[hidden email]>: >> >> Here is a fix. >> >> ------------------------------------ >> PositionableStream>>#upToAll: aCollection >> "Answer a subcollection from the current access position to the occurrence >> (if any, but not inclusive) of aCollection. If aCollection is not in the >> stream, answer the entire rest of the stream." >> >> | output pattern | >> >> aCollection ifEmpty: [ ^ collection species empty ]. >> >> output := (collection species new: 100) writeStream. >> pattern := aCollection readStream. >> >> [ pattern atEnd ] whileFalse: [ | item | >> self atEnd ifTrue: [ >> output next: pattern position putAll: aCollection startingAt: 1. >> ^ output contents >> ]. >> >> item := self next. >> (pattern peekFor: item) ifFalse: [ >> output >> next: pattern position putAll: aCollection startingAt: 1; >> nextPut: item. >> >> pattern reset. >> ]. >> ]. >> >> ^ output contents >> ------------------------------------ >> >> Best regards, >> Henrik > > |
Free forum by Nabble | Edit this page |