The Trunk: Multilingual-topa.205.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Multilingual-topa.205.mcz

commits-2
Tobias Pape uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-topa.205.mcz

==================== Summary ====================

Name: Multilingual-topa.205
Author: topa
Time: 14 April 2015, 11:04:27.175 am
UUID: da7bc0f1-5f76-40d0-9bcb-7cf96596ac92
Ancestors: Multilingual-topa.204, Multilingual-cbc.201

Pick up a fix for MultiByteFileStream>>#nextChunk

=============== Diff against Multilingual-topa.204 ===============

Item was changed:
  ----- Method: MultiByteFileStream>>nextChunk (in category 'fileIn/Out') -----
  nextChunk
  "Answer the contents of the receiver, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
 
+ ^(wantsLineEndConversion and: [ lineEndConvention notNil ])
+ ifTrue: [converter nextChunkLineEndConvertingFromStream: self]
+ ifFalse: [converter nextChunkFromStream: self]!
- ^converter nextChunkFromStream: self!

Item was added:
+ ----- Method: UTF8TextConverter>>nextChunkLineEndConvertingFromStream: (in category 'fileIn/Out') -----
+ nextChunkLineEndConvertingFromStream: input
+ "Answer the contents of input, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
+ "Obey line end conversion."
+
+ self skipSeparatorsFrom: input.
+ ^self
+ parseLangTagFor: (
+ self class decodeByteString: (
+ String new: 1000 streamContents: [ :stream |
+ [
+ stream nextPutAll: (input upTo: $!!).
+ input basicNext == $!! ]
+ whileTrue: [
+ stream nextPut: $!! ].
+ input atEnd ifFalse: [ input skip: -1 ] ]))
+ fromStream: input!


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-topa.205.mcz

Levente Uzonyi-2
This doesn't look right, because it'll decode the input twice. Once in
#upTo:, and once in #decodeByteString:. The latter will fail if
its argument is not a valid UTF-8 string after the first decoding.

Levente

On Tue, 14 Apr 2015, [hidden email] wrote:

> Tobias Pape uploaded a new version of Multilingual to project The Trunk:
> http://source.squeak.org/trunk/Multilingual-topa.205.mcz
>
> ==================== Summary ====================
>
> Name: Multilingual-topa.205
> Author: topa
> Time: 14 April 2015, 11:04:27.175 am
> UUID: da7bc0f1-5f76-40d0-9bcb-7cf96596ac92
> Ancestors: Multilingual-topa.204, Multilingual-cbc.201
>
> Pick up a fix for MultiByteFileStream>>#nextChunk
>
> =============== Diff against Multilingual-topa.204 ===============
>
> Item was changed:
>  ----- Method: MultiByteFileStream>>nextChunk (in category 'fileIn/Out') -----
>  nextChunk
>   "Answer the contents of the receiver, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
>
> + ^(wantsLineEndConversion and: [ lineEndConvention notNil ])
> + ifTrue: [converter nextChunkLineEndConvertingFromStream: self]
> + ifFalse: [converter nextChunkFromStream: self]!
> - ^converter nextChunkFromStream: self!
>
> Item was added:
> + ----- Method: UTF8TextConverter>>nextChunkLineEndConvertingFromStream: (in category 'fileIn/Out') -----
> + nextChunkLineEndConvertingFromStream: input
> + "Answer the contents of input, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
> + "Obey line end conversion."
> +
> + self skipSeparatorsFrom: input.
> + ^self
> + parseLangTagFor: (
> + self class decodeByteString: (
> + String new: 1000 streamContents: [ :stream |
> + [
> + stream nextPutAll: (input upTo: $!!).
> + input basicNext == $!! ]
> + whileTrue: [
> + stream nextPut: $!! ].
> + input atEnd ifFalse: [ input skip: -1 ] ]))
> + fromStream: input!
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-topa.205.mcz

Tobias Pape

On 14.04.2015, at 13:17, Levente Uzonyi <[hidden email]> wrote:

> This doesn't look right, because it'll decode the input twice. Once in #upTo:, and once in #decodeByteString:. The latter will fail if
> its argument is not a valid UTF-8 string after the first decoding.
>

Darn.
Do you have an Idea?
Probably the magic of upTo must me moved
to decodeByteString for the UTF converter?
Best
        -Tobias

> Levente
>
> On Tue, 14 Apr 2015, [hidden email] wrote:
>
>> Tobias Pape uploaded a new version of Multilingual to project The Trunk:
>> http://source.squeak.org/trunk/Multilingual-topa.205.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Multilingual-topa.205
>> Author: topa
>> Time: 14 April 2015, 11:04:27.175 am
>> UUID: da7bc0f1-5f76-40d0-9bcb-7cf96596ac92
>> Ancestors: Multilingual-topa.204, Multilingual-cbc.201
>>
>> Pick up a fix for MultiByteFileStream>>#nextChunk
>>
>> =============== Diff against Multilingual-topa.204 ===============
>>
>> Item was changed:
>> ----- Method: MultiByteFileStream>>nextChunk (in category 'fileIn/Out') -----
>> nextChunk
>> "Answer the contents of the receiver, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
>>
>> + ^(wantsLineEndConversion and: [ lineEndConvention notNil ])
>> + ifTrue: [converter nextChunkLineEndConvertingFromStream: self]
>> + ifFalse: [converter nextChunkFromStream: self]!
>> - ^converter nextChunkFromStream: self!
>>
>> Item was added:
>> + ----- Method: UTF8TextConverter>>nextChunkLineEndConvertingFromStream: (in category 'fileIn/Out') -----
>> + nextChunkLineEndConvertingFromStream: input
>> + "Answer the contents of input, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
>> + "Obey line end conversion."
>> +
>> + self skipSeparatorsFrom: input.
>> + ^self
>> + parseLangTagFor: (
>> + self class decodeByteString: (
>> + String new: 1000 streamContents: [ :stream |
>> + [
>> + stream nextPutAll: (input upTo: $!!).
>> + input basicNext == $!! ]
>> + whileTrue: [
>> + stream nextPut: $!! ].
>> + input atEnd ifFalse: [ input skip: -1 ] ]))
>> + fromStream: input!



Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-topa.205.mcz

Levente Uzonyi-2
On Tue, 14 Apr 2015, Tobias Pape wrote:

>
> On 14.04.2015, at 13:17, Levente Uzonyi <[hidden email]> wrote:
>
>> This doesn't look right, because it'll decode the input twice. Once in #upTo:, and once in #decodeByteString:. The latter will fail if
>> its argument is not a valid UTF-8 string after the first decoding.
>>
>
> Darn.
> Do you have an Idea?
> Probably the magic of upTo must me moved
> to decodeByteString for the UTF converter?

I've uploaded Multilingual-ul.206 to the Inbox with a more general fix.
The asymmetry in line end conversions was a bit surprising to me, but the
tests are green.

Levente

> Best
> -Tobias
>
>> Levente
>>
>> On Tue, 14 Apr 2015, [hidden email] wrote:
>>
>>> Tobias Pape uploaded a new version of Multilingual to project The Trunk:
>>> http://source.squeak.org/trunk/Multilingual-topa.205.mcz
>>>
>>> ==================== Summary ====================
>>>
>>> Name: Multilingual-topa.205
>>> Author: topa
>>> Time: 14 April 2015, 11:04:27.175 am
>>> UUID: da7bc0f1-5f76-40d0-9bcb-7cf96596ac92
>>> Ancestors: Multilingual-topa.204, Multilingual-cbc.201
>>>
>>> Pick up a fix for MultiByteFileStream>>#nextChunk
>>>
>>> =============== Diff against Multilingual-topa.204 ===============
>>>
>>> Item was changed:
>>> ----- Method: MultiByteFileStream>>nextChunk (in category 'fileIn/Out') -----
>>> nextChunk
>>> "Answer the contents of the receiver, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
>>>
>>> + ^(wantsLineEndConversion and: [ lineEndConvention notNil ])
>>> + ifTrue: [converter nextChunkLineEndConvertingFromStream: self]
>>> + ifFalse: [converter nextChunkFromStream: self]!
>>> - ^converter nextChunkFromStream: self!
>>>
>>> Item was added:
>>> + ----- Method: UTF8TextConverter>>nextChunkLineEndConvertingFromStream: (in category 'fileIn/Out') -----
>>> + nextChunkLineEndConvertingFromStream: input
>>> + "Answer the contents of input, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
>>> + "Obey line end conversion."
>>> +
>>> + self skipSeparatorsFrom: input.
>>> + ^self
>>> + parseLangTagFor: (
>>> + self class decodeByteString: (
>>> + String new: 1000 streamContents: [ :stream |
>>> + [
>>> + stream nextPutAll: (input upTo: $!!).
>>> + input basicNext == $!! ]
>>> + whileTrue: [
>>> + stream nextPut: $!! ].
>>> + input atEnd ifFalse: [ input skip: -1 ] ]))
>>> + fromStream: input!
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Multilingual-topa.205.mcz

Tobias Pape

On 15.04.2015, at 01:20, Levente Uzonyi <[hidden email]> wrote:

> On Tue, 14 Apr 2015, Tobias Pape wrote:
>
>>
>> On 14.04.2015, at 13:17, Levente Uzonyi <[hidden email]> wrote:
>>
>>> This doesn't look right, because it'll decode the input twice. Once in #upTo:, and once in #decodeByteString:. The latter will fail if
>>> its argument is not a valid UTF-8 string after the first decoding.
>>>
>>
>> Darn.
>> Do you have an Idea?
>> Probably the magic of upTo must me moved
>> to decodeByteString for the UTF converter?
>
> I've uploaded Multilingual-ul.206 to the Inbox with a more general fix.
> The asymmetry in line end conversions was a bit surprising to me, but the tests are green.

Thank you, Levente!

Best
        -Tobias

>
>> Best
>> -Tobias
>>
>>> Levente
>>>
>>> On Tue, 14 Apr 2015, [hidden email] wrote:
>>>
>>>> Tobias Pape uploaded a new version of Multilingual to project The Trunk:
>>>> http://source.squeak.org/trunk/Multilingual-topa.205.mcz
>>>>
>>>> ==================== Summary ====================
>>>>
>>>> Name: Multilingual-topa.205
>>>> Author: topa
>>>> Time: 14 April 2015, 11:04:27.175 am
>>>> UUID: da7bc0f1-5f76-40d0-9bcb-7cf96596ac92
>>>> Ancestors: Multilingual-topa.204, Multilingual-cbc.201
>>>>
>>>> Pick up a fix for MultiByteFileStream>>#nextChunk
>>>>
>>>> =============== Diff against Multilingual-topa.204 ===============
>>>>
>>>> Item was changed:
>>>> ----- Method: MultiByteFileStream>>nextChunk (in category 'fileIn/Out') -----
>>>> nextChunk
>>>> "Answer the contents of the receiver, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
>>>>
>>>> + ^(wantsLineEndConversion and: [ lineEndConvention notNil ])
>>>> + ifTrue: [converter nextChunkLineEndConvertingFromStream: self]
>>>> + ifFalse: [converter nextChunkFromStream: self]!
>>>> - ^converter nextChunkFromStream: self!
>>>>
>>>> Item was added:
>>>> + ----- Method: UTF8TextConverter>>nextChunkLineEndConvertingFromStream: (in category 'fileIn/Out') -----
>>>> + nextChunkLineEndConvertingFromStream: input
>>>> + "Answer the contents of input, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
>>>> + "Obey line end conversion."
>>>> +
>>>> + self skipSeparatorsFrom: input.
>>>> + ^self
>>>> + parseLangTagFor: (
>>>> + self class decodeByteString: (
>>>> + String new: 1000 streamContents: [ :stream |
>>>> + [
>>>> + stream nextPutAll: (input upTo: $!!).
>>>> + input basicNext == $!! ]
>>>> + whileTrue: [
>>>> + stream nextPut: $!! ].
>>>> + input atEnd ifFalse: [ input skip: -1 ] ]))
>>>> + fromStream: input!