#asOctetString Bug?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

#asOctetString Bug?

patrick.rein
Hi everyone,

while working on parsing network data I tried to use asOctetString and noticed that it did not yield a ByteString when called on a WideString. Is this the intendended behavior? I would have expected a behavior similar to the one I documented in the test case below. If not I think that the problem simply originates from using #at: which is overridden by WideString.

Bests,
Patrick

testAsOctetStringFromWideString

        | rawStringOctet wideStringAsOctet wideString |
        rawStringOctet := #[103 114 252 223 101 "The character 16r1fA02 starts here" 0 1 250 2].
        wideString := 'grüße' , (String value: 16r1FA02).
        wideStringAsOctet := wideString asOctetString asByteArray.
        self assert: rawStringOctet equals: wideStringAsOctet.

Reply | Threaded
Open this post in threaded view
|

Re: #asOctetString Bug?

Levente Uzonyi
Hi Patrick,

The name of the method is misleading. The intention was not to change the
encoding of the receiver nor to filter out out-of-range bytes but to
create a ByteString from a WideString when it only contains byte
characters. So, the method will return a string equal to the receiver. The
returned string will be a ByteString if an only if #isOctetString returns
true.
I don't think the conversion in your example would make much sense,
because it's not reversible: there's no way to recreate the string from a
ByteArray.

Levente

On Fri, 14 Jun 2019, [hidden email] wrote:

> Hi everyone,
>
> while working on parsing network data I tried to use asOctetString and noticed that it did not yield a ByteString when called on a WideString. Is this the intendended behavior? I would have expected a behavior similar to the one I documented in the test case below. If not I think that the problem simply originates from using #at: which is overridden by WideString.
>
> Bests,
> Patrick
>
> testAsOctetStringFromWideString
>
> | rawStringOctet wideStringAsOctet wideString |
> rawStringOctet := #[103 114 252 223 101 "The character 16r1fA02 starts here" 0 1 250 2].
> wideString := 'grüße' , (String value: 16r1FA02).
> wideStringAsOctet := wideString asOctetString asByteArray.
> self assert: rawStringOctet equals: wideStringAsOctet.

Reply | Threaded
Open this post in threaded view
|

Re: #asOctetString Bug?

patrick.rein
Hi Levente,

thanks for clarifying this :) I will add a comment to the method to document the intent and a corresponding test case.

Bests
Patrick
Am 17. Juni 2019, um 17:35, Levente Uzonyi <[hidden email]> schrieb:
Hi Patrick,

The name of the method is misleading. The intention was not to change the
encoding of the receiver nor to filter out out-of-range bytes but to
create a ByteString from a WideString when it only contains byte
characters. So, the method will return a string equal to the receiver. The
returned string will be a ByteString if an only if #isOctetString returns
true.
I don't think the conversion in your example would make much sense,
because it's not reversible: there's no way to recreate the string from a
ByteArray.

Levente

On Fri, 14 Jun 2019, [hidden email] wrote:

Hi everyone,

while working on parsing network data I tried to use asOctetString and noticed that it did not yield a ByteString when called on a WideString. Is this the intendended behavior? I would have expected a behavior similar to the one I documented in the test case below. If not I think that the problem simply originates from using #at: which is overridden by WideString.

Bests,
Patrick

testAsOctetStringFromWideString

| rawStringOctet wideStringAsOctet wideString |
rawStringOctet := #[103 114 252 223 101 "The character 16r1fA02 starts here" 0 1 250 2].
wideString := 'grüße' , (String value: 16r1FA02).
wideStringAsOctet := wideString asOctetString asByteArray.
self assert: rawStringOctet equals: wideStringAsOctet.