Smalltalk › Pharo › Pharo Smalltalk Developers

MongoTalk issue

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

3 messages Options

EstebanLM

MongoTalk issue

Hi,

Ok... this is the problem: I have a customer who likes to add stupid unicode to his strings (like strange open-close colons, not the regulars).
And of course, the BSON driver does not handle them well... in fact, it persist them well, but when customer tries to read it, it throws and error (Invalid type).
So... I started to investigate and I figured out that problem is that when the answered string is a widestring, size readed and size expected can be different (depending on the amount of 2+ bytes characters on the unicode string).

After some tries, I come this incredibly ugly hack that works:

BSON>>nextSizedString
| size result |
size := stream nextUInt32.
result := stream nextString.
result isWideString ifTrue: [
stream skip: (size - (result
collect: [ :each | each asString asByteArray size ]
as: OrderedCollection) sum) - 2 ].

^result

LittleEndianStream>>skip: aNumber
stream skip: aNumber

as you can see... It "calculates" the real consumed bytes (which is strangely less than declared size) and skips to its real position minus 2.

This works, at least in the examples I have at my hand...

now... my questions:

- I'm sure it has to be a better way to calculate the difference, but I didn't find a good one.
- why that "- 2"?????? WTF... what does that means????
- can someone confirm that the fix works? (I have "production issues", I need something that works as fast as I can)

Thanks,
Esteban

Igor Stasenko

Re: MongoTalk issue

On 1 July 2012 15:10, Esteban Lorenzano <[hidden email]> wrote:

> Hi,
>
> Ok... this is the problem: I have a customer who likes to add stupid unicode to his strings (like strange open-close colons, not the regulars).
> And of course, the BSON driver does not handle them well... in fact, it persist them well, but when customer tries to read it, it throws and error (Invalid type).
> So... I started to investigate and I figured out that problem is that when the answered string is a widestring, size readed and size expected can be different (depending on the amount of 2+ bytes characters on the unicode string).
>
> After some tries, I come this incredibly ugly hack that works:
>
> BSON>>nextSizedString
> | size result |
> size := stream nextUInt32.
> result := stream nextString.
> result isWideString ifTrue: [
> stream skip: (size - (result
> collect: [ :each | each asString asByteArray size ]
> as: OrderedCollection) sum) - 2 ].
>
> ^result
>
> LittleEndianStream>>skip: aNumber
> stream skip: aNumber
>
> as you can see... It "calculates" the real consumed bytes (which is strangely less than declared size) and skips to its real position minus 2.
>
> This works, at least in the examples I have at my hand...
>
> now... my questions:
>
> - I'm sure it has to be a better way to calculate the difference, but I didn't find a good one.
> - why that "- 2"?????? WTF... what does that means????
> - can someone confirm that the fix works? (I have "production issues", I need something that works as fast as I can)
>

maybe its BOM (byte order mark) character?
what unicode encoding used on server? utf-8 i guess?

> Thanks,
> Esteban
>
>
>

--
Best regards,
Igor Stasenko.

Nicolas Cellier

Re: MongoTalk issue

In reply to this post by EstebanLM

http://ss3.gemstone.com/ss/MongoSt.html suggest that some WideString
problems were solved in this fork...

Nicolas

2012/7/1 Esteban Lorenzano <[hidden email]>: