Changed semantics for mongo ObjectID

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Changed semantics for mongo ObjectID

NorbertHartl
I needed to change the way ObjectIds are read from and written to BSON. BSON is a format that encodes its byte contents little endian. In mongo database the format of ObjectIds is different. The way an ObjectId is encoded is big endian. You can see this if you compare the outputs for an object when read from mongo shell and from pharo. The ids won’t match.

If you do „normal“ stuff with mongo it is unlikely you have noticed the effect. An OID is read the wrong way but also written the wrong way which makes it right again from the mongo database perspective. But it won’t work if you use a mixed query setting. Meaning reading an OID and requesting the database using javascript expression. This will fail.

I’m writing this because I’m not sure the change can break existing software. The change is not included in the stable version, yet. If you want to test then load #bleedingEdge and report any problem that might occur. I will use the new behaviour and will take the freedom to make it stable if it works for some time.

FYI,

Norbert


Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] Changed semantics for mongo ObjectID

Sabine Manaa
Hi Norbert,

what about objects which have been written to mongo with the old version and will be read with the new version. Can they be read/found with the new version? Or is there a migration needed?

Regards
Sabine



On Thu, May 8, 2014 at 11:30 PM, Norbert Hartl <[hidden email]> wrote:
I needed to change the way ObjectIds are read from and written to BSON. BSON is a format that encodes its byte contents little endian. In mongo database the format of ObjectIds is different. The way an ObjectId is encoded is big endian. You can see this if you compare the outputs for an object when read from mongo shell and from pharo. The ids won’t match.

If you do „normal“ stuff with mongo it is unlikely you have noticed the effect. An OID is read the wrong way but also written the wrong way which makes it right again from the mongo database perspective. But it won’t work if you use a mixed query setting. Meaning reading an OID and requesting the database using javascript expression. This will fail.

I’m writing this because I’m not sure the change can break existing software. The change is not included in the stable version, yet. If you want to test then load #bleedingEdge and report any problem that might occur. I will use the new behaviour and will take the freedom to make it stable if it works for some time.

FYI,

Norbert



Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] Changed semantics for mongo ObjectID

NorbertHartl

Am 12.05.2014 um 10:46 schrieb Sabine Knöfel <[hidden email]>:

Hi Norbert,

what about objects which have been written to mongo with the old version and will be read with the new version. Can they be read/found with the new version? Or is there a migration needed?

I’d say it should be safe just to upgrade the code. The exisiting ids won’t change because encoding and decoding are always fully reversible. Meaning regardless of the encoding scheme the ids are the same in the database after being read and written. The only thing changes is the value in the OID object in your image. There are two ways where this can fail. Either you monkey with id values inside your code. Or you use manual references to objects where the referenced id is not written as OID. Everything else should be the same with the old and the new code.
In the meantime I even changed the generation of the id value to reflect the mongo spec. Now the values should be comparable to what other drivers (including mongo shell) are generating. This also means you can now ask your ObjectId about its timestamp. Which of course gives you non-sense with ids generated prior to the code update..

Norbert



On Thu, May 8, 2014 at 11:30 PM, Norbert Hartl <[hidden email]> wrote:
I needed to change the way ObjectIds are read from and written to BSON. BSON is a format that encodes its byte contents little endian. In mongo database the format of ObjectIds is different. The way an ObjectId is encoded is big endian. You can see this if you compare the outputs for an object when read from mongo shell and from pharo. The ids won’t match.

If you do „normal“ stuff with mongo it is unlikely you have noticed the effect. An OID is read the wrong way but also written the wrong way which makes it right again from the mongo database perspective. But it won’t work if you use a mixed query setting. Meaning reading an OID and requesting the database using javascript expression. This will fail.

I’m writing this because I’m not sure the change can break existing software. The change is not included in the stable version, yet. If you want to test then load #bleedingEdge and report any problem that might occur. I will use the new behaviour and will take the freedom to make it stable if it works for some time.

FYI,

Norbert




Reply | Threaded
Open this post in threaded view
|

Re: [Pharo-dev] Changed semantics for mongo ObjectID

Sabine Manaa
Thanks, Norbert!


On Mon, May 12, 2014 at 11:38 AM, Norbert Hartl <[hidden email]> wrote:

Am 12.05.2014 um 10:46 schrieb Sabine Knöfel <[hidden email]>:

Hi Norbert,

what about objects which have been written to mongo with the old version and will be read with the new version. Can they be read/found with the new version? Or is there a migration needed?

I’d say it should be safe just to upgrade the code. The exisiting ids won’t change because encoding and decoding are always fully reversible. Meaning regardless of the encoding scheme the ids are the same in the database after being read and written. The only thing changes is the value in the OID object in your image. There are two ways where this can fail. Either you monkey with id values inside your code. Or you use manual references to objects where the referenced id is not written as OID. Everything else should be the same with the old and the new code.
In the meantime I even changed the generation of the id value to reflect the mongo spec. Now the values should be comparable to what other drivers (including mongo shell) are generating. This also means you can now ask your ObjectId about its timestamp. Which of course gives you non-sense with ids generated prior to the code update..

Norbert



On Thu, May 8, 2014 at 11:30 PM, Norbert Hartl <[hidden email]> wrote:
I needed to change the way ObjectIds are read from and written to BSON. BSON is a format that encodes its byte contents little endian. In mongo database the format of ObjectIds is different. The way an ObjectId is encoded is big endian. You can see this if you compare the outputs for an object when read from mongo shell and from pharo. The ids won’t match.

If you do „normal“ stuff with mongo it is unlikely you have noticed the effect. An OID is read the wrong way but also written the wrong way which makes it right again from the mongo database perspective. But it won’t work if you use a mixed query setting. Meaning reading an OID and requesting the database using javascript expression. This will fail.

I’m writing this because I’m not sure the change can break existing software. The change is not included in the stable version, yet. If you want to test then load #bleedingEdge and report any problem that might occur. I will use the new behaviour and will take the freedom to make it stable if it works for some time.

FYI,

Norbert