Bug: Metacello caches some version files with 4 bytes per character

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug: Metacello caches some version files with 4 bytes per character

Pieter Nagel-3
We are unable to load Magritte3 over Seaside3.1 when Metacello is
simultaneously writing to a cacheRepository.

The problem seems to be that when Metacello caches, it also uses the mcz
it wrote to load from, and in the case of JQuery-Core-JohanBrichau.132.mcz
which it fetched from github://GsDevKit/Seaside31:v3.1.3.1-gs/repository,
it writes a version file to the mcz that has 4 bytes per character - which
it then later of course can not parse.

In order to reproduce the bug, log in to an extent0.seaside.dbf under
GemStone64 3.1.05, create a directory /tmp/somewhere, and execute the
following:

        Gofer new
                package: 'GsUpgrader-Core';
                url: 'http://ss3.gemtalksystems.com/ss/gsUpgrader';
                load.
        (Smalltalk at: #GsUpgrader)
                upgradeGrease;
                upgradeGLASS1.
        (Smalltalk at: #GsDeployer)
                deploy: [
                        Metacello new
                                baseline: 'Seaside3';
                                repository: 'github://GsDevKit/Seaside31:v3.1.3.1-gs/repository';
onLock: [ :ex | ex honor ];
                                load: #('Core' 'Seaside-Development' 'Seaside-Adaptors-FastCGI').
                        Metacello new
                                configuration: 'Magritte3';
                                version: #'release3.1';
                                repository: 'http://www.smalltalkhub.com/mc/Magritte/Magritte3/main';
onConflict: [ :ex :existingRegistration :newRegistration |
                                                        Transcript
                                                                show:
                                                                        'Conflict between existing: ' , existingRegistration className ,
' ' , existingRegistration printString
                                                                                , Character cr asString , '  and new: ' , newRegistration
className , ' ' , newRegistration printString
                                                                                , Character cr asString , '  resolved in favor of existing'.
                                                        ex disallow ];
                                cacheRepository: '/tmp/somewhere';
                                load: 'Magritte-Seaside' ]







--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bug: Metacello caches some version files with 4 bytes per character

Dale Henrichs-3
Pieter,

Could you share the details of the bug (error and stack)? I have hit an
error while running the below, but I want to make suer that I am looking
at the same problem that you are looking at.

This is the error that I'm getting:

a LookupError occurred (error 2021), reason:rtErrKeyNotFound, A
reference using the non-existent key #'id' was made into the dictionary
aDictionary( #'e'->#'s', #'m'->#'e', #'l'->#'d', '....

Dale


On 01/28/2015 04:02 AM, Pieter Nagel wrote:

> Gofer new
> package: 'GsUpgrader-Core';
> url: 'http://ss3.gemtalksystems.com/ss/gsUpgrader';
> load.
> (Smalltalk at: #GsUpgrader)
> upgradeGrease;
> upgradeGLASS1.
> (Smalltalk at: #GsDeployer)
> deploy: [
> Metacello new
> baseline: 'Seaside3';
> repository: 'github://GsDevKit/Seaside31:v3.1.3.1-gs/repository';
> onLock: [ :ex | ex honor ];
> load: #('Core' 'Seaside-Development' 'Seaside-Adaptors-FastCGI').
> Metacello new
> configuration: 'Magritte3';
> version: #'release3.1';
> repository: 'http://www.smalltalkhub.com/mc/Magritte/Magritte3/main';
> onConflict: [ :ex :existingRegistration :newRegistration |
> Transcript
> show:
> 'Conflict between existing: ' , existingRegistration className ,
> ' ' , existingRegistration printString
> , Character cr asString , '  and new: ' , newRegistration
> className , ' ' , newRegistration printString
> , Character cr asString , '  resolved in favor of existing'.
> ex disallow ];
> cacheRepository: '/tmp/somewhere';
> load: 'Magritte-Seaside' ]

--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bug: Metacello caches some version files with 4 bytes per character

Pieter Nagel-3
> a LookupError occurred (error 2021), reason:rtErrKeyNotFound, A
> reference using the non-existent key #'id' was made into the dictionary
> aDictionary( #'e'->#'s', #'m'->#'e', #'l'->#'d', '....

Yup, that's the same error I got, I can drop in a the client tomorrow and
mail you the stacktrace from there, but it seems you are looking at the
same one I did.

It seems that whatever code parses the version files gets confused by the
extra zero bytes, and so a dictionary that should have contained mappings
from #'version' to version and whatnot now just contains mappings between
characters that were adjacent in that file, or something like that.

If you go and unzip /tmp/somewhere/Query-Core-JohanBrichau.132.mcz as it
was cached and look at the version file with a hex editor (or something
that can show zero bytes), you'll see the problem clearly.

I see at
https://github.com/GsDevKit/Seaside31/blob/gs_master/repository/JQuery-Core.package/monticello.meta/version
that the version file on GitHub has at least one non-ASCII character
encoded as two bytes in the phrase "from Smalltalks 2010 in
Concepci󮠤el Uruguay", but it doesn't seem to be UTF-8 and I'm not
sure what encoding version files should nominally be in.

My hunch is that something about that character caused a QuadByteString or
the like to be used for the text of the version file, hence the 4 bytes
per character.



>
> Dale
>
>
> On 01/28/2015 04:02 AM, Pieter Nagel wrote:
>> Gofer new
>> package: 'GsUpgrader-Core';
>> url: 'http://ss3.gemtalksystems.com/ss/gsUpgrader';
>> load.
>> (Smalltalk at: #GsUpgrader)
>> upgradeGrease;
>> upgradeGLASS1.
>> (Smalltalk at: #GsDeployer)
>> deploy: [
>> Metacello new
>> baseline: 'Seaside3';
>> repository: 'github://GsDevKit/Seaside31:v3.1.3.1-gs/repository';
>> onLock: [ :ex | ex honor ];
>> load: #('Core' 'Seaside-Development' 'Seaside-Adaptors-FastCGI').
>> Metacello new
>> configuration: 'Magritte3';
>> version: #'release3.1';
>> repository:
>> 'http://www.smalltalkhub.com/mc/Magritte/Magritte3/main';
>> onConflict: [ :ex :existingRegistration :newRegistration |
>> Transcript
>> show:
>> 'Conflict between existing: ' , existingRegistration className
>> ,
>> ' ' , existingRegistration printString
>> , Character cr asString , '  and new: ' , newRegistration
>> className , ' ' , newRegistration printString
>> , Character cr asString , '  resolved in favor of existing'.
>> ex disallow ];
>> cacheRepository: '/tmp/somewhere';
>> load: 'Magritte-Seaside' ]
>
> --
> You received this message because you are subscribed to the Google Groups
> "Metacello" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> For more options, visit https://groups.google.com/d/optout.
>
>


--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bug: Metacello caches some version files with 4 bytes per character

Dale Henrichs-3
In reply to this post by Pieter Nagel-3
Now that I've dug into this a bit more, it looks like the
monticello.meta data for JQuery-Core-JohanBrichau.132 has some
multi-byte characters embedded in it ... here's the info entry:

name 'JQuery-Core-lr.112' message '- Updated to jQuery 1.4.4
(http://blog.jquery.com/2010/11/11/jquery-1-4-4-release-notes/) from
Smalltalks 2010 in Concepcióź €el Uruguay, Argentina' id
'220d51e5-e407-48df-9e53-d8416a1dd7d3' date '13 November 2010' time
'10:29:11 am' author 'lr'

The filetree implementation is able to handle this, because filetree
stores all files on disk in utf8.

.mcz files don't really work when multi-byte characters are used ...
this sorta works for Pharo to Pharo or Squeak to Squeak, but the files
are written out as WideStrings and GemStone doesn't have the class
WideString and sooner or later things break down ...

In this particular case, the cached package is written by GemStone and
it looks like the instance of MultiByteString
was written as bytes (QuadByteString?) so in this case GemStone did not
choke on a WideString... it just misinterpreted the version info.

As a very short term patch, one could edit the
Seaside31/repository/JQuery-Core.package/monticello.meta/version file on
github and replace the bad character with an 8-bit ascii character. That
one the only one I saw, but there could be others ...

To be honest, the only thing that I can safely do in this case, is throw
an error that there are multi-byte characters in the source package and
refuse to write out the .mcz file. I might be able to treat the
monticello meta data a bit differently and replace multibyte characters
with a safe junk character, but I think it's better to let a human
decide what needs to be done ...

If you have alternate solutions, I'd love to hear them.

Dale

[1] https://github.com/dalehenrich/metacello-work/issues/325
On 01/28/2015 04:02 AM, Pieter Nagel wrote:

> We are unable to load Magritte3 over Seaside3.1 when Metacello is
> simultaneously writing to a cacheRepository.
>
> The problem seems to be that when Metacello caches, it also uses the mcz
> it wrote to load from, and in the case of JQuery-Core-JohanBrichau.132.mcz
> which it fetched from github://GsDevKit/Seaside31:v3.1.3.1-gs/repository,
> it writes a version file to the mcz that has 4 bytes per character - which
> it then later of course can not parse.
>
> In order to reproduce the bug, log in to an extent0.seaside.dbf under
> GemStone64 3.1.05, create a directory /tmp/somewhere, and execute the
> following:
>
> Gofer new
> package: 'GsUpgrader-Core';
> url: 'http://ss3.gemtalksystems.com/ss/gsUpgrader';
> load.
> (Smalltalk at: #GsUpgrader)
> upgradeGrease;
> upgradeGLASS1.
> (Smalltalk at: #GsDeployer)
> deploy: [
> Metacello new
> baseline: 'Seaside3';
> repository: 'github://GsDevKit/Seaside31:v3.1.3.1-gs/repository';
> onLock: [ :ex | ex honor ];
> load: #('Core' 'Seaside-Development' 'Seaside-Adaptors-FastCGI').
> Metacello new
> configuration: 'Magritte3';
> version: #'release3.1';
> repository: 'http://www.smalltalkhub.com/mc/Magritte/Magritte3/main';
> onConflict: [ :ex :existingRegistration :newRegistration |
> Transcript
> show:
> 'Conflict between existing: ' , existingRegistration className ,
> ' ' , existingRegistration printString
> , Character cr asString , '  and new: ' , newRegistration
> className , ' ' , newRegistration printString
> , Character cr asString , '  resolved in favor of existing'.
> ex disallow ];
> cacheRepository: '/tmp/somewhere';
> load: 'Magritte-Seaside' ]
>
>
>
>
>
>
>

--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bug: Metacello caches some version files with 4 bytes per character

Dale Henrichs-3
In reply to this post by Pieter Nagel-3
Our replies are criss-crossed:)

We both have the same interpretation.

I'm amazed that the version history with the (presumably) mangled
characters survived this long (the comment is from 2010)... once it got
copied to fileTree, the characters were converted to utf8 and it is the
fact that GemStone is writing out an mcz file that contains multi-byte
characters that is the source of the problem ...

Dale
On 01/29/2015 11:40 AM, Pieter Nagel wrote:

>> a LookupError occurred (error 2021), reason:rtErrKeyNotFound, A
>> reference using the non-existent key #'id' was made into the dictionary
>> aDictionary( #'e'->#'s', #'m'->#'e', #'l'->#'d', '....
> Yup, that's the same error I got, I can drop in a the client tomorrow and
> mail you the stacktrace from there, but it seems you are looking at the
> same one I did.
>
> It seems that whatever code parses the version files gets confused by the
> extra zero bytes, and so a dictionary that should have contained mappings
> from #'version' to version and whatnot now just contains mappings between
> characters that were adjacent in that file, or something like that.
>
> If you go and unzip /tmp/somewhere/Query-Core-JohanBrichau.132.mcz as it
> was cached and look at the version file with a hex editor (or something
> that can show zero bytes), you'll see the problem clearly.
>
> I see at
> https://github.com/GsDevKit/Seaside31/blob/gs_master/repository/JQuery-Core.package/monticello.meta/version
> that the version file on GitHub has at least one non-ASCII character
> encoded as two bytes in the phrase "from Smalltalks 2010 in
> Concepci󮠤el Uruguay", but it doesn't seem to be UTF-8 and I'm not
> sure what encoding version files should nominally be in.
>
> My hunch is that something about that character caused a QuadByteString or
> the like to be used for the text of the version file, hence the 4 bytes
> per character.
>
>
>
>> Dale
>>
>>
>> On 01/28/2015 04:02 AM, Pieter Nagel wrote:
>>> Gofer new
>>> package: 'GsUpgrader-Core';
>>> url: 'http://ss3.gemtalksystems.com/ss/gsUpgrader';
>>> load.
>>> (Smalltalk at: #GsUpgrader)
>>> upgradeGrease;
>>> upgradeGLASS1.
>>> (Smalltalk at: #GsDeployer)
>>> deploy: [
>>> Metacello new
>>> baseline: 'Seaside3';
>>> repository: 'github://GsDevKit/Seaside31:v3.1.3.1-gs/repository';
>>> onLock: [ :ex | ex honor ];
>>> load: #('Core' 'Seaside-Development' 'Seaside-Adaptors-FastCGI').
>>> Metacello new
>>> configuration: 'Magritte3';
>>> version: #'release3.1';
>>> repository:
>>> 'http://www.smalltalkhub.com/mc/Magritte/Magritte3/main';
>>> onConflict: [ :ex :existingRegistration :newRegistration |
>>> Transcript
>>> show:
>>> 'Conflict between existing: ' , existingRegistration className
>>> ,
>>> ' ' , existingRegistration printString
>>> , Character cr asString , '  and new: ' , newRegistration
>>> className , ' ' , newRegistration printString
>>> , Character cr asString , '  resolved in favor of existing'.
>>> ex disallow ];
>>> cacheRepository: '/tmp/somewhere';
>>> load: 'Magritte-Seaside' ]
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Metacello" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>

--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.