Re: On UUID's and MC file names

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: On UUID's and MC file names

Bert Freudenberg
Jerome,

we should take this discussion to squeak-dev. Reply-To set.

On Jun 22, 2007, at 1:06 , Jerome Peace wrote:

> Hi Bert,
>
> Thanks for the interesting response.
>
> ***
>> [V3dot10] Re: RV: Do in a workspace and say if could
> build
>>
>>
>> Bert Freudenberg bert at freudenbergs.de
>> Thu Jun 21 00:11:25 UTC 2007
>>
>> On Jun 21, 2007, at 1:51 , Jerome Peace wrote:
>>
>>> First a better way to print out a uuid. Since its
>>> based on time I should be able to take an encoded
> UUID
>>> and print it out asHumanIntelligableText.
>>
>> http://en.wikipedia.org/wiki/UUID
>>
>>> Secondly it would seem that a time based version
>>> number would be a little less dangerous than a
>>> sequential version. So a package would be name
>>> somethink like:
>>> PackageName-subPackage-initials.yymmddnn.mcz
>>> with yymmddnn is a number based on time with a
>>> sufficient resolution to solve most problems.
>>> The details may be modified to meet other design
>>> criteria (e.g. spaceCompression).
>>>
>>> The first should be easy to do.
>>
>> Reversing a cryptographic hash function? Have fun.
>
> Hmm. I don't have to reverse a hash function I just
> have to "know what it means".
> That can be done by extra info saved with the hash as
> part of the name.
> Enough info to provide a human intellegible clue.
> UUID hashes mean that on such and such a day at such
> and such a time from such and such a place a something
> was saved and given a uuid in such and such a format.
>
> If the purpose of the saving is not to keep secret
> what was saved you can place both the open text and
> the hash together and if needs be keep a dictionary to
> reverse the cyptographic hash.
>
> Partial progress counts. I just want to look as
> something that doesn't mystify me.
> Remember the context is to make something a beginner
> and an amatuer can learn.

That information is stored in the VersionInfo entry next to the UUID.  
It's easily accessible. Whereas the UUID might be generated using the  
UUIDPlugin and you have no idea how to reverse that. It's not  
sensible to even attempt that.

>>> I wonder what it would take to train MC to work
> with the second.
>>
>> That's trivial. Since MC does not place meaning on
> the version name
>> you can just pre-populate the version name input
> field of the version
>> save dialog with whatever suits you.
>
> Huh? Wow.
>
> Does this mean I could rename the file and MC would
> still recognize it for what it is?
> Oh,. you said version name. So you mean that the
> packagename portion is still significant but I can
> play around with the version names and MC will pay no
> attention.

No. The package name is stored *inside* the MCZ.

> So a mischief maker could rename things so that
> Package-puck.30.mcz  was the ancestor of
> Package-puck.29.mcz instead of the expected other way
> around?
>
> On the other hand Package-puck.3.mcz duplicated and
> renamed to egakcaP-puck.3.mcz would not be recognized
> by MC as the same?
>
>
>>
>> Actually, maybe having readable version file names is
> a problem in
>> itself. It gives the illusion that these have any
> meaning to MC.
>> Other systems like git avoid the problem by just
> using UUIDs as
>> filenames.
>
> And how would you know when mischief had happened
> then?

MC is not designed to prevent mischief, though the UUIDs prevent  
accidental mistakes. For actual security, one could for example use  
the hash of the entire package contents as identifier, making it  
unforgeable.

- Bert -

Reply | Threaded
Open this post in threaded view
|

Re: On UUID's and MC file names

Jerome Peace

On UUID's and MC file names

Hi Bert,

This is getting interesting.

>No. The package name is stored *inside* the MCZ.

>> So a mischief maker could rename things so that
>> Package-puck.30.mcz  was the ancestor of
>> Package-puck.29.mcz instead of the expected other
way
>> around?
>>
>> On the other hand Package-puck.3.mcz duplicated and
>> renamed to egakcaP-puck.3.mcz would not be
recognized
>> by MC as the same?

So your saying that the mischievous puck** could
actually have egakcaP-puck.3.mcz
in his repository and it would be understood by MC to
be Package-puck.3.mcz.

So does that mean that when MC opens a repository it
reads ALL the files to see whats there?
What does it use the file names for if anything?

Yours in curiosity and service, --Jerome Peace

**Mischief-makers are often superprogrammers in
learning mode. See the apple folklore:
http://www.folklore.org/StoryView.py?project=Macintosh&story=Make_a_Mess,_Clean_it_Up!.txt

Or sometimes they are release team members with the
best of intentions just running into Murphy's Law.




>
>Bert Freudenberg bert at freudenbergs.de
>Fri Jun 22 00:34:10 UTC 2007
>

>On Jun 22, 2007, at 1:06 , Jerome Peace wrote:
>
>> Hi Bert,
>>
>> Thanks for the interesting response.
>>
>> ***
>>> [V3dot10] Re: RV: Do in a workspace and say if
could
>> build
>>>
>>>
>>> Bert Freudenberg bert at freudenbergs.de
>>> Thu Jun 21 00:11:25 UTC 2007
>>>
>>> On Jun 21, 2007, at 1:51 , Jerome Peace wrote:
>>>

>
>>>> I wonder what it would take to train MC to work
>> with the second.
>>>
>>> That's trivial. Since MC does not place meaning on
>> the version name
>>> you can just pre-populate the version name input
>> field of the version
>>> save dialog with whatever suits you.
>>
>> Huh? Wow.
>>
>> Does this mean I could rename the file and MC would
>> still recognize it for what it is?
>> Oh,. you said version name. So you mean that the
>> packagename portion is still significant but I can
>> play around with the version names and MC will pay
no
>> attention.
>
>No. The package name is stored *inside* the MCZ.
>
>> So a mischief maker could rename things so that
>> Package-puck.30.mcz  was the ancestor of
>> Package-puck.29.mcz instead of the expected other
way
>> around?
>>
>> On the other hand Package-puck.3.mcz duplicated and
>> renamed to egakcaP-puck.3.mcz would not be
recognized
>> by MC as the same?
>>
>>
>>>
>>> Actually, maybe having readable version file names
is

>> a problem in
>>> itself. It gives the illusion that these have any
>> meaning to MC.
>>> Other systems like git avoid the problem by just
>> using UUIDs as
>>> filenames.
>>
>> And how would you know when mischief had happened
>> then?
>
>MC is not designed to prevent mischief, though the
UUIDs prevent  
>accidental mistakes. For actual security, one could
for example use  
>the hash of the entire package contents as
identifier, making it  
>unforgeable.
>

Ok. That would secure things.

What I was trying to say was:

 How would a human while looking at a file-list get a
hint that things had gone wrong like the current
problem. I was more interested in an early and
observable warning than a machine testable security
assurance.

The context (which I didn't make clear) was the good
intentioned but off track repository maintainer and
the need for the repository to be ok-to-use (vs
not-ok-to-use). To see that early is worth a days,
maybe a weeks work.  


Cheers, -Jer



 
____________________________________________________________________________________
8:00? 8:25? 8:40? Find a flick in no time
with the Yahoo! Search movie showtime shortcut.
http://tools.search.yahoo.com/shortcuts/#news

Reply | Threaded
Open this post in threaded view
|

Re: On UUID's and MC file names

johnmci
In reply to this post by Bert Freudenberg
Ok, the UUID logic for which all the source code exists I might add  
clearly shows what happens.

It generates a version 4 random number IF the UUID primitive is not  
implemented on the host platform.  That uses
the Squeak's Random number generator, with hopefully a different  
start value each time it's used (cross fingers).
I recall some early MC users on unix were burned when the original  
logic restarted with the same start seed on an image crash.

Typically the host operating system supplied a version 1

I recall what happened was someone decoded the UUIDs found in MS word  
documents which were claimed to be from an anonymous party, but  
turned out to be a political party in Washington DC. That was met  
with outrage that you could easly trace back to the MAC address of  
the computers involved in the creation of any microsoft document, so  
Microsoft  moved to a HASHed value, others like OS-X followed. I can  
not speak for Unix.  {well actually it invokes MakeUUID(location);  
but I have no idea what that does}

The intent of the MD5 or SHA-1 is to hide the original value, I  
believe they are one way, and might as well be random bits from any  
viewpoint of a better way to print the UUID.

However if you are lucky some older operating system might still be  
using version 1 UUIDS.


Oh lastly I'll note the decoder for the UUID>>asUUID:  SUCKS (well I  
wrote it).

asUUID: aString
        | stream token byte |
        stream _ ReadStream on: (aString copyReplaceAll: '-' with: '')  
asUppercase.
        1 to: stream size/2 do: [:i |
                token _ stream next: 2.
                byte _ Integer readFrom: (ReadStream on: token ) base: 16.
                self at: i put: byte].
        ^self


In profiling Sophie document reading this nasty leap to the foreground.

However I wrote a new one for Sophie, which made the few percents  
dedicated to converting a string UUID to a UUID object go away.
Lot less readable tho,

SophieID>>asUUID:
        | n i poke l total r |

        n := aString size.
        i := 1.
        poke := 1.
        [ i < n] whileTrue:
                [l := (aString at: i) asInteger.
                l = 45 ifFalse:
                        [total := l > 96
                                        ifTrue: [10 + (l-97)]
                                        ifFalse: [l-48].
                        i := i + 1.
                        r := (aString at: i) asInteger.
                        total := r > 96
                                ifTrue: [total * 16 +  (10 + (r-97))]
                                ifFalse: [total * 16 +  (r-48)].
                        self at: poke put: total.
                        poke := poke + 1.
                        i := i + 1]
                        ifTrue: [i := i + 1]].


On Jun 21, 2007, at 5:34 PM, Bert Freudenberg wrote:

> Jerome,
>
> we should take this discussion to squeak-dev. Reply-To set.
>
> On Jun 22, 2007, at 1:06 , Jerome Peace wrote:
>
>> Hi Bert,
>>
>> Thanks for the interesting response.
>>
>> ***
>>> [V3dot10] Re: RV: Do in a workspace and say if could
>> build
>>>
>>>
>>> Bert Freudenberg bert at freudenbergs.de
>>> Thu Jun 21 00:11:25 UTC 2007
>>>
>>> On Jun 21, 2007, at 1:51 , Jerome Peace wrote:
>>>
>>>> First a better way to print out a uuid. Since its
>>>> based on time I should be able to take an encoded
>> UUID
>>>> and print it out asHumanIntelligableText.
>>>
>>> http://en.wikipedia.org/wiki/UUID
>>>
>>>> Secondly it would seem that a time based version
>>>> number would be a little less dangerous than a
>>>> sequential version. So a package would be name
>>>> somethink like:
>>>> PackageName-subPackage-initials.yymmddnn.mcz
>>>> with yymmddnn is a number based on time with a
>>>> sufficient resolution to solve most problems.
>>>> The details may be modified to meet other design
>>>> criteria (e.g. spaceCompression).
>>>>
>>>> The first should be easy to do.
>>>
>>> Reversing a cryptographic hash function? Have fun.
>>
>> Hmm. I don't have to reverse a hash function I just
>> have to "know what it means".
>> That can be done by extra info saved with the hash as
>> part of the name.
>> Enough info to provide a human intellegible clue.
>> UUID hashes mean that on such and such a day at such
>> and such a time from such and such a place a something
>> was saved and given a uuid in such and such a format.
>>
>> If the purpose of the saving is not to keep secret
>> what was saved you can place both the open text and
>> the hash together and if needs be keep a dictionary to
>> reverse the cyptographic hash.
>>
>> Partial progress counts. I just want to look as
>> something that doesn't mystify me.
>> Remember the context is to make something a beginner
>> and an amatuer can learn.
>
> That information is stored in the VersionInfo entry next to the  
> UUID. It's easily accessible. Whereas the UUID might be generated  
> using the UUIDPlugin and you have no idea how to reverse that. It's  
> not sensible to even attempt that.
>
>>>> I wonder what it would take to train MC to work
>> with the second.
>>>
>>> That's trivial. Since MC does not place meaning on
>> the version name
>>> you can just pre-populate the version name input
>> field of the version
>>> save dialog with whatever suits you.
>>
>> Huh? Wow.
>>
>> Does this mean I could rename the file and MC would
>> still recognize it for what it is?
>> Oh,. you said version name. So you mean that the
>> packagename portion is still significant but I can
>> play around with the version names and MC will pay no
>> attention.
>
> No. The package name is stored *inside* the MCZ.
>
>> So a mischief maker could rename things so that
>> Package-puck.30.mcz  was the ancestor of
>> Package-puck.29.mcz instead of the expected other way
>> around?
>>
>> On the other hand Package-puck.3.mcz duplicated and
>> renamed to egakcaP-puck.3.mcz would not be recognized
>> by MC as the same?
>>
>>
>>>
>>> Actually, maybe having readable version file names is
>> a problem in
>>> itself. It gives the illusion that these have any
>> meaning to MC.
>>> Other systems like git avoid the problem by just
>> using UUIDs as
>>> filenames.
>>
>> And how would you know when mischief had happened
>> then?
>
> MC is not designed to prevent mischief, though the UUIDs prevent  
> accidental mistakes. For actual security, one could for example use  
> the hash of the entire package contents as identifier, making it  
> unforgeable.
>
> - Bert -
>

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
========================================================================
===



Reply | Threaded
Open this post in threaded view
|

Re: On UUID's and MC file names

Jerome Peace
Hi John,

Thanks for your reply.

Sort of more info than I wanted to know.
I just want to look at something and not be mystified
by it. So it seems if I have a sophie id string I
could turn it into a uuid. If can do that then I could
create the association uuid->string put it in a
dictionary and then on looking a the same uuid look it
up.

So if I kept a list of all strings I might want to
look up that would work. Sometimes sort of.

Or I could just get used to being mytified (Nah.
Where's the fun in that?)

Yours in curiosity and service, --Jerome Peace







       
____________________________________________________________________________________
Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out.
http://answers.yahoo.com/dir/?link=list&sid=396545433

Reply | Threaded
Open this post in threaded view
|

Re: On UUID's and MC file names

johnmci
Well to clarify. In Sophie we have a SophieID which is used identify  
objects. This just happens to be a sub class of UUID.
But I decided at the beginning I wanted a separate class SophieID in  
case we wanted to migrate from UUIDs

*Most* objects in Sophie are subclassed off of SophieObject which may  
contain a SophieID instance.
Certainly if the Object is serialized for XML storage they get a  
SophieID. Other objects referring to an object then
at serialization time get a SophieID  reference value which later  
when the XML is read is used to hook the reference value
upto the actual object.   Fortunetly at this time we do not have  
cases of forward referencing when the book is read.

On Jun 23, 2007, at 4:13 PM, Jerome Peace wrote:

> Hi John,
>
> Thanks for your reply.
>
> Sort of more info than I wanted to know.
> I just want to look at something and not be mystified
> by it. So it seems if I have a sophie id string I
> could turn it into a uuid. If can do that then I could
> create the association uuid->string put it in a
> dictionary and then on looking a the same uuid look it
> up.
>
> So if I kept a list of all strings I might want to
> look up that would work. Sometimes sort of.
>
> Or I could just get used to being mytified (Nah.
> Where's the fun in that?)
>
> Yours in curiosity and service, --Jerome Peace
>

--
========================================================================
===
John M. McIntosh <[hidden email]>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
========================================================================
===