Fwd: [squeak-dev] sustainable Monticello

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: [squeak-dev] sustainable Monticello

Janko Mivšek


-------- Original Message --------
Subject: [squeak-dev] sustainable Monticello
Date: Tue, 8 Mar 2011 15:33:19 -0600
From: Chris Muller <[hidden email]>
Reply-To: [hidden email], The general-purpose Squeak developers
list <[hidden email]>
To: squeak dev <[hidden email]>

This will probably be a long post, but I would like to tell you about
the Monticello upgrades I'm about to move to the trunk.

Monticello has several repository types:

                MCRepository #('creationTemplate' 'storeDiffs')
                        MCDictionaryRepository #('description' 'dict')
                        MCFileBasedRepository #('cache' 'allFileNames')
                                MCDirectoryRepository #('directory')
                                        MCCacheRepository
                                          #('packageCaches' 'seenFiles')
                                        MCSubDirectoryRepository #()
                                MCFtpRepository #('host' 'directory'
                                   'user' 'password' 'connection')
                                MCHttpRepository #('location' 'user'
                                   'password' 'readerCache')
                                MCSMCacheRepository #('smCache')
                        MCGOODSRepository #('hostname' 'port'
                           'connection')
                        MCWriteOnlyRepository #()
                                MCSMReleaseRepository #('packageName'
                                   'user' 'password')
                                MCSmtpRepository #('email')

but MCFileBasedRepository is the one that has been given all of the
focus, the other repository types have been ignored over the years.
MCHttpRepository is the one that interfaces with SqueakSource, and
MCDirectoryRepository are pretty much the only types being used.

I know this because external users of MCRepository API, like the
Repository-browser tools and MC-Configurations and Installer; these
are all using API's that are specific to MCFileBasedRepository - not
generally understood by the other repository-types or the abstract API
in MCRepository.

This is worthy of concern because of the access-limitations of a
MCFileBasedRepository.  Unlike a MCGOODSRepository, for example, a
file-system-based repository cannot efficiently meet the demands of
being a MCRepository without, at some points, needing to enumerating
ALL version names (files) in its file-system location.

As the number of versions in a repository reaches 1-million and
beyond, performance will grind to a halt due to the number of files
that must be constantly downloaded into RAM (another area of
unscalability and unsustainability related to FileBased Repository's).
 A purging of old versions could be done, but a philosophy of
Monticello, from the outset, has been that repository's are intended
to contain "all" of version history.

I have therefore reworked the MCRepository API's and external tools to
talk using only an API that is understood by any repository that
implements the methods identified as #subclassResponsibility in
MCRepository.  This minimally-required API is now:

  #allPackageNames - answer a list of package names in this repository.
  #basicStoreVersion: - add a Version to this repository.
  #includesVersionNamed: - does a version with this name exist in this
repository?
  #versionNamed: - answer the first Version object with the given name.
  #versionNamesForPackageNamed: - answer the version names for the
given package name.
  #versionWithInfo:ifAbsent: - answer the Version object with the
given unique VersionInfo

In deference to the limitations of FileBasedRepository's, we only ask
for the _names_ of things rather than the whole object, because the
names are all that is needed to satisfy tool requirements, except in
cases where we need a single Version object (like loading).  FileBased
cannot access the Version objects quickly, just the (file)names (incl.
author & version-number).

During the process of this refactoring, I was able to signficantly
improve the coherence of the code.  It was really, really bad in some
areas.

I've also verified the viability of this API by updating
MCMagmaRepository, and demonstrating using Magma as a
totally-sustainable and scalable MC repository.  Employing a
Magma-based Repository also affords some additional benefits, which I
will describe in a separate follow-up mail.

I think SqueakSource will eventually have to change to something more
scalable.  At least now we have have a viable alternative, and with
much cleaner MC code in the process.

Please load my latest versions of Monticello,
MonticelloConfigurations, Installer and Tests from the Inbox and let
me know if you experience any issues.  You should not see any
difference in day-to-day operations.

 - Chris



Reply | Threaded
Open this post in threaded view
|

Re: Fwd: [squeak-dev] sustainable Monticello

Stéphane Ducasse
thanks

If somebody want to get the code and prepare it for pharo this will help.

Stef



On Mar 8, 2011, at 10:38 PM, Janko Mivšek wrote:

>
>
> -------- Original Message --------
> Subject: [squeak-dev] sustainable Monticello
> Date: Tue, 8 Mar 2011 15:33:19 -0600
> From: Chris Muller <[hidden email]>
> Reply-To: [hidden email], The general-purpose Squeak developers
> list <[hidden email]>
> To: squeak dev <[hidden email]>
>
> This will probably be a long post, but I would like to tell you about
> the Monticello upgrades I'm about to move to the trunk.
>
> Monticello has several repository types:
>
> MCRepository #('creationTemplate' 'storeDiffs')
> MCDictionaryRepository #('description' 'dict')
> MCFileBasedRepository #('cache' 'allFileNames')
> MCDirectoryRepository #('directory')
> MCCacheRepository
>  #('packageCaches' 'seenFiles')
> MCSubDirectoryRepository #()
> MCFtpRepository #('host' 'directory'
>   'user' 'password' 'connection')
> MCHttpRepository #('location' 'user'
>   'password' 'readerCache')
> MCSMCacheRepository #('smCache')
> MCGOODSRepository #('hostname' 'port'
>   'connection')
> MCWriteOnlyRepository #()
> MCSMReleaseRepository #('packageName'
>   'user' 'password')
> MCSmtpRepository #('email')
>
> but MCFileBasedRepository is the one that has been given all of the
> focus, the other repository types have been ignored over the years.
> MCHttpRepository is the one that interfaces with SqueakSource, and
> MCDirectoryRepository are pretty much the only types being used.
>
> I know this because external users of MCRepository API, like the
> Repository-browser tools and MC-Configurations and Installer; these
> are all using API's that are specific to MCFileBasedRepository - not
> generally understood by the other repository-types or the abstract API
> in MCRepository.
>
> This is worthy of concern because of the access-limitations of a
> MCFileBasedRepository.  Unlike a MCGOODSRepository, for example, a
> file-system-based repository cannot efficiently meet the demands of
> being a MCRepository without, at some points, needing to enumerating
> ALL version names (files) in its file-system location.
>
> As the number of versions in a repository reaches 1-million and
> beyond, performance will grind to a halt due to the number of files
> that must be constantly downloaded into RAM (another area of
> unscalability and unsustainability related to FileBased Repository's).
> A purging of old versions could be done, but a philosophy of
> Monticello, from the outset, has been that repository's are intended
> to contain "all" of version history.
>
> I have therefore reworked the MCRepository API's and external tools to
> talk using only an API that is understood by any repository that
> implements the methods identified as #subclassResponsibility in
> MCRepository.  This minimally-required API is now:
>
>  #allPackageNames - answer a list of package names in this repository.
>  #basicStoreVersion: - add a Version to this repository.
>  #includesVersionNamed: - does a version with this name exist in this
> repository?
>  #versionNamed: - answer the first Version object with the given name.
>  #versionNamesForPackageNamed: - answer the version names for the
> given package name.
>  #versionWithInfo:ifAbsent: - answer the Version object with the
> given unique VersionInfo
>
> In deference to the limitations of FileBasedRepository's, we only ask
> for the _names_ of things rather than the whole object, because the
> names are all that is needed to satisfy tool requirements, except in
> cases where we need a single Version object (like loading).  FileBased
> cannot access the Version objects quickly, just the (file)names (incl.
> author & version-number).
>
> During the process of this refactoring, I was able to signficantly
> improve the coherence of the code.  It was really, really bad in some
> areas.
>
> I've also verified the viability of this API by updating
> MCMagmaRepository, and demonstrating using Magma as a
> totally-sustainable and scalable MC repository.  Employing a
> Magma-based Repository also affords some additional benefits, which I
> will describe in a separate follow-up mail.
>
> I think SqueakSource will eventually have to change to something more
> scalable.  At least now we have have a viable alternative, and with
> much cleaner MC code in the process.
>
> Please load my latest versions of Monticello,
> MonticelloConfigurations, Installer and Tests from the Inbox and let
> me know if you experience any issues.  You should not see any
> difference in day-to-day operations.
>
> - Chris
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Fwd: [squeak-dev] sustainable Monticello

Chris Muller-3
Thanks, I'm glad Pharo will adopt too, since having common MC API's
between Squeak and Pharo will make life easier.

I only posted the latest versions, there there are 8 or 10 interim
ancestor versions which I will copy to trunk in a few days.

 - Chris

On Wed, Mar 9, 2011 at 2:55 AM, Stéphane Ducasse
<[hidden email]> wrote:

> thanks
>
> If somebody want to get the code and prepare it for pharo this will help.
>
> Stef
>
>
>
> On Mar 8, 2011, at 10:38 PM, Janko Mivšek wrote:
>
>>
>>
>> -------- Original Message --------
>> Subject: [squeak-dev] sustainable Monticello
>> Date: Tue, 8 Mar 2011 15:33:19 -0600
>> From: Chris Muller <[hidden email]>
>> Reply-To: [hidden email], The general-purpose Squeak developers
>> list  <[hidden email]>
>> To: squeak dev <[hidden email]>
>>
>> This will probably be a long post, but I would like to tell you about
>> the Monticello upgrades I'm about to move to the trunk.
>>
>> Monticello has several repository types:
>>
>>               MCRepository #('creationTemplate' 'storeDiffs')
>>                       MCDictionaryRepository #('description' 'dict')
>>                       MCFileBasedRepository #('cache' 'allFileNames')
>>                               MCDirectoryRepository #('directory')
>>                                       MCCacheRepository
>>                                         #('packageCaches' 'seenFiles')
>>                                       MCSubDirectoryRepository #()
>>                               MCFtpRepository #('host' 'directory'
>>                                  'user' 'password' 'connection')
>>                               MCHttpRepository #('location' 'user'
>>                                  'password' 'readerCache')
>>                               MCSMCacheRepository #('smCache')
>>                       MCGOODSRepository #('hostname' 'port'
>>                          'connection')
>>                       MCWriteOnlyRepository #()
>>                               MCSMReleaseRepository #('packageName'
>>                                  'user' 'password')
>>                               MCSmtpRepository #('email')
>>
>> but MCFileBasedRepository is the one that has been given all of the
>> focus, the other repository types have been ignored over the years.
>> MCHttpRepository is the one that interfaces with SqueakSource, and
>> MCDirectoryRepository are pretty much the only types being used.
>>
>> I know this because external users of MCRepository API, like the
>> Repository-browser tools and MC-Configurations and Installer; these
>> are all using API's that are specific to MCFileBasedRepository - not
>> generally understood by the other repository-types or the abstract API
>> in MCRepository.
>>
>> This is worthy of concern because of the access-limitations of a
>> MCFileBasedRepository.  Unlike a MCGOODSRepository, for example, a
>> file-system-based repository cannot efficiently meet the demands of
>> being a MCRepository without, at some points, needing to enumerating
>> ALL version names (files) in its file-system location.
>>
>> As the number of versions in a repository reaches 1-million and
>> beyond, performance will grind to a halt due to the number of files
>> that must be constantly downloaded into RAM (another area of
>> unscalability and unsustainability related to FileBased Repository's).
>> A purging of old versions could be done, but a philosophy of
>> Monticello, from the outset, has been that repository's are intended
>> to contain "all" of version history.
>>
>> I have therefore reworked the MCRepository API's and external tools to
>> talk using only an API that is understood by any repository that
>> implements the methods identified as #subclassResponsibility in
>> MCRepository.  This minimally-required API is now:
>>
>>  #allPackageNames - answer a list of package names in this repository.
>>  #basicStoreVersion: - add a Version to this repository.
>>  #includesVersionNamed: - does a version with this name exist in this
>> repository?
>>  #versionNamed: - answer the first Version object with the given name.
>>  #versionNamesForPackageNamed: - answer the version names for the
>> given package name.
>>  #versionWithInfo:ifAbsent: - answer the Version object with the
>> given unique VersionInfo
>>
>> In deference to the limitations of FileBasedRepository's, we only ask
>> for the _names_ of things rather than the whole object, because the
>> names are all that is needed to satisfy tool requirements, except in
>> cases where we need a single Version object (like loading).  FileBased
>> cannot access the Version objects quickly, just the (file)names (incl.
>> author & version-number).
>>
>> During the process of this refactoring, I was able to signficantly
>> improve the coherence of the code.  It was really, really bad in some
>> areas.
>>
>> I've also verified the viability of this API by updating
>> MCMagmaRepository, and demonstrating using Magma as a
>> totally-sustainable and scalable MC repository.  Employing a
>> Magma-based Repository also affords some additional benefits, which I
>> will describe in a separate follow-up mail.
>>
>> I think SqueakSource will eventually have to change to something more
>> scalable.  At least now we have have a viable alternative, and with
>> much cleaner MC code in the process.
>>
>> Please load my latest versions of Monticello,
>> MonticelloConfigurations, Installer and Tests from the Inbox and let
>> me know if you experience any issues.  You should not see any
>> difference in day-to-day operations.
>>
>> - Chris
>>
>>
>>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Fwd: [squeak-dev] sustainable Monticello

Stéphane Ducasse
Where can I find the code of your changes?
I looked in the squeak trunk but this is not clear to me.
I look in the squeak inbox and again I could not find it.

Stef


> Thanks, I'm glad Pharo will adopt too, since having common MC API's
> between Squeak and Pharo will make life easier.
>
> I only posted the latest versions, there there are 8 or 10 interim
> ancestor versions which I will copy to trunk in a few days.
>
> - Chris
>
> On Wed, Mar 9, 2011 at 2:55 AM, Stéphane Ducasse
> <[hidden email]> wrote:
>> thanks
>>
>> If somebody want to get the code and prepare it for pharo this will help.
>>
>> Stef
>>
>>
>>
>> On Mar 8, 2011, at 10:38 PM, Janko Mivšek wrote:
>>
>>>
>>>
>>> -------- Original Message --------
>>> Subject: [squeak-dev] sustainable Monticello
>>> Date: Tue, 8 Mar 2011 15:33:19 -0600
>>> From: Chris Muller <[hidden email]>
>>> Reply-To: [hidden email], The general-purpose Squeak developers
>>> list  <[hidden email]>
>>> To: squeak dev <[hidden email]>
>>>
>>> This will probably be a long post, but I would like to tell you about
>>> the Monticello upgrades I'm about to move to the trunk.
>>>
>>> Monticello has several repository types:
>>>
>>>               MCRepository #('creationTemplate' 'storeDiffs')
>>>                       MCDictionaryRepository #('description' 'dict')
>>>                       MCFileBasedRepository #('cache' 'allFileNames')
>>>                               MCDirectoryRepository #('directory')
>>>                                       MCCacheRepository
>>>                                         #('packageCaches' 'seenFiles')
>>>                                       MCSubDirectoryRepository #()
>>>                               MCFtpRepository #('host' 'directory'
>>>                                  'user' 'password' 'connection')
>>>                               MCHttpRepository #('location' 'user'
>>>                                  'password' 'readerCache')
>>>                               MCSMCacheRepository #('smCache')
>>>                       MCGOODSRepository #('hostname' 'port'
>>>                          'connection')
>>>                       MCWriteOnlyRepository #()
>>>                               MCSMReleaseRepository #('packageName'
>>>                                  'user' 'password')
>>>                               MCSmtpRepository #('email')
>>>
>>> but MCFileBasedRepository is the one that has been given all of the
>>> focus, the other repository types have been ignored over the years.
>>> MCHttpRepository is the one that interfaces with SqueakSource, and
>>> MCDirectoryRepository are pretty much the only types being used.
>>>
>>> I know this because external users of MCRepository API, like the
>>> Repository-browser tools and MC-Configurations and Installer; these
>>> are all using API's that are specific to MCFileBasedRepository - not
>>> generally understood by the other repository-types or the abstract API
>>> in MCRepository.
>>>
>>> This is worthy of concern because of the access-limitations of a
>>> MCFileBasedRepository.  Unlike a MCGOODSRepository, for example, a
>>> file-system-based repository cannot efficiently meet the demands of
>>> being a MCRepository without, at some points, needing to enumerating
>>> ALL version names (files) in its file-system location.
>>>
>>> As the number of versions in a repository reaches 1-million and
>>> beyond, performance will grind to a halt due to the number of files
>>> that must be constantly downloaded into RAM (another area of
>>> unscalability and unsustainability related to FileBased Repository's).
>>> A purging of old versions could be done, but a philosophy of
>>> Monticello, from the outset, has been that repository's are intended
>>> to contain "all" of version history.
>>>
>>> I have therefore reworked the MCRepository API's and external tools to
>>> talk using only an API that is understood by any repository that
>>> implements the methods identified as #subclassResponsibility in
>>> MCRepository.  This minimally-required API is now:
>>>
>>>  #allPackageNames - answer a list of package names in this repository.
>>>  #basicStoreVersion: - add a Version to this repository.
>>>  #includesVersionNamed: - does a version with this name exist in this
>>> repository?
>>>  #versionNamed: - answer the first Version object with the given name.
>>>  #versionNamesForPackageNamed: - answer the version names for the
>>> given package name.
>>>  #versionWithInfo:ifAbsent: - answer the Version object with the
>>> given unique VersionInfo
>>>
>>> In deference to the limitations of FileBasedRepository's, we only ask
>>> for the _names_ of things rather than the whole object, because the
>>> names are all that is needed to satisfy tool requirements, except in
>>> cases where we need a single Version object (like loading).  FileBased
>>> cannot access the Version objects quickly, just the (file)names (incl.
>>> author & version-number).
>>>
>>> During the process of this refactoring, I was able to signficantly
>>> improve the coherence of the code.  It was really, really bad in some
>>> areas.
>>>
>>> I've also verified the viability of this API by updating
>>> MCMagmaRepository, and demonstrating using Magma as a
>>> totally-sustainable and scalable MC repository.  Employing a
>>> Magma-based Repository also affords some additional benefits, which I
>>> will describe in a separate follow-up mail.
>>>
>>> I think SqueakSource will eventually have to change to something more
>>> scalable.  At least now we have have a viable alternative, and with
>>> much cleaner MC code in the process.
>>>
>>> Please load my latest versions of Monticello,
>>> MonticelloConfigurations, Installer and Tests from the Inbox and let
>>> me know if you experience any issues.  You should not see any
>>> difference in day-to-day operations.
>>>
>>> - Chris
>>>
>>>
>>>
>>
>>
>>


Reply | Threaded
Open this post in threaded view
|

Re: Fwd: [squeak-dev] sustainable Monticello

Chris Muller-4
I have not yet copied all of the packages.  Please give me a few more days..

On Sat, Mar 12, 2011 at 4:56 AM, Stéphane Ducasse
<[hidden email]> wrote:

> Where can I find the code of your changes?
> I looked in the squeak trunk but this is not clear to me.
> I look in the squeak inbox and again I could not find it.
>
> Stef
>
>
>> Thanks, I'm glad Pharo will adopt too, since having common MC API's
>> between Squeak and Pharo will make life easier.
>>
>> I only posted the latest versions, there there are 8 or 10 interim
>> ancestor versions which I will copy to trunk in a few days.
>>
>> - Chris
>>
>> On Wed, Mar 9, 2011 at 2:55 AM, Stéphane Ducasse
>> <[hidden email]> wrote:
>>> thanks
>>>
>>> If somebody want to get the code and prepare it for pharo this will help.
>>>
>>> Stef
>>>
>>>
>>>
>>> On Mar 8, 2011, at 10:38 PM, Janko Mivšek wrote:
>>>
>>>>
>>>>
>>>> -------- Original Message --------
>>>> Subject: [squeak-dev] sustainable Monticello
>>>> Date: Tue, 8 Mar 2011 15:33:19 -0600
>>>> From: Chris Muller <[hidden email]>
>>>> Reply-To: [hidden email], The general-purpose Squeak developers
>>>> list  <[hidden email]>
>>>> To: squeak dev <[hidden email]>
>>>>
>>>> This will probably be a long post, but I would like to tell you about
>>>> the Monticello upgrades I'm about to move to the trunk.
>>>>
>>>> Monticello has several repository types:
>>>>
>>>>               MCRepository #('creationTemplate' 'storeDiffs')
>>>>                       MCDictionaryRepository #('description' 'dict')
>>>>                       MCFileBasedRepository #('cache' 'allFileNames')
>>>>                               MCDirectoryRepository #('directory')
>>>>                                       MCCacheRepository
>>>>                                         #('packageCaches' 'seenFiles')
>>>>                                       MCSubDirectoryRepository #()
>>>>                               MCFtpRepository #('host' 'directory'
>>>>                                  'user' 'password' 'connection')
>>>>                               MCHttpRepository #('location' 'user'
>>>>                                  'password' 'readerCache')
>>>>                               MCSMCacheRepository #('smCache')
>>>>                       MCGOODSRepository #('hostname' 'port'
>>>>                          'connection')
>>>>                       MCWriteOnlyRepository #()
>>>>                               MCSMReleaseRepository #('packageName'
>>>>                                  'user' 'password')
>>>>                               MCSmtpRepository #('email')
>>>>
>>>> but MCFileBasedRepository is the one that has been given all of the
>>>> focus, the other repository types have been ignored over the years.
>>>> MCHttpRepository is the one that interfaces with SqueakSource, and
>>>> MCDirectoryRepository are pretty much the only types being used.
>>>>
>>>> I know this because external users of MCRepository API, like the
>>>> Repository-browser tools and MC-Configurations and Installer; these
>>>> are all using API's that are specific to MCFileBasedRepository - not
>>>> generally understood by the other repository-types or the abstract API
>>>> in MCRepository.
>>>>
>>>> This is worthy of concern because of the access-limitations of a
>>>> MCFileBasedRepository.  Unlike a MCGOODSRepository, for example, a
>>>> file-system-based repository cannot efficiently meet the demands of
>>>> being a MCRepository without, at some points, needing to enumerating
>>>> ALL version names (files) in its file-system location.
>>>>
>>>> As the number of versions in a repository reaches 1-million and
>>>> beyond, performance will grind to a halt due to the number of files
>>>> that must be constantly downloaded into RAM (another area of
>>>> unscalability and unsustainability related to FileBased Repository's).
>>>> A purging of old versions could be done, but a philosophy of
>>>> Monticello, from the outset, has been that repository's are intended
>>>> to contain "all" of version history.
>>>>
>>>> I have therefore reworked the MCRepository API's and external tools to
>>>> talk using only an API that is understood by any repository that
>>>> implements the methods identified as #subclassResponsibility in
>>>> MCRepository.  This minimally-required API is now:
>>>>
>>>>  #allPackageNames - answer a list of package names in this repository.
>>>>  #basicStoreVersion: - add a Version to this repository.
>>>>  #includesVersionNamed: - does a version with this name exist in this
>>>> repository?
>>>>  #versionNamed: - answer the first Version object with the given name.
>>>>  #versionNamesForPackageNamed: - answer the version names for the
>>>> given package name.
>>>>  #versionWithInfo:ifAbsent: - answer the Version object with the
>>>> given unique VersionInfo
>>>>
>>>> In deference to the limitations of FileBasedRepository's, we only ask
>>>> for the _names_ of things rather than the whole object, because the
>>>> names are all that is needed to satisfy tool requirements, except in
>>>> cases where we need a single Version object (like loading).  FileBased
>>>> cannot access the Version objects quickly, just the (file)names (incl.
>>>> author & version-number).
>>>>
>>>> During the process of this refactoring, I was able to signficantly
>>>> improve the coherence of the code.  It was really, really bad in some
>>>> areas.
>>>>
>>>> I've also verified the viability of this API by updating
>>>> MCMagmaRepository, and demonstrating using Magma as a
>>>> totally-sustainable and scalable MC repository.  Employing a
>>>> Magma-based Repository also affords some additional benefits, which I
>>>> will describe in a separate follow-up mail.
>>>>
>>>> I think SqueakSource will eventually have to change to something more
>>>> scalable.  At least now we have have a viable alternative, and with
>>>> much cleaner MC code in the process.
>>>>
>>>> Please load my latest versions of Monticello,
>>>> MonticelloConfigurations, Installer and Tests from the Inbox and let
>>>> me know if you experience any issues.  You should not see any
>>>> difference in day-to-day operations.
>>>>
>>>> - Chris
>>>>
>>>>
>>>>
>>>
>>>
>>>
>
>