This will probably be a long post, but I would like to tell you about
the Monticello upgrades I'm about to move to the trunk. Monticello has several repository types: MCRepository #('creationTemplate' 'storeDiffs') MCDictionaryRepository #('description' 'dict') MCFileBasedRepository #('cache' 'allFileNames') MCDirectoryRepository #('directory') MCCacheRepository #('packageCaches' 'seenFiles') MCSubDirectoryRepository #() MCFtpRepository #('host' 'directory' 'user' 'password' 'connection') MCHttpRepository #('location' 'user' 'password' 'readerCache') MCSMCacheRepository #('smCache') MCGOODSRepository #('hostname' 'port' 'connection') MCWriteOnlyRepository #() MCSMReleaseRepository #('packageName' 'user' 'password') MCSmtpRepository #('email') but MCFileBasedRepository is the one that has been given all of the focus, the other repository types have been ignored over the years. MCHttpRepository is the one that interfaces with SqueakSource, and MCDirectoryRepository are pretty much the only types being used. I know this because external users of MCRepository API, like the Repository-browser tools and MC-Configurations and Installer; these are all using API's that are specific to MCFileBasedRepository - not generally understood by the other repository-types or the abstract API in MCRepository. This is worthy of concern because of the access-limitations of a MCFileBasedRepository. Unlike a MCGOODSRepository, for example, a file-system-based repository cannot efficiently meet the demands of being a MCRepository without, at some points, needing to enumerating ALL version names (files) in its file-system location. As the number of versions in a repository reaches 1-million and beyond, performance will grind to a halt due to the number of files that must be constantly downloaded into RAM (another area of unscalability and unsustainability related to FileBased Repository's). A purging of old versions could be done, but a philosophy of Monticello, from the outset, has been that repository's are intended to contain "all" of version history. I have therefore reworked the MCRepository API's and external tools to talk using only an API that is understood by any repository that implements the methods identified as #subclassResponsibility in MCRepository. This minimally-required API is now: #allPackageNames - answer a list of package names in this repository. #basicStoreVersion: - add a Version to this repository. #includesVersionNamed: - does a version with this name exist in this repository? #versionNamed: - answer the first Version object with the given name. #versionNamesForPackageNamed: - answer the version names for the given package name. #versionWithInfo:ifAbsent: - answer the Version object with the given unique VersionInfo In deference to the limitations of FileBasedRepository's, we only ask for the _names_ of things rather than the whole object, because the names are all that is needed to satisfy tool requirements, except in cases where we need a single Version object (like loading). FileBased cannot access the Version objects quickly, just the (file)names (incl. author & version-number). During the process of this refactoring, I was able to signficantly improve the coherence of the code. It was really, really bad in some areas. I've also verified the viability of this API by updating MCMagmaRepository, and demonstrating using Magma as a totally-sustainable and scalable MC repository. Employing a Magma-based Repository also affords some additional benefits, which I will describe in a separate follow-up mail. I think SqueakSource will eventually have to change to something more scalable. At least now we have have a viable alternative, and with much cleaner MC code in the process. Please load my latest versions of Monticello, MonticelloConfigurations, Installer and Tests from the Inbox and let me know if you experience any issues. You should not see any difference in day-to-day operations. - Chris |
On Tue, Mar 8, 2011 at 1:33 PM, Chris Muller <[hidden email]> wrote:
> This will probably be a long post, but I would like to tell you about > the Monticello upgrades I'm about to move to the trunk. [snip details] > I think SqueakSource will eventually have to change to something more > scalable. At least now we have have a viable alternative, and with > much cleaner MC code in the process. > > Please load my latest versions of Monticello, > MonticelloConfigurations, Installer and Tests from the Inbox and let > me know if you experience any issues. You should not see any > difference in day-to-day operations. Hi Chris, Thanks so much for doing this work. You've fixed a pretty painful design flaw that we've been working around for years. Originally, the repository protocol was very, very simple. It was based entirely on VersionInfo and we expected that repositories would use the UUIDs they contain to store versions. (You'd basically just need #storeVersion and #versionWithInfo:ifAbsent:). We then found that naming mcz files with the version name instead of the UUID made it very easy manage repositories using the OS, and so all the cruft involving version names and custom RepositoryInspectors grew up to make that viable. I've loaded your work from the Inbox and find that it works well - all tests are green and poking around with repository inspectors, loading, saving etc didn't cause any problems. Nice work! Colin |
Free forum by Nabble | Edit this page |