The Trunk: Monticello-cmm.585.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Monticello-cmm.585.mcz

Colin Putney-3



On Thu, Jan 30, 2014 at 9:25 AM, Colin Putney <[hidden email]> wrote:
 
We could also change the mcz format to allow references between nodes in the ancestry tree so that  there's no duplicate information there. That would save space inside mcz files. 

I should have checked on this before posting. :-)

It turns out we already do this, so optimizing the ancestry data in an mcz file would require a pretty radical change.

Colin


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Monticello-cmm.585.mcz

Chris Muller-3
In reply to this post by Eliot Miranda-2
>> In a random mcz file from my cache (Kernel-721.cwp):
>>
>>      15     package
>> 1482380     snapshot.bin
>> 1407480     snapshot/source.st
>>  243499     version
>>
>> These are byte counts, so the ancestry data (in "version") takes up about
>> 8% of the total. If we really want to make these files smaller, we'd do
>> better to get rid of the redundancy between snapshot.bin and
>> snapshot/source.st.
>
>
> I thought that the issue was not file size but in-image footprint.  What do
> others think?

MC has 3 limitlessly-growing aspects causing increased degradation and
scaling issues.  1) The in-image ancestry, 2) the allFilenames cache
size (this is the Numero UNO thing our trunk server spends time
doing), and 3) the repetition of code in .mcz files packages wastes
quite a lot of space.  Morphic is probably the largest, its 582
versions in trunk consuming 857M today.  Total size of packages in
/trunk is currently 3.1G.

12