Re: Submodule object store

From: David Lang <david.lang@digitalinsight.com>
To: Martin Waitz <tali@admingilde.org>
Cc: Junio C Hamano <junkio@cox.net>,
	Josef Weidendorfer <Josef.Weidendorfer@gmx.de>,
	Eric Lesh <eclesh@ucla.edu>, Matthieu Moy <Matthieu.Moy@imag.fr>,
	git@vger.kernel.org
Subject: Re: Submodule object store
Date: Tue, 27 Mar 2007 08:53:37 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.63.0703270735550.15345@qynat.qvtvafvgr.pbz> (raw)
In-Reply-To: <20070327152507.GO22773@admingilde.org>

On Tue, 27 Mar 2007, Martin Waitz wrote:

> On Mon, Mar 26, 2007 at 03:40:15PM -0800, David Lang wrote:
>> useing the same object store makes this work automaticaly (think of all the
>> copies of COPYING that would end up being the same as a trivial example)
>
> Yes, but I guess not much more than COPYING, INSTALL, some trivial
> Makefiles and empty files will be shared between subprojects.
> Except when you have the same subproject in your tree multiple times,
> of course.

although, if you end up packing multiple projects togeather you may end up 
finding more things that diff well against each other (although it will slow 
down the packing with more objects.

> Yet this sharing is exactly why I started to do it that way, until Linus
> stopped me.

I missed that one.

>>> If someone comes up with a nice way to handle everything in one big
>>> object store I would happily use that! :-)
>>
>> what exactly are the problems with one big object store?
>
> I think we really have to discuss this separation on several layers:
> traversal, pack-files, and object database.
>
> For the traversal the point of separating it into a per-module traversal
> is that only one module has to be loaded into RAM at a time.
> This effects all operations which do a (potentially) recursive traversal:
> push, pull, fsck, prune, repack.
> However a separated traversal will no longer be garanteed to only list
> an object once, so this has to be handled in some way.

an object can already appear more then once in pack files.

> Pack files should have better access patterns if they are per-module.
> Most of the time you are only interested in one individual module and
> locality is important here.
>
> Separating the entire object database is a way to improve unreachability
> analysis, as it now can be done per module.
> The other two separations are easier to implement with a separated
> object database, but that's not too strong an argument.

if modules are really as seperate as you make them out to be then what you want 
isn't multiple modules inside one overall project (top level .git) you want 
multiple projects and a way to link them togeather.

>
> So if we can come up with a nice way to do unreachability analysis we
> can indeed go on with the shared object database and tackle the
> remaining scalability issues as they arise.  Those could then be added
> later without changing the on-disk format.
>
>> ones that I can think of:
>>
>> 1. when you are doing a fsck you need to walk all the trees and find out
>> the list of objects that you know about.
>>
>>   done as a tree of binary values you can hold a LOT in memory before
>>   running into swap.
>
> Could you explain the algorithm you are thinking about in more detail?

as I understand it the need is to efficiantly create a list of all the objects 
that are reachable (so that we can then go through the objects and remove them 
if they aren't on the list).

you need these sorted to make it easy to find if something is in the list, and 
with millions of entries you don't it to be a flat list (inserting new values 
becomes very inefficiant) so the classic answer is to do a tree structure. you 
can either do a tree with the object ID's in all the nodes, or you can do one 
where only the leaf nodes hold the object ID's and the other nodes just hold 
pointers (which would then allow you to spill the leaf nodes to disk more 
efficiantly as they wouldn't need to be accessed when inserting unless the node 
itself needed to be changed. looking them up is being done more or less in alpha 
order for loose objects (and could be made to be so for objects in packs) so any 
file I/O for lookups would be close to sequential

this sort of memory useage wouldn't be acceptable for something that happens 
frequently, but a fsck/prune is relativly infrequent and can be run off-hours.

>>   if it's enough larger then available ram then an option for fsck to use
>>   trees on disk is an option.
>
> This could simplify some things.
> There could be an on-disk index of all known objects, so that the sha1
> sums do not have to loaded into RAM all at once.

you wouldn't want to trust this for a fsck/prune

David Lang