git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Git Mailing List <git@vger.kernel.org>
Subject: CAREFUL! No more delta object support!
Date: Mon, 27 Jun 2005 18:14:40 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.58.0506271755140.19755@ppc970.osdl.org> (raw)


Some people may have noticed already (hopefully not the hard way) that the 
current git code doesn't support delta objects lying around in the object 
directory any more.

In other words, if you have delta objects, you need to un-deltify your 
repository _before_ you upgrade your git binaries, or they won't be able 
to read your objects any more.

The reason? The new git understands packed files natively, which ends up 
being a much bigger win in many many ways.

You should be very careful about using packed files (since they are a very 
recent addition), but what you can do to try them out is to do so in a 
separate repository.

Starting to use a packed repository is very simple indeed, and here's what 
you need to do for git, for example:

In your regular "git" directory (once you have ypdated your git to a 
recent version, in particular you need to have the "csum-file: fix missing 
buf pointer update" commit), do:

	git-rev-list --objects HEAD | git-pack-objects --window=50 --depth=50 out

which will say something like "Packing 3741 objects" and result in two new 
files a few seconds later:

	torvalds@ppc970:~/git> ls -lh out*
	-rw-r--r--  1 torvalds torvalds  89K Jun 27 17:59 out.idx
	-rw-r--r--  1 torvalds torvalds 1.3M Jun 27 17:59 out.pack

now, don't do anythign with those files, but instead go and create a 
directory somewhere else:

	cd ~
	mkdir packed-git-trial
	cd packed-git-trial
	git-init-db

you have now obviously created a totally empty repository. Now, let's 
populate that empty repository with _just_ the pack files:

	mkdir .git/objects/pack
	mv ~/git/out.* .git/objects/pack

and then, move over your tags, in particularly the HEAD pointer, with 
something like

	cat ~/git/.git/HEAD > .git/HEAD

and voila, you're done. Try "gitk", for example. Or "git log".

Now, what's even cooler is how you can just start using this packed tree: 
feel free to do a test-commit or something, and notice how git starts 
populating the empty .git/objects/xx/ subdirectories with new objects. But 
it still relies on the pack-file for the old history.

Now, there's still a misfeature there, which is that when you create a new
object, it doesn't check whether that object already exists in the
pack-file, so you'll end up with a few recent objects that you really
don't need (notably tree objects), and we'll fix that eventually. But
notice how you started with a 17MB .git/objects/ directory in your
original tree, and you now have just a 1.3MB pack-file and a 90kB index
file that replaces all that?

There are some other issues too, like the fact that "git-fsck-cache"  
doesn't know about the pack-files yet, so it will complain about missing
objects etc. Also, please note that the pack-file _only_ packs the commits
and the things reachable from them: things like tags (and your references
in your .git/refs directory) need to be copied over separately.

So this is all very rough, still, but the basics do actually seem to work
(ie anything that doesn't look directly at the object files - which is
pretty much all of it except for fsck and the direct-filesystem-access 
things like "rsync" and "git-local-pull").

Maybe you might not want to switch over yet, and as mentioned, rsync then
ends up not being a good way to sync (nor git-local-pull), but the
"git-http/ssh-pull" family should hopefully just work.

I've used a packed kernel tree too, so this has gotten _some_ testing even 
on really quite big git trees. 

			Linus

             reply	other threads:[~2005-06-28  1:05 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-28  1:14 Linus Torvalds [this message]
2005-06-27 23:58 ` CAREFUL! No more delta object support! Christopher Li
2005-06-28  3:30   ` Linus Torvalds
2005-06-28  9:40     ` Junio C Hamano
2005-06-28 11:06       ` Christopher Li
2005-06-28 14:52         ` Petr Baudis
2005-06-28 16:35           ` Benjamin LaHaise
2005-06-28 20:30             ` Petr Baudis
2005-06-28 14:46       ` Jan Harkes
2005-06-28 10:38     ` Christopher Li
2005-06-28 16:45       ` Linus Torvalds
2005-06-29  0:49         ` [PATCH] Emit base objects of a delta chain when the delta is output Junio C Hamano
2005-06-28  2:01 ` CAREFUL! No more delta object support! Junio C Hamano
2005-06-28  2:03   ` [PATCH] Skip writing out sha1 files for objects in packed git Junio C Hamano
2005-06-28  2:43     ` Linus Torvalds
2005-06-28  3:33       ` Junio C Hamano
2005-06-28 15:45         ` Linus Torvalds
2005-06-28  2:13   ` CAREFUL! No more delta object support! Linus Torvalds
2005-06-28  2:32     ` Junio C Hamano
2005-06-28  2:37       ` [PATCH] Adjust to git-init-db creating $GIT_OBJECT_DIRECTORY/pack Junio C Hamano
2005-06-28  2:48       ` CAREFUL! No more delta object support! Linus Torvalds
2005-06-28  5:09     ` Daniel Barkalow
2005-06-28 15:49       ` Linus Torvalds
2005-06-28 16:21         ` Linus Torvalds
2005-06-28 17:04           ` Daniel Barkalow
2005-06-28 17:36             ` Linus Torvalds
2005-06-28 18:17               ` Linus Torvalds
2005-06-28 19:49                 ` Matthias Urlichs
2005-06-28 20:18                   ` Matthias Urlichs
2005-06-28 20:01                 ` Daniel Barkalow
2005-06-29  3:53                 ` Linus Torvalds
2005-06-29 18:59     ` Linus Torvalds
2005-06-29 21:05       ` Daniel Barkalow
2005-06-29 21:38         ` Linus Torvalds
2005-06-29 22:24           ` Daniel Barkalow
2005-06-28  8:49 ` [PATCH] Adjust fsck-cache to packed GIT and alternate object pool Junio C Hamano
2005-06-28 21:56   ` [PATCH] Expose packed_git and alt_odb Junio C Hamano
2005-06-28 21:58   ` [PATCH 3/3] Update fsck-cache (take 2) Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.58.0506271755140.19755@ppc970.osdl.org \
    --to=torvalds@osdl.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).