archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <>
To: Junio C Hamano <>
Cc: "David S. Miller" <>,
	Git Mailing List <>,
	Nicolas Pitre <>, Chris Mason <>
Subject: Re: and GIT tree rebuilding
Date: Sun, 26 Jun 2005 12:19:07 -0700 (PDT)	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Sun, 26 Jun 2005, Junio C Hamano wrote:
> My preference is to do things in this order:
>  (0) concatenate pack and idx files;

Actually, I was originally planning to do that, but now that I have 
thought about what read_sha1_file() would actually do, I think it's more 
efficient to leave the index as a separate file.

In particular, what you'd normally do is that if you can't look up the
file in the regular object directory, you start going through the pack
files. You can do it by having GIT_ALTERNATE_OBJECT_DIRECTORIES point to a
pack file, but I actually would prefer the notion of just adding a


subdirectory, and having object lookup just automatically open and map all 
index files in that subdirectory.

And the thing is, you really just want to map the index files, the data
files can be so big that you can't afford to map them (ie a really big
project might have several pack-files a gig each or something like that).

And the most efficient way to map just the index file is to keep it 
separate, because then the "stat()" will just get the information 
directly, and you then just mmap that. 

The alternative is to first read the index of the index (to figure out how
big the index is), and then map the rest. But that just seems a lot
messier than just mapping the index file directly.

And when creating these things, we do need to create the data file (which 
can be big enough that it doesn't fit in memory) first, so we have to have 
a separate file for it, we can't just stream it out to stdout.

Now, when _sending_ the pack-files, linearizing them is easy: you just 
send the index first, and the data file immediately afterwards. The index 
tells how big it is, so there's no need to even add any markers: you can 
do something like 'git-send-script' with something simple like

	git-rev-list ... | git-pack-file tmp-pack &&
	cat tmp-pack.idx | ssh other git-receive-script

So let's just keep the index/data files separate.


  reply	other threads:[~2005-06-26 19:10 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-25  4:20 and GIT tree rebuilding David S. Miller
2005-06-25  4:40 ` Jeff Garzik
2005-06-25  5:23   ` Linus Torvalds
2005-06-25  5:48     ` Jeff Garzik
2005-06-25  6:16       ` Linus Torvalds
2005-06-26 16:41         ` Linus Torvalds
2005-06-26 18:39           ` Junio C Hamano
2005-06-26 19:19             ` Linus Torvalds [this message]
2005-06-26 19:45               ` Junio C Hamano
     [not found]                 ` <>
     [not found]                   ` <>
     [not found]                     ` <>
2005-06-28  6:56                       ` [PATCH] Obtain sha1_file_info() for deltified pack entry properly Junio C Hamano
2005-06-28  6:58                         ` Junio C Hamano
2005-06-28  6:58                         ` [PATCH 2/3] git-cat-file: use sha1_object_info() on '-t' Junio C Hamano
2005-06-28  6:59                         ` [PATCH 3/3] git-cat-file: '-s' to find out object size Junio C Hamano
2005-06-26 20:52           ` and GIT tree rebuilding Chris Mason
2005-06-26 21:03             ` Chris Mason
2005-06-26 21:40             ` Linus Torvalds
2005-06-26 22:34               ` Linus Torvalds
2005-06-28 18:06           ` Nicolas Pitre
2005-06-28 19:28             ` Linus Torvalds
2005-06-28 21:08               ` Nicolas Pitre
2005-06-28 21:27                 ` Linus Torvalds
2005-06-28 21:55                   ` [PATCH] Bugfix: initialize pack_base to NULL Junio C Hamano
2005-06-29  3:55                   ` and GIT tree rebuilding Nicolas Pitre
2005-06-29  5:16                     ` Nicolas Pitre
2005-06-29  5:43                       ` Linus Torvalds
2005-06-29  5:54                         ` Linus Torvalds
2005-06-29  7:16                           ` Last mile for 1.0 again Junio C Hamano
2005-06-29  9:51                             ` [PATCH] Add git-verify-pack command Junio C Hamano
2005-06-29 16:15                               ` Linus Torvalds
2005-07-04 21:40                             ` Last mile for 1.0 again Daniel Barkalow
2005-07-04 21:45                               ` Junio C Hamano
2005-07-04 21:59                               ` Linus Torvalds
2005-07-04 22:41                                 ` Daniel Barkalow
2005-07-04 23:06                                   ` Junio C Hamano
2005-07-05  1:54                                     ` Daniel Barkalow
2005-07-05  6:24                                       ` Junio C Hamano
2005-07-05 13:34                                         ` Marco Costalba
2005-06-25  5:04 ` and GIT tree rebuilding Junio C Hamano
2005-07-03  2:51 linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).