From: Linus Torvalds <torvalds@osdl.org>
To: Junio C Hamano <junkio@cox.net>
Cc: "David S. Miller" <davem@davemloft.net>,
Git Mailing List <git@vger.kernel.org>,
Nicolas Pitre <nico@cam.org>, Chris Mason <mason@suse.com>
Subject: Re: kernel.org and GIT tree rebuilding
Date: Sun, 26 Jun 2005 12:19:07 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.58.0506261206170.19755@ppc970.osdl.org> (raw)
In-Reply-To: <7vzmtdq7wy.fsf@assigned-by-dhcp.cox.net>
On Sun, 26 Jun 2005, Junio C Hamano wrote:
>
> My preference is to do things in this order:
>
> (0) concatenate pack and idx files;
Actually, I was originally planning to do that, but now that I have
thought about what read_sha1_file() would actually do, I think it's more
efficient to leave the index as a separate file.
In particular, what you'd normally do is that if you can't look up the
file in the regular object directory, you start going through the pack
files. You can do it by having GIT_ALTERNATE_OBJECT_DIRECTORIES point to a
pack file, but I actually would prefer the notion of just adding a
.git/objects/pack
subdirectory, and having object lookup just automatically open and map all
index files in that subdirectory.
And the thing is, you really just want to map the index files, the data
files can be so big that you can't afford to map them (ie a really big
project might have several pack-files a gig each or something like that).
And the most efficient way to map just the index file is to keep it
separate, because then the "stat()" will just get the information
directly, and you then just mmap that.
The alternative is to first read the index of the index (to figure out how
big the index is), and then map the rest. But that just seems a lot
messier than just mapping the index file directly.
And when creating these things, we do need to create the data file (which
can be big enough that it doesn't fit in memory) first, so we have to have
a separate file for it, we can't just stream it out to stdout.
Now, when _sending_ the pack-files, linearizing them is easy: you just
send the index first, and the data file immediately afterwards. The index
tells how big it is, so there's no need to even add any markers: you can
do something like 'git-send-script' with something simple like
git-rev-list ... | git-pack-file tmp-pack &&
cat tmp-pack.idx tmp-pack.data | ssh other git-receive-script
So let's just keep the index/data files separate.
Linus
next prev parent reply other threads:[~2005-06-26 19:10 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-06-25 4:20 kernel.org and GIT tree rebuilding David S. Miller
2005-06-25 4:40 ` Jeff Garzik
2005-06-25 5:23 ` Linus Torvalds
2005-06-25 5:48 ` Jeff Garzik
2005-06-25 6:16 ` Linus Torvalds
2005-06-26 16:41 ` Linus Torvalds
2005-06-26 18:39 ` Junio C Hamano
2005-06-26 19:19 ` Linus Torvalds [this message]
2005-06-26 19:45 ` Junio C Hamano
[not found] ` <7v1x6om6o5.fsf@assigned-by-dhcp.cox.net>
[not found] ` <Pine.LNX.4.58.0506271227160.19755@ppc970.osdl.org>
[not found] ` <7v64vzyqyw.fsf_-_@assigned-by-dhcp.cox.net>
2005-06-28 6:56 ` [PATCH] Obtain sha1_file_info() for deltified pack entry properly Junio C Hamano
2005-06-28 6:58 ` Junio C Hamano
2005-06-28 6:58 ` [PATCH 2/3] git-cat-file: use sha1_object_info() on '-t' Junio C Hamano
2005-06-28 6:59 ` [PATCH 3/3] git-cat-file: '-s' to find out object size Junio C Hamano
2005-06-26 20:52 ` kernel.org and GIT tree rebuilding Chris Mason
2005-06-26 21:03 ` Chris Mason
2005-06-26 21:40 ` Linus Torvalds
2005-06-26 22:34 ` Linus Torvalds
2005-06-28 18:06 ` Nicolas Pitre
2005-06-28 19:28 ` Linus Torvalds
2005-06-28 21:08 ` Nicolas Pitre
2005-06-28 21:27 ` Linus Torvalds
2005-06-28 21:55 ` [PATCH] Bugfix: initialize pack_base to NULL Junio C Hamano
2005-06-29 3:55 ` kernel.org and GIT tree rebuilding Nicolas Pitre
2005-06-29 5:16 ` Nicolas Pitre
2005-06-29 5:43 ` Linus Torvalds
2005-06-29 5:54 ` Linus Torvalds
2005-06-29 7:16 ` Last mile for 1.0 again Junio C Hamano
2005-06-29 9:51 ` [PATCH] Add git-verify-pack command Junio C Hamano
2005-06-29 16:15 ` Linus Torvalds
2005-07-04 21:40 ` Last mile for 1.0 again Daniel Barkalow
2005-07-04 21:45 ` Junio C Hamano
2005-07-04 21:59 ` Linus Torvalds
2005-07-04 22:41 ` Daniel Barkalow
2005-07-04 23:06 ` Junio C Hamano
2005-07-05 1:54 ` Daniel Barkalow
2005-07-05 6:24 ` Junio C Hamano
2005-07-05 13:34 ` Marco Costalba
2005-06-25 5:04 ` kernel.org and GIT tree rebuilding Junio C Hamano
2005-07-03 2:51 linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.58.0506261206170.19755@ppc970.osdl.org \
--to=torvalds@osdl.org \
--cc=davem@davemloft.net \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=mason@suse.com \
--cc=nico@cam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).