From: Linus Torvalds <torvalds@osdl.org>
To: Matt Mackall <mpm@selenic.com>
Cc: Bill Davidsen <davidsen@tmr.com>,
Morten Welinder <mwelinder@gmail.com>,
Sean <seanlkml@sympatico.ca>,
linux-kernel <linux-kernel@vger.kernel.org>,
git@vger.kernel.org
Subject: Re: Mercurial 0.4b vs git patchbomb benchmark
Date: Mon, 2 May 2005 15:49:49 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.58.0505021540070.3594@ppc970.osdl.org> (raw)
In-Reply-To: <20050502223002.GP21897@waste.org>
On Mon, 2 May 2005, Matt Mackall wrote:
> >
> > - you can share the objects freely between different trees, never
> > worrying about one tree corrupting another trees object by mistake.
>
> Not sure if this is terribly useful. It just makes it harder to pull
> the subset you're interested in.
You don't have to share things in a single subdirectory. Symlinks and
hardlinks work fine, as do actual filesystem tricks ;)
> > - you can drop old objects.
>
> You can't drop old objects without dropping all the changesets that
> refer to them or otherwise being prepared to deal with the broken
> links.
Absolutely. This needs support from fsck to allow us to say "commit xxxx
is no longer in the tree, because we pruned it".
Alternatively (and that's the much less intrusive one), you keep all the
commit objects, but drop the tree and blob objects. Again, all you need
for this to work is just feed a list of commits to fsck, and tell it
"we've pruned those from the tree", which tells fsck not to start looking
for the contents of those commits.
So for example, you can trivially have something that automates this: take
each commit that is older than <x> days, add it to the "prune list", and
run fsck, and delete all objects that now show up as being unreachable
(since fsck won't be looking at what those commits reference).
I could write this up in ten minutes. It's really simple.
And it's simple _exactly_ because we don't do deltas.
> > delta models very fundamentally don't support this.
>
> The latter can be done in a pretty straightforward manner in mercurial
> with one pass over the data. But I have a goal to make keeping the
> whole history cheap enough that no one balks at it.
With delta's, you have two choices:
- change all the sha1 names (ie a pruned tree would no longer be
compatible with a non-pruned one)
- make the delta part not show up as part of the sha1 name (which means
that it's unprotected).
which one would you have?
> What is a tree re-linker? Finds duplicate files and hard-links them?
> Ok, that makes some sense. But it's a win on one machine and a lose
> everywhere else.
Where would it be a loss? Esepcially since with git, it's cheap (you don't
need to compare content to find objects to link - you can just compare
filename listings).
> I've added an "hg verify" command to Mercurial. It doesn't attempt to
> fix anything up yet, but it can catch a couple things that git
> probably can't (like file revisions that aren't owned by any
> changeset), namely because there's more metadata around to look at.
git-fsck-cache catches exactly those kinds of things. And since it checks
pretty much every _single_ assumption in git (which is not a lot, since
git doesn't have a lot of assumptions), I guarantee you that you can't
find any more than it does (the filename ordering is the big missing
piece: I _still_ don't verify that trees are ordered. I've been mentioning
it since the beginning, but I'm lazy).
In other words, your verifier can't verify anything more. It's entirely
possible that more things can go _wrong_, since you have more indexes, so
your verifier will have more to check, but that's not an advantage, that's
a downside.
Linus
next prev parent reply other threads:[~2005-05-02 22:48 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-26 0:41 Mercurial 0.3 vs git benchmarks Matt Mackall
2005-04-26 1:49 ` Daniel Phillips
2005-04-26 2:08 ` Linus Torvalds
2005-04-26 2:30 ` Mike Taht
2005-04-26 3:04 ` Linus Torvalds
2005-04-26 4:00 ` Linus Torvalds
2005-04-26 11:13 ` Chris Mason
2005-04-26 15:09 ` Magnus Damm
2005-04-26 15:38 ` Chris Mason
2005-04-26 16:23 ` Magnus Damm
2005-04-26 18:18 ` Chris Mason
2005-04-26 20:56 ` Andrew Morton
2005-04-26 21:07 ` Linus Torvalds
2005-04-26 22:50 ` H. Peter Anvin
2005-04-26 22:56 ` Andrew Morton
2005-04-26 23:43 ` H. Peter Anvin
2005-04-27 15:01 ` Florian Weimer
2005-04-27 15:13 ` Thomas Glanzmann
2005-04-27 18:54 ` H. Peter Anvin
2005-04-27 19:01 ` Thomas Glanzmann
2005-04-27 19:57 ` Theodore Ts'o
2005-04-27 20:06 ` Thomas Glanzmann
2005-04-27 20:35 ` H. Peter Anvin
2005-04-27 20:39 ` Thomas Glanzmann
2005-04-27 20:47 ` Florian Weimer
2005-04-27 20:55 ` Florian Weimer
2005-04-27 21:04 ` H. Peter Anvin
2005-04-27 21:06 ` Florian Weimer
2005-04-27 21:32 ` Theodore Ts'o
2005-04-27 19:55 ` Theodore Ts'o
2005-04-27 6:34 ` Ingo Molnar
2005-04-27 21:10 ` Bill Davidsen
2005-04-27 21:39 ` Linus Torvalds
2005-04-26 16:42 ` Linus Torvalds
2005-04-26 17:39 ` Chris Mason
2005-04-26 19:52 ` Chris Mason
2005-04-26 18:15 ` H. Peter Anvin
2005-04-26 20:30 ` Bill Davidsen
2005-04-26 16:11 ` Bill Davidsen
2005-04-26 4:01 ` Matt Mackall
2005-04-26 4:20 ` Linus Torvalds
2005-04-26 4:09 ` Chris Wedgwood
2005-04-26 4:22 ` Andreas Gal
2005-04-26 4:22 ` Linus Torvalds
2005-04-29 6:01 ` Mercurial 0.4b vs git patchbomb benchmark Matt Mackall
2005-04-29 6:40 ` Sean
2005-04-29 7:40 ` Matt Mackall
2005-04-29 8:40 ` Sean
2005-04-29 14:34 ` Linus Torvalds
2005-04-29 15:18 ` Morten Welinder
2005-04-29 16:52 ` Matt Mackall
2005-05-02 16:10 ` Bill Davidsen
2005-05-02 19:02 ` Sean
2005-05-02 22:02 ` Linus Torvalds
2005-05-02 22:30 ` Matt Mackall
2005-05-02 22:49 ` Linus Torvalds [this message]
2005-05-03 0:00 ` Matt Mackall
2005-05-03 2:48 ` Linus Torvalds
2005-05-03 3:29 ` Matt Mackall
2005-05-03 4:18 ` Linus Torvalds
2005-05-03 4:24 ` Linus Torvalds
2005-05-03 4:27 ` Matt Mackall
2005-05-03 8:45 ` Chris Wedgwood
2005-04-29 15:44 ` Tom Lord
2005-04-29 15:58 ` Linus Torvalds
2005-04-29 17:34 ` Tom Lord
2005-04-29 17:56 ` Linus Torvalds
2005-04-29 18:08 ` Tom Lord
2005-04-29 18:33 ` Sean
2005-04-29 18:54 ` Tom Lord
2005-04-29 19:13 ` Sean
2005-05-02 16:15 ` Bill Davidsen
2005-04-29 16:37 ` Matt Mackall
2005-04-29 17:09 ` Linus Torvalds
2005-04-29 19:12 ` Matt Mackall
2005-04-29 19:50 ` Linus Torvalds
2005-04-29 20:23 ` Matt Mackall
2005-04-29 20:49 ` Linus Torvalds
2005-04-29 21:20 ` Matt Mackall
2005-04-29 16:46 ` Bill Davidsen
2005-04-29 20:19 ` Andrea Arcangeli
2005-04-29 22:30 ` Olivier Galibert
2005-04-29 22:47 ` Andrea Arcangeli
2005-04-29 20:30 ` Andrea Arcangeli
2005-04-29 20:39 ` Matt Mackall
2005-04-30 2:52 ` Andrea Arcangeli
2005-04-30 15:20 ` Matt Mackall
2005-04-30 16:37 ` Andrea Arcangeli
2005-05-02 15:49 ` Bill Davidsen
2005-05-02 16:14 ` Valdis.Kletnieks
2005-05-03 17:40 ` Bill Davidsen
2005-05-04 2:10 ` Mercurial 0.4b vs git patchbomb benchmark (/usr/bin/env again) David A. Wheeler
2005-05-02 16:17 ` Mercurial 0.4b vs git patchbomb benchmark Andrea Arcangeli
2005-05-02 16:31 ` Linus Torvalds
2005-05-02 17:18 ` Daniel Jacobowitz
2005-05-02 17:32 ` Linus Torvalds
2005-05-02 20:54 ` Sam Ravnborg
2005-05-02 17:20 ` Ryan Anderson
2005-05-02 17:31 ` Linus Torvalds
2005-05-02 21:17 ` Kyle Moffett
2005-05-03 17:43 ` Bill Davidsen
[not found] <3YQn9-8qX-5@gated-at.bofh.it>
[not found] ` <3ZLEF-56n-1@gated-at.bofh.it>
[not found] ` <3ZM7L-5ot-13@gated-at.bofh.it>
[not found] ` <3ZN3P-69A-9@gated-at.bofh.it>
[not found] ` <3ZNdz-6gK-9@gated-at.bofh.it>
2005-05-03 1:16 ` Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>
2005-05-03 1:29 ` Matt Mackall
2005-05-03 16:22 ` Bill Davidsen
2005-05-03 17:14 ` Rene Scharfe
2005-05-04 17:51 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.58.0505021540070.3594@ppc970.osdl.org \
--to=torvalds@osdl.org \
--cc=davidsen@tmr.com \
--cc=git@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpm@selenic.com \
--cc=mwelinder@gmail.com \
--cc=seanlkml@sympatico.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).