git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Langhoff <martin.langhoff@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Merging limitations after directory renames -- interesting test repo
Date: Sat, 19 Feb 2011 04:08:12 -0500	[thread overview]
Message-ID: <20110219090812.GA20577@sigill.intra.peff.net> (raw)
In-Reply-To: <AANLkTimKp+Z==QXJg2Bagot+Df4REeANuxwVi7bpPCXr@mail.gmail.com>

On Fri, Feb 18, 2011 at 03:27:36PM -0800, Linus Torvalds wrote:

> >  1. Did you bump up your merge.renamelimit? It's hard to see because it
> >     scrolls off the screen amidst the many conflicts, but the first
> >     message is:
> >
> >       warning: too many files (created: 425 deleted: 1093), skipping
> >       inexact rename detection
> >
> >     which you want to use. Try "git config merge.renamelimit
> >     10000". Which runs pretty snappily on my machine; I wonder if we
> >     should up the default limit.
> 
> Yeah, for the kernel, I have
> 
> 	[diff]
> 		renamelimit=0
> 
> to disable the limit entirely, because the default limit is very low
> indeed. Git is quite good at the rename detection.
> 
> However, the reason for the low default is not because it's not snappy
> enough - it's because it can end up using a lot of memory (and if
> you're low on memory, the swapping will mean that it goes from "quite
> snappy" to "slow as molasses" - but it still will not be CPU limited,
> it's just paging like crazy).

I think it can be both. There is an O(n^2) part to the algorithm. I did
some timings a few years ago that showed an n^2 increase in time as you
bumped the limit:

  http://article.gmane.org/gmane.comp.version-control.git/73519

That's staying within a reasonable memory size. I would not be surprised
if you can get much worse behavior by going into swap, but I didn't
measure peak memory use there.

Those tests led to:

  commit 50705915eae89eae490dff30fa370ed02e4d6e72
  Author: Jeff King <peff@peff.net>
  Date:   Wed Apr 30 13:24:43 2008 -0400

    bump rename limit defaults

    The current rename limit default of 100 was arbitrarily
    chosen. Testing[1] has shown that on modern hardware, a
    limit of 200 adds about a second of computation time, and a
    limit of 500 adds about 5 seconds of computation time.

    This patch bumps the default limit to 200 for viewing diffs,
    and to 500 for performing a merge. The limit for generating
    git-status templates is set independently; we bump it up to
    200 here, as well, to match the diff limit.

But perhaps it's time to revisit the test; it's been 2 years, and my
hardware at the time was probably 2 years out of date. :)

Here are the old and new times for various sizes of rename. Details
about the test are in the message referenced above.

   N   Old CPU Seconds   New CPU Seconds
  10              0.43              0.02
 100              0.44              0.20
 200              1.40              0.55
 400              4.87              1.90
 800             18.08              7.01
1000             27.82             10.83

So maybe bump the diff limit to 400 and the merge limit to 1000,
doubling both? That leaves us at around 2 seconds per-commit for a log,
and 10 seconds tacked onto a merge. We could maybe even go higher with
the merge limit. If it's such a big merge, the conflict resolution is
probably going to take forever anyway, so 30 extra seconds if it makes
rename detection work is probably a good thing.

According to top, git only hit around 17M resident on the 1000-sized
one, so I don't think memory is a problem, at least for average repos
(and yes, I know top is an awful way to measure, but it's quick and it
would need to be orders of magnitude off for it to be a problem).

So I'm in favor of bumping the limits, or possibly even removing the
hard number limit and putting in a "try to do renames for this many
seconds" option. If we're going to have something like 30 second delays
on merge, though, we should perhaps write some eye candy to stderr after
2 seconds or so (like we do with "git checkout").

> So I do think we could try to lift the default a bit, but it might be
> even more important to just make the message much more noticeable and
> avoid scrolling past it. For example, setting a flag, and not printing
> out the message immediately, but instead print it out only if it turns
> into trouble at the end.

Yeah, I also think that would be useful. And if that information filters
up to the merge command, it can even give better advice (like how to
tweak the limit).

-Peff

  parent reply	other threads:[~2011-02-19  9:08 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-18 18:58 Merging limitations after directory renames -- interesting test repo Martin Langhoff
2011-02-18 22:21 ` Jeff King
2011-02-18 23:27   ` Linus Torvalds
2011-02-19  0:26     ` Linus Torvalds
2011-02-19  0:52       ` Junio C Hamano
2011-02-19  1:03         ` Linus Torvalds
2011-02-19  1:20           ` Linus Torvalds
2011-02-19  0:55       ` Linus Torvalds
2011-02-19  0:44     ` Junio C Hamano
2011-02-19  9:08     ` Jeff King [this message]
2011-02-19 10:19     ` Jeff King
2011-02-19 10:20       ` [PATCH 1/4] merge: improve inexact rename limit warning Jeff King
2011-02-21 23:33         ` Junio C Hamano
2011-02-22 15:39           ` Jeff King
2011-02-22 18:46             ` Junio C Hamano
2011-02-23  8:02               ` Jeff King
2011-02-19 10:21       ` [PATCH 2/4] bump rename limit defaults (again) Jeff King
2011-02-19 17:54         ` Piotr Krukowiecki
2011-02-20 10:10           ` Jeff King
2011-02-19 20:12         ` Ævar Arnfjörð Bjarmason
2011-02-20 10:12           ` Jeff King
2011-02-19 10:21       ` [PATCH 3/4] commit: stop setting rename limit Jeff King
2011-02-19 10:25       ` [RFC/PATCH 4/4] inexact rename detection eye candy Jeff King
2011-02-19 15:57         ` Linus Torvalds
2011-02-20  9:48           ` Jeff King
2011-02-20  9:51             ` [PATCH 1/3] add inexact rename detection progress infrastructure Jeff King
2011-02-20  9:53             ` [PATCH 2/3] merge: enable progress reporting for rename detection Jeff King
2011-02-20  9:56             ` [PATCH 3/3] pull: propagate --progress to merge Jeff King
2011-02-20 10:37             ` [RFC/PATCH 4/4] inexact rename detection eye candy Jeff King
2011-02-19 16:29         ` Martin Langhoff
2011-02-20 10:04           ` Jeff King
2011-02-20 13:16             ` Martin Langhoff
2011-02-19 15:22       ` Merging limitations after directory renames -- interesting test repo Martin Langhoff
2011-02-19 15:31         ` Martin Langhoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110219090812.GA20577@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=martin.langhoff@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).