All of lore.kernel.org
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>
Subject: [RFC PATCH 1/9] diffcore-rename: No point trying to find a match better than exact
Date: Fri, 10 Nov 2017 14:21:48 -0800	[thread overview]
Message-ID: <20171110222156.23221-2-newren@gmail.com> (raw)
In-Reply-To: <20171110222156.23221-1-newren@gmail.com>

diffcore_rename() had some code to avoid having destination paths that
already had an exact rename detected from being re-checked for other
renames.  Source paths, however, were re-checked because we wanted to
allow the possibility of detecting copies.  But if copy detection isn't
turned on, then this merely amounts to attempting to find a
better-than-exact match, which naturally ends up being an expensive
no-op.  In particular, copy detection is never turned on by the merge
recursive machinery.

In a large repository (~50k files, about 60% of which was java) that had
a number of high level directories involved in renames, this cut the time
necessary for a cherry-pick down by about 50% (from around 9 minutes to
4.5 minutes).

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diffcore-rename.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 6ba6157c61..c0517058b0 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -377,11 +377,10 @@ static void record_if_better(struct diff_score m[], struct diff_score *o)
  * 1 if we need to disable inexact rename detection;
  * 2 if we would be under the limit if we were given -C instead of -C -C.
  */
-static int too_many_rename_candidates(int num_create,
+static int too_many_rename_candidates(int num_create, int num_src,
 				      struct diff_options *options)
 {
 	int rename_limit = options->rename_limit;
-	int num_src = rename_src_nr;
 	int i;
 
 	options->needed_rename_limit = 0;
@@ -446,7 +445,7 @@ void diffcore_rename(struct diff_options *options)
 	struct diff_queue_struct outq;
 	struct diff_score *mx;
 	int i, j, rename_count, skip_unmodified = 0;
-	int num_create, dst_cnt;
+	int num_create, dst_cnt, num_src;
 	struct progress *progress = NULL;
 
 	if (!minimum_score)
@@ -512,12 +511,14 @@ void diffcore_rename(struct diff_options *options)
 	 * files still remain as options for rename/copies!)
 	 */
 	num_create = (rename_dst_nr - rename_count);
+	num_src = (detect_rename == DIFF_DETECT_COPY ?
+		   rename_src_nr : rename_src_nr - rename_count);
 
 	/* All done? */
 	if (!num_create)
 		goto cleanup;
 
-	switch (too_many_rename_candidates(num_create, options)) {
+	switch (too_many_rename_candidates(num_create, num_src, options)) {
 	case 1:
 		goto cleanup;
 	case 2:
@@ -531,7 +532,7 @@ void diffcore_rename(struct diff_options *options)
 	if (options->show_rename_progress) {
 		progress = start_delayed_progress(
 				_("Performing inexact rename detection"),
-				(uint64_t)rename_dst_nr * (uint64_t)rename_src_nr);
+				(uint64_t)num_create * (uint64_t)num_src);
 	}
 
 	mx = xcalloc(st_mult(NUM_CANDIDATE_PER_DST, num_create), sizeof(*mx));
@@ -550,6 +551,10 @@ void diffcore_rename(struct diff_options *options)
 			struct diff_filespec *one = rename_src[j].p->one;
 			struct diff_score this_src;
 
+			if (one->rename_used &&
+			    detect_rename != DIFF_DETECT_COPY)
+				continue;
+
 			if (skip_unmodified &&
 			    diff_unmodified_pair(rename_src[j].p))
 				continue;
@@ -568,7 +573,7 @@ void diffcore_rename(struct diff_options *options)
 			diff_free_filespec_blob(two);
 		}
 		dst_cnt++;
-		display_progress(progress, (uint64_t)(i+1)*(uint64_t)rename_src_nr);
+		display_progress(progress, (uint64_t)dst_cnt*(uint64_t)num_src);
 	}
 	stop_progress(&progress);
 
-- 
2.15.0.46.g41dca04efb


  reply	other threads:[~2017-11-10 22:22 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-10 22:21 [RFC PATCH 0/9] Improve rename detection performance in merge recursive Elijah Newren
2017-11-10 22:21 ` Elijah Newren [this message]
2017-11-10 22:21 ` [RFC PATCH 2/9] merge-recursive: Avoid unnecessary string list lookups Elijah Newren
2017-11-10 22:21 ` [RFC PATCH 3/9] merge-recursive: New function for better colliding conflict resolutions Elijah Newren
2017-11-11 16:49   ` Elijah Newren
2017-11-10 22:21 ` [RFC PATCH 4/9] Add testcases for improved file collision conflict handling Elijah Newren
2017-11-11 16:52   ` Elijah Newren
2017-11-10 22:21 ` [RFC PATCH 5/9] merge-recursive: Fix rename/add " Elijah Newren
2017-11-10 22:21 ` [RFC PATCH 6/9] merge-recursive: Improve handling for rename/rename(2to1) conflicts Elijah Newren
2017-11-10 22:21 ` [RFC PATCH 7/9] merge-recursive: Improve handling for add/add conflicts Elijah Newren
2017-11-10 22:21 ` [RFC PATCH 8/9] merge-recursive: Accelerate rename detection Elijah Newren
2017-11-10 22:21 ` [RFC PATCH 9/9] diffcore-rename: Filter rename_src list when possible Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171110222156.23221-2-newren@gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.