All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>, Taylor Blau <me@ttaylorr.com>,
	Christian Couder <christian.couder@gmail.com>,
	Elijah Newren <newren@gmail.com>,
	Elijah Newren <newren@gmail.com>
Subject: [PATCH v3 9/9] diffcore-rename: remove unnecessary duplicate entry checks
Date: Tue, 29 Dec 2020 20:05:28 +0000	[thread overview]
Message-ID: <723b0e55e75d264f83407c3fb50f3ccc16d35062.1609272328.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.929.v3.git.git.1609272328.gitgitgadget@gmail.com>

From: Elijah Newren <newren@gmail.com>

Commit 25d5ea410f ("[PATCH] Redo rename/copy detection logic.",
2005-05-24) added a duplicate entry check on rename_src in order to
avoid segfaults; the code at the time was prone to double free()s and an
easy way to avoid it was just to turn off rename detection for any
duplicate entries.  Note that the form of the check was modified two
commits ago in this series.

Similarly, commit 4d6be03b95 ("diffcore-rename: avoid processing
duplicate destinations", 2015-02-26) added a duplicate entry check
on rename_dst for the exact same reason -- the code was prone to double
free()s, and an easy way to avoid it was just to turn off rename
detection entirely.  Note that the form of the check was modified in the
commit just before this one.

In the original code in both places, the code was dealing with
individual diff_filespecs and trying to match things up, instead of just
keeping the original diff_filepairs around as we do now.  The
intervening change in structure has fixed the accounting problems and
the associated double free()s that used to occur, and thus we already
have a better fix.  As such, we can remove the band-aid checks for
duplicate entries.

Due to the last two patches, the diffcore_rename() setup is no longer a
sizeable chunk of overall runtime.  Thus, in a large rebase of many
commits with lots of renames and several optimizations to inexact rename
detection, this patch only speeds up the overall code by about half a
percent or so and is pretty close to the run-to-run variability making
it hard to get an exact measurement.  However, with some trace2 regions
around the setup code in diffcore_rename() so that I can focus on just
it, I measure that this patch consistently saves almost a third of the
remaining time spent in diffcore_rename() setup.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diffcore-rename.c          | 23 -----------------------
 t/t4058-diff-duplicates.sh |  2 +-
 2 files changed, 1 insertion(+), 24 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 2a8fcd52928..90db9ebd6d0 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -34,18 +34,6 @@ static struct diff_rename_dst *locate_rename_dst(struct diff_filepair *p)
  */
 static int add_rename_dst(struct diff_filepair *p)
 {
-	/*
-	 * If we have multiple entries at the same path in the destination
-	 * tree (an invalid tree, to be sure), turn off rename detection
-	 * entirely.  Once upon a time, diffcore-rename had double free()
-	 * issues due to such duplicate paths, resulting in segfaults.  See
-	 * commit 4d6be03b95 ("diffcore-rename: avoid processing duplicate
-	 * destinations", 2015-02-26).
-	 */
-	if (rename_dst_nr > 0 &&
-	    !strcmp(rename_dst[rename_dst_nr-1].p->two->path, p->two->path))
-		return -1;
-
 	ALLOC_GROW(rename_dst, rename_dst_nr + 1, rename_dst_alloc);
 	rename_dst[rename_dst_nr].p = p;
 	rename_dst[rename_dst_nr].filespec_to_free = NULL;
@@ -63,17 +51,6 @@ static int rename_src_nr, rename_src_alloc;
 
 static void register_rename_src(struct diff_filepair *p)
 {
-	/*
-	 * If we have multiple entries at the same path in the source tree
-	 * (an invalid tree, to be sure), avoid using more more than one
-	 * such entry in rename detection.  Once upon a time, doing so
-	 * caused segfaults; see commit 25d5ea410f ("[PATCH] Redo
-	 * rename/copy detection logic.", 2005-05-24).
-	 */
-	if (rename_src_nr > 0 &&
-	    !strcmp(rename_src[rename_src_nr-1].p->one->path, p->one->path))
-		return;
-
 	if (p->broken_pair) {
 		if (!break_idx) {
 			break_idx = xmalloc(sizeof(*break_idx));
diff --git a/t/t4058-diff-duplicates.sh b/t/t4058-diff-duplicates.sh
index ad3f37d388d..54614b814db 100755
--- a/t/t4058-diff-duplicates.sh
+++ b/t/t4058-diff-duplicates.sh
@@ -91,7 +91,7 @@ test_expect_success 'diff-tree between duplicate trees' '
 test_expect_success 'diff-tree with renames' '
 	# See NOTICE at top of file.
 	git diff-tree -M -r --no-abbrev one two >actual &&
-	test_cmp expect actual
+	test_must_be_empty actual
 '
 
 test_expect_success 'diff-tree FROM duplicate tree' '
-- 
gitgitgadget

      parent reply	other threads:[~2020-12-29 20:06 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-06  2:54 [PATCH 0/7] diffcore-rename improvements Elijah Newren via GitGitGadget
2020-12-06  2:54 ` [PATCH 1/7] diffcore-rename: avoid usage of global in too_many_rename_candidates() Elijah Newren via GitGitGadget
2020-12-09 22:06   ` Taylor Blau
2020-12-06  2:54 ` [PATCH 2/7] diffcore-rename: remove unnecessary if-clause Elijah Newren via GitGitGadget
2020-12-09 22:10   ` Taylor Blau
2020-12-10  0:32     ` Elijah Newren
2020-12-10  2:03     ` Junio C Hamano
2020-12-10  2:17       ` Elijah Newren
2020-12-10  6:56         ` Junio C Hamano
2020-12-06  2:54 ` [PATCH 3/7] diffcore-rename: rename num_create to num_targets Elijah Newren via GitGitGadget
2020-12-10  2:20   ` Junio C Hamano
2020-12-10  2:25     ` Elijah Newren
2020-12-06  2:54 ` [PATCH 4/7] diffcore-rename: change a few comments to use 'add' instead of 'create' Elijah Newren via GitGitGadget
2020-12-10  2:29   ` Junio C Hamano
2020-12-06  2:54 ` [PATCH 5/7] diffcore-rename: reduce jumpiness in progress counters Elijah Newren via GitGitGadget
2020-12-09 22:24   ` Taylor Blau
2020-12-10  2:36   ` Junio C Hamano
2020-12-06  2:54 ` [PATCH 6/7] diffcore-rename: simplify and accelerate register_rename_src() Elijah Newren via GitGitGadget
2020-12-09 22:40   ` Taylor Blau
2020-12-10  0:25     ` Elijah Newren
2020-12-10  0:41       ` Taylor Blau
2020-12-10  2:51   ` Junio C Hamano
2020-12-06  2:54 ` [PATCH 7/7] Accelerate rename_dst setup Elijah Newren via GitGitGadget
2020-12-09 23:01   ` Taylor Blau
2020-12-10  0:57     ` Elijah Newren
2020-12-10  1:43       ` Junio C Hamano
2020-12-06  3:01 ` [PATCH 0/7] diffcore-rename improvements Elijah Newren
2020-12-11  9:08 ` [PATCH v2 0/9] " Elijah Newren via GitGitGadget
2020-12-11  9:08   ` [PATCH v2 1/9] diffcore-rename: rename num_create to num_destinations Elijah Newren via GitGitGadget
2020-12-11  9:08   ` [PATCH v2 2/9] diffcore-rename: avoid usage of global in too_many_rename_candidates() Elijah Newren via GitGitGadget
2020-12-11  9:08   ` [PATCH v2 3/9] diffcore-rename: simplify limit check Elijah Newren via GitGitGadget
2020-12-11  9:08   ` [PATCH v2 4/9] diffcore-rename: reduce jumpiness in progress counters Elijah Newren via GitGitGadget
2020-12-11  9:08   ` [PATCH v2 5/9] t4058: add more tests and documentation for duplicate tree entry handling Elijah Newren via GitGitGadget
2020-12-11  9:08   ` [PATCH v2 6/9] t4058: explore duplicate tree entry handling in a bit more detail Elijah Newren via GitGitGadget
2021-04-21 12:29     ` Ævar Arnfjörð Bjarmason
2021-04-21 17:38       ` Elijah Newren
2020-12-11  9:08   ` [PATCH v2 7/9] diffcore-rename: simplify and accelerate register_rename_src() Elijah Newren via GitGitGadget
2020-12-11  9:08   ` [PATCH v2 8/9] diffcore-rename: accelerate rename_dst setup Elijah Newren via GitGitGadget
2020-12-11  9:08   ` [PATCH v2 9/9] diffcore-rename: remove unneccessary duplicate entry checks Elijah Newren via GitGitGadget
2020-12-29  8:31     ` Christian Couder
2020-12-29 18:09       ` Elijah Newren
2020-12-29 20:05   ` [PATCH v3 0/9] diffcore-rename improvements Elijah Newren via GitGitGadget
2020-12-29 20:05     ` [PATCH v3 1/9] diffcore-rename: rename num_create to num_destinations Elijah Newren via GitGitGadget
2020-12-29 20:05     ` [PATCH v3 2/9] diffcore-rename: avoid usage of global in too_many_rename_candidates() Elijah Newren via GitGitGadget
2020-12-29 20:05     ` [PATCH v3 3/9] diffcore-rename: simplify limit check Elijah Newren via GitGitGadget
2021-11-09 21:14       ` Başar Uğur
2021-11-10 20:06         ` Elijah Newren
2021-11-11  9:02           ` Başar Uğur
2021-11-11 16:19             ` Elijah Newren
2020-12-29 20:05     ` [PATCH v3 4/9] diffcore-rename: reduce jumpiness in progress counters Elijah Newren via GitGitGadget
2020-12-29 20:05     ` [PATCH v3 5/9] t4058: add more tests and documentation for duplicate tree entry handling Elijah Newren via GitGitGadget
2020-12-29 20:05     ` [PATCH v3 6/9] t4058: explore duplicate tree entry handling in a bit more detail Elijah Newren via GitGitGadget
2020-12-29 20:05     ` [PATCH v3 7/9] diffcore-rename: simplify and accelerate register_rename_src() Elijah Newren via GitGitGadget
2020-12-29 20:05     ` [PATCH v3 8/9] diffcore-rename: accelerate rename_dst setup Elijah Newren via GitGitGadget
2020-12-29 20:05     ` Elijah Newren via GitGitGadget [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=723b0e55e75d264f83407c3fb50f3ccc16d35062.1609272328.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.