All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Peart <Ben.Peart@microsoft.com>
To: "git@vger.kernel.org" <git@vger.kernel.org>
Cc: Ben Peart <Ben.Peart@microsoft.com>,
	"newren@gmail.com" <newren@gmail.com>,
	"gitster@pobox.com" <gitster@pobox.com>,
	"pclouds@gmail.com" <pclouds@gmail.com>,
	"vmiklos@frugalware.org" <vmiklos@frugalware.org>,
	Alejandro Pauly <alpauly@microsoft.com>,
	"Johannes.Schindelin@gmx.de" <Johannes.Schindelin@gmx.de>,
	"eckhard.s.maass@googlemail.com" <eckhard.s.maass@googlemail.com>,
	Ben Peart <Ben.Peart@microsoft.com>
Subject: [PATCH v2] add status config and command line options for rename detection
Date: Thu, 10 May 2018 14:16:37 +0000	[thread overview]
Message-ID: <20180510141621.9668-1-benpeart@microsoft.com> (raw)
In-Reply-To: <20180509144213.18032-1-benpeart@microsoft.com>

After performing a merge that has conflicts, git status will by default attempt
to detect renames which causes many objects to be examined.  In a virtualized
repo, those objects do not exist locally so the rename logic triggers them to be
fetched from the server. This results in the status call taking hours to
complete on very large repos.  Even in a small repo (the GVFS repo) turning off
break and rename detection has a significant impact:

git status --no-renames:
31 secs., 105 loose object downloads

git status --no-breaks
7 secs., 17 loose object downloads

git status --no-breaks --no-renames
1 sec., 1 loose object download

Add a new config status.renames setting to enable turning off rename detection
during status.  This setting will default to the value of diff.renames.

Add a new config status.renamelimit setting to to enable bounding the time spent
finding out inexact renames during status.  This setting will default to the
value of diff.renamelimit.

Add status --no-renames command line option that enables overriding the config
setting from the command line. Add --find-renames[=<n>] to enable detecting
renames and optionally setting the similarity index from the command line.

Note: I removed the --no-breaks command line option from the original patch as
it will no longer be needed once the default has been changed [1] to turn it off.

[1] https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.maass@gmail.com/

Original-Patch-by: Alejandro Pauly <alpauly@microsoft.com>
Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
---

Notes:
    Base Ref: master
    Web-Diff: https://github.com/benpeart/git/commit/823212725b
    Checkout: git fetch https://github.com/benpeart/git status-renames-v2 && git checkout 823212725b
    
    ### Interdiff (v1..v2):
    
    diff --git a/Documentation/config.txt b/Documentation/config.txt
    index b79b83c587..9c8eca05b1 100644
    --- a/Documentation/config.txt
    +++ b/Documentation/config.txt
    @@ -3126,7 +3126,8 @@ status.renameLimit::
     status.renames::
     	Whether and how Git detects renames.  If set to "false",
     	rename detection is disabled. If set to "true", basic rename
    -	detection is enabled.  Defaults to the value of diff.renames.
    +	detection is enabled.  If set to "copies" or "copy", Git will
    +	detect copies, as well.  Defaults to the value of diff.renames.
    
     status.showStash::
     	If set to true, linkgit:git-status[1] will display the number of
    diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
    index c16e27e63d..c4467ffb98 100644
    --- a/Documentation/git-status.txt
    +++ b/Documentation/git-status.txt
    @@ -135,6 +135,16 @@ ignored, then the directory is not shown, but all contents are shown.
     	Display or do not display detailed ahead/behind counts for the
     	branch relative to its upstream branch.  Defaults to true.
    
    +--renames::
    +--no-renames::
    +	Turn on/off rename detection regardless of user configuration.
    +	See also linkgit:git-diff[1] `--no-renames`.
    +
    +--find-renames[=<n>]::
    +	Turn on rename detection, optionally setting the similarity
    +	threshold.
    +	See also linkgit:git-diff[1] `--find-renames`.
    +
     <pathspec>...::
     	See the 'pathspec' entry in linkgit:gitglossary[7].
    
    diff --git a/builtin/commit.c b/builtin/commit.c
    index a545096ddd..db886277f4 100644
    --- a/builtin/commit.c
    +++ b/builtin/commit.c
    @@ -109,10 +109,6 @@ static int have_option_m;
     static struct strbuf message = STRBUF_INIT;
    
     static enum wt_status_format status_format = STATUS_FORMAT_UNSPECIFIED;
    -static int diff_detect_rename = -1;
    -static int status_detect_rename = -1;
    -static int diff_rename_limit = -1;
    -static int status_rename_limit = -1;
    
     static int opt_parse_porcelain(const struct option *opt, const char *arg, int unset)
     {
    @@ -1274,19 +1270,21 @@ static int git_status_config(const char *k, const char *v, void *cb)
     		return 0;
     	}
     	if (!strcmp(k, "diff.renamelimit")) {
    -		diff_rename_limit = git_config_int(k, v);
    +		if (s->rename_limit == -1)
    +			s->rename_limit = git_config_int(k, v);
     		return 0;
     	}
     	if (!strcmp(k, "status.renamelimit")) {
    -		status_rename_limit = git_config_int(k, v);
    +		s->rename_limit = git_config_int(k, v);
     		return 0;
     	}
     	if (!strcmp(k, "diff.renames")) {
    -		diff_detect_rename = git_config_rename(k, v);
    +		if (s->detect_rename == -1)
    +			s->detect_rename = git_config_rename(k, v);
     		return 0;
     	}
     	if (!strcmp(k, "status.renames")) {
    -		status_detect_rename = git_config_rename(k, v);
    +		s->detect_rename = git_config_rename(k, v);
     		return 0;
     	}
     	return git_diff_ui_config(k, v, NULL);
    @@ -1372,27 +1370,13 @@ int cmd_status(int argc, const char **argv, const char *prefix)
     	s.ignore_submodule_arg = ignore_submodule_arg;
     	s.status_format = status_format;
     	s.verbose = verbose;
    -	s.detect_rename = no_renames >= 0 ? !no_renames :
    -					  status_detect_rename >= 0 ? status_detect_rename :
    -					  diff_detect_rename >= 0 ? diff_detect_rename :
    -					  s.detect_rename;
    +	if (no_renames != -1)
    +		s.detect_rename = !no_renames;
     	if ((intptr_t)rename_score_arg != -1) {
     		s.detect_rename = DIFF_DETECT_RENAME;
     		if (rename_score_arg)
     			s.rename_score = parse_rename_score(&rename_score_arg);
     	}
    -	s.rename_limit = status_rename_limit >= 0 ? status_rename_limit :
    -					 diff_rename_limit >= 0 ? diff_rename_limit :
    -					 s.rename_limit;
    -
    -	/*
    -	 * We do not have logic to handle the detection of copies.  In
    -	 * fact, it may not even make sense to add such logic: would we
    -	 * really want a change to a base file to be propagated through
    -	 * multiple other files by a merge?
    -	 */
    -	if (s.detect_rename > DIFF_DETECT_RENAME)
    -		s.detect_rename = DIFF_DETECT_RENAME;
    
     	wt_status_collect(&s);
    
    ### Patches

 Documentation/config.txt     | 10 ++++
 Documentation/git-status.txt | 10 ++++
 builtin/commit.c             | 41 ++++++++++++++++
 diff.c                       |  2 +-
 diff.h                       |  1 +
 t/t7525-status-rename.sh     | 90 ++++++++++++++++++++++++++++++++++++
 wt-status.c                  | 12 +++++
 wt-status.h                  |  4 +-
 8 files changed, 168 insertions(+), 2 deletions(-)
 create mode 100644 t/t7525-status-rename.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..9c8eca05b1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3119,6 +3119,16 @@ status.displayCommentPrefix::
 	behavior of linkgit:git-status[1] in Git 1.8.4 and previous.
 	Defaults to false.
 
+status.renameLimit::
+	The number of files to consider when performing rename detection;
+	if not specified, defaults to the value of diff.renameLimit.
+
+status.renames::
+	Whether and how Git detects renames.  If set to "false",
+	rename detection is disabled. If set to "true", basic rename
+	detection is enabled.  If set to "copies" or "copy", Git will
+	detect copies, as well.  Defaults to the value of diff.renames.
+
 status.showStash::
 	If set to true, linkgit:git-status[1] will display the number of
 	entries currently stashed away.
diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index c16e27e63d..c4467ffb98 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -135,6 +135,16 @@ ignored, then the directory is not shown, but all contents are shown.
 	Display or do not display detailed ahead/behind counts for the
 	branch relative to its upstream branch.  Defaults to true.
 
+--renames::
+--no-renames::
+	Turn on/off rename detection regardless of user configuration.
+	See also linkgit:git-diff[1] `--no-renames`.
+
+--find-renames[=<n>]::
+	Turn on rename detection, optionally setting the similarity
+	threshold.
+	See also linkgit:git-diff[1] `--find-renames`.
+
 <pathspec>...::
 	See the 'pathspec' entry in linkgit:gitglossary[7].
 
diff --git a/builtin/commit.c b/builtin/commit.c
index 5240f11225..db886277f4 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -143,6 +143,16 @@ static int opt_parse_m(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+static int opt_parse_rename_score(const struct option *opt, const char *arg, int unset)
+{
+	const char **value = opt->value;
+	if (arg != NULL && *arg == '=')
+		arg = arg + 1;
+
+	*value = arg;
+	return 0;
+}
+
 static void determine_whence(struct wt_status *s)
 {
 	if (file_exists(git_path_merge_head()))
@@ -1259,11 +1269,31 @@ static int git_status_config(const char *k, const char *v, void *cb)
 			return error(_("Invalid untracked files mode '%s'"), v);
 		return 0;
 	}
+	if (!strcmp(k, "diff.renamelimit")) {
+		if (s->rename_limit == -1)
+			s->rename_limit = git_config_int(k, v);
+		return 0;
+	}
+	if (!strcmp(k, "status.renamelimit")) {
+		s->rename_limit = git_config_int(k, v);
+		return 0;
+	}
+	if (!strcmp(k, "diff.renames")) {
+		if (s->detect_rename == -1)
+			s->detect_rename = git_config_rename(k, v);
+		return 0;
+	}
+	if (!strcmp(k, "status.renames")) {
+		s->detect_rename = git_config_rename(k, v);
+		return 0;
+	}
 	return git_diff_ui_config(k, v, NULL);
 }
 
 int cmd_status(int argc, const char **argv, const char *prefix)
 {
+	static int no_renames = -1;
+	static const char *rename_score_arg = (const char *)-1;
 	static struct wt_status s;
 	int fd;
 	struct object_id oid;
@@ -1297,6 +1327,10 @@ int cmd_status(int argc, const char **argv, const char *prefix)
 		  N_("ignore changes to submodules, optional when: all, dirty, untracked. (Default: all)"),
 		  PARSE_OPT_OPTARG, NULL, (intptr_t)"all" },
 		OPT_COLUMN(0, "column", &s.colopts, N_("list untracked files in columns")),
+		OPT_BOOL(0, "no-renames", &no_renames, N_("do not detect renames")),
+		{ OPTION_CALLBACK, 'M', "find-renames", &rename_score_arg,
+		  N_("n"), N_("detect renames, optionally set similarity index"),
+		  PARSE_OPT_OPTARG, opt_parse_rename_score },
 		OPT_END(),
 	};
 
@@ -1336,6 +1370,13 @@ int cmd_status(int argc, const char **argv, const char *prefix)
 	s.ignore_submodule_arg = ignore_submodule_arg;
 	s.status_format = status_format;
 	s.verbose = verbose;
+	if (no_renames != -1)
+		s.detect_rename = !no_renames;
+	if ((intptr_t)rename_score_arg != -1) {
+		s.detect_rename = DIFF_DETECT_RENAME;
+		if (rename_score_arg)
+			s.rename_score = parse_rename_score(&rename_score_arg);
+	}
 
 	wt_status_collect(&s);
 
diff --git a/diff.c b/diff.c
index 1289df4b1f..5dfc24aa6d 100644
--- a/diff.c
+++ b/diff.c
@@ -177,7 +177,7 @@ static int parse_submodule_params(struct diff_options *options, const char *valu
 	return 0;
 }
 
-static int git_config_rename(const char *var, const char *value)
+int git_config_rename(const char *var, const char *value)
 {
 	if (!value)
 		return DIFF_DETECT_RENAME;
diff --git a/diff.h b/diff.h
index d29560f822..dedac472ca 100644
--- a/diff.h
+++ b/diff.h
@@ -324,6 +324,7 @@ extern int git_diff_ui_config(const char *var, const char *value, void *cb);
 extern void diff_setup(struct diff_options *);
 extern int diff_opt_parse(struct diff_options *, const char **, int, const char *);
 extern void diff_setup_done(struct diff_options *);
+extern int git_config_rename(const char *var, const char *value);
 
 #define DIFF_DETECT_RENAME	1
 #define DIFF_DETECT_COPY	2
diff --git a/t/t7525-status-rename.sh b/t/t7525-status-rename.sh
new file mode 100644
index 0000000000..311df8038a
--- /dev/null
+++ b/t/t7525-status-rename.sh
@@ -0,0 +1,90 @@
+#!/bin/sh
+
+test_description='git status rename detection options'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	echo 1 >original &&
+	git add . &&
+	git commit -m"Adding original file." &&
+	mv original renamed &&
+	echo 2 >> renamed &&
+	git add .
+'
+
+cat >.gitignore <<\EOF
+.gitignore
+expect*
+actual*
+EOF
+
+test_expect_success 'status no-options' '
+	git status >actual &&
+	test_i18ngrep "renamed:" actual
+'
+
+test_expect_success 'status --no-renames' '
+	git status --no-renames >actual &&
+	test_i18ngrep "deleted:" actual &&
+	test_i18ngrep "new file:" actual
+'
+
+test_expect_success 'status.renames inherits from diff.renames false' '
+	git -c diff.renames=false status >actual &&
+	test_i18ngrep "deleted:" actual &&
+	test_i18ngrep "new file:" actual
+'
+
+test_expect_success 'status.renames inherits from diff.renames true' '
+	git -c diff.renames=true status >actual &&
+	test_i18ngrep "renamed:" actual
+'
+
+test_expect_success 'status.renames overrides diff.renames false' '
+	git -c diff.renames=true -c status.renames=false status >actual &&
+	test_i18ngrep "deleted:" actual &&
+	test_i18ngrep "new file:" actual
+'
+
+test_expect_success 'status.renames overrides from diff.renames true' '
+	git -c diff.renames=false -c status.renames=true status >actual &&
+	test_i18ngrep "renamed:" actual
+'
+
+test_expect_success 'status status.renames=false' '
+	git -c status.renames=false status >actual &&
+	test_i18ngrep "deleted:" actual &&
+	test_i18ngrep "new file:" actual
+'
+
+test_expect_success 'status status.renames=true' '
+	git -c status.renames=true status >actual &&
+	test_i18ngrep "renamed:" actual
+'
+
+test_expect_success 'status config overriden' '
+	git -c status.renames=true status --no-renames >actual &&
+	test_i18ngrep "deleted:" actual &&
+	test_i18ngrep "new file:" actual
+'
+
+test_expect_success 'status score=100%' '
+	git status -M=100% >actual &&
+	test_i18ngrep "deleted:" actual &&
+	test_i18ngrep "new file:" actual &&
+
+	git status --find-rename=100% >actual &&
+	test_i18ngrep "deleted:" actual &&
+	test_i18ngrep "new file:" actual
+'
+
+test_expect_success 'status score=01%' '
+	git status -M=01% >actual &&
+	test_i18ngrep "renamed:" actual &&
+
+	git status --find-rename=01% >actual &&
+	test_i18ngrep "renamed:" actual
+'
+
+test_done
diff --git a/wt-status.c b/wt-status.c
index 32f3bcaebd..172f07cbb0 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -138,6 +138,9 @@ void wt_status_prepare(struct wt_status *s)
 	s->show_stash = 0;
 	s->ahead_behind_flags = AHEAD_BEHIND_UNSPECIFIED;
 	s->display_comment_prefix = 0;
+	s->detect_rename = -1;
+	s->rename_score = -1;
+	s->rename_limit = -1;
 }
 
 static void wt_longstatus_print_unmerged_header(struct wt_status *s)
@@ -592,6 +595,9 @@ static void wt_status_collect_changes_worktree(struct wt_status *s)
 	}
 	rev.diffopt.format_callback = wt_status_collect_changed_cb;
 	rev.diffopt.format_callback_data = s;
+	rev.diffopt.detect_rename = s->detect_rename >= 0 ? s->detect_rename : rev.diffopt.detect_rename;
+	rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : rev.diffopt.rename_limit;
+	rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : rev.diffopt.rename_score;
 	copy_pathspec(&rev.prune_data, &s->pathspec);
 	run_diff_files(&rev, 0);
 }
@@ -625,6 +631,9 @@ static void wt_status_collect_changes_index(struct wt_status *s)
 	rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
 	rev.diffopt.format_callback = wt_status_collect_updated_cb;
 	rev.diffopt.format_callback_data = s;
+	rev.diffopt.detect_rename = s->detect_rename >= 0 ? s->detect_rename : rev.diffopt.detect_rename;
+	rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : rev.diffopt.rename_limit;
+	rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : rev.diffopt.rename_score;
 	copy_pathspec(&rev.prune_data, &s->pathspec);
 	run_diff_index(&rev, 1);
 }
@@ -982,6 +991,9 @@ static void wt_longstatus_print_verbose(struct wt_status *s)
 	setup_revisions(0, NULL, &rev, &opt);
 
 	rev.diffopt.output_format |= DIFF_FORMAT_PATCH;
+	rev.diffopt.detect_rename = s->detect_rename >= 0 ? s->detect_rename : rev.diffopt.detect_rename;
+	rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : rev.diffopt.rename_limit;
+	rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : rev.diffopt.rename_score;
 	rev.diffopt.file = s->fp;
 	rev.diffopt.close_file = 0;
 	/*
diff --git a/wt-status.h b/wt-status.h
index 430770b854..1673d146fa 100644
--- a/wt-status.h
+++ b/wt-status.h
@@ -89,7 +89,9 @@ struct wt_status {
 	int show_stash;
 	int hints;
 	enum ahead_behind_flags ahead_behind_flags;
-
+	int detect_rename;
+	int rename_score;
+	int rename_limit;
 	enum wt_status_format status_format;
 	unsigned char sha1_commit[GIT_MAX_RAWSZ]; /* when not Initial */
 

base-commit: a92ae92585d8db14b7871f760f157256cd96742c
-- 
2.17.0.windows.1


  parent reply	other threads:[~2018-05-10 14:16 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-09 14:42 [PATCH v1] add status config and command line options for rename detection Ben Peart
2018-05-09 15:59 ` Duy Nguyen
2018-05-09 17:04   ` Ben Peart
2018-05-09 16:56 ` Elijah Newren
2018-05-09 19:54   ` Ben Peart
2018-05-10 14:16 ` Ben Peart [this message]
2018-05-10 16:19   ` [PATCH v2] " Elijah Newren
2018-05-10 19:09     ` Ben Peart
2018-05-10 22:31       ` Elijah Newren
2018-05-11 12:50         ` Ben Peart
2018-05-11  1:57     ` Junio C Hamano
2018-05-11  6:39   ` Junio C Hamano
2018-05-11 12:56 ` [PATCH v3] " Ben Peart
2018-05-11 14:33   ` Elijah Newren
2018-05-12  8:04   ` Eckhard Maaß
2018-05-14 12:57     ` Ben Peart
2018-05-11 15:38 ` [PATCH v4] " Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180510141621.9668-1-benpeart@microsoft.com \
    --to=ben.peart@microsoft.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=alpauly@microsoft.com \
    --cc=eckhard.s.maass@googlemail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=vmiklos@frugalware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.