git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Maintenance: adapt custom refspecs
@ 2021-04-05 13:04 Derrick Stolee via GitGitGadget
  2021-04-05 13:04 ` [PATCH 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
                   ` (5 more replies)
  0 siblings, 6 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-05 13:04 UTC (permalink / raw)
  To: git; +Cc: tom.saeger, gitster, sunshine, Derrick Stolee

Tom Saeger rightly pointed out [1] that the prefetch task ignores custom
refspecs. This can lead to downloading more data than requested, and it
doesn't even help the future foreground fetches that use that custom
refspec.

[1]
https://lore.kernel.org/git/20210401184914.qmr7jhjbhp2mt3h6@dhcp-10-154-148-175.vpn.oracle.com/

This series fixes this problem by carefully replacing the start of each
refspec's destination with "refs/prefetch/". If the destination already
starts with "refs/", then that is replaced. Otherwise "refs/prefetch/" is
just prepended.

In order to accomplish this safely, a new refspec_item_format() method is
created and tested.

Patch 1 is just a preparation patch that makes the code simpler (and in
hindsight it should have been written this way from the start).

Patch 2 is a simplification of test_subcommand that removes the need for
escaping glob characters. Thanks, Eric Sunshine, for the tip of why my tests
were failing on FreeBSD.

Patches 3-4 add refspec_item_format().

Patch 5 finally modifies the logic in the prefetch task to translate these
refspecs.

Thanks, -Stolee

Derrick Stolee (5):
  maintenance: simplify prefetch logic
  test-lib: use exact match for test_subcommand
  refspec: output a refspec item
  test-tool: test refspec input/output
  maintenance: allow custom refspecs during prefetch

 Documentation/git-maintenance.txt |  3 +-
 Makefile                          |  1 +
 builtin/gc.c                      | 63 +++++++++++++++++++------------
 refspec.c                         | 25 ++++++++++++
 refspec.h                         |  5 +++
 t/helper/test-refspec.c           | 39 +++++++++++++++++++
 t/helper/test-tool.c              |  1 +
 t/helper/test-tool.h              |  1 +
 t/t5511-refspec.sh                | 41 ++++++++++++++++++++
 t/t7900-maintenance.sh            | 43 ++++++++++++++++++---
 t/test-lib-functions.sh           |  4 +-
 11 files changed, 192 insertions(+), 34 deletions(-)
 create mode 100644 t/helper/test-refspec.c


base-commit: 2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-924%2Fderrickstolee%2Fmaintenance%2Frefspec-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-924/derrickstolee/maintenance/refspec-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/924
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 1/5] maintenance: simplify prefetch logic
  2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
@ 2021-04-05 13:04 ` Derrick Stolee via GitGitGadget
  2021-04-05 17:01   ` Tom Saeger
  2021-04-05 13:04 ` [PATCH 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-05 13:04 UTC (permalink / raw)
  To: git; +Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The previous logic filled a string list with the names of each remote,
but instead we could simply run the appropriate 'git fetch' data
directly in the remote iterator. Do this for reduced code size, but also
becuase it sets up an upcoming change to use the remote's refspec. This
data is accessible from the 'struct remote' data that is now accessible
in fetch_remote().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/gc.c | 33 ++++++++-------------------------
 1 file changed, 8 insertions(+), 25 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index ef7226d7bca4..fa8128de9ae1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -873,55 +873,38 @@ static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
+static int fetch_remote(struct remote *remote, void *cbdata)
 {
+	struct maintenance_run_opts *opts = cbdata;
 	struct child_process child = CHILD_PROCESS_INIT;
 
 	child.git_cmd = 1;
-	strvec_pushl(&child.args, "fetch", remote, "--prune", "--no-tags",
+	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
 		     "--no-write-fetch-head", "--recurse-submodules=no",
 		     "--refmap=", NULL);
 
 	if (opts->quiet)
 		strvec_push(&child.args, "--quiet");
 
-	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote);
+	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
 
 	return !!run_command(&child);
 }
 
-static int append_remote(struct remote *remote, void *cbdata)
-{
-	struct string_list *remotes = (struct string_list *)cbdata;
-
-	string_list_append(remotes, remote->name);
-	return 0;
-}
-
 static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
 {
-	int result = 0;
-	struct string_list_item *item;
-	struct string_list remotes = STRING_LIST_INIT_DUP;
-
 	git_config_set_multivar_gently("log.excludedecoration",
 					"refs/prefetch/",
 					"refs/prefetch/",
 					CONFIG_FLAGS_FIXED_VALUE |
 					CONFIG_FLAGS_MULTI_REPLACE);
 
-	if (for_each_remote(append_remote, &remotes)) {
-		error(_("failed to fill remotes"));
-		result = 1;
-		goto cleanup;
+	if (for_each_remote(fetch_remote, opts)) {
+		error(_("failed to prefetch remotes"));
+		return 1;
 	}
 
-	for_each_string_list_item(item, &remotes)
-		result |= fetch_remote(item->string, opts);
-
-cleanup:
-	string_list_clear(&remotes, 0);
-	return result;
+	return 0;
 }
 
 static int maintenance_task_gc(struct maintenance_run_opts *opts)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 2/5] test-lib: use exact match for test_subcommand
  2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  2021-04-05 13:04 ` [PATCH 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
@ 2021-04-05 13:04 ` Derrick Stolee via GitGitGadget
  2021-04-05 17:31   ` Eric Sunshine
  2021-04-05 13:04 ` [PATCH 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-05 13:04 UTC (permalink / raw)
  To: git; +Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The use of 'grep' inside test_subcommand uses general patterns, leading
to sometimes needing escape characters to avoid incorrect matches.
Further, some platforms interpret different glob characters differently.

Use 'grep -F' to use an exact match. This requires removing escape
characters from existing callers. Luckily, this is only one test that
expects refspecs as part of the subcommand.

Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t7900-maintenance.sh  | 4 ++--
 t/test-lib-functions.sh | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 2412d8c5c006..fc2315edec11 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -142,8 +142,8 @@ test_expect_success 'prefetch multiple remotes' '
 	test_commit -C clone2 two &&
 	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
 	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-	test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
-	test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
+	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
+	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&
 	test_path_is_missing .git/refs/remotes &&
 	git log prefetch/remote1/one &&
 	git log prefetch/remote2/two &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 6348e8d7339c..a5915dec22df 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1652,9 +1652,9 @@ test_subcommand () {
 
 	if test -n "$negate"
 	then
-		! grep "\[$expr\]"
+		! grep -F "[$expr]"
 	else
-		grep "\[$expr\]"
+		grep -F "[$expr]"
 	fi
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 3/5] refspec: output a refspec item
  2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  2021-04-05 13:04 ` [PATCH 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
  2021-04-05 13:04 ` [PATCH 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
@ 2021-04-05 13:04 ` Derrick Stolee via GitGitGadget
  2021-04-05 16:57   ` Tom Saeger
  2021-04-07  8:46   ` Ævar Arnfjörð Bjarmason
  2021-04-05 13:04 ` [PATCH 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-05 13:04 UTC (permalink / raw)
  To: git; +Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add a new method, refspec_item_format(), that takes a 'struct
refspec_item' pointer as input and returns a string for how that refspec
item should be written to Git's config or a subcommand, such as 'git
fetch'.

There are several subtleties regarding special-case refspecs that can
occur and are represented in t5511-refspec.sh. These cases will be
explored in new tests in the following change. It requires adding a new
test helper in order to test this format directly, so that is saved for
a separate change to keep this one focused on the logic of the format
method.

A future change will consume this method when translating refspecs in
the 'prefetch' task of the 'git maintenance' builtin.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 refspec.c | 25 +++++++++++++++++++++++++
 refspec.h |  5 +++++
 2 files changed, 30 insertions(+)

diff --git a/refspec.c b/refspec.c
index e3d852c0bfec..ca65ba01bfe6 100644
--- a/refspec.c
+++ b/refspec.c
@@ -180,6 +180,31 @@ void refspec_item_clear(struct refspec_item *item)
 	item->exact_sha1 = 0;
 }
 
+const char *refspec_item_format(const struct refspec_item *rsi)
+{
+	static struct strbuf buf = STRBUF_INIT;
+
+	strbuf_reset(&buf);
+
+	if (rsi->matching)
+		return ":";
+
+	if (rsi->negative)
+		strbuf_addch(&buf, '^');
+	else if (rsi->force)
+		strbuf_addch(&buf, '+');
+
+	if (rsi->src)
+		strbuf_addstr(&buf, rsi->src);
+
+	if (rsi->dst) {
+		strbuf_addch(&buf, ':');
+		strbuf_addstr(&buf, rsi->dst);
+	}
+
+	return buf.buf;
+}
+
 void refspec_init(struct refspec *rs, int fetch)
 {
 	memset(rs, 0, sizeof(*rs));
diff --git a/refspec.h b/refspec.h
index 8b79891d3218..92a312f5b4e6 100644
--- a/refspec.h
+++ b/refspec.h
@@ -56,6 +56,11 @@ int refspec_item_init(struct refspec_item *item, const char *refspec,
 void refspec_item_init_or_die(struct refspec_item *item, const char *refspec,
 			      int fetch);
 void refspec_item_clear(struct refspec_item *item);
+/*
+ * Output a given refspec item to a string.
+ */
+const char *refspec_item_format(const struct refspec_item *rsi);
+
 void refspec_init(struct refspec *rs, int fetch);
 void refspec_append(struct refspec *rs, const char *refspec);
 __attribute__((format (printf,2,3)))
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 4/5] test-tool: test refspec input/output
  2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
                   ` (2 preceding siblings ...)
  2021-04-05 13:04 ` [PATCH 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
@ 2021-04-05 13:04 ` Derrick Stolee via GitGitGadget
  2021-04-05 17:52   ` Eric Sunshine
  2021-04-07  8:54   ` Ævar Arnfjörð Bjarmason
  2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
  2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  5 siblings, 2 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-05 13:04 UTC (permalink / raw)
  To: git; +Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add a new test-helper, 'test-tool refspec', that currently reads stdin
line-by-line and translates the refspecs using the parsing logic of
refspec_item_init() and writes them to output.

Create a test in t5511-refspec.sh that uses this helper to test several
known special cases. This includes all of the special properties of the
'struct refspec_item', including:

 * force: The refspec starts with '+'.
 * pattern: Each side of the refspec has a glob character ('*')
 * matching: The refspec is simply the string ":".
 * exact_sha1: The 'src' string is a 40-character hex string.
 * negative: The refspec starts with '^' and 'dst' is NULL.

While the exact_sha1 property doesn't require special logic in
refspec_item_format, it is still tested here for completeness.

There is also the special-case refspec "@" which translates to "HEAD".

Note that if a refspec does not start with "refs/", then that is not
incorporated as part of the 'struct refspec_item'. This behavior is
confirmed by these tests. These refspecs still work in the wild because
the refs layer interprets them appropriately as branches, prepending
"refs/" or "refs/heads/" as necessary. I spent some time attempting to
insert these prefixes explicitly in parse_refspec(), but these are
several subtleties I was unable to overcome. If such a change were to be
made, then this new test in t5511-refspec.sh will need to be updated
with new output. For example, the input lines ending with "translated"
are designed to demonstrate these subtleties.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Makefile                |  1 +
 t/helper/test-refspec.c | 39 +++++++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c    |  1 +
 t/helper/test-tool.h    |  1 +
 t/t5511-refspec.sh      | 41 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 83 insertions(+)
 create mode 100644 t/helper/test-refspec.c

diff --git a/Makefile b/Makefile
index a6a73c574191..f858c9f25976 100644
--- a/Makefile
+++ b/Makefile
@@ -734,6 +734,7 @@ TEST_BUILTINS_OBJS += test-reach.o
 TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-read-graph.o
 TEST_BUILTINS_OBJS += test-read-midx.o
+TEST_BUILTINS_OBJS += test-refspec.o
 TEST_BUILTINS_OBJS += test-ref-store.o
 TEST_BUILTINS_OBJS += test-regex.o
 TEST_BUILTINS_OBJS += test-repository.o
diff --git a/t/helper/test-refspec.c b/t/helper/test-refspec.c
new file mode 100644
index 000000000000..08cf441a0a06
--- /dev/null
+++ b/t/helper/test-refspec.c
@@ -0,0 +1,39 @@
+#include "cache.h"
+#include "parse-options.h"
+#include "refspec.h"
+#include "strbuf.h"
+#include "test-tool.h"
+
+static const char * const refspec_usage[] = {
+	N_("test-tool refspec [--fetch]"),
+	NULL
+};
+
+int cmd__refspec(int argc, const char **argv)
+{
+	struct strbuf line = STRBUF_INIT;
+	int fetch = 0;
+
+	struct option refspec_options [] = {
+		OPT_BOOL(0, "fetch", &fetch,
+			 N_("enable the 'fetch' option for parsing refpecs")),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, NULL, refspec_options,
+			     refspec_usage, 0);
+
+	while (strbuf_getline(&line, stdin) != EOF) {
+		struct refspec_item rsi;
+
+		if (!refspec_item_init(&rsi, line.buf, fetch)) {
+			printf("failed to parse %s\n", line.buf);
+			continue;
+		}
+
+		printf("%s\n", refspec_item_format(&rsi));
+		refspec_item_clear(&rsi);
+	}
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 287aa6002307..f534ad1731a9 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -55,6 +55,7 @@ static struct test_cmd cmds[] = {
 	{ "read-cache", cmd__read_cache },
 	{ "read-graph", cmd__read_graph },
 	{ "read-midx", cmd__read_midx },
+	{ "refspec", cmd__refspec },
 	{ "ref-store", cmd__ref_store },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 9ea4b31011dd..46a0b8850f17 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -44,6 +44,7 @@ int cmd__reach(int argc, const char **argv);
 int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
+int cmd__refspec(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
diff --git a/t/t5511-refspec.sh b/t/t5511-refspec.sh
index be025b90f989..7614b6adf932 100755
--- a/t/t5511-refspec.sh
+++ b/t/t5511-refspec.sh
@@ -93,4 +93,45 @@ test_refspec fetch "refs/heads/${good}"
 bad=$(printf '\011tab')
 test_refspec fetch "refs/heads/${bad}"				invalid
 
+test_expect_success 'test input/output round trip' '
+	cat >input <<-\EOF &&
+		+refs/heads/*:refs/remotes/origin/*
+		refs/heads/*:refs/remotes/origin/*
+		refs/heads/main:refs/remotes/frotz/xyzzy
+		:refs/remotes/frotz/deleteme
+		^refs/heads/secrets
+		refs/heads/secret:refs/heads/translated
+		refs/heads/secret:heads/translated
+		refs/heads/secret:remotes/translated
+		secret:translated
+		refs/heads/*:remotes/xxy/*
+		refs/heads*/for-linus:refs/remotes/mine/*
+		2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
+		HEAD
+		@
+		:
+	EOF
+	cat >expect <<-\EOF &&
+		+refs/heads/*:refs/remotes/origin/*
+		refs/heads/*:refs/remotes/origin/*
+		refs/heads/main:refs/remotes/frotz/xyzzy
+		:refs/remotes/frotz/deleteme
+		^refs/heads/secrets
+		refs/heads/secret:refs/heads/translated
+		refs/heads/secret:heads/translated
+		refs/heads/secret:remotes/translated
+		secret:translated
+		refs/heads/*:remotes/xxy/*
+		refs/heads*/for-linus:refs/remotes/mine/*
+		2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
+		HEAD
+		HEAD
+		:
+	EOF
+	test-tool refspec <input >output &&
+	test_cmp expect output &&
+	test-tool refspec --fetch <input >output &&
+	test_cmp expect output
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
                   ` (3 preceding siblings ...)
  2021-04-05 13:04 ` [PATCH 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
@ 2021-04-05 13:04 ` Derrick Stolee via GitGitGadget
  2021-04-05 17:16   ` Tom Saeger
                     ` (2 more replies)
  2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  5 siblings, 3 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-05 13:04 UTC (permalink / raw)
  To: git; +Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The prefetch task previously used the default refspec source plus a
custom refspec destination to avoid colliding with remote refs:

	+refs/heads/*:refs/prefetch/<remote>/*

However, some users customize their refspec to reduce how much data they
download from specific remotes. This can involve restrictive patterns
for fetching or negative patterns to avoid downloading some refs.

Modify fetch_remote() to iterate over the remote's refspec list and
translate that into the appropriate prefetch scenario. Specifically,
re-parse the raw form of the refspec into a new 'struct refspec' and
modify the 'dst' member to replace a leading "refs/" substring with
"refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
Negative refspecs do not have a 'dst' so they can be transferred to the
'git fetch' command unmodified.

This prefix change provides the benefit of keeping whatever collisions
may exist in the custom refspecs, if that is a desirable outcome.

This changes the names of the refs that would be fetched by the default
refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
is not a seriously breaking one: these refs are intended to be hidden
and not used.

Update the documentation to be more generic about the destination refs.
Do not mention custom refpecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch remains.

Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  3 +--
 builtin/gc.c                      | 34 +++++++++++++++++++++++-
 t/t7900-maintenance.sh            | 43 ++++++++++++++++++++++++++-----
 3 files changed, 71 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 80ddd33ceba0..95a24264eb10 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -94,8 +94,7 @@ prefetch::
 	objects from all registered remotes. For each remote, a `git fetch`
 	command is run. The refmap is custom to avoid updating local or remote
 	branches (those in `refs/heads` or `refs/remotes`). Instead, the
-	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
-	not updated.
+	refs are stored in `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
diff --git a/builtin/gc.c b/builtin/gc.c
index fa8128de9ae1..92cb8b4e0bfa 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -32,6 +32,7 @@
 #include "remote.h"
 #include "object-store.h"
 #include "exec-cmd.h"
+#include "refspec.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -877,6 +878,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 {
 	struct maintenance_run_opts *opts = cbdata;
 	struct child_process child = CHILD_PROCESS_INIT;
+	int i;
 
 	child.git_cmd = 1;
 	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
@@ -886,7 +888,37 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	if (opts->quiet)
 		strvec_push(&child.args, "--quiet");
 
-	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
+	for (i = 0; i < remote->fetch.nr; i++) {
+		struct refspec_item replace;
+		struct refspec_item *rsi = &remote->fetch.items[i];
+		struct strbuf new_dst = STRBUF_INIT;
+		size_t ignore_len = 0;
+
+		if (rsi->negative) {
+			strvec_push(&child.args, remote->fetch.raw[i]);
+			continue;
+		}
+
+		refspec_item_init(&replace, remote->fetch.raw[i], 1);
+
+		/*
+		 * If a refspec dst starts with "refs/" at the start,
+		 * then we will replace "refs/" with "refs/prefetch/".
+		 * Otherwise, we will prepend the dst string with
+		 * "refs/prefetch/".
+		 */
+		if (!strncmp(replace.dst, "refs/", 5))
+			ignore_len = 5;
+
+		strbuf_addstr(&new_dst, "refs/prefetch/");
+		strbuf_addstr(&new_dst, replace.dst + ignore_len);
+		free(replace.dst);
+		replace.dst = strbuf_detach(&new_dst, NULL);
+
+		strvec_push(&child.args, refspec_item_format(&replace));
+
+		refspec_item_clear(&replace);
+	}
 
 	return !!run_command(&child);
 }
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index fc2315edec11..3366ea188782 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -142,20 +142,51 @@ test_expect_success 'prefetch multiple remotes' '
 	test_commit -C clone2 two &&
 	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
 	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
-	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&
+	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote1/* <run-prefetch.txt &&
+	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <run-prefetch.txt &&
 	test_path_is_missing .git/refs/remotes &&
-	git log prefetch/remote1/one &&
-	git log prefetch/remote2/two &&
+	git log prefetch/remotes/remote1/one &&
+	git log prefetch/remotes/remote2/two &&
 	git fetch --all &&
-	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remote1/one &&
-	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remote2/two &&
+	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remotes/remote1/one &&
+	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
 
 	test_cmp_config refs/prefetch/ log.excludedecoration &&
 	git log --oneline --decorate --all >log &&
 	! grep "prefetch" log
 '
 
+test_expect_success 'prefetch custom refspecs' '
+	git -C clone1 branch -f special/fetched HEAD &&
+	git -C clone1 branch -f special/secret/not-fetched HEAD &&
+
+	# create multiple refspecs for remote1
+	git config --add remote.remote1.fetch +refs/heads/special/fetched:refs/heads/fetched &&
+	git config --add remote.remote1.fetch ^refs/heads/special/secret/not-fetched &&
+
+	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
+
+	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
+
+	# skips second refspec because it is not a pattern type
+	rs1="+refs/heads/*:refs/prefetch/remotes/remote1/*" &&
+	rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
+	rs3="^refs/heads/special/secret/not-fetched" &&
+
+	test_subcommand git fetch remote1 $fetchargs $rs1 $rs2 $rs3 <prefetch-refspec.txt &&
+	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <prefetch-refspec.txt &&
+
+	# first refspec is overridden by second
+	test_must_fail git rev-parse refs/prefetch/special/fetched &&
+	git rev-parse refs/prefetch/heads/fetched &&
+
+	# possible incorrect places for the non-fetched ref
+	test_must_fail git rev-parse refs/prefetch/remotes/remote1/secret/not-fetched &&
+	test_must_fail git rev-parse refs/prefetch/remotes/remote1/not-fetched &&
+	test_must_fail git rev-parse refs/heads/secret/not-fetched &&
+	test_must_fail git rev-parse refs/heads/not-fetched
+'
+
 test_expect_success 'prefetch and existing log.excludeDecoration values' '
 	git config --unset-all log.excludeDecoration &&
 	git config log.excludeDecoration refs/remotes/remote1/ &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-05 13:04 ` [PATCH 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
@ 2021-04-05 16:57   ` Tom Saeger
  2021-04-05 17:40     ` Eric Sunshine
  2021-04-07  8:46   ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 72+ messages in thread
From: Tom Saeger @ 2021-04-05 16:57 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Derrick Stolee

On Mon, Apr 05, 2021 at 01:04:13PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> Add a new method, refspec_item_format(), that takes a 'struct
> refspec_item' pointer as input and returns a string for how that refspec
> item should be written to Git's config or a subcommand, such as 'git
> fetch'.
> 
> There are several subtleties regarding special-case refspecs that can
> occur and are represented in t5511-refspec.sh. These cases will be
> explored in new tests in the following change. It requires adding a new
> test helper in order to test this format directly, so that is saved for
> a separate change to keep this one focused on the logic of the format
> method.
> 
> A future change will consume this method when translating refspecs in
> the 'prefetch' task of the 'git maintenance' builtin.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  refspec.c | 25 +++++++++++++++++++++++++
>  refspec.h |  5 +++++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/refspec.c b/refspec.c
> index e3d852c0bfec..ca65ba01bfe6 100644
> --- a/refspec.c
> +++ b/refspec.c
> @@ -180,6 +180,31 @@ void refspec_item_clear(struct refspec_item *item)
>  	item->exact_sha1 = 0;
>  }
>  
> +const char *refspec_item_format(const struct refspec_item *rsi)
> +{
> +	static struct strbuf buf = STRBUF_INIT;
> +
> +	strbuf_reset(&buf);

is this even needed?

> +
> +	if (rsi->matching)
> +		return ":";
> +
> +	if (rsi->negative)
> +		strbuf_addch(&buf, '^');
> +	else if (rsi->force)
> +		strbuf_addch(&buf, '+');
> +
> +	if (rsi->src)
> +		strbuf_addstr(&buf, rsi->src);
> +
> +	if (rsi->dst) {
> +		strbuf_addch(&buf, ':');
> +		strbuf_addstr(&buf, rsi->dst);
> +	}
> +
> +	return buf.buf;

should this be strbuf_detach?

> +}
> +
>  void refspec_init(struct refspec *rs, int fetch)
>  {
>  	memset(rs, 0, sizeof(*rs));
> diff --git a/refspec.h b/refspec.h
> index 8b79891d3218..92a312f5b4e6 100644
> --- a/refspec.h
> +++ b/refspec.h
> @@ -56,6 +56,11 @@ int refspec_item_init(struct refspec_item *item, const char *refspec,
>  void refspec_item_init_or_die(struct refspec_item *item, const char *refspec,
>  			      int fetch);
>  void refspec_item_clear(struct refspec_item *item);
> +/*
> + * Output a given refspec item to a string.
> + */
> +const char *refspec_item_format(const struct refspec_item *rsi);
> +
>  void refspec_init(struct refspec *rs, int fetch);
>  void refspec_append(struct refspec *rs, const char *refspec);
>  __attribute__((format (printf,2,3)))
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 1/5] maintenance: simplify prefetch logic
  2021-04-05 13:04 ` [PATCH 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
@ 2021-04-05 17:01   ` Tom Saeger
  0 siblings, 0 replies; 72+ messages in thread
From: Tom Saeger @ 2021-04-05 17:01 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Derrick Stolee

On Mon, Apr 05, 2021 at 01:04:11PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The previous logic filled a string list with the names of each remote,
> but instead we could simply run the appropriate 'git fetch' data
> directly in the remote iterator. Do this for reduced code size, but also
> becuase it sets up an upcoming change to use the remote's refspec. This

*NIT* because

> data is accessible from the 'struct remote' data that is now accessible
> in fetch_remote().
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/gc.c | 33 ++++++++-------------------------
>  1 file changed, 8 insertions(+), 25 deletions(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index ef7226d7bca4..fa8128de9ae1 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -873,55 +873,38 @@ static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
>  	return 0;
>  }
>  
> -static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
> +static int fetch_remote(struct remote *remote, void *cbdata)
>  {
> +	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;
>  
>  	child.git_cmd = 1;
> -	strvec_pushl(&child.args, "fetch", remote, "--prune", "--no-tags",
> +	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
>  		     "--no-write-fetch-head", "--recurse-submodules=no",
>  		     "--refmap=", NULL);
>  
>  	if (opts->quiet)
>  		strvec_push(&child.args, "--quiet");
>  
> -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote);
> +	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
>  
>  	return !!run_command(&child);
>  }
>  
> -static int append_remote(struct remote *remote, void *cbdata)
> -{
> -	struct string_list *remotes = (struct string_list *)cbdata;
> -
> -	string_list_append(remotes, remote->name);
> -	return 0;
> -}
> -
>  static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
>  {
> -	int result = 0;
> -	struct string_list_item *item;
> -	struct string_list remotes = STRING_LIST_INIT_DUP;
> -
>  	git_config_set_multivar_gently("log.excludedecoration",
>  					"refs/prefetch/",
>  					"refs/prefetch/",
>  					CONFIG_FLAGS_FIXED_VALUE |
>  					CONFIG_FLAGS_MULTI_REPLACE);
>  
> -	if (for_each_remote(append_remote, &remotes)) {
> -		error(_("failed to fill remotes"));
> -		result = 1;
> -		goto cleanup;
> +	if (for_each_remote(fetch_remote, opts)) {
> +		error(_("failed to prefetch remotes"));
> +		return 1;
>  	}
>  
> -	for_each_string_list_item(item, &remotes)
> -		result |= fetch_remote(item->string, opts);
> -
> -cleanup:
> -	string_list_clear(&remotes, 0);
> -	return result;
> +	return 0;
>  }
>  
>  static int maintenance_task_gc(struct maintenance_run_opts *opts)
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
@ 2021-04-05 17:16   ` Tom Saeger
  2021-04-06 11:15     ` Derrick Stolee
  2021-04-07  8:53   ` Ævar Arnfjörð Bjarmason
  2021-04-07 13:47   ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 72+ messages in thread
From: Tom Saeger @ 2021-04-05 17:16 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Derrick Stolee

On Mon, Apr 05, 2021 at 01:04:15PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The prefetch task previously used the default refspec source plus a
> custom refspec destination to avoid colliding with remote refs:
> 
> 	+refs/heads/*:refs/prefetch/<remote>/*
> 
> However, some users customize their refspec to reduce how much data they
> download from specific remotes. This can involve restrictive patterns
> for fetching or negative patterns to avoid downloading some refs.
> 
> Modify fetch_remote() to iterate over the remote's refspec list and
> translate that into the appropriate prefetch scenario. Specifically,
> re-parse the raw form of the refspec into a new 'struct refspec' and
> modify the 'dst' member to replace a leading "refs/" substring with
> "refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
> Negative refspecs do not have a 'dst' so they can be transferred to the
> 'git fetch' command unmodified.
> 
> This prefix change provides the benefit of keeping whatever collisions
> may exist in the custom refspecs, if that is a desirable outcome.
> 
> This changes the names of the refs that would be fetched by the default
> refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
> to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
> is not a seriously breaking one: these refs are intended to be hidden
> and not used.
> 
> Update the documentation to be more generic about the destination refs.
> Do not mention custom refpecs explicitly, as that does not need to be
> highlighted in this documentation. The important part of placing refs in
> refs/prefetch remains.
> 
> Reported-by: Tom Saeger <tom.saeger@oracle.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/git-maintenance.txt |  3 +--
>  builtin/gc.c                      | 34 +++++++++++++++++++++++-
>  t/t7900-maintenance.sh            | 43 ++++++++++++++++++++++++++-----
>  3 files changed, 71 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 80ddd33ceba0..95a24264eb10 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -94,8 +94,7 @@ prefetch::
>  	objects from all registered remotes. For each remote, a `git fetch`
>  	command is run. The refmap is custom to avoid updating local or remote
>  	branches (those in `refs/heads` or `refs/remotes`). Instead, the
> -	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
> -	not updated.
> +	refs are stored in `refs/prefetch/`. Also, tags are not updated.
>  +
>  This is done to avoid disrupting the remote-tracking branches. The end users
>  expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
> diff --git a/builtin/gc.c b/builtin/gc.c
> index fa8128de9ae1..92cb8b4e0bfa 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -32,6 +32,7 @@
>  #include "remote.h"
>  #include "object-store.h"
>  #include "exec-cmd.h"
> +#include "refspec.h"
>  
>  #define FAILED_RUN "failed to run %s"
>  
> @@ -877,6 +878,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  {
>  	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;
> +	int i;
>  
>  	child.git_cmd = 1;
>  	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
> @@ -886,7 +888,37 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  	if (opts->quiet)
>  		strvec_push(&child.args, "--quiet");
>  
> -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
> +	for (i = 0; i < remote->fetch.nr; i++) {
> +		struct refspec_item replace;
> +		struct refspec_item *rsi = &remote->fetch.items[i];
> +		struct strbuf new_dst = STRBUF_INIT;
> +		size_t ignore_len = 0;
> +
> +		if (rsi->negative) {
> +			strvec_push(&child.args, remote->fetch.raw[i]);
> +			continue;
> +		}
> +
> +		refspec_item_init(&replace, remote->fetch.raw[i], 1);
> +
> +		/*
> +		 * If a refspec dst starts with "refs/" at the start,
> +		 * then we will replace "refs/" with "refs/prefetch/".
> +		 * Otherwise, we will prepend the dst string with
> +		 * "refs/prefetch/".
> +		 */
> +		if (!strncmp(replace.dst, "refs/", 5))
> +			ignore_len = 5;
> +
> +		strbuf_addstr(&new_dst, "refs/prefetch/");
> +		strbuf_addstr(&new_dst, replace.dst + ignore_len);
> +		free(replace.dst);
> +		replace.dst = strbuf_detach(&new_dst, NULL);
> +
> +		strvec_push(&child.args, refspec_item_format(&replace));

see comment on 3/5, think refspec_item_format is leaking here.
this code looks fine though.

> +
> +		refspec_item_clear(&replace);
> +	}
>  
>  	return !!run_command(&child);
>  }
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index fc2315edec11..3366ea188782 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -142,20 +142,51 @@ test_expect_success 'prefetch multiple remotes' '
>  	test_commit -C clone2 two &&
>  	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
>  	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
> -	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
> -	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&
> +	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote1/* <run-prefetch.txt &&
> +	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <run-prefetch.txt &&
>  	test_path_is_missing .git/refs/remotes &&
> -	git log prefetch/remote1/one &&
> -	git log prefetch/remote2/two &&
> +	git log prefetch/remotes/remote1/one &&
> +	git log prefetch/remotes/remote2/two &&
>  	git fetch --all &&
> -	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remote1/one &&
> -	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remote2/two &&
> +	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remotes/remote1/one &&
> +	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
>  
>  	test_cmp_config refs/prefetch/ log.excludedecoration &&
>  	git log --oneline --decorate --all >log &&
>  	! grep "prefetch" log
>  '
>  
> +test_expect_success 'prefetch custom refspecs' '
> +	git -C clone1 branch -f special/fetched HEAD &&
> +	git -C clone1 branch -f special/secret/not-fetched HEAD &&
> +
> +	# create multiple refspecs for remote1
> +	git config --add remote.remote1.fetch +refs/heads/special/fetched:refs/heads/fetched &&
> +	git config --add remote.remote1.fetch ^refs/heads/special/secret/not-fetched &&
> +
> +	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
> +
> +	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
> +
> +	# skips second refspec because it is not a pattern type
> +	rs1="+refs/heads/*:refs/prefetch/remotes/remote1/*" &&
> +	rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
> +	rs3="^refs/heads/special/secret/not-fetched" &&
> +
> +	test_subcommand git fetch remote1 $fetchargs $rs1 $rs2 $rs3 <prefetch-refspec.txt &&
> +	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <prefetch-refspec.txt &&
> +
> +	# first refspec is overridden by second
> +	test_must_fail git rev-parse refs/prefetch/special/fetched &&
> +	git rev-parse refs/prefetch/heads/fetched &&
> +
> +	# possible incorrect places for the non-fetched ref
> +	test_must_fail git rev-parse refs/prefetch/remotes/remote1/secret/not-fetched &&
> +	test_must_fail git rev-parse refs/prefetch/remotes/remote1/not-fetched &&
> +	test_must_fail git rev-parse refs/heads/secret/not-fetched &&
> +	test_must_fail git rev-parse refs/heads/not-fetched
> +'
> +
>  test_expect_success 'prefetch and existing log.excludeDecoration values' '
>  	git config --unset-all log.excludeDecoration &&
>  	git config log.excludeDecoration refs/remotes/remote1/ &&
> -- 
> gitgitgadget

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/5] test-lib: use exact match for test_subcommand
  2021-04-05 13:04 ` [PATCH 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
@ 2021-04-05 17:31   ` Eric Sunshine
  2021-04-05 17:43     ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: Eric Sunshine @ 2021-04-05 17:31 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git List, Tom Saeger, Junio C Hamano, Derrick Stolee, Derrick Stolee

On Mon, Apr 5, 2021 at 9:04 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> The use of 'grep' inside test_subcommand uses general patterns, leading
> to sometimes needing escape characters to avoid incorrect matches.
> Further, some platforms interpret different glob characters differently.

These are regular expression metacharacters, not glob characters. A
more general way to say this might be:

    Furthermore, it can be difficult to know which characters need
    escaping since the actual regular expression language implemented
    by various `grep`s differs between platforms; for instance, some
    may employ pure BRE, whereas others a mix of BRE & ERE.

    Sidestep this difficulty by using `grep -F`...

> Use 'grep -F' to use an exact match. This requires removing escape
> characters from existing callers. Luckily, this is only one test that
> expects refspecs as part of the subcommand.
>
> Reported-by: Eric Sunshine <sunshine@sunshineco.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

The Reported-by: feels a bit unusual in this context. Perhaps
Helped-by: would be more appropriate.

> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> @@ -142,8 +142,8 @@ test_expect_success 'prefetch multiple remotes' '
> -       test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
> -       test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
> +       test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
> +       test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&

To be really robust and avoid accidental glob expansion (as unlikely
as it is), you should quote any arguments which contain glob
metacharacters such as "*" rather than supplying them bare like this.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-05 16:57   ` Tom Saeger
@ 2021-04-05 17:40     ` Eric Sunshine
  2021-04-05 17:44       ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: Eric Sunshine @ 2021-04-05 17:40 UTC (permalink / raw)
  To: Tom Saeger
  Cc: Derrick Stolee via GitGitGadget, Git List, Junio C Hamano,
	Derrick Stolee, Derrick Stolee

On Mon, Apr 5, 2021 at 12:58 PM Tom Saeger <tom.saeger@oracle.com> wrote:
> On Mon, Apr 05, 2021 at 01:04:13PM +0000, Derrick Stolee via GitGitGadget wrote:
> > +const char *refspec_item_format(const struct refspec_item *rsi)
> > +{
> > +     static struct strbuf buf = STRBUF_INIT;
> > +
> > +     strbuf_reset(&buf);
>
> is this even needed?

This is needed due to the `static` strbuf declaration (which is easy
to overlook).

> > +     if (rsi->matching)
> > +             return ":";
> > +
> > +     if (rsi->negative)
> > +             strbuf_addch(&buf, '^');
> > +     else if (rsi->force)
> > +             strbuf_addch(&buf, '+');
> > +
> > +     if (rsi->src)
> > +             strbuf_addstr(&buf, rsi->src);
> > +
> > +     if (rsi->dst) {
> > +             strbuf_addch(&buf, ':');
> > +             strbuf_addstr(&buf, rsi->dst);
> > +     }
> > +
> > +     return buf.buf;
>
> should this be strbuf_detach?

In normal circumstances, yes, however, with the `static` strbuf, this
is correct.

However, a more significant question, perhaps, is why this is using a
`static` strbuf in the first place? Does this need to be optimized
because it is on a hot path? If not, then the only obvious reason why
`static` was chosen was that sometimes the function returns a string
literal and sometimes a constructed string. However, that's minor, and
it would feel cleaner to avoid the `static` strbuf altogether by using
strbuf_detach() for the constructed case and xstrdup() for the string
literal case, and making it the caller's responsibility to free the
result. (The comment in the header file would need to be updated to
say as much.)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/5] test-lib: use exact match for test_subcommand
  2021-04-05 17:31   ` Eric Sunshine
@ 2021-04-05 17:43     ` Junio C Hamano
  0 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2021-04-05 17:43 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Derrick Stolee via GitGitGadget, Git List, Tom Saeger,
	Derrick Stolee, Derrick Stolee

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Mon, Apr 5, 2021 at 9:04 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> The use of 'grep' inside test_subcommand uses general patterns, leading
>> to sometimes needing escape characters to avoid incorrect matches.
>> Further, some platforms interpret different glob characters differently.
>
> These are regular expression metacharacters, not glob characters. A
> more general way to say this might be:
>
>     Furthermore, it can be difficult to know which characters need
>     escaping since the actual regular expression language implemented
>     by various `grep`s differs between platforms; for instance, some
>     may employ pure BRE, whereas others a mix of BRE & ERE.
>
>     Sidestep this difficulty by using `grep -F`...
>
>> Use 'grep -F' to use an exact match. This requires removing escape
>> characters from existing callers. Luckily, this is only one test that
>> expects refspecs as part of the subcommand.
>>
>> Reported-by: Eric Sunshine <sunshine@sunshineco.com>
>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>
> The Reported-by: feels a bit unusual in this context. Perhaps
> Helped-by: would be more appropriate.
>
>> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
>> @@ -142,8 +142,8 @@ test_expect_success 'prefetch multiple remotes' '
>> -       test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
>> -       test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
>> +       test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
>> +       test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&
>
> To be really robust and avoid accidental glob expansion (as unlikely
> as it is), you should quote any arguments which contain glob
> metacharacters such as "*" rather than supplying them bare like this.

Yup, just enclose the whole refspec inside dq-pair, like

	test_subcommand git fetch remote2 $fetchargs \
		"+refs/heads/*:refs/prefetch/remote2/*" <run-prefetch.txt &&

would be the easiest to read.

Thanks.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-05 17:40     ` Eric Sunshine
@ 2021-04-05 17:44       ` Junio C Hamano
  2021-04-06 11:21         ` Derrick Stolee
  0 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2021-04-05 17:44 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Tom Saeger, Derrick Stolee via GitGitGadget, Git List,
	Derrick Stolee, Derrick Stolee

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Mon, Apr 5, 2021 at 12:58 PM Tom Saeger <tom.saeger@oracle.com> wrote:
>> On Mon, Apr 05, 2021 at 01:04:13PM +0000, Derrick Stolee via GitGitGadget wrote:
>> > +const char *refspec_item_format(const struct refspec_item *rsi)
>> > +{
>> > +     static struct strbuf buf = STRBUF_INIT;
>> > +
>> > +     strbuf_reset(&buf);
>>
>> is this even needed?
>
> This is needed due to the `static` strbuf declaration (which is easy
> to overlook).
>
>> > +     if (rsi->matching)
>> > +             return ":";
>> > +
>> > +     if (rsi->negative)
>> > +             strbuf_addch(&buf, '^');
>> > +     else if (rsi->force)
>> > +             strbuf_addch(&buf, '+');
>> > +
>> > +     if (rsi->src)
>> > +             strbuf_addstr(&buf, rsi->src);
>> > +
>> > +     if (rsi->dst) {
>> > +             strbuf_addch(&buf, ':');
>> > +             strbuf_addstr(&buf, rsi->dst);
>> > +     }
>> > +
>> > +     return buf.buf;
>>
>> should this be strbuf_detach?
>
> In normal circumstances, yes, however, with the `static` strbuf, this
> is correct.
>
> However, a more significant question, perhaps, is why this is using a
> `static` strbuf in the first place? Does this need to be optimized
> because it is on a hot path? If not, then the only obvious reason why
> `static` was chosen was that sometimes the function returns a string
> literal and sometimes a constructed string. However, that's minor, and
> it would feel cleaner to avoid the `static` strbuf altogether by using
> strbuf_detach() for the constructed case and xstrdup() for the string
> literal case, and making it the caller's responsibility to free the
> result. (The comment in the header file would need to be updated to
> say as much.)

Very good suggestion.  That would also make this codepath
thread-safe (I do not offhand know how important that is, though).

Thanks.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 4/5] test-tool: test refspec input/output
  2021-04-05 13:04 ` [PATCH 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
@ 2021-04-05 17:52   ` Eric Sunshine
  2021-04-06 11:13     ` Derrick Stolee
  2021-04-07  8:54   ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 72+ messages in thread
From: Eric Sunshine @ 2021-04-05 17:52 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git List, Tom Saeger, Junio C Hamano, Derrick Stolee, Derrick Stolee

On Mon, Apr 5, 2021 at 9:04 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> Add a new test-helper, 'test-tool refspec', that currently reads stdin
> line-by-line and translates the refspecs using the parsing logic of
> refspec_item_init() and writes them to output.
> [...]
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
> diff --git a/t/helper/test-refspec.c b/t/helper/test-refspec.c
> @@ -0,0 +1,39 @@
> +int cmd__refspec(int argc, const char **argv)
> +{
> +       struct strbuf line = STRBUF_INIT;
> +       [...]
> +       return 0;
> +}

Leaking `strbuf line` here. Yes, I realize that the function is
returning and test-tool exiting immediately after this, so not a big
deal, but it's easy to do this correctly by releasing the strbuf, thus
setting good precedence for people who might use this as a template
for new test-tool functions they add in the future.

> diff --git a/t/t5511-refspec.sh b/t/t5511-refspec.sh
> @@ -93,4 +93,45 @@ test_refspec fetch "refs/heads/${good}"
> +test_expect_success 'test input/output round trip' '
> +       cat >input <<-\EOF &&
> +               +refs/heads/*:refs/remotes/origin/*
> +               refs/heads/*:refs/remotes/origin/*
> +               refs/heads/main:refs/remotes/frotz/xyzzy
> +               :refs/remotes/frotz/deleteme
> +               ^refs/heads/secrets
> +               refs/heads/secret:refs/heads/translated
> +               refs/heads/secret:heads/translated
> +               refs/heads/secret:remotes/translated
> +               secret:translated
> +               refs/heads/*:remotes/xxy/*
> +               refs/heads*/for-linus:refs/remotes/mine/*
> +               2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
> +               HEAD
> +               @
> +               :
> +       EOF

Over-indented heredoc body. It is customary[1] in this codebase for
the body and EOF to have the same indentation as the command which
starts the heredoc.

> +       cat >expect <<-\EOF &&
> +               +refs/heads/*:refs/remotes/origin/*
> +               refs/heads/*:refs/remotes/origin/*
> +               refs/heads/main:refs/remotes/frotz/xyzzy
> +               :refs/remotes/frotz/deleteme
> +               ^refs/heads/secrets
> +               refs/heads/secret:refs/heads/translated
> +               refs/heads/secret:heads/translated
> +               refs/heads/secret:remotes/translated
> +               secret:translated
> +               refs/heads/*:remotes/xxy/*
> +               refs/heads*/for-linus:refs/remotes/mine/*
> +               2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
> +               HEAD
> +               HEAD
> +               :
> +       EOF

Ditto.

> +       test-tool refspec <input >output &&
> +       test_cmp expect output &&
> +       test-tool refspec --fetch <input >output &&
> +       test_cmp expect output
> +'

[1]: https://lore.kernel.org/git/CAPig+cSBVG0AdyqXH2mZp6Ohrcb8_ec1Mm_vGbQM4zWT_7yYxQ@mail.gmail.com/

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 4/5] test-tool: test refspec input/output
  2021-04-05 17:52   ` Eric Sunshine
@ 2021-04-06 11:13     ` Derrick Stolee
  0 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee @ 2021-04-06 11:13 UTC (permalink / raw)
  To: Eric Sunshine, Derrick Stolee via GitGitGadget
  Cc: Git List, Tom Saeger, Junio C Hamano, Derrick Stolee, Derrick Stolee

On 4/5/2021 1:52 PM, Eric Sunshine wrote:
> On Mon, Apr 5, 2021 at 9:04 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> Add a new test-helper, 'test-tool refspec', that currently reads stdin
>> line-by-line and translates the refspecs using the parsing logic of
>> refspec_item_init() and writes them to output.
>> [...]
>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>> ---
>> diff --git a/t/helper/test-refspec.c b/t/helper/test-refspec.c
>> @@ -0,0 +1,39 @@
>> +int cmd__refspec(int argc, const char **argv)
>> +{
>> +       struct strbuf line = STRBUF_INIT;
>> +       [...]
>> +       return 0;
>> +}
> 
> Leaking `strbuf line` here. Yes, I realize that the function is
> returning and test-tool exiting immediately after this, so not a big
> deal, but it's easy to do this correctly by releasing the strbuf, thus
> setting good precedence for people who might use this as a template
> for new test-tool functions they add in the future.
> 
>> diff --git a/t/t5511-refspec.sh b/t/t5511-refspec.sh
>> @@ -93,4 +93,45 @@ test_refspec fetch "refs/heads/${good}"
>> +test_expect_success 'test input/output round trip' '
>> +       cat >input <<-\EOF &&
>> +               +refs/heads/*:refs/remotes/origin/*
>> +               refs/heads/*:refs/remotes/origin/*
>> +               refs/heads/main:refs/remotes/frotz/xyzzy
>> +               :refs/remotes/frotz/deleteme
>> +               ^refs/heads/secrets
>> +               refs/heads/secret:refs/heads/translated
>> +               refs/heads/secret:heads/translated
>> +               refs/heads/secret:remotes/translated
>> +               secret:translated
>> +               refs/heads/*:remotes/xxy/*
>> +               refs/heads*/for-linus:refs/remotes/mine/*
>> +               2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
>> +               HEAD
>> +               @
>> +               :
>> +       EOF
> 
> Over-indented heredoc body. It is customary[1] in this codebase for
> the body and EOF to have the same indentation as the command which
> starts the heredoc.

Good catches. Thanks!
-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-05 17:16   ` Tom Saeger
@ 2021-04-06 11:15     ` Derrick Stolee
  0 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee @ 2021-04-06 11:15 UTC (permalink / raw)
  To: Tom Saeger, Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Derrick Stolee

On 4/5/2021 1:16 PM, Tom Saeger wrote:
> On Mon, Apr 05, 2021 at 01:04:15PM +0000, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <dstolee@microsoft.com>
>> +		strvec_push(&child.args, refspec_item_format(&replace));
> 
> see comment on 3/5, think refspec_item_format is leaking here.
> this code looks fine though.

I will respond to the comments on patch 3, but this is the reason
a static strbuf is used: we can print like this without needing
to store the buffer in a variable and free() it here. Seemed like
an easier-to-use API for a non-critical area of code. I'll
continue the discussion over on that patch thread.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-05 17:44       ` Junio C Hamano
@ 2021-04-06 11:21         ` Derrick Stolee
  2021-04-06 15:23           ` Eric Sunshine
  0 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee @ 2021-04-06 11:21 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine
  Cc: Tom Saeger, Derrick Stolee via GitGitGadget, Git List,
	Derrick Stolee, Derrick Stolee

On 4/5/2021 1:44 PM, Junio C Hamano wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> 
>> On Mon, Apr 5, 2021 at 12:58 PM Tom Saeger <tom.saeger@oracle.com> wrote:
>>> On Mon, Apr 05, 2021 at 01:04:13PM +0000, Derrick Stolee via GitGitGadget wrote:
>>>> +const char *refspec_item_format(const struct refspec_item *rsi)
>>>> +{
>>>> +     static struct strbuf buf = STRBUF_INIT;
>>>> +
>>>> +     strbuf_reset(&buf);
>>>
>>> is this even needed?
>>
>> This is needed due to the `static` strbuf declaration (which is easy
>> to overlook).
>>
>>>> +     if (rsi->matching)
>>>> +             return ":";
>>>> +
>>>> +     if (rsi->negative)
>>>> +             strbuf_addch(&buf, '^');
>>>> +     else if (rsi->force)
>>>> +             strbuf_addch(&buf, '+');
>>>> +
>>>> +     if (rsi->src)
>>>> +             strbuf_addstr(&buf, rsi->src);
>>>> +
>>>> +     if (rsi->dst) {
>>>> +             strbuf_addch(&buf, ':');
>>>> +             strbuf_addstr(&buf, rsi->dst);
>>>> +     }
>>>> +
>>>> +     return buf.buf;
>>>
>>> should this be strbuf_detach?
>>
>> In normal circumstances, yes, however, with the `static` strbuf, this
>> is correct.
>>
>> However, a more significant question, perhaps, is why this is using a
>> `static` strbuf in the first place? Does this need to be optimized
>> because it is on a hot path? If not, then the only obvious reason why
>> `static` was chosen was that sometimes the function returns a string
>> literal and sometimes a constructed string. However, that's minor, and
>> it would feel cleaner to avoid the `static` strbuf altogether by using
>> strbuf_detach() for the constructed case and xstrdup() for the string
>> literal case, and making it the caller's responsibility to free the
>> result. (The comment in the header file would need to be updated to
>> say as much.)

Yes, we could get around the return of ":" very easily.

> Very good suggestion.  That would also make this codepath
> thread-safe (I do not offhand know how important that is, though).

I was not intending to make this re-entrant/thread safe. The intention
was to make it easy to consume the formatted string into output such
as a printf without needing to store a temporary 'char *' and free() it
afterwards. This ensures that the only lost memory over the life of the
process is at most one buffer. At minimum, these are things that could
be part of the message to justify this design.

So, I'm torn. This seems like a case where there is value in having
the return buffer be "owned" by this method, and the expected
consumers will use the buffer before calling it again. I'm not sure
how important it is to do this the other way.

Would it be sufficient to justify this choice in the commit message
and comment about it in the method declaration? Or is it worth adding
this templating around every caller:

	char *buf = refspec_item_format(rsi);
	...
	<use 'buf'>
	...
	free(buf);

I don't need much convincing to do this, but I hadn't properly
described my opinion before. Just a small nudge would convince me to
do it this way.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-06 11:21         ` Derrick Stolee
@ 2021-04-06 15:23           ` Eric Sunshine
  2021-04-06 16:51             ` Derrick Stolee
  0 siblings, 1 reply; 72+ messages in thread
From: Eric Sunshine @ 2021-04-06 15:23 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Junio C Hamano, Tom Saeger, Derrick Stolee via GitGitGadget,
	Git List, Derrick Stolee, Derrick Stolee

On Tue, Apr 6, 2021 at 7:21 AM Derrick Stolee <stolee@gmail.com> wrote:
> I was not intending to make this re-entrant/thread safe. The intention
> was to make it easy to consume the formatted string into output such
> as a printf without needing to store a temporary 'char *' and free() it
> afterwards. This ensures that the only lost memory over the life of the
> process is at most one buffer. At minimum, these are things that could
> be part of the message to justify this design.

This has the failing that it won't work if someone calls it twice in
the same printf() or calls it again before even consuming the first
returned value, so this fails:

    printf("foo: %s\nbar: %s\n",
        refspec_item_format(...),
        refspec_item_format(...));

as does this:

    const char *a = refspec_item_format(...);
    const char *b = refspec_item_format(...);

Historically this project would "work around" that problem by using
rotating static buffers in the function, but we've mostly been moving
away from that for several reasons (can't predict how many buffers
will be needed, re-entrancy, etc.).

> So, I'm torn. This seems like a case where there is value in having
> the return buffer be "owned" by this method, and the expected
> consumers will use the buffer before calling it again. I'm not sure
> how important it is to do this the other way.

If history is any indication, we'd probably end up moving away from
such an API eventually anyhow.

> Would it be sufficient to justify this choice in the commit message
> and comment about it in the method declaration? Or is it worth adding
> this templating around every caller:
>
>         char *buf = refspec_item_format(rsi);
>         ...
>         <use 'buf'>
>         ...
>         free(buf);

An alternative would be to have the caller pass in a strbuf to be
populated by the function. It doesn't reduce the boilerplate needed by
the caller (still need to create and release the strbuf), but may
avoid some memory allocations. But if this isn't a critical path and
won't likely ever be, then passing in strbuf may be overkill.

> I don't need much convincing to do this, but I hadn't properly
> described my opinion before. Just a small nudge would convince me to
> do it this way.

For the reasons described above and earlier in the thread, avoiding
the static buffer seems the best course of action.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-06 15:23           ` Eric Sunshine
@ 2021-04-06 16:51             ` Derrick Stolee
  0 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee @ 2021-04-06 16:51 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Junio C Hamano, Tom Saeger, Derrick Stolee via GitGitGadget,
	Git List, Derrick Stolee, Derrick Stolee

On 4/6/2021 11:23 AM, Eric Sunshine wrote:
> On Tue, Apr 6, 2021 at 7:21 AM Derrick Stolee <stolee@gmail.com> wrote:
>> I was not intending to make this re-entrant/thread safe. The intention
>> was to make it easy to consume the formatted string into output such
>> as a printf without needing to store a temporary 'char *' and free() it
>> afterwards. This ensures that the only lost memory over the life of the
>> process is at most one buffer. At minimum, these are things that could
>> be part of the message to justify this design.
> 
> This has the failing that it won't work if someone calls it twice in
> the same printf() or calls it again before even consuming the first
> returned value, so this fails:
> 
>     printf("foo: %s\nbar: %s\n",
>         refspec_item_format(...),
>         refspec_item_format(...));
> 
> as does this:
> 
>     const char *a = refspec_item_format(...);
>     const char *b = refspec_item_format(...);
> 
> Historically this project would "work around" that problem by using
> rotating static buffers in the function, but we've mostly been moving
> away from that for several reasons (can't predict how many buffers
> will be needed, re-entrancy, etc.).
> 
>> So, I'm torn. This seems like a case where there is value in having
>> the return buffer be "owned" by this method, and the expected
>> consumers will use the buffer before calling it again. I'm not sure
>> how important it is to do this the other way.
> 
> If history is any indication, we'd probably end up moving away from
> such an API eventually anyhow.
> 
>> Would it be sufficient to justify this choice in the commit message
>> and comment about it in the method declaration? Or is it worth adding
>> this templating around every caller:
>>
>>         char *buf = refspec_item_format(rsi);
>>         ...
>>         <use 'buf'>
>>         ...
>>         free(buf);
> 
> An alternative would be to have the caller pass in a strbuf to be
> populated by the function. It doesn't reduce the boilerplate needed by
> the caller (still need to create and release the strbuf), but may
> avoid some memory allocations. But if this isn't a critical path and
> won't likely ever be, then passing in strbuf may be overkill.
> 
>> I don't need much convincing to do this, but I hadn't properly
>> described my opinion before. Just a small nudge would convince me to
>> do it this way.
> 
> For the reasons described above and earlier in the thread, avoiding
> the static buffer seems the best course of action.

OK, convinced. I'll return a string that must be freed in my
next version. Thanks!

-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 0/5] Maintenance: adapt custom refspecs
  2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
                   ` (4 preceding siblings ...)
  2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
@ 2021-04-06 18:47 ` Derrick Stolee via GitGitGadget
  2021-04-06 18:47   ` [PATCH v2 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
                     ` (5 more replies)
  5 siblings, 6 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-06 18:47 UTC (permalink / raw)
  To: git; +Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee

Tom Saeger rightly pointed out [1] that the prefetch task ignores custom
refspecs. This can lead to downloading more data than requested, and it
doesn't even help the future foreground fetches that use that custom
refspec.

[1]
https://lore.kernel.org/git/20210401184914.qmr7jhjbhp2mt3h6@dhcp-10-154-148-175.vpn.oracle.com/

This series fixes this problem by carefully replacing the start of each
refspec's destination with "refs/prefetch/". If the destination already
starts with "refs/", then that is replaced. Otherwise "refs/prefetch/" is
just prepended.

In order to accomplish this safely, a new refspec_item_format() method is
created and tested.

Patch 1 is just a preparation patch that makes the code simpler (and in
hindsight it should have been written this way from the start).

Patch 2 is a simplification of test_subcommand that removes the need for
escaping glob characters. Thanks, Eric Sunshine, for the tip of why my tests
were failing on FreeBSD.

Patches 3-4 add refspec_item_format().

Patch 5 finally modifies the logic in the prefetch task to translate these
refspecs.


Updates in V2
=============

Thanks for the close eye on this series. I appreciate the recommendations,
which I believe I have responded to them all:

 * Fixed typos.
 * Made refspec_item_format() re-entrant. Consumers must free the buffer.
 * Cleaned up style (quoting and tabbing).

Thanks, -Stolee

Derrick Stolee (5):
  maintenance: simplify prefetch logic
  test-lib: use exact match for test_subcommand
  refspec: output a refspec item
  test-tool: test refspec input/output
  maintenance: allow custom refspecs during prefetch

 Documentation/git-maintenance.txt |  3 +-
 Makefile                          |  1 +
 builtin/gc.c                      | 66 ++++++++++++++++++++-----------
 refspec.c                         | 23 +++++++++++
 refspec.h                         |  2 +
 t/helper/test-refspec.c           | 44 +++++++++++++++++++++
 t/helper/test-tool.c              |  1 +
 t/helper/test-tool.h              |  1 +
 t/t5511-refspec.sh                | 41 +++++++++++++++++++
 t/t7900-maintenance.sh            | 43 +++++++++++++++++---
 t/test-lib-functions.sh           |  4 +-
 11 files changed, 195 insertions(+), 34 deletions(-)
 create mode 100644 t/helper/test-refspec.c


base-commit: 2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-924%2Fderrickstolee%2Fmaintenance%2Frefspec-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-924/derrickstolee/maintenance/refspec-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/924

Range-diff vs v1:

 1:  3a94ff80657c ! 1:  5aa0cb06c3f2 maintenance: simplify prefetch logic
     @@ Commit message
          The previous logic filled a string list with the names of each remote,
          but instead we could simply run the appropriate 'git fetch' data
          directly in the remote iterator. Do this for reduced code size, but also
     -    becuase it sets up an upcoming change to use the remote's refspec. This
     +    because it sets up an upcoming change to use the remote's refspec. This
          data is accessible from the 'struct remote' data that is now accessible
          in fetch_remote().
      
 2:  2b74889c2a32 ! 2:  d58a3e042ee8 test-lib: use exact match for test_subcommand
     @@ Commit message
      
          The use of 'grep' inside test_subcommand uses general patterns, leading
          to sometimes needing escape characters to avoid incorrect matches.
     -    Further, some platforms interpret different glob characters differently.
     +    Further, some platforms interpret regular expression metacharacters
     +    differently. Furthermore, it can be difficult to know which characters
     +    need escaping since the actual regular expression language implemented
     +    by various `grep`s differs between platforms; for instance, some may
     +    employ pure BRE, whereas others a mix of BRE & ERE.
      
     -    Use 'grep -F' to use an exact match. This requires removing escape
     -    characters from existing callers. Luckily, this is only one test that
     -    expects refspecs as part of the subcommand.
     +    Sidestep this difficulty by using `grep -F` to use an exact match. This
     +    requires removing escape characters from existing callers. Luckily,
     +    this is only one test that expects refspecs as part of the subcommand.
      
     -    Reported-by: Eric Sunshine <sunshine@sunshineco.com>
     +    Helped-by: Eric Sunshine <sunshine@sunshineco.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## t/t7900-maintenance.sh ##
     @@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
       	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
      -	test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
      -	test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
     -+	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
     -+	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&
     ++	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remote1/*" <run-prefetch.txt &&
     ++	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remote2/*" <run-prefetch.txt &&
       	test_path_is_missing .git/refs/remotes &&
       	git log prefetch/remote1/one &&
       	git log prefetch/remote2/two &&
 3:  e10007e1cf8f ! 3:  96388d949b98 refspec: output a refspec item
     @@ refspec.c: void refspec_item_clear(struct refspec_item *item)
       	item->exact_sha1 = 0;
       }
       
     -+const char *refspec_item_format(const struct refspec_item *rsi)
     ++char *refspec_item_format(const struct refspec_item *rsi)
      +{
     -+	static struct strbuf buf = STRBUF_INIT;
     -+
     -+	strbuf_reset(&buf);
     ++	struct strbuf buf = STRBUF_INIT;
      +
      +	if (rsi->matching)
     -+		return ":";
     ++		return xstrdup(":");
      +
      +	if (rsi->negative)
      +		strbuf_addch(&buf, '^');
     @@ refspec.c: void refspec_item_clear(struct refspec_item *item)
      +		strbuf_addstr(&buf, rsi->dst);
      +	}
      +
     -+	return buf.buf;
     ++	return strbuf_detach(&buf, NULL);
      +}
      +
       void refspec_init(struct refspec *rs, int fetch)
     @@ refspec.h: int refspec_item_init(struct refspec_item *item, const char *refspec,
       void refspec_item_init_or_die(struct refspec_item *item, const char *refspec,
       			      int fetch);
       void refspec_item_clear(struct refspec_item *item);
     -+/*
     -+ * Output a given refspec item to a string.
     -+ */
     -+const char *refspec_item_format(const struct refspec_item *rsi);
     ++char *refspec_item_format(const struct refspec_item *rsi);
      +
       void refspec_init(struct refspec *rs, int fetch);
       void refspec_append(struct refspec *rs, const char *refspec);
 4:  c8d1de06f844 ! 4:  bf296282323a test-tool: test refspec input/output
     @@ t/helper/test-refspec.c (new)
      +
      +	while (strbuf_getline(&line, stdin) != EOF) {
      +		struct refspec_item rsi;
     ++		char *buf;
      +
      +		if (!refspec_item_init(&rsi, line.buf, fetch)) {
      +			printf("failed to parse %s\n", line.buf);
      +			continue;
      +		}
      +
     -+		printf("%s\n", refspec_item_format(&rsi));
     ++		buf = refspec_item_format(&rsi);
     ++		printf("%s\n", buf);
     ++		free(buf);
     ++
      +		refspec_item_clear(&rsi);
      +	}
      +
     ++	strbuf_release(&line);
      +	return 0;
      +}
      
     @@ t/t5511-refspec.sh: test_refspec fetch "refs/heads/${good}"
       
      +test_expect_success 'test input/output round trip' '
      +	cat >input <<-\EOF &&
     -+		+refs/heads/*:refs/remotes/origin/*
     -+		refs/heads/*:refs/remotes/origin/*
     -+		refs/heads/main:refs/remotes/frotz/xyzzy
     -+		:refs/remotes/frotz/deleteme
     -+		^refs/heads/secrets
     -+		refs/heads/secret:refs/heads/translated
     -+		refs/heads/secret:heads/translated
     -+		refs/heads/secret:remotes/translated
     -+		secret:translated
     -+		refs/heads/*:remotes/xxy/*
     -+		refs/heads*/for-linus:refs/remotes/mine/*
     -+		2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
     -+		HEAD
     -+		@
     -+		:
     ++	+refs/heads/*:refs/remotes/origin/*
     ++	refs/heads/*:refs/remotes/origin/*
     ++	refs/heads/main:refs/remotes/frotz/xyzzy
     ++	:refs/remotes/frotz/deleteme
     ++	^refs/heads/secrets
     ++	refs/heads/secret:refs/heads/translated
     ++	refs/heads/secret:heads/translated
     ++	refs/heads/secret:remotes/translated
     ++	secret:translated
     ++	refs/heads/*:remotes/xxy/*
     ++	refs/heads*/for-linus:refs/remotes/mine/*
     ++	2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
     ++	HEAD
     ++	@
     ++	:
      +	EOF
      +	cat >expect <<-\EOF &&
     -+		+refs/heads/*:refs/remotes/origin/*
     -+		refs/heads/*:refs/remotes/origin/*
     -+		refs/heads/main:refs/remotes/frotz/xyzzy
     -+		:refs/remotes/frotz/deleteme
     -+		^refs/heads/secrets
     -+		refs/heads/secret:refs/heads/translated
     -+		refs/heads/secret:heads/translated
     -+		refs/heads/secret:remotes/translated
     -+		secret:translated
     -+		refs/heads/*:remotes/xxy/*
     -+		refs/heads*/for-linus:refs/remotes/mine/*
     -+		2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
     -+		HEAD
     -+		HEAD
     -+		:
     ++	+refs/heads/*:refs/remotes/origin/*
     ++	refs/heads/*:refs/remotes/origin/*
     ++	refs/heads/main:refs/remotes/frotz/xyzzy
     ++	:refs/remotes/frotz/deleteme
     ++	^refs/heads/secrets
     ++	refs/heads/secret:refs/heads/translated
     ++	refs/heads/secret:heads/translated
     ++	refs/heads/secret:remotes/translated
     ++	secret:translated
     ++	refs/heads/*:remotes/xxy/*
     ++	refs/heads*/for-linus:refs/remotes/mine/*
     ++	2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
     ++	HEAD
     ++	HEAD
     ++	:
      +	EOF
      +	test-tool refspec <input >output &&
      +	test_cmp expect output &&
 5:  7f6c127dac48 ! 5:  9592224e3d42 maintenance: allow custom refspecs during prefetch
     @@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
      +		struct refspec_item *rsi = &remote->fetch.items[i];
      +		struct strbuf new_dst = STRBUF_INIT;
      +		size_t ignore_len = 0;
     ++		char *replace_string;
      +
      +		if (rsi->negative) {
      +			strvec_push(&child.args, remote->fetch.raw[i]);
     @@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
      +		free(replace.dst);
      +		replace.dst = strbuf_detach(&new_dst, NULL);
      +
     -+		strvec_push(&child.args, refspec_item_format(&replace));
     ++		replace_string = refspec_item_format(&replace);
     ++		strvec_push(&child.args, replace_string);
     ++		free(replace_string);
      +
      +		refspec_item_clear(&replace);
      +	}
     @@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
       	test_commit -C clone2 two &&
       	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
       	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
     --	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
     --	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&
     -+	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote1/* <run-prefetch.txt &&
     -+	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <run-prefetch.txt &&
     +-	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remote1/*" <run-prefetch.txt &&
     +-	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remote2/*" <run-prefetch.txt &&
     ++	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote1/*" <run-prefetch.txt &&
     ++	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <run-prefetch.txt &&
       	test_path_is_missing .git/refs/remotes &&
      -	git log prefetch/remote1/one &&
      -	git log prefetch/remote2/two &&
     @@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
      +	git -C clone1 branch -f special/secret/not-fetched HEAD &&
      +
      +	# create multiple refspecs for remote1
     -+	git config --add remote.remote1.fetch +refs/heads/special/fetched:refs/heads/fetched &&
     -+	git config --add remote.remote1.fetch ^refs/heads/special/secret/not-fetched &&
     ++	git config --add remote.remote1.fetch "+refs/heads/special/fetched:refs/heads/fetched" &&
     ++	git config --add remote.remote1.fetch "^refs/heads/special/secret/not-fetched" &&
      +
      +	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
      +
     @@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
      +	rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
      +	rs3="^refs/heads/special/secret/not-fetched" &&
      +
     -+	test_subcommand git fetch remote1 $fetchargs $rs1 $rs2 $rs3 <prefetch-refspec.txt &&
     -+	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <prefetch-refspec.txt &&
     ++	test_subcommand git fetch remote1 $fetchargs "$rs1" "$rs2" "$rs3" <prefetch-refspec.txt &&
     ++	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <prefetch-refspec.txt &&
      +
      +	# first refspec is overridden by second
      +	test_must_fail git rev-parse refs/prefetch/special/fetched &&

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 1/5] maintenance: simplify prefetch logic
  2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
@ 2021-04-06 18:47   ` Derrick Stolee via GitGitGadget
  2021-04-07 23:23     ` Emily Shaffer
  2021-04-06 18:47   ` [PATCH v2 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-06 18:47 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The previous logic filled a string list with the names of each remote,
but instead we could simply run the appropriate 'git fetch' data
directly in the remote iterator. Do this for reduced code size, but also
because it sets up an upcoming change to use the remote's refspec. This
data is accessible from the 'struct remote' data that is now accessible
in fetch_remote().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/gc.c | 33 ++++++++-------------------------
 1 file changed, 8 insertions(+), 25 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index ef7226d7bca4..fa8128de9ae1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -873,55 +873,38 @@ static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
+static int fetch_remote(struct remote *remote, void *cbdata)
 {
+	struct maintenance_run_opts *opts = cbdata;
 	struct child_process child = CHILD_PROCESS_INIT;
 
 	child.git_cmd = 1;
-	strvec_pushl(&child.args, "fetch", remote, "--prune", "--no-tags",
+	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
 		     "--no-write-fetch-head", "--recurse-submodules=no",
 		     "--refmap=", NULL);
 
 	if (opts->quiet)
 		strvec_push(&child.args, "--quiet");
 
-	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote);
+	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
 
 	return !!run_command(&child);
 }
 
-static int append_remote(struct remote *remote, void *cbdata)
-{
-	struct string_list *remotes = (struct string_list *)cbdata;
-
-	string_list_append(remotes, remote->name);
-	return 0;
-}
-
 static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
 {
-	int result = 0;
-	struct string_list_item *item;
-	struct string_list remotes = STRING_LIST_INIT_DUP;
-
 	git_config_set_multivar_gently("log.excludedecoration",
 					"refs/prefetch/",
 					"refs/prefetch/",
 					CONFIG_FLAGS_FIXED_VALUE |
 					CONFIG_FLAGS_MULTI_REPLACE);
 
-	if (for_each_remote(append_remote, &remotes)) {
-		error(_("failed to fill remotes"));
-		result = 1;
-		goto cleanup;
+	if (for_each_remote(fetch_remote, opts)) {
+		error(_("failed to prefetch remotes"));
+		return 1;
 	}
 
-	for_each_string_list_item(item, &remotes)
-		result |= fetch_remote(item->string, opts);
-
-cleanup:
-	string_list_clear(&remotes, 0);
-	return result;
+	return 0;
 }
 
 static int maintenance_task_gc(struct maintenance_run_opts *opts)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 2/5] test-lib: use exact match for test_subcommand
  2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  2021-04-06 18:47   ` [PATCH v2 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
@ 2021-04-06 18:47   ` Derrick Stolee via GitGitGadget
  2021-04-06 18:47   ` [PATCH v2 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-06 18:47 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The use of 'grep' inside test_subcommand uses general patterns, leading
to sometimes needing escape characters to avoid incorrect matches.
Further, some platforms interpret regular expression metacharacters
differently. Furthermore, it can be difficult to know which characters
need escaping since the actual regular expression language implemented
by various `grep`s differs between platforms; for instance, some may
employ pure BRE, whereas others a mix of BRE & ERE.

Sidestep this difficulty by using `grep -F` to use an exact match. This
requires removing escape characters from existing callers. Luckily,
this is only one test that expects refspecs as part of the subcommand.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t7900-maintenance.sh  | 4 ++--
 t/test-lib-functions.sh | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 2412d8c5c006..37eed6ed3aa3 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -142,8 +142,8 @@ test_expect_success 'prefetch multiple remotes' '
 	test_commit -C clone2 two &&
 	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
 	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-	test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
-	test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
+	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remote1/*" <run-prefetch.txt &&
+	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remote2/*" <run-prefetch.txt &&
 	test_path_is_missing .git/refs/remotes &&
 	git log prefetch/remote1/one &&
 	git log prefetch/remote2/two &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 6348e8d7339c..a5915dec22df 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1652,9 +1652,9 @@ test_subcommand () {
 
 	if test -n "$negate"
 	then
-		! grep "\[$expr\]"
+		! grep -F "[$expr]"
 	else
-		grep "\[$expr\]"
+		grep -F "[$expr]"
 	fi
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 3/5] refspec: output a refspec item
  2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  2021-04-06 18:47   ` [PATCH v2 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
  2021-04-06 18:47   ` [PATCH v2 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
@ 2021-04-06 18:47   ` Derrick Stolee via GitGitGadget
  2021-04-06 18:47   ` [PATCH v2 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-06 18:47 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add a new method, refspec_item_format(), that takes a 'struct
refspec_item' pointer as input and returns a string for how that refspec
item should be written to Git's config or a subcommand, such as 'git
fetch'.

There are several subtleties regarding special-case refspecs that can
occur and are represented in t5511-refspec.sh. These cases will be
explored in new tests in the following change. It requires adding a new
test helper in order to test this format directly, so that is saved for
a separate change to keep this one focused on the logic of the format
method.

A future change will consume this method when translating refspecs in
the 'prefetch' task of the 'git maintenance' builtin.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 refspec.c | 23 +++++++++++++++++++++++
 refspec.h |  2 ++
 2 files changed, 25 insertions(+)

diff --git a/refspec.c b/refspec.c
index e3d852c0bfec..e79cde3c58be 100644
--- a/refspec.c
+++ b/refspec.c
@@ -180,6 +180,29 @@ void refspec_item_clear(struct refspec_item *item)
 	item->exact_sha1 = 0;
 }
 
+char *refspec_item_format(const struct refspec_item *rsi)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	if (rsi->matching)
+		return xstrdup(":");
+
+	if (rsi->negative)
+		strbuf_addch(&buf, '^');
+	else if (rsi->force)
+		strbuf_addch(&buf, '+');
+
+	if (rsi->src)
+		strbuf_addstr(&buf, rsi->src);
+
+	if (rsi->dst) {
+		strbuf_addch(&buf, ':');
+		strbuf_addstr(&buf, rsi->dst);
+	}
+
+	return strbuf_detach(&buf, NULL);
+}
+
 void refspec_init(struct refspec *rs, int fetch)
 {
 	memset(rs, 0, sizeof(*rs));
diff --git a/refspec.h b/refspec.h
index 8b79891d3218..9f2ddc7949a1 100644
--- a/refspec.h
+++ b/refspec.h
@@ -56,6 +56,8 @@ int refspec_item_init(struct refspec_item *item, const char *refspec,
 void refspec_item_init_or_die(struct refspec_item *item, const char *refspec,
 			      int fetch);
 void refspec_item_clear(struct refspec_item *item);
+char *refspec_item_format(const struct refspec_item *rsi);
+
 void refspec_init(struct refspec *rs, int fetch);
 void refspec_append(struct refspec *rs, const char *refspec);
 __attribute__((format (printf,2,3)))
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 4/5] test-tool: test refspec input/output
  2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
                     ` (2 preceding siblings ...)
  2021-04-06 18:47   ` [PATCH v2 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
@ 2021-04-06 18:47   ` Derrick Stolee via GitGitGadget
  2021-04-07 23:08     ` Josh Steadmon
  2021-04-07 23:26     ` Emily Shaffer
  2021-04-06 18:47   ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
  2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  5 siblings, 2 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-06 18:47 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add a new test-helper, 'test-tool refspec', that currently reads stdin
line-by-line and translates the refspecs using the parsing logic of
refspec_item_init() and writes them to output.

Create a test in t5511-refspec.sh that uses this helper to test several
known special cases. This includes all of the special properties of the
'struct refspec_item', including:

 * force: The refspec starts with '+'.
 * pattern: Each side of the refspec has a glob character ('*')
 * matching: The refspec is simply the string ":".
 * exact_sha1: The 'src' string is a 40-character hex string.
 * negative: The refspec starts with '^' and 'dst' is NULL.

While the exact_sha1 property doesn't require special logic in
refspec_item_format, it is still tested here for completeness.

There is also the special-case refspec "@" which translates to "HEAD".

Note that if a refspec does not start with "refs/", then that is not
incorporated as part of the 'struct refspec_item'. This behavior is
confirmed by these tests. These refspecs still work in the wild because
the refs layer interprets them appropriately as branches, prepending
"refs/" or "refs/heads/" as necessary. I spent some time attempting to
insert these prefixes explicitly in parse_refspec(), but these are
several subtleties I was unable to overcome. If such a change were to be
made, then this new test in t5511-refspec.sh will need to be updated
with new output. For example, the input lines ending with "translated"
are designed to demonstrate these subtleties.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Makefile                |  1 +
 t/helper/test-refspec.c | 44 +++++++++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c    |  1 +
 t/helper/test-tool.h    |  1 +
 t/t5511-refspec.sh      | 41 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 88 insertions(+)
 create mode 100644 t/helper/test-refspec.c

diff --git a/Makefile b/Makefile
index a6a73c574191..f858c9f25976 100644
--- a/Makefile
+++ b/Makefile
@@ -734,6 +734,7 @@ TEST_BUILTINS_OBJS += test-reach.o
 TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-read-graph.o
 TEST_BUILTINS_OBJS += test-read-midx.o
+TEST_BUILTINS_OBJS += test-refspec.o
 TEST_BUILTINS_OBJS += test-ref-store.o
 TEST_BUILTINS_OBJS += test-regex.o
 TEST_BUILTINS_OBJS += test-repository.o
diff --git a/t/helper/test-refspec.c b/t/helper/test-refspec.c
new file mode 100644
index 000000000000..b06735ded208
--- /dev/null
+++ b/t/helper/test-refspec.c
@@ -0,0 +1,44 @@
+#include "cache.h"
+#include "parse-options.h"
+#include "refspec.h"
+#include "strbuf.h"
+#include "test-tool.h"
+
+static const char * const refspec_usage[] = {
+	N_("test-tool refspec [--fetch]"),
+	NULL
+};
+
+int cmd__refspec(int argc, const char **argv)
+{
+	struct strbuf line = STRBUF_INIT;
+	int fetch = 0;
+
+	struct option refspec_options [] = {
+		OPT_BOOL(0, "fetch", &fetch,
+			 N_("enable the 'fetch' option for parsing refpecs")),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, NULL, refspec_options,
+			     refspec_usage, 0);
+
+	while (strbuf_getline(&line, stdin) != EOF) {
+		struct refspec_item rsi;
+		char *buf;
+
+		if (!refspec_item_init(&rsi, line.buf, fetch)) {
+			printf("failed to parse %s\n", line.buf);
+			continue;
+		}
+
+		buf = refspec_item_format(&rsi);
+		printf("%s\n", buf);
+		free(buf);
+
+		refspec_item_clear(&rsi);
+	}
+
+	strbuf_release(&line);
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 287aa6002307..f534ad1731a9 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -55,6 +55,7 @@ static struct test_cmd cmds[] = {
 	{ "read-cache", cmd__read_cache },
 	{ "read-graph", cmd__read_graph },
 	{ "read-midx", cmd__read_midx },
+	{ "refspec", cmd__refspec },
 	{ "ref-store", cmd__ref_store },
 	{ "regex", cmd__regex },
 	{ "repository", cmd__repository },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 9ea4b31011dd..46a0b8850f17 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -44,6 +44,7 @@ int cmd__reach(int argc, const char **argv);
 int cmd__read_cache(int argc, const char **argv);
 int cmd__read_graph(int argc, const char **argv);
 int cmd__read_midx(int argc, const char **argv);
+int cmd__refspec(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
 int cmd__regex(int argc, const char **argv);
 int cmd__repository(int argc, const char **argv);
diff --git a/t/t5511-refspec.sh b/t/t5511-refspec.sh
index be025b90f989..489bec08d570 100755
--- a/t/t5511-refspec.sh
+++ b/t/t5511-refspec.sh
@@ -93,4 +93,45 @@ test_refspec fetch "refs/heads/${good}"
 bad=$(printf '\011tab')
 test_refspec fetch "refs/heads/${bad}"				invalid
 
+test_expect_success 'test input/output round trip' '
+	cat >input <<-\EOF &&
+	+refs/heads/*:refs/remotes/origin/*
+	refs/heads/*:refs/remotes/origin/*
+	refs/heads/main:refs/remotes/frotz/xyzzy
+	:refs/remotes/frotz/deleteme
+	^refs/heads/secrets
+	refs/heads/secret:refs/heads/translated
+	refs/heads/secret:heads/translated
+	refs/heads/secret:remotes/translated
+	secret:translated
+	refs/heads/*:remotes/xxy/*
+	refs/heads*/for-linus:refs/remotes/mine/*
+	2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
+	HEAD
+	@
+	:
+	EOF
+	cat >expect <<-\EOF &&
+	+refs/heads/*:refs/remotes/origin/*
+	refs/heads/*:refs/remotes/origin/*
+	refs/heads/main:refs/remotes/frotz/xyzzy
+	:refs/remotes/frotz/deleteme
+	^refs/heads/secrets
+	refs/heads/secret:refs/heads/translated
+	refs/heads/secret:heads/translated
+	refs/heads/secret:remotes/translated
+	secret:translated
+	refs/heads/*:remotes/xxy/*
+	refs/heads*/for-linus:refs/remotes/mine/*
+	2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
+	HEAD
+	HEAD
+	:
+	EOF
+	test-tool refspec <input >output &&
+	test_cmp expect output &&
+	test-tool refspec --fetch <input >output &&
+	test_cmp expect output
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
                     ` (3 preceding siblings ...)
  2021-04-06 18:47   ` [PATCH v2 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
@ 2021-04-06 18:47   ` Derrick Stolee via GitGitGadget
  2021-04-06 19:36     ` Tom Saeger
                       ` (3 more replies)
  2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  5 siblings, 4 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-06 18:47 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The prefetch task previously used the default refspec source plus a
custom refspec destination to avoid colliding with remote refs:

	+refs/heads/*:refs/prefetch/<remote>/*

However, some users customize their refspec to reduce how much data they
download from specific remotes. This can involve restrictive patterns
for fetching or negative patterns to avoid downloading some refs.

Modify fetch_remote() to iterate over the remote's refspec list and
translate that into the appropriate prefetch scenario. Specifically,
re-parse the raw form of the refspec into a new 'struct refspec' and
modify the 'dst' member to replace a leading "refs/" substring with
"refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
Negative refspecs do not have a 'dst' so they can be transferred to the
'git fetch' command unmodified.

This prefix change provides the benefit of keeping whatever collisions
may exist in the custom refspecs, if that is a desirable outcome.

This changes the names of the refs that would be fetched by the default
refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
is not a seriously breaking one: these refs are intended to be hidden
and not used.

Update the documentation to be more generic about the destination refs.
Do not mention custom refpecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch remains.

Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  3 +--
 builtin/gc.c                      | 37 +++++++++++++++++++++++++-
 t/t7900-maintenance.sh            | 43 ++++++++++++++++++++++++++-----
 3 files changed, 74 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 80ddd33ceba0..95a24264eb10 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -94,8 +94,7 @@ prefetch::
 	objects from all registered remotes. For each remote, a `git fetch`
 	command is run. The refmap is custom to avoid updating local or remote
 	branches (those in `refs/heads` or `refs/remotes`). Instead, the
-	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
-	not updated.
+	refs are stored in `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
diff --git a/builtin/gc.c b/builtin/gc.c
index fa8128de9ae1..76f347dd6b11 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -32,6 +32,7 @@
 #include "remote.h"
 #include "object-store.h"
 #include "exec-cmd.h"
+#include "refspec.h"
 
 #define FAILED_RUN "failed to run %s"
 
@@ -877,6 +878,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 {
 	struct maintenance_run_opts *opts = cbdata;
 	struct child_process child = CHILD_PROCESS_INIT;
+	int i;
 
 	child.git_cmd = 1;
 	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
@@ -886,7 +888,40 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	if (opts->quiet)
 		strvec_push(&child.args, "--quiet");
 
-	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
+	for (i = 0; i < remote->fetch.nr; i++) {
+		struct refspec_item replace;
+		struct refspec_item *rsi = &remote->fetch.items[i];
+		struct strbuf new_dst = STRBUF_INIT;
+		size_t ignore_len = 0;
+		char *replace_string;
+
+		if (rsi->negative) {
+			strvec_push(&child.args, remote->fetch.raw[i]);
+			continue;
+		}
+
+		refspec_item_init(&replace, remote->fetch.raw[i], 1);
+
+		/*
+		 * If a refspec dst starts with "refs/" at the start,
+		 * then we will replace "refs/" with "refs/prefetch/".
+		 * Otherwise, we will prepend the dst string with
+		 * "refs/prefetch/".
+		 */
+		if (!strncmp(replace.dst, "refs/", 5))
+			ignore_len = 5;
+
+		strbuf_addstr(&new_dst, "refs/prefetch/");
+		strbuf_addstr(&new_dst, replace.dst + ignore_len);
+		free(replace.dst);
+		replace.dst = strbuf_detach(&new_dst, NULL);
+
+		replace_string = refspec_item_format(&replace);
+		strvec_push(&child.args, replace_string);
+		free(replace_string);
+
+		refspec_item_clear(&replace);
+	}
 
 	return !!run_command(&child);
 }
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 37eed6ed3aa3..03487be3af38 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -142,20 +142,51 @@ test_expect_success 'prefetch multiple remotes' '
 	test_commit -C clone2 two &&
 	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
 	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remote1/*" <run-prefetch.txt &&
-	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remote2/*" <run-prefetch.txt &&
+	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote1/*" <run-prefetch.txt &&
+	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <run-prefetch.txt &&
 	test_path_is_missing .git/refs/remotes &&
-	git log prefetch/remote1/one &&
-	git log prefetch/remote2/two &&
+	git log prefetch/remotes/remote1/one &&
+	git log prefetch/remotes/remote2/two &&
 	git fetch --all &&
-	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remote1/one &&
-	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remote2/two &&
+	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remotes/remote1/one &&
+	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
 
 	test_cmp_config refs/prefetch/ log.excludedecoration &&
 	git log --oneline --decorate --all >log &&
 	! grep "prefetch" log
 '
 
+test_expect_success 'prefetch custom refspecs' '
+	git -C clone1 branch -f special/fetched HEAD &&
+	git -C clone1 branch -f special/secret/not-fetched HEAD &&
+
+	# create multiple refspecs for remote1
+	git config --add remote.remote1.fetch "+refs/heads/special/fetched:refs/heads/fetched" &&
+	git config --add remote.remote1.fetch "^refs/heads/special/secret/not-fetched" &&
+
+	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
+
+	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
+
+	# skips second refspec because it is not a pattern type
+	rs1="+refs/heads/*:refs/prefetch/remotes/remote1/*" &&
+	rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
+	rs3="^refs/heads/special/secret/not-fetched" &&
+
+	test_subcommand git fetch remote1 $fetchargs "$rs1" "$rs2" "$rs3" <prefetch-refspec.txt &&
+	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <prefetch-refspec.txt &&
+
+	# first refspec is overridden by second
+	test_must_fail git rev-parse refs/prefetch/special/fetched &&
+	git rev-parse refs/prefetch/heads/fetched &&
+
+	# possible incorrect places for the non-fetched ref
+	test_must_fail git rev-parse refs/prefetch/remotes/remote1/secret/not-fetched &&
+	test_must_fail git rev-parse refs/prefetch/remotes/remote1/not-fetched &&
+	test_must_fail git rev-parse refs/heads/secret/not-fetched &&
+	test_must_fail git rev-parse refs/heads/not-fetched
+'
+
 test_expect_success 'prefetch and existing log.excludeDecoration values' '
 	git config --unset-all log.excludeDecoration &&
 	git config log.excludeDecoration refs/remotes/remote1/ &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-06 18:47   ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
@ 2021-04-06 19:36     ` Tom Saeger
  2021-04-06 19:45       ` Derrick Stolee
  2021-04-07 23:09     ` Josh Steadmon
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 72+ messages in thread
From: Tom Saeger @ 2021-04-06 19:36 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Derrick Stolee, Derrick Stolee

On Tue, Apr 06, 2021 at 06:47:50PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The prefetch task previously used the default refspec source plus a
> custom refspec destination to avoid colliding with remote refs:
> 
> 	+refs/heads/*:refs/prefetch/<remote>/*
> 
> However, some users customize their refspec to reduce how much data they
> download from specific remotes. This can involve restrictive patterns
> for fetching or negative patterns to avoid downloading some refs.
> 
> Modify fetch_remote() to iterate over the remote's refspec list and
> translate that into the appropriate prefetch scenario. Specifically,
> re-parse the raw form of the refspec into a new 'struct refspec' and
> modify the 'dst' member to replace a leading "refs/" substring with
> "refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
> Negative refspecs do not have a 'dst' so they can be transferred to the
> 'git fetch' command unmodified.
> 
> This prefix change provides the benefit of keeping whatever collisions
> may exist in the custom refspecs, if that is a desirable outcome.
> 
> This changes the names of the refs that would be fetched by the default
> refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
> to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
> is not a seriously breaking one: these refs are intended to be hidden
> and not used.
> 
> Update the documentation to be more generic about the destination refs.
> Do not mention custom refpecs explicitly, as that does not need to be
> highlighted in this documentation. The important part of placing refs in
> refs/prefetch remains.
> 
> Reported-by: Tom Saeger <tom.saeger@oracle.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/git-maintenance.txt |  3 +--
>  builtin/gc.c                      | 37 +++++++++++++++++++++++++-
>  t/t7900-maintenance.sh            | 43 ++++++++++++++++++++++++++-----
>  3 files changed, 74 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 80ddd33ceba0..95a24264eb10 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -94,8 +94,7 @@ prefetch::
>  	objects from all registered remotes. For each remote, a `git fetch`
>  	command is run. The refmap is custom to avoid updating local or remote
>  	branches (those in `refs/heads` or `refs/remotes`). Instead, the
> -	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
> -	not updated.
> +	refs are stored in `refs/prefetch/`. Also, tags are not updated.
>  +
>  This is done to avoid disrupting the remote-tracking branches. The end users
>  expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
> diff --git a/builtin/gc.c b/builtin/gc.c
> index fa8128de9ae1..76f347dd6b11 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -32,6 +32,7 @@
>  #include "remote.h"
>  #include "object-store.h"
>  #include "exec-cmd.h"
> +#include "refspec.h"
>  
>  #define FAILED_RUN "failed to run %s"
>  
> @@ -877,6 +878,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  {
>  	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;
> +	int i;
>  
>  	child.git_cmd = 1;
>  	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
> @@ -886,7 +888,40 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  	if (opts->quiet)
>  		strvec_push(&child.args, "--quiet");
>  
> -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
> +	for (i = 0; i < remote->fetch.nr; i++) {
> +		struct refspec_item replace;
> +		struct refspec_item *rsi = &remote->fetch.items[i];
> +		struct strbuf new_dst = STRBUF_INIT;
> +		size_t ignore_len = 0;
> +		char *replace_string;
> +
> +		if (rsi->negative) {
> +			strvec_push(&child.args, remote->fetch.raw[i]);
> +			continue;
> +		}
> +
> +		refspec_item_init(&replace, remote->fetch.raw[i], 1);
> +
> +		/*
> +		 * If a refspec dst starts with "refs/" at the start,
> +		 * then we will replace "refs/" with "refs/prefetch/".
> +		 * Otherwise, we will prepend the dst string with
> +		 * "refs/prefetch/".
> +		 */
> +		if (!strncmp(replace.dst, "refs/", 5))
> +			ignore_len = 5;
> +
> +		strbuf_addstr(&new_dst, "refs/prefetch/");
> +		strbuf_addstr(&new_dst, replace.dst + ignore_len);
> +		free(replace.dst);
> +		replace.dst = strbuf_detach(&new_dst, NULL);
> +
> +		replace_string = refspec_item_format(&replace);
> +		strvec_push(&child.args, replace_string);
> +		free(replace_string);
> +
> +		refspec_item_clear(&replace);
> +	}
>  
>  	return !!run_command(&child);
>  }

Junio brought up the point about configs which 'fetch' have no dst
https://lore.kernel.org/git/c06a198a-2043-27a2-cab3-3471190754cc@gmail.com/

    [remote "submaintainer1"]
        url = ... repository of submaintainer #1 ...
        fetch = master
        tagopt = --no-tags


This patch fixes segfault for config like above.
You might have ideas on a cleaner way to do this.
I did add `child_process_clear`.


--Tom

diff --git a/builtin/gc.c b/builtin/gc.c
index 76f347dd6b11..921266ee30a5 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -879,6 +879,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
        struct maintenance_run_opts *opts = cbdata;
        struct child_process child = CHILD_PROCESS_INIT;
        int i;
+       int nargs;

        child.git_cmd = 1;
        strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
@@ -888,6 +889,8 @@ static int fetch_remote(struct remote *remote, void *cbdata)
        if (opts->quiet)
                strvec_push(&child.args, "--quiet");

+       nargs = child.args.nr;
+
        for (i = 0; i < remote->fetch.nr; i++) {
                struct refspec_item replace;
                struct refspec_item *rsi = &remote->fetch.items[i];
@@ -900,6 +903,10 @@ static int fetch_remote(struct remote *remote, void *cbdata)
                        continue;
                }

+               if (!rsi->dst) {
+                       continue;
+               }
+
                refspec_item_init(&replace, remote->fetch.raw[i], 1);

                /*
@@ -923,6 +930,12 @@ static int fetch_remote(struct remote *remote, void *cbdata)
                refspec_item_clear(&replace);
        }

+       /* skip remote if no refspecs to fetch */
+       if (child.args.nr - nargs <= 0) {
+               child_process_clear(&child);
+               return 0;
+       }

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-06 19:36     ` Tom Saeger
@ 2021-04-06 19:45       ` Derrick Stolee
  0 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee @ 2021-04-06 19:45 UTC (permalink / raw)
  To: Tom Saeger, Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Derrick Stolee

On 4/6/2021 3:36 PM, Tom Saeger wrote:
>
> Junio brought up the point about configs which 'fetch' have no dst
> https://lore.kernel.org/git/c06a198a-2043-27a2-cab3-3471190754cc@gmail.com/

Thank you for reminding me about this. It was on the other thread,
and I forgot to go back to it as I was preparing this version.

>     [remote "submaintainer1"]
>         url = ... repository of submaintainer #1 ...
>         fetch = master
>         tagopt = --no-tags
> 
> 
> This patch fixes segfault for config like above.
> You might have ideas on a cleaner way to do this.
> I did add `child_process_clear`.

I will also add a test to ensure this scenario does not regress.
-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-05 13:04 ` [PATCH 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
  2021-04-05 16:57   ` Tom Saeger
@ 2021-04-07  8:46   ` Ævar Arnfjörð Bjarmason
  2021-04-07 20:53     ` Derrick Stolee
  1 sibling, 1 reply; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-07  8:46 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee


On Mon, Apr 05 2021, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> Add a new method, refspec_item_format(), that takes a 'struct
> refspec_item' pointer as input and returns a string for how that refspec
> item should be written to Git's config or a subcommand, such as 'git
> fetch'.
>
> There are several subtleties regarding special-case refspecs that can
> occur and are represented in t5511-refspec.sh. These cases will be
> explored in new tests in the following change. It requires adding a new
> test helper in order to test this format directly, so that is saved for
> a separate change to keep this one focused on the logic of the format
> method.
>
> A future change will consume this method when translating refspecs in
> the 'prefetch' task of the 'git maintenance' builtin.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  refspec.c | 25 +++++++++++++++++++++++++
>  refspec.h |  5 +++++
>  2 files changed, 30 insertions(+)
>
> diff --git a/refspec.c b/refspec.c
> index e3d852c0bfec..ca65ba01bfe6 100644
> --- a/refspec.c
> +++ b/refspec.c
> @@ -180,6 +180,31 @@ void refspec_item_clear(struct refspec_item *item)
>  	item->exact_sha1 = 0;
>  }
>  
> +const char *refspec_item_format(const struct refspec_item *rsi)
> +{
> +	static struct strbuf buf = STRBUF_INIT;
> +
> +	strbuf_reset(&buf);
> +
> +	if (rsi->matching)
> +		return ":";
> +
> +	if (rsi->negative)
> +		strbuf_addch(&buf, '^');
> +	else if (rsi->force)
> +		strbuf_addch(&buf, '+');
> +
> +	if (rsi->src)
> +		strbuf_addstr(&buf, rsi->src);
> +
> +	if (rsi->dst) {
> +		strbuf_addch(&buf, ':');
> +		strbuf_addstr(&buf, rsi->dst);
> +	}
> +
> +	return buf.buf;

There's a downthread discussion about the strbuf usage here so that's
covered.

But I'm still confused about the need for this function and the
following two patches. If we apply this on top of your series:
    
    diff --git a/t/helper/test-refspec.c b/t/helper/test-refspec.c
    index 08cf441a0a0..9e099e43ebf 100644
    --- a/t/helper/test-refspec.c
    +++ b/t/helper/test-refspec.c
    @@ -31,7 +31,7 @@ int cmd__refspec(int argc, const char **argv)
                            continue;
                    }
    
    -               printf("%s\n", refspec_item_format(&rsi));
    +               puts(line.buf);
                    refspec_item_clear(&rsi);
            }

The only failing test is:
    
    + diff -u expect output
    --- expect      2021-04-07 08:12:05.577598038 +0000
    +++ output      2021-04-07 08:12:05.577598038 +0000
    @@ -11,5 +11,5 @@
     refs/heads*/for-linus:refs/remotes/mine/*
     2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
     HEAD
    -HEAD
    +@
     :

Other than that applying this on top makes everything pass:
    
    diff --git a/builtin/gc.c b/builtin/gc.c
    index 92cb8b4e0bf..ea5572d15c5 100644
    --- a/builtin/gc.c
    +++ b/builtin/gc.c
    @@ -889,35 +889,21 @@ static int fetch_remote(struct remote *remote, void *cbdata)
     		strvec_push(&child.args, "--quiet");
     
     	for (i = 0; i < remote->fetch.nr; i++) {
    -		struct refspec_item replace;
     		struct refspec_item *rsi = &remote->fetch.items[i];
    -		struct strbuf new_dst = STRBUF_INIT;
    -		size_t ignore_len = 0;
    +		struct strbuf new_spec = STRBUF_INIT;
    +		char *pos;
     
     		if (rsi->negative) {
     			strvec_push(&child.args, remote->fetch.raw[i]);
     			continue;
     		}
     
    -		refspec_item_init(&replace, remote->fetch.raw[i], 1);
    -
    -		/*
    -		 * If a refspec dst starts with "refs/" at the start,
    -		 * then we will replace "refs/" with "refs/prefetch/".
    -		 * Otherwise, we will prepend the dst string with
    -		 * "refs/prefetch/".
    -		 */
    -		if (!strncmp(replace.dst, "refs/", 5))
    -			ignore_len = 5;
    -
    -		strbuf_addstr(&new_dst, "refs/prefetch/");
    -		strbuf_addstr(&new_dst, replace.dst + ignore_len);
    -		free(replace.dst);
    -		replace.dst = strbuf_detach(&new_dst, NULL);
    -
    -		strvec_push(&child.args, refspec_item_format(&replace));
    -
    -		refspec_item_clear(&replace);
    +		strbuf_addstr(&new_spec, remote->fetch.raw[i]);
    +		if ((pos = strrchr(new_spec.buf, ':')) != NULL)
    +			strbuf_splice(&new_spec, pos - new_spec.buf + 1, sizeof("refs/") - 1,
    +				      "refs/prefetch/", sizeof("refs/prefetch/") - 1);
    +		strvec_push(&child.args, new_spec.buf);
    +		strbuf_release(&new_spec);
     	}
     
     	return !!run_command(&child);

So the purpose of this new API is that we don't want to make the
assumption that strrchr(buf, ':') is a safe way to find the delimiter in
the refspec, or is there some case where we grok "HEAD" but not "@"
that's buggy, but not tested for in this series?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
  2021-04-05 17:16   ` Tom Saeger
@ 2021-04-07  8:53   ` Ævar Arnfjörð Bjarmason
  2021-04-07 10:26     ` Ævar Arnfjörð Bjarmason
  2021-04-07 13:47   ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-07  8:53 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee


On Mon, Apr 05 2021, Derrick Stolee via GitGitGadget wrote:

> [...]
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index fc2315edec11..3366ea188782 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -142,20 +142,51 @@ test_expect_success 'prefetch multiple remotes' '
>  	test_commit -C clone2 two &&
>  	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
>  	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
> -	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
> -	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&
> +	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote1/* <run-prefetch.txt &&
> +	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <run-prefetch.txt &&
>  	test_path_is_missing .git/refs/remotes &&
> -	git log prefetch/remote1/one &&
> -	git log prefetch/remote2/two &&
> +	git log prefetch/remotes/remote1/one &&
> +	git log prefetch/remotes/remote2/two &&
>  	git fetch --all &&
> -	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remote1/one &&
> -	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remote2/two &&
> +	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remotes/remote1/one &&
> +	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
>  
>  	test_cmp_config refs/prefetch/ log.excludedecoration &&
>  	git log --oneline --decorate --all >log &&
>  	! grep "prefetch" log
>  '
>  
> +test_expect_success 'prefetch custom refspecs' '
> +	git -C clone1 branch -f special/fetched HEAD &&
> +	git -C clone1 branch -f special/secret/not-fetched HEAD &&
> +
> +	# create multiple refspecs for remote1
> +	git config --add remote.remote1.fetch +refs/heads/special/fetched:refs/heads/fetched &&
> +	git config --add remote.remote1.fetch ^refs/heads/special/secret/not-fetched &&
> +
> +	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&

I see this is following some established convention in the file, but is
there really not a way to make this pass without directing stderr to
/dev/null? It makes ad-hoc debugging when reviewing harder.


I tried just removing it, but then (in an earlier test case) the
"test_subcommand" fails because it can't find the line we're looking
for, so us piping stderr to /dev/null impacts our trace2 output?


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 4/5] test-tool: test refspec input/output
  2021-04-05 13:04 ` [PATCH 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
  2021-04-05 17:52   ` Eric Sunshine
@ 2021-04-07  8:54   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-07  8:54 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee


On Mon, Apr 05 2021, Derrick Stolee via GitGitGadget wrote:

> +static const char * const refspec_usage[] = {
> +	N_("test-tool refspec [--fetch]"),
> +	NULL
> +};
> +
> +int cmd__refspec(int argc, const char **argv)
> +{
> +	struct strbuf line = STRBUF_INIT;
> +	int fetch = 0;
> +
> +	struct option refspec_options [] = {
> +		OPT_BOOL(0, "fetch", &fetch,
> +			 N_("enable the 'fetch' option for parsing refpecs")),
> +		OPT_END()
> +	};

I don't think we should waste translator time by marking these (or
anything else in t/helper) with N_()).

I see you probably copied this from elsewhere & we should probably fix
the existing ones, but no reason to add new ones...

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-07  8:53   ` Ævar Arnfjörð Bjarmason
@ 2021-04-07 10:26     ` Ævar Arnfjörð Bjarmason
  2021-04-09 11:48       ` Derrick Stolee
  0 siblings, 1 reply; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-07 10:26 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee


On Wed, Apr 07 2021, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Apr 05 2021, Derrick Stolee via GitGitGadget wrote:
>
>> [...]
>> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
>> index fc2315edec11..3366ea188782 100755
>> --- a/t/t7900-maintenance.sh
>> +++ b/t/t7900-maintenance.sh
>> @@ -142,20 +142,51 @@ test_expect_success 'prefetch multiple remotes' '
>>  	test_commit -C clone2 two &&
>>  	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
>>  	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
>> -	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
>> -	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&
>> +	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote1/* <run-prefetch.txt &&
>> +	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <run-prefetch.txt &&
>>  	test_path_is_missing .git/refs/remotes &&
>> -	git log prefetch/remote1/one &&
>> -	git log prefetch/remote2/two &&
>> +	git log prefetch/remotes/remote1/one &&
>> +	git log prefetch/remotes/remote2/two &&
>>  	git fetch --all &&
>> -	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remote1/one &&
>> -	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remote2/two &&
>> +	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remotes/remote1/one &&
>> +	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
>>  
>>  	test_cmp_config refs/prefetch/ log.excludedecoration &&
>>  	git log --oneline --decorate --all >log &&
>>  	! grep "prefetch" log
>>  '
>>  
>> +test_expect_success 'prefetch custom refspecs' '
>> +	git -C clone1 branch -f special/fetched HEAD &&
>> +	git -C clone1 branch -f special/secret/not-fetched HEAD &&
>> +
>> +	# create multiple refspecs for remote1
>> +	git config --add remote.remote1.fetch +refs/heads/special/fetched:refs/heads/fetched &&
>> +	git config --add remote.remote1.fetch ^refs/heads/special/secret/not-fetched &&
>> +
>> +	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
>
> I see this is following some established convention in the file, but is
> there really not a way to make this pass without directing stderr to
> /dev/null? It makes ad-hoc debugging when reviewing harder.

As I later found out this is copy/pasted to get around the fact that
--quiet is dependent on isatty(), so without this the result would be
different under --verbose and non-verbose testing.

So that dates back to 3ddaad0e060 (maintenance: add --quiet option,
2020-09-17), but I see other quiet=isatty(2) in related code. I wish we
could isolate that particular behavior so removing the 2>/dev/null when
debugging the tests doesn't cause you to run into this, maybe an
explicit --quiet or --no-quiet option for all but one test that's
checking that isatty() behavior?

> I tried just removing it, but then (in an earlier test case) the
> "test_subcommand" fails because it can't find the line we're looking
> for, so us piping stderr to /dev/null impacts our trace2 output?

I hadn't seen seen test_subcommand before, sorry to be blunt, but "ew!".

So we're ad-hoc grepping trace2 JSON output just to find out whether we
invoked some subcommand. But unlike test_expect_code etc. this one
doesn't run git for you, but instead we have temp *.txt files and the
command disconnected from the run.

And because you're using "grep" and "! grep" to test, you're hiding the
difference between "did not find this line" v.s. "did not find anything
at all".

Because of that the second test using test_subcommand is either buggy or
painfully non-obvious. We check that "run --auto" doesn't contain a
"auto --quiet", but in reality it doesn't contain any subcommands at
all. We didn't run any because it exited with "nothing to pack".

I think converting the whole thing to something like the WIP/RFC patch
below is much better and more readable.

The pattern is basically stolen from test_commit and
check_sub_test_lib_test_err, respectively.

As an aside: The new test_expect_process_tree function would be much
less painful if we had a helper that took the JSON output and emitted
some sensible subset of the information therein. For now I just ad-hoc
grepped the "perf" output. AFAICT the only way to get "depth" from
trace2's JSON so to count slashes in SIDs.

In the case of your tests you're mostly/(only?) interested in a slice of
the "child_start" events. If we had a helper to spew out a pretty-print
version of that (or a subset of it) we could test_cmp against that
directly.

diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 3366ea18878..d03fb361562 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -30,28 +30,41 @@ test_expect_success 'help text' '
 '
 
 test_expect_success 'run [--auto|--quiet]' '
-	GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" \
-		git maintenance run 2>/dev/null &&
-	GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" \
-		git maintenance run --auto 2>/dev/null &&
-	GIT_TRACE2_EVENT="$(pwd)/run-no-quiet.txt" \
-		git maintenance run --no-quiet 2>/dev/null &&
-	test_subcommand git gc --quiet <run-no-auto.txt &&
-	test_subcommand ! git gc --auto --quiet <run-auto.txt &&
-	test_subcommand git gc --no-quiet <run-no-quiet.txt
+	test_expect_process_tree --depth 0 git maintenance run <<-\OUT 3<<-\ERR &&
+	git gc --quiet
+	OUT
+	ERR
+
+	test_expect_process_tree --depth 0 git maintenance run --auto <<-\OUT 3<<-\ERR &&
+	OUT
+	ERR
+
+	test_expect_process_tree --depth 0 git maintenance run --quiet <<-\OUT 3<<-\ERR &&
+	git gc --quiet
+	OUT
+	ERR
+
+	test_expect_process_tree --depth 0 git maintenance run --no-quiet <<-\OUT 3<<-\ERR
+	git gc --no-quiet
+	OUT
+	ERR
 '
 
 test_expect_success 'maintenance.auto config option' '
-	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
-	test_subcommand git maintenance run --auto --quiet <default &&
-	GIT_TRACE2_EVENT="$(pwd)/true" \
-		git -c maintenance.auto=true \
-		commit --quiet --allow-empty -m 2 &&
-	test_subcommand git maintenance run --auto --quiet  <true &&
-	GIT_TRACE2_EVENT="$(pwd)/false" \
-		git -c maintenance.auto=false \
-		commit --quiet --allow-empty -m 3 &&
-	test_subcommand ! git maintenance run --auto --quiet  <false
+	test_expect_process_tree git commit --quiet --allow-empty -m 1 <<-\OUT 3<<-\ERR &&
+	git maintenance run --auto --quiet
+	OUT
+	ERR
+
+	test_expect_process_tree git commit --quiet --allow-empty -m 2 <<-\OUT 3<<-\ERR &&
+	git maintenance run --auto --quiet
+	OUT
+	ERR
+
+	test_expect_process_tree git commit --quiet --allow-empty -m 3 <<-\OUT 3<<-\ERR
+	git maintenance run --auto --quiet
+	OUT
+	ERR
 '
 
 test_expect_success 'maintenance.<task>.enabled' '
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index a5915dec22d..cd1187b473c 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1625,6 +1625,38 @@ test_path_is_hidden () {
 	return 1
 }
 
+test_expect_process_tree () {
+	depth= &&
+	>actual &&
+	cat >expect &&
+	cat <&3 >expect.err
+	while test $# != 0
+	do
+		case "$1" in
+		--depth)
+			depth="$2"
+			shift
+			;;
+		*)
+			break
+			;;
+		esac
+		shift
+	done &&
+	log="$(pwd)/proc-tree.txt" &&
+	>"$log" &&
+	GIT_TRACE2_PERF="$log" "$@" 2>actual.err &&
+	grep "child_start" proc-tree.txt >proc-tree-start.txt || : &&
+	if test -n "$depth"
+	then
+		grep " d$depth " proc-tree-start.txt >tmp.txt || : &&
+		mv tmp.txt proc-tree-start.txt
+	fi &&
+	sed -e 's/^.*argv:\[//' -e 's/\]$//' <proc-tree-start.txt >actual &&
+	test_cmp expect actual &&
+	test_cmp expect.err actual.err
+} 7>&2 2>&4
+
 # Check that the given command was invoked as part of the
 # trace2-format trace on stdin.
 #

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
  2021-04-05 17:16   ` Tom Saeger
  2021-04-07  8:53   ` Ævar Arnfjörð Bjarmason
@ 2021-04-07 13:47   ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-07 13:47 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee


On Mon, Apr 05 2021, Derrick Stolee via GitGitGadget wrote:

> This changes the names of the refs that would be fetched by the default
> refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
> to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
> is not a seriously breaking one: these refs are intended to be hidden
> and not used.

Not "a seriously breaking one" just because we'll assume nobody had a
remote they'd named "remotes" and they'll need to manually clean that
mess up (if needed), or ...?

> [...]
>  	objects from all registered remotes. For each remote, a `git fetch`
>  	command is run. The refmap is custom to avoid updating local or remote
>  	branches (those in `refs/heads` or `refs/remotes`). Instead, the
> -	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
> -	not updated.
> +	refs are stored in `refs/prefetch/`. Also, tags are not updated.

So, "tags are not updated", but:

>  
> -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
> +	for (i = 0; i < remote->fetch.nr; i++) {
> +		struct refspec_item replace;
> +		struct refspec_item *rsi = &remote->fetch.items[i];
> +		struct strbuf new_dst = STRBUF_INIT;
> +		size_t ignore_len = 0;
> +
> +		if (rsi->negative) {
> +			strvec_push(&child.args, remote->fetch.raw[i]);
> +			continue;
> +		}
> +
> +		refspec_item_init(&replace, remote->fetch.raw[i], 1);
> +
> +		/*
> +		 * If a refspec dst starts with "refs/" at the start,
> +		 * then we will replace "refs/" with "refs/prefetch/".
> +		 * Otherwise, we will prepend the dst string with
> +		 * "refs/prefetch/".
> +		 */
> +		if (!strncmp(replace.dst, "refs/", 5))
> +			ignore_len = 5;
> +
> +		strbuf_addstr(&new_dst, "refs/prefetch/");
> +		strbuf_addstr(&new_dst, replace.dst + ignore_len);
> +		free(replace.dst);
> +		replace.dst = strbuf_detach(&new_dst, NULL);
> +
> +		strvec_push(&child.args, refspec_item_format(&replace));
> +
> +		refspec_item_clear(&replace);
> +	}

Isn't a blanket replacement of refs/heads/* with refs/* going to change
that? I haven't tested this so maybe it still doesn't work, but:

>  	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
>  	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
> -	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/* <run-prefetch.txt &&
> -	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/* <run-prefetch.txt &&
> +	test_subcommand git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote1/* <run-prefetch.txt &&
> +	test_subcommand git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remotes/remote2/* <run-prefetch.txt &&
> [...]
>
> +	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&

It seems we should at least have a test for the case of having a refspec
that pulls down tags.

I suspect that we could document this as an absolute before, as
refs/heads/* is the only namespace that'll refuse tags, but now that we
fetch refs/<whatever> that'll no longer be the case.

I'd think that this new behavior (if I'm right) is a feature, but that
we need to update the docs/tests appropriately.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-07  8:46   ` Ævar Arnfjörð Bjarmason
@ 2021-04-07 20:53     ` Derrick Stolee
  2021-04-07 22:05       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee @ 2021-04-07 20:53 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee

On 4/7/2021 4:46 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Apr 05 2021, Derrick Stolee via GitGitGadget wrote:
>> +	return buf.buf;
> 
> There's a downthread discussion about the strbuf usage here so that's
> covered.

And it's fixed in v2.

> But I'm still confused about the need for this function and the
> following two patches. If we apply this on top of your series:
>     
>     diff --git a/t/helper/test-refspec.c b/t/helper/test-refspec.c
>     index 08cf441a0a0..9e099e43ebf 100644
>     --- a/t/helper/test-refspec.c
>     +++ b/t/helper/test-refspec.c
>     @@ -31,7 +31,7 @@ int cmd__refspec(int argc, const char **argv)
>                             continue;
>                     }
>     
>     -               printf("%s\n", refspec_item_format(&rsi));
>     +               puts(line.buf);
>                     refspec_item_clear(&rsi);
>             }
> 
> The only failing test is:
>     
>     + diff -u expect output
>     --- expect      2021-04-07 08:12:05.577598038 +0000
>     +++ output      2021-04-07 08:12:05.577598038 +0000
>     @@ -11,5 +11,5 @@
>      refs/heads*/for-linus:refs/remotes/mine/*
>      2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
>      HEAD
>     -HEAD
>     +@
>      :

It should be obvious that taking refspecs as input, parsing them,
then reformatting them for output should be almost equivalent to
printing the input line.

The point is to exercise the logic that actually formats the
refspec for output. The test-tool clearly does this.

The logic for converting a 'struct refspec_item' to a string is
non-trivial and worth testing. I don't understand why you are
concerned that the black-box of the test-tool could be done
more easily to "trick" the test script.

> So the purpose of this new API is that we don't want to make the
> assumption that strrchr(buf, ':') is a safe way to find the delimiter in
> the refspec, or is there some case where we grok "HEAD" but not "@"
> that's buggy, but not tested for in this series?

The purpose is to allow us to modify a 'struct refspec_item' andproduce a refspec string instead of munging a refspec string
directly.

Thanks,
-Stolee
 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-07 20:53     ` Derrick Stolee
@ 2021-04-07 22:05       ` Ævar Arnfjörð Bjarmason
  2021-04-07 22:49         ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-07 22:05 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, tom.saeger, gitster,
	sunshine, Derrick Stolee, Derrick Stolee


On Wed, Apr 07 2021, Derrick Stolee wrote:

> On 4/7/2021 4:46 AM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Mon, Apr 05 2021, Derrick Stolee via GitGitGadget wrote:
>>> +	return buf.buf;
>> 
>> There's a downthread discussion about the strbuf usage here so that's
>> covered.
>
> And it's fixed in v2.
>
>> But I'm still confused about the need for this function and the
>> following two patches. If we apply this on top of your series:
>>     
>>     diff --git a/t/helper/test-refspec.c b/t/helper/test-refspec.c
>>     index 08cf441a0a0..9e099e43ebf 100644
>>     --- a/t/helper/test-refspec.c
>>     +++ b/t/helper/test-refspec.c
>>     @@ -31,7 +31,7 @@ int cmd__refspec(int argc, const char **argv)
>>                             continue;
>>                     }
>>     
>>     -               printf("%s\n", refspec_item_format(&rsi));
>>     +               puts(line.buf);
>>                     refspec_item_clear(&rsi);
>>             }
>> 
>> The only failing test is:
>>     
>>     + diff -u expect output
>>     --- expect      2021-04-07 08:12:05.577598038 +0000
>>     +++ output      2021-04-07 08:12:05.577598038 +0000
>>     @@ -11,5 +11,5 @@
>>      refs/heads*/for-linus:refs/remotes/mine/*
>>      2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
>>      HEAD
>>     -HEAD
>>     +@
>>      :
>
> It should be obvious that taking refspecs as input, parsing them,
> then reformatting them for output should be almost equivalent to
> printing the input line.
>
> The point is to exercise the logic that actually formats the
> refspec for output. The test-tool clearly does this.
>
> The logic for converting a 'struct refspec_item' to a string is
> non-trivial and worth testing. I don't understand why you are
> concerned that the black-box of the test-tool could be done
> more easily to "trick" the test script.

Yes, but why do we need to convert it to a struct refspec_item in the
first place?

Maybe I'm just overly comfortable with string munging but I think the
smaller patch-on-top to use strbuf_splice() is simpler than adding a new
API just for this use-case.

But I'm still wondering if that @ v.s. HEAD case is something this
series actually needs in its end goal (but then has a missing test?), or
if it was just a "let's test the guts of the refspec.c while we're at
it".

>> So the purpose of this new API is that we don't want to make the
>> assumption that strrchr(buf, ':') is a safe way to find the delimiter in
>> the refspec, or is there some case where we grok "HEAD" but not "@"
>> that's buggy, but not tested for in this series?
>
> The purpose is to allow us to modify a 'struct refspec_item' andproduce a refspec string instead of munging a refspec string
> directly.

But aren't we doing that all over the place, e.g. the grep results for
"refspec_appendf". Even for things purely constructed on the C API level
we pass a const char* now.

I'm not saying it wouldn't be nice to have the refspec.c API changed to
have a clear delimitation between its const char* handling, and C-level
uses which could construct and pass a "struct refspec_item" instead.

But is it *needed* here in a way that I've missed, or is this just a
partial testing/refactoring of that API while we're at it?

[Guessing ahead here because of our TZ difference]:

It seems to me that if this is such a partial refactoring it's a strange
way to go about it.

We're left with freeing/munging the "struct refspec" src/dst in-place
and re-constructing a string that has "+" etc., but we already had that
data in parse_refspec() just before we'd call
refspec_item_format(). That function could then just spew out a
pre-formatted string we'd squirreled away in "struct refspec_item".

If the lengthy paragraph you have at the end of 4/5 holds true, then
such an internal representation doesn't need to have the "refs/" prefix
stores as a const char* (in cases where it's not just "*" or whatever),
no?. We'd then be able to more easily init/copy/munge the refspec for
formatting.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-07 22:05       ` Ævar Arnfjörð Bjarmason
@ 2021-04-07 22:49         ` Junio C Hamano
  2021-04-07 23:01           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2021-04-07 22:49 UTC (permalink / raw)
  To: Derrick Stolee, Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee via GitGitGadget, git, tom.saeger, sunshine,
	Derrick Stolee, Derrick Stolee

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Wed, Apr 07 2021, Derrick Stolee wrote:
>
>> The purpose is to allow us to modify a 'struct refspec_item'
>> andproduce a refspec string instead of munging a refspec string
>> directly.

Ouch.  I thought the goal was to take

    [remote "origin"]
	fetch = $src:$dst

let the code that is used in the actual fetching to parse it into
the in-core "refspec_item", and then transform the refspec_item by

 - discarding it if the item does not result in storing in the real
   fetch

 - tweaking $dst side so that it won't touch anywhere outside
   refs/prefetch/ to avoid disturbing end-user's notion of what the
   latest state of the remote ref is.

so that the "parsed" refspec_item is passed to the fetch machinery
without ever having to be converted back to textual form.

Why do we even need to "andproduce a refspec string"?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-07 22:49         ` Junio C Hamano
@ 2021-04-07 23:01           ` Ævar Arnfjörð Bjarmason
  2021-04-08  7:33             ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-07 23:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee, Derrick Stolee via GitGitGadget, git, tom.saeger,
	sunshine, Derrick Stolee, Derrick Stolee


On Thu, Apr 08 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Wed, Apr 07 2021, Derrick Stolee wrote:
>>
>>> The purpose is to allow us to modify a 'struct refspec_item'
>>> andproduce a refspec string instead of munging a refspec string
>>> directly.
>
> Ouch.  I thought the goal was to take
>
>     [remote "origin"]
> 	fetch = $src:$dst
>
> let the code that is used in the actual fetching to parse it into
> the in-core "refspec_item", and then transform the refspec_item by
>
>  - discarding it if the item does not result in storing in the real
>    fetch
>
>  - tweaking $dst side so that it won't touch anywhere outside
>    refs/prefetch/ to avoid disturbing end-user's notion of what the
>    latest state of the remote ref is.
>
> so that the "parsed" refspec_item is passed to the fetch machinery
> without ever having to be converted back to textual form.
>
> Why do we even need to "andproduce a refspec string"?

We're shelling out to "git fetch", but we first munge the refspec on in
"git gc".

But yes, it seems even more straightforward to do away with passing the
refspec at all to "git fetch", and instead pass some (maybe internal
only, and documented as such) "--refspec-dst-prefix=refs/prefetch/"
option to "git fetch".

I.e. get_ref_map() over there is already doing a version of this loop
over "remote->fetch.nr".

So instead of "git gc" doing the loop, then passing all the refspecs on
the command-line, it could tell "git fetch" to do the same munging when
it does the same iteration.

Doing the munging in builtin/gc.c's fetch_remote() just seems like a
relic from when we didn't care what decision builtin/fetch.c made about
refspecs, we always wanted our custom one.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/5] test-tool: test refspec input/output
  2021-04-06 18:47   ` [PATCH v2 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
@ 2021-04-07 23:08     ` Josh Steadmon
  2021-04-07 23:26     ` Emily Shaffer
  1 sibling, 0 replies; 72+ messages in thread
From: Josh Steadmon @ 2021-04-07 23:08 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee,
	Derrick Stolee, Derrick Stolee

On 2021.04.06 18:47, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> Add a new test-helper, 'test-tool refspec', that currently reads stdin
> line-by-line and translates the refspecs using the parsing logic of
> refspec_item_init() and writes them to output.
> 
> Create a test in t5511-refspec.sh that uses this helper to test several
> known special cases. This includes all of the special properties of the
> 'struct refspec_item', including:
> 
>  * force: The refspec starts with '+'.
>  * pattern: Each side of the refspec has a glob character ('*')
>  * matching: The refspec is simply the string ":".
>  * exact_sha1: The 'src' string is a 40-character hex string.
>  * negative: The refspec starts with '^' and 'dst' is NULL.
> 
> While the exact_sha1 property doesn't require special logic in
> refspec_item_format, it is still tested here for completeness.
> 
> There is also the special-case refspec "@" which translates to "HEAD".
> 
> Note that if a refspec does not start with "refs/", then that is not
> incorporated as part of the 'struct refspec_item'. This behavior is
> confirmed by these tests. These refspecs still work in the wild because
> the refs layer interprets them appropriately as branches, prepending
> "refs/" or "refs/heads/" as necessary. I spent some time attempting to
> insert these prefixes explicitly in parse_refspec(), but these are
> several subtleties I was unable to overcome. If such a change were to be
> made, then this new test in t5511-refspec.sh will need to be updated
> with new output. For example, the input lines ending with "translated"
> are designed to demonstrate these subtleties.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Makefile                |  1 +
>  t/helper/test-refspec.c | 44 +++++++++++++++++++++++++++++++++++++++++
>  t/helper/test-tool.c    |  1 +
>  t/helper/test-tool.h    |  1 +
>  t/t5511-refspec.sh      | 41 ++++++++++++++++++++++++++++++++++++++
>  5 files changed, 88 insertions(+)
>  create mode 100644 t/helper/test-refspec.c
> 
> diff --git a/Makefile b/Makefile
> index a6a73c574191..f858c9f25976 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -734,6 +734,7 @@ TEST_BUILTINS_OBJS += test-reach.o
>  TEST_BUILTINS_OBJS += test-read-cache.o
>  TEST_BUILTINS_OBJS += test-read-graph.o
>  TEST_BUILTINS_OBJS += test-read-midx.o
> +TEST_BUILTINS_OBJS += test-refspec.o
>  TEST_BUILTINS_OBJS += test-ref-store.o
>  TEST_BUILTINS_OBJS += test-regex.o
>  TEST_BUILTINS_OBJS += test-repository.o
> diff --git a/t/helper/test-refspec.c b/t/helper/test-refspec.c
> new file mode 100644
> index 000000000000..b06735ded208
> --- /dev/null
> +++ b/t/helper/test-refspec.c
> @@ -0,0 +1,44 @@
> +#include "cache.h"
> +#include "parse-options.h"
> +#include "refspec.h"
> +#include "strbuf.h"
> +#include "test-tool.h"
> +
> +static const char * const refspec_usage[] = {
> +	N_("test-tool refspec [--fetch]"),
> +	NULL
> +};
> +
> +int cmd__refspec(int argc, const char **argv)
> +{
> +	struct strbuf line = STRBUF_INIT;
> +	int fetch = 0;
> +
> +	struct option refspec_options [] = {
> +		OPT_BOOL(0, "fetch", &fetch,
> +			 N_("enable the 'fetch' option for parsing refpecs")),

Typo here: s/refpecs/refspecs/


> +		OPT_END()
> +	};
> +
> +	argc = parse_options(argc, argv, NULL, refspec_options,
> +			     refspec_usage, 0);
> +
> +	while (strbuf_getline(&line, stdin) != EOF) {
> +		struct refspec_item rsi;
> +		char *buf;
> +
> +		if (!refspec_item_init(&rsi, line.buf, fetch)) {
> +			printf("failed to parse %s\n", line.buf);
> +			continue;
> +		}
> +
> +		buf = refspec_item_format(&rsi);
> +		printf("%s\n", buf);
> +		free(buf);
> +
> +		refspec_item_clear(&rsi);
> +	}
> +
> +	strbuf_release(&line);
> +	return 0;
> +}
> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index 287aa6002307..f534ad1731a9 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -55,6 +55,7 @@ static struct test_cmd cmds[] = {
>  	{ "read-cache", cmd__read_cache },
>  	{ "read-graph", cmd__read_graph },
>  	{ "read-midx", cmd__read_midx },
> +	{ "refspec", cmd__refspec },
>  	{ "ref-store", cmd__ref_store },
>  	{ "regex", cmd__regex },
>  	{ "repository", cmd__repository },
> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
> index 9ea4b31011dd..46a0b8850f17 100644
> --- a/t/helper/test-tool.h
> +++ b/t/helper/test-tool.h
> @@ -44,6 +44,7 @@ int cmd__reach(int argc, const char **argv);
>  int cmd__read_cache(int argc, const char **argv);
>  int cmd__read_graph(int argc, const char **argv);
>  int cmd__read_midx(int argc, const char **argv);
> +int cmd__refspec(int argc, const char **argv);
>  int cmd__ref_store(int argc, const char **argv);
>  int cmd__regex(int argc, const char **argv);
>  int cmd__repository(int argc, const char **argv);
> diff --git a/t/t5511-refspec.sh b/t/t5511-refspec.sh
> index be025b90f989..489bec08d570 100755
> --- a/t/t5511-refspec.sh
> +++ b/t/t5511-refspec.sh
> @@ -93,4 +93,45 @@ test_refspec fetch "refs/heads/${good}"
>  bad=$(printf '\011tab')
>  test_refspec fetch "refs/heads/${bad}"				invalid
>  
> +test_expect_success 'test input/output round trip' '
> +	cat >input <<-\EOF &&
> +	+refs/heads/*:refs/remotes/origin/*
> +	refs/heads/*:refs/remotes/origin/*
> +	refs/heads/main:refs/remotes/frotz/xyzzy
> +	:refs/remotes/frotz/deleteme
> +	^refs/heads/secrets
> +	refs/heads/secret:refs/heads/translated
> +	refs/heads/secret:heads/translated
> +	refs/heads/secret:remotes/translated
> +	secret:translated
> +	refs/heads/*:remotes/xxy/*
> +	refs/heads*/for-linus:refs/remotes/mine/*
> +	2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
> +	HEAD
> +	@
> +	:
> +	EOF
> +	cat >expect <<-\EOF &&
> +	+refs/heads/*:refs/remotes/origin/*
> +	refs/heads/*:refs/remotes/origin/*
> +	refs/heads/main:refs/remotes/frotz/xyzzy
> +	:refs/remotes/frotz/deleteme
> +	^refs/heads/secrets
> +	refs/heads/secret:refs/heads/translated
> +	refs/heads/secret:heads/translated
> +	refs/heads/secret:remotes/translated
> +	secret:translated
> +	refs/heads/*:remotes/xxy/*
> +	refs/heads*/for-linus:refs/remotes/mine/*
> +	2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
> +	HEAD
> +	HEAD
> +	:
> +	EOF
> +	test-tool refspec <input >output &&
> +	test_cmp expect output &&
> +	test-tool refspec --fetch <input >output &&
> +	test_cmp expect output
> +'
> +
>  test_done
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-06 18:47   ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
  2021-04-06 19:36     ` Tom Saeger
@ 2021-04-07 23:09     ` Josh Steadmon
  2021-04-07 23:37     ` Emily Shaffer
  2021-04-08  0:23     ` Jonathan Tan
  3 siblings, 0 replies; 72+ messages in thread
From: Josh Steadmon @ 2021-04-07 23:09 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee,
	Derrick Stolee, Derrick Stolee

On 2021.04.06 18:47, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The prefetch task previously used the default refspec source plus a
> custom refspec destination to avoid colliding with remote refs:
> 
> 	+refs/heads/*:refs/prefetch/<remote>/*
> 
> However, some users customize their refspec to reduce how much data they
> download from specific remotes. This can involve restrictive patterns
> for fetching or negative patterns to avoid downloading some refs.
> 
> Modify fetch_remote() to iterate over the remote's refspec list and
> translate that into the appropriate prefetch scenario. Specifically,
> re-parse the raw form of the refspec into a new 'struct refspec' and
> modify the 'dst' member to replace a leading "refs/" substring with
> "refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
> Negative refspecs do not have a 'dst' so they can be transferred to the
> 'git fetch' command unmodified.
> 
> This prefix change provides the benefit of keeping whatever collisions
> may exist in the custom refspecs, if that is a desirable outcome.
> 
> This changes the names of the refs that would be fetched by the default
> refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
> to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
> is not a seriously breaking one: these refs are intended to be hidden
> and not used.
> 
> Update the documentation to be more generic about the destination refs.
> Do not mention custom refpecs explicitly, as that does not need to be

Typo here: s/refpecs/refspecs/


> highlighted in this documentation. The important part of placing refs in
> refs/prefetch remains.
> 
> Reported-by: Tom Saeger <tom.saeger@oracle.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/git-maintenance.txt |  3 +--
>  builtin/gc.c                      | 37 +++++++++++++++++++++++++-
>  t/t7900-maintenance.sh            | 43 ++++++++++++++++++++++++++-----
>  3 files changed, 74 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 80ddd33ceba0..95a24264eb10 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -94,8 +94,7 @@ prefetch::
>  	objects from all registered remotes. For each remote, a `git fetch`
>  	command is run. The refmap is custom to avoid updating local or remote
>  	branches (those in `refs/heads` or `refs/remotes`). Instead, the
> -	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
> -	not updated.
> +	refs are stored in `refs/prefetch/`. Also, tags are not updated.
>  +
>  This is done to avoid disrupting the remote-tracking branches. The end users
>  expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
> diff --git a/builtin/gc.c b/builtin/gc.c
> index fa8128de9ae1..76f347dd6b11 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -32,6 +32,7 @@
>  #include "remote.h"
>  #include "object-store.h"
>  #include "exec-cmd.h"
> +#include "refspec.h"
>  
>  #define FAILED_RUN "failed to run %s"
>  
> @@ -877,6 +878,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  {
>  	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;
> +	int i;
>  
>  	child.git_cmd = 1;
>  	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
> @@ -886,7 +888,40 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  	if (opts->quiet)
>  		strvec_push(&child.args, "--quiet");
>  
> -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
> +	for (i = 0; i < remote->fetch.nr; i++) {
> +		struct refspec_item replace;
> +		struct refspec_item *rsi = &remote->fetch.items[i];
> +		struct strbuf new_dst = STRBUF_INIT;
> +		size_t ignore_len = 0;
> +		char *replace_string;
> +
> +		if (rsi->negative) {
> +			strvec_push(&child.args, remote->fetch.raw[i]);
> +			continue;
> +		}
> +
> +		refspec_item_init(&replace, remote->fetch.raw[i], 1);
> +
> +		/*
> +		 * If a refspec dst starts with "refs/" at the start,
> +		 * then we will replace "refs/" with "refs/prefetch/".
> +		 * Otherwise, we will prepend the dst string with
> +		 * "refs/prefetch/".
> +		 */
> +		if (!strncmp(replace.dst, "refs/", 5))
> +			ignore_len = 5;
> +
> +		strbuf_addstr(&new_dst, "refs/prefetch/");
> +		strbuf_addstr(&new_dst, replace.dst + ignore_len);
> +		free(replace.dst);
> +		replace.dst = strbuf_detach(&new_dst, NULL);
> +
> +		replace_string = refspec_item_format(&replace);
> +		strvec_push(&child.args, replace_string);
> +		free(replace_string);
> +
> +		refspec_item_clear(&replace);
> +	}
>  
>  	return !!run_command(&child);
>  }
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 37eed6ed3aa3..03487be3af38 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -142,20 +142,51 @@ test_expect_success 'prefetch multiple remotes' '
>  	test_commit -C clone2 two &&
>  	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
>  	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
> -	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remote1/*" <run-prefetch.txt &&
> -	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remote2/*" <run-prefetch.txt &&
> +	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote1/*" <run-prefetch.txt &&
> +	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <run-prefetch.txt &&
>  	test_path_is_missing .git/refs/remotes &&
> -	git log prefetch/remote1/one &&
> -	git log prefetch/remote2/two &&
> +	git log prefetch/remotes/remote1/one &&
> +	git log prefetch/remotes/remote2/two &&
>  	git fetch --all &&
> -	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remote1/one &&
> -	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remote2/two &&
> +	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remotes/remote1/one &&
> +	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
>  
>  	test_cmp_config refs/prefetch/ log.excludedecoration &&
>  	git log --oneline --decorate --all >log &&
>  	! grep "prefetch" log
>  '
>  
> +test_expect_success 'prefetch custom refspecs' '
> +	git -C clone1 branch -f special/fetched HEAD &&
> +	git -C clone1 branch -f special/secret/not-fetched HEAD &&
> +
> +	# create multiple refspecs for remote1
> +	git config --add remote.remote1.fetch "+refs/heads/special/fetched:refs/heads/fetched" &&
> +	git config --add remote.remote1.fetch "^refs/heads/special/secret/not-fetched" &&
> +
> +	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
> +
> +	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
> +
> +	# skips second refspec because it is not a pattern type
> +	rs1="+refs/heads/*:refs/prefetch/remotes/remote1/*" &&
> +	rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
> +	rs3="^refs/heads/special/secret/not-fetched" &&
> +
> +	test_subcommand git fetch remote1 $fetchargs "$rs1" "$rs2" "$rs3" <prefetch-refspec.txt &&
> +	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <prefetch-refspec.txt &&
> +
> +	# first refspec is overridden by second
> +	test_must_fail git rev-parse refs/prefetch/special/fetched &&
> +	git rev-parse refs/prefetch/heads/fetched &&
> +
> +	# possible incorrect places for the non-fetched ref
> +	test_must_fail git rev-parse refs/prefetch/remotes/remote1/secret/not-fetched &&
> +	test_must_fail git rev-parse refs/prefetch/remotes/remote1/not-fetched &&
> +	test_must_fail git rev-parse refs/heads/secret/not-fetched &&
> +	test_must_fail git rev-parse refs/heads/not-fetched
> +'
> +
>  test_expect_success 'prefetch and existing log.excludeDecoration values' '
>  	git config --unset-all log.excludeDecoration &&
>  	git config log.excludeDecoration refs/remotes/remote1/ &&
> -- 
> gitgitgadget

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/5] maintenance: simplify prefetch logic
  2021-04-06 18:47   ` [PATCH v2 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
@ 2021-04-07 23:23     ` Emily Shaffer
  2021-04-09 19:00       ` Derrick Stolee
  0 siblings, 1 reply; 72+ messages in thread
From: Emily Shaffer @ 2021-04-07 23:23 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee,
	Derrick Stolee, Derrick Stolee

Overall, the series looks pretty solid to me - which is why I've got a
handful of small nits to relay. :)

On Tue, Apr 06, 2021 at 06:47:46PM +0000, Derrick Stolee via GitGitGadget wrote:
> -static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
> +static int fetch_remote(struct remote *remote, void *cbdata)
>  {
> +	struct maintenance_run_opts *opts = cbdata;
[snip]
>  	if (opts->quiet)
I worry about the lack of null-checking here.

 - Emily

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/5] test-tool: test refspec input/output
  2021-04-06 18:47   ` [PATCH v2 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
  2021-04-07 23:08     ` Josh Steadmon
@ 2021-04-07 23:26     ` Emily Shaffer
  1 sibling, 0 replies; 72+ messages in thread
From: Emily Shaffer @ 2021-04-07 23:26 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee,
	Derrick Stolee, Derrick Stolee

On Tue, Apr 06, 2021 at 06:47:49PM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/t/t5511-refspec.sh b/t/t5511-refspec.sh
> index be025b90f989..489bec08d570 100755
> --- a/t/t5511-refspec.sh
> +++ b/t/t5511-refspec.sh
> @@ -93,4 +93,45 @@ test_refspec fetch "refs/heads/${good}"
>  bad=$(printf '\011tab')
>  test_refspec fetch "refs/heads/${bad}"				invalid
>  
> +test_expect_success 'test input/output round trip' '
> +	cat >input <<-\EOF &&
> +	+refs/heads/*:refs/remotes/origin/*
> +	refs/heads/*:refs/remotes/origin/*
> +	refs/heads/main:refs/remotes/frotz/xyzzy
> +	:refs/remotes/frotz/deleteme
> +	^refs/heads/secrets
> +	refs/heads/secret:refs/heads/translated
> +	refs/heads/secret:heads/translated
> +	refs/heads/secret:remotes/translated
> +	secret:translated
> +	refs/heads/*:remotes/xxy/*
> +	refs/heads*/for-linus:refs/remotes/mine/*
> +	2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
> +	HEAD
> +	@
> +	:
> +	EOF
> +	cat >expect <<-\EOF &&
> +	+refs/heads/*:refs/remotes/origin/*
> +	refs/heads/*:refs/remotes/origin/*
> +	refs/heads/main:refs/remotes/frotz/xyzzy
> +	:refs/remotes/frotz/deleteme
> +	^refs/heads/secrets
> +	refs/heads/secret:refs/heads/translated
> +	refs/heads/secret:heads/translated
> +	refs/heads/secret:remotes/translated
> +	secret:translated
> +	refs/heads/*:remotes/xxy/*
> +	refs/heads*/for-linus:refs/remotes/mine/*
> +	2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48:refs/heads/fixed
> +	HEAD
> +	HEAD
> +	:
> +	EOF

I don't like these expect/actual here. They are almost entirely
identical, which means that the reader either A) spends a toilsome few
minutes checking every single line to be sure they are not identical, or
B) reads the first three lines, decides they're the same, and misses the
@->HEAD special case.

Why not instead run the test once for all the lines which should be the
same before and after the parse, and again for all the lines which
should differ, to reduce burden on the reader?

 - Emily

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-06 18:47   ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
  2021-04-06 19:36     ` Tom Saeger
  2021-04-07 23:09     ` Josh Steadmon
@ 2021-04-07 23:37     ` Emily Shaffer
  2021-04-08  0:23     ` Jonathan Tan
  3 siblings, 0 replies; 72+ messages in thread
From: Emily Shaffer @ 2021-04-07 23:37 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee,
	Derrick Stolee, Derrick Stolee

On Tue, Apr 06, 2021 at 06:47:50PM +0000, Derrick Stolee via GitGitGadget wrote:
> @@ -877,6 +878,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
[snip]
> +		/*
> +		 * If a refspec dst starts with "refs/" at the start,
> +		 * then we will replace "refs/" with "refs/prefetch/".
> +		 * Otherwise, we will prepend the dst string with
> +		 * "refs/prefetch/".
> +		 */
> +		if (!strncmp(replace.dst, "refs/", 5))
> +			ignore_len = 5;
Using a literal string plus the literal value of the string length,
twice, doesn't sit great with me...

> +
> +		strbuf_addstr(&new_dst, "refs/prefetch/");
> +		strbuf_addstr(&new_dst, replace.dst + ignore_len);
...plus with some ugly array pointer math. :) Why not use
git-compat-util.h:skip_prefix() instead of doing your own math? (In
fact, the doc comment on skip_prefix() talks about using it exactly for
stripping "refs/" off the beginning of a string :) )

 - Emily

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-06 18:47   ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
                       ` (2 preceding siblings ...)
  2021-04-07 23:37     ` Emily Shaffer
@ 2021-04-08  0:23     ` Jonathan Tan
  3 siblings, 0 replies; 72+ messages in thread
From: Jonathan Tan @ 2021-04-08  0:23 UTC (permalink / raw)
  To: gitgitgadget
  Cc: git, tom.saeger, gitster, sunshine, stolee, derrickstolee,
	dstolee, Jonathan Tan

> +test_expect_success 'prefetch custom refspecs' '
> +	git -C clone1 branch -f special/fetched HEAD &&
> +	git -C clone1 branch -f special/secret/not-fetched HEAD &&
> +
> +	# create multiple refspecs for remote1
> +	git config --add remote.remote1.fetch "+refs/heads/special/fetched:refs/heads/fetched" &&
> +	git config --add remote.remote1.fetch "^refs/heads/special/secret/not-fetched" &&
> +
> +	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
> +
> +	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
> +
> +	# skips second refspec because it is not a pattern type

What second refspec is being skipped?

> +	rs1="+refs/heads/*:refs/prefetch/remotes/remote1/*" &&
> +	rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
> +	rs3="^refs/heads/special/secret/not-fetched" &&
> +
> +	test_subcommand git fetch remote1 $fetchargs "$rs1" "$rs2" "$rs3" <prefetch-refspec.txt &&
> +	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <prefetch-refspec.txt &&

How is this command generated? I don't see any mention of remote2 in
this test. (If it's because this repo was configured in a previous test
and some of the configuration carried over, I think it's best to start
in a new repo or at least have the previous config be cleared.)

> +
> +	# first refspec is overridden by second
> +	test_must_fail git rev-parse refs/prefetch/special/fetched &&
> +	git rev-parse refs/prefetch/heads/fetched &&
> +
> +	# possible incorrect places for the non-fetched ref
> +	test_must_fail git rev-parse refs/prefetch/remotes/remote1/secret/not-fetched &&
> +	test_must_fail git rev-parse refs/prefetch/remotes/remote1/not-fetched &&
> +	test_must_fail git rev-parse refs/heads/secret/not-fetched &&
> +	test_must_fail git rev-parse refs/heads/not-fetched
> +'

Other than this and the other comments that others have brought up, this
series looks good.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/5] refspec: output a refspec item
  2021-04-07 23:01           ` Ævar Arnfjörð Bjarmason
@ 2021-04-08  7:33             ` Junio C Hamano
  0 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2021-04-08  7:33 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee, Derrick Stolee via GitGitGadget, git, tom.saeger,
	sunshine, Derrick Stolee, Derrick Stolee

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> But yes, it seems even more straightforward to do away with passing the
> refspec at all to "git fetch", and instead pass some (maybe internal
> only, and documented as such) "--refspec-dst-prefix=refs/prefetch/"
> option to "git fetch".
>
> I.e. get_ref_map() over there is already doing a version of this loop
> over "remote->fetch.nr".
>
> So instead of "git gc" doing the loop, then passing all the refspecs on
> the command-line, it could tell "git fetch" to do the same munging when
> it does the same iteration.

That direction makes quite a lot of sense to me.

> Doing the munging in builtin/gc.c's fetch_remote() just seems like a
> relic from when we didn't care what decision builtin/fetch.c made about
> refspecs, we always wanted our custom one.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-07 10:26     ` Ævar Arnfjörð Bjarmason
@ 2021-04-09 11:48       ` Derrick Stolee
  2021-04-09 19:28         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee @ 2021-04-09 11:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee

On 4/7/2021 6:26 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Apr 07 2021, Ævar Arnfjörð Bjarmason wrote:
> 
>> On Mon, Apr 05 2021, Derrick Stolee via GitGitGadget wrote:
>>> +	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
>>
>> I see this is following some established convention in the file, but is
>> there really not a way to make this pass without directing stderr to
>> /dev/null? It makes ad-hoc debugging when reviewing harder.
> 
> As I later found out this is copy/pasted to get around the fact that
> --quiet is dependent on isatty(), so without this the result would be
> different under --verbose and non-verbose testing.

Yes, adding --quiet directly is a better pattern.

> So that dates back to 3ddaad0e060 (maintenance: add --quiet option,
> 2020-09-17), but I see other quiet=isatty(2) in related code. I wish we
> could isolate that particular behavior so removing the 2>/dev/null when
> debugging the tests doesn't cause you to run into this, maybe an
> explicit --quiet or --no-quiet option for all but one test that's
> checking that isatty() behavior?
> 
>> I tried just removing it, but then (in an earlier test case) the
>> "test_subcommand" fails because it can't find the line we're looking
>> for, so us piping stderr to /dev/null impacts our trace2 output?
> 
> I hadn't seen seen test_subcommand before, sorry to be blunt, but "ew!".

Saying "I'm about to be rude" before being rude doesn't excuse it.
I had to step away and get over this comment before I could examine
the actually constructive feedback below.

A better way to communicate the same information could be "This test
helper seems more complicated than necessary, and has some gaps that
could be filled." I completely agree with this statement.

> So we're ad-hoc grepping trace2 JSON output just to find out whether we
> invoked some subcommand. But unlike test_expect_code etc. this one
> doesn't run git for you, but instead we have temp *.txt files and the
> command disconnected from the run.
> 
> And because you're using "grep" and "! grep" to test, you're hiding the
> difference between "did not find this line" v.s. "did not find anything
> at all".

You're right that there is value in comparing the ordered list of
subcommands run by a given Git command. That will catch buggy tests
that are checking that a subcommand doesn't run.

> Because of that the second test using test_subcommand is either buggy or
> painfully non-obvious. We check that "run --auto" doesn't contain a
> "auto --quiet", but in reality it doesn't contain any subcommands at
> all. We didn't run any because it exited with "nothing to pack".

That exit is from the third command, which does not pass the --auto
command. The "nothing to pack" output is from the 'git gc --no-quiet'
subcommand that is being checked in this third test.

> I think converting the whole thing to something like the WIP/RFC patch
> below is much better and more readable.

This is an interesting approach. I don't see you using the ERR that you
are inputting anywhere, so that seems like an unnecessary bloat to the
consumers. But maybe I haven't discovered all of the places where this
would be useful, but it seems better to pipe stderr to a file for later
comparison when needed.
> +test_expect_process_tree () {
> +	depth= &&
> +	>actual &&
> +	cat >expect &&
> +	cat <&3 >expect.err
> +	while test $# != 0
> +	do
> +		case "$1" in
> +		--depth)
> +			depth="$2"
> +			shift
> +			;;
> +		*)
> +			break
> +			;;
> +		esac
> +		shift
> +	done &&
Do you have an example where this is being checked? Or can depth
be left as 1 for now?

> +	log="$(pwd)/proc-tree.txt" &&
> +	>"$log" &&
> +	GIT_TRACE2_PERF="$log" "$@" 2>actual.err &&
> +	grep "child_start" proc-tree.txt >proc-tree-start.txt || : &&
> +	if test -n "$depth"
> +	then
> +		grep " d$depth " proc-tree-start.txt >tmp.txt || : &&
> +		mv tmp.txt proc-tree-start.txt
> +	fi &&
> +	sed -e 's/^.*argv:\[//' -e 's/\]$//' <proc-tree-start.txt >actual &&
> +	test_cmp expect actual &&
> +	test_cmp expect.err actual.err
> +} 7>&2 2>&4

I think similar ideas could apply to test_region. Giving it a try
now.

-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/5] maintenance: simplify prefetch logic
  2021-04-07 23:23     ` Emily Shaffer
@ 2021-04-09 19:00       ` Derrick Stolee
  0 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee @ 2021-04-09 19:00 UTC (permalink / raw)
  To: Emily Shaffer, Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee, Derrick Stolee

On 4/7/2021 7:23 PM, Emily Shaffer wrote:
> Overall, the series looks pretty solid to me - which is why I've got a
> handful of small nits to relay. :)
> 
> On Tue, Apr 06, 2021 at 06:47:46PM +0000, Derrick Stolee via GitGitGadget wrote:
>> -static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
>> +static int fetch_remote(struct remote *remote, void *cbdata)
>>  {
>> +	struct maintenance_run_opts *opts = cbdata;
> [snip]
>>  	if (opts->quiet)
> I worry about the lack of null-checking here.

If this was a general-purpose method, I'd agree with you.
But since this is a static method being called exactly once
(and always passing a non-NULL value) then I don't believe
that NULL check is valuable here.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-09 11:48       ` Derrick Stolee
@ 2021-04-09 19:28         ` Ævar Arnfjörð Bjarmason
  2021-04-10  0:56           ` Derrick Stolee
  0 siblings, 1 reply; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-09 19:28 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, tom.saeger, gitster,
	sunshine, Derrick Stolee, Derrick Stolee


On Fri, Apr 09 2021, Derrick Stolee wrote:

> On 4/7/2021 6:26 AM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Wed, Apr 07 2021, Ævar Arnfjörð Bjarmason wrote:
>> 
>>> On Mon, Apr 05 2021, Derrick Stolee via GitGitGadget wrote:
>>>> +	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
>>>
>>> I see this is following some established convention in the file, but is
>>> there really not a way to make this pass without directing stderr to
>>> /dev/null? It makes ad-hoc debugging when reviewing harder.
>> 
>> As I later found out this is copy/pasted to get around the fact that
>> --quiet is dependent on isatty(), so without this the result would be
>> different under --verbose and non-verbose testing.
>
> Yes, adding --quiet directly is a better pattern.
>
>> So that dates back to 3ddaad0e060 (maintenance: add --quiet option,
>> 2020-09-17), but I see other quiet=isatty(2) in related code. I wish we
>> could isolate that particular behavior so removing the 2>/dev/null when
>> debugging the tests doesn't cause you to run into this, maybe an
>> explicit --quiet or --no-quiet option for all but one test that's
>> checking that isatty() behavior?
>> 
>>> I tried just removing it, but then (in an earlier test case) the
>>> "test_subcommand" fails because it can't find the line we're looking
>>> for, so us piping stderr to /dev/null impacts our trace2 output?
>> 
>> I hadn't seen seen test_subcommand before, sorry to be blunt, but "ew!".
>
> Saying "I'm about to be rude" before being rude doesn't excuse it.
> I had to step away and get over this comment before I could examine
> the actually constructive feedback below.
>
> A better way to communicate the same information could be "This test
> helper seems more complicated than necessary, and has some gaps that
> could be filled." I completely agree with this statement.

I didn't mean to be rude, my understanding of "blunt" is more like
"forthright" or "without mincing words" or something to that effect, but
I'm not a native speaker of English, so I'll take your understanding
over mine and apologize, sorry.

What I meant to say is: When we look at some prior art in the test suite
that does similar things this pattern is needlessly convoluted, and
separates the command invocation from the test assert itself. Something
similar-ish to check_sub_test_lib_test would allow you to run the
command, but do any asserting of the proc tree in the helper.

>> So we're ad-hoc grepping trace2 JSON output just to find out whether we
>> invoked some subcommand. But unlike test_expect_code etc. this one
>> doesn't run git for you, but instead we have temp *.txt files and the
>> command disconnected from the run.
>> 
>> And because you're using "grep" and "! grep" to test, you're hiding the
>> difference between "did not find this line" v.s. "did not find anything
>> at all".
>
> You're right that there is value in comparing the ordered list of
> subcommands run by a given Git command. That will catch buggy tests
> that are checking that a subcommand doesn't run.
>
>> Because of that the second test using test_subcommand is either buggy or
>> painfully non-obvious. We check that "run --auto" doesn't contain a
>> "auto --quiet", but in reality it doesn't contain any subcommands at
>> all. We didn't run any because it exited with "nothing to pack".
>
> That exit is from the third command, which does not pass the --auto
> command. The "nothing to pack" output is from the 'git gc --no-quiet'
> subcommand that is being checked in this third test.
>
>> I think converting the whole thing to something like the WIP/RFC patch
>> below is much better and more readable.
>
> This is an interesting approach. I don't see you using the ERR that you
> are inputting anywhere, so that seems like an unnecessary bloat to the
> consumers. But maybe I haven't discovered all of the places where this
> would be useful, but it seems better to pipe stderr to a file for later
> comparison when needed.

Yes, it's probably not a good default here. For the test-lib.sh tests
there's check_sub_test_lib_test and check_sub_test_lib_test_err, most of
the tests only test stdout.

>> +test_expect_process_tree () {
>> +	depth= &&
>> +	>actual &&
>> +	cat >expect &&
>> +	cat <&3 >expect.err
>> +	while test $# != 0
>> +	do
>> +		case "$1" in
>> +		--depth)
>> +			depth="$2"
>> +			shift
>> +			;;
>> +		*)
>> +			break
>> +			;;
>> +		esac
>> +		shift
>> +	done &&
> Do you have an example where this is being checked? Or can depth
> be left as 1 for now?

It can probably be hardcoded, but I was hoping someone more familiar
with trace2 would chime in, but I'm fairly sure there's not a way to do
it without parsing the existing output with either some clever
grep/awk-ing of the PERF output, or stateful parsing of the JSON.

I thought that for git maintenance tests perhaps something wanted to
assert that we didn't have maintenance invoking maintenance, or that
something expected to prune refs really invoked the relevant prune
command via "gc".

>> +	log="$(pwd)/proc-tree.txt" &&
>> +	>"$log" &&
>> +	GIT_TRACE2_PERF="$log" "$@" 2>actual.err &&
>> +	grep "child_start" proc-tree.txt >proc-tree-start.txt || : &&
>> +	if test -n "$depth"
>> +	then
>> +		grep " d$depth " proc-tree-start.txt >tmp.txt || : &&
>> +		mv tmp.txt proc-tree-start.txt
>> +	fi &&
>> +	sed -e 's/^.*argv:\[//' -e 's/\]$//' <proc-tree-start.txt >actual &&
>> +	test_cmp expect actual &&
>> +	test_cmp expect.err actual.err
>> +} 7>&2 2>&4
>
> I think similar ideas could apply to test_region. Giving it a try
> now.

Probably, I didn't even notice that one...

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-09 19:28         ` Ævar Arnfjörð Bjarmason
@ 2021-04-10  0:56           ` Derrick Stolee
  2021-04-10 11:37             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee @ 2021-04-10  0:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee via GitGitGadget, git, tom.saeger, gitster,
	sunshine, Derrick Stolee, Derrick Stolee

On 4/9/2021 3:28 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Apr 09 2021, Derrick Stolee wrote:
> 
>> On 4/7/2021 6:26 AM, Ævar Arnfjörð Bjarmason wrote:
>>> I think converting the whole thing to something like the WIP/RFC patch
>>> below is much better and more readable.
>>
>> This is an interesting approach. I don't see you using the ERR that you
>> are inputting anywhere, so that seems like an unnecessary bloat to the
>> consumers. But maybe I haven't discovered all of the places where this
>> would be useful, but it seems better to pipe stderr to a file for later
>> comparison when needed.
> 
> Yes, it's probably not a good default here. For the test-lib.sh tests
> there's check_sub_test_lib_test and check_sub_test_lib_test_err, most of
> the tests only test stdout.
> 
>>> +test_expect_process_tree () {
>>> +	depth= &&
>>> +	>actual &&
>>> +	cat >expect &&
>>> +	cat <&3 >expect.err
>>> +	while test $# != 0
>>> +	do
>>> +		case "$1" in
>>> +		--depth)
>>> +			depth="$2"
>>> +			shift
>>> +			;;
>>> +		*)
>>> +			break
>>> +			;;
>>> +		esac
>>> +		shift
>>> +	done &&
>> Do you have an example where this is being checked? Or can depth
>> be left as 1 for now?
> 
> It can probably be hardcoded, but I was hoping someone more familiar
> with trace2 would chime in, but I'm fairly sure there's not a way to do
> it without parsing the existing output with either some clever
> grep/awk-ing of the PERF output, or stateful parsing of the JSON.
> 
> I thought that for git maintenance tests perhaps something wanted to
> assert that we didn't have maintenance invoking maintenance, or that
> something expected to prune refs really invoked the relevant prune
> command via "gc".
> 
>>> +	log="$(pwd)/proc-tree.txt" &&
>>> +	>"$log" &&
>>> +	GIT_TRACE2_PERF="$log" "$@" 2>actual.err &&
>>> +	grep "child_start" proc-tree.txt >proc-tree-start.txt || : &&
>>> +	if test -n "$depth"
>>> +	then
>>> +		grep " d$depth " proc-tree-start.txt >tmp.txt || : &&
>>> +		mv tmp.txt proc-tree-start.txt
>>> +	fi &&
>>> +	sed -e 's/^.*argv:\[//' -e 's/\]$//' <proc-tree-start.txt >actual &&
>>> +	test_cmp expect actual &&
>>> +	test_cmp expect.err actual.err
>>> +} 7>&2 2>&4
>>
>> I think similar ideas could apply to test_region. Giving it a try
>> now.
> 
> Probably, I didn't even notice that one...

I gave this a few hours today, and I'm giving up. I'm the first to
admit that I don't have the correct scripting skills to do some of
these things.

I've got what I tried below. It certainly looks like it would work.
It solves the problem of "what if the test is flaky?" by ensuring that
all subcommands (at depth 0) match the inputs exactly.

However, the problem comes when trying to make that work for all of
the maintenance tests, specifically the 'incremental-repack' task.
That task dynamically computes a --batch-size=X parameter, and that
is not stable across runs of the script.

This was avoided in the past by only checking for the first of three
subcommands when verifying that the 'incremental-repack' task worked.
That is, except for the EXPENSIVE test that checks that the --batch-size
maxes out at 2g.

The thing that might make these changing parameters work is to allow
the specified lines be a _prefix_ of the actual parameters. Or, let
each line be a pattern that is checked against that line. Issues come
up with how to handle this line-by-line check that I was unable to
overcome.

The good news is that the idea of adding a '--prefetch' option to
'git fetch' makes the change to t7900-maintenance.sh much easier,
making this change to test_subcommand less of a priority.

I include my attempt here as a patch. Feel free to take whatever
you want of it, or none of it and start over. I do think that it
makes the test script look much nicer.

Thanks,
-Stolee

-- >8 --

From 449d098f2a13860f44b2e6fb96fb2dd5872b511b Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Fri, 9 Apr 2021 08:42:26 -0400
Subject: [PATCH 1/2] test-lib: add test_subcommands
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The test_subcommand helper in test-lib-functions.sh satisfied a need to
check for certain subcommands in Git processes. This is especially
needed in t7900-maintenance.sh to ensure that the maintenance builtin
properly calls certain subcommands based on command-line options, config
values, and the state of the repository.

However, test_subcommand has some complexities in its use that can be
improved. First, it requires that the caller knows to create a log file
with GIT_TRACE2_EVENT then supply that to test_subcommand. Further, it
only checks that some exact subcommands exist or do not exist. It does
not guarantee that the list of subcommands exactly matches a given list.
Because of this drawback, the tests that check a subcommand does _not_
run are particularly flaky to slight changes in behavior.

Introduce a new helper, test_subcommands, that resolves these drawbacks:

1. It runs the supplied command and handles the trace log itself.

2. It takes a list of commands overs stdin and compares this to the
   complete list of subcommands run from the top-level command.

The helper does not test that the full subcommand tree matches, because
that would cause tests to be too fragile to changes unrelated to the
component being tested. This could easily be extended to allow the full
tree with an option, if desired.

To ensure we only check the first level, use GIT_TRACE2_PERF output and
scan for " d0 " in the rows that include the "child_start" event. The
last column includes a way to scrape the subcommand itself from the
trace. Sometimes arguments are quoted, such as when passing a refspec
with '*' to the subcommand. This makes it difficult to create a matching
string within the single-quoted test definitions, so strip these single
quotes from the arguments before matching the input.

Only modify one test in t7900-maintenance.sh. The rest of the callers to
test_subcommand will be converted to test_subcommands in a later change.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t7900-maintenance.sh  | 16 +++++++---------
 t/test-lib-functions.sh | 24 ++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 2412d8c5c0..e170ab7862 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -30,15 +30,13 @@ test_expect_success 'help text' '
 '
 
 test_expect_success 'run [--auto|--quiet]' '
-	GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" \
-		git maintenance run 2>/dev/null &&
-	GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" \
-		git maintenance run --auto 2>/dev/null &&
-	GIT_TRACE2_EVENT="$(pwd)/run-no-quiet.txt" \
-		git maintenance run --no-quiet 2>/dev/null &&
-	test_subcommand git gc --quiet <run-no-auto.txt &&
-	test_subcommand ! git gc --auto --quiet <run-auto.txt &&
-	test_subcommand git gc --no-quiet <run-no-quiet.txt
+	test_subcommands git maintenance run --quiet <<-EOF &&
+	git gc --quiet
+	EOF
+	test_subcommands git maintenance run --auto --quiet </dev/null &&
+	test_subcommands git maintenance run --no-quiet <<-EOF
+	git gc --no-quiet
+	EOF
 '
 
 test_expect_success 'maintenance.auto config option' '
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 6348e8d733..53ffeb07f8 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1658,6 +1658,30 @@ test_subcommand () {
 	fi
 }
 
+# Run a command and ensure it succeeds. Use the
+# GIT_TRACE2_PERF logs to ensure that every subcommand
+# run by this top-level Git process is exactly the
+# set supplied over stdin.
+#
+# Redirects stderr to a file named "err". This can
+# be used by tests, but it also provides consistent
+# use of isatty(2) which can affect subcommand calls.
+test_subcommands () {
+	local log line &&
+
+	cat >expect &&
+
+	log="$(pwd)"/subcommand-trace.txt &&
+
+	GIT_TRACE2_PERF="$log" "$@" 2>err &&
+	grep "child_start" "$log" | grep " d0 " >processes || : &&
+
+	sed -e 's/^.*argv:\[//' -e 's/\]$//' -e "s/'//g" <processes >actual &&
+	test_cmp expect actual &&
+
+	rm -f "$log" expect actual processes
+}
+
 # Check that the given command was invoked as part of the
 # trace2-format trace on stdin.
 #
-- 
2.31.1.vfs.0.0

--- >8 ---

And here is the follow-up that attempts to change the rest of
the tests in t7900-maintenance.sh. However, this leads to flaky
tests because of the --batch-size changes.

--- >8 ---

From 61f142ba7fcde8ae2a84f18005f384a4544b7741 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Fri, 9 Apr 2021 11:13:42 -0400
Subject: [PATCH 2/2] t7900: convert to test_subcommands

Replace the remaining uses of test_subcommand with test_subcommands,
allowing us to delete the old helper.

Most of these replacements are straightforward. However, some are a bit
more subtle, specifically because we now are checking the full ordered
set of subcommands. Some places we were only testing one of multiple
subcommands that would be run by a task. These are expanded to include
the full set.

When working with the prefetch task, the refspecs that are passed to the
subcommand are normally quoted with single quotes in the GIT_TRACE2_PERF
output, but those characters are removed in test_subcommands, allowing
our tests in t7900-maintenance.sh to avoid escaping such quotes.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t7900-maintenance.sh  | 285 +++++++++++++++++++---------------------
 t/test-lib-functions.sh |  33 -----
 2 files changed, 133 insertions(+), 185 deletions(-)

diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index e170ab7862..8861435dcf 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -40,78 +40,65 @@ test_expect_success 'run [--auto|--quiet]' '
 '
 
 test_expect_success 'maintenance.auto config option' '
-	GIT_TRACE2_EVENT="$(pwd)/default" git commit --quiet --allow-empty -m 1 &&
-	test_subcommand git maintenance run --auto --quiet <default &&
-	GIT_TRACE2_EVENT="$(pwd)/true" \
-		git -c maintenance.auto=true \
-		commit --quiet --allow-empty -m 2 &&
-	test_subcommand git maintenance run --auto --quiet  <true &&
-	GIT_TRACE2_EVENT="$(pwd)/false" \
-		git -c maintenance.auto=false \
-		commit --quiet --allow-empty -m 3 &&
-	test_subcommand ! git maintenance run --auto --quiet  <false
+	test_subcommands git commit --allow-empty -m 1 <<-EOF &&
+	git maintenance run --auto --no-quiet
+	EOF
+	test_subcommands git -c maintenance.auto=true \
+		commit --allow-empty -m 2 <<-EOF &&
+	git maintenance run --auto --no-quiet
+	EOF
+	test_subcommands git -c maintenance.auto=false \
+		commit --allow-empty -m 3 </dev/null
 '
 
 test_expect_success 'maintenance.<task>.enabled' '
 	git config maintenance.gc.enabled false &&
 	git config maintenance.commit-graph.enabled true &&
-	GIT_TRACE2_EVENT="$(pwd)/run-config.txt" git maintenance run 2>err &&
-	test_subcommand ! git gc --quiet <run-config.txt &&
-	test_subcommand git commit-graph write --split --reachable --no-progress <run-config.txt
+
+	# This also verifies that "git gc" is not run.
+	test_subcommands git maintenance run --quiet <<-EOF
+	git commit-graph write --split --reachable --no-progress
+	EOF
 '
 
 test_expect_success 'run --task=<task>' '
-	GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" \
-		git maintenance run --task=commit-graph 2>/dev/null &&
-	GIT_TRACE2_EVENT="$(pwd)/run-gc.txt" \
-		git maintenance run --task=gc 2>/dev/null &&
-	GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" \
-		git maintenance run --task=commit-graph 2>/dev/null &&
-	GIT_TRACE2_EVENT="$(pwd)/run-both.txt" \
-		git maintenance run --task=commit-graph --task=gc 2>/dev/null &&
-	test_subcommand ! git gc --quiet <run-commit-graph.txt &&
-	test_subcommand git gc --quiet <run-gc.txt &&
-	test_subcommand git gc --quiet <run-both.txt &&
-	test_subcommand git commit-graph write --split --reachable --no-progress <run-commit-graph.txt &&
-	test_subcommand ! git commit-graph write --split --reachable --no-progress <run-gc.txt &&
-	test_subcommand git commit-graph write --split --reachable --no-progress <run-both.txt
+	test_subcommands git maintenance run --task=commit-graph <<-EOF &&
+	git commit-graph write --split --reachable --no-progress
+	EOF
+	test_subcommands git maintenance run --task=gc <<-EOF &&
+	git gc --quiet
+	EOF
+	test_subcommands git maintenance run --task=commit-graph --task=gc <<-EOF
+	git gc --quiet
+	git commit-graph write --split --reachable --no-progress
+	EOF
 '
 
 test_expect_success 'core.commitGraph=false prevents write process' '
-	GIT_TRACE2_EVENT="$(pwd)/no-commit-graph.txt" \
-		git -c core.commitGraph=false maintenance run \
-		--task=commit-graph 2>/dev/null &&
-	test_subcommand ! git commit-graph write --split --reachable --no-progress \
-		<no-commit-graph.txt
+	test_subcommands git -c core.commitGraph=false maintenance \
+		run --task=commit-graph <<-EOF
+	EOF
 '
 
 test_expect_success 'commit-graph auto condition' '
 	COMMAND="maintenance run --task=commit-graph --auto --quiet" &&
+	cat >did-run.txt <<-EOF &&
+	git commit-graph write --split --reachable --no-progress
+	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/cg-no.txt" \
-		git -c maintenance.commit-graph.auto=1 $COMMAND &&
-	GIT_TRACE2_EVENT="$(pwd)/cg-negative-means-yes.txt" \
-		git -c maintenance.commit-graph.auto="-1" $COMMAND &&
+	test_subcommands git -c maintenance.commit-graph.auto=1 $COMMAND </dev/null &&
+	test_subcommands git -c maintenance.commit-graph.auto="-1" $COMMAND <did-run.txt &&
 
 	test_commit first &&
 
-	GIT_TRACE2_EVENT="$(pwd)/cg-zero-means-no.txt" \
-		git -c maintenance.commit-graph.auto=0 $COMMAND &&
-	GIT_TRACE2_EVENT="$(pwd)/cg-one-satisfied.txt" \
-		git -c maintenance.commit-graph.auto=1 $COMMAND &&
+	test_subcommands git -c maintenance.commit-graph.auto=0 $COMMAND </dev/null &&
+	test_subcommands git -c maintenance.commit-graph.auto="1" $COMMAND <did-run.txt &&
 
 	git commit --allow-empty -m "second" &&
 	git commit --allow-empty -m "third" &&
 
-	GIT_TRACE2_EVENT="$(pwd)/cg-two-satisfied.txt" \
-		git -c maintenance.commit-graph.auto=2 $COMMAND &&
-
-	COMMIT_GRAPH_WRITE="git commit-graph write --split --reachable --no-progress" &&
-	test_subcommand ! $COMMIT_GRAPH_WRITE <cg-no.txt &&
-	test_subcommand $COMMIT_GRAPH_WRITE <cg-negative-means-yes.txt &&
-	test_subcommand ! $COMMIT_GRAPH_WRITE <cg-zero-means-no.txt &&
-	test_subcommand $COMMIT_GRAPH_WRITE <cg-one-satisfied.txt &&
-	test_subcommand $COMMIT_GRAPH_WRITE <cg-two-satisfied.txt
+	test_subcommands git -c maintenance.commit-graph.auto=3 $COMMAND </dev/null &&
+	test_subcommands git -c maintenance.commit-graph.auto=2 $COMMAND <did-run.txt
 '
 
 test_expect_success 'run --task=bogus' '
@@ -138,10 +125,11 @@ test_expect_success 'prefetch multiple remotes' '
 	git -C clone2 switch -c two &&
 	test_commit -C clone1 one &&
 	test_commit -C clone2 two &&
-	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
 	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-	test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
-	test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
+	test_subcommands git maintenance run --task=prefetch <<-EOF &&
+	git fetch remote1 $fetchargs +refs/heads/*:refs/prefetch/remote1/*
+	git fetch remote2 $fetchargs +refs/heads/*:refs/prefetch/remote2/*
+	EOF
 	test_path_is_missing .git/refs/remotes &&
 	git log prefetch/remote1/one &&
 	git log prefetch/remote2/two &&
@@ -215,24 +203,23 @@ test_expect_success 'loose-objects task' '
 
 test_expect_success 'maintenance.loose-objects.auto' '
 	git repack -adk &&
-	GIT_TRACE2_EVENT="$(pwd)/trace-lo1.txt" \
-		git -c maintenance.loose-objects.auto=1 maintenance \
-		run --auto --task=loose-objects 2>/dev/null &&
-	test_subcommand ! git prune-packed --quiet <trace-lo1.txt &&
+	test_subcommands git -c maintenance.loose-objects.auto=1 maintenance \
+		run --auto --task=loose-objects </dev/null &&
 	printf data-A | git hash-object -t blob --stdin -w &&
-	GIT_TRACE2_EVENT="$(pwd)/trace-loA" \
-		git -c maintenance.loose-objects.auto=2 \
-		maintenance run --auto --task=loose-objects 2>/dev/null &&
-	test_subcommand ! git prune-packed --quiet <trace-loA &&
+	test_subcommands git -c maintenance.loose-objects.auto=2 \
+		maintenance run --auto --task=loose-objects </dev/null &&
 	printf data-B | git hash-object -t blob --stdin -w &&
-	GIT_TRACE2_EVENT="$(pwd)/trace-loB" \
-		git -c maintenance.loose-objects.auto=2 \
-		maintenance run --auto --task=loose-objects 2>/dev/null &&
-	test_subcommand git prune-packed --quiet <trace-loB &&
-	GIT_TRACE2_EVENT="$(pwd)/trace-loC" \
-		git -c maintenance.loose-objects.auto=2 \
-		maintenance run --auto --task=loose-objects 2>/dev/null &&
-	test_subcommand git prune-packed --quiet <trace-loC
+
+	test_subcommands git -c maintenance.loose-objects.auto=2 \
+		maintenance run --auto --task=loose-objects <<-EOF &&
+	git prune-packed --quiet
+	git pack-objects --quiet .git/objects/pack/loose
+	EOF
+
+	test_subcommands git -c maintenance.loose-objects.auto=2 \
+		maintenance run --auto --task=loose-objects <<-EOF
+	git prune-packed --quiet
+	EOF
 '
 
 test_expect_success 'incremental-repack task' '
@@ -307,38 +294,40 @@ test_expect_success EXPENSIVE 'incremental-repack 2g limit' '
 	git maintenance run --task=loose-objects &&
 
 	# Now run the incremental-repack task and check the batch-size
-	GIT_TRACE2_EVENT="$(pwd)/run-2g.txt" git maintenance run \
-		--task=incremental-repack 2>/dev/null &&
-	test_subcommand git multi-pack-index repack \
-		 --no-progress --batch-size=2147483647 <run-2g.txt
+	test_subcommands git maintenance run \
+		--task=incremental-repack <<-EOF
+	git multi-pack-index repack --no-progress --batch-size=2147483647
+	EOF
 '
 
 test_expect_success 'maintenance.incremental-repack.auto' '
 	git repack -adk &&
 	git config core.multiPackIndex true &&
 	git multi-pack-index write &&
-	GIT_TRACE2_EVENT="$(pwd)/midx-init.txt" git \
-		-c maintenance.incremental-repack.auto=1 \
-		maintenance run --auto --task=incremental-repack 2>/dev/null &&
-	test_subcommand ! git multi-pack-index write --no-progress <midx-init.txt &&
+	test_subcommands git -c maintenance.incremental-repack.auto=1 \
+		maintenance run --auto --task=incremental-repack </dev/null &&
+
 	test_commit A &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
 	^HEAD~1
 	EOF
-	GIT_TRACE2_EVENT=$(pwd)/trace-A git \
-		-c maintenance.incremental-repack.auto=2 \
-		maintenance run --auto --task=incremental-repack 2>/dev/null &&
-	test_subcommand ! git multi-pack-index write --no-progress <trace-A &&
+
+	test_subcommands git -c maintenance.incremental-repack.auto=2 \
+		maintenance run --auto --task=incremental-repack </dev/null &&
+
 	test_commit B &&
 	git pack-objects --revs .git/objects/pack/pack <<-\EOF &&
 	HEAD
 	^HEAD~1
 	EOF
-	GIT_TRACE2_EVENT=$(pwd)/trace-B git \
-		-c maintenance.incremental-repack.auto=2 \
-		maintenance run --auto --task=incremental-repack 2>/dev/null &&
-	test_subcommand git multi-pack-index write --no-progress <trace-B
+
+	test_subcommands git -c maintenance.incremental-repack.auto=2 \
+		maintenance run --auto --task=incremental-repack <<-EOF
+	git multi-pack-index write --no-progress
+	git multi-pack-index expire --no-progress
+	git multi-pack-index repack --no-progress --batch-size=469
+	EOF
 '
 
 test_expect_success 'pack-refs task' '
@@ -346,11 +335,12 @@ test_expect_success 'pack-refs task' '
 	do
 		git branch -f to-pack/$n HEAD || return 1
 	done &&
-	GIT_TRACE2_EVENT="$(pwd)/pack-refs.txt" \
-		git maintenance run --task=pack-refs &&
+	test_subcommands git maintenance run --task=pack-refs <<-EOF &&
+	git pack-refs --all --prune
+	EOF
+
 	ls .git/refs/heads/ >after &&
-	test_must_be_empty after &&
-	test_subcommand git pack-refs --all --prune <pack-refs.txt
+	test_must_be_empty after
 '
 
 test_expect_success '--auto and --schedule incompatible' '
@@ -371,26 +361,26 @@ test_expect_success '--schedule inheritance weekly -> daily -> hourly' '
 	git config maintenance.incremental-repack.enabled true &&
 	git config maintenance.incremental-repack.schedule weekly &&
 
-	GIT_TRACE2_EVENT="$(pwd)/hourly.txt" \
-		git maintenance run --schedule=hourly 2>/dev/null &&
-	test_subcommand git prune-packed --quiet <hourly.txt &&
-	test_subcommand ! git commit-graph write --split --reachable \
-		--no-progress <hourly.txt &&
-	test_subcommand ! git multi-pack-index write --no-progress <hourly.txt &&
+	cat >hourly <<-EOF &&
+	git prune-packed --quiet
+	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/daily.txt" \
-		git maintenance run --schedule=daily 2>/dev/null &&
-	test_subcommand git prune-packed --quiet <daily.txt &&
-	test_subcommand git commit-graph write --split --reachable \
-		--no-progress <daily.txt &&
-	test_subcommand ! git multi-pack-index write --no-progress <daily.txt &&
+	cat >daily <<-EOF &&
+	git prune-packed --quiet
+	git commit-graph write --split --reachable --no-progress
+	EOF
 
-	GIT_TRACE2_EVENT="$(pwd)/weekly.txt" \
-		git maintenance run --schedule=weekly 2>/dev/null &&
-	test_subcommand git prune-packed --quiet <weekly.txt &&
-	test_subcommand git commit-graph write --split --reachable \
-		--no-progress <weekly.txt &&
-	test_subcommand git multi-pack-index write --no-progress <weekly.txt
+	cat >weekly <<-EOF &&
+	git prune-packed --quiet
+	git multi-pack-index write --no-progress
+	git multi-pack-index expire --no-progress
+	git multi-pack-index repack --no-progress --batch-size=655
+	git commit-graph write --split --reachable --no-progress
+	EOF
+
+	test_subcommands git maintenance run --schedule=hourly <hourly &&
+	test_subcommands git maintenance run --schedule=daily <daily &&
+	test_subcommands git maintenance run --schedule=weekly <weekly
 '
 
 test_expect_success 'maintenance.strategy inheritance' '
@@ -402,58 +392,49 @@ test_expect_success 'maintenance.strategy inheritance' '
 	test_when_finished git config --unset maintenance.strategy &&
 	git config maintenance.strategy incremental &&
 
-	GIT_TRACE2_EVENT="$(pwd)/incremental-hourly.txt" \
-		git maintenance run --schedule=hourly --quiet &&
-	GIT_TRACE2_EVENT="$(pwd)/incremental-daily.txt" \
-		git maintenance run --schedule=daily --quiet &&
-	GIT_TRACE2_EVENT="$(pwd)/incremental-weekly.txt" \
-		git maintenance run --schedule=weekly --quiet &&
-
-	test_subcommand git commit-graph write --split --reachable \
-		--no-progress <incremental-hourly.txt &&
-	test_subcommand ! git prune-packed --quiet <incremental-hourly.txt &&
-	test_subcommand ! git multi-pack-index write --no-progress \
-		<incremental-hourly.txt &&
-	test_subcommand ! git pack-refs --all --prune \
-		<incremental-hourly.txt &&
-
-	test_subcommand git commit-graph write --split --reachable \
-		--no-progress <incremental-daily.txt &&
-	test_subcommand git prune-packed --quiet <incremental-daily.txt &&
-	test_subcommand git multi-pack-index write --no-progress \
-		<incremental-daily.txt &&
-	test_subcommand ! git pack-refs --all --prune \
-		<incremental-daily.txt &&
-
-	test_subcommand git commit-graph write --split --reachable \
-		--no-progress <incremental-weekly.txt &&
-	test_subcommand git prune-packed --quiet <incremental-weekly.txt &&
-	test_subcommand git multi-pack-index write --no-progress \
-		<incremental-weekly.txt &&
-	test_subcommand git pack-refs --all --prune \
-		<incremental-weekly.txt &&
+	# Modify this default for simplicity
+	git config maintenance.prefetch.enabled false &&
+
+	test_subcommands git maintenance run --schedule=hourly <<-EOF &&
+	git commit-graph write --split --reachable --no-progress
+	EOF
+
+	test_subcommands git maintenance run --schedule=daily <<-EOF &&
+	git prune-packed --quiet
+	git multi-pack-index write --no-progress
+	git multi-pack-index expire --no-progress
+	git multi-pack-index repack --no-progress --batch-size=655
+	git commit-graph write --split --reachable --no-progress
+	EOF
+
+	test_subcommands git maintenance run --schedule=weekly <<-EOF &&
+	git prune-packed --quiet
+	git multi-pack-index write --no-progress
+	git multi-pack-index expire --no-progress
+	git multi-pack-index repack --no-progress --batch-size=655
+	git commit-graph write --split --reachable --no-progress
+	git pack-refs --all --prune
+	EOF
 
 	# Modify defaults
 	git config maintenance.commit-graph.schedule daily &&
 	git config maintenance.loose-objects.schedule hourly &&
 	git config maintenance.incremental-repack.enabled false &&
 
-	GIT_TRACE2_EVENT="$(pwd)/modified-hourly.txt" \
-		git maintenance run --schedule=hourly --quiet &&
-	GIT_TRACE2_EVENT="$(pwd)/modified-daily.txt" \
-		git maintenance run --schedule=daily --quiet &&
-
-	test_subcommand ! git commit-graph write --split --reachable \
-		--no-progress <modified-hourly.txt &&
-	test_subcommand git prune-packed --quiet <modified-hourly.txt &&
-	test_subcommand ! git multi-pack-index write --no-progress \
-		<modified-hourly.txt &&
-
-	test_subcommand git commit-graph write --split --reachable \
-		--no-progress <modified-daily.txt &&
-	test_subcommand git prune-packed --quiet <modified-daily.txt &&
-	test_subcommand ! git multi-pack-index write --no-progress \
-		<modified-daily.txt
+	test_subcommands git maintenance run --schedule=hourly <<-EOF &&
+	git prune-packed --quiet
+	EOF
+
+	test_subcommands git maintenance run --schedule=daily <<-EOF &&
+	git prune-packed --quiet
+	git commit-graph write --split --reachable --no-progress
+	EOF
+
+	test_subcommands git maintenance run --schedule=weekly <<-EOF
+	git prune-packed --quiet
+	git commit-graph write --split --reachable --no-progress
+	git pack-refs --all --prune
+	EOF
 '
 
 test_expect_success 'register and unregister' '
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 53ffeb07f8..444dd3c6c7 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1625,39 +1625,6 @@ test_path_is_hidden () {
 	return 1
 }
 
-# Check that the given command was invoked as part of the
-# trace2-format trace on stdin.
-#
-#	test_subcommand [!] <command> <args>... < <trace>
-#
-# For example, to look for an invocation of "git upload-pack
-# /path/to/repo"
-#
-#	GIT_TRACE2_EVENT=event.log git fetch ... &&
-#	test_subcommand git upload-pack "$PATH" <event.log
-#
-# If the first parameter passed is !, this instead checks that
-# the given command was not called.
-#
-test_subcommand () {
-	local negate=
-	if test "$1" = "!"
-	then
-		negate=t
-		shift
-	fi
-
-	local expr=$(printf '"%s",' "$@")
-	expr="${expr%,}"
-
-	if test -n "$negate"
-	then
-		! grep "\[$expr\]"
-	else
-		grep "\[$expr\]"
-	fi
-}
-
 # Run a command and ensure it succeeds. Use the
 # GIT_TRACE2_PERF logs to ensure that every subcommand
 # run by this top-level Git process is exactly the
-- 
2.31.1.vfs.0.0




^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 0/3] Maintenance: adapt custom refspecs
  2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
                     ` (4 preceding siblings ...)
  2021-04-06 18:47   ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
@ 2021-04-10  2:03   ` Derrick Stolee via GitGitGadget
  2021-04-10  2:03     ` [PATCH v3 1/3] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
                       ` (4 more replies)
  5 siblings, 5 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-10  2:03 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Derrick Stolee

Tom Saeger rightly pointed out [1] that the prefetch task ignores custom
refspecs. This can lead to downloading more data than requested, and it
doesn't even help the future foreground fetches that use that custom
refspec.

[1]
https://lore.kernel.org/git/20210401184914.qmr7jhjbhp2mt3h6@dhcp-10-154-148-175.vpn.oracle.com/

This series fixes this problem by carefully replacing the start of each
refspec's destination with "refs/prefetch/". If the destination already
starts with "refs/", then that is replaced. Otherwise "refs/prefetch/" is
just prepended.

This happens inside of git fetch when a --prefetch option is given. This
allows us to maniuplate a struct refspec_item instead of a full refspec
string. It also simplifies our logic in testing the prefetch task.


Update in V3
============

 * The fix is almost completely rewritten as an update to 'git fetch'. See
   the new PATCH 2 for this update.

 * There was some discussion of rewriting test_subcommand, but that can be
   delayed until a proper solution is found to complications around softer
   matches.


Updates in V2
=============

Thanks for the close eye on this series. I appreciate the recommendations,
which I believe I have responded to them all:

 * Fixed typos.
 * Made refspec_item_format() re-entrant. Consumers must free the buffer.
 * Cleaned up style (quoting and tabbing).

Thanks, -Stolee

Derrick Stolee (3):
  maintenance: simplify prefetch logic
  fetch: add --prefetch option
  maintenance: use 'git fetch --prefetch'

 Documentation/fetch-options.txt   |  5 +++
 Documentation/git-maintenance.txt |  6 ++--
 builtin/fetch.c                   | 56 +++++++++++++++++++++++++++++++
 builtin/gc.c                      | 36 +++++---------------
 t/t5582-fetch-negative-refspec.sh | 30 +++++++++++++++++
 t/t7900-maintenance.sh            | 14 ++++----
 6 files changed, 109 insertions(+), 38 deletions(-)


base-commit: 89b43f80a514aee58b662ad606e6352e03eaeee4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-924%2Fderrickstolee%2Fmaintenance%2Frefspec-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-924/derrickstolee/maintenance/refspec-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/924

Range-diff vs v2:

 1:  5aa0cb06c3f2 = 1:  4c0e983ba56f maintenance: simplify prefetch logic
 2:  d58a3e042ee8 < -:  ------------ test-lib: use exact match for test_subcommand
 3:  96388d949b98 < -:  ------------ refspec: output a refspec item
 4:  bf296282323a < -:  ------------ test-tool: test refspec input/output
 -:  ------------ > 2:  7f488eea6dbd fetch: add --prefetch option
 5:  9592224e3d42 ! 3:  ed055d772452 maintenance: allow custom refspecs during prefetch
     @@ Metadata
      Author: Derrick Stolee <dstolee@microsoft.com>
      
       ## Commit message ##
     -    maintenance: allow custom refspecs during prefetch
     +    maintenance: use 'git fetch --prefetch'
      
     -    The prefetch task previously used the default refspec source plus a
     -    custom refspec destination to avoid colliding with remote refs:
     +    The 'prefetch' maintenance task previously forced the following refspec
     +    for each remote:
      
                  +refs/heads/*:refs/prefetch/<remote>/*
      
     -    However, some users customize their refspec to reduce how much data they
     -    download from specific remotes. This can involve restrictive patterns
     -    for fetching or negative patterns to avoid downloading some refs.
     +    If a user has specified a more strict refspec for the remote, then this
     +    prefetch task downloads more objects than necessary.
      
     -    Modify fetch_remote() to iterate over the remote's refspec list and
     -    translate that into the appropriate prefetch scenario. Specifically,
     -    re-parse the raw form of the refspec into a new 'struct refspec' and
     -    modify the 'dst' member to replace a leading "refs/" substring with
     -    "refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
     -    Negative refspecs do not have a 'dst' so they can be transferred to the
     -    'git fetch' command unmodified.
     -
     -    This prefix change provides the benefit of keeping whatever collisions
     -    may exist in the custom refspecs, if that is a desirable outcome.
     -
     -    This changes the names of the refs that would be fetched by the default
     -    refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
     -    to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
     -    is not a seriously breaking one: these refs are intended to be hidden
     -    and not used.
     +    The previous change introduced the '--prefetch' option to 'git fetch'
     +    which manipulates the remote's refspec to place all resulting refs into
     +    refs/prefetch/, with further partitioning based on the destinations of
     +    those refspecs.
      
          Update the documentation to be more generic about the destination refs.
     -    Do not mention custom refpecs explicitly, as that does not need to be
     +    Do not mention custom refspecs explicitly, as that does not need to be
          highlighted in this documentation. The important part of placing refs in
     -    refs/prefetch remains.
     +    refs/prefetch/ remains.
      
          Reported-by: Tom Saeger <tom.saeger@oracle.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## Documentation/git-maintenance.txt ##
     -@@ Documentation/git-maintenance.txt: prefetch::
     +@@ Documentation/git-maintenance.txt: commit-graph::
     + prefetch::
     + 	The `prefetch` task updates the object directory with the latest
       	objects from all registered remotes. For each remote, a `git fetch`
     - 	command is run. The refmap is custom to avoid updating local or remote
     - 	branches (those in `refs/heads` or `refs/remotes`). Instead, the
     +-	command is run. The refmap is custom to avoid updating local or remote
     +-	branches (those in `refs/heads` or `refs/remotes`). Instead, the
      -	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
      -	not updated.
     -+	refs are stored in `refs/prefetch/`. Also, tags are not updated.
     ++	command is run. The configured refspec is modified to place all
     ++	requested refs within `refs/prefetch/`. Also, tags are not updated.
       +
       This is done to avoid disrupting the remote-tracking branches. The end users
       expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
      
       ## builtin/gc.c ##
     -@@
     - #include "remote.h"
     - #include "object-store.h"
     - #include "exec-cmd.h"
     -+#include "refspec.h"
     - 
     - #define FAILED_RUN "failed to run %s"
     - 
      @@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
     - {
     - 	struct maintenance_run_opts *opts = cbdata;
       	struct child_process child = CHILD_PROCESS_INIT;
     -+	int i;
       
       	child.git_cmd = 1;
     - 	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
     -@@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
     +-	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
     ++	strvec_pushl(&child.args, "fetch", remote->name,
     ++		     "--prefetch", "--prune", "--no-tags",
     + 		     "--no-write-fetch-head", "--recurse-submodules=no",
     +-		     "--refmap=", NULL);
     ++		     NULL);
     + 
       	if (opts->quiet)
       		strvec_push(&child.args, "--quiet");
       
      -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
     -+	for (i = 0; i < remote->fetch.nr; i++) {
     -+		struct refspec_item replace;
     -+		struct refspec_item *rsi = &remote->fetch.items[i];
     -+		struct strbuf new_dst = STRBUF_INIT;
     -+		size_t ignore_len = 0;
     -+		char *replace_string;
     -+
     -+		if (rsi->negative) {
     -+			strvec_push(&child.args, remote->fetch.raw[i]);
     -+			continue;
     -+		}
     -+
     -+		refspec_item_init(&replace, remote->fetch.raw[i], 1);
     -+
     -+		/*
     -+		 * If a refspec dst starts with "refs/" at the start,
     -+		 * then we will replace "refs/" with "refs/prefetch/".
     -+		 * Otherwise, we will prepend the dst string with
     -+		 * "refs/prefetch/".
     -+		 */
     -+		if (!strncmp(replace.dst, "refs/", 5))
     -+			ignore_len = 5;
     -+
     -+		strbuf_addstr(&new_dst, "refs/prefetch/");
     -+		strbuf_addstr(&new_dst, replace.dst + ignore_len);
     -+		free(replace.dst);
     -+		replace.dst = strbuf_detach(&new_dst, NULL);
     -+
     -+		replace_string = refspec_item_format(&replace);
     -+		strvec_push(&child.args, replace_string);
     -+		free(replace_string);
     -+
     -+		refspec_item_clear(&replace);
     -+	}
     - 
     +-
       	return !!run_command(&child);
       }
     + 
      
       ## t/t7900-maintenance.sh ##
      @@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
     + 	test_commit -C clone1 one &&
       	test_commit -C clone2 two &&
       	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
     - 	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
     --	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remote1/*" <run-prefetch.txt &&
     --	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remote2/*" <run-prefetch.txt &&
     -+	test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote1/*" <run-prefetch.txt &&
     -+	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <run-prefetch.txt &&
     +-	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
     +-	test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
     +-	test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
     ++	fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
     ++	test_subcommand git fetch remote1 $fetchargs <run-prefetch.txt &&
     ++	test_subcommand git fetch remote2 $fetchargs <run-prefetch.txt &&
       	test_path_is_missing .git/refs/remotes &&
      -	git log prefetch/remote1/one &&
      -	git log prefetch/remote2/two &&
     @@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
       
       	test_cmp_config refs/prefetch/ log.excludedecoration &&
       	git log --oneline --decorate --all >log &&
     - 	! grep "prefetch" log
     - '
     - 
     -+test_expect_success 'prefetch custom refspecs' '
     -+	git -C clone1 branch -f special/fetched HEAD &&
     -+	git -C clone1 branch -f special/secret/not-fetched HEAD &&
     -+
     -+	# create multiple refspecs for remote1
     -+	git config --add remote.remote1.fetch "+refs/heads/special/fetched:refs/heads/fetched" &&
     -+	git config --add remote.remote1.fetch "^refs/heads/special/secret/not-fetched" &&
     -+
     -+	GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
     -+
     -+	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
     -+
     -+	# skips second refspec because it is not a pattern type
     -+	rs1="+refs/heads/*:refs/prefetch/remotes/remote1/*" &&
     -+	rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
     -+	rs3="^refs/heads/special/secret/not-fetched" &&
     -+
     -+	test_subcommand git fetch remote1 $fetchargs "$rs1" "$rs2" "$rs3" <prefetch-refspec.txt &&
     -+	test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <prefetch-refspec.txt &&
     -+
     -+	# first refspec is overridden by second
     -+	test_must_fail git rev-parse refs/prefetch/special/fetched &&
     -+	git rev-parse refs/prefetch/heads/fetched &&
     -+
     -+	# possible incorrect places for the non-fetched ref
     -+	test_must_fail git rev-parse refs/prefetch/remotes/remote1/secret/not-fetched &&
     -+	test_must_fail git rev-parse refs/prefetch/remotes/remote1/not-fetched &&
     -+	test_must_fail git rev-parse refs/heads/secret/not-fetched &&
     -+	test_must_fail git rev-parse refs/heads/not-fetched
     -+'
     -+
     - test_expect_success 'prefetch and existing log.excludeDecoration values' '
     - 	git config --unset-all log.excludeDecoration &&
     - 	git config log.excludeDecoration refs/remotes/remote1/ &&

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v3 1/3] maintenance: simplify prefetch logic
  2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
@ 2021-04-10  2:03     ` Derrick Stolee via GitGitGadget
  2021-04-12 20:13       ` Tom Saeger
  2021-04-10  2:03     ` [PATCH v3 2/3] fetch: add --prefetch option Derrick Stolee via GitGitGadget
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-10  2:03 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The previous logic filled a string list with the names of each remote,
but instead we could simply run the appropriate 'git fetch' data
directly in the remote iterator. Do this for reduced code size, but also
because it sets up an upcoming change to use the remote's refspec. This
data is accessible from the 'struct remote' data that is now accessible
in fetch_remote().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/gc.c | 33 ++++++++-------------------------
 1 file changed, 8 insertions(+), 25 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index ef7226d7bca4..fa8128de9ae1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -873,55 +873,38 @@ static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
+static int fetch_remote(struct remote *remote, void *cbdata)
 {
+	struct maintenance_run_opts *opts = cbdata;
 	struct child_process child = CHILD_PROCESS_INIT;
 
 	child.git_cmd = 1;
-	strvec_pushl(&child.args, "fetch", remote, "--prune", "--no-tags",
+	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
 		     "--no-write-fetch-head", "--recurse-submodules=no",
 		     "--refmap=", NULL);
 
 	if (opts->quiet)
 		strvec_push(&child.args, "--quiet");
 
-	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote);
+	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
 
 	return !!run_command(&child);
 }
 
-static int append_remote(struct remote *remote, void *cbdata)
-{
-	struct string_list *remotes = (struct string_list *)cbdata;
-
-	string_list_append(remotes, remote->name);
-	return 0;
-}
-
 static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
 {
-	int result = 0;
-	struct string_list_item *item;
-	struct string_list remotes = STRING_LIST_INIT_DUP;
-
 	git_config_set_multivar_gently("log.excludedecoration",
 					"refs/prefetch/",
 					"refs/prefetch/",
 					CONFIG_FLAGS_FIXED_VALUE |
 					CONFIG_FLAGS_MULTI_REPLACE);
 
-	if (for_each_remote(append_remote, &remotes)) {
-		error(_("failed to fill remotes"));
-		result = 1;
-		goto cleanup;
+	if (for_each_remote(fetch_remote, opts)) {
+		error(_("failed to prefetch remotes"));
+		return 1;
 	}
 
-	for_each_string_list_item(item, &remotes)
-		result |= fetch_remote(item->string, opts);
-
-cleanup:
-	string_list_clear(&remotes, 0);
-	return result;
+	return 0;
 }
 
 static int maintenance_task_gc(struct maintenance_run_opts *opts)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 2/3] fetch: add --prefetch option
  2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  2021-04-10  2:03     ` [PATCH v3 1/3] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
@ 2021-04-10  2:03     ` Derrick Stolee via GitGitGadget
  2021-04-11 21:09       ` Ramsay Jones
  2021-04-10  2:03     ` [PATCH v3 3/3] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-10  2:03 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.

Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:

 * Negative refspecs are preserved.
 * Refspecs without a destination are removed.
 * Refspecs whose source starts with "refs/tags/" are removed.
 * Other refspecs are placed within "refs/prefetch/".

Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/fetch-options.txt   |  5 +++
 builtin/fetch.c                   | 56 +++++++++++++++++++++++++++++++
 t/t5582-fetch-negative-refspec.sh | 30 +++++++++++++++++
 3 files changed, 91 insertions(+)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 07783deee309..9e7b4e189ce0 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -110,6 +110,11 @@ ifndef::git-pull[]
 	setting `fetch.writeCommitGraph`.
 endif::git-pull[]
 
+--prefetch::
+	Modify the configured refspec to place all refs into the
+	`refs/prefetch/` namespace. See the `prefetch` task in
+	linkgit:git-maintenance[1].
+
 -p::
 --prune::
 	Before fetching, remove any remote-tracking references that no
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0b90de87c7a2..30856b442b79 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -48,6 +48,7 @@ enum {
 static int fetch_prune_config = -1; /* unspecified */
 static int fetch_show_forced_updates = 1;
 static uint64_t forced_updates_ms = 0;
+static int prefetch = 0;
 static int prune = -1; /* unspecified */
 #define PRUNE_BY_DEFAULT 0 /* do we prune by default? */
 
@@ -158,6 +159,8 @@ static struct option builtin_fetch_options[] = {
 		    N_("do not fetch all tags (--no-tags)"), TAGS_UNSET),
 	OPT_INTEGER('j', "jobs", &max_jobs,
 		    N_("number of submodules fetched in parallel")),
+	OPT_BOOL(0, "prefetch", &prefetch,
+		 N_("modify the refspec to place all refs within refs/prefetch/")),
 	OPT_BOOL('p', "prune", &prune,
 		 N_("prune remote-tracking branches no longer on remote")),
 	OPT_BOOL('P', "prune-tags", &prune_tags,
@@ -436,6 +439,55 @@ static void find_non_local_tags(const struct ref *refs,
 	oidset_clear(&fetch_oids);
 }
 
+static void filter_prefetch_refspec(struct refspec *rs)
+{
+	int i;
+
+	if (!prefetch)
+		return;
+
+	for (i = 0; i < rs->nr; i++) {
+		struct strbuf new_dst = STRBUF_INIT;
+		char *old_dst;
+		const char *sub = NULL;
+
+		if (rs->items[i].negative)
+			continue;
+		if (!rs->items[i].dst ||
+		    (rs->items[i].src &&
+		     !strncmp(rs->items[i].src, "refs/tags/", 10))) {
+			int j;
+
+			free(rs->items[i].src);
+			free(rs->items[i].dst);
+
+			for (j = i + 1; j < rs->nr; j++) {
+				rs->items[j - 1] = rs->items[j];
+				rs->raw[j - 1] = rs->raw[j];
+			}
+			rs->nr--;
+			continue;
+		}
+
+		old_dst = rs->items[i].dst;
+		strbuf_addstr(&new_dst, "refs/prefetch/");
+
+		/*
+		 * If old_dst starts with "refs/", then place
+		 * sub after that prefix. Otherwise, start at
+		 * the beginning of the string.
+		 */
+		if (!skip_prefix(old_dst, "refs/", &sub))
+			sub = old_dst;
+		strbuf_addstr(&new_dst, sub);
+
+		rs->items[i].dst = strbuf_detach(&new_dst, NULL);
+		rs->items[i].force = 1;
+
+		free(old_dst);
+	}
+}
+
 static struct ref *get_ref_map(struct remote *remote,
 			       const struct ref *remote_refs,
 			       struct refspec *rs,
@@ -452,6 +504,10 @@ static struct ref *get_ref_map(struct remote *remote,
 	struct hashmap existing_refs;
 	int existing_refs_populated = 0;
 
+	filter_prefetch_refspec(rs);
+	if (remote)
+		filter_prefetch_refspec(&remote->fetch);
+
 	if (rs->nr) {
 		struct refspec *fetch_refspec;
 
diff --git a/t/t5582-fetch-negative-refspec.sh b/t/t5582-fetch-negative-refspec.sh
index f34509727702..030e6f978c4e 100755
--- a/t/t5582-fetch-negative-refspec.sh
+++ b/t/t5582-fetch-negative-refspec.sh
@@ -240,4 +240,34 @@ test_expect_success "push with matching +: and negative refspec" '
 	git -C two push -v one
 '
 
+test_expect_success '--prefetch correctly modifies refspecs' '
+	git -C one config --unset-all remote.origin.fetch &&
+	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
+	git -C one config --add remote.origin.fetch ^refs/heads/bogus/ignore &&
+	git -C one config --add remote.origin.fetch "refs/heads/bogus/*:bogus/*" &&
+
+	git tag -a -m never never-fetch-tag HEAD &&
+
+	git branch bogus/fetched HEAD~1 &&
+	git branch bogus/ignore HEAD &&
+
+	git -C one fetch --prefetch --no-tags &&
+	test_must_fail git -C one rev-parse never-fetch-tag &&
+	git -C one rev-parse refs/prefetch/bogus/fetched &&
+	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore &&
+
+	# correctly handle when refspec set becomes empty
+	# after removing the refs/tags/* refspec.
+	git -C one config --unset-all remote.origin.fetch &&
+	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
+
+	git -C one fetch --prefetch --no-tags &&
+	test_must_fail git -C one rev-parse never-fetch-tag &&
+
+	# The refspec for refs that are not fully qualified
+	# are filtered multiple times.
+	git -C one rev-parse refs/prefetch/bogus/fetched &&
+	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 3/3] maintenance: use 'git fetch --prefetch'
  2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
  2021-04-10  2:03     ` [PATCH v3 1/3] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
  2021-04-10  2:03     ` [PATCH v3 2/3] fetch: add --prefetch option Derrick Stolee via GitGitGadget
@ 2021-04-10  2:03     ` Derrick Stolee via GitGitGadget
  2021-04-11  1:35     ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Junio C Hamano
  2021-04-16 12:49     ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
  4 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-10  2:03 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'prefetch' maintenance task previously forced the following refspec
for each remote:

	+refs/heads/*:refs/prefetch/<remote>/*

If a user has specified a more strict refspec for the remote, then this
prefetch task downloads more objects than necessary.

The previous change introduced the '--prefetch' option to 'git fetch'
which manipulates the remote's refspec to place all resulting refs into
refs/prefetch/, with further partitioning based on the destinations of
those refspecs.

Update the documentation to be more generic about the destination refs.
Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch/ remains.

Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  6 ++----
 builtin/gc.c                      |  7 +++----
 t/t7900-maintenance.sh            | 14 +++++++-------
 3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 80ddd33ceba0..1e738ad39832 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -92,10 +92,8 @@ commit-graph::
 prefetch::
 	The `prefetch` task updates the object directory with the latest
 	objects from all registered remotes. For each remote, a `git fetch`
-	command is run. The refmap is custom to avoid updating local or remote
-	branches (those in `refs/heads` or `refs/remotes`). Instead, the
-	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
-	not updated.
+	command is run. The configured refspec is modified to place all
+	requested refs within `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
diff --git a/builtin/gc.c b/builtin/gc.c
index fa8128de9ae1..9d35f7da50d8 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -879,15 +879,14 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	struct child_process child = CHILD_PROCESS_INIT;
 
 	child.git_cmd = 1;
-	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
+	strvec_pushl(&child.args, "fetch", remote->name,
+		     "--prefetch", "--prune", "--no-tags",
 		     "--no-write-fetch-head", "--recurse-submodules=no",
-		     "--refmap=", NULL);
+		     NULL);
 
 	if (opts->quiet)
 		strvec_push(&child.args, "--quiet");
 
-	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
-
 	return !!run_command(&child);
 }
 
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 2412d8c5c006..eadb800c08cc 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -141,15 +141,15 @@ test_expect_success 'prefetch multiple remotes' '
 	test_commit -C clone1 one &&
 	test_commit -C clone2 two &&
 	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
-	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-	test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
-	test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
+	fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
+	test_subcommand git fetch remote1 $fetchargs <run-prefetch.txt &&
+	test_subcommand git fetch remote2 $fetchargs <run-prefetch.txt &&
 	test_path_is_missing .git/refs/remotes &&
-	git log prefetch/remote1/one &&
-	git log prefetch/remote2/two &&
+	git log prefetch/remotes/remote1/one &&
+	git log prefetch/remotes/remote2/two &&
 	git fetch --all &&
-	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remote1/one &&
-	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remote2/two &&
+	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remotes/remote1/one &&
+	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
 
 	test_cmp_config refs/prefetch/ log.excludedecoration &&
 	git log --oneline --decorate --all >log &&
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] maintenance: allow custom refspecs during prefetch
  2021-04-10  0:56           ` Derrick Stolee
@ 2021-04-10 11:37             ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-10 11:37 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, tom.saeger, gitster,
	sunshine, Derrick Stolee, Derrick Stolee


On Sat, Apr 10 2021, Derrick Stolee wrote:

> On 4/9/2021 3:28 PM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Fri, Apr 09 2021, Derrick Stolee wrote:
>> 
>>> On 4/7/2021 6:26 AM, Ævar Arnfjörð Bjarmason wrote:
>>>> I think converting the whole thing to something like the WIP/RFC patch
>>>> below is much better and more readable.
>>>
>>> This is an interesting approach. I don't see you using the ERR that you
>>> are inputting anywhere, so that seems like an unnecessary bloat to the
>>> consumers. But maybe I haven't discovered all of the places where this
>>> would be useful, but it seems better to pipe stderr to a file for later
>>> comparison when needed.
>> 
>> Yes, it's probably not a good default here. For the test-lib.sh tests
>> there's check_sub_test_lib_test and check_sub_test_lib_test_err, most of
>> the tests only test stdout.
>> 
>>>> +test_expect_process_tree () {
>>>> +	depth= &&
>>>> +	>actual &&
>>>> +	cat >expect &&
>>>> +	cat <&3 >expect.err
>>>> +	while test $# != 0
>>>> +	do
>>>> +		case "$1" in
>>>> +		--depth)
>>>> +			depth="$2"
>>>> +			shift
>>>> +			;;
>>>> +		*)
>>>> +			break
>>>> +			;;
>>>> +		esac
>>>> +		shift
>>>> +	done &&
>>> Do you have an example where this is being checked? Or can depth
>>> be left as 1 for now?
>> 
>> It can probably be hardcoded, but I was hoping someone more familiar
>> with trace2 would chime in, but I'm fairly sure there's not a way to do
>> it without parsing the existing output with either some clever
>> grep/awk-ing of the PERF output, or stateful parsing of the JSON.
>> 
>> I thought that for git maintenance tests perhaps something wanted to
>> assert that we didn't have maintenance invoking maintenance, or that
>> something expected to prune refs really invoked the relevant prune
>> command via "gc".
>> 
>>>> +	log="$(pwd)/proc-tree.txt" &&
>>>> +	>"$log" &&
>>>> +	GIT_TRACE2_PERF="$log" "$@" 2>actual.err &&
>>>> +	grep "child_start" proc-tree.txt >proc-tree-start.txt || : &&
>>>> +	if test -n "$depth"
>>>> +	then
>>>> +		grep " d$depth " proc-tree-start.txt >tmp.txt || : &&
>>>> +		mv tmp.txt proc-tree-start.txt
>>>> +	fi &&
>>>> +	sed -e 's/^.*argv:\[//' -e 's/\]$//' <proc-tree-start.txt >actual &&
>>>> +	test_cmp expect actual &&
>>>> +	test_cmp expect.err actual.err
>>>> +} 7>&2 2>&4
>>>
>>> I think similar ideas could apply to test_region. Giving it a try
>>> now.
>> 
>> Probably, I didn't even notice that one...
>
> I gave this a few hours today, and I'm giving up. I'm the first to
> admit that I don't have the correct scripting skills to do some of
> these things.
>
> I've got what I tried below. It certainly looks like it would work.
> It solves the problem of "what if the test is flaky?" by ensuring that
> all subcommands (at depth 0) match the inputs exactly.

Looks good!

> However, the problem comes when trying to make that work for all of
> the maintenance tests, specifically the 'incremental-repack' task.
> That task dynamically computes a --batch-size=X parameter, and that
> is not stable across runs of the script.
>
> This was avoided in the past by only checking for the first of three
> subcommands when verifying that the 'incremental-repack' task worked.
> That is, except for the EXPENSIVE test that checks that the --batch-size
> maxes out at 2g.
>
> The thing that might make these changing parameters work is to allow
> the specified lines be a _prefix_ of the actual parameters. Or, let
> each line be a pattern that is checked against that line. Issues come
> up with how to handle this line-by-line check that I was unable to
> overcome.
>
> The good news is that the idea of adding a '--prefetch' option to
> 'git fetch' makes the change to t7900-maintenance.sh much easier,
> making this change to test_subcommand less of a priority.
>
> I include my attempt here as a patch. Feel free to take whatever
> you want of it, or none of it and start over. I do think that it
> makes the test script look much nicer.

I think a good way to deal with that is to have a lower-level helper
function that doesn't do the test_cmp, and instead just runs the
command, and leaves the stdout/stderr files in-place for another "check"
helper.

On a WIP topic I split up the t0000-basic.sh "test test-lib.sh itself"
code to do pretty much that:
https://github.com/avar/git/blob/avar/support-test-verbose-under-prove-2/t/lib-subtest.sh

It's basically back to the existing model of "run a command and grep it
later", except that we can pass the JSON through some basic parser, and
extract common cases like test_cmp, prefix munging before test_cmp etc.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 0/3] Maintenance: adapt custom refspecs
  2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
                       ` (2 preceding siblings ...)
  2021-04-10  2:03     ` [PATCH v3 3/3] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
@ 2021-04-11  1:35     ` Junio C Hamano
  2021-04-12 16:48       ` Tom Saeger
  2021-04-16 12:49     ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
  4 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2021-04-11  1:35 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  * The fix is almost completely rewritten as an update to 'git fetch'. See
>    the new PATCH 2 for this update.

I do agree that it gives us the most flexibility there with nice
encapsulation.  Nobody other than "git fetch" needs to know how it
computes which remote refs are fetched given the real pathspec, and
the only thing the caller with "--prefetch" is interested in is that
the prefetch operation would not contaminate the remote-tracking
refs.

Great idea.  I wish I were the one who thought of it first ;-)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 2/3] fetch: add --prefetch option
  2021-04-10  2:03     ` [PATCH v3 2/3] fetch: add --prefetch option Derrick Stolee via GitGitGadget
@ 2021-04-11 21:09       ` Ramsay Jones
  2021-04-12 20:23         ` Derrick Stolee
  0 siblings, 1 reply; 72+ messages in thread
From: Ramsay Jones @ 2021-04-11 21:09 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget, git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Derrick Stolee, Derrick Stolee



On 10/04/2021 03:03, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The --prefetch option will be used by the 'prefetch' maintenance task
> instead of sending refspecs explicitly across the command-line. The
> intention is to modify the refspec to place all results in
> refs/prefetch/ instead of anywhere else.
> 
> Create helper method filter_prefetch_refspec() to modify a given refspec
> to fit the rules expected of the prefetch task:
> 
>  * Negative refspecs are preserved.
>  * Refspecs without a destination are removed.
>  * Refspecs whose source starts with "refs/tags/" are removed.
>  * Other refspecs are placed within "refs/prefetch/".
> 
> Finally, we add the 'force' option to ensure that prefetch refs are
> replaced as necessary.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/fetch-options.txt   |  5 +++
>  builtin/fetch.c                   | 56 +++++++++++++++++++++++++++++++
>  t/t5582-fetch-negative-refspec.sh | 30 +++++++++++++++++
>  3 files changed, 91 insertions(+)
> 
> diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
> index 07783deee309..9e7b4e189ce0 100644
> --- a/Documentation/fetch-options.txt
> +++ b/Documentation/fetch-options.txt
> @@ -110,6 +110,11 @@ ifndef::git-pull[]
>  	setting `fetch.writeCommitGraph`.
>  endif::git-pull[]
>  
> +--prefetch::
> +	Modify the configured refspec to place all refs into the
> +	`refs/prefetch/` namespace. See the `prefetch` task in
> +	linkgit:git-maintenance[1].
> +
>  -p::
>  --prune::
>  	Before fetching, remove any remote-tracking references that no
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 0b90de87c7a2..30856b442b79 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -48,6 +48,7 @@ enum {
>  static int fetch_prune_config = -1; /* unspecified */
>  static int fetch_show_forced_updates = 1;
>  static uint64_t forced_updates_ms = 0;
> +static int prefetch = 0;
>  static int prune = -1; /* unspecified */
>  #define PRUNE_BY_DEFAULT 0 /* do we prune by default? */
>  
> @@ -158,6 +159,8 @@ static struct option builtin_fetch_options[] = {
>  		    N_("do not fetch all tags (--no-tags)"), TAGS_UNSET),
>  	OPT_INTEGER('j', "jobs", &max_jobs,
>  		    N_("number of submodules fetched in parallel")),
> +	OPT_BOOL(0, "prefetch", &prefetch,
> +		 N_("modify the refspec to place all refs within refs/prefetch/")),
>  	OPT_BOOL('p', "prune", &prune,
>  		 N_("prune remote-tracking branches no longer on remote")),
>  	OPT_BOOL('P', "prune-tags", &prune_tags,
> @@ -436,6 +439,55 @@ static void find_non_local_tags(const struct ref *refs,
>  	oidset_clear(&fetch_oids);
>  }
>  
> +static void filter_prefetch_refspec(struct refspec *rs)
> +{
> +	int i;
> +
> +	if (!prefetch)
> +		return;
> +
> +	for (i = 0; i < rs->nr; i++) {
> +		struct strbuf new_dst = STRBUF_INIT;
> +		char *old_dst;
> +		const char *sub = NULL;
> +
> +		if (rs->items[i].negative)
> +			continue;
> +		if (!rs->items[i].dst ||
> +		    (rs->items[i].src &&
> +		     !strncmp(rs->items[i].src, "refs/tags/", 10))) {
> +			int j;
> +
> +			free(rs->items[i].src);
> +			free(rs->items[i].dst);
> +
> +			for (j = i + 1; j < rs->nr; j++) {
> +				rs->items[j - 1] = rs->items[j];
> +				rs->raw[j - 1] = rs->raw[j];
> +			}
> +			rs->nr--;

Hmm, don't you need to do 'i--;' here?

(Sorry in advance if this is nonsense, I am just skimming the
patches without reading the whole series carefully).

Maybe try a test which has an entry, which requires the 'prefetch'
modification, that immediately follows a 'tag' or 'empty dst' entry.
(I can't quite tell, just reading the email, whether that is covered
by the tests below - so please just ignore me if it already works ;)

ATB,
Ramsay Jones


> +			continue;
> +		}
> +
> +		old_dst = rs->items[i].dst;
> +		strbuf_addstr(&new_dst, "refs/prefetch/");
> +
> +		/*
> +		 * If old_dst starts with "refs/", then place
> +		 * sub after that prefix. Otherwise, start at
> +		 * the beginning of the string.
> +		 */
> +		if (!skip_prefix(old_dst, "refs/", &sub))
> +			sub = old_dst;
> +		strbuf_addstr(&new_dst, sub);
> +
> +		rs->items[i].dst = strbuf_detach(&new_dst, NULL);
> +		rs->items[i].force = 1;
> +
> +		free(old_dst);
> +	}
> +}
> +
>  static struct ref *get_ref_map(struct remote *remote,
>  			       const struct ref *remote_refs,
>  			       struct refspec *rs,
> @@ -452,6 +504,10 @@ static struct ref *get_ref_map(struct remote *remote,
>  	struct hashmap existing_refs;
>  	int existing_refs_populated = 0;
>  
> +	filter_prefetch_refspec(rs);
> +	if (remote)
> +		filter_prefetch_refspec(&remote->fetch);
> +
>  	if (rs->nr) {
>  		struct refspec *fetch_refspec;
>  
> diff --git a/t/t5582-fetch-negative-refspec.sh b/t/t5582-fetch-negative-refspec.sh
> index f34509727702..030e6f978c4e 100755
> --- a/t/t5582-fetch-negative-refspec.sh
> +++ b/t/t5582-fetch-negative-refspec.sh
> @@ -240,4 +240,34 @@ test_expect_success "push with matching +: and negative refspec" '
>  	git -C two push -v one
>  '
>  
> +test_expect_success '--prefetch correctly modifies refspecs' '
> +	git -C one config --unset-all remote.origin.fetch &&
> +	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
> +	git -C one config --add remote.origin.fetch ^refs/heads/bogus/ignore &&
> +	git -C one config --add remote.origin.fetch "refs/heads/bogus/*:bogus/*" &&
> +
> +	git tag -a -m never never-fetch-tag HEAD &&
> +
> +	git branch bogus/fetched HEAD~1 &&
> +	git branch bogus/ignore HEAD &&
> +
> +	git -C one fetch --prefetch --no-tags &&
> +	test_must_fail git -C one rev-parse never-fetch-tag &&
> +	git -C one rev-parse refs/prefetch/bogus/fetched &&
> +	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore &&
> +
> +	# correctly handle when refspec set becomes empty
> +	# after removing the refs/tags/* refspec.
> +	git -C one config --unset-all remote.origin.fetch &&
> +	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
> +
> +	git -C one fetch --prefetch --no-tags &&
> +	test_must_fail git -C one rev-parse never-fetch-tag &&
> +
> +	# The refspec for refs that are not fully qualified
> +	# are filtered multiple times.
> +	git -C one rev-parse refs/prefetch/bogus/fetched &&
> +	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore
> +'
> +
>  test_done
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 0/3] Maintenance: adapt custom refspecs
  2021-04-11  1:35     ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Junio C Hamano
@ 2021-04-12 16:48       ` Tom Saeger
  2021-04-12 17:24         ` Tom Saeger
  0 siblings, 1 reply; 72+ messages in thread
From: Tom Saeger @ 2021-04-12 16:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, sunshine, Derrick Stolee,
	Josh Steadmon, Emily Shaffer, Derrick Stolee

On Sat, Apr 10, 2021 at 06:35:40PM -0700, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> >  * The fix is almost completely rewritten as an update to 'git fetch'. See
> >    the new PATCH 2 for this update.
> 
> I do agree that it gives us the most flexibility there with nice
> encapsulation.  Nobody other than "git fetch" needs to know how it
> computes which remote refs are fetched given the real pathspec, and
> the only thing the caller with "--prefetch" is interested in is that
> the prefetch operation would not contaminate the remote-tracking
> refs.
> 
> Great idea.  I wish I were the one who thought of it first ;-)

Yes - this simplifies things greatly!

I do have one case that fails prefetch though.
It's a case where all the remote's fetch configs are filtered out.

Example:

	[remote "pr-924"]
	    url = https://github.com/gitgitgadget/git
	    fetch = +refs/tags/pr-924/derrickstolee/maintenance/refspec-v3
	    skipfetchall = true
	    tagopt = --no-tags


In this case, running `git fetch pr-924` will fetch and update
FETCH_HEAD, but running with maintenance prefetch task results in:

fatal: Couldn't find remote ref HEAD
error: failed to prefetch remotes
error: task 'prefetch' failed

I tracked this down a bit, but don't have a suggestion how to fix it.

builtin/fetch.c `get_ref_map` makes two calls to `filter_prefetch_refspec`,
one for 'rs' and another for 'remote->fetch'.

`filter_prefetch_refspec` works and filters out the above fetch config.
This correctly yields condition
`rs->nr == 0` and `remote->fetch.nr == 0`

Later a call is made to `get_remote_ref(remote_refs, "HEAD")` which
fails, leading to `fatal: Couldn't find remote ref HEAD`

Should this be expected, or should this now be special-cased for 'prefetch'
somehow?

Regards,

--Tom

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 0/3] Maintenance: adapt custom refspecs
  2021-04-12 16:48       ` Tom Saeger
@ 2021-04-12 17:24         ` Tom Saeger
  2021-04-12 17:41           ` Tom Saeger
  0 siblings, 1 reply; 72+ messages in thread
From: Tom Saeger @ 2021-04-12 17:24 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, sunshine, Derrick Stolee,
	Josh Steadmon, Emily Shaffer, Derrick Stolee

On Mon, Apr 12, 2021 at 11:48:09AM -0500, Tom Saeger wrote:
> On Sat, Apr 10, 2021 at 06:35:40PM -0700, Junio C Hamano wrote:
> > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > 
> > >  * The fix is almost completely rewritten as an update to 'git fetch'. See
> > >    the new PATCH 2 for this update.
> > 
> > I do agree that it gives us the most flexibility there with nice
> > encapsulation.  Nobody other than "git fetch" needs to know how it
> > computes which remote refs are fetched given the real pathspec, and
> > the only thing the caller with "--prefetch" is interested in is that
> > the prefetch operation would not contaminate the remote-tracking
> > refs.
> > 
> > Great idea.  I wish I were the one who thought of it first ;-)
> 
> Yes - this simplifies things greatly!
> 
> I do have one case that fails prefetch though.
> It's a case where all the remote's fetch configs are filtered out.
> 
> Example:
> 
> 	[remote "pr-924"]
> 	    url = https://github.com/gitgitgadget/git
> 	    fetch = +refs/tags/pr-924/derrickstolee/maintenance/refspec-v3
> 	    skipfetchall = true
> 	    tagopt = --no-tags
> 
> 
> In this case, running `git fetch pr-924` will fetch and update
> FETCH_HEAD, but running with maintenance prefetch task results in:
> 
> fatal: Couldn't find remote ref HEAD
> error: failed to prefetch remotes
> error: task 'prefetch' failed
> 
> I tracked this down a bit, but don't have a suggestion how to fix it.

This ugly hack fixes this failure.  I'll keep staring at it.


diff --git a/builtin/fetch.c b/builtin/fetch.c
index 30856b442b79..6489ce7d8d3b 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -508,6 +508,9 @@ static struct ref *get_ref_map(struct remote *remote,
        if (remote)
                filter_prefetch_refspec(&remote->fetch);

+       if (prefetch && !rs->nr && remote && !remote->fetch.nr)
+               return NULL;
+
        if (rs->nr) {
                struct refspec *fetch_refspec;

--



> 
> builtin/fetch.c `get_ref_map` makes two calls to `filter_prefetch_refspec`,
> one for 'rs' and another for 'remote->fetch'.
> 
> `filter_prefetch_refspec` works and filters out the above fetch config.
> This correctly yields condition
> `rs->nr == 0` and `remote->fetch.nr == 0`
> 
> Later a call is made to `get_remote_ref(remote_refs, "HEAD")` which
> fails, leading to `fatal: Couldn't find remote ref HEAD`
> 
> Should this be expected, or should this now be special-cased for 'prefetch'
> somehow?
> 
> Regards,
> 
> --Tom

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 0/3] Maintenance: adapt custom refspecs
  2021-04-12 17:24         ` Tom Saeger
@ 2021-04-12 17:41           ` Tom Saeger
  2021-04-12 20:25             ` Derrick Stolee
  0 siblings, 1 reply; 72+ messages in thread
From: Tom Saeger @ 2021-04-12 17:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, sunshine, Derrick Stolee,
	Josh Steadmon, Emily Shaffer, Derrick Stolee

On Mon, Apr 12, 2021 at 12:24:27PM -0500, Tom Saeger wrote:
> On Mon, Apr 12, 2021 at 11:48:09AM -0500, Tom Saeger wrote:
> > On Sat, Apr 10, 2021 at 06:35:40PM -0700, Junio C Hamano wrote:
> > > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > > 
> > > >  * The fix is almost completely rewritten as an update to 'git fetch'. See
> > > >    the new PATCH 2 for this update.
> > > 
> > > I do agree that it gives us the most flexibility there with nice
> > > encapsulation.  Nobody other than "git fetch" needs to know how it
> > > computes which remote refs are fetched given the real pathspec, and
> > > the only thing the caller with "--prefetch" is interested in is that
> > > the prefetch operation would not contaminate the remote-tracking
> > > refs.
> > > 
> > > Great idea.  I wish I were the one who thought of it first ;-)
> > 
> > Yes - this simplifies things greatly!
> > 
> > I do have one case that fails prefetch though.
> > It's a case where all the remote's fetch configs are filtered out.
> > 
> > Example:
> > 
> > 	[remote "pr-924"]
> > 	    url = https://github.com/gitgitgadget/git
> > 	    fetch = +refs/tags/pr-924/derrickstolee/maintenance/refspec-v3
> > 	    skipfetchall = true
> > 	    tagopt = --no-tags
> > 
> > 
> > In this case, running `git fetch pr-924` will fetch and update
> > FETCH_HEAD, but running with maintenance prefetch task results in:
> > 
> > fatal: Couldn't find remote ref HEAD
> > error: failed to prefetch remotes
> > error: task 'prefetch' failed
> > 
> > I tracked this down a bit, but don't have a suggestion how to fix it.
> 
> This ugly hack fixes this failure.  I'll keep staring at it.
> 
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 30856b442b79..6489ce7d8d3b 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -508,6 +508,9 @@ static struct ref *get_ref_map(struct remote *remote,
>         if (remote)
>                 filter_prefetch_refspec(&remote->fetch);
> 
> +       if (prefetch && !rs->nr && remote && !remote->fetch.nr)
> +               return NULL;
> +
>         if (rs->nr) {
>                 struct refspec *fetch_refspec;
> 
> --
> 

Less ugly fix...

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 30856b442b79..5fbffbd17d7d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -576,6 +576,8 @@ static struct ref *get_ref_map(struct remote *remote,
                        if (has_merge &&
                            !strcmp(branch->remote_name, remote->name))
                                add_merge_config(&ref_map, remote_refs, branch, &tail);
+               } else if (prefetch) {
+                       ;
                } else {
                        ref_map = get_remote_ref(remote_refs, "HEAD");
                        if (!ref_map)
--

Other ideas?


> 
> 
> > 
> > builtin/fetch.c `get_ref_map` makes two calls to `filter_prefetch_refspec`,
> > one for 'rs' and another for 'remote->fetch'.
> > 
> > `filter_prefetch_refspec` works and filters out the above fetch config.
> > This correctly yields condition
> > `rs->nr == 0` and `remote->fetch.nr == 0`
> > 
> > Later a call is made to `get_remote_ref(remote_refs, "HEAD")` which
> > fails, leading to `fatal: Couldn't find remote ref HEAD`
> > 
> > Should this be expected, or should this now be special-cased for 'prefetch'
> > somehow?
> > 
> > Regards,
> > 
> > --Tom

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 1/3] maintenance: simplify prefetch logic
  2021-04-10  2:03     ` [PATCH v3 1/3] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
@ 2021-04-12 20:13       ` Tom Saeger
  2021-04-12 20:27         ` Derrick Stolee
  0 siblings, 1 reply; 72+ messages in thread
From: Tom Saeger @ 2021-04-12 20:13 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Derrick Stolee, Derrick Stolee

On Sat, Apr 10, 2021 at 02:03:43AM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The previous logic filled a string list with the names of each remote,
> but instead we could simply run the appropriate 'git fetch' data
> directly in the remote iterator. Do this for reduced code size, but also
> because it sets up an upcoming change to use the remote's refspec. This
> data is accessible from the 'struct remote' data that is now accessible
> in fetch_remote().
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/gc.c | 33 ++++++++-------------------------
>  1 file changed, 8 insertions(+), 25 deletions(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index ef7226d7bca4..fa8128de9ae1 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -873,55 +873,38 @@ static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
>  	return 0;
>  }
>  
> -static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
> +static int fetch_remote(struct remote *remote, void *cbdata)
>  {
> +	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;


I think this might be appropriate to add:


       if (remote->skip_default_update)
               return 0;


maintenance prefetch is acting like `git fetch --all`
So it should also skip remotes with configs `skipfetchall = true`
Agree?

>  
>  	child.git_cmd = 1;
> -	strvec_pushl(&child.args, "fetch", remote, "--prune", "--no-tags",
> +	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
>  		     "--no-write-fetch-head", "--recurse-submodules=no",
>  		     "--refmap=", NULL);
>  
>  	if (opts->quiet)
>  		strvec_push(&child.args, "--quiet");
>  
> -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote);
> +	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
>  
>  	return !!run_command(&child);
>  }
>  
> -static int append_remote(struct remote *remote, void *cbdata)
> -{
> -	struct string_list *remotes = (struct string_list *)cbdata;
> -
> -	string_list_append(remotes, remote->name);
> -	return 0;
> -}
> -
>  static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
>  {
> -	int result = 0;
> -	struct string_list_item *item;
> -	struct string_list remotes = STRING_LIST_INIT_DUP;
> -
>  	git_config_set_multivar_gently("log.excludedecoration",
>  					"refs/prefetch/",
>  					"refs/prefetch/",
>  					CONFIG_FLAGS_FIXED_VALUE |
>  					CONFIG_FLAGS_MULTI_REPLACE);
>  
> -	if (for_each_remote(append_remote, &remotes)) {
> -		error(_("failed to fill remotes"));
> -		result = 1;
> -		goto cleanup;
> +	if (for_each_remote(fetch_remote, opts)) {
> +		error(_("failed to prefetch remotes"));
> +		return 1;
>  	}
>  
> -	for_each_string_list_item(item, &remotes)
> -		result |= fetch_remote(item->string, opts);
> -
> -cleanup:
> -	string_list_clear(&remotes, 0);
> -	return result;
> +	return 0;
>  }
>  
>  static int maintenance_task_gc(struct maintenance_run_opts *opts)
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 2/3] fetch: add --prefetch option
  2021-04-11 21:09       ` Ramsay Jones
@ 2021-04-12 20:23         ` Derrick Stolee
  0 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee @ 2021-04-12 20:23 UTC (permalink / raw)
  To: Ramsay Jones, Derrick Stolee via GitGitGadget, git
  Cc: tom.saeger, gitster, sunshine, Josh Steadmon, Emily Shaffer,
	Derrick Stolee, Derrick Stolee

On 4/11/21 5:09 PM, Ramsay Jones wrote:> On 10/04/2021 03:03, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <dstolee@microsoft.com>
>> +	for (i = 0; i < rs->nr; i++) {
>> +		struct strbuf new_dst = STRBUF_INIT;
>> +		char *old_dst;
>> +		const char *sub = NULL;
>> +
>> +		if (rs->items[i].negative)
>> +			continue;
>> +		if (!rs->items[i].dst ||
>> +		    (rs->items[i].src &&
>> +		     !strncmp(rs->items[i].src, "refs/tags/", 10))) {
>> +			int j;
>> +
>> +			free(rs->items[i].src);
>> +			free(rs->items[i].dst);
>> +
>> +			for (j = i + 1; j < rs->nr; j++) {
>> +				rs->items[j - 1] = rs->items[j];
>> +				rs->raw[j - 1] = rs->raw[j];
>> +			}
>> +			rs->nr--;
> 
> Hmm, don't you need to do 'i--;' here?
> 
> (Sorry in advance if this is nonsense, I am just skimming the
> patches without reading the whole series carefully).
> 
> Maybe try a test which has an entry, which requires the 'prefetch'
> modification, that immediately follows a 'tag' or 'empty dst' entry.
> (I can't quite tell, just reading the email, whether that is covered
> by the tests below - so please just ignore me if it already works ;)

You are absolutely right! Thanks.

-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 0/3] Maintenance: adapt custom refspecs
  2021-04-12 17:41           ` Tom Saeger
@ 2021-04-12 20:25             ` Derrick Stolee
  0 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee @ 2021-04-12 20:25 UTC (permalink / raw)
  To: Tom Saeger, Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, sunshine, Josh Steadmon,
	Emily Shaffer, Derrick Stolee

On 4/12/21 1:41 PM, Tom Saeger wrote:> 
> Less ugly fix...
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 30856b442b79..5fbffbd17d7d 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -576,6 +576,8 @@ static struct ref *get_ref_map(struct remote *remote,
>                         if (has_merge &&
>                             !strcmp(branch->remote_name, remote->name))
>                                 add_merge_config(&ref_map, remote_refs, branch, &tail);
> +               } else if (prefetch) {
> +                       ;
>                 } else {

I'll give this a try, but with "else if (!prefetch)" for the
last block,instead.

Thanks for your diligent testing! It's helping a lot.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 1/3] maintenance: simplify prefetch logic
  2021-04-12 20:13       ` Tom Saeger
@ 2021-04-12 20:27         ` Derrick Stolee
  0 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee @ 2021-04-12 20:27 UTC (permalink / raw)
  To: Tom Saeger, Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Josh Steadmon, Emily Shaffer,
	Derrick Stolee, Derrick Stolee

On 4/12/21 4:13 PM, Tom Saeger wrote:
> On Sat, Apr 10, 2021 at 02:03:43AM +0000, Derrick Stolee via GitGitGadget wrote:
>> -static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
>> +static int fetch_remote(struct remote *remote, void *cbdata)
>>  {
>> +	struct maintenance_run_opts *opts = cbdata;
>>  	struct child_process child = CHILD_PROCESS_INIT;
> 
> 
> I think this might be appropriate to add:
> 
> 
>        if (remote->skip_default_update)
>                return 0;
> 
> 
> maintenance prefetch is acting like `git fetch --all`
> So it should also skip remotes with configs `skipfetchall = true`
> Agree?

TIL about skipfetchall. I think that's a good idea to introduce.
It'll be a new patch, not added in this one.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 0/4] Maintenance: adapt custom refspecs
  2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
                       ` (3 preceding siblings ...)
  2021-04-11  1:35     ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Junio C Hamano
@ 2021-04-16 12:49     ` Derrick Stolee via GitGitGadget
  2021-04-16 12:49       ` [PATCH v4 1/4] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
                         ` (3 more replies)
  4 siblings, 4 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-16 12:49 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Ramsay Jones, Derrick Stolee

Tom Saeger rightly pointed out [1] that the prefetch task ignores custom
refspecs. This can lead to downloading more data than requested, and it
doesn't even help the future foreground fetches that use that custom
refspec.

[1]
https://lore.kernel.org/git/20210401184914.qmr7jhjbhp2mt3h6@dhcp-10-154-148-175.vpn.oracle.com/

This series fixes this problem by carefully replacing the start of each
refspec's destination with "refs/prefetch/". If the destination already
starts with "refs/", then that is replaced. Otherwise "refs/prefetch/" is
just prepended.

This happens inside of git fetch when a --prefetch option is given. This
allows us to maniuplate a struct refspec_item instead of a full refspec
string. It also simplifies our logic in testing the prefetch task.


Updates in V4
=============

 * Two bugs were fixed. Thanks, Ramsay and Tom, for pointing out the issues.
   Tests are added that prevent regressions.
 * A new patch is added to respect remote.<name>.skipFetchAll. This is added
   at the end to take advantage of the simpler test design after --prefetch
   is added.


Update in V3
============

 * The fix is almost completely rewritten as an update to 'git fetch'. See
   the new PATCH 2 for this update.

 * There was some discussion of rewriting test_subcommand, but that can be
   delayed until a proper solution is found to complications around softer
   matches.


Updates in V2
=============

Thanks for the close eye on this series. I appreciate the recommendations,
which I believe I have responded to them all:

 * Fixed typos.
 * Made refspec_item_format() re-entrant. Consumers must free the buffer.
 * Cleaned up style (quoting and tabbing).

Thanks, -Stolee

Derrick Stolee (4):
  maintenance: simplify prefetch logic
  fetch: add --prefetch option
  maintenance: use 'git fetch --prefetch'
  maintenance: respect remote.*.skipFetchAll

 Documentation/fetch-options.txt   |  5 +++
 Documentation/git-maintenance.txt |  6 ++--
 builtin/fetch.c                   | 59 ++++++++++++++++++++++++++++++-
 builtin/gc.c                      | 39 +++++++-------------
 t/t5582-fetch-negative-refspec.sh | 43 ++++++++++++++++++++++
 t/t7900-maintenance.sh            | 22 +++++++-----
 6 files changed, 134 insertions(+), 40 deletions(-)


base-commit: 89b43f80a514aee58b662ad606e6352e03eaeee4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-924%2Fderrickstolee%2Fmaintenance%2Frefspec-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-924/derrickstolee/maintenance/refspec-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/924

Range-diff vs v3:

 1:  4c0e983ba56f = 1:  4c0e983ba56f maintenance: simplify prefetch logic
 2:  7f488eea6dbd ! 2:  73b4e8496746 fetch: add --prefetch option
     @@ Commit message
          Finally, we add the 'force' option to ensure that prefetch refs are
          replaced as necessary.
      
     +    There are some interesting cases that are worth testing.
     +
     +    An earlier version of this change dropped the "i--" from the loop that
     +    deletes a refspec item and shifts the remaining entries down. This
     +    allowed some refspecs to not be modified. The subtle part about the
     +    first --prefetch test is that the "refs/tags/*" refspec appears directly
     +    before the "refs/heads/bogus/*" refspec. Without that "i--", this
     +    ordering would remove the "refs/tags/*" refspec and leave the last one
     +    unmodified, placing the result in "refs/heads/*".
     +
     +    It is possible to have an empty refspec. This is typically the case for
     +    remotes other than the origin, where users want to fetch a specific tag
     +    or branch. To correctly test this case, we need to further remove the
     +    upstream remote for the local branch. Thus, we are testing a refspec
     +    that will be deleted, leaving nothing to fetch.
     +
     +    Helped-by: Tom Saeger <tom.saeger@oracle.com>
     +    Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## Documentation/fetch-options.txt ##
     @@ builtin/fetch.c: static void find_non_local_tags(const struct ref *refs,
      +				rs->raw[j - 1] = rs->raw[j];
      +			}
      +			rs->nr--;
     ++			i--;
      +			continue;
      +		}
      +
     @@ builtin/fetch.c: static struct ref *get_ref_map(struct remote *remote,
       	if (rs->nr) {
       		struct refspec *fetch_refspec;
       
     +@@ builtin/fetch.c: static struct ref *get_ref_map(struct remote *remote,
     + 			if (has_merge &&
     + 			    !strcmp(branch->remote_name, remote->name))
     + 				add_merge_config(&ref_map, remote_refs, branch, &tail);
     +-		} else {
     ++		} else if (!prefetch) {
     + 			ref_map = get_remote_ref(remote_refs, "HEAD");
     + 			if (!ref_map)
     + 				die(_("Couldn't find remote ref HEAD"));
      
       ## t/t5582-fetch-negative-refspec.sh ##
      @@ t/t5582-fetch-negative-refspec.sh: test_expect_success "push with matching +: and negative refspec" '
     @@ t/t5582-fetch-negative-refspec.sh: test_expect_success "push with matching +: an
       
      +test_expect_success '--prefetch correctly modifies refspecs' '
      +	git -C one config --unset-all remote.origin.fetch &&
     -+	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
      +	git -C one config --add remote.origin.fetch ^refs/heads/bogus/ignore &&
     ++	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
      +	git -C one config --add remote.origin.fetch "refs/heads/bogus/*:bogus/*" &&
      +
      +	git tag -a -m never never-fetch-tag HEAD &&
     @@ t/t5582-fetch-negative-refspec.sh: test_expect_success "push with matching +: an
      +	git -C one rev-parse refs/prefetch/bogus/fetched &&
      +	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore
      +'
     ++
     ++test_expect_success '--prefetch succeeds when refspec becomes empty' '
     ++	git checkout bogus/fetched &&
     ++	test_commit extra &&
     ++
     ++	git -C one config --unset-all remote.origin.fetch &&
     ++	git -C one config --unset branch.main.remote &&
     ++	git -C one config remote.origin.fetch "+refs/tags/extra" &&
     ++	git -C one config remote.origin.skipfetchall true &&
     ++	git -C one config remote.origin.tagopt "--no-tags" &&
     ++
     ++	git -C one fetch --prefetch
     ++'
      +
       test_done
 3:  ed055d772452 = 3:  565ed8a18929 maintenance: use 'git fetch --prefetch'
 -:  ------------ > 4:  92652fd9e6e1 maintenance: respect remote.*.skipFetchAll

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 1/4] maintenance: simplify prefetch logic
  2021-04-16 12:49     ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
@ 2021-04-16 12:49       ` Derrick Stolee via GitGitGadget
  2021-04-16 18:02         ` Tom Saeger
  2021-04-16 12:49       ` [PATCH v4 2/4] fetch: add --prefetch option Derrick Stolee via GitGitGadget
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-16 12:49 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Ramsay Jones, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The previous logic filled a string list with the names of each remote,
but instead we could simply run the appropriate 'git fetch' data
directly in the remote iterator. Do this for reduced code size, but also
because it sets up an upcoming change to use the remote's refspec. This
data is accessible from the 'struct remote' data that is now accessible
in fetch_remote().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/gc.c | 33 ++++++++-------------------------
 1 file changed, 8 insertions(+), 25 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index ef7226d7bca4..fa8128de9ae1 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -873,55 +873,38 @@ static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
 	return 0;
 }
 
-static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
+static int fetch_remote(struct remote *remote, void *cbdata)
 {
+	struct maintenance_run_opts *opts = cbdata;
 	struct child_process child = CHILD_PROCESS_INIT;
 
 	child.git_cmd = 1;
-	strvec_pushl(&child.args, "fetch", remote, "--prune", "--no-tags",
+	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
 		     "--no-write-fetch-head", "--recurse-submodules=no",
 		     "--refmap=", NULL);
 
 	if (opts->quiet)
 		strvec_push(&child.args, "--quiet");
 
-	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote);
+	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
 
 	return !!run_command(&child);
 }
 
-static int append_remote(struct remote *remote, void *cbdata)
-{
-	struct string_list *remotes = (struct string_list *)cbdata;
-
-	string_list_append(remotes, remote->name);
-	return 0;
-}
-
 static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
 {
-	int result = 0;
-	struct string_list_item *item;
-	struct string_list remotes = STRING_LIST_INIT_DUP;
-
 	git_config_set_multivar_gently("log.excludedecoration",
 					"refs/prefetch/",
 					"refs/prefetch/",
 					CONFIG_FLAGS_FIXED_VALUE |
 					CONFIG_FLAGS_MULTI_REPLACE);
 
-	if (for_each_remote(append_remote, &remotes)) {
-		error(_("failed to fill remotes"));
-		result = 1;
-		goto cleanup;
+	if (for_each_remote(fetch_remote, opts)) {
+		error(_("failed to prefetch remotes"));
+		return 1;
 	}
 
-	for_each_string_list_item(item, &remotes)
-		result |= fetch_remote(item->string, opts);
-
-cleanup:
-	string_list_clear(&remotes, 0);
-	return result;
+	return 0;
 }
 
 static int maintenance_task_gc(struct maintenance_run_opts *opts)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 2/4] fetch: add --prefetch option
  2021-04-16 12:49     ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
  2021-04-16 12:49       ` [PATCH v4 1/4] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
@ 2021-04-16 12:49       ` Derrick Stolee via GitGitGadget
  2021-04-16 17:52         ` Tom Saeger
  2021-04-16 12:49       ` [PATCH v4 3/4] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
  2021-04-16 12:49       ` [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll Derrick Stolee via GitGitGadget
  3 siblings, 1 reply; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-16 12:49 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Ramsay Jones, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.

Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:

 * Negative refspecs are preserved.
 * Refspecs without a destination are removed.
 * Refspecs whose source starts with "refs/tags/" are removed.
 * Other refspecs are placed within "refs/prefetch/".

Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.

There are some interesting cases that are worth testing.

An earlier version of this change dropped the "i--" from the loop that
deletes a refspec item and shifts the remaining entries down. This
allowed some refspecs to not be modified. The subtle part about the
first --prefetch test is that the "refs/tags/*" refspec appears directly
before the "refs/heads/bogus/*" refspec. Without that "i--", this
ordering would remove the "refs/tags/*" refspec and leave the last one
unmodified, placing the result in "refs/heads/*".

It is possible to have an empty refspec. This is typically the case for
remotes other than the origin, where users want to fetch a specific tag
or branch. To correctly test this case, we need to further remove the
upstream remote for the local branch. Thus, we are testing a refspec
that will be deleted, leaving nothing to fetch.

Helped-by: Tom Saeger <tom.saeger@oracle.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/fetch-options.txt   |  5 +++
 builtin/fetch.c                   | 59 ++++++++++++++++++++++++++++++-
 t/t5582-fetch-negative-refspec.sh | 43 ++++++++++++++++++++++
 3 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 07783deee309..9e7b4e189ce0 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -110,6 +110,11 @@ ifndef::git-pull[]
 	setting `fetch.writeCommitGraph`.
 endif::git-pull[]
 
+--prefetch::
+	Modify the configured refspec to place all refs into the
+	`refs/prefetch/` namespace. See the `prefetch` task in
+	linkgit:git-maintenance[1].
+
 -p::
 --prune::
 	Before fetching, remove any remote-tracking references that no
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0b90de87c7a2..97c4fe6e6d66 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -48,6 +48,7 @@ enum {
 static int fetch_prune_config = -1; /* unspecified */
 static int fetch_show_forced_updates = 1;
 static uint64_t forced_updates_ms = 0;
+static int prefetch = 0;
 static int prune = -1; /* unspecified */
 #define PRUNE_BY_DEFAULT 0 /* do we prune by default? */
 
@@ -158,6 +159,8 @@ static struct option builtin_fetch_options[] = {
 		    N_("do not fetch all tags (--no-tags)"), TAGS_UNSET),
 	OPT_INTEGER('j', "jobs", &max_jobs,
 		    N_("number of submodules fetched in parallel")),
+	OPT_BOOL(0, "prefetch", &prefetch,
+		 N_("modify the refspec to place all refs within refs/prefetch/")),
 	OPT_BOOL('p', "prune", &prune,
 		 N_("prune remote-tracking branches no longer on remote")),
 	OPT_BOOL('P', "prune-tags", &prune_tags,
@@ -436,6 +439,56 @@ static void find_non_local_tags(const struct ref *refs,
 	oidset_clear(&fetch_oids);
 }
 
+static void filter_prefetch_refspec(struct refspec *rs)
+{
+	int i;
+
+	if (!prefetch)
+		return;
+
+	for (i = 0; i < rs->nr; i++) {
+		struct strbuf new_dst = STRBUF_INIT;
+		char *old_dst;
+		const char *sub = NULL;
+
+		if (rs->items[i].negative)
+			continue;
+		if (!rs->items[i].dst ||
+		    (rs->items[i].src &&
+		     !strncmp(rs->items[i].src, "refs/tags/", 10))) {
+			int j;
+
+			free(rs->items[i].src);
+			free(rs->items[i].dst);
+
+			for (j = i + 1; j < rs->nr; j++) {
+				rs->items[j - 1] = rs->items[j];
+				rs->raw[j - 1] = rs->raw[j];
+			}
+			rs->nr--;
+			i--;
+			continue;
+		}
+
+		old_dst = rs->items[i].dst;
+		strbuf_addstr(&new_dst, "refs/prefetch/");
+
+		/*
+		 * If old_dst starts with "refs/", then place
+		 * sub after that prefix. Otherwise, start at
+		 * the beginning of the string.
+		 */
+		if (!skip_prefix(old_dst, "refs/", &sub))
+			sub = old_dst;
+		strbuf_addstr(&new_dst, sub);
+
+		rs->items[i].dst = strbuf_detach(&new_dst, NULL);
+		rs->items[i].force = 1;
+
+		free(old_dst);
+	}
+}
+
 static struct ref *get_ref_map(struct remote *remote,
 			       const struct ref *remote_refs,
 			       struct refspec *rs,
@@ -452,6 +505,10 @@ static struct ref *get_ref_map(struct remote *remote,
 	struct hashmap existing_refs;
 	int existing_refs_populated = 0;
 
+	filter_prefetch_refspec(rs);
+	if (remote)
+		filter_prefetch_refspec(&remote->fetch);
+
 	if (rs->nr) {
 		struct refspec *fetch_refspec;
 
@@ -520,7 +577,7 @@ static struct ref *get_ref_map(struct remote *remote,
 			if (has_merge &&
 			    !strcmp(branch->remote_name, remote->name))
 				add_merge_config(&ref_map, remote_refs, branch, &tail);
-		} else {
+		} else if (!prefetch) {
 			ref_map = get_remote_ref(remote_refs, "HEAD");
 			if (!ref_map)
 				die(_("Couldn't find remote ref HEAD"));
diff --git a/t/t5582-fetch-negative-refspec.sh b/t/t5582-fetch-negative-refspec.sh
index f34509727702..e5d2e79ad382 100755
--- a/t/t5582-fetch-negative-refspec.sh
+++ b/t/t5582-fetch-negative-refspec.sh
@@ -240,4 +240,47 @@ test_expect_success "push with matching +: and negative refspec" '
 	git -C two push -v one
 '
 
+test_expect_success '--prefetch correctly modifies refspecs' '
+	git -C one config --unset-all remote.origin.fetch &&
+	git -C one config --add remote.origin.fetch ^refs/heads/bogus/ignore &&
+	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
+	git -C one config --add remote.origin.fetch "refs/heads/bogus/*:bogus/*" &&
+
+	git tag -a -m never never-fetch-tag HEAD &&
+
+	git branch bogus/fetched HEAD~1 &&
+	git branch bogus/ignore HEAD &&
+
+	git -C one fetch --prefetch --no-tags &&
+	test_must_fail git -C one rev-parse never-fetch-tag &&
+	git -C one rev-parse refs/prefetch/bogus/fetched &&
+	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore &&
+
+	# correctly handle when refspec set becomes empty
+	# after removing the refs/tags/* refspec.
+	git -C one config --unset-all remote.origin.fetch &&
+	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
+
+	git -C one fetch --prefetch --no-tags &&
+	test_must_fail git -C one rev-parse never-fetch-tag &&
+
+	# The refspec for refs that are not fully qualified
+	# are filtered multiple times.
+	git -C one rev-parse refs/prefetch/bogus/fetched &&
+	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore
+'
+
+test_expect_success '--prefetch succeeds when refspec becomes empty' '
+	git checkout bogus/fetched &&
+	test_commit extra &&
+
+	git -C one config --unset-all remote.origin.fetch &&
+	git -C one config --unset branch.main.remote &&
+	git -C one config remote.origin.fetch "+refs/tags/extra" &&
+	git -C one config remote.origin.skipfetchall true &&
+	git -C one config remote.origin.tagopt "--no-tags" &&
+
+	git -C one fetch --prefetch
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 3/4] maintenance: use 'git fetch --prefetch'
  2021-04-16 12:49     ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
  2021-04-16 12:49       ` [PATCH v4 1/4] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
  2021-04-16 12:49       ` [PATCH v4 2/4] fetch: add --prefetch option Derrick Stolee via GitGitGadget
@ 2021-04-16 12:49       ` Derrick Stolee via GitGitGadget
  2021-04-16 12:49       ` [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll Derrick Stolee via GitGitGadget
  3 siblings, 0 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-16 12:49 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Ramsay Jones, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'prefetch' maintenance task previously forced the following refspec
for each remote:

	+refs/heads/*:refs/prefetch/<remote>/*

If a user has specified a more strict refspec for the remote, then this
prefetch task downloads more objects than necessary.

The previous change introduced the '--prefetch' option to 'git fetch'
which manipulates the remote's refspec to place all resulting refs into
refs/prefetch/, with further partitioning based on the destinations of
those refspecs.

Update the documentation to be more generic about the destination refs.
Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch/ remains.

Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-maintenance.txt |  6 ++----
 builtin/gc.c                      |  7 +++----
 t/t7900-maintenance.sh            | 14 +++++++-------
 3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
index 80ddd33ceba0..1e738ad39832 100644
--- a/Documentation/git-maintenance.txt
+++ b/Documentation/git-maintenance.txt
@@ -92,10 +92,8 @@ commit-graph::
 prefetch::
 	The `prefetch` task updates the object directory with the latest
 	objects from all registered remotes. For each remote, a `git fetch`
-	command is run. The refmap is custom to avoid updating local or remote
-	branches (those in `refs/heads` or `refs/remotes`). Instead, the
-	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
-	not updated.
+	command is run. The configured refspec is modified to place all
+	requested refs within `refs/prefetch/`. Also, tags are not updated.
 +
 This is done to avoid disrupting the remote-tracking branches. The end users
 expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
diff --git a/builtin/gc.c b/builtin/gc.c
index fa8128de9ae1..9d35f7da50d8 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -879,15 +879,14 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	struct child_process child = CHILD_PROCESS_INIT;
 
 	child.git_cmd = 1;
-	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
+	strvec_pushl(&child.args, "fetch", remote->name,
+		     "--prefetch", "--prune", "--no-tags",
 		     "--no-write-fetch-head", "--recurse-submodules=no",
-		     "--refmap=", NULL);
+		     NULL);
 
 	if (opts->quiet)
 		strvec_push(&child.args, "--quiet");
 
-	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
-
 	return !!run_command(&child);
 }
 
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 2412d8c5c006..eadb800c08cc 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -141,15 +141,15 @@ test_expect_success 'prefetch multiple remotes' '
 	test_commit -C clone1 one &&
 	test_commit -C clone2 two &&
 	GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
-	fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-	test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
-	test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
+	fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
+	test_subcommand git fetch remote1 $fetchargs <run-prefetch.txt &&
+	test_subcommand git fetch remote2 $fetchargs <run-prefetch.txt &&
 	test_path_is_missing .git/refs/remotes &&
-	git log prefetch/remote1/one &&
-	git log prefetch/remote2/two &&
+	git log prefetch/remotes/remote1/one &&
+	git log prefetch/remotes/remote2/two &&
 	git fetch --all &&
-	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remote1/one &&
-	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remote2/two &&
+	test_cmp_rev refs/remotes/remote1/one refs/prefetch/remotes/remote1/one &&
+	test_cmp_rev refs/remotes/remote2/two refs/prefetch/remotes/remote2/two &&
 
 	test_cmp_config refs/prefetch/ log.excludedecoration &&
 	git log --oneline --decorate --all >log &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll
  2021-04-16 12:49     ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
                         ` (2 preceding siblings ...)
  2021-04-16 12:49       ` [PATCH v4 3/4] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
@ 2021-04-16 12:49       ` Derrick Stolee via GitGitGadget
  2021-04-16 13:54         ` Ævar Arnfjörð Bjarmason
  2021-04-16 18:31         ` Tom Saeger
  3 siblings, 2 replies; 72+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-04-16 12:49 UTC (permalink / raw)
  To: git
  Cc: tom.saeger, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Ramsay Jones, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

If a remote has the skipFetchAll setting enabled, then that remote is
not intended for frequent fetching. It makes sense to not fetch that
data during the 'prefetch' maintenance task. Skip that remote in the
iteration without error. The skip_default_update member is initialized
in remote.c:handle_config() as part of initializing the 'struct remote'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/gc.c           | 3 +++
 t/t7900-maintenance.sh | 8 +++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index 9d35f7da50d8..98a803196b88 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -878,6 +878,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
 	struct maintenance_run_opts *opts = cbdata;
 	struct child_process child = CHILD_PROCESS_INIT;
 
+	if (remote->skip_default_update)
+		return 0;
+
 	child.git_cmd = 1;
 	strvec_pushl(&child.args, "fetch", remote->name,
 		     "--prefetch", "--prune", "--no-tags",
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index eadb800c08cc..b93ae014ee58 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -153,7 +153,13 @@ test_expect_success 'prefetch multiple remotes' '
 
 	test_cmp_config refs/prefetch/ log.excludedecoration &&
 	git log --oneline --decorate --all >log &&
-	! grep "prefetch" log
+	! grep "prefetch" log &&
+
+	test_when_finished git config --unset remote.remote1.skipFetchAll &&
+	git config remote.remote1.skipFetchAll true &&
+	GIT_TRACE2_EVENT="$(pwd)/skip-remote1.txt" git maintenance run --task=prefetch 2>/dev/null &&
+	test_subcommand ! git fetch remote1 $fetchargs <skip-remote1.txt &&
+	test_subcommand git fetch remote2 $fetchargs <skip-remote1.txt
 '
 
 test_expect_success 'prefetch and existing log.excludeDecoration values' '
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll
  2021-04-16 12:49       ` [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll Derrick Stolee via GitGitGadget
@ 2021-04-16 13:54         ` Ævar Arnfjörð Bjarmason
  2021-04-16 14:33           ` Tom Saeger
  2021-04-16 18:31         ` Tom Saeger
  1 sibling, 1 reply; 72+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-16 13:54 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, tom.saeger, gitster, sunshine, Derrick Stolee,
	Josh Steadmon, Emily Shaffer, Ramsay Jones, Derrick Stolee,
	Derrick Stolee


On Fri, Apr 16 2021, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> If a remote has the skipFetchAll setting enabled, then that remote is
> not intended for frequent fetching. It makes sense to not fetch that
> data during the 'prefetch' maintenance task. Skip that remote in the
> iteration without error. The skip_default_update member is initialized
> in remote.c:handle_config() as part of initializing the 'struct remote'.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/gc.c           | 3 +++
>  t/t7900-maintenance.sh | 8 +++++++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 9d35f7da50d8..98a803196b88 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -878,6 +878,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;
>  
> +	if (remote->skip_default_update)
> +		return 0;
> +
>  	child.git_cmd = 1;
>  	strvec_pushl(&child.args, "fetch", remote->name,
>  		     "--prefetch", "--prune", "--no-tags",
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index eadb800c08cc..b93ae014ee58 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -153,7 +153,13 @@ test_expect_success 'prefetch multiple remotes' '
>  
>  	test_cmp_config refs/prefetch/ log.excludedecoration &&
>  	git log --oneline --decorate --all >log &&
> -	! grep "prefetch" log
> +	! grep "prefetch" log &&
> +
> +	test_when_finished git config --unset remote.remote1.skipFetchAll &&
> +	git config remote.remote1.skipFetchAll true &&
> +	GIT_TRACE2_EVENT="$(pwd)/skip-remote1.txt" git maintenance run --task=prefetch 2>/dev/null &&
> +	test_subcommand ! git fetch remote1 $fetchargs <skip-remote1.txt &&
> +	test_subcommand git fetch remote2 $fetchargs <skip-remote1.txt
>  '
>  
>  test_expect_success 'prefetch and existing log.excludeDecoration values' '

Without having read the code I'd have very much expected a
"remote.*.skipFetchAll" to impact:

    git fetch --all

Or:

    git remote update --all # --all does not exist yet

As e.g. remote.<name>.skipDefaultUpdate would do (i.e. impact "git
remote update" ...).

I suspect naming it like this started as a hack around the lack of
4-level .ini config keys, i.e. so we could do:

    maintenance.remote.<name>.skipFetchAll = true

But I wonder if we couldn't give this a less confusing name still, even
the pseudo-command form of:

    maintenanceRemote.<name>.skipFetchAll = true

Or something...


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll
  2021-04-16 13:54         ` Ævar Arnfjörð Bjarmason
@ 2021-04-16 14:33           ` Tom Saeger
  0 siblings, 0 replies; 72+ messages in thread
From: Tom Saeger @ 2021-04-16 14:33 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee via GitGitGadget, git, gitster, sunshine,
	Derrick Stolee, Josh Steadmon, Emily Shaffer, Ramsay Jones,
	Derrick Stolee, Derrick Stolee

On Fri, Apr 16, 2021 at 03:54:13PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Apr 16 2021, Derrick Stolee via GitGitGadget wrote:
> 
> > From: Derrick Stolee <dstolee@microsoft.com>
> >
> > If a remote has the skipFetchAll setting enabled, then that remote is
> > not intended for frequent fetching. It makes sense to not fetch that
> > data during the 'prefetch' maintenance task. Skip that remote in the
> > iteration without error. The skip_default_update member is initialized
> > in remote.c:handle_config() as part of initializing the 'struct remote'.
> >
> > Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> > ---
> >  builtin/gc.c           | 3 +++
> >  t/t7900-maintenance.sh | 8 +++++++-
> >  2 files changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/builtin/gc.c b/builtin/gc.c
> > index 9d35f7da50d8..98a803196b88 100644
> > --- a/builtin/gc.c
> > +++ b/builtin/gc.c
> > @@ -878,6 +878,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
> >  	struct maintenance_run_opts *opts = cbdata;
> >  	struct child_process child = CHILD_PROCESS_INIT;
> >  
> > +	if (remote->skip_default_update)
> > +		return 0;
> > +
> >  	child.git_cmd = 1;
> >  	strvec_pushl(&child.args, "fetch", remote->name,
> >  		     "--prefetch", "--prune", "--no-tags",
> > diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> > index eadb800c08cc..b93ae014ee58 100755
> > --- a/t/t7900-maintenance.sh
> > +++ b/t/t7900-maintenance.sh
> > @@ -153,7 +153,13 @@ test_expect_success 'prefetch multiple remotes' '
> >  
> >  	test_cmp_config refs/prefetch/ log.excludedecoration &&
> >  	git log --oneline --decorate --all >log &&
> > -	! grep "prefetch" log
> > +	! grep "prefetch" log &&
> > +
> > +	test_when_finished git config --unset remote.remote1.skipFetchAll &&
> > +	git config remote.remote1.skipFetchAll true &&
> > +	GIT_TRACE2_EVENT="$(pwd)/skip-remote1.txt" git maintenance run --task=prefetch 2>/dev/null &&
> > +	test_subcommand ! git fetch remote1 $fetchargs <skip-remote1.txt &&
> > +	test_subcommand git fetch remote2 $fetchargs <skip-remote1.txt
> >  '
> >  
> >  test_expect_success 'prefetch and existing log.excludeDecoration values' '
> 
> Without having read the code I'd have very much expected a
> "remote.*.skipFetchAll" to impact:
> 
>     git fetch --all

'skipFetchAll' indeed impacts 'git fetch --all'

But this patch doesn't add "skipFetchAll", instead it just honors that
config if set during 'git maintenance --task=prefetch'

See v3 discussion: https://lore.kernel.org/git/2f4fa2b5-0d8b-b368-ab4d-411740595a4f@gmail.com/

> 
> Or:
> 
>     git remote update --all # --all does not exist yet
> 
> As e.g. remote.<name>.skipDefaultUpdate would do (i.e. impact "git
> remote update" ...).
> 
> I suspect naming it like this started as a hack around the lack of
> 4-level .ini config keys, i.e. so we could do:
> 
>     maintenance.remote.<name>.skipFetchAll = true
> 
> But I wonder if we couldn't give this a less confusing name still, even
> the pseudo-command form of:
> 
>     maintenanceRemote.<name>.skipFetchAll = true
> 
> Or something...
> 

Whether or not additional maintenance specific configs are desired, that might be
something to consider.  I've thought about this for a few of my repos
which have remotes which require a VPN connection.  Perhaps
I want to skip those during 'prefetch'?  Or maybe instead define a
'remotes.prefetch' group and only prefetch remotes listed?

That all said - I wouldn't hold-up this patch series for it.

--Tom

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 2/4] fetch: add --prefetch option
  2021-04-16 12:49       ` [PATCH v4 2/4] fetch: add --prefetch option Derrick Stolee via GitGitGadget
@ 2021-04-16 17:52         ` Tom Saeger
  2021-04-16 18:26           ` Tom Saeger
  0 siblings, 1 reply; 72+ messages in thread
From: Tom Saeger @ 2021-04-16 17:52 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Ramsay Jones, Derrick Stolee, Derrick Stolee

On Fri, Apr 16, 2021 at 12:49:57PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The --prefetch option will be used by the 'prefetch' maintenance task
> instead of sending refspecs explicitly across the command-line. The
> intention is to modify the refspec to place all results in
> refs/prefetch/ instead of anywhere else.
> 
> Create helper method filter_prefetch_refspec() to modify a given refspec
> to fit the rules expected of the prefetch task:
> 
>  * Negative refspecs are preserved.
>  * Refspecs without a destination are removed.
>  * Refspecs whose source starts with "refs/tags/" are removed.
>  * Other refspecs are placed within "refs/prefetch/".
> 
> Finally, we add the 'force' option to ensure that prefetch refs are
> replaced as necessary.
> 
> There are some interesting cases that are worth testing.
> 
> An earlier version of this change dropped the "i--" from the loop that
> deletes a refspec item and shifts the remaining entries down. This
> allowed some refspecs to not be modified. The subtle part about the
> first --prefetch test is that the "refs/tags/*" refspec appears directly
> before the "refs/heads/bogus/*" refspec. Without that "i--", this
> ordering would remove the "refs/tags/*" refspec and leave the last one
> unmodified, placing the result in "refs/heads/*".
> 
> It is possible to have an empty refspec. This is typically the case for
> remotes other than the origin, where users want to fetch a specific tag
> or branch. To correctly test this case, we need to further remove the
> upstream remote for the local branch. Thus, we are testing a refspec
> that will be deleted, leaving nothing to fetch.
> 
> Helped-by: Tom Saeger <tom.saeger@oracle.com>
> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/fetch-options.txt   |  5 +++
>  builtin/fetch.c                   | 59 ++++++++++++++++++++++++++++++-
>  t/t5582-fetch-negative-refspec.sh | 43 ++++++++++++++++++++++
>  3 files changed, 106 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
> index 07783deee309..9e7b4e189ce0 100644
> --- a/Documentation/fetch-options.txt
> +++ b/Documentation/fetch-options.txt
> @@ -110,6 +110,11 @@ ifndef::git-pull[]
>  	setting `fetch.writeCommitGraph`.
>  endif::git-pull[]
>  
> +--prefetch::
> +	Modify the configured refspec to place all refs into the
> +	`refs/prefetch/` namespace. See the `prefetch` task in
> +	linkgit:git-maintenance[1].
> +
>  -p::
>  --prune::
>  	Before fetching, remove any remote-tracking references that no
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 0b90de87c7a2..97c4fe6e6d66 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -48,6 +48,7 @@ enum {
>  static int fetch_prune_config = -1; /* unspecified */
>  static int fetch_show_forced_updates = 1;
>  static uint64_t forced_updates_ms = 0;
> +static int prefetch = 0;
>  static int prune = -1; /* unspecified */
>  #define PRUNE_BY_DEFAULT 0 /* do we prune by default? */
>  
> @@ -158,6 +159,8 @@ static struct option builtin_fetch_options[] = {
>  		    N_("do not fetch all tags (--no-tags)"), TAGS_UNSET),
>  	OPT_INTEGER('j', "jobs", &max_jobs,
>  		    N_("number of submodules fetched in parallel")),
> +	OPT_BOOL(0, "prefetch", &prefetch,
> +		 N_("modify the refspec to place all refs within refs/prefetch/")),
>  	OPT_BOOL('p', "prune", &prune,
>  		 N_("prune remote-tracking branches no longer on remote")),
>  	OPT_BOOL('P', "prune-tags", &prune_tags,
> @@ -436,6 +439,56 @@ static void find_non_local_tags(const struct ref *refs,
>  	oidset_clear(&fetch_oids);
>  }
>  
> +static void filter_prefetch_refspec(struct refspec *rs)
> +{
> +	int i;
> +
> +	if (!prefetch)
> +		return;
> +
> +	for (i = 0; i < rs->nr; i++) {
> +		struct strbuf new_dst = STRBUF_INIT;
> +		char *old_dst;
> +		const char *sub = NULL;
> +
> +		if (rs->items[i].negative)
> +			continue;
> +		if (!rs->items[i].dst ||
> +		    (rs->items[i].src &&
> +		     !strncmp(rs->items[i].src, "refs/tags/", 10))) {
> +			int j;
> +
> +			free(rs->items[i].src);
> +			free(rs->items[i].dst);
> +
> +			for (j = i + 1; j < rs->nr; j++) {
> +				rs->items[j - 1] = rs->items[j];
> +				rs->raw[j - 1] = rs->raw[j];
> +			}
> +			rs->nr--;
> +			i--;
> +			continue;
> +		}
> +
> +		old_dst = rs->items[i].dst;
> +		strbuf_addstr(&new_dst, "refs/prefetch/");
> +
> +		/*
> +		 * If old_dst starts with "refs/", then place
> +		 * sub after that prefix. Otherwise, start at
> +		 * the beginning of the string.
> +		 */
> +		if (!skip_prefix(old_dst, "refs/", &sub))
> +			sub = old_dst;
> +		strbuf_addstr(&new_dst, sub);
> +
> +		rs->items[i].dst = strbuf_detach(&new_dst, NULL);
> +		rs->items[i].force = 1;
> +
> +		free(old_dst);
> +	}
> +}
> +
>  static struct ref *get_ref_map(struct remote *remote,
>  			       const struct ref *remote_refs,
>  			       struct refspec *rs,
> @@ -452,6 +505,10 @@ static struct ref *get_ref_map(struct remote *remote,
>  	struct hashmap existing_refs;
>  	int existing_refs_populated = 0;
>  
> +	filter_prefetch_refspec(rs);
> +	if (remote)
> +		filter_prefetch_refspec(&remote->fetch);
> +
>  	if (rs->nr) {
>  		struct refspec *fetch_refspec;
>  
> @@ -520,7 +577,7 @@ static struct ref *get_ref_map(struct remote *remote,
>  			if (has_merge &&
>  			    !strcmp(branch->remote_name, remote->name))
>  				add_merge_config(&ref_map, remote_refs, branch, &tail);
> -		} else {
> +		} else if (!prefetch) {

That works for me.

>  			ref_map = get_remote_ref(remote_refs, "HEAD");
>  			if (!ref_map)
>  				die(_("Couldn't find remote ref HEAD"));
> diff --git a/t/t5582-fetch-negative-refspec.sh b/t/t5582-fetch-negative-refspec.sh
> index f34509727702..e5d2e79ad382 100755
> --- a/t/t5582-fetch-negative-refspec.sh
> +++ b/t/t5582-fetch-negative-refspec.sh
> @@ -240,4 +240,47 @@ test_expect_success "push with matching +: and negative refspec" '
>  	git -C two push -v one
>  '
>  
> +test_expect_success '--prefetch correctly modifies refspecs' '
> +	git -C one config --unset-all remote.origin.fetch &&
> +	git -C one config --add remote.origin.fetch ^refs/heads/bogus/ignore &&
> +	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
> +	git -C one config --add remote.origin.fetch "refs/heads/bogus/*:bogus/*" &&
> +
> +	git tag -a -m never never-fetch-tag HEAD &&
> +
> +	git branch bogus/fetched HEAD~1 &&
> +	git branch bogus/ignore HEAD &&
> +
> +	git -C one fetch --prefetch --no-tags &&
> +	test_must_fail git -C one rev-parse never-fetch-tag &&
> +	git -C one rev-parse refs/prefetch/bogus/fetched &&
> +	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore &&
> +
> +	# correctly handle when refspec set becomes empty
> +	# after removing the refs/tags/* refspec.
> +	git -C one config --unset-all remote.origin.fetch &&
> +	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
> +
> +	git -C one fetch --prefetch --no-tags &&
> +	test_must_fail git -C one rev-parse never-fetch-tag &&
> +
> +	# The refspec for refs that are not fully qualified
> +	# are filtered multiple times.
> +	git -C one rev-parse refs/prefetch/bogus/fetched &&
> +	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore
> +'
> +
> +test_expect_success '--prefetch succeeds when refspec becomes empty' '

technically this will get skipped based only on "skipfetchall" right?

The remote could have an empty-set of refspecs or multiple
valid refspecs post filter_prefetch_refspec, but the remote gets skipped altogether.

perhaps '--prefetch succeeds when remote.skipfetchall is true' '

anyway this is looking pretty solid

Reviewed-by: Tom Saeger <tom.saeger@oracle.com>

> +	git checkout bogus/fetched &&
> +	test_commit extra &&
> +
> +	git -C one config --unset-all remote.origin.fetch &&
> +	git -C one config --unset branch.main.remote &&
> +	git -C one config remote.origin.fetch "+refs/tags/extra" &&
> +	git -C one config remote.origin.skipfetchall true &&
> +	git -C one config remote.origin.tagopt "--no-tags" &&
> +
> +	git -C one fetch --prefetch
> +'
> +
>  test_done
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 1/4] maintenance: simplify prefetch logic
  2021-04-16 12:49       ` [PATCH v4 1/4] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
@ 2021-04-16 18:02         ` Tom Saeger
  0 siblings, 0 replies; 72+ messages in thread
From: Tom Saeger @ 2021-04-16 18:02 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Ramsay Jones, Derrick Stolee, Derrick Stolee

On Fri, Apr 16, 2021 at 12:49:56PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The previous logic filled a string list with the names of each remote,
> but instead we could simply run the appropriate 'git fetch' data
> directly in the remote iterator. Do this for reduced code size, but also
> because it sets up an upcoming change to use the remote's refspec. This
> data is accessible from the 'struct remote' data that is now accessible
> in fetch_remote().
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

Reviewed-by: Tom Saeger <tom.saeger@oracle.com>

> ---
>  builtin/gc.c | 33 ++++++++-------------------------
>  1 file changed, 8 insertions(+), 25 deletions(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index ef7226d7bca4..fa8128de9ae1 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -873,55 +873,38 @@ static int maintenance_task_commit_graph(struct maintenance_run_opts *opts)
>  	return 0;
>  }
>  
> -static int fetch_remote(const char *remote, struct maintenance_run_opts *opts)
> +static int fetch_remote(struct remote *remote, void *cbdata)
>  {
> +	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;
>  
>  	child.git_cmd = 1;
> -	strvec_pushl(&child.args, "fetch", remote, "--prune", "--no-tags",
> +	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
>  		     "--no-write-fetch-head", "--recurse-submodules=no",
>  		     "--refmap=", NULL);
>  
>  	if (opts->quiet)
>  		strvec_push(&child.args, "--quiet");
>  
> -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote);
> +	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
>  
>  	return !!run_command(&child);
>  }
>  
> -static int append_remote(struct remote *remote, void *cbdata)
> -{
> -	struct string_list *remotes = (struct string_list *)cbdata;
> -
> -	string_list_append(remotes, remote->name);
> -	return 0;
> -}
> -
>  static int maintenance_task_prefetch(struct maintenance_run_opts *opts)
>  {
> -	int result = 0;
> -	struct string_list_item *item;
> -	struct string_list remotes = STRING_LIST_INIT_DUP;
> -
>  	git_config_set_multivar_gently("log.excludedecoration",
>  					"refs/prefetch/",
>  					"refs/prefetch/",
>  					CONFIG_FLAGS_FIXED_VALUE |
>  					CONFIG_FLAGS_MULTI_REPLACE);
>  
> -	if (for_each_remote(append_remote, &remotes)) {
> -		error(_("failed to fill remotes"));
> -		result = 1;
> -		goto cleanup;
> +	if (for_each_remote(fetch_remote, opts)) {
> +		error(_("failed to prefetch remotes"));
> +		return 1;
>  	}
>  
> -	for_each_string_list_item(item, &remotes)
> -		result |= fetch_remote(item->string, opts);
> -
> -cleanup:
> -	string_list_clear(&remotes, 0);
> -	return result;
> +	return 0;
>  }
>  
>  static int maintenance_task_gc(struct maintenance_run_opts *opts)
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 2/4] fetch: add --prefetch option
  2021-04-16 17:52         ` Tom Saeger
@ 2021-04-16 18:26           ` Tom Saeger
  0 siblings, 0 replies; 72+ messages in thread
From: Tom Saeger @ 2021-04-16 18:26 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Ramsay Jones, Derrick Stolee, Derrick Stolee

On Fri, Apr 16, 2021 at 12:52:21PM -0500, Tom Saeger wrote:
> On Fri, Apr 16, 2021 at 12:49:57PM +0000, Derrick Stolee via GitGitGadget wrote:
> > From: Derrick Stolee <dstolee@microsoft.com>
> > 
> > The --prefetch option will be used by the 'prefetch' maintenance task
> > instead of sending refspecs explicitly across the command-line. The
> > intention is to modify the refspec to place all results in
> > refs/prefetch/ instead of anywhere else.
> > 
> > Create helper method filter_prefetch_refspec() to modify a given refspec
> > to fit the rules expected of the prefetch task:
> > 
> >  * Negative refspecs are preserved.
> >  * Refspecs without a destination are removed.
> >  * Refspecs whose source starts with "refs/tags/" are removed.
> >  * Other refspecs are placed within "refs/prefetch/".
> > 
> > Finally, we add the 'force' option to ensure that prefetch refs are
> > replaced as necessary.
> > 
> > There are some interesting cases that are worth testing.
> > 
> > An earlier version of this change dropped the "i--" from the loop that
> > deletes a refspec item and shifts the remaining entries down. This
> > allowed some refspecs to not be modified. The subtle part about the
> > first --prefetch test is that the "refs/tags/*" refspec appears directly
> > before the "refs/heads/bogus/*" refspec. Without that "i--", this
> > ordering would remove the "refs/tags/*" refspec and leave the last one
> > unmodified, placing the result in "refs/heads/*".
> > 
> > It is possible to have an empty refspec. This is typically the case for
> > remotes other than the origin, where users want to fetch a specific tag
> > or branch. To correctly test this case, we need to further remove the
> > upstream remote for the local branch. Thus, we are testing a refspec
> > that will be deleted, leaving nothing to fetch.
> > 
> > Helped-by: Tom Saeger <tom.saeger@oracle.com>
> > Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
> > Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> > ---
> >  Documentation/fetch-options.txt   |  5 +++
> >  builtin/fetch.c                   | 59 ++++++++++++++++++++++++++++++-
> >  t/t5582-fetch-negative-refspec.sh | 43 ++++++++++++++++++++++
> >  3 files changed, 106 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
> > index 07783deee309..9e7b4e189ce0 100644
> > --- a/Documentation/fetch-options.txt
> > +++ b/Documentation/fetch-options.txt
> > @@ -110,6 +110,11 @@ ifndef::git-pull[]
> >  	setting `fetch.writeCommitGraph`.
> >  endif::git-pull[]
> >  
> > +--prefetch::
> > +	Modify the configured refspec to place all refs into the
> > +	`refs/prefetch/` namespace. See the `prefetch` task in
> > +	linkgit:git-maintenance[1].
> > +
> >  -p::
> >  --prune::
> >  	Before fetching, remove any remote-tracking references that no
> > diff --git a/builtin/fetch.c b/builtin/fetch.c
> > index 0b90de87c7a2..97c4fe6e6d66 100644
> > --- a/builtin/fetch.c
> > +++ b/builtin/fetch.c
> > @@ -48,6 +48,7 @@ enum {
> >  static int fetch_prune_config = -1; /* unspecified */
> >  static int fetch_show_forced_updates = 1;
> >  static uint64_t forced_updates_ms = 0;
> > +static int prefetch = 0;
> >  static int prune = -1; /* unspecified */
> >  #define PRUNE_BY_DEFAULT 0 /* do we prune by default? */
> >  
> > @@ -158,6 +159,8 @@ static struct option builtin_fetch_options[] = {
> >  		    N_("do not fetch all tags (--no-tags)"), TAGS_UNSET),
> >  	OPT_INTEGER('j', "jobs", &max_jobs,
> >  		    N_("number of submodules fetched in parallel")),
> > +	OPT_BOOL(0, "prefetch", &prefetch,
> > +		 N_("modify the refspec to place all refs within refs/prefetch/")),
> >  	OPT_BOOL('p', "prune", &prune,
> >  		 N_("prune remote-tracking branches no longer on remote")),
> >  	OPT_BOOL('P', "prune-tags", &prune_tags,
> > @@ -436,6 +439,56 @@ static void find_non_local_tags(const struct ref *refs,
> >  	oidset_clear(&fetch_oids);
> >  }
> >  
> > +static void filter_prefetch_refspec(struct refspec *rs)
> > +{
> > +	int i;
> > +
> > +	if (!prefetch)
> > +		return;
> > +
> > +	for (i = 0; i < rs->nr; i++) {
> > +		struct strbuf new_dst = STRBUF_INIT;
> > +		char *old_dst;
> > +		const char *sub = NULL;
> > +
> > +		if (rs->items[i].negative)
> > +			continue;
> > +		if (!rs->items[i].dst ||
> > +		    (rs->items[i].src &&
> > +		     !strncmp(rs->items[i].src, "refs/tags/", 10))) {
> > +			int j;
> > +
> > +			free(rs->items[i].src);
> > +			free(rs->items[i].dst);
> > +
> > +			for (j = i + 1; j < rs->nr; j++) {
> > +				rs->items[j - 1] = rs->items[j];
> > +				rs->raw[j - 1] = rs->raw[j];
> > +			}
> > +			rs->nr--;
> > +			i--;
> > +			continue;
> > +		}
> > +
> > +		old_dst = rs->items[i].dst;
> > +		strbuf_addstr(&new_dst, "refs/prefetch/");
> > +
> > +		/*
> > +		 * If old_dst starts with "refs/", then place
> > +		 * sub after that prefix. Otherwise, start at
> > +		 * the beginning of the string.
> > +		 */
> > +		if (!skip_prefix(old_dst, "refs/", &sub))
> > +			sub = old_dst;
> > +		strbuf_addstr(&new_dst, sub);
> > +
> > +		rs->items[i].dst = strbuf_detach(&new_dst, NULL);
> > +		rs->items[i].force = 1;
> > +
> > +		free(old_dst);
> > +	}
> > +}
> > +
> >  static struct ref *get_ref_map(struct remote *remote,
> >  			       const struct ref *remote_refs,
> >  			       struct refspec *rs,
> > @@ -452,6 +505,10 @@ static struct ref *get_ref_map(struct remote *remote,
> >  	struct hashmap existing_refs;
> >  	int existing_refs_populated = 0;
> >  
> > +	filter_prefetch_refspec(rs);
> > +	if (remote)
> > +		filter_prefetch_refspec(&remote->fetch);
> > +
> >  	if (rs->nr) {
> >  		struct refspec *fetch_refspec;
> >  
> > @@ -520,7 +577,7 @@ static struct ref *get_ref_map(struct remote *remote,
> >  			if (has_merge &&
> >  			    !strcmp(branch->remote_name, remote->name))
> >  				add_merge_config(&ref_map, remote_refs, branch, &tail);
> > -		} else {
> > +		} else if (!prefetch) {
> 
> That works for me.
> 
> >  			ref_map = get_remote_ref(remote_refs, "HEAD");
> >  			if (!ref_map)
> >  				die(_("Couldn't find remote ref HEAD"));
> > diff --git a/t/t5582-fetch-negative-refspec.sh b/t/t5582-fetch-negative-refspec.sh
> > index f34509727702..e5d2e79ad382 100755
> > --- a/t/t5582-fetch-negative-refspec.sh
> > +++ b/t/t5582-fetch-negative-refspec.sh
> > @@ -240,4 +240,47 @@ test_expect_success "push with matching +: and negative refspec" '
> >  	git -C two push -v one
> >  '
> >  
> > +test_expect_success '--prefetch correctly modifies refspecs' '
> > +	git -C one config --unset-all remote.origin.fetch &&
> > +	git -C one config --add remote.origin.fetch ^refs/heads/bogus/ignore &&
> > +	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
> > +	git -C one config --add remote.origin.fetch "refs/heads/bogus/*:bogus/*" &&
> > +
> > +	git tag -a -m never never-fetch-tag HEAD &&
> > +
> > +	git branch bogus/fetched HEAD~1 &&
> > +	git branch bogus/ignore HEAD &&
> > +
> > +	git -C one fetch --prefetch --no-tags &&
> > +	test_must_fail git -C one rev-parse never-fetch-tag &&
> > +	git -C one rev-parse refs/prefetch/bogus/fetched &&
> > +	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore &&
> > +
> > +	# correctly handle when refspec set becomes empty
> > +	# after removing the refs/tags/* refspec.
> > +	git -C one config --unset-all remote.origin.fetch &&
> > +	git -C one config --add remote.origin.fetch "refs/tags/*:refs/tags/*" &&
> > +
> > +	git -C one fetch --prefetch --no-tags &&
> > +	test_must_fail git -C one rev-parse never-fetch-tag &&
> > +
> > +	# The refspec for refs that are not fully qualified
> > +	# are filtered multiple times.
> > +	git -C one rev-parse refs/prefetch/bogus/fetched &&
> > +	test_must_fail git -C one rev-parse refs/prefetch/bogus/ignore
> > +'
> > +
> > +test_expect_success '--prefetch succeeds when refspec becomes empty' '
> 
> technically this will get skipped based only on "skipfetchall" right?
> 
> The remote could have an empty-set of refspecs or multiple
> valid refspecs post filter_prefetch_refspec, but the remote gets skipped altogether.
> 
> perhaps '--prefetch succeeds when remote.skipfetchall is true' '

Forget what I said here.   Now seeing this is using 'git fetch --prefetch'
directly and not using `mainenance --task=prefetch`

> 
> anyway this is looking pretty solid
> 
> Reviewed-by: Tom Saeger <tom.saeger@oracle.com>
> 
> > +	git checkout bogus/fetched &&
> > +	test_commit extra &&
> > +
> > +	git -C one config --unset-all remote.origin.fetch &&
> > +	git -C one config --unset branch.main.remote &&
> > +	git -C one config remote.origin.fetch "+refs/tags/extra" &&
> > +	git -C one config remote.origin.skipfetchall true &&
> > +	git -C one config remote.origin.tagopt "--no-tags" &&
> > +
> > +	git -C one fetch --prefetch
> > +'
> > +
> >  test_done
> > -- 
> > gitgitgadget
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll
  2021-04-16 12:49       ` [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll Derrick Stolee via GitGitGadget
  2021-04-16 13:54         ` Ævar Arnfjörð Bjarmason
@ 2021-04-16 18:31         ` Tom Saeger
  1 sibling, 0 replies; 72+ messages in thread
From: Tom Saeger @ 2021-04-16 18:31 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, sunshine, Derrick Stolee, Josh Steadmon,
	Emily Shaffer, Ramsay Jones, Derrick Stolee, Derrick Stolee

On Fri, Apr 16, 2021 at 12:49:59PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> If a remote has the skipFetchAll setting enabled, then that remote is
> not intended for frequent fetching. It makes sense to not fetch that
> data during the 'prefetch' maintenance task. Skip that remote in the
> iteration without error. The skip_default_update member is initialized
> in remote.c:handle_config() as part of initializing the 'struct remote'.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

Reviewed-by: Tom Saeger <tom.saeger@oracle.com>
> ---
>  builtin/gc.c           | 3 +++
>  t/t7900-maintenance.sh | 8 +++++++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 9d35f7da50d8..98a803196b88 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -878,6 +878,9 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;
>  
> +	if (remote->skip_default_update)
> +		return 0;
> +

Well that is way easier than doing it in 'fetch'.


>  	child.git_cmd = 1;
>  	strvec_pushl(&child.args, "fetch", remote->name,
>  		     "--prefetch", "--prune", "--no-tags",
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index eadb800c08cc..b93ae014ee58 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -153,7 +153,13 @@ test_expect_success 'prefetch multiple remotes' '
>  
>  	test_cmp_config refs/prefetch/ log.excludedecoration &&
>  	git log --oneline --decorate --all >log &&
> -	! grep "prefetch" log
> +	! grep "prefetch" log &&
> +
> +	test_when_finished git config --unset remote.remote1.skipFetchAll &&
> +	git config remote.remote1.skipFetchAll true &&
> +	GIT_TRACE2_EVENT="$(pwd)/skip-remote1.txt" git maintenance run --task=prefetch 2>/dev/null &&
> +	test_subcommand ! git fetch remote1 $fetchargs <skip-remote1.txt &&
> +	test_subcommand git fetch remote2 $fetchargs <skip-remote1.txt
>  '
>  
>  test_expect_success 'prefetch and existing log.excludeDecoration values' '
> -- 
> gitgitgadget

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2021-04-16 18:31 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-05 13:04 ` [PATCH 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-05 17:01   ` Tom Saeger
2021-04-05 13:04 ` [PATCH 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-05 17:31   ` Eric Sunshine
2021-04-05 17:43     ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-05 16:57   ` Tom Saeger
2021-04-05 17:40     ` Eric Sunshine
2021-04-05 17:44       ` Junio C Hamano
2021-04-06 11:21         ` Derrick Stolee
2021-04-06 15:23           ` Eric Sunshine
2021-04-06 16:51             ` Derrick Stolee
2021-04-07  8:46   ` Ævar Arnfjörð Bjarmason
2021-04-07 20:53     ` Derrick Stolee
2021-04-07 22:05       ` Ævar Arnfjörð Bjarmason
2021-04-07 22:49         ` Junio C Hamano
2021-04-07 23:01           ` Ævar Arnfjörð Bjarmason
2021-04-08  7:33             ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-05 17:52   ` Eric Sunshine
2021-04-06 11:13     ` Derrick Stolee
2021-04-07  8:54   ` Ævar Arnfjörð Bjarmason
2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-05 17:16   ` Tom Saeger
2021-04-06 11:15     ` Derrick Stolee
2021-04-07  8:53   ` Ævar Arnfjörð Bjarmason
2021-04-07 10:26     ` Ævar Arnfjörð Bjarmason
2021-04-09 11:48       ` Derrick Stolee
2021-04-09 19:28         ` Ævar Arnfjörð Bjarmason
2021-04-10  0:56           ` Derrick Stolee
2021-04-10 11:37             ` Ævar Arnfjörð Bjarmason
2021-04-07 13:47   ` Ævar Arnfjörð Bjarmason
2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-06 18:47   ` [PATCH v2 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-07 23:23     ` Emily Shaffer
2021-04-09 19:00       ` Derrick Stolee
2021-04-06 18:47   ` [PATCH v2 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-06 18:47   ` [PATCH v2 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-06 18:47   ` [PATCH v2 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-07 23:08     ` Josh Steadmon
2021-04-07 23:26     ` Emily Shaffer
2021-04-06 18:47   ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-06 19:36     ` Tom Saeger
2021-04-06 19:45       ` Derrick Stolee
2021-04-07 23:09     ` Josh Steadmon
2021-04-07 23:37     ` Emily Shaffer
2021-04-08  0:23     ` Jonathan Tan
2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-10  2:03     ` [PATCH v3 1/3] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-12 20:13       ` Tom Saeger
2021-04-12 20:27         ` Derrick Stolee
2021-04-10  2:03     ` [PATCH v3 2/3] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-11 21:09       ` Ramsay Jones
2021-04-12 20:23         ` Derrick Stolee
2021-04-10  2:03     ` [PATCH v3 3/3] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-11  1:35     ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Junio C Hamano
2021-04-12 16:48       ` Tom Saeger
2021-04-12 17:24         ` Tom Saeger
2021-04-12 17:41           ` Tom Saeger
2021-04-12 20:25             ` Derrick Stolee
2021-04-16 12:49     ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
2021-04-16 12:49       ` [PATCH v4 1/4] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-16 18:02         ` Tom Saeger
2021-04-16 12:49       ` [PATCH v4 2/4] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-16 17:52         ` Tom Saeger
2021-04-16 18:26           ` Tom Saeger
2021-04-16 12:49       ` [PATCH v4 3/4] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-16 12:49       ` [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll Derrick Stolee via GitGitGadget
2021-04-16 13:54         ` Ævar Arnfjörð Bjarmason
2021-04-16 14:33           ` Tom Saeger
2021-04-16 18:31         ` Tom Saeger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).