From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: tom.saeger@oracle.com, gitster@pobox.com,
sunshine@sunshineco.com, Derrick Stolee <stolee@gmail.com>,
Josh Steadmon <steadmon@google.com>,
Emily Shaffer <emilyshaffer@google.com>,
Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v3 0/3] Maintenance: adapt custom refspecs
Date: Sat, 10 Apr 2021 02:03:42 +0000 [thread overview]
Message-ID: <pull.924.v3.git.1618020225.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.924.v2.git.1617734870.gitgitgadget@gmail.com>
Tom Saeger rightly pointed out [1] that the prefetch task ignores custom
refspecs. This can lead to downloading more data than requested, and it
doesn't even help the future foreground fetches that use that custom
refspec.
[1]
https://lore.kernel.org/git/20210401184914.qmr7jhjbhp2mt3h6@dhcp-10-154-148-175.vpn.oracle.com/
This series fixes this problem by carefully replacing the start of each
refspec's destination with "refs/prefetch/". If the destination already
starts with "refs/", then that is replaced. Otherwise "refs/prefetch/" is
just prepended.
This happens inside of git fetch when a --prefetch option is given. This
allows us to maniuplate a struct refspec_item instead of a full refspec
string. It also simplifies our logic in testing the prefetch task.
Update in V3
============
* The fix is almost completely rewritten as an update to 'git fetch'. See
the new PATCH 2 for this update.
* There was some discussion of rewriting test_subcommand, but that can be
delayed until a proper solution is found to complications around softer
matches.
Updates in V2
=============
Thanks for the close eye on this series. I appreciate the recommendations,
which I believe I have responded to them all:
* Fixed typos.
* Made refspec_item_format() re-entrant. Consumers must free the buffer.
* Cleaned up style (quoting and tabbing).
Thanks, -Stolee
Derrick Stolee (3):
maintenance: simplify prefetch logic
fetch: add --prefetch option
maintenance: use 'git fetch --prefetch'
Documentation/fetch-options.txt | 5 +++
Documentation/git-maintenance.txt | 6 ++--
builtin/fetch.c | 56 +++++++++++++++++++++++++++++++
builtin/gc.c | 36 +++++---------------
t/t5582-fetch-negative-refspec.sh | 30 +++++++++++++++++
t/t7900-maintenance.sh | 14 ++++----
6 files changed, 109 insertions(+), 38 deletions(-)
base-commit: 89b43f80a514aee58b662ad606e6352e03eaeee4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-924%2Fderrickstolee%2Fmaintenance%2Frefspec-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-924/derrickstolee/maintenance/refspec-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/924
Range-diff vs v2:
1: 5aa0cb06c3f2 = 1: 4c0e983ba56f maintenance: simplify prefetch logic
2: d58a3e042ee8 < -: ------------ test-lib: use exact match for test_subcommand
3: 96388d949b98 < -: ------------ refspec: output a refspec item
4: bf296282323a < -: ------------ test-tool: test refspec input/output
-: ------------ > 2: 7f488eea6dbd fetch: add --prefetch option
5: 9592224e3d42 ! 3: ed055d772452 maintenance: allow custom refspecs during prefetch
@@ Metadata
Author: Derrick Stolee <dstolee@microsoft.com>
## Commit message ##
- maintenance: allow custom refspecs during prefetch
+ maintenance: use 'git fetch --prefetch'
- The prefetch task previously used the default refspec source plus a
- custom refspec destination to avoid colliding with remote refs:
+ The 'prefetch' maintenance task previously forced the following refspec
+ for each remote:
+refs/heads/*:refs/prefetch/<remote>/*
- However, some users customize their refspec to reduce how much data they
- download from specific remotes. This can involve restrictive patterns
- for fetching or negative patterns to avoid downloading some refs.
+ If a user has specified a more strict refspec for the remote, then this
+ prefetch task downloads more objects than necessary.
- Modify fetch_remote() to iterate over the remote's refspec list and
- translate that into the appropriate prefetch scenario. Specifically,
- re-parse the raw form of the refspec into a new 'struct refspec' and
- modify the 'dst' member to replace a leading "refs/" substring with
- "refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
- Negative refspecs do not have a 'dst' so they can be transferred to the
- 'git fetch' command unmodified.
-
- This prefix change provides the benefit of keeping whatever collisions
- may exist in the custom refspecs, if that is a desirable outcome.
-
- This changes the names of the refs that would be fetched by the default
- refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
- to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
- is not a seriously breaking one: these refs are intended to be hidden
- and not used.
+ The previous change introduced the '--prefetch' option to 'git fetch'
+ which manipulates the remote's refspec to place all resulting refs into
+ refs/prefetch/, with further partitioning based on the destinations of
+ those refspecs.
Update the documentation to be more generic about the destination refs.
- Do not mention custom refpecs explicitly, as that does not need to be
+ Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
- refs/prefetch remains.
+ refs/prefetch/ remains.
Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
## Documentation/git-maintenance.txt ##
-@@ Documentation/git-maintenance.txt: prefetch::
+@@ Documentation/git-maintenance.txt: commit-graph::
+ prefetch::
+ The `prefetch` task updates the object directory with the latest
objects from all registered remotes. For each remote, a `git fetch`
- command is run. The refmap is custom to avoid updating local or remote
- branches (those in `refs/heads` or `refs/remotes`). Instead, the
+- command is run. The refmap is custom to avoid updating local or remote
+- branches (those in `refs/heads` or `refs/remotes`). Instead, the
- remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
- not updated.
-+ refs are stored in `refs/prefetch/`. Also, tags are not updated.
++ command is run. The configured refspec is modified to place all
++ requested refs within `refs/prefetch/`. Also, tags are not updated.
+
This is done to avoid disrupting the remote-tracking branches. The end users
expect these refs to stay unmoved unless they initiate a fetch. With prefetch
## builtin/gc.c ##
-@@
- #include "remote.h"
- #include "object-store.h"
- #include "exec-cmd.h"
-+#include "refspec.h"
-
- #define FAILED_RUN "failed to run %s"
-
@@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
- {
- struct maintenance_run_opts *opts = cbdata;
struct child_process child = CHILD_PROCESS_INIT;
-+ int i;
child.git_cmd = 1;
- strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
-@@ builtin/gc.c: static int fetch_remote(struct remote *remote, void *cbdata)
+- strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
++ strvec_pushl(&child.args, "fetch", remote->name,
++ "--prefetch", "--prune", "--no-tags",
+ "--no-write-fetch-head", "--recurse-submodules=no",
+- "--refmap=", NULL);
++ NULL);
+
if (opts->quiet)
strvec_push(&child.args, "--quiet");
- strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
-+ for (i = 0; i < remote->fetch.nr; i++) {
-+ struct refspec_item replace;
-+ struct refspec_item *rsi = &remote->fetch.items[i];
-+ struct strbuf new_dst = STRBUF_INIT;
-+ size_t ignore_len = 0;
-+ char *replace_string;
-+
-+ if (rsi->negative) {
-+ strvec_push(&child.args, remote->fetch.raw[i]);
-+ continue;
-+ }
-+
-+ refspec_item_init(&replace, remote->fetch.raw[i], 1);
-+
-+ /*
-+ * If a refspec dst starts with "refs/" at the start,
-+ * then we will replace "refs/" with "refs/prefetch/".
-+ * Otherwise, we will prepend the dst string with
-+ * "refs/prefetch/".
-+ */
-+ if (!strncmp(replace.dst, "refs/", 5))
-+ ignore_len = 5;
-+
-+ strbuf_addstr(&new_dst, "refs/prefetch/");
-+ strbuf_addstr(&new_dst, replace.dst + ignore_len);
-+ free(replace.dst);
-+ replace.dst = strbuf_detach(&new_dst, NULL);
-+
-+ replace_string = refspec_item_format(&replace);
-+ strvec_push(&child.args, replace_string);
-+ free(replace_string);
-+
-+ refspec_item_clear(&replace);
-+ }
-
+-
return !!run_command(&child);
}
+
## t/t7900-maintenance.sh ##
@@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
+ test_commit -C clone1 one &&
test_commit -C clone2 two &&
GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch 2>/dev/null &&
- fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-- test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remote1/*" <run-prefetch.txt &&
-- test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remote2/*" <run-prefetch.txt &&
-+ test_subcommand git fetch remote1 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote1/*" <run-prefetch.txt &&
-+ test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <run-prefetch.txt &&
+- fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
+- test_subcommand git fetch remote1 $fetchargs +refs/heads/\\*:refs/prefetch/remote1/\\* <run-prefetch.txt &&
+- test_subcommand git fetch remote2 $fetchargs +refs/heads/\\*:refs/prefetch/remote2/\\* <run-prefetch.txt &&
++ fetchargs="--prefetch --prune --no-tags --no-write-fetch-head --recurse-submodules=no --quiet" &&
++ test_subcommand git fetch remote1 $fetchargs <run-prefetch.txt &&
++ test_subcommand git fetch remote2 $fetchargs <run-prefetch.txt &&
test_path_is_missing .git/refs/remotes &&
- git log prefetch/remote1/one &&
- git log prefetch/remote2/two &&
@@ t/t7900-maintenance.sh: test_expect_success 'prefetch multiple remotes' '
test_cmp_config refs/prefetch/ log.excludedecoration &&
git log --oneline --decorate --all >log &&
- ! grep "prefetch" log
- '
-
-+test_expect_success 'prefetch custom refspecs' '
-+ git -C clone1 branch -f special/fetched HEAD &&
-+ git -C clone1 branch -f special/secret/not-fetched HEAD &&
-+
-+ # create multiple refspecs for remote1
-+ git config --add remote.remote1.fetch "+refs/heads/special/fetched:refs/heads/fetched" &&
-+ git config --add remote.remote1.fetch "^refs/heads/special/secret/not-fetched" &&
-+
-+ GIT_TRACE2_EVENT="$(pwd)/prefetch-refspec.txt" git maintenance run --task=prefetch 2>/dev/null &&
-+
-+ fetchargs="--prune --no-tags --no-write-fetch-head --recurse-submodules=no --refmap= --quiet" &&
-+
-+ # skips second refspec because it is not a pattern type
-+ rs1="+refs/heads/*:refs/prefetch/remotes/remote1/*" &&
-+ rs2="+refs/heads/special/fetched:refs/prefetch/heads/fetched" &&
-+ rs3="^refs/heads/special/secret/not-fetched" &&
-+
-+ test_subcommand git fetch remote1 $fetchargs "$rs1" "$rs2" "$rs3" <prefetch-refspec.txt &&
-+ test_subcommand git fetch remote2 $fetchargs "+refs/heads/*:refs/prefetch/remotes/remote2/*" <prefetch-refspec.txt &&
-+
-+ # first refspec is overridden by second
-+ test_must_fail git rev-parse refs/prefetch/special/fetched &&
-+ git rev-parse refs/prefetch/heads/fetched &&
-+
-+ # possible incorrect places for the non-fetched ref
-+ test_must_fail git rev-parse refs/prefetch/remotes/remote1/secret/not-fetched &&
-+ test_must_fail git rev-parse refs/prefetch/remotes/remote1/not-fetched &&
-+ test_must_fail git rev-parse refs/heads/secret/not-fetched &&
-+ test_must_fail git rev-parse refs/heads/not-fetched
-+'
-+
- test_expect_success 'prefetch and existing log.excludeDecoration values' '
- git config --unset-all log.excludeDecoration &&
- git config log.excludeDecoration refs/remotes/remote1/ &&
--
gitgitgadget
next prev parent reply other threads:[~2021-04-10 2:04 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-05 13:04 ` [PATCH 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-05 17:01 ` Tom Saeger
2021-04-05 13:04 ` [PATCH 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-05 17:31 ` Eric Sunshine
2021-04-05 17:43 ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-05 16:57 ` Tom Saeger
2021-04-05 17:40 ` Eric Sunshine
2021-04-05 17:44 ` Junio C Hamano
2021-04-06 11:21 ` Derrick Stolee
2021-04-06 15:23 ` Eric Sunshine
2021-04-06 16:51 ` Derrick Stolee
2021-04-07 8:46 ` Ævar Arnfjörð Bjarmason
2021-04-07 20:53 ` Derrick Stolee
2021-04-07 22:05 ` Ævar Arnfjörð Bjarmason
2021-04-07 22:49 ` Junio C Hamano
2021-04-07 23:01 ` Ævar Arnfjörð Bjarmason
2021-04-08 7:33 ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-05 17:52 ` Eric Sunshine
2021-04-06 11:13 ` Derrick Stolee
2021-04-07 8:54 ` Ævar Arnfjörð Bjarmason
2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-05 17:16 ` Tom Saeger
2021-04-06 11:15 ` Derrick Stolee
2021-04-07 8:53 ` Ævar Arnfjörð Bjarmason
2021-04-07 10:26 ` Ævar Arnfjörð Bjarmason
2021-04-09 11:48 ` Derrick Stolee
2021-04-09 19:28 ` Ævar Arnfjörð Bjarmason
2021-04-10 0:56 ` Derrick Stolee
2021-04-10 11:37 ` Ævar Arnfjörð Bjarmason
2021-04-07 13:47 ` Ævar Arnfjörð Bjarmason
2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-06 18:47 ` [PATCH v2 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-07 23:23 ` Emily Shaffer
2021-04-09 19:00 ` Derrick Stolee
2021-04-06 18:47 ` [PATCH v2 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-06 18:47 ` [PATCH v2 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-06 18:47 ` [PATCH v2 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-07 23:08 ` Josh Steadmon
2021-04-07 23:26 ` Emily Shaffer
2021-04-06 18:47 ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-06 19:36 ` Tom Saeger
2021-04-06 19:45 ` Derrick Stolee
2021-04-07 23:09 ` Josh Steadmon
2021-04-07 23:37 ` Emily Shaffer
2021-04-08 0:23 ` Jonathan Tan
2021-04-10 2:03 ` Derrick Stolee via GitGitGadget [this message]
2021-04-10 2:03 ` [PATCH v3 1/3] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-12 20:13 ` Tom Saeger
2021-04-12 20:27 ` Derrick Stolee
2021-04-10 2:03 ` [PATCH v3 2/3] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-11 21:09 ` Ramsay Jones
2021-04-12 20:23 ` Derrick Stolee
2021-04-10 2:03 ` [PATCH v3 3/3] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-11 1:35 ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Junio C Hamano
2021-04-12 16:48 ` Tom Saeger
2021-04-12 17:24 ` Tom Saeger
2021-04-12 17:41 ` Tom Saeger
2021-04-12 20:25 ` Derrick Stolee
2021-04-16 12:49 ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
2021-04-16 12:49 ` [PATCH v4 1/4] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-16 18:02 ` Tom Saeger
2021-04-16 12:49 ` [PATCH v4 2/4] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-16 17:52 ` Tom Saeger
2021-04-16 18:26 ` Tom Saeger
2021-04-16 12:49 ` [PATCH v4 3/4] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-16 12:49 ` [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll Derrick Stolee via GitGitGadget
2021-04-16 13:54 ` Ævar Arnfjörð Bjarmason
2021-04-16 14:33 ` Tom Saeger
2021-04-16 18:31 ` Tom Saeger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.924.v3.git.1618020225.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=derrickstolee@github.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=steadmon@google.com \
--cc=stolee@gmail.com \
--cc=sunshine@sunshineco.com \
--cc=tom.saeger@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).