git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tom Saeger <tom.saeger@oracle.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, sunshine@sunshineco.com,
	Derrick Stolee <stolee@gmail.com>,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch
Date: Tue, 6 Apr 2021 14:36:37 -0500	[thread overview]
Message-ID: <20210406193637.x2xqkkyqbjabgzge@dhcp-10-154-113-61.vpn.oracle.com> (raw)
In-Reply-To: <9592224e3d428762c6f9b38a0bcc2cee5c3dff6c.1617734871.git.gitgitgadget@gmail.com>

On Tue, Apr 06, 2021 at 06:47:50PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The prefetch task previously used the default refspec source plus a
> custom refspec destination to avoid colliding with remote refs:
> 
> 	+refs/heads/*:refs/prefetch/<remote>/*
> 
> However, some users customize their refspec to reduce how much data they
> download from specific remotes. This can involve restrictive patterns
> for fetching or negative patterns to avoid downloading some refs.
> 
> Modify fetch_remote() to iterate over the remote's refspec list and
> translate that into the appropriate prefetch scenario. Specifically,
> re-parse the raw form of the refspec into a new 'struct refspec' and
> modify the 'dst' member to replace a leading "refs/" substring with
> "refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
> Negative refspecs do not have a 'dst' so they can be transferred to the
> 'git fetch' command unmodified.
> 
> This prefix change provides the benefit of keeping whatever collisions
> may exist in the custom refspecs, if that is a desirable outcome.
> 
> This changes the names of the refs that would be fetched by the default
> refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
> to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
> is not a seriously breaking one: these refs are intended to be hidden
> and not used.
> 
> Update the documentation to be more generic about the destination refs.
> Do not mention custom refpecs explicitly, as that does not need to be
> highlighted in this documentation. The important part of placing refs in
> refs/prefetch remains.
> 
> Reported-by: Tom Saeger <tom.saeger@oracle.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/git-maintenance.txt |  3 +--
>  builtin/gc.c                      | 37 +++++++++++++++++++++++++-
>  t/t7900-maintenance.sh            | 43 ++++++++++++++++++++++++++-----
>  3 files changed, 74 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 80ddd33ceba0..95a24264eb10 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -94,8 +94,7 @@ prefetch::
>  	objects from all registered remotes. For each remote, a `git fetch`
>  	command is run. The refmap is custom to avoid updating local or remote
>  	branches (those in `refs/heads` or `refs/remotes`). Instead, the
> -	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
> -	not updated.
> +	refs are stored in `refs/prefetch/`. Also, tags are not updated.
>  +
>  This is done to avoid disrupting the remote-tracking branches. The end users
>  expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
> diff --git a/builtin/gc.c b/builtin/gc.c
> index fa8128de9ae1..76f347dd6b11 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -32,6 +32,7 @@
>  #include "remote.h"
>  #include "object-store.h"
>  #include "exec-cmd.h"
> +#include "refspec.h"
>  
>  #define FAILED_RUN "failed to run %s"
>  
> @@ -877,6 +878,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  {
>  	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;
> +	int i;
>  
>  	child.git_cmd = 1;
>  	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
> @@ -886,7 +888,40 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  	if (opts->quiet)
>  		strvec_push(&child.args, "--quiet");
>  
> -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
> +	for (i = 0; i < remote->fetch.nr; i++) {
> +		struct refspec_item replace;
> +		struct refspec_item *rsi = &remote->fetch.items[i];
> +		struct strbuf new_dst = STRBUF_INIT;
> +		size_t ignore_len = 0;
> +		char *replace_string;
> +
> +		if (rsi->negative) {
> +			strvec_push(&child.args, remote->fetch.raw[i]);
> +			continue;
> +		}
> +
> +		refspec_item_init(&replace, remote->fetch.raw[i], 1);
> +
> +		/*
> +		 * If a refspec dst starts with "refs/" at the start,
> +		 * then we will replace "refs/" with "refs/prefetch/".
> +		 * Otherwise, we will prepend the dst string with
> +		 * "refs/prefetch/".
> +		 */
> +		if (!strncmp(replace.dst, "refs/", 5))
> +			ignore_len = 5;
> +
> +		strbuf_addstr(&new_dst, "refs/prefetch/");
> +		strbuf_addstr(&new_dst, replace.dst + ignore_len);
> +		free(replace.dst);
> +		replace.dst = strbuf_detach(&new_dst, NULL);
> +
> +		replace_string = refspec_item_format(&replace);
> +		strvec_push(&child.args, replace_string);
> +		free(replace_string);
> +
> +		refspec_item_clear(&replace);
> +	}
>  
>  	return !!run_command(&child);
>  }

Junio brought up the point about configs which 'fetch' have no dst
https://lore.kernel.org/git/c06a198a-2043-27a2-cab3-3471190754cc@gmail.com/

    [remote "submaintainer1"]
        url = ... repository of submaintainer #1 ...
        fetch = master
        tagopt = --no-tags


This patch fixes segfault for config like above.
You might have ideas on a cleaner way to do this.
I did add `child_process_clear`.


--Tom

diff --git a/builtin/gc.c b/builtin/gc.c
index 76f347dd6b11..921266ee30a5 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -879,6 +879,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
        struct maintenance_run_opts *opts = cbdata;
        struct child_process child = CHILD_PROCESS_INIT;
        int i;
+       int nargs;

        child.git_cmd = 1;
        strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
@@ -888,6 +889,8 @@ static int fetch_remote(struct remote *remote, void *cbdata)
        if (opts->quiet)
                strvec_push(&child.args, "--quiet");

+       nargs = child.args.nr;
+
        for (i = 0; i < remote->fetch.nr; i++) {
                struct refspec_item replace;
                struct refspec_item *rsi = &remote->fetch.items[i];
@@ -900,6 +903,10 @@ static int fetch_remote(struct remote *remote, void *cbdata)
                        continue;
                }

+               if (!rsi->dst) {
+                       continue;
+               }
+
                refspec_item_init(&replace, remote->fetch.raw[i], 1);

                /*
@@ -923,6 +930,12 @@ static int fetch_remote(struct remote *remote, void *cbdata)
                refspec_item_clear(&replace);
        }

+       /* skip remote if no refspecs to fetch */
+       if (child.args.nr - nargs <= 0) {
+               child_process_clear(&child);
+               return 0;
+       }

  reply	other threads:[~2021-04-06 19:36 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-05 13:04 ` [PATCH 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-05 17:01   ` Tom Saeger
2021-04-05 13:04 ` [PATCH 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-05 17:31   ` Eric Sunshine
2021-04-05 17:43     ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-05 16:57   ` Tom Saeger
2021-04-05 17:40     ` Eric Sunshine
2021-04-05 17:44       ` Junio C Hamano
2021-04-06 11:21         ` Derrick Stolee
2021-04-06 15:23           ` Eric Sunshine
2021-04-06 16:51             ` Derrick Stolee
2021-04-07  8:46   ` Ævar Arnfjörð Bjarmason
2021-04-07 20:53     ` Derrick Stolee
2021-04-07 22:05       ` Ævar Arnfjörð Bjarmason
2021-04-07 22:49         ` Junio C Hamano
2021-04-07 23:01           ` Ævar Arnfjörð Bjarmason
2021-04-08  7:33             ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-05 17:52   ` Eric Sunshine
2021-04-06 11:13     ` Derrick Stolee
2021-04-07  8:54   ` Ævar Arnfjörð Bjarmason
2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-05 17:16   ` Tom Saeger
2021-04-06 11:15     ` Derrick Stolee
2021-04-07  8:53   ` Ævar Arnfjörð Bjarmason
2021-04-07 10:26     ` Ævar Arnfjörð Bjarmason
2021-04-09 11:48       ` Derrick Stolee
2021-04-09 19:28         ` Ævar Arnfjörð Bjarmason
2021-04-10  0:56           ` Derrick Stolee
2021-04-10 11:37             ` Ævar Arnfjörð Bjarmason
2021-04-07 13:47   ` Ævar Arnfjörð Bjarmason
2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-06 18:47   ` [PATCH v2 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-07 23:23     ` Emily Shaffer
2021-04-09 19:00       ` Derrick Stolee
2021-04-06 18:47   ` [PATCH v2 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-06 18:47   ` [PATCH v2 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-06 18:47   ` [PATCH v2 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-07 23:08     ` Josh Steadmon
2021-04-07 23:26     ` Emily Shaffer
2021-04-06 18:47   ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-06 19:36     ` Tom Saeger [this message]
2021-04-06 19:45       ` Derrick Stolee
2021-04-07 23:09     ` Josh Steadmon
2021-04-07 23:37     ` Emily Shaffer
2021-04-08  0:23     ` Jonathan Tan
2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-10  2:03     ` [PATCH v3 1/3] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-12 20:13       ` Tom Saeger
2021-04-12 20:27         ` Derrick Stolee
2021-04-10  2:03     ` [PATCH v3 2/3] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-11 21:09       ` Ramsay Jones
2021-04-12 20:23         ` Derrick Stolee
2021-04-10  2:03     ` [PATCH v3 3/3] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-11  1:35     ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Junio C Hamano
2021-04-12 16:48       ` Tom Saeger
2021-04-12 17:24         ` Tom Saeger
2021-04-12 17:41           ` Tom Saeger
2021-04-12 20:25             ` Derrick Stolee
2021-04-16 12:49     ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
2021-04-16 12:49       ` [PATCH v4 1/4] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-16 18:02         ` Tom Saeger
2021-04-16 12:49       ` [PATCH v4 2/4] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-16 17:52         ` Tom Saeger
2021-04-16 18:26           ` Tom Saeger
2021-04-16 12:49       ` [PATCH v4 3/4] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-16 12:49       ` [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll Derrick Stolee via GitGitGadget
2021-04-16 13:54         ` Ævar Arnfjörð Bjarmason
2021-04-16 14:33           ` Tom Saeger
2021-04-16 18:31         ` Tom Saeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210406193637.x2xqkkyqbjabgzge@dhcp-10-154-113-61.vpn.oracle.com \
    --to=tom.saeger@oracle.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).