All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Saeger <tom.saeger@oracle.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, sunshine@sunshineco.com,
	Derrick Stolee <stolee@gmail.com>,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch
Date: Tue, 6 Apr 2021 14:36:37 -0500	[thread overview]
Message-ID: <20210406193637.x2xqkkyqbjabgzge@dhcp-10-154-113-61.vpn.oracle.com> (raw)
In-Reply-To: <9592224e3d428762c6f9b38a0bcc2cee5c3dff6c.1617734871.git.gitgitgadget@gmail.com>

On Tue, Apr 06, 2021 at 06:47:50PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The prefetch task previously used the default refspec source plus a
> custom refspec destination to avoid colliding with remote refs:
> 
> 	+refs/heads/*:refs/prefetch/<remote>/*
> 
> However, some users customize their refspec to reduce how much data they
> download from specific remotes. This can involve restrictive patterns
> for fetching or negative patterns to avoid downloading some refs.
> 
> Modify fetch_remote() to iterate over the remote's refspec list and
> translate that into the appropriate prefetch scenario. Specifically,
> re-parse the raw form of the refspec into a new 'struct refspec' and
> modify the 'dst' member to replace a leading "refs/" substring with
> "refs/prefetch/", or prepend "refs/prefetch/" to 'dst' otherwise.
> Negative refspecs do not have a 'dst' so they can be transferred to the
> 'git fetch' command unmodified.
> 
> This prefix change provides the benefit of keeping whatever collisions
> may exist in the custom refspecs, if that is a desirable outcome.
> 
> This changes the names of the refs that would be fetched by the default
> refspec. Instead of "refs/prefetch/<remote>/<branch>" they will now go
> to "refs/prefetch/remotes/<remote>/<branch>". While this is a change, it
> is not a seriously breaking one: these refs are intended to be hidden
> and not used.
> 
> Update the documentation to be more generic about the destination refs.
> Do not mention custom refpecs explicitly, as that does not need to be
> highlighted in this documentation. The important part of placing refs in
> refs/prefetch remains.
> 
> Reported-by: Tom Saeger <tom.saeger@oracle.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/git-maintenance.txt |  3 +--
>  builtin/gc.c                      | 37 +++++++++++++++++++++++++-
>  t/t7900-maintenance.sh            | 43 ++++++++++++++++++++++++++-----
>  3 files changed, 74 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt
> index 80ddd33ceba0..95a24264eb10 100644
> --- a/Documentation/git-maintenance.txt
> +++ b/Documentation/git-maintenance.txt
> @@ -94,8 +94,7 @@ prefetch::
>  	objects from all registered remotes. For each remote, a `git fetch`
>  	command is run. The refmap is custom to avoid updating local or remote
>  	branches (those in `refs/heads` or `refs/remotes`). Instead, the
> -	remote refs are stored in `refs/prefetch/<remote>/`. Also, tags are
> -	not updated.
> +	refs are stored in `refs/prefetch/`. Also, tags are not updated.
>  +
>  This is done to avoid disrupting the remote-tracking branches. The end users
>  expect these refs to stay unmoved unless they initiate a fetch.  With prefetch
> diff --git a/builtin/gc.c b/builtin/gc.c
> index fa8128de9ae1..76f347dd6b11 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -32,6 +32,7 @@
>  #include "remote.h"
>  #include "object-store.h"
>  #include "exec-cmd.h"
> +#include "refspec.h"
>  
>  #define FAILED_RUN "failed to run %s"
>  
> @@ -877,6 +878,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  {
>  	struct maintenance_run_opts *opts = cbdata;
>  	struct child_process child = CHILD_PROCESS_INIT;
> +	int i;
>  
>  	child.git_cmd = 1;
>  	strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
> @@ -886,7 +888,40 @@ static int fetch_remote(struct remote *remote, void *cbdata)
>  	if (opts->quiet)
>  		strvec_push(&child.args, "--quiet");
>  
> -	strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote->name);
> +	for (i = 0; i < remote->fetch.nr; i++) {
> +		struct refspec_item replace;
> +		struct refspec_item *rsi = &remote->fetch.items[i];
> +		struct strbuf new_dst = STRBUF_INIT;
> +		size_t ignore_len = 0;
> +		char *replace_string;
> +
> +		if (rsi->negative) {
> +			strvec_push(&child.args, remote->fetch.raw[i]);
> +			continue;
> +		}
> +
> +		refspec_item_init(&replace, remote->fetch.raw[i], 1);
> +
> +		/*
> +		 * If a refspec dst starts with "refs/" at the start,
> +		 * then we will replace "refs/" with "refs/prefetch/".
> +		 * Otherwise, we will prepend the dst string with
> +		 * "refs/prefetch/".
> +		 */
> +		if (!strncmp(replace.dst, "refs/", 5))
> +			ignore_len = 5;
> +
> +		strbuf_addstr(&new_dst, "refs/prefetch/");
> +		strbuf_addstr(&new_dst, replace.dst + ignore_len);
> +		free(replace.dst);
> +		replace.dst = strbuf_detach(&new_dst, NULL);
> +
> +		replace_string = refspec_item_format(&replace);
> +		strvec_push(&child.args, replace_string);
> +		free(replace_string);
> +
> +		refspec_item_clear(&replace);
> +	}
>  
>  	return !!run_command(&child);
>  }

Junio brought up the point about configs which 'fetch' have no dst
https://lore.kernel.org/git/c06a198a-2043-27a2-cab3-3471190754cc@gmail.com/

    [remote "submaintainer1"]
        url = ... repository of submaintainer #1 ...
        fetch = master
        tagopt = --no-tags


This patch fixes segfault for config like above.
You might have ideas on a cleaner way to do this.
I did add `child_process_clear`.


--Tom

diff --git a/builtin/gc.c b/builtin/gc.c
index 76f347dd6b11..921266ee30a5 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -879,6 +879,7 @@ static int fetch_remote(struct remote *remote, void *cbdata)
        struct maintenance_run_opts *opts = cbdata;
        struct child_process child = CHILD_PROCESS_INIT;
        int i;
+       int nargs;

        child.git_cmd = 1;
        strvec_pushl(&child.args, "fetch", remote->name, "--prune", "--no-tags",
@@ -888,6 +889,8 @@ static int fetch_remote(struct remote *remote, void *cbdata)
        if (opts->quiet)
                strvec_push(&child.args, "--quiet");

+       nargs = child.args.nr;
+
        for (i = 0; i < remote->fetch.nr; i++) {
                struct refspec_item replace;
                struct refspec_item *rsi = &remote->fetch.items[i];
@@ -900,6 +903,10 @@ static int fetch_remote(struct remote *remote, void *cbdata)
                        continue;
                }

+               if (!rsi->dst) {
+                       continue;
+               }
+
                refspec_item_init(&replace, remote->fetch.raw[i], 1);

                /*
@@ -923,6 +930,12 @@ static int fetch_remote(struct remote *remote, void *cbdata)
                refspec_item_clear(&replace);
        }

+       /* skip remote if no refspecs to fetch */
+       if (child.args.nr - nargs <= 0) {
+               child_process_clear(&child);
+               return 0;
+       }

  reply	other threads:[~2021-04-06 19:36 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-05 13:04 [PATCH 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-05 13:04 ` [PATCH 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-05 17:01   ` Tom Saeger
2021-04-05 13:04 ` [PATCH 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-05 17:31   ` Eric Sunshine
2021-04-05 17:43     ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-05 16:57   ` Tom Saeger
2021-04-05 17:40     ` Eric Sunshine
2021-04-05 17:44       ` Junio C Hamano
2021-04-06 11:21         ` Derrick Stolee
2021-04-06 15:23           ` Eric Sunshine
2021-04-06 16:51             ` Derrick Stolee
2021-04-07  8:46   ` Ævar Arnfjörð Bjarmason
2021-04-07 20:53     ` Derrick Stolee
2021-04-07 22:05       ` Ævar Arnfjörð Bjarmason
2021-04-07 22:49         ` Junio C Hamano
2021-04-07 23:01           ` Ævar Arnfjörð Bjarmason
2021-04-08  7:33             ` Junio C Hamano
2021-04-05 13:04 ` [PATCH 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-05 17:52   ` Eric Sunshine
2021-04-06 11:13     ` Derrick Stolee
2021-04-07  8:54   ` Ævar Arnfjörð Bjarmason
2021-04-05 13:04 ` [PATCH 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-05 17:16   ` Tom Saeger
2021-04-06 11:15     ` Derrick Stolee
2021-04-07  8:53   ` Ævar Arnfjörð Bjarmason
2021-04-07 10:26     ` Ævar Arnfjörð Bjarmason
2021-04-09 11:48       ` Derrick Stolee
2021-04-09 19:28         ` Ævar Arnfjörð Bjarmason
2021-04-10  0:56           ` Derrick Stolee
2021-04-10 11:37             ` Ævar Arnfjörð Bjarmason
2021-04-07 13:47   ` Ævar Arnfjörð Bjarmason
2021-04-06 18:47 ` [PATCH v2 0/5] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-06 18:47   ` [PATCH v2 1/5] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-07 23:23     ` Emily Shaffer
2021-04-09 19:00       ` Derrick Stolee
2021-04-06 18:47   ` [PATCH v2 2/5] test-lib: use exact match for test_subcommand Derrick Stolee via GitGitGadget
2021-04-06 18:47   ` [PATCH v2 3/5] refspec: output a refspec item Derrick Stolee via GitGitGadget
2021-04-06 18:47   ` [PATCH v2 4/5] test-tool: test refspec input/output Derrick Stolee via GitGitGadget
2021-04-07 23:08     ` Josh Steadmon
2021-04-07 23:26     ` Emily Shaffer
2021-04-06 18:47   ` [PATCH v2 5/5] maintenance: allow custom refspecs during prefetch Derrick Stolee via GitGitGadget
2021-04-06 19:36     ` Tom Saeger [this message]
2021-04-06 19:45       ` Derrick Stolee
2021-04-07 23:09     ` Josh Steadmon
2021-04-07 23:37     ` Emily Shaffer
2021-04-08  0:23     ` Jonathan Tan
2021-04-10  2:03   ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Derrick Stolee via GitGitGadget
2021-04-10  2:03     ` [PATCH v3 1/3] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-12 20:13       ` Tom Saeger
2021-04-12 20:27         ` Derrick Stolee
2021-04-10  2:03     ` [PATCH v3 2/3] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-11 21:09       ` Ramsay Jones
2021-04-12 20:23         ` Derrick Stolee
2021-04-10  2:03     ` [PATCH v3 3/3] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-11  1:35     ` [PATCH v3 0/3] Maintenance: adapt custom refspecs Junio C Hamano
2021-04-12 16:48       ` Tom Saeger
2021-04-12 17:24         ` Tom Saeger
2021-04-12 17:41           ` Tom Saeger
2021-04-12 20:25             ` Derrick Stolee
2021-04-16 12:49     ` [PATCH v4 0/4] " Derrick Stolee via GitGitGadget
2021-04-16 12:49       ` [PATCH v4 1/4] maintenance: simplify prefetch logic Derrick Stolee via GitGitGadget
2021-04-16 18:02         ` Tom Saeger
2021-04-16 12:49       ` [PATCH v4 2/4] fetch: add --prefetch option Derrick Stolee via GitGitGadget
2021-04-16 17:52         ` Tom Saeger
2021-04-16 18:26           ` Tom Saeger
2021-04-16 12:49       ` [PATCH v4 3/4] maintenance: use 'git fetch --prefetch' Derrick Stolee via GitGitGadget
2021-04-16 12:49       ` [PATCH v4 4/4] maintenance: respect remote.*.skipFetchAll Derrick Stolee via GitGitGadget
2021-04-16 13:54         ` Ævar Arnfjörð Bjarmason
2021-04-16 14:33           ` Tom Saeger
2021-04-16 18:31         ` Tom Saeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210406193637.x2xqkkyqbjabgzge@dhcp-10-154-113-61.vpn.oracle.com \
    --to=tom.saeger@oracle.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.