git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>,
	Derrick Stolee <stolee@gmail.com>, Jeff King <peff@peff.net>,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v4 2/2] commit-graph: don't write commit-graph when disabled
Date: Fri, 9 Oct 2020 17:17:25 -0400	[thread overview]
Message-ID: <20201009211725.GA450854@nand.local> (raw)
In-Reply-To: <4439e8ae8fdc9abf28df29d3038a1483d9084cf2.1602276832.git.gitgitgadget@gmail.com>

On Fri, Oct 09, 2020 at 08:53:52PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The core.commitGraph config setting can be set to 'false' to prevent
> parsing commits from the commit-graph file(s). This causes an issue when
> trying to write with "--split" which needs to distinguish between
> commits that are in the existing commit-graph layers and commits that
> are not. The existing mechanism uses parse_commit() and follows by
> checking if there is a 'graph_pos' that shows the commit was parsed from
> the commit-graph file.
>
> When core.commitGraph=false, we do not parse the commits from the
> commit-graph and 'graph_pos' indicates that no commits are in the
> existing file. The --split logic moves forward creating a new layer on
> top that holds all reachable commits, then possibly merges down into
> those layers, resulting in duplicate commits. The previous change makes
> that merging process more robust to such a situation in case it happens
> in the written commit-graph data.

You're noting something interesting here which is that I actually think
setting 'core.commitGraph' _would_ be OK for non-split writes, and
'--split=replace' (along with any other split that happens to write a
single layer).

But, I think that actually enforcing that rule (i.e., "if you have
core.commitGraph set to false, you can't run `git commit-graph write`
except in X Y Z certain situations") is overly-complex and confusing to
users. So, I like what you have here a lot.

> The easy answer here is to avoid writing a commit-graph if reading the
> commit-graph is disabled. Since the resulting commit-graph will would not
> be read by subsequent Git processes. This is more natural than forcing
> core.commitGraph to be true for the 'write' process.
>
> Reported-by: Thomas Braun <thomas.braun@virtuell-zuhause.de>
> Helped-by: Jeff King <peff@peff.net>
> Helped-by: Taylor Blau <me@ttaylorr.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>

You can add my:

  Signed-off-by: Taylor Blau <me@ttaylorr.com>

to the patch below, too, unless you want to take my suggestion below...

> ---
>  Documentation/git-commit-graph.txt | 4 +++-
>  commit-graph.c                     | 5 +++++
>  t/t5324-split-commit-graph.sh      | 3 ++-
>  3 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> index de6b6de230..e1f48c95b3 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -39,7 +39,9 @@ COMMANDS
>  --------
>  'write'::
>
> -Write a commit-graph file based on the commits found in packfiles.
> +Write a commit-graph file based on the commits found in packfiles. If
> +the config option `core.commitGraph` is disabled, then this command will
> +output a warning, then return success without writing a commit-graph file.
>  +
>  With the `--stdin-packs` option, generate the new commit graph by
>  walking objects only in the specified pack-indexes. (Cannot be combined
> diff --git a/commit-graph.c b/commit-graph.c
> index 0280dcb2ce..6f62a07313 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2160,6 +2160,11 @@ int write_commit_graph(struct object_directory *odb,
>  	int replace = 0;
>  	struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS;
>
> +	prepare_repo_settings(the_repository);
> +	if (!the_repository->settings.core_commit_graph) {
> +		warning(_("attempting to write a commit-graph, but 'core.commitGraph' is disabled"));
> +		return 0;
> +	}

Should this check be folded into 'commit_graph_compatible()'? Maybe in
'prepare_commit_graph()' which itself calls 'commit_graph_compatible()'?
I admit that I find this chain of callers to be confusing.

I suppose one argument for checking it here _before_ calling
'commit_graph_compatible()' is that it allows you to issue a specific
warning before returning from this function, so I'm OK with it.

I also don't have a concrete suggestion of where a better place for this
hunk might be, so I'm fine with what you wrote.

>  	if (!commit_graph_compatible(the_repository))
>  		return 0;
>
> diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
> index a314ce0368..4d3842b83b 100755
> --- a/t/t5324-split-commit-graph.sh
> +++ b/t/t5324-split-commit-graph.sh
> @@ -442,8 +442,9 @@ test_expect_success '--split=replace with partial Bloom data' '
>
>  test_expect_success 'prevent regression for duplicate commits across layers' '
>  	git init dup &&
> -	git -C dup config core.commitGraph false &&
>  	git -C dup commit --allow-empty -m one &&
> +	git -C dup -c core.commitGraph=false commit-graph write --split=no-merge --reachable 2>err &&
> +	test_i18ngrep "attempting to write a commit-graph" err &&
>  	git -C dup commit-graph write --split=no-merge --reachable &&
>  	git -C dup commit --allow-empty -m two &&
>  	git -C dup commit-graph write --split=no-merge --reachable &&

Hmm. I would have preferred to see a new test here. Unless I'm wrong, I
believe the patched version of this test _doesn't_ have a duplicate
commit across multiple layers:

  - We try to write a layer with 'one', but don't (because
    'core.commitGraph' is set to false).

  - Then we write a layer for 'one' with 'core.commitGraph' unset.

  - Then we write a layer for 'two' (and only 'two'), since we read the
    below layer containing 'one'.

But, I'm not sure of a better way to test this, either. You fixed the
bug that this is trying to exercise, so it's no longer being exercised
here, but then again neither is the new code that is supposed to handle
it.

I wonder if it is maybe worth having some sample commit-graphs laying
around in a t5324 directory that _would_ demonstrate this problem. OTOH,
maybe that is just me being overly pedantic and worrying about something
that isn't actually a problem.

I trust your judgement, so whatever you feel like is fine with me.

Thanks,
Taylor

      parent reply	other threads:[~2020-10-09 21:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-08 13:56 [PATCH] commit-graph: ignore duplicates when merging layers Derrick Stolee via GitGitGadget
2020-10-08 14:15 ` Taylor Blau
2020-10-08 14:29   ` Derrick Stolee
2020-10-08 14:59 ` [PATCH v2] " Derrick Stolee via GitGitGadget
2020-10-08 15:04   ` [PATCH v3] " Derrick Stolee via GitGitGadget
2020-10-08 15:53     ` Jeff King
2020-10-08 16:26       ` Derrick Stolee
2020-10-08 16:42         ` Taylor Blau
2020-10-08 16:43         ` Jeff King
2020-10-09 20:53     ` [PATCH v4 0/2] " Derrick Stolee via GitGitGadget
2020-10-09 20:53       ` [PATCH v4 1/2] " Derrick Stolee via GitGitGadget
2020-10-09 20:53       ` [PATCH v4 2/2] commit-graph: don't write commit-graph when disabled Derrick Stolee via GitGitGadget
2020-10-09 21:12         ` Junio C Hamano
2020-10-09 21:17         ` Taylor Blau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201009211725.GA450854@nand.local \
    --to=me@ttaylorr.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).