All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, dstolee@microsoft.com, gitster@pobox.com,
	szeder.dev@gmail.com
Subject: Re: [PATCH 8/8] commit-graph: drop COMMIT_GRAPH_WRITE_CHECK_OIDS flag
Date: Thu, 7 May 2020 16:40:05 -0400	[thread overview]
Message-ID: <20200507204005.GE29683@coredump.intra.peff.net> (raw)
In-Reply-To: <6c2d130b0cd4b6a8a541b362ae7dd44d4c282e3f.1588641176.git.me@ttaylorr.com>

On Mon, May 04, 2020 at 07:14:03PM -0600, Taylor Blau wrote:

> If callers do wish to retain this behavior, they can easily work around
> this change by doing the following:
> 
>     git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' |
>     awk '/commit/ { print $1 }' |
>     git commit-graph write --stdin-commits

I know this came from my earlier email, but I think that recipe actually
shows how to make your input work even if --check-oids were the default.
If you really want the --check-oids behavior, you'd want the opposite:
to complain about those ones. So it's something like:

     git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' |
     awk '
       !/commit/ { print "not-a-commit:"$1 }
        /commit/ { print $1 }
     ' |
     git commit-graph write --stdin-commits

> diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
> index 53a650225a..fcac7d12e1 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -47,8 +47,10 @@ with `--stdin-commits` or `--reachable`.)
>  +
>  With the `--stdin-commits` option, generate the new commit graph by
>  walking commits starting at the commits specified in stdin as a list
> -of OIDs in hex, one OID per line. (Cannot be combined with
> -`--stdin-packs` or `--reachable`.)
> +of OIDs in hex, one OID per line. OIDs that resolve to non-commits
> +(either directly, or by peeling tags) are silently ignored. OIDs that
> +are malformed, or do not exist generate an error. (Cannot be combined
> +with `--stdin-packs` or `--reachable`.)

Yeah, I think these semantics are good.

> diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
> index 9eec68572f..3637d079fb 100644
> --- a/builtin/commit-graph.c
> +++ b/builtin/commit-graph.c
> @@ -153,13 +153,14 @@ static int read_one_commit(struct oidset *commits, struct progress *progress,
>  
>  	display_progress(progress, oidset_size(commits) + 1);
>  
> +	if (oid_object_info(the_repository, &oid, NULL) < 0) {
> +		error(_("object %s does not exist"), hash);
> +		return 1;
> +	}
> +
>  	result = lookup_commit_reference_gently(the_repository, &oid, 1);
>  	if (result)
>  		oidset_insert(commits, &result->object.oid);
> -	else {
> -		error(_("invalid commit object id: %s"), hash);
> -		return 1;
> -	}
>  	return 0;
>  }

We can avoid the object-existence check entirely if
lookup_commit_reference_gently() gives us an answer. And we'd expect
that to be the common path.

Also, using has_object_file() is cheaper than oid_object_info(), since
it doesn't have to resolve the type for deltas.

So perhaps:

  result = lookup_commit_reference_gently(...);
  if (result)
          oidset_insert(...);
  else if (has_object_file(&oid))
          ; /* not a commit; quietly ignore;
  else
          return error(no such object...);

That said, I think this technique misses some cases of corruption.
You're checking that the outer-most object exists, but not any
intermediate peeled objects. I.e., lookup_commit_reference_gently()
might have failed for two reasons:

  - an object it peeled to didn't exist

  - an object it peeled to wasn't a commit

To do it thoroughly, I think you'd have to call deref_tag() yourself and
distinguish NULL there (an error) from a result where obj->type isn't
OBJ_COMMIT (quietly ignore).

>  enum commit_graph_write_flags {
> -	COMMIT_GRAPH_WRITE_APPEND     = (1 << 0),
> -	COMMIT_GRAPH_WRITE_PROGRESS   = (1 << 1),
> -	COMMIT_GRAPH_WRITE_SPLIT      = (1 << 2),
> -	/* Make sure that each OID in the input is a valid commit OID. */
> -	COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3),
> -	COMMIT_GRAPH_WRITE_BLOOM_FILTERS = (1 << 4),
> +	COMMIT_GRAPH_WRITE_APPEND        = (1 << 0),
> +	COMMIT_GRAPH_WRITE_PROGRESS      = (1 << 1),
> +	COMMIT_GRAPH_WRITE_SPLIT         = (1 << 2),
> +	COMMIT_GRAPH_WRITE_BLOOM_FILTERS = (1 << 3)

As much as I love looking at matched-indentation lists, I think this
diff is a good example of why it's not worth doing. It's much easier to
see what's going on if the first three items aren't touched. I'd
actually even leave BLOOM_FILTERS where it is, and accept the hole which
could be refilled later.

Your patch also loses the trailing comma after the final BLOOM_FILTERS
entry.

-Peff

  parent reply	other threads:[~2020-05-07 20:40 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05  1:13 [PATCH 0/8] commit-graph: drop CHECK_OIDS, peel in callers Taylor Blau
2020-05-05  1:13 ` [PATCH 1/8] commit-graph.c: extract 'refs_cb_data' Taylor Blau
2020-05-05  1:13 ` [PATCH 2/8] commit-graph.c: show progress of finding reachable commits Taylor Blau
2020-05-05 11:50   ` Derrick Stolee
2020-05-05 16:13     ` Taylor Blau
2020-05-05  1:13 ` [PATCH 3/8] commit-graph.c: peel refs in 'add_ref_to_set' Taylor Blau
2020-05-07 19:54   ` Jeff King
2020-05-13 19:48     ` Taylor Blau
2020-05-13 20:17       ` Jeff King
2020-05-05  1:13 ` [PATCH 4/8] builtin/commit-graph.c: extract 'read_one_commit()' Taylor Blau
2020-05-07 20:03   ` Jeff King
2020-05-13 20:01     ` Taylor Blau
2020-05-05  1:13 ` [PATCH 5/8] builtin/commit-graph.c: dereference tags in builtin Taylor Blau
2020-05-05 12:01   ` Derrick Stolee
2020-05-05 16:14     ` Taylor Blau
2020-05-07 20:14   ` Jeff King
2020-05-05  1:13 ` [PATCH 6/8] commit-graph.c: simplify 'fill_oids_from_commits' Taylor Blau
2020-05-05 12:05   ` Derrick Stolee
2020-05-07 20:21   ` Jeff King
2020-05-05  1:14 ` [PATCH 7/8] t5318: reorder test below 'graph_read_expect' Taylor Blau
2020-05-05  1:14 ` [PATCH 8/8] commit-graph: drop COMMIT_GRAPH_WRITE_CHECK_OIDS flag Taylor Blau
2020-05-05 12:10   ` Derrick Stolee
2020-05-05 16:16     ` Taylor Blau
2020-05-07 20:40   ` Jeff King [this message]
2020-05-13 21:32     ` Taylor Blau
2020-05-05 11:35 ` [PATCH 0/8] commit-graph: drop CHECK_OIDS, peel in callers Derrick Stolee
2020-05-05 16:11   ` Taylor Blau
2020-05-06  0:07 ` [PATCH v2 " Taylor Blau
2020-05-06  0:07   ` [PATCH v2 1/8] commit-graph.c: extract 'refs_cb_data' Taylor Blau
2020-05-06  0:07   ` [PATCH v2 2/8] commit-graph.c: show progress of finding reachable commits Taylor Blau
2020-05-06  0:07   ` [PATCH v2 3/8] commit-graph.c: peel refs in 'add_ref_to_set' Taylor Blau
2020-05-06  0:07   ` [PATCH v2 4/8] builtin/commit-graph.c: extract 'read_one_commit()' Taylor Blau
2020-05-06  0:07   ` [PATCH v2 5/8] builtin/commit-graph.c: dereference tags in builtin Taylor Blau
2020-05-06  0:07   ` [PATCH v2 6/8] commit-graph.c: simplify 'fill_oids_from_commits' Taylor Blau
2020-05-06  0:07   ` [PATCH v2 7/8] t5318: reorder test below 'graph_read_expect' Taylor Blau
2020-05-06  0:07   ` [PATCH v2 8/8] commit-graph: drop COMMIT_GRAPH_WRITE_CHECK_OIDS flag Taylor Blau
2020-05-07 20:42   ` [PATCH v2 0/8] commit-graph: drop CHECK_OIDS, peel in callers Jeff King
2020-05-13 21:59 ` [PATCH v3 " Taylor Blau
2020-05-13 21:59   ` [PATCH v3 1/8] commit-graph.c: extract 'refs_cb_data' Taylor Blau
2020-05-13 21:59   ` [PATCH v3 2/8] commit-graph.c: show progress of finding reachable commits Taylor Blau
2020-05-13 21:59   ` [PATCH v3 3/8] commit-graph.c: peel refs in 'add_ref_to_set' Taylor Blau
2020-05-13 21:59   ` [PATCH v3 4/8] builtin/commit-graph.c: extract 'read_one_commit()' Taylor Blau
2020-05-14 17:56     ` Jeff King
2020-05-14 18:02       ` Taylor Blau
2020-05-14 18:27         ` Junio C Hamano
2020-05-18 19:27           ` Taylor Blau
2020-05-13 21:59   ` [PATCH v3 5/8] builtin/commit-graph.c: dereference tags in builtin Taylor Blau
2020-05-14 18:01     ` Jeff King
2020-05-14 18:04       ` Taylor Blau
2020-07-10 19:02     ` [PATCH] commit-graph: fix "Collecting commits from input" progress line SZEDER Gábor
2020-07-10 19:17       ` Taylor Blau
2020-07-15 18:33       ` SZEDER Gábor
2020-07-15 18:43         ` Derrick Stolee
2020-07-15 18:58           ` Junio C Hamano
2020-05-13 21:59   ` [PATCH v3 6/8] commit-graph.c: simplify 'fill_oids_from_commits' Taylor Blau
2020-05-13 21:59   ` [PATCH v3 7/8] t5318: reorder test below 'graph_read_expect' Taylor Blau
2020-05-13 21:59   ` [PATCH v3 8/8] commit-graph: drop COMMIT_GRAPH_WRITE_CHECK_OIDS flag Taylor Blau
2020-05-14 18:09     ` Jeff King
2020-05-14 18:12   ` [PATCH v3 0/8] commit-graph: drop CHECK_OIDS, peel in callers Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200507204005.GE29683@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.