git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, "Jeff King" <peff@peff.net>,
	"Felipe Contreras" <felipe.contreras@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Chris Torek" <chris.torek@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Taylor Blau" <me@ttaylorr.com>
Subject: Re: [PATCH v3 4/4] revision: avoid hitting packfiles when commits are in commit-graph
Date: Tue, 3 Aug 2021 11:16:36 +0200	[thread overview]
Message-ID: <YQkJdDvRtyOPzszU@ncase> (raw)
In-Reply-To: <xmqqwnp3vcow.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 3353 bytes --]

On Mon, Aug 02, 2021 at 01:01:03PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
[snip]
> > +int find_object_in_graph(struct repository *repo, struct object *object)
> > +{
> > +	struct commit *commit;
> > +	uint32_t pos;
> > +
> > +	if (object->parsed) {
> > +		if (object->type != OBJ_COMMIT)
> > +			return -1;
> > +		return 0;
> 
> This is puzzling---at least it is not consistent with what the
> function name says ("please say if you find _this_ object in the
> commit-graph file"---if that is not what this function does, it
> needs a comment before the implementation).
> 
> The caller had object and we has already been parsed.  If the
> function were "with help from commit-graph, please tell me if you
> can positively say this is a commit", the above is understandable.
> If we know positively that it is not commit, we say "no, it is not a
> commit" (which may be suboptimal---if the caller falls back to
> another codepath, the object will still not be a commit) and if we
> know it is a commit, we can say "yes, it definitely is a commit" and
> the caller can stop there.
> 
> I guess my only problem with this function is that its name and what
> it does does not align.  If the caller uses it for the real purpose
> of the function I guessed, then the logic itself may be OK.

Fair point. The only caller for now only calls the function if the
object's type is unknown, so it really is "Resolve the commit if it is
one". I'll adjust the function's name.

> >  static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uint32_t *pos)
> >  {
> >  	uint32_t graph_pos = commit_graph_position(item);
> > @@ -871,18 +913,7 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
> >  		*pos = graph_pos;
> >  		return 1;
> >  	} else {
> > -		struct commit_graph *cur_g = g;
> > -		uint32_t lex_index;
> > -
> > -		while (cur_g && !bsearch_graph(cur_g, &(item->object.oid), &lex_index))
> > -			cur_g = cur_g->base_graph;
> > -
> > -		if (cur_g) {
> > -			*pos = lex_index + cur_g->num_commits_in_base;
> > -			return 1;
> > -		}
> > -
> > -		return 0;
> > +		return find_object_id_in_graph(&item->object.oid, g, pos);
> 
> And I think this one is a op-op refactoring that does not change the
> behaviour of find_commit_in_graph()?  It might be easier if done in
> a separate preparatory step, but it is small enough.

Will do.

One thing that occurred to me this morning after waking up is that this
commit changes semantics: if we're able to look up the commit via the
commit-graph, then we'll happily consider it to exist in the repository.
But given that we don't hit the object database at all anymore, it may
be that the commit-graph was out of date while the commit got
unreachable and thus pruned. So it may not even exist anymore in the
repository.

I wonder what our stance on this is. I can definitely understand the
angle that this would be a deal breaker given that we now claim commits
exist which don't anymore. On the other hand, we update commit-graphs
via git-gc(1), which makes this scenario a lot less likely nowadays. Is
there any precedent in our codebase where we treat commits part of the
commit-graph as existing? If not, do we want to make that assumption?

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2021-08-03  9:16 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-28  5:33 [PATCH v2 0/3] Speed up connectivity checks via bitmaps Patrick Steinhardt
2021-06-28  5:33 ` [PATCH v2 1/3] p5400: add perf tests for git-receive-pack(1) Patrick Steinhardt
2021-06-28  7:49   ` Ævar Arnfjörð Bjarmason
2021-06-29  6:18     ` Patrick Steinhardt
2021-06-29 12:09       ` Ævar Arnfjörð Bjarmason
2021-06-28  5:33 ` [PATCH v2 2/3] receive-pack: skip connectivity checks on delete-only commands Patrick Steinhardt
2021-06-28  8:00   ` Ævar Arnfjörð Bjarmason
2021-06-28  8:06     ` Ævar Arnfjörð Bjarmason
2021-06-29  6:26     ` Patrick Steinhardt
2021-06-30  1:31   ` Jeff King
2021-06-30  1:35     ` Jeff King
2021-06-30 13:52     ` Patrick Steinhardt
2021-06-28  5:33 ` [PATCH v2 3/3] connected: implement connectivity check using bitmaps Patrick Steinhardt
2021-06-28 20:23   ` Taylor Blau
2021-06-29 22:44     ` Taylor Blau
2021-06-30  2:04       ` Jeff King
2021-06-30  3:07         ` Taylor Blau
2021-06-30  5:45           ` Jeff King
2021-07-02 17:44             ` Taylor Blau
2021-07-02 21:21               ` Jeff King
2021-06-30  1:51   ` Jeff King
2021-07-20 14:26     ` Patrick Steinhardt
2021-08-02  9:37 ` [PATCH v3 0/4] Speed up connectivity checks Patrick Steinhardt
2021-08-02  9:38   ` [PATCH v3 1/4] connected: do not sort input revisions Patrick Steinhardt
2021-08-02 12:49     ` Ævar Arnfjörð Bjarmason
2021-08-03  8:50       ` Patrick Steinhardt
2021-08-04 11:01         ` Ævar Arnfjörð Bjarmason
2021-08-02 19:00     ` Junio C Hamano
2021-08-03  8:55       ` Patrick Steinhardt
2021-08-03 21:47         ` Junio C Hamano
2021-08-02  9:38   ` [PATCH v3 2/4] revision: stop retrieving reference twice Patrick Steinhardt
2021-08-02 12:53     ` Ævar Arnfjörð Bjarmason
2021-08-02  9:38   ` [PATCH v3 3/4] revision: avoid loading object headers multiple times Patrick Steinhardt
2021-08-02 12:55     ` Ævar Arnfjörð Bjarmason
2021-08-05 10:12       ` Patrick Steinhardt
2021-08-02 19:40     ` Junio C Hamano
2021-08-03  9:07       ` Patrick Steinhardt
2021-08-06 14:17         ` Patrick Steinhardt
2021-08-02  9:38   ` [PATCH v3 4/4] revision: avoid hitting packfiles when commits are in commit-graph Patrick Steinhardt
2021-08-02 20:01     ` Junio C Hamano
2021-08-03  9:16       ` Patrick Steinhardt [this message]
2021-08-03 21:56         ` Junio C Hamano
2021-08-05 11:01           ` Patrick Steinhardt
2021-08-05 16:16             ` Junio C Hamano
2021-08-04 10:51         ` Ævar Arnfjörð Bjarmason
2021-08-05 11:25   ` [PATCH v4 0/6] Speed up connectivity checks Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 1/6] revision: separate walk and unsorted flags Patrick Steinhardt
2021-08-05 18:47       ` Junio C Hamano
2021-08-05 11:25     ` [PATCH v4 2/6] connected: do not sort input revisions Patrick Steinhardt
2021-08-05 18:44       ` Junio C Hamano
2021-08-06  6:00         ` Patrick Steinhardt
2021-08-06 16:50           ` Junio C Hamano
2021-08-05 11:25     ` [PATCH v4 3/6] revision: stop retrieving reference twice Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 4/6] revision: avoid loading object headers multiple times Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 5/6] commit-graph: split out function to search commit position Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 6/6] revision: avoid hitting packfiles when commits are in commit-graph Patrick Steinhardt
2021-08-09  8:00 ` [PATCH v5 0/5] Speed up connectivity checks Patrick Steinhardt
2021-08-09  8:02   ` Patrick Steinhardt
2021-08-09  8:11 ` Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 1/5] revision: separate walk and unsorted flags Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 2/5] connected: do not sort input revisions Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 3/5] revision: stop retrieving reference twice Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 4/5] commit-graph: split out function to search commit position Patrick Steinhardt
2021-08-09  8:12   ` [PATCH v5 5/5] revision: avoid hitting packfiles when commits are in commit-graph Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YQkJdDvRtyOPzszU@ncase \
    --to=ps@pks.im \
    --cc=avarab@gmail.com \
    --cc=chris.torek@gmail.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).