git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, jonathantanmy@google.com, gitster@pobox.com,
	newren@gmail.com, Jay Conrod <jayconrod@google.com>,
	Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH v2 2/2] shallow.c: use '{commit,rollback}_shallow_file'
Date: Wed, 3 Jun 2020 16:14:53 -0600	[thread overview]
Message-ID: <20200603221453.GA36237@syl.local> (raw)
In-Reply-To: <20200603205151.GC253041@google.com>

On Wed, Jun 03, 2020 at 01:51:51PM -0700, Jonathan Nieder wrote:
> Taylor Blau wrote:
>
> > Ah, this only sort of has to do with the object cache. In
> > 'parse_commit_buffer()' we stop parsing parents in the case that the
> > repository is shallow (this goes back to 7f3140cd23 (git repack: keep
> > commits hidden by a graft, 2009-07-23)).
>
> Ah, good analysis.  (In fact, the behavior is older: it's from
> 5da5c8f4cf4 (Teach parse_commit_buffer about grafting., 2005-07-30).)
> So this is additional "cached" data that needs to be invalidated by
> reset_repository_shallow.
>
> So the question is, what other information falls into that category?
>
> [...]
> > --- a/shallow.c
> > +++ b/shallow.c
> > @@ -90,6 +90,9 @@ static void reset_repository_shallow(struct repository *r)
> >  {
> >  	r->parsed_objects->is_shallow = -1;
> >  	stat_validity_clear(r->parsed_objects->shallow_stat);
> > +
>
> (nit: the above two lines wouldn't be needed if r->parsed_objects is
> being thrown away.)

Right, thanks. I don't think that it matters since you point out a
legitimate issue with dangling references, but serves me right for
working on this so late at night ;-).

> > +	parsed_object_pool_clear(r->parsed_objects);
> > +	r->parsed_objects = parsed_object_pool_new();
> >  }
>
> Shallows don't affect the ref store.  They only affect object walks.
> So r->parsed_objects does seem like the only place that could be
> affected.
>
> That said, with this change I'd worry about use-after-free from any
> existing references to objects in the pool.
>
> Stepping back, what I think I would like to see is to *not* have
> grafts and shallow state affect the in-memory persisted parsed
> objects.  Instead, act as an overlay in accessors that traverse over
> them.
>
> Lacking that, I like the idea of a "dirty bit" that gets written as
> soon as we have started lying in the parsed object pool.  Something
> like this.  What do you think?
>
> diff --git i/commit-graph.c w/commit-graph.c
> index 2ff042fbf4f..84b49ce903b 100644
> --- i/commit-graph.c
> +++ w/commit-graph.c
> @@ -149,7 +149,8 @@ static int commit_graph_compatible(struct repository *r)
>  	}
>
>  	prepare_commit_graft(r);
> -	if (r->parsed_objects && r->parsed_objects->grafts_nr)
> +	if (r->parsed_objects &&
> +	    (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent))

This is a little tricky. Why would we set substituted_parent without
also incrementing grafts_nr? That seems like the real bug here: if we
incremented grafts_nr, then we would return a non-zero value from
'commit_graph_compatible' and rightly stop even without this sticky-bit.

I don't quite understand this myself. If it's an oversight, it's a
remarkably long-lived one. Do you have a better sense of this?

>  		return 0;
>  	if (is_repository_shallow(r))
>  		return 0;
> diff --git i/commit.c w/commit.c
> index 87686a7055b..762f09e53ae 100644
> --- i/commit.c
> +++ w/commit.c
> @@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
>  	pptr = &item->parents;
>
>  	graft = lookup_commit_graft(r, &item->object.oid);
> +	if (graft)
> +		r->parsed_objects->substituted_parent = 1;
>  	while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) {
>  		struct commit *new_parent;
>
> @@ -447,6 +449,7 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
>  	if (graft) {
>  		int i;
>  		struct commit *new_parent;
> +

Nit: unnecessary whitespace change, but I doubt it really matters much.

>  		for (i = 0; i < graft->nr_parent; i++) {
>  			new_parent = lookup_commit(r,
>  						   &graft->parent[i]);
> diff --git i/object.h w/object.h
> index b22328b8383..db02fdcd6b2 100644
> --- i/object.h
> +++ w/object.h
> @@ -26,6 +26,7 @@ struct parsed_object_pool {
>  	char *alternate_shallow_file;
>
>  	int commit_graft_prepared;
> +	int substituted_parent;
>
>  	struct buffer_slab *buffer_slab;
>  };
>
> Thanks,
> Jonathan

Thanks,
Taylor

  reply	other threads:[~2020-06-03 22:14 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-21 18:09 [PATCH] shallow.c: use 'reset_repository_shallow' when appropriate Taylor Blau
2020-04-21 20:41 ` Junio C Hamano
2020-04-21 20:45   ` Taylor Blau
2020-04-21 20:52     ` Junio C Hamano
2020-04-21 22:21       ` Taylor Blau
2020-04-21 23:06         ` Junio C Hamano
2020-04-22 18:05       ` Jonathan Tan
2020-04-22 18:02 ` Jonathan Tan
2020-04-22 18:15   ` Junio C Hamano
2020-04-23  0:14     ` Taylor Blau
2020-04-23  0:25       ` [PATCH v2 0/2] shallow.c: reset shallow-ness after updating Taylor Blau
2020-04-23  0:25         ` [PATCH v2 1/2] t5537: use test_write_lines, indented heredocs for readability Taylor Blau
2020-04-23  1:14           ` Jonathan Nieder
2020-04-24 17:11             ` Taylor Blau
2020-04-24 17:17               ` Jonathan Nieder
2020-04-24 20:45               ` Junio C Hamano
2020-04-23  0:25         ` [PATCH v2 2/2] shallow.c: use '{commit,rollback}_shallow_file' Taylor Blau
2020-04-23  1:23           ` Jonathan Nieder
2020-04-23 18:09           ` Jonathan Tan
2020-04-23 20:40             ` Junio C Hamano
2020-04-24 17:13               ` Taylor Blau
2020-06-03  3:42           ` Jonathan Nieder
2020-06-03  4:52             ` Taylor Blau
2020-06-03  5:16               ` Taylor Blau
2020-06-03 13:08                 ` Derrick Stolee
2020-06-03 19:26                   ` Taylor Blau
2020-06-03 21:23                   ` Jonathan Nieder
2020-06-03 20:51                 ` Jonathan Nieder
2020-06-03 22:14                   ` Taylor Blau [this message]
2020-06-03 23:06                     ` Jonathan Nieder
2020-06-04 17:45                       ` Taylor Blau
2020-04-23 19:05       ` [PATCH] shallow.c: use 'reset_repository_shallow' when appropriate Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200603221453.GA36237@syl.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jayconrod@google.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).