From: Taylor Blau <me@ttaylorr.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Taylor Blau <me@ttaylorr.com>,
git@vger.kernel.org, jonathantanmy@google.com, gitster@pobox.com,
newren@gmail.com, Jay Conrod <jayconrod@google.com>,
Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH v2 2/2] shallow.c: use '{commit,rollback}_shallow_file'
Date: Wed, 3 Jun 2020 16:14:53 -0600 [thread overview]
Message-ID: <20200603221453.GA36237@syl.local> (raw)
In-Reply-To: <20200603205151.GC253041@google.com>
On Wed, Jun 03, 2020 at 01:51:51PM -0700, Jonathan Nieder wrote:
> Taylor Blau wrote:
>
> > Ah, this only sort of has to do with the object cache. In
> > 'parse_commit_buffer()' we stop parsing parents in the case that the
> > repository is shallow (this goes back to 7f3140cd23 (git repack: keep
> > commits hidden by a graft, 2009-07-23)).
>
> Ah, good analysis. (In fact, the behavior is older: it's from
> 5da5c8f4cf4 (Teach parse_commit_buffer about grafting., 2005-07-30).)
> So this is additional "cached" data that needs to be invalidated by
> reset_repository_shallow.
>
> So the question is, what other information falls into that category?
>
> [...]
> > --- a/shallow.c
> > +++ b/shallow.c
> > @@ -90,6 +90,9 @@ static void reset_repository_shallow(struct repository *r)
> > {
> > r->parsed_objects->is_shallow = -1;
> > stat_validity_clear(r->parsed_objects->shallow_stat);
> > +
>
> (nit: the above two lines wouldn't be needed if r->parsed_objects is
> being thrown away.)
Right, thanks. I don't think that it matters since you point out a
legitimate issue with dangling references, but serves me right for
working on this so late at night ;-).
> > + parsed_object_pool_clear(r->parsed_objects);
> > + r->parsed_objects = parsed_object_pool_new();
> > }
>
> Shallows don't affect the ref store. They only affect object walks.
> So r->parsed_objects does seem like the only place that could be
> affected.
>
> That said, with this change I'd worry about use-after-free from any
> existing references to objects in the pool.
>
> Stepping back, what I think I would like to see is to *not* have
> grafts and shallow state affect the in-memory persisted parsed
> objects. Instead, act as an overlay in accessors that traverse over
> them.
>
> Lacking that, I like the idea of a "dirty bit" that gets written as
> soon as we have started lying in the parsed object pool. Something
> like this. What do you think?
>
> diff --git i/commit-graph.c w/commit-graph.c
> index 2ff042fbf4f..84b49ce903b 100644
> --- i/commit-graph.c
> +++ w/commit-graph.c
> @@ -149,7 +149,8 @@ static int commit_graph_compatible(struct repository *r)
> }
>
> prepare_commit_graft(r);
> - if (r->parsed_objects && r->parsed_objects->grafts_nr)
> + if (r->parsed_objects &&
> + (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent))
This is a little tricky. Why would we set substituted_parent without
also incrementing grafts_nr? That seems like the real bug here: if we
incremented grafts_nr, then we would return a non-zero value from
'commit_graph_compatible' and rightly stop even without this sticky-bit.
I don't quite understand this myself. If it's an oversight, it's a
remarkably long-lived one. Do you have a better sense of this?
> return 0;
> if (is_repository_shallow(r))
> return 0;
> diff --git i/commit.c w/commit.c
> index 87686a7055b..762f09e53ae 100644
> --- i/commit.c
> +++ w/commit.c
> @@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
> pptr = &item->parents;
>
> graft = lookup_commit_graft(r, &item->object.oid);
> + if (graft)
> + r->parsed_objects->substituted_parent = 1;
> while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) {
> struct commit *new_parent;
>
> @@ -447,6 +449,7 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b
> if (graft) {
> int i;
> struct commit *new_parent;
> +
Nit: unnecessary whitespace change, but I doubt it really matters much.
> for (i = 0; i < graft->nr_parent; i++) {
> new_parent = lookup_commit(r,
> &graft->parent[i]);
> diff --git i/object.h w/object.h
> index b22328b8383..db02fdcd6b2 100644
> --- i/object.h
> +++ w/object.h
> @@ -26,6 +26,7 @@ struct parsed_object_pool {
> char *alternate_shallow_file;
>
> int commit_graft_prepared;
> + int substituted_parent;
>
> struct buffer_slab *buffer_slab;
> };
>
> Thanks,
> Jonathan
Thanks,
Taylor
next prev parent reply other threads:[~2020-06-03 22:14 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-21 18:09 [PATCH] shallow.c: use 'reset_repository_shallow' when appropriate Taylor Blau
2020-04-21 20:41 ` Junio C Hamano
2020-04-21 20:45 ` Taylor Blau
2020-04-21 20:52 ` Junio C Hamano
2020-04-21 22:21 ` Taylor Blau
2020-04-21 23:06 ` Junio C Hamano
2020-04-22 18:05 ` Jonathan Tan
2020-04-22 18:02 ` Jonathan Tan
2020-04-22 18:15 ` Junio C Hamano
2020-04-23 0:14 ` Taylor Blau
2020-04-23 0:25 ` [PATCH v2 0/2] shallow.c: reset shallow-ness after updating Taylor Blau
2020-04-23 0:25 ` [PATCH v2 1/2] t5537: use test_write_lines, indented heredocs for readability Taylor Blau
2020-04-23 1:14 ` Jonathan Nieder
2020-04-24 17:11 ` Taylor Blau
2020-04-24 17:17 ` Jonathan Nieder
2020-04-24 20:45 ` Junio C Hamano
2020-04-23 0:25 ` [PATCH v2 2/2] shallow.c: use '{commit,rollback}_shallow_file' Taylor Blau
2020-04-23 1:23 ` Jonathan Nieder
2020-04-23 18:09 ` Jonathan Tan
2020-04-23 20:40 ` Junio C Hamano
2020-04-24 17:13 ` Taylor Blau
2020-06-03 3:42 ` Jonathan Nieder
2020-06-03 4:52 ` Taylor Blau
2020-06-03 5:16 ` Taylor Blau
2020-06-03 13:08 ` Derrick Stolee
2020-06-03 19:26 ` Taylor Blau
2020-06-03 21:23 ` Jonathan Nieder
2020-06-03 20:51 ` Jonathan Nieder
2020-06-03 22:14 ` Taylor Blau [this message]
2020-06-03 23:06 ` Jonathan Nieder
2020-06-04 17:45 ` Taylor Blau
2020-04-23 19:05 ` [PATCH] shallow.c: use 'reset_repository_shallow' when appropriate Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200603221453.GA36237@syl.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jayconrod@google.com \
--cc=jonathantanmy@google.com \
--cc=jrnieder@gmail.com \
--cc=newren@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).