git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] revision: use repository from rev_info when parsing commits
@ 2020-06-23 20:56 Michael Forney
  2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Michael Forney @ 2020-06-23 20:56 UTC (permalink / raw)
  To: git, Derrick Stolee

This is needed when repo_init_revisions() is called with a repository
that is not the_repository to ensure appropriate repository is used
in repo_parse_commit_internal(). If the wrong repository is used,
a fatal error is the commit-graph machinery occurs:

  fatal: invalid commit position. commit-graph is likely corrupt

Since revision.c was the only user of the parse_commit_gently
compatibility define, remove it from commit.h.

Signed-off-by: Michael Forney <mforney@mforney.org>
---
 commit.h   |  1 -
 revision.c | 18 +++++++++---------
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/commit.h b/commit.h
index 1b2dea5d85..a2e8ca99a2 100644
--- a/commit.h
+++ b/commit.h
@@ -97,7 +97,6 @@ static inline int parse_commit_no_graph(struct commit *commit)
 
 #ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
 #define parse_commit_internal(item, quiet, use) repo_parse_commit_internal(the_repository, item, quiet, use)
-#define parse_commit_gently(item, quiet) repo_parse_commit_gently(the_repository, item, quiet)
 #define parse_commit(item) repo_parse_commit(the_repository, item)
 #endif
 
diff --git a/revision.c b/revision.c
index ebb4d2a0f2..2b6bf47c81 100644
--- a/revision.c
+++ b/revision.c
@@ -439,7 +439,7 @@ static struct commit *handle_commit(struct rev_info *revs,
 	if (object->type == OBJ_COMMIT) {
 		struct commit *commit = (struct commit *)object;
 
-		if (parse_commit(commit) < 0)
+		if (repo_parse_commit(revs->repo, commit) < 0)
 			die("unable to parse commit %s", name);
 		if (flags & UNINTERESTING) {
 			mark_parents_uninteresting(commit);
@@ -992,7 +992,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 					ts->treesame[0] = 1;
 			}
 		}
-		if (parse_commit(p) < 0)
+		if (repo_parse_commit(revs->repo, p) < 0)
 			die("cannot simplify commit %s (because of %s)",
 			    oid_to_hex(&commit->object.oid),
 			    oid_to_hex(&p->object.oid));
@@ -1037,7 +1037,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 				 * IOW, we pretend this parent is a
 				 * "root" commit.
 				 */
-				if (parse_commit(p) < 0)
+				if (repo_parse_commit(revs->repo, p) < 0)
 					die("cannot simplify commit %s (invalid %s)",
 					    oid_to_hex(&commit->object.oid),
 					    oid_to_hex(&p->object.oid));
@@ -1105,7 +1105,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			parent = parent->next;
 			if (p)
 				p->object.flags |= UNINTERESTING;
-			if (parse_commit_gently(p, 1) < 0)
+			if (repo_parse_commit_gently(revs->repo, p, 1) < 0)
 				continue;
 			if (p->parents)
 				mark_parents_uninteresting(p);
@@ -1136,7 +1136,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 		struct commit *p = parent->item;
 		int gently = revs->ignore_missing_links ||
 			     revs->exclude_promisor_objects;
-		if (parse_commit_gently(p, gently) < 0) {
+		if (repo_parse_commit_gently(revs->repo, p, gently) < 0) {
 			if (revs->exclude_promisor_objects &&
 			    is_promisor_object(&p->object.oid)) {
 				if (revs->first_parent_only)
@@ -3295,7 +3295,7 @@ static void explore_walk_step(struct rev_info *revs)
 	if (!c)
 		return;
 
-	if (parse_commit_gently(c, 1) < 0)
+	if (repo_parse_commit_gently(revs->repo, c, 1) < 0)
 		return;
 
 	if (revs->sort_order == REV_SORT_BY_AUTHOR_DATE)
@@ -3333,7 +3333,7 @@ static void indegree_walk_step(struct rev_info *revs)
 	if (!c)
 		return;
 
-	if (parse_commit_gently(c, 1) < 0)
+	if (repo_parse_commit_gently(revs->repo, c, 1) < 0)
 		return;
 
 	explore_to_depth(revs, c->generation);
@@ -3414,7 +3414,7 @@ static void init_topo_walk(struct rev_info *revs)
 	for (list = revs->commits; list; list = list->next) {
 		struct commit *c = list->item;
 
-		if (parse_commit_gently(c, 1))
+		if (repo_parse_commit_gently(revs->repo, c, 1))
 			continue;
 
 		test_flag_and_insert(&info->explore_queue, c, TOPO_WALK_EXPLORED);
@@ -3476,7 +3476,7 @@ static void expand_topo_walk(struct rev_info *revs, struct commit *commit)
 		if (parent->object.flags & UNINTERESTING)
 			continue;
 
-		if (parse_commit_gently(parent, 1) < 0)
+		if (repo_parse_commit_gently(revs->repo, parent, 1) < 0)
 			continue;
 
 		if (parent->generation < info->min_generation) {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] submodule: use submodule repository when preparing summary
  2020-06-23 20:56 [PATCH 1/2] revision: use repository from rev_info when parsing commits Michael Forney
@ 2020-06-23 20:56 ` Michael Forney
  2020-06-24 14:35   ` Derrick Stolee
  2020-06-23 21:24 ` [PATCH 1/2] revision: use repository from rev_info when parsing commits Eric Sunshine
  2020-06-24 14:29 ` Derrick Stolee
  2 siblings, 1 reply; 10+ messages in thread
From: Michael Forney @ 2020-06-23 20:56 UTC (permalink / raw)
  To: git, Derrick Stolee

In show_submodule_header(), we gather the left and right commits
of the submodule repository, as well as the merge bases. However,
prepare_submodule_summary() initializes the rev_info with the_repository,
so we end up parsing the commit in the wrong repository.

This results in a fatal error in parse_commit_in_graph(), since the
passed item does not belong to the repository's commit graph.

Signed-off-by: Michael Forney <mforney@mforney.org>
---
 submodule.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/submodule.c b/submodule.c
index e2ef5698c8..785ab47629 100644
--- a/submodule.c
+++ b/submodule.c
@@ -438,13 +438,13 @@ void handle_ignore_submodules_arg(struct diff_options *diffopt,
 	 */
 }
 
-static int prepare_submodule_summary(struct rev_info *rev, const char *path,
-		struct commit *left, struct commit *right,
+static int prepare_submodule_summary(struct repository *r, struct rev_info *rev,
+		const char *path, struct commit *left, struct commit *right,
 		struct commit_list *merge_bases)
 {
 	struct commit_list *list;
 
-	repo_init_revisions(the_repository, rev, NULL);
+	repo_init_revisions(r, rev, NULL);
 	setup_revisions(0, NULL, rev, NULL);
 	rev->left_right = 1;
 	rev->first_parent_only = 1;
@@ -632,7 +632,7 @@ void show_submodule_summary(struct diff_options *o, const char *path,
 		goto out;
 
 	/* Treat revision walker failure the same as missing commits */
-	if (prepare_submodule_summary(&rev, path, left, right, merge_bases)) {
+	if (prepare_submodule_summary(sub, &rev, path, left, right, merge_bases)) {
 		diff_emit_submodule_error(o, "(revision walker failed)\n");
 		goto out;
 	}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] revision: use repository from rev_info when parsing commits
  2020-06-23 20:56 [PATCH 1/2] revision: use repository from rev_info when parsing commits Michael Forney
  2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
@ 2020-06-23 21:24 ` Eric Sunshine
  2020-06-24 14:29 ` Derrick Stolee
  2 siblings, 0 replies; 10+ messages in thread
From: Eric Sunshine @ 2020-06-23 21:24 UTC (permalink / raw)
  To: Michael Forney; +Cc: Git List, Derrick Stolee

On Tue, Jun 23, 2020 at 5:02 PM Michael Forney <mforney@mforney.org> wrote:
> This is needed when repo_init_revisions() is called with a repository
> that is not the_repository to ensure appropriate repository is used
> in repo_parse_commit_internal(). If the wrong repository is used,
> a fatal error is the commit-graph machinery occurs:

s/is/in/

>   fatal: invalid commit position. commit-graph is likely corrupt
>
> Since revision.c was the only user of the parse_commit_gently
> compatibility define, remove it from commit.h.
>
> Signed-off-by: Michael Forney <mforney@mforney.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] revision: use repository from rev_info when parsing commits
  2020-06-23 20:56 [PATCH 1/2] revision: use repository from rev_info when parsing commits Michael Forney
  2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
  2020-06-23 21:24 ` [PATCH 1/2] revision: use repository from rev_info when parsing commits Eric Sunshine
@ 2020-06-24 14:29 ` Derrick Stolee
  2020-09-03 21:58   ` Junio C Hamano
  2 siblings, 1 reply; 10+ messages in thread
From: Derrick Stolee @ 2020-06-24 14:29 UTC (permalink / raw)
  To: Michael Forney, git

On 6/23/2020 4:56 PM, Michael Forney wrote:
> This is needed when repo_init_revisions() is called with a repository
> that is not the_repository to ensure appropriate repository is used
> in repo_parse_commit_internal(). If the wrong repository is used,
> a fatal error is the commit-graph machinery occurs:
> 
>   fatal: invalid commit position. commit-graph is likely corrupt
> 
> Since revision.c was the only user of the parse_commit_gently
> compatibility define, remove it from commit.h.

Is this demonstrable in a test case, to prevent regressions?

Notably, you are _not_ dropping parse_commit(), and it would be
easy for another call to that shim to slip into revision.c.

> Signed-off-by: Michael Forney <mforney@mforney.org>
> ---
>  commit.h   |  1 -
>  revision.c | 18 +++++++++---------
>  2 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/commit.h b/commit.h
> -#define parse_commit_gently(item, quiet) repo_parse_commit_gently(the_repository, item, quiet)

I'm happy we can drop this shim!

> diff --git a/revision.c b/revision.c
> index ebb4d2a0f2..2b6bf47c81 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -439,7 +439,7 @@ static struct commit *handle_commit(struct rev_info *revs,
>  	if (object->type == OBJ_COMMIT) {
>  		struct commit *commit = (struct commit *)object;
>  
> -		if (parse_commit(commit) < 0)
> +		if (repo_parse_commit(revs->repo, commit) < 0)

I counted 9 copies of parse_commit[_gently]() in my version
of revision.c, so it looks like you caught them all.

Thanks!
-Stolee


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] submodule: use submodule repository when preparing summary
  2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
@ 2020-06-24 14:35   ` Derrick Stolee
  2020-06-24 16:12     ` Junio C Hamano
  2020-06-30 11:04     ` Michael Forney
  0 siblings, 2 replies; 10+ messages in thread
From: Derrick Stolee @ 2020-06-24 14:35 UTC (permalink / raw)
  To: Michael Forney, git

On 6/23/2020 4:56 PM, Michael Forney wrote:
> In show_submodule_header(), we gather the left and right commits
> of the submodule repository, as well as the merge bases. However,
> prepare_submodule_summary() initializes the rev_info with the_repository,
> so we end up parsing the commit in the wrong repository.
> 
> This results in a fatal error in parse_commit_in_graph(), since the
> passed item does not belong to the repository's commit graph.
> 
> Signed-off-by: Michael Forney <mforney@mforney.org>
> ---
>  submodule.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/submodule.c b/submodule.c
> index e2ef5698c8..785ab47629 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -438,13 +438,13 @@ void handle_ignore_submodules_arg(struct diff_options *diffopt,
>  	 */
>  }
>  
> -static int prepare_submodule_summary(struct rev_info *rev, const char *path,
> -		struct commit *left, struct commit *right,
> +static int prepare_submodule_summary(struct repository *r, struct rev_info *rev,
> +		const char *path, struct commit *left, struct commit *right,
>  		struct commit_list *merge_bases)
>  {
>  	struct commit_list *list;
>  
> -	repo_init_revisions(the_repository, rev, NULL);
> +	repo_init_revisions(r, rev, NULL);

This is how we properly initialize the repository in the rev_info.
It's unfortunate that this use of the_repository was pretty clearly
incorrect. This is submodule.c, so every instance of the_repository
should be examined carefully. Taking a brief look right now, the
rest seem to be correct in that they are finding submodules within
the super-repo. The only issue will arise when recursing into
submodules, which is known to be broken in-process and are handled
with subprocesses instead.

>  	setup_revisions(0, NULL, rev, NULL);
>  	rev->left_right = 1;
>  	rev->first_parent_only = 1;
> @@ -632,7 +632,7 @@ void show_submodule_summary(struct diff_options *o, const char *path,
>  		goto out;
>  
>  	/* Treat revision walker failure the same as missing commits */
> -	if (prepare_submodule_summary(&rev, path, left, right, merge_bases)) {
> +	if (prepare_submodule_summary(sub, &rev, path, left, right, merge_bases)) {
>  		diff_emit_submodule_error(o, "(revision walker failed)\n");
>  		goto out;
>  	}

Perhaps the test I requested in patch 1 is only appropriate
here? Or, maybe the test should be test_expect_failure in the
first patch and switched to test_expect_success here?

Thanks,
-Stolee



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] submodule: use submodule repository when preparing summary
  2020-06-24 14:35   ` Derrick Stolee
@ 2020-06-24 16:12     ` Junio C Hamano
  2020-06-30 11:04     ` Michael Forney
  1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2020-06-24 16:12 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Michael Forney, git

Derrick Stolee <stolee@gmail.com> writes:

> Perhaps the test I requested in patch 1 is only appropriate
> here? Or, maybe the test should be test_expect_failure in the
> first patch and switched to test_expect_success here?

Either is OK, but it is probably easier to read to have just one
addition in 2/2 to expect succeses.  Temporarily revierting with
"git show ':!t' | git apply -R" before running test when you want
to reallly see how the original crashed is easy and simple enough.

Thanks.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] submodule: use submodule repository when preparing summary
  2020-06-24 14:35   ` Derrick Stolee
  2020-06-24 16:12     ` Junio C Hamano
@ 2020-06-30 11:04     ` Michael Forney
  2020-08-13 21:16       ` Michael Forney
  1 sibling, 1 reply; 10+ messages in thread
From: Michael Forney @ 2020-06-30 11:04 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1967 bytes --]

On 2020-06-24, Derrick Stolee <stolee@gmail.com> wrote:
> Perhaps the test I requested in patch 1 is only appropriate
> here? Or, maybe the test should be test_expect_failure in the
> first patch and switched to test_expect_success here?

I made a good effort to write a test, but I am still unable to
reliably trigger the offending codepath, which is:

submodule.c:prepare_submodule_summary
revision.c:prepare_revision_walk
revision.c:limit_list
revision.c:process_parents
commit.c:repo_parse_commit_gently
commit.c:repo_parse_commit_internal (needs !item->object.parsed and
use_commit_graph)
commit-graph.c:parse_commit_in_graph
commit-graph.c:parse_commit_in_graph_one
commit-graph.c:fill_commit_in_graph (needs pos >= number of commits in
commit-graph in parent repository)

The trick seems to be ensuring that the parent commit of the first
commit in the range of commits changed in a submodule does not get
parsed during show_submodule_header, is not a loose object in the
repository, and has an index in the commit-graph that is larger than
the size of the commit-graph in the parent repository. This seems to
depend on the order of commits in the commit-graph, which seems to be
random (perhaps based on commit hashes?).

I attached my best attempt at a test to trigger the error. The
probability of the test failing correctly (without the fix applied)
seems to depend on how many commits are present in submodule before
the first commit in the range of changed commits. This can be
controlled by adjusting the `seq 1 X` in the for loop. The lowest
number of commits with which I have been able to reproduce the bug is
3, where it occurs around 1% of the time, and if I set it to 200, I
can reproduce the bug around 99% of the time.

I don't really want to spend more time on this than I already have.
Can the bug fix be applied without a test? If not, hopefully someone
can volunteer to craft a reliable test (assuming that this is even
possible).

-Michael

[-- Attachment #2: t7421-submodule-commit-graph.sh --]
[-- Type: application/x-sh, Size: 721 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] submodule: use submodule repository when preparing summary
  2020-06-30 11:04     ` Michael Forney
@ 2020-08-13 21:16       ` Michael Forney
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Forney @ 2020-08-13 21:16 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On 2020-06-30, Michael Forney <mforney@mforney.org> wrote:
> I attached my best attempt at a test to trigger the error. The
> probability of the test failing correctly (without the fix applied)
> seems to depend on how many commits are present in submodule before
> the first commit in the range of changed commits. This can be
> controlled by adjusting the `seq 1 X` in the for loop. The lowest
> number of commits with which I have been able to reproduce the bug is
> 3, where it occurs around 1% of the time, and if I set it to 200, I
> can reproduce the bug around 99% of the time.
>
> Can the bug fix be applied without a test? If not, hopefully someone
> can volunteer to craft a reliable test (assuming that this is even
> possible).

Still looking for any help with this. It seems pretty clear that this
is a bug (I am not the only one who has hit this), and I'd really like
to see the issue fixed. I gave my best shot at a test, but I don't
think it's acceptable to commit a test that gives a false positive
some percentage of the time.

I see that my patch is still blocked as "Needs tests" in the what's
cooking summary, but I really don't know how to proceed from here.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] revision: use repository from rev_info when parsing commits
  2020-06-24 14:29 ` Derrick Stolee
@ 2020-09-03 21:58   ` Junio C Hamano
  2020-09-04 12:19     ` Derrick Stolee
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2020-09-03 21:58 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Michael Forney, git

Derrick Stolee <stolee@gmail.com> writes:

> On 6/23/2020 4:56 PM, Michael Forney wrote:
>> This is needed when repo_init_revisions() is called with a repository
>> that is not the_repository to ensure appropriate repository is used
>> in repo_parse_commit_internal(). If the wrong repository is used,
>> a fatal error is the commit-graph machinery occurs:
>> 
>>   fatal: invalid commit position. commit-graph is likely corrupt
>> 
>> Since revision.c was the only user of the parse_commit_gently
>> compatibility define, remove it from commit.h.
>
> Is this demonstrable in a test case, to prevent regressions?

It appears that Michael tried and failed.  Even if we do not
currently have a caller that asks these functions in revision.c to
work on a repository that is not the primary one (i.e. in a
submodule), in which case these patches may not be fixing any bug
that can be triggered in the current code, it is quite obvious that
these functions misbehave once a caller starts asking them to work
on a repository other than the primary one.

So, given that ... 

>
> I counted 9 copies of parse_commit[_gently]() in my version
> of revision.c, so it looks like you caught them all.

... we should be able to proceed with the code as-is, I guess.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] revision: use repository from rev_info when parsing commits
  2020-09-03 21:58   ` Junio C Hamano
@ 2020-09-04 12:19     ` Derrick Stolee
  0 siblings, 0 replies; 10+ messages in thread
From: Derrick Stolee @ 2020-09-04 12:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Forney, git

On 9/3/2020 5:58 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>> On 6/23/2020 4:56 PM, Michael Forney wrote:
>>> This is needed when repo_init_revisions() is called with a repository
>>> that is not the_repository to ensure appropriate repository is used
>>> in repo_parse_commit_internal(). If the wrong repository is used,
>>> a fatal error is the commit-graph machinery occurs:
>>>
>>>   fatal: invalid commit position. commit-graph is likely corrupt
>>>
>>> Since revision.c was the only user of the parse_commit_gently
>>> compatibility define, remove it from commit.h.
>>
>> Is this demonstrable in a test case, to prevent regressions?
> 
> It appears that Michael tried and failed.  Even if we do not
> currently have a caller that asks these functions in revision.c to
> work on a repository that is not the primary one (i.e. in a
> submodule), in which case these patches may not be fixing any bug
> that can be triggered in the current code, it is quite obvious that
> these functions misbehave once a caller starts asking them to work
> on a repository other than the primary one.
> 
> So, given that ... 
> 
>>
>> I counted 9 copies of parse_commit[_gently]() in my version
>> of revision.c, so it looks like you caught them all.
> 
> ... we should be able to proceed with the code as-is, I guess
Yes, I think this is an improvement regardless.

Thanks, for the reminder.

-Stolee

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-09-04 12:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-23 20:56 [PATCH 1/2] revision: use repository from rev_info when parsing commits Michael Forney
2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
2020-06-24 14:35   ` Derrick Stolee
2020-06-24 16:12     ` Junio C Hamano
2020-06-30 11:04     ` Michael Forney
2020-08-13 21:16       ` Michael Forney
2020-06-23 21:24 ` [PATCH 1/2] revision: use repository from rev_info when parsing commits Eric Sunshine
2020-06-24 14:29 ` Derrick Stolee
2020-09-03 21:58   ` Junio C Hamano
2020-09-04 12:19     ` Derrick Stolee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).