All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	ZheNing Hu <adlternative@gmail.com>,
	Eric Sunshine <sunshine@sunshineco.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge
Date: Tue, 26 Jul 2022 08:54:29 +0200	[thread overview]
Message-ID: <220726.86fsionynh.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <CABPp-BFVZy+TtYtaKe1d25Ua3HSMrQpzkUDgfnzQvo8xzpD-3g@mail.gmail.com>


On Mon, Jul 25 2022, Elijah Newren wrote:

> On Mon, Jul 25, 2022 at 4:06 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
> [...]
>>
>> I'm re-rolling ab/leak-check, and came up with the below (at the very
>> end) to "fix" a report in builtin/merge.c, reading your commit message
>> your fix seems obviously better.
>>
>> Mine's early WIP, and I e.g. didn't notice that I forgot to unlock the
>> &lock file, which is correct.
>>
>> I *could* say "that's not my problem", i.e. we didn't unlock it before
>> (we rely on atexit). The truth is I just missed it, but having said that
>> it *is* true that we could do without it, or do it as a separate chaneg.
>>
>> I'm just posting my version below to help move yours forward, i.e. to
>> show that someone else has carefully at least this part.
>
> "has carefully ... at least this part" ?
>
> I think you have a missing verb there.

"reviewed", sorry.

>> But it is worth noting from staring at the two that your version is
>> mixing several different behavior changes into one, which *could* be
>> split up (but whether you think that's worth it I leave to you).
>>
>> Maybe I'm the only one initially confused by it, and that's probably
>> just from being mentally biased towards my own "solution". Those are (at
>> least):
>>
>>  1. Before we didn't explicitly unlock() before exit(), but had atexit()
>>     do it, that could be a one-line first commit. This change is
>>     obviously good.
>
> That'd be fine.  (Though at this point, I'd rather not mess with the
> series more.)

That's completely fine, to be clear(er) what I'm walking through here is
my review process & potential edge cases I discovered, but I think your
patch as-is does it all correctly.

>>  2. A commit like mine could come next, i.e. we bug-for-bug do what we
>>     do do now, but just run the "post-builtin" logic when we return from
>>     cmd_merge().
>>
>>     Doing it as an in-between would be some churn, as we'll need to get
>>     rid of "early_exit" again, but would allow us to incrementally move
>>     forward to...
>
> So, add a step that makes it glaringly obvious that the code is not
> only buggy but totally at odds with itself?
>
> builtin/merge.c was designed to allow pluggable backends and to
> automatically pick the "best" one if more than one is specified.  We
> had a bug in one line of code that defeated the design, by making it
> not bother consulting beyond the first failed backend in some cases.
> That's the bug I'm trying to address.  Your patch would make the
> inconsistency with the design both bigger and more obvious; I don't
> see how it's a useful step to take.
>
> Now, the existing design might be questionable.  In fact, I'm not sure
> I like it.  But I think we should either change the design, or fix
> things in a way that improves towards the existing design.

Yes, I don't think it's useful, sorry about the confusion. I was walking
through what the observable changes from the outside are here, and how
they *might* be split up.

Not as a practical suggestion, but as a way to understand your commit...

>>  3. ...then we'd say "but it actually makes sense not to early abort",
>>      i.e. you want to change this so that we'll run the logic between
>>      try_merge_strategy() exiting with 128 now and the return from
>>      cmd_merge().
>>
>>      This bit is my main sticking point in reviewing your change,
>>      i.e. your "a testcase for this is somewhat difficult" somewhat
>>      addresses this, but (and maybe I'm wrong) it seems to me that
>>
>>      Editing that code the post-image looks like this, with my
>>      commentary & most of the code removed, i.e. just focusing on the
>>      branches we do and don't potentially have tests for:
>>
>>                 /* Before this we fall through from ret == 128 (or ret == 2...) */
>>                 if (automerge_was_ok) { // not tested?
>>                 if (!best_strategy) {
>>                         // we test this...
>>                         if (use_strategies_nr > 1)
>>                                 // And this: _("No merge strategy handled the merge.\n"));
>>                         else
>>                                 // And this: _("Merge with strategy %s failed.\n"),
>>                 } else if (best_strategy == wt_strategy)
>>                         // but not this?
>>                 else
>>                         // Or this, where we e.g. say "Rewinding the tree to pristene..."?
>>
>>                 if (squash) {
>>                         // this?
>>                 } else
>>                         // this? (probably, yes)
>>                         write_merge_state(remoteheads);
>>
>>                 if (merge_was_ok)
>>                         // this? (probably, yes, we just don't grep it?)
>>                 else
>>                         // this? maybe yes because it's covered by the
>>                         // "failed" above too?
>>                         ret = suggest_conflicts();
>>
>>         done:
>>                 if (!automerge_was_ok) {
>>                         // this? ditto the first "not tested?"
>>                 }
>>
>>    I.e. are you confident that we want to continue now in these various
>>    cases, where we have squash, !automerge_was_ok etc. I think it would
>>    be really useful to comment on (perhaps by amending the above
>>    pseudocode) what test cases we're not testing / test already etc.
>
> To be honest, I'm confused by what looks like a make-work project.
> Perhaps if I understood your frame of reference better, maybe they
> wouldn't bother me, but there's a few things here that seem a little
> funny to me.
>
> You've highlighted that you are worried about the case where ret is 2
> (or 128) at the point of all these branches in question.  However,
> three of those branches can almost trivially be deduced to never be
> taken unless ret is 0.  One of the other codepaths, for freeing
> memory, is correct regardless of the value of ret -- the memory is
> conditionally freed earlier and the "if"-check exists only to avoid a
> double free (and checking the recent commit message where those lines
> were added would explain this, though I'm not sure why it'd even need
> explaining separately for e.g. ret == 2 compared to any other value).
> Three of the other code paths involve nothing more than print
> statements.  Now, there are many codepaths you highlighted, and
> perhaps there are some where it's not trivial to determine whether
> they are okay in combination with a different return value.  And it
> may also be easy to miss some of the "almost trivial" cases.  I'd
> understand better if you asked about the tougher ones or only some of
> the easier ones, but it feels like you didn't try to check any of them
> and instead wanted me to just spend time commenting on every single
> code branch?
>
> I hope that doesn't come across harshly.  I'm just struggling to
> understand where the detailed request is coming from.
>
> However, perhaps I can obviate the whole set of requests by just
> pointing out that I don't think any of them are relevant.  The premise
> for the audit request seems to be that you are worrying that the
> change from die() to "return 2" (or 128) in try_merge_strategy() will
> result in the calling code getting into a state it has never
> experienced before which might uncover latent bugs.  We can trivially
> point out that it's not a new state, though: such return values were
> already possible from try_merge_strategy() via the final line of code
> in the function -- namely, the "return try_merge_command(...)" code
> path.  And a return value of 2 from try_merge_command() and
> try_merge_strategy() isn't merely theoretical either -- you can easily
> trigger it multiple ways; the easiest is perhaps by passing `-s
> octopus` when doing a non-octopus merge.  (You can also make the
> testcase more complex by combining that with as many other additional
> merge strategies as you want, e.g. "git merge -s octopus -s resolve -s
> recursive -s mySpecialStrategy $BRANCH".  You can also move the
> octopus to the end if you want to test what happens if it is tried
> last.  Lots of possibilities exist).

Thanks, that all makes sense & addresses any questions I had, which were
just if we were sure that this behavior was expected.

>>  4. Having done all that (or maybe this can't be split up / needs to
>>     come earlier) you say that we'd like to not generically call this
>>     exit state 128, but have it under the "exit(2)" umbrella.
>
> I don't see how reading your set of steps 1-4 logically restores the
> design of builtin/merge.c, though.  An example: what if the user ran
> "git merge -s resolve -s recursive $BRANCH", and `resolve` handled the
> merge but had some conflicts, while `recursive` just failed and
> returned a value of clean < 0?  In such a case, builtin/merge.c is
> supposed to restore the tree to pristine after attempting the
> recursive backend, and then redo using the best_strategy (which would
> be `resolve` in this example case).  Your steps 1-4 never address such
> a thing.  Your steps might incidentally address such a case as a side
> effect of their implementation, but that's not at all clear.  Since
> the whole point is fixing the code to match the existing design, it
> seems really odd to split things into a set of steps that obscures
> that fix.

You know more about the wider context here, I was just poking at the
narrow code change & seeing what the side-effects were of the changes
being made.

>> Again, all just food for thought, and a way to step-by-step go through
>> how I came about reviewing this in detail, I hope it and the below
>> version I came up with before seeing yours helps.
>>
>> P.s.: The last paragraph in my commit message does not point to some
>> hidden edge case in the code behavior here, it's just that clang/gcc are
>> funny about exit() and die() control flow when combined with
>> -fsanitize=address and higher optimization levels.
>>
>> -- >8 --
>> Subject: [PATCH] merge: return, don't use exit()
>>
>> Change some of the builtin/merge.c code added in f241ff0d0a9 (prepare
>> the builtins for a libified merge_recursive(), 2016-07-26) to ferry up
>> an "early return" state, rather than having try_merge_strategy() call
>> exit() itself.
>>
>> This is a follow-up to dda31145d79 (Merge branch
>> 'ab/usage-die-message' into gc/branch-recurse-submodules-fix,
>> 2022-03-31).
>>
>> The only behavior change here is that we'll now properly catch other
>> issues on our way out, see e.g. [1] and the interaction with /dev/full
>> for an example.
>>
>> The immediate reason to do this change is because it's one of the
>> cases where clang and gcc's SANITIZE=leak behavior differs. Under
>> clang we don't detect that "t/t6415-merge-dir-to-symlink.sh" triggers
>> a leak, but gcc spots it.
>>
>> 1. https://lore.kernel.org/git/87im2n3gje.fsf@evledraar.gmail.com/
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  builtin/merge.c | 27 +++++++++++++++++++++------
>>  1 file changed, 21 insertions(+), 6 deletions(-)
>>
>> diff --git a/builtin/merge.c b/builtin/merge.c
>> index 23170f2d2a6..a8d5d04f622 100644
>> --- a/builtin/merge.c
>> +++ b/builtin/merge.c
>> @@ -709,10 +709,12 @@ static void write_tree_trivial(struct object_id *oid)
>>
>>  static int try_merge_strategy(const char *strategy, struct commit_list *common,
>>                               struct commit_list *remoteheads,
>> -                             struct commit *head)
>> +                             struct commit *head, int *early_exit)
>>  {
>>         const char *head_arg = "HEAD";
>>
>> +       *early_exit = 0;
>> +
>>         if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED, 0) < 0)
>>                 return error(_("Unable to write index."));
>>
>> @@ -754,8 +756,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
>>                 else
>>                         clean = merge_recursive(&o, head, remoteheads->item,
>>                                                 reversed, &result);
>> -               if (clean < 0)
>> -                       exit(128);
>> +               if (clean < 0) {
>> +                       *early_exit = 1;
>> +                       return 128;
>> +               }
>>                 if (write_locked_index(&the_index, &lock,
>>                                        COMMIT_LOCK | SKIP_IF_UNCHANGED))
>>                         die(_("unable to write %s"), get_index_file());
>> @@ -1665,6 +1669,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>>
>>         for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
>>                 int ret, cnt;
>> +               int early_exit;
>> +
>>                 if (i) {
>>                         printf(_("Rewinding the tree to pristine...\n"));
>>                         restore_state(&head_commit->object.oid, &stash);
>> @@ -1680,7 +1686,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>>
>>                 ret = try_merge_strategy(use_strategies[i]->name,
>>                                          common, remoteheads,
>> -                                        head_commit);
>> +                                        head_commit, &early_exit);
>> +               if (early_exit)
>> +                       goto done;
>> +
>>                 /*
>>                  * The backend exits with 1 when conflicts are
>>                  * left to be resolved, with 2 when it does not
>> @@ -1732,12 +1741,18 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>>         } else if (best_strategy == wt_strategy)
>>                 ; /* We already have its result in the working tree. */
>>         else {
>> +               int new_ret, early_exit;
>> +
>>                 printf(_("Rewinding the tree to pristine...\n"));
>>                 restore_state(&head_commit->object.oid, &stash);
>>                 printf(_("Using the %s strategy to prepare resolving by hand.\n"),
>>                         best_strategy);
>> -               try_merge_strategy(best_strategy, common, remoteheads,
>> -                                  head_commit);
>> +               new_ret = try_merge_strategy(best_strategy, common, remoteheads,
>> +                                            head_commit, &early_exit);
>> +               if (early_exit) {
>> +                       ret = new_ret;
>> +                       goto done;
>> +               }
>
> Incidentally, this is essentially dead code being added in this last
> hunk.  This final `else` block can only be triggered when
> best_strategy has been set, and best_strategy will only be set after a
> merge strategy works (possibly with conflicts, but was at least
> appropriate for the problem).  Anyway, by the point of this `else`
> block, we've run multiple strategies already, at least one worked, and
> the "best" one was not the last one tried.  So, at this point, we
> simply rewind the tree and rerun the known-working merge strategy.  (I
> guess you could say that maybe the merge strategy might not return the
> same result despite being given the same inputs, so theoretically
> early_exit could come back true and this hunk isn't quite dead.  Just
> mostly dead, I guess.)

The idea here was to leave the "is it really dead?" question aside, and
just move the code towards running the post-builtin code we run when we
don't call an early exit().


  reply	other threads:[~2022-07-26  7:03 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-19 16:26 [PATCH 0/2] Fix merge restore state Elijah Newren via GitGitGadget
2022-05-19 16:26 ` [PATCH 1/2] merge: remove unused variable Elijah Newren via GitGitGadget
2022-05-19 17:45   ` Junio C Hamano
2022-05-19 16:26 ` [PATCH 2/2] merge: make restore_state() do as its name says Elijah Newren via GitGitGadget
2022-05-19 17:44   ` Junio C Hamano
2022-05-19 18:32     ` Junio C Hamano
2022-06-12  6:58 ` [PATCH 0/2] Fix merge restore state Elijah Newren
2022-06-12  8:54   ` ZheNing Hu
2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
2022-06-19  6:50   ` [PATCH v2 1/6] t6424: make sure a failed merge preserves local changes Junio C Hamano via GitGitGadget
2022-06-19  6:50   ` [PATCH v2 2/6] merge: remove unused variable Elijah Newren via GitGitGadget
2022-07-19 23:14     ` Junio C Hamano
2022-06-19  6:50   ` [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files Elijah Newren via GitGitGadget
2022-07-17 16:28     ` ZheNing Hu
2022-07-19 22:49       ` Junio C Hamano
2022-07-21  1:09         ` Elijah Newren
2022-07-19 22:43     ` Junio C Hamano
2022-06-19  6:50   ` [PATCH v2 4/6] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
2022-07-17 16:37     ` ZheNing Hu
2022-07-19 23:14     ` Junio C Hamano
2022-07-19 23:28       ` Junio C Hamano
2022-07-21  1:37         ` Elijah Newren
2022-06-19  6:50   ` [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
2022-07-17 16:41     ` ZheNing Hu
2022-07-19 22:57     ` Junio C Hamano
2022-07-21  2:03       ` Elijah Newren
2022-06-19  6:50   ` [PATCH v2 6/6] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
2022-07-17 16:44     ` ZheNing Hu
2022-07-19 23:13     ` Junio C Hamano
2022-07-20  0:09       ` Eric Sunshine
2022-07-21  2:03         ` Elijah Newren
2022-07-21  3:27       ` Elijah Newren
2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
2022-07-21  8:16     ` [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
2022-07-21 15:47       ` Junio C Hamano
2022-07-21 19:51         ` Elijah Newren
2022-07-21 20:05           ` Junio C Hamano
2022-07-21 21:14             ` Elijah Newren
2022-07-21  8:16     ` [PATCH v3 2/7] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
2022-07-21  8:16     ` [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
2022-07-21 16:09       ` Junio C Hamano
2022-07-25 10:38       ` Ævar Arnfjörð Bjarmason
2022-07-26  1:31         ` Elijah Newren
2022-07-26  6:54           ` Ævar Arnfjörð Bjarmason [this message]
2022-07-21  8:16     ` [PATCH v3 4/7] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
2022-07-21  8:16     ` [PATCH v3 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
2022-07-21 16:16       ` Junio C Hamano
2022-07-21 16:24       ` Junio C Hamano
2022-07-21 19:52         ` Elijah Newren
2022-07-21  8:16     ` [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
2022-07-21 16:31       ` Junio C Hamano
2023-03-02  7:17         ` Ben Humphreys
2023-03-02 15:35           ` Elijah Newren
2023-03-02 16:19             ` Junio C Hamano
2023-03-04 16:18               ` Rudy Rigot
2023-03-06 22:19             ` Ben Humphreys
2022-07-21  8:16     ` [PATCH v3 7/7] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
2022-07-21 16:34       ` Junio C Hamano
2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
2022-07-22  5:15       ` [PATCH v4 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
2022-07-22  5:15       ` [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
2022-07-22 10:27         ` Ævar Arnfjörð Bjarmason
2022-07-23  0:28           ` Elijah Newren
2022-07-23  5:44             ` Ævar Arnfjörð Bjarmason
2022-07-26  1:58               ` Elijah Newren
2022-07-26  6:35                 ` Ævar Arnfjörð Bjarmason
2022-07-22  5:15       ` [PATCH v4 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
2022-07-22 10:47         ` Ævar Arnfjörð Bjarmason
2022-07-23  0:36           ` Elijah Newren
2022-07-22  5:15       ` [PATCH v4 4/7] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
2022-07-22  5:15       ` [PATCH v4 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
2022-07-22 10:53         ` Ævar Arnfjörð Bjarmason
2022-07-23  1:56           ` Elijah Newren
2022-07-22  5:15       ` [PATCH v4 6/7] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
2022-07-22  5:15       ` [PATCH v4 7/7] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 1/8] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 2/8] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 3/8] merge: abort if index does not match HEAD for trivial merges Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 4/8] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 5/8] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 6/8] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 7/8] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 8/8] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
2022-07-25 19:03         ` [PATCH v5 0/8] Fix merge restore state Junio C Hamano
2022-07-26  1:59           ` Elijah Newren
2022-07-26  4:03         ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220726.86fsionynh.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=adlternative@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.