All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Kirillov <max@max630.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, git@vger.kernel.org
Subject: Re: [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents
Date: Sun, 12 Apr 2015 08:43:32 +0300	[thread overview]
Message-ID: <20150412054332.GA28555@wheezy.local> (raw)
In-Reply-To: <xmqqa8ye5q7m.fsf@gitster.dls.corp.google.com>

On Sat, Apr 11, 2015 at 02:07:25PM -0700, Junio C Hamano wrote:
> Max Kirillov <max@max630.net> writes:
> 
>> If `git diff --cc` is used with 2 or more parents, then it shows
>> all hunks which have changed compared to at least 2 parents.
>> Which is reasonable, because those places are likely places for
>> conflicts, and it should be displayed how they were resolved.
> 
> OK.
> 
>> But, preliminary path filtering passes a path only it was changed
>> compared to each parent.
> 
> That is true, but I am a bit confused by the above, especially the
> word "But" that begins the sentence.  Are you talking about this
> comment that describes what the caller wants to do?
> 
>     /* find set of paths that every parent touches */
>     static struct combine_diff_path *find_paths_generic(const unsigned char *sha1,
>             const struct sha1_array *parents, struct diff_options *opt)
> 
> When the result of a merge exactly matches one (or more) of the
> parent of the merge, we do not want to show it in the combined
> format, so intersect_paths() does want to find paths that are
> different from all parents.  Isn't that a good thing?

If it is a good thing, then probably for file which has not
been filtered out it is also a good thing, then diff --cc
should have been hiding hunks which fully preserve one of
the parent.

>> So, if a hunk which has not changed compared to some of parents is
>> shown if there are more changed hunks in the file, but not shown
>> if it is the only change.
>>
>> This looks inconsistent and for some scenarios it is desirable to show
>> such changes.
> 
> Hmm, that may be true.  So help me see if I understand your goal by
> checking if I rephrased you correctly below:
> 
>  - We want to show a combined hunk when the number of parent
>    variants for the hunk is more than 2 (i.e. interesting octopus)
>    or the result does not match any of the parents (i.e. conflict
>    resolution of a pairwise merge).  We want to drop a hunk whose
>    result came from only one set of parent and the other parents had
>    the same original that is different from the result.
> 
>  - The current code filters out a path that matches one of the
>    parents very early.  This is OK for a two-way merge.  If the
>    result matches one of the parent's, then any hunk we might
>    produce by not pre-filtering would have the result that came from
>    one parent (i.e. the one identical to the result) and there is
>    only one other parent, which cannot make it an interesting
>    octopus by definition.
> 
>    But an octopus may merge three variants and pick the result from
>    one of the parents as a whole.  With the pre-filtering, no hunk
>    from such a path is shown, even when the other two variants in
>    "discarded" parents are not identical.

Yes, I meant that.

> 
> The original to refer to are two commits bf1c32bd (combine-diff:
> update --cc "uninteresting hunks" logic., 2006-02-02) and fd4b1d21
> (combine-diff: add safety check to --cc., 2006-02-02).
> 
> Especially, we need to pay close attention to the discussion that
> germinated the current behaviour:
> 
>   http://thread.gmane.org/gmane.comp.version-control.git/15486/focus=15519
> 
> I recall that the "diff --cc" before that change was not discarding
> uninteresting merges sufficiently and the two commits were a
> deliberate attempt to reject what your series wants to show as
> uninteresting hunks.  
> 
> Two suggestions.
> 
>  - This is primarily for 2/4, but can we make it more clear in the
>    code that we do this "include more" change only on Octopus
>    merges?  This change should not make any difference for two-way
>    merges and I'd prefer to avoid extra processing of finding
>    matching hunks and combining, only to discard the result.
> 
>  - Can you run "diff --cc" with and without your patches to the
>    "merge from hell" commit mentioned in the original thread and see
>    if we show more hunks with your patches, and make sure what are
>    shown additionally looked really "interesting"?


The diffs:
older version (without my patch):
https://gist.githubusercontent.com/max630/70080d38e8e9951b58a4/raw/diffcc_old.txt
newer version:
https://gist.githubusercontent.com/max630/70080d38e8e9951b58a4/raw/diffcc_new.txt

The outcome is quite different, yes.

First, I can see that file removal/deletion are passed
without filtering, even if file was only added/deleted
compared to some parents and not changed compared to any, so
it's in some way "2 versions" diff which should not be
shown. This I think is something should be fixed in addition
to this series.

Other than that, all other changes look "interesting",
meaning they contain more than 2 versions. Sometimes it's not
easy to spot because a long hunk is almost inherited from
one parent with all others being the same, and only in
couple of lines there when they differ. But I always find
some difference.

It was also quite slow, 11 seconds compared to some 0.4.
But for 2-way merges speed did not change.



My exact case was that there was a change in one branch
which was overwritten during merge conflict resolution by
fully acepting the other branch - in a 2-parent merge. I
started looking for a way to visualize such cases. They
are not visible in usual diff, because they look same as
accepting change compared to the unchange branch. To
recognize them you should consider the mergebase. I googled
to your suggestion in [1], and prepared a trivial testcase
for such a merge - in contained a base commit, change in
both branches and children which fully repeat one of the
branches. But then I discovered that such merge is not shown
in "diff --cc" because of the path filtering. So here am I.

Actually I think this is the very true rule which I would
like to be used: if there is a parent which is changed
compared to the merge base and child commit differs from
that parent. Because this is where there had been a valuable
change, and it disappeared. Linus in [2] wrote that those
change was fixing the same problem and it's good that they
are not shown, by it my case there were fixing different
problems, they just happened to be in the same place, and
after merge the bug reappeared.

[1] http://thread.gmane.org/gmane.comp.version-control.git/191553/focus=191557
[2] http://thread.gmane.org/gmane.comp.version-control.git/15486/focus=15519

  parent reply	other threads:[~2015-04-12  5:43 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-02 20:34 [PATCH 0/4] diff --cc: relax path filtering Max Kirillov
2015-04-02 20:34 ` [PATCH 1/4] Add test for showing discarded changes with diff --cc Max Kirillov
2015-04-02 20:55   ` Junio C Hamano
2015-04-03 16:03     ` Max Kirillov
2015-04-02 20:34 ` [PATCH 2/4] combine-diff.c: refactor: extract insert_path() Max Kirillov
2015-04-02 20:34 ` [PATCH 3/4] diff --cc: relax too strict paths picking Max Kirillov
2015-04-02 20:59   ` Junio C Hamano
2015-04-02 20:34 ` [PATCH 4/4] t4059: rewrite to be adaptive to hunk filtering Max Kirillov
2015-04-02 21:13 ` [PATCH 0/4] diff --cc: relax path filtering Jeff King
2015-04-03 16:29   ` Max Kirillov
2015-04-03 15:58 ` [PATCH v2 " Max Kirillov
2015-04-03 15:58   ` [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents Max Kirillov
2015-04-11 20:04     ` Junio C Hamano
2015-04-11 21:07     ` Junio C Hamano
2015-04-11 21:20       ` Junio C Hamano
2015-04-12  5:43       ` Max Kirillov [this message]
2015-04-12  5:51         ` Junio C Hamano
2015-04-14  4:22           ` Max Kirillov
2015-04-14  4:09         ` [PATCH/RFC] combine-diff.c: make intersect_paths() behave like hunk filtering Max Kirillov
2015-04-03 15:58   ` [PATCH v2 2/4] combine-diff.c: refactor: extract insert_path() Max Kirillov
2015-04-11 20:14     ` Junio C Hamano
2015-04-03 15:58   ` [PATCH v2 3/4] diff --cc: relax too strict paths picking Max Kirillov
2015-04-03 15:58   ` [PATCH v2 4/4] t4059: rewrite to be adaptive to hunk filtering Max Kirillov
2015-04-12  5:48     ` Junio C Hamano
2015-04-14  4:18       ` Max Kirillov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150412054332.GA28555@wheezy.local \
    --to=max@max630.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.