linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Borislav Petkov <bp@suse.de>
Cc: x86-ml <x86@kernel.org>, lkml <linux-kernel@vger.kernel.org>
Subject: Re: [GIT PULL] RAS updates for 5.18
Date: Fri, 25 Mar 2022 13:01:15 -0700	[thread overview]
Message-ID: <CAHk-=wgXbSa8yq8Dht8at+gxb_idnJ7X5qWZQWRBN4_CUPr=eQ@mail.gmail.com> (raw)
In-Reply-To: <YjtZAvQnshp1pZIh@zn.tnic>

[ I've written this kind of reply multiple times before and I
_thought_ we had something in the docs about this, but I can't find
them, so here goes again ]

On Wed, Mar 23, 2022 at 10:29 AM Borislav Petkov <bp@suse.de> wrote:
>
> But before you do, a git question if I may:
> [.. details removed for brevity..]
> However, when creating the diffstat for the pull request, it would
> add additional files to it from tip/locking/core even if all the
> tip/locking/core changes are already in your master branch:

So what you are describing is a very fundamental thing - your branch
has multiple separate starting points, since you had different
branches that you merged into your tree..

Sometimes having multiple branches doesn't actually cause that,
because the different branches may all have the same base starting
point.

Git calls these things "merge bases", because those starting points is
what you have to take into account when merging, they are the "base"
for actually resolving the differences that come in through multiple
branches.

And git handles that perfectly fine when merging by doing all the
appropriate magic. And "git log" has no problem with it either - you
can list all the commits that are in your head but are *not* in some
arbitrary number of merge bases just fine).

But when you do a "git diff", things are different (and "git
request-pull" basically just does a diff to show what the thing was
about).

A "diff" is fundamentally something you do on two end-points. You have
a beginning, you have an end, and you ask "what changed between these
two end-points".

But that fundamentally means that when you have multiple different
merge bases, and you ask "what changed since the beginning and the
current state", your question is fundamentally ambiguous. There is not
a "the beginning". There are *multiple* beginnings.

So what git will do it to pick _one_ beginning, and just use that.

And that means that yes, the diff will show the changes since that
beginning, but since the end result depends on the _other_ beginning
too, it will show the changes that came from that other beginning as
well.

Sometimes those changes end up being empty, because the "first
beginning" might already have had all of that. So sometimes you might
not even notice that what "git diff" gave you was ambiguous.

So "git request-pull" does both a log (for the shortlog of commits)
and a diff (for the diffstat), and the log should always be correct,
but the diffstat will have this ambiguity problem if you have multiple
merge bases.

> After poking at it a bit more, I found a hint as to what it might be
> complaining about:
>
> $ git diff --stat master...ras_core_for_v5.18_rc1
> warning: master...ras_core_for_v5.18_rc1: multiple merge bases, using 754e0b0e35608ed5206d6a67a791563c631cec07

Yeah, so using that "..." format will warn about how there are
multiple possible merge bases, and point out how it just picked one at
random.

> So it looks like the diffstat for the pull request is created by using
> the -rc4 as the merge base:
>
> $ git diff --stat v5.17-rc4..ras_core_for_v5.18_rc1
> ...
>
> while the correct diffstat should be from the merge-commit onwards:

No. There is no such thing as a *correct* merge base. You had two, and
git picked one of them. In the general case there just isn't a correct
answer.

Now, it turns out that you shouldn't have done a merge at all. I'm not
sure why that commit c0f6799de2a0 ("Merge tip:locking/core into
tip:ras/core") even exists, because it could have been done as a
fast-forward. Did you use "--no-ff" to explicitly not do that?

So *because* you have that pointless merge that could just have been a
fast-forward, you think that "hey, if it had just picked the other
merge base it would all have been fine". But in a normal merge
situation, the two merge bases would both have had some work that
wasn't in the other side, so that's just because you did something
odd.

> So, long story short, what am I doing wrong?

So in the general case, you aren't doing anything wrong: if you merge
multiple real branches, it's just that "git diff" cannot find a single
unique point to use as the base, and you'll get some odd random diff.

But if you are a developer who merges multiple real branches, you
obviously know how to merge things, and one way to sort it out is to
basically do a test-merge just for yourself:

    # WWLS? ("What would Linus See?")
    git branch -b test-merge linus
    git merge my-branch
    git diff -C --stat --summary ORIG_HEAD..
    .. save that away ..

    # go back to your real work, and remove that test-merge
    git checkout <normal-branch>
    git branch -D test-merge

will generate a diffstat of what a merge (which fundamentally knows
how to resolve multiple merge bases) would generate.

(Obviously you can just do the above in a completely separate git tree
too, if you don't like doing those temporary branches that might mess
up your working tree).

The other alternative is to just send me the bogus diffstat - I'm
sadly quite used to it, since a number of people just do "git
request-pull", see that it's odd, don't understand why, and just let
me sort it out.

Now the good news is that people who are afraid of merges and the
above kind of complexity will never actually see this situation. You
can't get multiple merge bases if you don't do any merges yourself.

So this kind of git complexity only happens to people who are supposed
to be able to handle it. You clearly figured out what was going on,
you didn't perhaps just realize the full story.



                  Linus

  reply	other threads:[~2022-03-25 20:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-23 17:29 Borislav Petkov
2022-03-25 20:01 ` Linus Torvalds [this message]
2022-03-25 20:40   ` Borislav Petkov
2022-03-25 20:51     ` Linus Torvalds
2022-03-28 14:21     ` Jonathan Corbet
2022-03-28 15:56       ` Borislav Petkov
2022-03-25 21:07 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wgXbSa8yq8Dht8at+gxb_idnJ7X5qWZQWRBN4_CUPr=eQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=bp@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    --subject='Re: [GIT PULL] RAS updates for 5.18' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).