linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Sean Christopherson <seanjc@google.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"David Rientjes" <rientjes@google.com>,
	"Ben Gardon" <bgardon@google.com>,
	"Michal Hocko" <mhocko@suse.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Dimitri Sivanich" <dimitri.sivanich@hpe.com>
Subject: Re: [PATCH] mm/oom_kill: Ensure MMU notifier range_end() is paired with range_start()
Date: Wed, 10 Mar 2021 20:28:07 -0400	[thread overview]
Message-ID: <20210311002807.GQ444867@ziepe.ca> (raw)
In-Reply-To: <20210310213117.1444147-1-seanjc@google.com>

On Wed, Mar 10, 2021 at 01:31:17PM -0800, Sean Christopherson wrote:
> Invoke the MMU notifier's .invalidate_range_end() callbacks even if one
> of the .invalidate_range_start() callbacks failed.  If there are multiple
> notifiers, the notifier that did not fail may have performed actions in
> its ...start() that it expects to unwind via ...end().  Per the
> mmu_notifier_ops documentation, ...start() and ...end() must be paired.

No this is not OK, if invalidate_start returns EBUSY invalidate_end
should *not* be called.

As you observed:
 
> The only in-kernel usage that is fatally broken is the SGI UV GRU driver,
> which effectively blocks and sleeps fault handlers during ...start(), and
> unblocks/wakes the handlers during ...end().  But, the only users that
> can fail ...start() are the i915 and Nouveau drivers, which are unlikely
> to collide with the SGI driver.

It used to be worse but I've since moved most of the other problematic
users to the itree notifier which doesn't have the problem.

> KVM is the only other user of ...end(), and while KVM also blocks fault
> handlers in ...start(), the fault handlers do not sleep and originate in

KVM will have its mmu_notifier_count become imbalanced:

static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
                                        const struct mmu_notifier_range *range)
{
        kvm->mmu_notifier_count++;

static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
                                        const struct mmu_notifier_range *range)
{
        kvm->mmu_notifier_count--;

Which I believe is fatal to kvm? These notifiers certainly do not only
happen at process exit.

So, both of the remaining _end users become corrupted with this patch!

I've tried to fix this before, the only thing that seems like it will
work is to sort the hlist and only call ends that have succeeded their
starts by comparing pointers with <.

This is because the hlist can have items removed concurrently under
SRCU so there is no easy way to compute the subset that succeeded in
calling start.

I had a prior effort to just ban more than 1 hlist notifier with end,
but it turns out kvm on ARM uses two all the time (IIRC)

> Found by inspection.  Verified by adding a second notifier in KVM
> that

AFAIK it is a non-problem in real life because kvm is not mixed with
notifier_start's that fail (and GRU is dead?). Everything else was
fixed by moving to itree.

Jason

  parent reply	other threads:[~2021-03-11  0:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 21:31 [PATCH] mm/oom_kill: Ensure MMU notifier range_end() is paired with range_start() Sean Christopherson
2021-03-11  0:06 ` Andrew Morton
2021-03-11  0:28 ` Jason Gunthorpe [this message]
2021-03-11  1:20   ` Sean Christopherson
2021-03-11  1:50     ` Jason Gunthorpe
2021-03-11  7:22       ` Sean Christopherson
2021-03-11 16:20   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210311002807.GQ444867@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bgardon@google.com \
    --cc=dimitri.sivanich@hpe.com \
    --cc=hannes@cmpxchg.org \
    --cc=jglisse@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).