All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Amaan Cheval <amaan.cheval@gmail.com>
Cc: brak@gameservers.com, kvm@vger.kernel.org
Subject: Re: Deadlock due to EPT_VIOLATION
Date: Tue, 25 Jul 2023 10:30:34 -0700	[thread overview]
Message-ID: <ZMAGuic1viMLtV7h@google.com> (raw)
In-Reply-To: <CAG+wEg21f6PPEnP2N7oE=48PBSd_2bHOcRsTy_ZuBpa2=dGuiA@mail.gmail.com>

On Mon, Jul 24, 2023, Amaan Cheval wrote:
> > > I've also run a `function_graph` trace on some of the affected hosts, if you
> > > think it might be helpful...
> >
> > It wouldn't hurt to see it.
> >
> 
> Here you go:
> https://transfer.sh/SfXSCHp5xI/ept-function-graph.log

Yeesh.  There is a ridiculous amount of potentially problematic activity.  KSM is
active in that trace, it looks like NUMA balancing might be in play, there might
be hugepage shattering, etc.

> > > Another interesting observation we made was that when we migrate a guest to a
> > > different host, the guest _stays_ locked up and throws EPT violations on the new
> > > host as well
> >
> > Ooh, that's *very* interesting.  That pretty much rules out memslot and mmu_notifier
> > issues.
> 
> Good to know, thanks!

Let me rephrase that statement: it rules out a certain class of memslot and
mmu_notifier bugs, namely bugs where KVM would incorrect leave an invalidation
refcount (for lack of a better term) elevated.  It doesn't mean memslot changes
and/or mmu_notifier events aren't at fault.

Can you migrate a hung guest to a host that is completely unloaded?  And ideally,
disable KSM and NUMA autobalancing on the target host.  And then get a
function_graph trace on that host, assuming the vCPU remains stuck.  There is *so*
much going on in the above graph that it's impossible to determine if there's a
kernel bug, e.g. it's possible the vCPU is stuck purely because it's being trashed
to the point where it can't make forward progress.

> > To mostly confirm this is likely what's happening, can you enable all of the async
> > #PF tracepoints in KVM?  The exact tracepoints might vary dependending on which kernel
> > version you're running, just enable everything with "async" in the name, e.g.
> >
> >   # ls -1 /sys/kernel/debug/tracing/events/kvm | grep async
> >   kvm_async_pf_completed/
> >   kvm_async_pf_not_present/
> >   kvm_async_pf_ready/
> >   kvm_async_pf_repeated_fault/
> >   kvm_try_async_get_page/
> >
> > If kvm_try_async_get_page() is more or less keeping pace with the "pf_taken" stat,
> > then this is likely what's happening.
> 
> I did this and unfortunately, don't see any of these functions being
> called at all despite
> EPT_VIOLATIONs still being thrown and pf_taken still climbing. (Tried both with
> `trace-cmd -e ...` and using `bpftrace` and none of those functions
> are being called
> during the deadlock/guest being stuck.)

Well fudge.

> > And then to really confirm, this small bpf program will yell if get_user_pages_remote()
> > fails when attempting get a single page (which is always the case for KVM's async
> > #PF usage).
> >
> > $ tail gup_remote.bt
> > kretfunc:get_user_pages_remote
> > {
> >         if ( args->nr_pages == 1 && retval != 1 ) {
> >                 printf("Failed remote gup() on address %lx, ret = %d\n", args->start, retval);
> >         }
> > }
> >
> 
> Our hosts don't have kfunc/kretfunc support (`bpftrace --info` reports
> `kret: no`),
> but I tried just a kprobe to verify that get_user_pages_remote is
> being called at all -
> does not seem like it is, unfortunately:
> 
> ```
> # bpftrace -e 'kprobe:get_user_pages_remote { @[comm] = count(); }'
> Attaching 1 probe...
> ^C
> #
> ```
> 
> So I guess that disproves the async #PF theory?

Yeah.  Definitely not related async page fault.

  reply	other threads:[~2023-07-25 17:30 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23 14:02 Deadlock due to EPT_VIOLATION Brian Rak
2023-05-23 16:22 ` Sean Christopherson
2023-05-24 13:39   ` Brian Rak
2023-05-26 16:59     ` Brian Rak
2023-05-26 21:02       ` Sean Christopherson
2023-05-30 17:35         ` Brian Rak
2023-05-30 18:36           ` Sean Christopherson
2023-05-31 17:40             ` Brian Rak
2023-07-21 14:34             ` Amaan Cheval
2023-07-21 17:37               ` Sean Christopherson
2023-07-24 12:08                 ` Amaan Cheval
2023-07-25 17:30                   ` Sean Christopherson [this message]
2023-08-02 14:21                     ` Amaan Cheval
2023-08-02 15:34                       ` Sean Christopherson
2023-08-02 16:45                         ` Amaan Cheval
2023-08-02 17:52                           ` Sean Christopherson
2023-08-08 15:34                             ` Amaan Cheval
2023-08-08 17:07                               ` Sean Christopherson
2023-08-10  0:48                                 ` Eric Wheeler
2023-08-10  1:27                                   ` Eric Wheeler
2023-08-10 23:58                                     ` Sean Christopherson
2023-08-11 12:37                                       ` Amaan Cheval
2023-08-11 18:02                                         ` Sean Christopherson
2023-08-12  0:50                                           ` Eric Wheeler
2023-08-14 17:29                                             ` Sean Christopherson
2023-08-15  0:30                                 ` Eric Wheeler
2023-08-15 16:10                                   ` Sean Christopherson
2023-08-16 23:54                                     ` Eric Wheeler
2023-08-17 18:21                                       ` Sean Christopherson
2023-08-18  0:55                                         ` Eric Wheeler
2023-08-18 14:33                                           ` Sean Christopherson
2023-08-18 23:06                                             ` Eric Wheeler
2023-08-21 20:27                                               ` Eric Wheeler
2023-08-21 23:51                                                 ` Sean Christopherson
2023-08-22  0:11                                                   ` Sean Christopherson
2023-08-22  1:10                                                   ` Eric Wheeler
2023-08-22 15:11                                                     ` Sean Christopherson
2023-08-22 21:23                                                       ` Eric Wheeler
2023-08-22 21:32                                                         ` Sean Christopherson
2023-08-23  0:39                                                       ` Eric Wheeler
2023-08-23 17:54                                                         ` Sean Christopherson
2023-08-23 19:44                                                           ` Eric Wheeler
2023-08-23 22:12                                                           ` Eric Wheeler
2023-08-23 22:32                                                             ` Eric Wheeler
2023-08-23 23:21                                                               ` Sean Christopherson
2023-08-24  0:30                                                                 ` Eric Wheeler
2023-08-24  0:52                                                                   ` Sean Christopherson
2023-08-24 23:51                                                                     ` Eric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZMAGuic1viMLtV7h@google.com \
    --to=seanjc@google.com \
    --cc=amaan.cheval@gmail.com \
    --cc=brak@gameservers.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.