All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amaan Cheval <amaan.cheval@gmail.com>
To: Sean Christopherson <seanjc@google.com>
Cc: brak@gameservers.com, kvm@vger.kernel.org
Subject: Re: Deadlock due to EPT_VIOLATION
Date: Wed, 2 Aug 2023 22:15:36 +0530	[thread overview]
Message-ID: <CAG+wEg2x-oGALCwKkHOxcrcdjP6ceU=K52UoQE2ht6ut1O46ug@mail.gmail.com> (raw)
In-Reply-To: <ZMp3bR2YkK2QGIFH@google.com>

> LOL, NUMA autobalancing.  I have a longstanding hatred of that feature.  I'm sure
> there are setups where it adds value, but from my perspective it's nothing but
> pain and misery.

Do you think autobalancing is increasing the odds of some edge-case race
condition, perhaps?
I find it really curious that numa_balancing definitely affects this issue, but
particularly when thp=0. Is it just too many EPT entries to install
when transparent hugepages is disabled, increasing the likelihood of
a race condition / lock contention of some sort?

> > They still remain locked up, but that might be because the original cause of the
> > looping EPT_VIOLATIONs corrupted/crashed them in an unrecoverable way (are there
> > any ways you can think of that that might happen)?
>
> Define "remain locked up".  If the vCPUs are actively running in the guest and
> making forward progress, i.e. not looping on VM-Exits on a single RIP, then they
> aren't stuck from KVM's perspective.

Right, the traces look like they're not stuck (i.e. no looping on the same
RIP). By "remain locked up" I mean that the VM is unresponsive on both the
console and services (such as ssh) used to connect to it.

> But that doesn't mean the guest didn't take punitive action when a vCPU was
> effectively stalled indefinitely by KVM, e.g. from the guest's perspective the
> stuck vCPU will likely manifest as a soft lockup, and that could lead to a panic()
> if the guest is a Linux kernel running with softlockup_panic=1.

So far we haven't had any guest kernels with softlockup_panic=1 have this issue,
so it's hard to confirm, but it makes sense that the guest took punitive action
in response to being stalled.

Any thoughts on how we might reproduce the issue or trace it down better?

Anything look suspect in the function_graph trace?
(Note that this was on a host that had numa_balancing=0,thp=1 from before
the guest booted, and it still ended up in the EPT_VIOLATION loop and
"locked up" (unresponsive on console).)

  reply	other threads:[~2023-08-02 16:45 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23 14:02 Deadlock due to EPT_VIOLATION Brian Rak
2023-05-23 16:22 ` Sean Christopherson
2023-05-24 13:39   ` Brian Rak
2023-05-26 16:59     ` Brian Rak
2023-05-26 21:02       ` Sean Christopherson
2023-05-30 17:35         ` Brian Rak
2023-05-30 18:36           ` Sean Christopherson
2023-05-31 17:40             ` Brian Rak
2023-07-21 14:34             ` Amaan Cheval
2023-07-21 17:37               ` Sean Christopherson
2023-07-24 12:08                 ` Amaan Cheval
2023-07-25 17:30                   ` Sean Christopherson
2023-08-02 14:21                     ` Amaan Cheval
2023-08-02 15:34                       ` Sean Christopherson
2023-08-02 16:45                         ` Amaan Cheval [this message]
2023-08-02 17:52                           ` Sean Christopherson
2023-08-08 15:34                             ` Amaan Cheval
2023-08-08 17:07                               ` Sean Christopherson
2023-08-10  0:48                                 ` Eric Wheeler
2023-08-10  1:27                                   ` Eric Wheeler
2023-08-10 23:58                                     ` Sean Christopherson
2023-08-11 12:37                                       ` Amaan Cheval
2023-08-11 18:02                                         ` Sean Christopherson
2023-08-12  0:50                                           ` Eric Wheeler
2023-08-14 17:29                                             ` Sean Christopherson
2023-08-15  0:30                                 ` Eric Wheeler
2023-08-15 16:10                                   ` Sean Christopherson
2023-08-16 23:54                                     ` Eric Wheeler
2023-08-17 18:21                                       ` Sean Christopherson
2023-08-18  0:55                                         ` Eric Wheeler
2023-08-18 14:33                                           ` Sean Christopherson
2023-08-18 23:06                                             ` Eric Wheeler
2023-08-21 20:27                                               ` Eric Wheeler
2023-08-21 23:51                                                 ` Sean Christopherson
2023-08-22  0:11                                                   ` Sean Christopherson
2023-08-22  1:10                                                   ` Eric Wheeler
2023-08-22 15:11                                                     ` Sean Christopherson
2023-08-22 21:23                                                       ` Eric Wheeler
2023-08-22 21:32                                                         ` Sean Christopherson
2023-08-23  0:39                                                       ` Eric Wheeler
2023-08-23 17:54                                                         ` Sean Christopherson
2023-08-23 19:44                                                           ` Eric Wheeler
2023-08-23 22:12                                                           ` Eric Wheeler
2023-08-23 22:32                                                             ` Eric Wheeler
2023-08-23 23:21                                                               ` Sean Christopherson
2023-08-24  0:30                                                                 ` Eric Wheeler
2023-08-24  0:52                                                                   ` Sean Christopherson
2023-08-24 23:51                                                                     ` Eric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG+wEg2x-oGALCwKkHOxcrcdjP6ceU=K52UoQE2ht6ut1O46ug@mail.gmail.com' \
    --to=amaan.cheval@gmail.com \
    --cc=brak@gameservers.com \
    --cc=kvm@vger.kernel.org \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.