All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julian Stecklina <julian.stecklina@cyberus-technology.de>
To: "seanjc@google.com" <seanjc@google.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Thomas Prescher <thomas.prescher@cyberus-technology.de>
Subject: Re: Timer Signals vs KVM
Date: Tue, 16 Apr 2024 12:44:13 +0000	[thread overview]
Message-ID: <af2ede328efee9dc3761333bd47648ee6f752686.camel@cyberus-technology.de> (raw)
In-Reply-To: <Zgszp5wvxGtu2YHS@google.com>

On Mon, 2024-04-01 at 15:22 -0700, Sean Christopherson wrote:
> On Wed, Mar 27, 2024, Julian Stecklina wrote:
> 
> > 
> > When we enable nested virtualization, we see what looks like corruption in
> > the
> > nested guest. The guest trips over exceptions that shouldn't be there. We
> > are
> > currently debugging this to find out details, but the setup is pretty
> > painful
> > and it will take a bit. If we disable the timer signals, this issue goes
> > away
> > (at the cost of broken VBox timers obviously...).  This is weird and has
> > left us
> > wondering, whether there might be something broken with signals in this
> > scenario, especially since none of the other VMMs uses this method.
> 
> It's certainly possible there's a kernel bug, but it's probably more likely a
> problem in your userspace.  QEMU (and others VMMs) do use signals to interrupt
> vCPUs, e.g. to take control for live migration.  That's obviously different
> than
> what you're doing, and will have orders of magnitude lower volume of signals
> in
> nested guests, but the effective coverage isn't "zero".

After some weeks of bug hunting, my colleague Thomas has found the issue and we
posted a patch:

https://lore.kernel.org/kvm/20240416123558.212040-1-julian.stecklina@cyberus-technology.de/T/#t

Given the complexity of the nesting code, we're not entirely sure whether this
is the best way of fixing this, though.

But with this patch we can run uXen (as used by HP Sure Click aka Bromium)
inside of VirtualBox. It also fixes the other nesting problems we saw with
VBox/KVM!

The reason why this triggers in VirtualBox and not in Qemu is that there are
cases where VirtualBox marks CR4 dirty even though it hasn't changed.

Thanks,

Julian

  reply	other threads:[~2024-04-16 12:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-27 13:46 Timer Signals vs KVM Julian Stecklina
2024-04-01 22:22 ` Sean Christopherson
2024-04-16 12:44   ` Julian Stecklina [this message]
2024-04-16 12:53     ` Julian Stecklina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af2ede328efee9dc3761333bd47648ee6f752686.camel@cyberus-technology.de \
    --to=julian.stecklina@cyberus-technology.de \
    --cc=kvm@vger.kernel.org \
    --cc=seanjc@google.com \
    --cc=thomas.prescher@cyberus-technology.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.