All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jim Mattson <jmattson@google.com>
To: Maxim Levitsky <mlevitsk@redhat.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Gilbert <dgilbert@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Peter Xu <peterx@redhat.com>
Subject: Re: Why do we need KVM_REQ_GET_NESTED_STATE_PAGES after all
Date: Thu, 27 Jan 2022 11:39:42 -0800	[thread overview]
Message-ID: <CALMp9eT2cP7kdptoP3=acJX+5_Wg6MXNwoDh42pfb21-wdXvJg@mail.gmail.com> (raw)
In-Reply-To: <fc6bea3249f26e8dd973ce1bd1e3f6f42c142469.camel@redhat.com>

On Thu, Jan 27, 2022 at 8:04 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> I would like to raise a question about this elephant in the room which I wanted to understand for
> quite a long time.
>
> For my nested AVIC work I once again need to change the KVM_REQ_GET_NESTED_STATE_PAGES code and once
> again I am asking myself, maybe we can get rid of this code, after all?

We (GCE) use it so that, during post-copy, a vCPU thread can exit to
userspace and demand these pages from the source itself, rather than
funneling all demands through a single "demand paging listener"
thread, which I believe is the equivalent of qemu's userfaultfd "fault
handler" thread. Our (internal) post-copy mechanism scales quite well,
because most demand paging requests are triggered by an EPT violation,
which happens to be a convenient place to exit to userspace. Very few
pages are typically demanded as a result of
kvm_vcpu_{read,write}_guest, where the vCPU thread is so deep in the
kernel call stack that it has to request the page via the demand
paging listener thread. With nested virtualization, the various vmcs12
pages consulted directly by kvm (bypassing the EPT tables) were a
scalability issue.

(Note that, unlike upstream, we don't call nested_get_vmcs12_pages
directly from VMLAUNCH/VMRESUME emulation; we always call it as a
result of this request that you don't like.)

As we work on converting from our (hacky) demand paging scheme to
userfaultfd, we will have to solve the scalability issue anyway
(unless someone else beats us to it). Eventually, I expect that our
need for this request will go away.

Honestly, without the exits to userspace, I don't really see how this
request buys you anything upstream. When I originally submitted it, I
was prepared for rejection, but Paolo said that qemu had a similar
need for it, and I happily never questioned that assertion.

  reply	other threads:[~2022-01-27 19:39 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-27 16:03 Why do we need KVM_REQ_GET_NESTED_STATE_PAGES after all Maxim Levitsky
2022-01-27 19:39 ` Jim Mattson [this message]
2022-01-30 14:29   ` Maxim Levitsky
2022-01-30 23:46     ` Jim Mattson
2022-01-31 10:31       ` Maxim Levitsky
2022-01-31 17:37         ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALMp9eT2cP7kdptoP3=acJX+5_Wg6MXNwoDh42pfb21-wdXvJg@mail.gmail.com' \
    --to=jmattson@google.com \
    --cc=dgilbert@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.