kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Graf <graf@amazon.com>
To: Jim Mattson <jmattson@google.com>, Aaron Lewis <aaronlewis@google.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	"Joerg Roedel" <joro@8bytes.org>, kvm list <kvm@vger.kernel.org>,
	<linux-doc@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] KVM: x86: Deflect unknown MSR accesses to user space
Date: Wed, 29 Jul 2020 11:06:46 +0200	[thread overview]
Message-ID: <14035057-ea80-603b-0466-bb50767f9f7e@amazon.com> (raw)
In-Reply-To: <CALMp9eQ3OxhQZYiHPiebX=KyvjWQgxQEO-owjSoxgPKsOMRvjw@mail.gmail.com>



On 28.07.20 19:13, Jim Mattson wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On Tue, Jul 28, 2020 at 5:41 AM Alexander Graf <graf@amazon.com> wrote:
>>
>>
>>
>> On 28.07.20 10:15, Vitaly Kuznetsov wrote:
>>>
>>> Alexander Graf <graf@amazon.com> writes:
>>>
>>>> MSRs are weird. Some of them are normal control registers, such as EFER.
>>>> Some however are registers that really are model specific, not very
>>>> interesting to virtualization workloads, and not performance critical.
>>>> Others again are really just windows into package configuration.
>>>>
>>>> Out of these MSRs, only the first category is necessary to implement in
>>>> kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against
>>>> certain CPU models and MSRs that contain information on the package level
>>>> are much better suited for user space to process. However, over time we have
>>>> accumulated a lot of MSRs that are not the first category, but still handled
>>>> by in-kernel KVM code.
>>>>
>>>> This patch adds a generic interface to handle WRMSR and RDMSR from user
>>>> space. With this, any future MSR that is part of the latter categories can
>>>> be handled in user space.
> 
> This sounds similar to Peter Hornyack's RFC from 5 years ago:
> https://www.mail-archive.com/kvm@vger.kernel.org/msg124448.html.

Yeah, looks very similar. Do you know the history why it never got 
merged? I couldn't spot a non-RFC version of this on the ML.

> 
>>>> Furthermore, it allows us to replace the existing "ignore_msrs" logic with
>>>> something that applies per-VM rather than on the full system. That way you
>>>> can run productive VMs in parallel to experimental ones where you don't care
>>>> about proper MSR handling.
>>>>
>>>
>>> In theory, we can go further: userspace will give KVM the list of MSRs
>>> it is interested in. This list may even contain MSRs which are normally
>>> handled by KVM, in this case userspace gets an option to mangle KVM's
>>> reply (RDMSR) or do something extra (WRMSR). I'm not sure if there is a
>>> real need behind this, just an idea.
>>>
>>> The problem with this approach is: if currently some MSR is not
>>> implemented in KVM you will get an exit. When later someone comes with a
>>> patch to implement this MSR your userspace handling will immediately get
>>> broken so the list of not implemented MSRs effectively becomes an API :-)
> 
> Indeed. This is a legitimate concern. At Google, we have experienced
> this problem already, using Peter Hornyack's approach. We ended up
> commenting out some MSRs from kvm, which is less than ideal.

Yeah :(.

> 
>> Yeah, I'm not quite sure how to do this without bloating the kernel's
>> memory footprint too much though.
>>
>> One option would be to create a shared bitmap with user space. But that
>> would need to be sparse and quite big to be able to address all of
>> today's possible MSR indexes. From a quick glimpse at Linux's MSR
>> defines, there are:
>>
>>     0x00000000 - 0x00001000 (Intel)
>>     0x00001000 - 0x00002000 (VIA)
>>     0x40000000 - 0x50000000 (PV)
>>     0xc0000000 - 0xc0003000 (AMD)
>>     0xc0010000 - 0xc0012000 (AMD)
>>     0x80860000 - 0x80870000 (Transmeta)
>>
>> Another idea would be to turn the logic around and implement an
>> allowlist in KVM with all of the MSRs that KVM should handle. In that
>> API we could ask for an array of KVM supported MSRs into user space.
>> User space could then bounce that array back to KVM to have all in-KVM
>> supported MSRs handled. Or it could remove entries that it wants to
>> handle on its own.
>>
>> KVM internally could then save the list as a dense bitmap, translating
>> every list entry into its corresponding bit.
>>
>> While it does feel a bit overengineered, it would solve the problem that
>> we're turning in-KVM handled MSRs into an ABI.
> 
> It seems unlikely that userspace is going to know what to do with a
> large number of MSRs. I suspect that a small enumerated list will
> suffice. In fact, +Aaron Lewis is working on upstreaming a local
> Google patch set that does just that.

I tend to disagree on that sentiment. One of the motivations behind this 
patch is to populate invalid MSR accesses into user space, to move logic 
like "ignore_msrs"[1] into user space. This is not very useful for the 
cloud use case, but it does come in handy when you want to have VMs that 
can handle unimplemented MSRs in parallel to ones that do not.

So whatever we implement, I would ideally want a mechanism at the end of 
the day that allows me to "trap the rest" into user space.


Alex

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/x86.c#n114



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



  parent reply	other threads:[~2020-07-29  9:07 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-28  0:44 [PATCH] KVM: x86: Deflect unknown MSR accesses to user space Alexander Graf
2020-07-28  8:15 ` Vitaly Kuznetsov
2020-07-28 12:41   ` Alexander Graf
2020-07-28 17:13     ` Jim Mattson
2020-07-29  8:23       ` Vitaly Kuznetsov
2020-07-29  9:09         ` Alexander Graf
2020-07-29  9:22           ` Vitaly Kuznetsov
2020-07-29  9:34             ` Alexander Graf
2020-07-29  9:06       ` Alexander Graf [this message]
2020-07-29 18:27         ` Jim Mattson
2020-07-29 20:28           ` Alexander Graf
2020-07-29 20:37             ` Jim Mattson
2020-07-29 20:45               ` Alexander Graf
2020-07-29 20:49                 ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14035057-ea80-603b-0466-bb50767f9f7e@amazon.com \
    --to=graf@amazon.com \
    --cc=aaronlewis@google.com \
    --cc=corbet@lwn.net \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).