linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Eric Li <ercli@ucdavis.edu>, David Matlack <dmatlack@google.com>,
	Oliver Upton <oupton@google.com>,
	Liu Jingqi <jingqi.liu@intel.com>
Subject: Re: [PATCH v5 05/15] KVM: nVMX: Let userspace set nVMX MSR to any _host_ supported value
Date: Tue, 8 Nov 2022 18:35:21 +0000	[thread overview]
Message-ID: <Y2qhaSr/d2ds+nqD@google.com> (raw)
In-Reply-To: <20221108102120.qdlgqlgvdi6wi22u@linux.intel.com>

On Tue, Nov 08, 2022, Yu Zhang wrote:
> On Mon, Nov 07, 2022 at 03:06:51PM +0000, Sean Christopherson wrote:
> > On Mon, Nov 07, 2022, Yu Zhang wrote:
> > > On Thu, Nov 03, 2022 at 04:53:11PM +0000, Sean Christopherson wrote:
> > > > Ideally, KVM should NEVER manipulate VMX MSRs in response to guest CPUID changes.
> > > > That's what I was referring to earlier by commits:
> > 
> > ...
> > 
> > > Thanks Sean. Let me try to rephrase my understandings of your statement(
> > > and pls feel free to correct me):
> > > 
> > > 1> For now, what vmx_adjust_secondary_exec_control() does, is to enable/
> > > disable a feature in VMX MSR(and nVMX MSR) based on cpuid changes.
> > > 2> What makes sense is, if a feature is 
> > > 	a. disabled by guest CPUID, it shall not be exposed in guest VMX MSR;
> > > 	b. enabled by guest CPUID, it could be either exposed or hidden in
> > > 	guest VMX MSR.
> > > 3> So your previous change is to guarantee 2.a, and userspace VMM can choose
> > > to follow follow either choices in 2.b(depending on whether it believes this
> > > feature is correctly supported by KVM in nested). 
> > > 
> > > Is above understanding correct? 
> > 
> > Not quite.  Again, in an ideal world, KVM would not modify the VMX MSRs based on
> > guest CPUID.  But it's possible userspace is relying on KVM to hide a feature from
> > L2 if it's hidden from L1, so to avoid breaking an otherwise valide userspace config,
> > it's worth enforcing that in KVM.
> > 
> 
> Sorry, maybe I should understand this way:
> 
> In theroy, KVM shall not modify guest VMX MSRs in response to the guest CPUID
> updates. Therefore we shall not enforce the exposure of a feature in guest VMX
> MSR, just because it is enabled in guest CPUID (e.g., userspace VMM can choose
> to hide such feature so long as it believes KVM can not provide correct nested
> support for this feature). 
> 
> But in reverse, it is not reasonable for userspace VMM to expose a feature in
> guest VMX MSR settings, if such feature is disabled in this guest's CPUID. So
> KVM shall help to make sure such feature is hidden when guest CPUID changes.

No.  Again, KVM _should never_ manipulate VMX MSRs in response to CPUID changes.
Keeping the existing behavior would be done purely to maintain backwards
compability with existing userspace, not because it's strictly the right thing to do.

E.g. as a strawman, a weird userspace could do KVM_SET_MSRS => KVM_SET_CPUID =>
KVM_SET_CPUID, where the first KVM_SET_CPUID reset to a base config and the second
KVM_SET_CPUID incorporates "optional" features.  In that case, clearing bits in
the VMX MSRs on the first KVM_SET_CPUID would do the wrong thing if the second
KVM_SET_CPUID enabled the relevant features.

AFAIK, no userspace actually does something odd like that, whereas there are VMMs
that do KVM_SET_MSRS before KVM_SET_CPUID, e.g. disable a feature in VMX MSRs but
later enable the feature in CPUID for L1.  And so disabling features is likely
safe-ish, but enabling feature most definitely can cause problems for userspace.

Hrm, actually, there are likely older VMMs that never set VMX MSRs, and so dropping
the "enable features" code might not be safe either.  Grr.  The obvious solution
would be to add a quirk, but maybe we can avoid a quirk by skipping KVM's
misguided updates if userspace has set the MSR.  That should work for a userspace
that deliberately sets the MSR during setup, and for a userspace that blindly
migrates the MSR since the migrated value should already be correct/sane.

E.g.

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 45162c1bcd8f..671479cd7721 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -51,6 +51,7 @@ struct nested_vmx_msrs {
        u64 cr4_fixed1;
        u64 vmcs_enum;
        u64 vmfunc_controls;
+       bool secondary_set_by_userspace;
 };
 
 struct vmcs_config {
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 62e3967cf131..3f691ed169d8 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1257,6 +1257,9 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
        if (!is_bitwise_subset(supported, data, GENMASK_ULL(63, 32)))
                return -EINVAL;
 
+       if (msr_index == MSR_IA32_VMX_PROCBASED_CTLS2)
+               vmx->nested.msrs.secondary_set_by_userspace = true;
+
        vmx_get_control_msr(&vmx->nested.msrs, msr_index, &lowp, &highp);
        *lowp = data;
        *highp = data >> 32;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ab89755dce66..8aadaae5b81e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4523,7 +4523,7 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
         * Update the nested MSR settings so that a nested VMM can/can't set
         * controls for features that are/aren't exposed to the guest.
         */
-       if (nested) {
+       if (nested && !vmx->nested.msrs.secondary_set_by_userspace) {
                if (enabled)
                        vmx->nested.msrs.secondary_ctls_high |= control;
                else


> BTW, I found my previous understanding of what vmx_adjust_secondary_exec_control()
> currently does was also wrong. It could also be used for EXITING controls. And
> for such flags(e.g., SECONDARY_EXEC_RDRAND_EXITING), values for the nested settings
> (vmx->nested.msrs.secondary_ctls_high) and for the L1 execution controls(*exec_control)
> could be opposite. So the statement:
> 	"1> For now, what vmx_adjust_secondary_exec_control() does, is to enable/
> 	 disable a feature in VMX MSR(and nVMX MSR) based on cpuid changes."
> is wrong.

No, it's correct.  The EXITING controls are just inverted feature flags.  E.g. if
RDRAND is disabled in CPUID, KVM sets the EXITING control so that KVM intercepts
RDRAND in order to inject #UD.

	[EXIT_REASON_RDRAND]                  = kvm_handle_invalid_op,

  reply	other threads:[~2022-11-08 18:35 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-07 21:35 [PATCH v5 00/15] KVM: nVMX: VMX MSR quirk+fixes, CR4 fixes Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 01/15] KVM: x86: Split kvm_is_valid_cr4() and export only the non-vendor bits Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 02/15] KVM: nVMX: Account for KVM reserved CR4 bits in consistency checks Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 03/15] KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4 Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 04/15] KVM: nVMX: Rename handle_vm{on,off}() to handle_vmx{on,off}() Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 05/15] KVM: nVMX: Let userspace set nVMX MSR to any _host_ supported value Sean Christopherson
2022-10-31 16:39   ` Yu Zhang
2022-10-31 17:11     ` Sean Christopherson
2022-11-01 10:18       ` Yu Zhang
2022-11-01 17:58         ` Sean Christopherson
2022-11-02  8:54           ` Yu Zhang
2022-11-03 16:53             ` Sean Christopherson
2022-11-07  8:28               ` Yu Zhang
2022-11-07 15:06                 ` Sean Christopherson
2022-11-08 10:21                   ` Yu Zhang
2022-11-08 18:35                     ` Sean Christopherson [this message]
2022-11-10  8:44                       ` Yu Zhang
2022-11-10 16:08                         ` Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 06/15] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Sean Christopherson
2022-07-22  9:06   ` Paolo Bonzini
2022-06-07 21:35 ` [PATCH v5 07/15] KVM: VMX: Add helper to check if the guest PMU has PERF_GLOBAL_CTRL Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 08/15] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR write Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 09/15] KVM: nVMX: Drop nested_vmx_pmu_refresh() Sean Christopherson
2022-06-07 21:35 ` [PATCH v5 10/15] KVM: nVMX: Add a quirk for KVM tweaks to VMX MSRs Sean Christopherson
2022-06-07 21:36 ` [PATCH v5 11/15] KVM: nVMX: Set UMIP bit CR4_FIXED1 MSR when emulating UMIP Sean Christopherson
2022-07-22  9:49   ` Paolo Bonzini
2022-06-07 21:36 ` [PATCH v5 12/15] KVM: nVMX: Extend VMX MSRs quirk to CR0/4 fixed1 bits Sean Christopherson
2022-07-22  9:50   ` Paolo Bonzini
2022-06-07 21:36 ` [PATCH v5 13/15] KVM: selftests: Add test to verify KVM's VMX MSRs quirk for controls Sean Christopherson
2022-06-07 21:36 ` [PATCH v5 14/15] KVM: selftests: Extend VMX MSRs test to cover CR4_FIXED1 (and its quirks) Sean Christopherson
2022-06-07 21:36 ` [PATCH v5 15/15] KVM: selftests: Verify VMX MSRs can be restored to KVM-supported values Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2qhaSr/d2ds+nqD@google.com \
    --to=seanjc@google.com \
    --cc=dmatlack@google.com \
    --cc=ercli@ucdavis.edu \
    --cc=jingqi.liu@intel.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oupton@google.com \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).