linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Yu Zhang <yu.c.zhang@linux.intel.com>,
	Maxim Levitsky <mlevitsk@redhat.com>
Subject: [PATCH 07/54] KVM: x86: Alert userspace that KVM_SET_CPUID{,2} after KVM_RUN is broken
Date: Tue, 22 Jun 2021 10:56:52 -0700	[thread overview]
Message-ID: <20210622175739.3610207-8-seanjc@google.com> (raw)
In-Reply-To: <20210622175739.3610207-1-seanjc@google.com>

Warn userspace that KVM_SET_CPUID{,2} after KVM_RUN "may" cause guest
instability.  Initialize last_vmentry_cpu to -1 and use it to detect if
the vCPU has been run at least once when its CPUID model is changed.

KVM does not correctly handle changes to paging related settings in the
guest's vCPU model after KVM_RUN, e.g. MAXPHYADDR, GBPAGES, etc...  KVM
could theoretically zap all shadow pages, but actually making that happen
is a mess due to lock inversion (vcpu->mutex is held).  And even then,
updating paging settings on the fly would only work if all vCPUs are
stopped, updated in concert with identical settings, then restarted.

To support running vCPUs with different vCPU models (that affect paging),
KVM would need to track all relevant information in kvm_mmu_page_role.
Note, that's the _page_ role, not the full mmu_role.  Updating mmu_role
isn't sufficient as a vCPU can reuse a shadow page translation that was
created by a vCPU with different settings and thus completely skip the
reserved bit checks (that are tied to CPUID).

Tracking CPUID state in kvm_mmu_page_role is _extremely_ undesirable as
it would require doubling gfn_track from a u16 to a u32, i.e. would
increase KVM's memory footprint by 2 bytes for every 4kb of guest memory.
E.g. MAXPHYADDR (6 bits), GBPAGES, AMD vs. INTEL = 1 bit, and SEV C-BIT
would all need to be tracked.

In practice, there is no remotely sane use case for changing any paging
related CPUID entries on the fly, so just sweep it under the rug (after
yelling at userspace).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/api.rst  | 11 ++++++++---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu/mmu.c          | 18 ++++++++++++++++++
 arch/x86/kvm/x86.c              |  2 ++
 4 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index e328caa35d6c..06e82f07fe54 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -688,9 +688,14 @@ MSRs that have been set successfully.
 Defines the vcpu responses to the cpuid instruction.  Applications
 should use the KVM_SET_CPUID2 ioctl if available.
 
-Note, when this IOCTL fails, KVM gives no guarantees that previous valid CPUID
-configuration (if there is) is not corrupted. Userspace can get a copy of the
-resulting CPUID configuration through KVM_GET_CPUID2 in case.
+Caveat emptor:
+  - If this IOCTL fails, KVM gives no guarantees that previous valid CPUID
+    configuration (if there is) is not corrupted. Userspace can get a copy
+    of the resulting CPUID configuration through KVM_GET_CPUID2 in case.
+  - Using KVM_SET_CPUID{,2} after KVM_RUN, i.e. changing the guest vCPU model
+    after running the guest, may cause guest instability.
+  - Using heterogeneous CPUID configurations, modulo APIC IDs, topology, etc...
+    may cause guest instability.
 
 ::
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4ac534766eff..19c88b445ee0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -840,7 +840,7 @@ struct kvm_vcpu_arch {
 	bool l1tf_flush_l1d;
 
 	/* Host CPU on which VM-entry was most recently attempted */
-	unsigned int last_vmentry_cpu;
+	int last_vmentry_cpu;
 
 	/* AMD MSRC001_0015 Hardware Configuration */
 	u64 msr_hwcr;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index e2668a9b5936..8d97d21d5241 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4913,6 +4913,24 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	vcpu->arch.guest_mmu.mmu_role.ext.valid = 0;
 	vcpu->arch.nested_mmu.mmu_role.ext.valid = 0;
 	kvm_mmu_reset_context(vcpu);
+
+	/*
+	 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
+	 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
+	 * tracked in kvm_mmu_page_role.  As a result, KVM may miss guest page
+	 * faults due to reusing SPs/SPTEs.  Alert userspace, but otherwise
+	 * sweep the problem under the rug.
+	 *
+	 * KVM's horrific CPUID ABI makes the problem all but impossible to
+	 * solve, as correctly handling multiple vCPU models (with respect to
+	 * paging and physical address properties) in a single VM would require
+	 * tracking all relevant CPUID information in kvm_mmu_page_role.  That
+	 * is very undesirable as it would double the memory requirements for
+	 * gfn_track (see struct kvm_mmu_page_role comments), and in practice
+	 * no sane VMM mucks with the core vCPU model on the fly.
+	 */
+	if (vcpu->arch.last_vmentry_cpu != -1)
+		pr_warn_ratelimited("KVM: KVM_SET_CPUID{,2} after KVM_RUN may cause guest instability\n");
 }
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 42608b515ce4..92b4a9305651 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10583,6 +10583,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	struct page *page;
 	int r;
 
+	vcpu->arch.last_vmentry_cpu = -1;
+
 	if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
 		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
 	else
-- 
2.32.0.288.g62a8d224e6-goog


  parent reply	other threads:[~2021-06-22 17:58 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22 17:56 [PATCH 00/54] KVM: x86/mmu: Bug fixes and summer cleaning Sean Christopherson
2021-06-22 17:56 ` [PATCH 01/54] KVM: x86/mmu: Remove broken WARN that fires on 32-bit KVM w/ nested EPT Sean Christopherson
2021-06-22 17:56 ` [PATCH 02/54] KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs Sean Christopherson
2021-06-22 17:56 ` [PATCH 03/54] KVM: x86: Properly reset MMU context at vCPU RESET/INIT Sean Christopherson
2021-06-23 13:59   ` Paolo Bonzini
2021-06-23 14:01   ` Paolo Bonzini
2021-06-23 14:50     ` Sean Christopherson
2021-06-22 17:56 ` [PATCH 04/54] KVM: x86/mmu: Use MMU's role to detect CR4.SMEP value in nested NPT walk Sean Christopherson
2021-06-22 17:56 ` [PATCH 05/54] Revert "KVM: x86/mmu: Drop kvm_mmu_extended_role.cr4_la57 hack" Sean Christopherson
2021-06-25  8:47   ` Yu Zhang
2021-06-25  8:57     ` Paolo Bonzini
2021-06-25  9:29       ` Yu Zhang
2021-06-25 10:25         ` Paolo Bonzini
2021-06-25 11:23           ` Yu Zhang
2021-06-22 17:56 ` [PATCH 06/54] KVM: x86: Force all MMUs to reinitialize if guest CPUID is modified Sean Christopherson
2021-06-22 17:56 ` Sean Christopherson [this message]
2021-06-23 14:16   ` [PATCH 07/54] KVM: x86: Alert userspace that KVM_SET_CPUID{,2} after KVM_RUN is broken Paolo Bonzini
2021-06-23 17:00     ` Jim Mattson
2021-06-23 17:11       ` Paolo Bonzini
2021-06-23 18:11         ` Jim Mattson
2021-06-23 18:49           ` Paolo Bonzini
2021-06-23 19:02             ` Jim Mattson
2021-06-23 19:53               ` Paolo Bonzini
2021-06-22 17:56 ` [PATCH 08/54] Revert "KVM: MMU: record maximum physical address width in kvm_mmu_extended_role" Sean Christopherson
2021-06-25  8:52   ` Yu Zhang
2021-06-22 17:56 ` [PATCH 09/54] KVM: x86/mmu: Unconditionally zap unsync SPs when creating >4k SP at GFN Sean Christopherson
2021-06-23 14:36   ` Paolo Bonzini
2021-06-23 15:08     ` Sean Christopherson
2021-06-23 16:38       ` Paolo Bonzini
2021-06-23 22:04         ` Sean Christopherson
2021-06-25  9:51   ` Yu Zhang
2021-06-25 10:26     ` Paolo Bonzini
2021-06-25 13:08       ` Yu Zhang
2021-06-22 17:56 ` [PATCH 10/54] KVM: x86/mmu: Replace EPT shadow page shenanigans with simpler check Sean Christopherson
2021-06-23 15:49   ` Paolo Bonzini
2021-06-23 16:17     ` Sean Christopherson
2021-06-23 16:41       ` Paolo Bonzini
2021-06-23 16:54         ` Sean Christopherson
2021-06-22 17:56 ` [PATCH 11/54] KVM: x86/mmu: WARN and zap SP when sync'ing if MMU role mismatches Sean Christopherson
2021-06-22 17:56 ` [PATCH 12/54] KVM: x86/mmu: Drop the intermediate "transient" __kvm_sync_page() Sean Christopherson
2021-06-23 16:54   ` Paolo Bonzini
2021-06-22 17:56 ` [PATCH 13/54] KVM: x86/mmu: Rename unsync helper and update related comments Sean Christopherson
2021-06-22 17:56 ` [PATCH 14/54] KVM: x86: Fix sizes used to pass around CR0, CR4, and EFER Sean Christopherson
2021-06-22 17:57 ` [PATCH 15/54] KVM: nSVM: Add a comment to document why nNPT uses vmcb01, not vCPU state Sean Christopherson
2021-06-23 17:06   ` Paolo Bonzini
2021-06-23 20:49     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 16/54] KVM: x86/mmu: Drop smep_andnot_wp check from "uses NX" for shadow MMUs Sean Christopherson
2021-06-23 17:11   ` Paolo Bonzini
2021-06-23 19:36     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 17/54] KVM: x86: Read and pass all CR0/CR4 role bits to shadow MMU helper Sean Christopherson
2021-06-22 17:57 ` [PATCH 18/54] KVM: x86/mmu: Move nested NPT reserved bit calculation into MMU proper Sean Christopherson
2021-06-23 17:13   ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 19/54] KVM: x86/mmu: Grab shadow root level from mmu_role for shadow MMUs Sean Christopherson
2021-06-22 17:57 ` [PATCH 20/54] KVM: x86/mmu: Add struct and helpers to retrieve MMU role bits from regs Sean Christopherson
2021-06-23  1:58   ` kernel test robot
2021-06-23 17:18   ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 21/54] KVM: x86/mmu: Consolidate misc updates into shadow_mmu_init_context() Sean Christopherson
2021-06-22 17:57 ` [PATCH 22/54] KVM: x86/mmu: Ignore CR0 and CR4 bits in nested EPT MMU role Sean Christopherson
2021-06-22 17:57 ` [PATCH 23/54] KVM: x86/mmu: Use MMU's role_regs, not vCPU state, to compute mmu_role Sean Christopherson
2021-06-22 17:57 ` [PATCH 24/54] KVM: x86/mmu: Rename "nxe" role bit to "efer_nx" for macro shenanigans Sean Christopherson
2021-06-22 17:57 ` [PATCH 25/54] KVM: x86/mmu: Add helpers to query mmu_role bits Sean Christopherson
2021-06-23 20:02   ` Paolo Bonzini
2021-06-23 20:47     ` Sean Christopherson
2021-06-23 20:53       ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 26/54] KVM: x86/mmu: Do not set paging-related bits in MMU role if CR0.PG=0 Sean Christopherson
2021-06-22 17:57 ` [PATCH 27/54] KVM: x86/mmu: Set CR4.PKE/LA57 in MMU role iff long mode is active Sean Christopherson
2021-06-22 17:57 ` [PATCH 28/54] KVM: x86/mmu: Always Set new mmu_role immediately after checking old role Sean Christopherson
2021-06-22 17:57 ` [PATCH 29/54] KVM: x86/mmu: Don't grab CR4.PSE for calculating shadow reserved bits Sean Christopherson
2021-06-22 17:57 ` [PATCH 30/54] KVM: x86/mmu: Use MMU's role to get CR4.PSE for computing rsvd bits Sean Christopherson
2021-06-22 17:57 ` [PATCH 31/54] KVM: x86/mmu: Drop vCPU param from reserved bits calculator Sean Christopherson
2021-06-22 17:57 ` [PATCH 32/54] KVM: x86/mmu: Use MMU's role to compute permission bitmask Sean Christopherson
2021-06-22 17:57 ` [PATCH 33/54] KVM: x86/mmu: Use MMU's role to compute PKRU bitmask Sean Christopherson
2021-06-22 17:57 ` [PATCH 34/54] KVM: x86/mmu: Use MMU's roles to compute last non-leaf level Sean Christopherson
2021-06-22 17:57 ` [PATCH 35/54] KVM: x86/mmu: Use MMU's role to detect EFER.NX in guest page walk Sean Christopherson
2021-06-22 17:57 ` [PATCH 36/54] KVM: x86/mmu: Use MMU's role/role_regs to compute context's metadata Sean Christopherson
2021-06-22 17:57 ` [PATCH 37/54] KVM: x86/mmu: Use MMU's role to get EFER.NX during MMU configuration Sean Christopherson
2021-06-22 17:57 ` [PATCH 38/54] KVM: x86/mmu: Drop "nx" from MMU context now that there are no readers Sean Christopherson
2021-06-22 17:57 ` [PATCH 39/54] KVM: x86/mmu: Get nested MMU's root level from the MMU's role Sean Christopherson
2021-06-22 17:57 ` [PATCH 40/54] KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper Sean Christopherson
2021-06-22 17:57 ` [PATCH 41/54] KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls Sean Christopherson
2021-06-23 20:07   ` Paolo Bonzini
2021-06-23 20:53     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 42/54] KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0 Sean Christopherson
2021-06-22 17:57 ` [PATCH 43/54] KVM: x86/mmu: Add helper to update paging metadata Sean Christopherson
2021-06-22 17:57 ` [PATCH 44/54] KVM: x86/mmu: Add a helper to calculate root from role_regs Sean Christopherson
2021-06-22 17:57 ` [PATCH 45/54] KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers Sean Christopherson
2021-06-22 17:57 ` [PATCH 46/54] KVM: x86/mmu: Use MMU's role to determine PTTYPE Sean Christopherson
2021-06-22 17:57 ` [PATCH 47/54] KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU Sean Christopherson
2021-06-23 20:13   ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 48/54] KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE Sean Christopherson
2021-06-22 17:57 ` [PATCH 49/54] KVM: x86: Enhance comments for MMU roles and nested transition trickiness Sean Christopherson
2021-06-22 17:57 ` [PATCH 50/54] KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic Sean Christopherson
2021-06-23 20:22   ` Paolo Bonzini
2021-06-23 20:58     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 51/54] KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT Sean Christopherson
2021-06-22 17:57 ` [PATCH 52/54] KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault Sean Christopherson
2021-06-22 17:57 ` [PATCH 53/54] KVM: x86/mmu: Get CR4.SMEP " Sean Christopherson
2021-06-22 17:57 ` [PATCH 54/54] KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on Sean Christopherson
2021-06-23 20:29 ` [PATCH 00/54] KVM: x86/mmu: Bug fixes and summer cleaning Paolo Bonzini
2021-06-23 21:06   ` Sean Christopherson
2021-06-23 21:33     ` Paolo Bonzini
2021-06-23 22:08       ` Sean Christopherson
2021-06-23 22:12         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210622175739.3610207-8-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).