All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Yu Zhang <yu.c.zhang@linux.intel.com>,
	Maxim Levitsky <mlevitsk@redhat.com>
Subject: [PATCH 49/54] KVM: x86: Enhance comments for MMU roles and nested transition trickiness
Date: Tue, 22 Jun 2021 10:57:34 -0700	[thread overview]
Message-ID: <20210622175739.3610207-50-seanjc@google.com> (raw)
In-Reply-To: <20210622175739.3610207-1-seanjc@google.com>

Expand the comments for the MMU roles.  The interactions with gfn_track
PGD reuse in particular are hairy.

Regarding PGD reuse, add comments in the nested virtualization flows to
call out why kvm_init_mmu() is unconditionally called even when nested
TDP is used.

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 59 +++++++++++++++++++++++++++------
 arch/x86/kvm/svm/nested.c       |  1 +
 arch/x86/kvm/vmx/nested.c       |  1 +
 3 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index be7088fb0594..2da8b5ddbd6a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -269,12 +269,36 @@ enum x86_intercept_stage;
 struct kvm_kernel_irq_routing_entry;
 
 /*
- * the pages used as guest page table on soft mmu are tracked by
- * kvm_memory_slot.arch.gfn_track which is 16 bits, so the role bits used
- * by indirect shadow page can not be more than 15 bits.
+ * kvm_mmu_page_role tracks the properties of a shadow page (where shadow page
+ * also includes TDP pages) to determine whether or not a page can be used in
+ * the given MMU context.  This is a subset of the overall kvm_mmu_role to
+ * minimize the size of kvm_memory_slot.arch.gfn_track, i.e. allows allocating
+ * 2 bytes per gfn instead of 4 bytes per gfn.
  *
- * Currently, we used 14 bits that are @level, @gpte_is_8_bytes, @quadrant, @access,
- * @efer_nx, @cr0_wp, @smep_andnot_wp and @smap_andnot_wp.
+ * Indirect upper-level shadow pages are tracked for write-protection via
+ * gfn_track.  As above, gfn_track is a 16 bit counter, so KVM must not create
+ * more than 2^16-1 upper-level shadow pages at a single gfn, otherwise
+ * gfn_track will overflow and explosions will ensure.
+ *
+ * A unique shadow page (SP) for a gfn is created if and only if an existing SP
+ * cannot be reused.  The ability to reuse a SP is tracked by its role, which
+ * incorporates various mode bits and properties of the SP.  Roughly speaking,
+ * the number of unique SPs that can theoretically be created is 2^n, where n
+ * is the number of bits that are used to compute the role.
+ *
+ * But, even though there are 18 bits in the mask below, not all combinations
+ * of modes and flags are possible.  The maximum number of possible upper-level
+ * shadow pages for a single gfn is in the neighborhood of 2^13.
+ *
+ *   - invalid shadow pages are not accounted.
+ *   - level is effectively limited to four combinations, not 16 as the number
+ *     bits would imply, as 4k SPs are not tracked (allowed to go unsync).
+ *   - level is effectively unused for non-PAE paging because there is exactly
+ *     one upper level (see 4k SP exception above).
+ *   - quadrant is used only for non-PAE paging and is exclusive with
+ *     gpte_is_8_bytes.
+ *   - execonly and ad_disabled are used only for nested EPT, which makes it
+ *     exclusive with quadrant.
  */
 union kvm_mmu_page_role {
 	u32 word;
@@ -303,13 +327,26 @@ union kvm_mmu_page_role {
 	};
 };
 
+/*
+ * kvm_mmu_extended_role complements kvm_mmu_page_role, tracking properties
+ * relevant to the current MMU configuration.   When loading CR0, CR4, or EFER,
+ * including on nested transitions, if nothing in the full role changes then
+ * MMU re-configuration can be skipped. @valid bit is set on first usage so we
+ * don't treat all-zero structure as valid data.
+ *
+ * The properties that are tracked in the extended role but not the page role
+ * are for things that either (a) do not affect the validity of the shadow page
+ * or (b) are indirectly reflected in the shadow page's role.  For example,
+ * CR4.PKE only affects permission checks for software walks of the guest page
+ * tables (because KVM doesn't support Protection Keys with shadow paging), and
+ * CR0.PG, CR4.PAE, and CR4.PSE are indirectly reflected in role.level.
+ *
+ * Note, SMEP and SMAP are not redundant with sm*p_andnot_wp in the page role.
+ * If CR0.WP=1, KVM can reuse shadow pages for the guest regardless of SMEP and
+ * SMAP, but the MMU's permission checks for software walks need to be SMEP and
+ * SMAP aware regardless of CR0.WP.
+ */
 union kvm_mmu_extended_role {
-/*
- * This structure complements kvm_mmu_page_role caching everything needed for
- * MMU configuration. If nothing in both these structures changed, MMU
- * re-configuration can be skipped. @valid bit is set on first usage so we don't
- * treat all-zero structure as valid data.
- */
 	u32 word;
 	struct {
 		unsigned int valid:1;
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 927e545591c3..94389f974ba9 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -424,6 +424,7 @@ static int nested_svm_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
 	vcpu->arch.cr3 = cr3;
 	kvm_register_mark_available(vcpu, VCPU_EXREG_CR3);
 
+	/* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */
 	kvm_init_mmu(vcpu);
 
 	return 0;
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 183fd9d62fc5..77fc51a852cf 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1098,6 +1098,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
 	vcpu->arch.cr3 = cr3;
 	kvm_register_mark_available(vcpu, VCPU_EXREG_CR3);
 
+	/* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */
 	kvm_init_mmu(vcpu);
 
 	return 0;
-- 
2.32.0.288.g62a8d224e6-goog


  parent reply	other threads:[~2021-06-22 18:04 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22 17:56 [PATCH 00/54] KVM: x86/mmu: Bug fixes and summer cleaning Sean Christopherson
2021-06-22 17:56 ` [PATCH 01/54] KVM: x86/mmu: Remove broken WARN that fires on 32-bit KVM w/ nested EPT Sean Christopherson
2021-06-22 17:56 ` [PATCH 02/54] KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs Sean Christopherson
2021-06-22 17:56 ` [PATCH 03/54] KVM: x86: Properly reset MMU context at vCPU RESET/INIT Sean Christopherson
2021-06-23 13:59   ` Paolo Bonzini
2021-06-23 14:01   ` Paolo Bonzini
2021-06-23 14:50     ` Sean Christopherson
2021-06-22 17:56 ` [PATCH 04/54] KVM: x86/mmu: Use MMU's role to detect CR4.SMEP value in nested NPT walk Sean Christopherson
2021-06-22 17:56 ` [PATCH 05/54] Revert "KVM: x86/mmu: Drop kvm_mmu_extended_role.cr4_la57 hack" Sean Christopherson
2021-06-25  8:47   ` Yu Zhang
2021-06-25  8:57     ` Paolo Bonzini
2021-06-25  9:29       ` Yu Zhang
2021-06-25 10:25         ` Paolo Bonzini
2021-06-25 11:23           ` Yu Zhang
2021-06-22 17:56 ` [PATCH 06/54] KVM: x86: Force all MMUs to reinitialize if guest CPUID is modified Sean Christopherson
2021-06-22 17:56 ` [PATCH 07/54] KVM: x86: Alert userspace that KVM_SET_CPUID{,2} after KVM_RUN is broken Sean Christopherson
2021-06-23 14:16   ` Paolo Bonzini
2021-06-23 17:00     ` Jim Mattson
2021-06-23 17:11       ` Paolo Bonzini
2021-06-23 18:11         ` Jim Mattson
2021-06-23 18:49           ` Paolo Bonzini
2021-06-23 19:02             ` Jim Mattson
2021-06-23 19:53               ` Paolo Bonzini
2021-06-22 17:56 ` [PATCH 08/54] Revert "KVM: MMU: record maximum physical address width in kvm_mmu_extended_role" Sean Christopherson
2021-06-25  8:52   ` Yu Zhang
2021-06-22 17:56 ` [PATCH 09/54] KVM: x86/mmu: Unconditionally zap unsync SPs when creating >4k SP at GFN Sean Christopherson
2021-06-23 14:36   ` Paolo Bonzini
2021-06-23 15:08     ` Sean Christopherson
2021-06-23 16:38       ` Paolo Bonzini
2021-06-23 22:04         ` Sean Christopherson
2021-06-25  9:51   ` Yu Zhang
2021-06-25 10:26     ` Paolo Bonzini
2021-06-25 13:08       ` Yu Zhang
2021-06-22 17:56 ` [PATCH 10/54] KVM: x86/mmu: Replace EPT shadow page shenanigans with simpler check Sean Christopherson
2021-06-23 15:49   ` Paolo Bonzini
2021-06-23 16:17     ` Sean Christopherson
2021-06-23 16:41       ` Paolo Bonzini
2021-06-23 16:54         ` Sean Christopherson
2021-06-22 17:56 ` [PATCH 11/54] KVM: x86/mmu: WARN and zap SP when sync'ing if MMU role mismatches Sean Christopherson
2021-06-22 17:56 ` [PATCH 12/54] KVM: x86/mmu: Drop the intermediate "transient" __kvm_sync_page() Sean Christopherson
2021-06-23 16:54   ` Paolo Bonzini
2021-06-22 17:56 ` [PATCH 13/54] KVM: x86/mmu: Rename unsync helper and update related comments Sean Christopherson
2021-06-22 17:56 ` [PATCH 14/54] KVM: x86: Fix sizes used to pass around CR0, CR4, and EFER Sean Christopherson
2021-06-22 17:57 ` [PATCH 15/54] KVM: nSVM: Add a comment to document why nNPT uses vmcb01, not vCPU state Sean Christopherson
2021-06-23 17:06   ` Paolo Bonzini
2021-06-23 20:49     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 16/54] KVM: x86/mmu: Drop smep_andnot_wp check from "uses NX" for shadow MMUs Sean Christopherson
2021-06-23 17:11   ` Paolo Bonzini
2021-06-23 19:36     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 17/54] KVM: x86: Read and pass all CR0/CR4 role bits to shadow MMU helper Sean Christopherson
2021-06-22 17:57 ` [PATCH 18/54] KVM: x86/mmu: Move nested NPT reserved bit calculation into MMU proper Sean Christopherson
2021-06-23 17:13   ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 19/54] KVM: x86/mmu: Grab shadow root level from mmu_role for shadow MMUs Sean Christopherson
2021-06-22 17:57 ` [PATCH 20/54] KVM: x86/mmu: Add struct and helpers to retrieve MMU role bits from regs Sean Christopherson
2021-06-23  1:58   ` kernel test robot
2021-06-23  1:58     ` kernel test robot
2021-06-23 17:18   ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 21/54] KVM: x86/mmu: Consolidate misc updates into shadow_mmu_init_context() Sean Christopherson
2021-06-22 17:57 ` [PATCH 22/54] KVM: x86/mmu: Ignore CR0 and CR4 bits in nested EPT MMU role Sean Christopherson
2021-06-22 17:57 ` [PATCH 23/54] KVM: x86/mmu: Use MMU's role_regs, not vCPU state, to compute mmu_role Sean Christopherson
2021-06-22 17:57 ` [PATCH 24/54] KVM: x86/mmu: Rename "nxe" role bit to "efer_nx" for macro shenanigans Sean Christopherson
2021-06-22 17:57 ` [PATCH 25/54] KVM: x86/mmu: Add helpers to query mmu_role bits Sean Christopherson
2021-06-23 20:02   ` Paolo Bonzini
2021-06-23 20:47     ` Sean Christopherson
2021-06-23 20:53       ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 26/54] KVM: x86/mmu: Do not set paging-related bits in MMU role if CR0.PG=0 Sean Christopherson
2021-06-22 17:57 ` [PATCH 27/54] KVM: x86/mmu: Set CR4.PKE/LA57 in MMU role iff long mode is active Sean Christopherson
2021-06-22 17:57 ` [PATCH 28/54] KVM: x86/mmu: Always Set new mmu_role immediately after checking old role Sean Christopherson
2021-06-22 17:57 ` [PATCH 29/54] KVM: x86/mmu: Don't grab CR4.PSE for calculating shadow reserved bits Sean Christopherson
2021-06-22 17:57 ` [PATCH 30/54] KVM: x86/mmu: Use MMU's role to get CR4.PSE for computing rsvd bits Sean Christopherson
2021-06-22 17:57 ` [PATCH 31/54] KVM: x86/mmu: Drop vCPU param from reserved bits calculator Sean Christopherson
2021-06-22 17:57 ` [PATCH 32/54] KVM: x86/mmu: Use MMU's role to compute permission bitmask Sean Christopherson
2021-06-22 17:57 ` [PATCH 33/54] KVM: x86/mmu: Use MMU's role to compute PKRU bitmask Sean Christopherson
2021-06-22 17:57 ` [PATCH 34/54] KVM: x86/mmu: Use MMU's roles to compute last non-leaf level Sean Christopherson
2021-06-22 17:57 ` [PATCH 35/54] KVM: x86/mmu: Use MMU's role to detect EFER.NX in guest page walk Sean Christopherson
2021-06-22 17:57 ` [PATCH 36/54] KVM: x86/mmu: Use MMU's role/role_regs to compute context's metadata Sean Christopherson
2021-06-22 17:57 ` [PATCH 37/54] KVM: x86/mmu: Use MMU's role to get EFER.NX during MMU configuration Sean Christopherson
2021-06-22 17:57 ` [PATCH 38/54] KVM: x86/mmu: Drop "nx" from MMU context now that there are no readers Sean Christopherson
2021-06-22 17:57 ` [PATCH 39/54] KVM: x86/mmu: Get nested MMU's root level from the MMU's role Sean Christopherson
2021-06-22 17:57 ` [PATCH 40/54] KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper Sean Christopherson
2021-06-22 17:57 ` [PATCH 41/54] KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls Sean Christopherson
2021-06-23 20:07   ` Paolo Bonzini
2021-06-23 20:53     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 42/54] KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0 Sean Christopherson
2021-06-22 17:57 ` [PATCH 43/54] KVM: x86/mmu: Add helper to update paging metadata Sean Christopherson
2021-06-22 17:57 ` [PATCH 44/54] KVM: x86/mmu: Add a helper to calculate root from role_regs Sean Christopherson
2021-06-22 17:57 ` [PATCH 45/54] KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers Sean Christopherson
2021-06-22 17:57 ` [PATCH 46/54] KVM: x86/mmu: Use MMU's role to determine PTTYPE Sean Christopherson
2021-06-22 17:57 ` [PATCH 47/54] KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU Sean Christopherson
2021-06-23 20:13   ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 48/54] KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE Sean Christopherson
2021-06-22 17:57 ` Sean Christopherson [this message]
2021-06-22 17:57 ` [PATCH 50/54] KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic Sean Christopherson
2021-06-23 20:22   ` Paolo Bonzini
2021-06-23 20:58     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 51/54] KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT Sean Christopherson
2021-06-22 17:57 ` [PATCH 52/54] KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault Sean Christopherson
2021-06-22 17:57 ` [PATCH 53/54] KVM: x86/mmu: Get CR4.SMEP " Sean Christopherson
2021-06-22 17:57 ` [PATCH 54/54] KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on Sean Christopherson
2021-06-23 20:29 ` [PATCH 00/54] KVM: x86/mmu: Bug fixes and summer cleaning Paolo Bonzini
2021-06-23 21:06   ` Sean Christopherson
2021-06-23 21:33     ` Paolo Bonzini
2021-06-23 22:08       ` Sean Christopherson
2021-06-23 22:12         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210622175739.3610207-50-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.