linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Yu Zhang <yu.c.zhang@linux.intel.com>,
	Maxim Levitsky <mlevitsk@redhat.com>
Subject: [PATCH 49/54] KVM: x86: Enhance comments for MMU roles and nested transition trickiness
Date: Tue, 22 Jun 2021 10:57:34 -0700	[thread overview]
Message-ID: <20210622175739.3610207-50-seanjc@google.com> (raw)
In-Reply-To: <20210622175739.3610207-1-seanjc@google.com>

Expand the comments for the MMU roles.  The interactions with gfn_track
PGD reuse in particular are hairy.

Regarding PGD reuse, add comments in the nested virtualization flows to
call out why kvm_init_mmu() is unconditionally called even when nested
TDP is used.

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 59 +++++++++++++++++++++++++++------
 arch/x86/kvm/svm/nested.c       |  1 +
 arch/x86/kvm/vmx/nested.c       |  1 +
 3 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index be7088fb0594..2da8b5ddbd6a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -269,12 +269,36 @@ enum x86_intercept_stage;
 struct kvm_kernel_irq_routing_entry;
 
 /*
- * the pages used as guest page table on soft mmu are tracked by
- * kvm_memory_slot.arch.gfn_track which is 16 bits, so the role bits used
- * by indirect shadow page can not be more than 15 bits.
+ * kvm_mmu_page_role tracks the properties of a shadow page (where shadow page
+ * also includes TDP pages) to determine whether or not a page can be used in
+ * the given MMU context.  This is a subset of the overall kvm_mmu_role to
+ * minimize the size of kvm_memory_slot.arch.gfn_track, i.e. allows allocating
+ * 2 bytes per gfn instead of 4 bytes per gfn.
  *
- * Currently, we used 14 bits that are @level, @gpte_is_8_bytes, @quadrant, @access,
- * @efer_nx, @cr0_wp, @smep_andnot_wp and @smap_andnot_wp.
+ * Indirect upper-level shadow pages are tracked for write-protection via
+ * gfn_track.  As above, gfn_track is a 16 bit counter, so KVM must not create
+ * more than 2^16-1 upper-level shadow pages at a single gfn, otherwise
+ * gfn_track will overflow and explosions will ensure.
+ *
+ * A unique shadow page (SP) for a gfn is created if and only if an existing SP
+ * cannot be reused.  The ability to reuse a SP is tracked by its role, which
+ * incorporates various mode bits and properties of the SP.  Roughly speaking,
+ * the number of unique SPs that can theoretically be created is 2^n, where n
+ * is the number of bits that are used to compute the role.
+ *
+ * But, even though there are 18 bits in the mask below, not all combinations
+ * of modes and flags are possible.  The maximum number of possible upper-level
+ * shadow pages for a single gfn is in the neighborhood of 2^13.
+ *
+ *   - invalid shadow pages are not accounted.
+ *   - level is effectively limited to four combinations, not 16 as the number
+ *     bits would imply, as 4k SPs are not tracked (allowed to go unsync).
+ *   - level is effectively unused for non-PAE paging because there is exactly
+ *     one upper level (see 4k SP exception above).
+ *   - quadrant is used only for non-PAE paging and is exclusive with
+ *     gpte_is_8_bytes.
+ *   - execonly and ad_disabled are used only for nested EPT, which makes it
+ *     exclusive with quadrant.
  */
 union kvm_mmu_page_role {
 	u32 word;
@@ -303,13 +327,26 @@ union kvm_mmu_page_role {
 	};
 };
 
+/*
+ * kvm_mmu_extended_role complements kvm_mmu_page_role, tracking properties
+ * relevant to the current MMU configuration.   When loading CR0, CR4, or EFER,
+ * including on nested transitions, if nothing in the full role changes then
+ * MMU re-configuration can be skipped. @valid bit is set on first usage so we
+ * don't treat all-zero structure as valid data.
+ *
+ * The properties that are tracked in the extended role but not the page role
+ * are for things that either (a) do not affect the validity of the shadow page
+ * or (b) are indirectly reflected in the shadow page's role.  For example,
+ * CR4.PKE only affects permission checks for software walks of the guest page
+ * tables (because KVM doesn't support Protection Keys with shadow paging), and
+ * CR0.PG, CR4.PAE, and CR4.PSE are indirectly reflected in role.level.
+ *
+ * Note, SMEP and SMAP are not redundant with sm*p_andnot_wp in the page role.
+ * If CR0.WP=1, KVM can reuse shadow pages for the guest regardless of SMEP and
+ * SMAP, but the MMU's permission checks for software walks need to be SMEP and
+ * SMAP aware regardless of CR0.WP.
+ */
 union kvm_mmu_extended_role {
-/*
- * This structure complements kvm_mmu_page_role caching everything needed for
- * MMU configuration. If nothing in both these structures changed, MMU
- * re-configuration can be skipped. @valid bit is set on first usage so we don't
- * treat all-zero structure as valid data.
- */
 	u32 word;
 	struct {
 		unsigned int valid:1;
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 927e545591c3..94389f974ba9 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -424,6 +424,7 @@ static int nested_svm_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
 	vcpu->arch.cr3 = cr3;
 	kvm_register_mark_available(vcpu, VCPU_EXREG_CR3);
 
+	/* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */
 	kvm_init_mmu(vcpu);
 
 	return 0;
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 183fd9d62fc5..77fc51a852cf 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1098,6 +1098,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
 	vcpu->arch.cr3 = cr3;
 	kvm_register_mark_available(vcpu, VCPU_EXREG_CR3);
 
+	/* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */
 	kvm_init_mmu(vcpu);
 
 	return 0;
-- 
2.32.0.288.g62a8d224e6-goog


  parent reply	other threads:[~2021-06-22 18:04 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22 17:56 [PATCH 00/54] KVM: x86/mmu: Bug fixes and summer cleaning Sean Christopherson
2021-06-22 17:56 ` [PATCH 01/54] KVM: x86/mmu: Remove broken WARN that fires on 32-bit KVM w/ nested EPT Sean Christopherson
2021-06-22 17:56 ` [PATCH 02/54] KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs Sean Christopherson
2021-06-22 17:56 ` [PATCH 03/54] KVM: x86: Properly reset MMU context at vCPU RESET/INIT Sean Christopherson
2021-06-23 13:59   ` Paolo Bonzini
2021-06-23 14:01   ` Paolo Bonzini
2021-06-23 14:50     ` Sean Christopherson
2021-06-22 17:56 ` [PATCH 04/54] KVM: x86/mmu: Use MMU's role to detect CR4.SMEP value in nested NPT walk Sean Christopherson
2021-06-22 17:56 ` [PATCH 05/54] Revert "KVM: x86/mmu: Drop kvm_mmu_extended_role.cr4_la57 hack" Sean Christopherson
2021-06-25  8:47   ` Yu Zhang
2021-06-25  8:57     ` Paolo Bonzini
2021-06-25  9:29       ` Yu Zhang
2021-06-25 10:25         ` Paolo Bonzini
2021-06-25 11:23           ` Yu Zhang
2021-06-22 17:56 ` [PATCH 06/54] KVM: x86: Force all MMUs to reinitialize if guest CPUID is modified Sean Christopherson
2021-06-22 17:56 ` [PATCH 07/54] KVM: x86: Alert userspace that KVM_SET_CPUID{,2} after KVM_RUN is broken Sean Christopherson
2021-06-23 14:16   ` Paolo Bonzini
2021-06-23 17:00     ` Jim Mattson
2021-06-23 17:11       ` Paolo Bonzini
2021-06-23 18:11         ` Jim Mattson
2021-06-23 18:49           ` Paolo Bonzini
2021-06-23 19:02             ` Jim Mattson
2021-06-23 19:53               ` Paolo Bonzini
2021-06-22 17:56 ` [PATCH 08/54] Revert "KVM: MMU: record maximum physical address width in kvm_mmu_extended_role" Sean Christopherson
2021-06-25  8:52   ` Yu Zhang
2021-06-22 17:56 ` [PATCH 09/54] KVM: x86/mmu: Unconditionally zap unsync SPs when creating >4k SP at GFN Sean Christopherson
2021-06-23 14:36   ` Paolo Bonzini
2021-06-23 15:08     ` Sean Christopherson
2021-06-23 16:38       ` Paolo Bonzini
2021-06-23 22:04         ` Sean Christopherson
2021-06-25  9:51   ` Yu Zhang
2021-06-25 10:26     ` Paolo Bonzini
2021-06-25 13:08       ` Yu Zhang
2021-06-22 17:56 ` [PATCH 10/54] KVM: x86/mmu: Replace EPT shadow page shenanigans with simpler check Sean Christopherson
2021-06-23 15:49   ` Paolo Bonzini
2021-06-23 16:17     ` Sean Christopherson
2021-06-23 16:41       ` Paolo Bonzini
2021-06-23 16:54         ` Sean Christopherson
2021-06-22 17:56 ` [PATCH 11/54] KVM: x86/mmu: WARN and zap SP when sync'ing if MMU role mismatches Sean Christopherson
2021-06-22 17:56 ` [PATCH 12/54] KVM: x86/mmu: Drop the intermediate "transient" __kvm_sync_page() Sean Christopherson
2021-06-23 16:54   ` Paolo Bonzini
2021-06-22 17:56 ` [PATCH 13/54] KVM: x86/mmu: Rename unsync helper and update related comments Sean Christopherson
2021-06-22 17:56 ` [PATCH 14/54] KVM: x86: Fix sizes used to pass around CR0, CR4, and EFER Sean Christopherson
2021-06-22 17:57 ` [PATCH 15/54] KVM: nSVM: Add a comment to document why nNPT uses vmcb01, not vCPU state Sean Christopherson
2021-06-23 17:06   ` Paolo Bonzini
2021-06-23 20:49     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 16/54] KVM: x86/mmu: Drop smep_andnot_wp check from "uses NX" for shadow MMUs Sean Christopherson
2021-06-23 17:11   ` Paolo Bonzini
2021-06-23 19:36     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 17/54] KVM: x86: Read and pass all CR0/CR4 role bits to shadow MMU helper Sean Christopherson
2021-06-22 17:57 ` [PATCH 18/54] KVM: x86/mmu: Move nested NPT reserved bit calculation into MMU proper Sean Christopherson
2021-06-23 17:13   ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 19/54] KVM: x86/mmu: Grab shadow root level from mmu_role for shadow MMUs Sean Christopherson
2021-06-22 17:57 ` [PATCH 20/54] KVM: x86/mmu: Add struct and helpers to retrieve MMU role bits from regs Sean Christopherson
2021-06-23  1:58   ` kernel test robot
2021-06-23 17:18   ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 21/54] KVM: x86/mmu: Consolidate misc updates into shadow_mmu_init_context() Sean Christopherson
2021-06-22 17:57 ` [PATCH 22/54] KVM: x86/mmu: Ignore CR0 and CR4 bits in nested EPT MMU role Sean Christopherson
2021-06-22 17:57 ` [PATCH 23/54] KVM: x86/mmu: Use MMU's role_regs, not vCPU state, to compute mmu_role Sean Christopherson
2021-06-22 17:57 ` [PATCH 24/54] KVM: x86/mmu: Rename "nxe" role bit to "efer_nx" for macro shenanigans Sean Christopherson
2021-06-22 17:57 ` [PATCH 25/54] KVM: x86/mmu: Add helpers to query mmu_role bits Sean Christopherson
2021-06-23 20:02   ` Paolo Bonzini
2021-06-23 20:47     ` Sean Christopherson
2021-06-23 20:53       ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 26/54] KVM: x86/mmu: Do not set paging-related bits in MMU role if CR0.PG=0 Sean Christopherson
2021-06-22 17:57 ` [PATCH 27/54] KVM: x86/mmu: Set CR4.PKE/LA57 in MMU role iff long mode is active Sean Christopherson
2021-06-22 17:57 ` [PATCH 28/54] KVM: x86/mmu: Always Set new mmu_role immediately after checking old role Sean Christopherson
2021-06-22 17:57 ` [PATCH 29/54] KVM: x86/mmu: Don't grab CR4.PSE for calculating shadow reserved bits Sean Christopherson
2021-06-22 17:57 ` [PATCH 30/54] KVM: x86/mmu: Use MMU's role to get CR4.PSE for computing rsvd bits Sean Christopherson
2021-06-22 17:57 ` [PATCH 31/54] KVM: x86/mmu: Drop vCPU param from reserved bits calculator Sean Christopherson
2021-06-22 17:57 ` [PATCH 32/54] KVM: x86/mmu: Use MMU's role to compute permission bitmask Sean Christopherson
2021-06-22 17:57 ` [PATCH 33/54] KVM: x86/mmu: Use MMU's role to compute PKRU bitmask Sean Christopherson
2021-06-22 17:57 ` [PATCH 34/54] KVM: x86/mmu: Use MMU's roles to compute last non-leaf level Sean Christopherson
2021-06-22 17:57 ` [PATCH 35/54] KVM: x86/mmu: Use MMU's role to detect EFER.NX in guest page walk Sean Christopherson
2021-06-22 17:57 ` [PATCH 36/54] KVM: x86/mmu: Use MMU's role/role_regs to compute context's metadata Sean Christopherson
2021-06-22 17:57 ` [PATCH 37/54] KVM: x86/mmu: Use MMU's role to get EFER.NX during MMU configuration Sean Christopherson
2021-06-22 17:57 ` [PATCH 38/54] KVM: x86/mmu: Drop "nx" from MMU context now that there are no readers Sean Christopherson
2021-06-22 17:57 ` [PATCH 39/54] KVM: x86/mmu: Get nested MMU's root level from the MMU's role Sean Christopherson
2021-06-22 17:57 ` [PATCH 40/54] KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper Sean Christopherson
2021-06-22 17:57 ` [PATCH 41/54] KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls Sean Christopherson
2021-06-23 20:07   ` Paolo Bonzini
2021-06-23 20:53     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 42/54] KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0 Sean Christopherson
2021-06-22 17:57 ` [PATCH 43/54] KVM: x86/mmu: Add helper to update paging metadata Sean Christopherson
2021-06-22 17:57 ` [PATCH 44/54] KVM: x86/mmu: Add a helper to calculate root from role_regs Sean Christopherson
2021-06-22 17:57 ` [PATCH 45/54] KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers Sean Christopherson
2021-06-22 17:57 ` [PATCH 46/54] KVM: x86/mmu: Use MMU's role to determine PTTYPE Sean Christopherson
2021-06-22 17:57 ` [PATCH 47/54] KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU Sean Christopherson
2021-06-23 20:13   ` Paolo Bonzini
2021-06-22 17:57 ` [PATCH 48/54] KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE Sean Christopherson
2021-06-22 17:57 ` Sean Christopherson [this message]
2021-06-22 17:57 ` [PATCH 50/54] KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic Sean Christopherson
2021-06-23 20:22   ` Paolo Bonzini
2021-06-23 20:58     ` Sean Christopherson
2021-06-22 17:57 ` [PATCH 51/54] KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT Sean Christopherson
2021-06-22 17:57 ` [PATCH 52/54] KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault Sean Christopherson
2021-06-22 17:57 ` [PATCH 53/54] KVM: x86/mmu: Get CR4.SMEP " Sean Christopherson
2021-06-22 17:57 ` [PATCH 54/54] KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on Sean Christopherson
2021-06-23 20:29 ` [PATCH 00/54] KVM: x86/mmu: Bug fixes and summer cleaning Paolo Bonzini
2021-06-23 21:06   ` Sean Christopherson
2021-06-23 21:33     ` Paolo Bonzini
2021-06-23 22:08       ` Sean Christopherson
2021-06-23 22:12         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210622175739.3610207-50-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).