linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Yan Zhao <yan.y.zhao@intel.com>,
	Isaku Yamahata <isaku.yamahata@intel.com>,
	 Michael Roth <michael.roth@amd.com>,
	Yu Zhang <yu.c.zhang@linux.intel.com>,
	 Chao Peng <chao.p.peng@linux.intel.com>,
	Fuad Tabba <tabba@google.com>,
	 David Matlack <dmatlack@google.com>
Subject: [PATCH 10/16] KVM: x86/mmu: Don't force emulation of L2 accesses to non-APIC internal slots
Date: Tue, 27 Feb 2024 18:41:41 -0800	[thread overview]
Message-ID: <20240228024147.41573-11-seanjc@google.com> (raw)
In-Reply-To: <20240228024147.41573-1-seanjc@google.com>

Allow mapping KVM's internal memslots used for EPT without unrestricted
guest into L2, i.e. allow mapping the hidden TSS and the identity mapped
page tables into L2.  Unlike the APIC access page, there is no correctness
issue with letting L2 access the "hidden" memory.  Allowing these memslots
to be mapped into L2 fixes a largely theoretical bug where KVM could
incorrectly emulate subsequent _L1_ accesses as MMIO, and also ensures
consistent KVM behavior for L2.

If KVM is using TDP, but L1 is using shadow paging for L2, then routing
through kvm_handle_noslot_fault() will incorrectly cache the gfn as MMIO,
and create an MMIO SPTE.  Creating an MMIO SPTE is ok, but only because
kvm_mmu_page_role.guest_mode ensure KVM uses different roots for L1 vs.
L2.  But vcpu->arch.mmio_gfn will remain valid, and could cause KVM to
incorrectly treat an L1 access to the hidden TSS or identity mapped page
tables as MMIO.

Furthermore, forcing L2 accesses to be treated as "no slot" faults doesn't
actually prevent exposing KVM's internal memslots to L2, it simply forces
KVM to emulate the access.  In most cases, that will trigger MMIO,
amusingly due to filling vcpu->arch.mmio_gfn, but also because
vcpu_is_mmio_gpa() unconditionally treats APIC accesses as MMIO, i.e. APIC
accesses are ok.  But the hidden TSS and identity mapped page tables could
go either way (MMIO or access the private memslot's backing memory).

Alternatively, the inconsistent emulator behavior could be addressed by
forcing MMIO emulation for L2 access to all internal memslots, not just to
the APIC.  But that's arguably less correct than letting L2 access the
hidden TSS and identity mapped page tables, not to mention that it's
*extremely* unlikely anyone cares what KVM does in this case.  From L1's
perspective there is R/W memory at those memslots, the memory just happens
to be initialized with non-zero data.  Making the memory disappear when it
is accessed by L2 is far more magical and arbitrary than the memory
existing in the first place.

The APIC access page is special because KVM _must_ emulate the access to
do the right thing (emulate an APIC access instead of reading/writing the
APIC access page).  And despite what commit 3a2936dedd20 ("kvm: mmu: Don't
expose private memslots to L2") said, it's not just necessary when L1 is
accelerating L2's virtual APIC, it's just as important (likely *more*
imporant for correctness when L1 is passing through its own APIC to L2.

Fixes: 3a2936dedd20 ("kvm: mmu: Don't expose private memslots to L2")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 58c5ae8be66c..5c8caab64ba2 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4346,8 +4346,18 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 	if (slot && (slot->flags & KVM_MEMSLOT_INVALID))
 		return RET_PF_RETRY;
 
-	if (!kvm_is_visible_memslot(slot)) {
-		/* Don't expose private memslots to L2. */
+	if (slot && slot->id == APIC_ACCESS_PAGE_PRIVATE_MEMSLOT) {
+		/*
+		 * Don't map L1's APIC access page into L2, KVM doesn't support
+		 * using APICv/AVIC to accelerate L2 accesses to L1's APIC,
+		 * i.e. the access needs to be emulated.  Emulating access to
+		 * L1's APIC is also correct if L1 is accelerating L2's own
+		 * virtual APIC, but for some reason L1 also maps _L1's_ APIC
+		 * into L2.  Note, vcpu_is_mmio_gpa() always treats access to
+		 * the APIC as MMIO.  Allow an MMIO SPTE to be created, as KVM
+		 * uses different roots for L1 vs. L2, i.e. there is no danger
+		 * of breaking APICv/AVIC for L1.
+		 */
 		if (is_guest_mode(vcpu)) {
 			fault->slot = NULL;
 			fault->pfn = KVM_PFN_NOSLOT;
@@ -4360,8 +4370,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 		 * MMIO SPTE.  That way the cache doesn't need to be purged
 		 * when the AVIC is re-enabled.
 		 */
-		if (slot && slot->id == APIC_ACCESS_PAGE_PRIVATE_MEMSLOT &&
-		    !kvm_apicv_activated(vcpu->kvm))
+		if (!kvm_apicv_activated(vcpu->kvm))
 			return RET_PF_EMULATE;
 	}
 
-- 
2.44.0.278.ge034bb2e1d-goog


  parent reply	other threads:[~2024-02-28  2:42 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-28  2:41 [PATCH 00/16] KVM: x86/mmu: Page fault and MMIO cleanups Sean Christopherson
2024-02-28  2:41 ` [PATCH 01/16] KVM: x86/mmu: Exit to userspace with -EFAULT if private fault hits emulation Sean Christopherson
2024-03-01  8:48   ` Xiaoyao Li
2024-03-07 12:52   ` Gupta, Pankaj
2024-03-12  2:59     ` Binbin Wu
2024-04-04 16:38       ` Sean Christopherson
2024-03-08  4:22   ` Yan Zhao
2024-04-04 16:45     ` Sean Christopherson
2024-02-28  2:41 ` [PATCH 02/16] KVM: x86: Remove separate "bit" defines for page fault error code masks Sean Christopherson
2024-02-29 12:44   ` Paolo Bonzini
2024-02-29 18:40     ` Sean Christopherson
2024-02-29 20:56       ` Paolo Bonzini
2024-02-29 13:43   ` Dongli Zhang
2024-02-29 15:25     ` Sean Christopherson
2024-02-28  2:41 ` [PATCH 03/16] KVM: x86: Define more SEV+ page fault error bits/flags for #NPF Sean Christopherson
2024-02-28  4:43   ` Dongli Zhang
2024-02-28 16:16     ` Sean Christopherson
2024-02-28  2:41 ` [PATCH 04/16] KVM: x86/mmu: Pass full 64-bit error code when handling page faults Sean Christopherson
2024-02-28  7:30   ` Dongli Zhang
2024-02-28 16:22     ` Sean Christopherson
2024-02-29 13:32       ` Dongli Zhang
2024-03-05  3:55   ` Xiaoyao Li
2024-02-28  2:41 ` [PATCH 05/16] KVM: x86/mmu: Use synthetic page fault error code to indicate private faults Sean Christopherson
2024-02-29 11:16   ` Huang, Kai
2024-02-29 15:17     ` Sean Christopherson
2024-03-06  9:43   ` Xu Yilun
2024-03-06 14:45     ` Sean Christopherson
2024-03-07  9:05       ` Xu Yilun
2024-03-07 14:36         ` Sean Christopherson
2024-03-12  5:34   ` Binbin Wu
2024-02-28  2:41 ` [PATCH 06/16] KVM: x86/mmu: WARN if upper 32 bits of legacy #PF error code are non-zero Sean Christopherson
2024-02-29 22:11   ` Huang, Kai
2024-02-29 23:07     ` Sean Christopherson
2024-03-12  5:44       ` Binbin Wu
2024-02-28  2:41 ` [PATCH 07/16] KVM: x86: Move synthetic PFERR_* sanity checks to SVM's #NPF handler Sean Christopherson
2024-02-29 22:19   ` Huang, Kai
2024-02-29 22:52     ` Sean Christopherson
2024-02-29 23:14       ` Huang, Kai
2024-03-12  9:44   ` Binbin Wu
2024-02-28  2:41 ` [PATCH 08/16] KVM: x86/mmu: WARN and skip MMIO cache on private, reserved page faults Sean Christopherson
2024-02-29 22:26   ` Huang, Kai
2024-02-29 23:06     ` Sean Christopherson
2024-02-29 23:21       ` Huang, Kai
2024-03-04 15:51         ` Sean Christopherson
2024-03-05 21:32           ` Huang, Kai
2024-03-06  0:25             ` Sean Christopherson
2024-02-28  2:41 ` [PATCH 09/16] KVM: x86/mmu: Move private vs. shared check above slot validity checks Sean Christopherson
2024-03-05 23:06   ` Huang, Kai
2024-03-06  0:38     ` Sean Christopherson
2024-03-06  1:22       ` Huang, Kai
2024-03-06  2:02         ` Sean Christopherson
2024-03-06 22:06           ` Huang, Kai
2024-03-06 23:49             ` Sean Christopherson
2024-03-07  0:28               ` Huang, Kai
2024-03-08  4:54   ` Xu Yilun
2024-03-08 23:28     ` Sean Christopherson
2024-03-11  4:43       ` Xu Yilun
2024-03-12  0:08         ` Sean Christopherson
2024-02-28  2:41 ` Sean Christopherson [this message]
2024-03-07  0:03   ` [PATCH 10/16] KVM: x86/mmu: Don't force emulation of L2 accesses to non-APIC internal slots Huang, Kai
2024-02-28  2:41 ` [PATCH 11/16] KVM: x86/mmu: Explicitly disallow private accesses to emulated MMIO Sean Christopherson
2024-03-06 22:35   ` Huang, Kai
2024-03-06 22:43     ` Sean Christopherson
2024-03-06 22:49       ` Huang, Kai
2024-03-06 23:01         ` Sean Christopherson
2024-03-06 23:20           ` Huang, Kai
2024-03-07 17:10         ` Kirill A. Shutemov
2024-03-08  0:09           ` Huang, Kai
2024-02-28  2:41 ` [PATCH 12/16] KVM: x86/mmu: Move slot checks from __kvm_faultin_pfn() to kvm_faultin_pfn() Sean Christopherson
2024-03-07  0:11   ` Huang, Kai
2024-02-28  2:41 ` [PATCH 13/16] KVM: x86/mmu: Handle no-slot faults at the beginning of kvm_faultin_pfn() Sean Christopherson
2024-03-07  0:48   ` Huang, Kai
2024-03-07  0:53     ` Sean Christopherson
2024-02-28  2:41 ` [PATCH 14/16] KVM: x86/mmu: Set kvm_page_fault.hva to KVM_HVA_ERR_BAD for "no slot" faults Sean Christopherson
2024-03-07  0:50   ` Huang, Kai
2024-03-07  1:01     ` Sean Christopherson
2024-02-28  2:41 ` [PATCH 15/16] KVM: x86/mmu: Initialize kvm_page_fault's pfn and hva to error values Sean Christopherson
2024-03-07  0:46   ` Huang, Kai
2024-02-28  2:41 ` [PATCH 16/16] KVM: x86/mmu: Sanity check that __kvm_faultin_pfn() doesn't create noslot pfns Sean Christopherson
2024-03-07  0:46   ` Huang, Kai
2024-04-17 12:48 ` [PATCH 00/16] KVM: x86/mmu: Page fault and MMIO cleanups Paolo Bonzini
2024-04-18 15:40   ` Sean Christopherson
2024-04-19  6:47   ` Xiaoyao Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240228024147.41573-11-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=tabba@google.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).