All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Roth <michael.roth@amd.com>
To: <x86@kernel.org>
Cc: <kvm@vger.kernel.org>, <linux-coco@lists.linux.dev>,
	<linux-mm@kvack.org>, <linux-crypto@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <tglx@linutronix.de>,
	<mingo@redhat.com>, <jroedel@suse.de>, <thomas.lendacky@amd.com>,
	<hpa@zytor.com>, <ardb@kernel.org>, <pbonzini@redhat.com>,
	<seanjc@google.com>, <vkuznets@redhat.com>, <jmattson@google.com>,
	<luto@kernel.org>, <dave.hansen@linux.intel.com>,
	<slp@redhat.com>, <pgonda@google.com>, <peterz@infradead.org>,
	<srinivas.pandruvada@linux.intel.com>, <rientjes@google.com>,
	<tobin@ibm.com>, <bp@alien8.de>, <vbabka@suse.cz>,
	<kirill@shutemov.name>, <ak@linux.intel.com>,
	<tony.luck@intel.com>,
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	<alpergun@google.com>, <jarkko@kernel.org>,
	<ashish.kalra@amd.com>, <nikunj.dadhania@amd.com>,
	<pankaj.gupta@amd.com>, <liam.merwick@oracle.com>,
	Brijesh Singh <brijesh.singh@amd.com>,
	Marc Orr <marcorr@google.com>
Subject: [PATCH v2 21/25] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe
Date: Thu, 25 Jan 2024 22:11:21 -0600	[thread overview]
Message-ID: <20240126041126.1927228-22-michael.roth@amd.com> (raw)
In-Reply-To: <20240126041126.1927228-1-michael.roth@amd.com>

From: Brijesh Singh <brijesh.singh@amd.com>

Implement a workaround for an SNP erratum where the CPU will incorrectly
signal an RMP violation #PF if a hugepage (2MB or 1GB) collides with the
RMP entry of a VMCB, VMSA or AVIC backing page.

When SEV-SNP is globally enabled, the CPU marks the VMCB, VMSA, and AVIC
backing pages as "in-use" via a reserved bit in the corresponding RMP
entry after a successful VMRUN. This is done for _all_ VMs, not just
SNP-Active VMs.

If the hypervisor accesses an in-use page through a writable
translation, the CPU will throw an RMP violation #PF. On early SNP
hardware, if an in-use page is 2MB-aligned and software accesses any
part of the associated 2MB region with a hugepage, the CPU will
incorrectly treat the entire 2MB region as in-use and signal a an RMP
violation #PF.

To avoid this, the recommendation is to not use a 2MB-aligned page for
the VMCB, VMSA or AVIC pages. Add a generic allocator that will ensure
that the page returned is not 2MB-aligned and is safe to be used when
SEV-SNP is enabled. Also implement similar handling for the VMCB/VMSA
pages of nested guests.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Marc Orr <marcorr@google.com>
Reported-by: Alper Gun <alpergun@google.com> # for nested VMSA case
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
[mdr: squash in nested guest handling from Ashish, commit msg fixups]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/kvm/lapic.c               |  5 ++++-
 arch/x86/kvm/svm/nested.c          |  2 +-
 arch/x86/kvm/svm/sev.c             | 32 ++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c             | 17 +++++++++++++---
 arch/x86/kvm/svm/svm.h             |  1 +
 7 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 378ed944b849..ab24ce207988 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -138,6 +138,7 @@ KVM_X86_OP(complete_emulated_msr)
 KVM_X86_OP(vcpu_deliver_sipi_vector)
 KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
 KVM_X86_OP_OPTIONAL(get_untagged_addr)
+KVM_X86_OP_OPTIONAL(alloc_apic_backing_page)
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b5b2d0fde579..5c12af29fd9b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1794,6 +1794,7 @@ struct kvm_x86_ops {
 	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
 
 	gva_t (*get_untagged_addr)(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags);
+	void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3242f3da2457..1edf93ee3395 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2815,7 +2815,10 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
 
 	vcpu->arch.apic = apic;
 
-	apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+	if (kvm_x86_ops.alloc_apic_backing_page)
+		apic->regs = static_call(kvm_x86_alloc_apic_backing_page)(vcpu);
+	else
+		apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
 	if (!apic->regs) {
 		printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
 		       vcpu->vcpu_id);
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index dee62362a360..55b9a6d96bcf 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1181,7 +1181,7 @@ int svm_allocate_nested(struct vcpu_svm *svm)
 	if (svm->nested.initialized)
 		return 0;
 
-	vmcb02_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+	vmcb02_page = snp_safe_alloc_page(&svm->vcpu);
 	if (!vmcb02_page)
 		return -ENOMEM;
 	svm->nested.vmcb02.ptr = page_address(vmcb02_page);
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 564091f386f7..f99435b6648f 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3163,3 +3163,35 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 
 	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
 }
+
+struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
+{
+	unsigned long pfn;
+	struct page *p;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+
+	/*
+	 * Allocate an SNP-safe page to workaround the SNP erratum where
+	 * the CPU will incorrectly signal an RMP violation #PF if a
+	 * hugepage (2MB or 1GB) collides with the RMP entry of a
+	 * 2MB-aligned VMCB, VMSA, or AVIC backing page.
+	 *
+	 * Allocate one extra page, choose a page which is not
+	 * 2MB-aligned, and free the other.
+	 */
+	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
+	if (!p)
+		return NULL;
+
+	split_page(p, 1);
+
+	pfn = page_to_pfn(p);
+	if (IS_ALIGNED(pfn, PTRS_PER_PMD))
+		__free_page(p++);
+	else
+		__free_page(p + 1);
+
+	return p;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 61f2bdc9f4f8..272d5ed37ce7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -703,7 +703,7 @@ static int svm_cpu_init(int cpu)
 	int ret = -ENOMEM;
 
 	memset(sd, 0, sizeof(struct svm_cpu_data));
-	sd->save_area = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	sd->save_area = snp_safe_alloc_page(NULL);
 	if (!sd->save_area)
 		return ret;
 
@@ -1421,7 +1421,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 	svm = to_svm(vcpu);
 
 	err = -ENOMEM;
-	vmcb01_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+	vmcb01_page = snp_safe_alloc_page(vcpu);
 	if (!vmcb01_page)
 		goto out;
 
@@ -1430,7 +1430,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 		 * SEV-ES guests require a separate VMSA page used to contain
 		 * the encrypted register state of the guest.
 		 */
-		vmsa_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+		vmsa_page = snp_safe_alloc_page(vcpu);
 		if (!vmsa_page)
 			goto error_free_vmcb_page;
 
@@ -4900,6 +4900,16 @@ static int svm_vm_init(struct kvm *kvm)
 	return 0;
 }
 
+static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu)
+{
+	struct page *page = snp_safe_alloc_page(vcpu);
+
+	if (!page)
+		return NULL;
+
+	return page_address(page);
+}
+
 static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.name = KBUILD_MODNAME,
 
@@ -5031,6 +5041,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 
 	.vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
 	.vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons,
+	.alloc_apic_backing_page = svm_alloc_apic_backing_page,
 };
 
 /*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 8ef95139cd24..7f1fbd874c45 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -694,6 +694,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
 void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
+struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu);
 
 /* vmenter.S */
 
-- 
2.25.1


  parent reply	other threads:[~2024-01-26  4:45 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
2024-01-26  4:11 ` [PATCH v2 01/25] x86/cpufeatures: Add SEV-SNP CPU feature Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 02/25] x86/speculation: Do not enable Automatic IBRS if SEV SNP is enabled Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] x86/speculation: Do not enable Automatic IBRS if SEV-SNP " tip-bot2 for Kim Phillips
2024-01-26  4:11 ` [PATCH v2 03/25] iommu/amd: Don't rely on external callers to enable IOMMU SNP support Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Ashish Kalra
2024-01-26  4:11 ` [PATCH v2 04/25] x86/sev: Add the host SEV-SNP initialization support Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] x86/sev: Add SEV-SNP host " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 05/25] x86/mtrr: Don't print errors if MtrrFixDramModEn is set when SNP enabled Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Ashish Kalra
2024-01-26  4:11 ` [PATCH v2 06/25] x86/sev: Add RMP entry lookup helpers Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 07/25] x86/fault: Add helper for dumping RMP entries Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 08/25] x86/traps: Define RMP violation #PF error code Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 09/25] x86/fault: Dump RMP table information when RMP page faults occur Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Michael Roth
2024-01-26  4:11 ` [PATCH v2 10/25] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction Michael Roth
2024-01-29 18:00   ` Liam Merwick
2024-01-29 19:28     ` Borislav Petkov
2024-01-29 19:33       ` Borislav Petkov
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults Michael Roth
2024-01-26 15:34   ` Borislav Petkov
2024-01-26 17:04     ` Michael Roth
2024-01-26 18:43       ` Borislav Petkov
2024-01-26 23:54         ` Michael Roth
2024-01-27 11:42           ` Borislav Petkov
2024-01-27 15:45             ` Michael Roth
2024-01-27 16:02               ` Borislav Petkov
2024-01-29 11:59                 ` Borislav Petkov
2024-01-29 15:26                   ` Vlastimil Babka
2024-01-30 16:26   ` [tip: x86/sev] x86/sev: Adjust the directmap to avoid inadvertent " tip-bot2 for Michael Roth
2024-01-26  4:11 ` [PATCH v2 12/25] crypto: ccp: Define the SEV-SNP commands Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 13/25] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP Michael Roth
2024-01-29 17:58   ` Borislav Petkov
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 14/25] crypto: ccp: Provide API to issue SEV and SNP commands Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] crypto: ccp: Provide an " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 15/25] x86/sev: Introduce snp leaked pages list Michael Roth
2024-01-29 14:26   ` Vlastimil Babka
2024-01-29 14:29     ` Borislav Petkov
2024-01-30 16:26   ` [tip: x86/sev] x86/sev: Introduce an SNP " tip-bot2 for Ashish Kalra
2024-01-26  4:11 ` [PATCH v2 16/25] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled Michael Roth
2024-01-29 15:04   ` Borislav Petkov
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 17/25] crypto: ccp: Handle non-volatile INIT_EX data " Michael Roth
2024-01-29 15:12   ` Borislav Petkov
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Tom Lendacky
2024-01-26  4:11 ` [PATCH v2 18/25] crypto: ccp: Handle legacy SEV commands " Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 19/25] iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown Michael Roth
2024-01-30 16:26   ` [tip: x86/sev] " tip-bot2 for Ashish Kalra
2024-02-07 17:13   ` [tip: x86/sev] iommu/amd: Fix failure return from snp_lookup_rmpentry() tip-bot2 for Ashish Kalra
2024-01-26  4:11 ` [PATCH v2 20/25] crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump Michael Roth
2024-01-30 16:25   ` [tip: x86/sev] " tip-bot2 for Ashish Kalra
2024-01-26  4:11 ` Michael Roth [this message]
2024-01-26 11:00   ` [PATCH v2 21/25] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe Paolo Bonzini
2024-01-30 16:25   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 22/25] x86/cpufeatures: Enable/unmask SEV-SNP CPU feature Michael Roth
2024-01-30 16:25   ` [tip: x86/sev] " tip-bot2 for Michael Roth
2024-01-26  4:11 ` [PATCH v2 23/25] crypto: ccp: Add the SNP_PLATFORM_STATUS command Michael Roth
2024-01-30 16:25   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-26  4:11 ` [PATCH v2 24/25] crypto: ccp: Add the SNP_COMMIT command Michael Roth
2024-01-30 16:25   ` [tip: x86/sev] " tip-bot2 for Tom Lendacky
2024-01-26  4:11 ` [PATCH v2 25/25] crypto: ccp: Add the SNP_SET_CONFIG command Michael Roth
2024-01-29 19:18   ` Liam Merwick
2024-01-29 20:10     ` Michael Roth
2024-01-30 16:25   ` [tip: x86/sev] " tip-bot2 for Brijesh Singh
2024-01-30 16:19 ` [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240126041126.1927228-22-michael.roth@amd.com \
    --to=michael.roth@amd.com \
    --cc=ak@linux.intel.com \
    --cc=alpergun@google.com \
    --cc=ardb@kernel.org \
    --cc=ashish.kalra@amd.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=jmattson@google.com \
    --cc=jroedel@suse.de \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=marcorr@google.com \
    --cc=mingo@redhat.com \
    --cc=nikunj.dadhania@amd.com \
    --cc=pankaj.gupta@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pgonda@google.com \
    --cc=rientjes@google.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=slp@redhat.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tobin@ibm.com \
    --cc=tony.luck@intel.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.