linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support
@ 2024-04-18 19:41 Michael Roth
  2024-04-18 19:41 ` [PATCH v13 01/26] [TEMP] x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM Michael Roth
                   ` (26 more replies)
  0 siblings, 27 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

This patchset is also available at:

  https://github.com/amdese/linux/commits/snp-host-v13

and is based on commit 4d2deb62185f (as suggested by Paolo) from:

  https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=kvm-coco-queue


Patch Layout
------------

01-03: These patches are minor dependencies for this series and are already
       included in both tip/master and mainline, so are only included here
       as a stop-gap until merged from one of those trees. These are needed
       by patch #8 in this series which makes use of CC_ATTR_HOST_SEV_SNP

04:    This is a small general fix-up for guest_memfd that can be applied
       independently of this series.

05-08: These patches add some basic infrastructure and introduces a new
       KVM_X86_SNP_VM vm_type to handle differences verses the existing
       KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types.

09-11: These implement the KVM API to handle the creation of a
       cryptographic launch context, encrypt/measure the initial image
       into guest memory, and finalize it before launching it.

12-17: These implement handling for various guest-generated events such
       as page state changes, onlining of additional vCPUs, etc.

18-21: These implement the gmem hooks needed to prepare gmem-allocated
       pages before mapping them into guest private memory ranges as
       well as cleaning them up prior to returning them to the host for
       use as normal memory. Because this supplants certain activities
       like issued WBINVDs during KVM MMU invalidations, there's also
       a patch to avoid duplicating that work to avoid unecessary
       overhead.

22:    With all the core support in place, the patch adds a kvm_amd module
       parameter to enable SNP support.

23-26: These patches all deal with the servicing of guest requests to handle
       things like attestation, as well as some related host-management
       interfaces.


Testing
-------

For testing this via QEMU, use the following tree:

  https://github.com/amdese/qemu/commits/snp-v4-wip3

A patched OVMF is also needed due to upstream KVM no longer supporting MMIO
ranges that are mapped as private. It is recommended you build the AmdSevX64
variant as it provides the kernel-hashing support present in this series:

  https://github.com/amdese/ovmf/commits/apic-mmio-fix1d

A basic command-line invocation for SNP would be:

 qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
  -machine q35,confidential-guest-support=sev0,memory-backend=ram1
  -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
  -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=
  -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd

With kernel-hashing and certificate data supplied:

 qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
  -machine q35,confidential-guest-support=sev0,memory-backend=ram1
  -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
  -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=,certs-path=/home/mroth/cert.blob,kernel-hashes=on
  -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd
  -kernel /boot/vmlinuz-$ver
  -initrd /boot/initrd.img-$ver
  -append "root=UUID=d72a6d1c-06cf-4b79-af43-f1bac4f620f9 ro console=ttyS0,115200n8"

With standard X64 OVMF package with separate image for persistent NVRAM:

 qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
  -machine q35,confidential-guest-support=sev0,memory-backend=ram1
  -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
  -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=
  -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d.fd 
  -drive if=pflash,format=raw,unit=0,file=OVMF_VARS-upstream-20240410-apic-mmio-fix1d.fd,readonly=off


Known issues / TODOs
--------------------

 * SEV-ES guests may trigger the following warning:

     WARNING: CPU: 151 PID: 4003 at arch/x86/kvm/mmu/mmu.c:5855 kvm_mmu_page_fault+0x33b/0x860 [kvm]

   It is assumed here that these will be resolved once the transition to
   PFERR_PRIVATE_ACCESS is fully completed, but if that's not the case let me
   know and will investigate further.

 * Base tree in some cases reports "Unpatched return thunk in use. This should 
   not happen!" the first time it runs an SVM/SEV/SNP guests. This a recent
   regression upstream and unrelated to this series:

     https://lore.kernel.org/linux-kernel/CANpmjNOcKzEvLHoGGeL-boWDHJobwfwyVxUqMq2kWeka3N4tXA@mail.gmail.com/T/

 * 2MB hugepage support has been dropped pending discussion on how we plan to
   re-enable it in gmem.

 * Host kexec should work, but there is a known issue with host kdump support
   while SNP guests are running that will be addressed as a follow-up.

 * SNP kselftests are currently a WIP and will be included as part of SNP
   upstreaming efforts in the near-term.


SEV-SNP Overview
----------------

This part of the Secure Encrypted Paging (SEV-SNP) series focuses on the
changes required to add KVM support for SEV-SNP. This series builds upon
SEV-SNP guest support, which is now in mainline, and and SEV-SNP host
initialization support, which is now in linux-next.

While series provides the basic building blocks to support booting the
SEV-SNP VMs, it does not cover all the security enhancement introduced by
the SEV-SNP such as interrupt protection, which will added in the future.

With SNP, when pages are marked as guest-owned in the RMP table, they are
assigned to a specific guest/ASID, as well as a specific GFN with in the
guest. Any attempts to map it in the RMP table to a different guest/ASID,
or a different GFN within a guest/ASID, will result in an RMP nested page
fault.

Prior to accessing a guest-owned page, the guest must validate it with a
special PVALIDATE instruction which will set a special bit in the RMP table
for the guest. This is the only way to set the validated bit outside of the
initial pre-encrypted guest payload/image; any attempts outside the guest to
modify the RMP entry from that point forward will result in the validated
bit being cleared, at which point the guest will trigger an exception if it
attempts to access that page so it can be made aware of possible tampering.

One exception to this is the initial guest payload, which is pre-validated
by the firmware prior to launching. The guest can use Guest Message requests 
to fetch an attestation report which will include the measurement of the
initial image so that the guest can verify it was booted with the expected
image/environment.

After boot, guests can use Page State Change requests to switch pages
between shared/hypervisor-owned and private/guest-owned to share data for
things like DMA, virtio buffers, and other GHCB requests.

In this implementation of SEV-SNP, private guest memory is managed by a new
kernel framework called guest_memfd (gmem). With gmem, a new
KVM_SET_MEMORY_ATTRIBUTES KVM ioctl has been added to tell the KVM
MMU whether a particular GFN should be backed by shared (normal) memory or
private (gmem-allocated) memory. To tie into this, Page State Change
requests are forward to userspace via KVM_EXIT_VMGEXIT exits, which will
then issue the corresponding KVM_SET_MEMORY_ATTRIBUTES call to set the
private/shared state in the KVM MMU.

The gmem / KVM MMU hooks implemented in this series will then update the RMP
table entries for the backing PFNs to set them to guest-owned/private when
mapping private pages into the guest via KVM MMU, or use the normal KVM MMU
handling in the case of shared pages where the corresponding RMP table
entries are left in the default shared/hypervisor-owned state.

Feedback/review is very much appreciated!

-Mike


Changes since v12:

 * rebased to latest kvm-coco-queue branch (commit 4d2deb62185f)
 * add more input validation for SNP_LAUNCH_START, especially for handling
   things like MBO/MBZ policy bits, and API major/minor minimums. (Paolo)
 * block SNP KVM instances from being able to run legacy SEV commands (Paolo)
 * don't attempt to measure VMSA for vcpu 0/BSP before the others, let
   userspace deal with the ordering just like with SEV-ES (Paolo)
 * fix up docs for SNP_LAUNCH_FINISH (Paolo)
 * introduce svm->sev_es.snp_has_guest_vmsa flag to better distinguish
   handling for guest-mapped vs non-guest-mapped VMSAs, rename
   'snp_ap_create' flag to 'snp_ap_waiting_for_reset' (Paolo)
 * drop "KVM: SEV: Use a VMSA physical address variable for populating VMCB"
   as it is no longer needed due to above VMSA rework
 * replace pr_debug_ratelimited() messages for RMP #NPFs with a single trace
   event
 * handle transient PSMASH_FAIL_INUSE return codes in kvm_gmem_invalidate(),
   switch to WARN_ON*()'s to indicate remaining error cases are not expected
   and should not be seen in practice. (Paolo)
 * add a cond_resched() in kvm_gmem_invalidate() to avoid soft lock-ups when
   cleaning up large guest memory ranges.
 * rename VLEK_REQUIRED to VCEK_DISABLE. it's be more applicable if another
   key type ever gets added.
 * don't allow attestation to be paused while an attestation request is
   being processed by firmware (Tom)
 * add missing Documentation entry for SNP_VLEK_LOAD
 * collect Reviewed-by's from Paolo and Tom

Changes since v11:

 * Rebase series on kvm-coco-queue and re-work to leverage more
   infrastructure between SNP/TDX series.
 * Drop KVM_SNP_INIT in favor of the new KVM_SEV_INIT2 interface introduced
   here (Paolo):
     https://lore.kernel.org/lkml/20240318233352.2728327-1-pbonzini@redhat.com/
 * Drop exposure API fields related to things like VMPL levels, migration
   agents, etc., until they are actually supported/used (Sean)
 * Rework KVM_SEV_SNP_LAUNCH_UPDATE handling to use a new
   kvm_gmem_populate() interface instead of copying data directly into
   gmem-allocated pages (Sean)
 * Add support for SNP_LOAD_VLEK, rework the SNP_SET_CONFIG_{START,END} to
   have simpler semantics that are applicable to management of SNP_LOAD_VLEK
   updates as well, rename interfaces to the now more appropriate
   SNP_{PAUSE,RESUME}_ATTESTATION
 * Fix up documentation wording and do print warnings for
   userspace-triggerable failures (Peter, Sean)
 * Fix a race with AP_CREATION wake-up events (Jacob, Sean)
 * Fix a memory leak with VMSA pages (Sean)
 * Tighten up handling of RMP page faults to better distinguish between real
   and spurious cases (Tom)
 * Various patch/documentation rewording, cleanups, etc.


----------------------------------------------------------------
Ashish Kalra (1):
      KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP

Borislav Petkov (AMD) (3):
      [TEMP] x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM
      [TEMP] x86/cc: Add cc_platform_set/_clear() helpers
      [TEMP] x86/CPU/AMD: Track SNP host status with cc_platform_*()

Brijesh Singh (10):
      KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
      KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
      KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
      KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command
      KVM: SEV: Add support to handle GHCB GPA register VMGEXIT
      KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
      KVM: SEV: Add support to handle Page State Change VMGEXIT
      KVM: SEV: Add support to handle RMP nested page faults
      KVM: SVM: Add module parameter to enable SEV-SNP
      KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event

Michael Roth (10):
      KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn()
      KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y
      KVM: SEV: Add initial SEV-SNP support
      KVM: SEV: Add support for GHCB-based termination requests
      KVM: SEV: Implement gmem hook for initializing private pages
      KVM: SEV: Implement gmem hook for invalidating private pages
      KVM: x86: Implement gmem hook for determining max NPT mapping level
      crypto: ccp: Add the SNP_VLEK_LOAD command
      crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
      KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event

Tom Lendacky (2):
      KVM: SEV: Add support to handle AP reset MSR protocol
      KVM: SEV: Support SEV-SNP AP Creation NAE event

 Documentation/virt/coco/sev-guest.rst              |   69 +-
 Documentation/virt/kvm/api.rst                     |   73 +
 .../virt/kvm/x86/amd-memory-encryption.rst         |   88 +-
 arch/x86/coco/core.c                               |   52 +
 arch/x86/include/asm/kvm_host.h                    |    2 +
 arch/x86/include/asm/sev-common.h                  |   22 +-
 arch/x86/include/asm/sev.h                         |   19 +-
 arch/x86/include/asm/svm.h                         |    9 +-
 arch/x86/include/uapi/asm/kvm.h                    |   39 +
 arch/x86/kernel/cpu/amd.c                          |   38 +-
 arch/x86/kernel/cpu/mtrr/generic.c                 |    2 +-
 arch/x86/kernel/sev.c                              |   10 -
 arch/x86/kvm/Kconfig                               |    4 +
 arch/x86/kvm/mmu.h                                 |    2 -
 arch/x86/kvm/mmu/mmu.c                             |    1 +
 arch/x86/kvm/svm/sev.c                             | 1444 +++++++++++++++++++-
 arch/x86/kvm/svm/svm.c                             |   39 +-
 arch/x86/kvm/svm/svm.h                             |   50 +
 arch/x86/kvm/trace.h                               |   31 +
 arch/x86/kvm/x86.c                                 |   19 +-
 arch/x86/virt/svm/sev.c                            |  106 +-
 drivers/crypto/ccp/sev-dev.c                       |   85 +-
 drivers/iommu/amd/init.c                           |    4 +-
 include/linux/cc_platform.h                        |   12 +
 include/linux/psp-sev.h                            |    4 +-
 include/uapi/linux/kvm.h                           |   28 +
 include/uapi/linux/psp-sev.h                       |   39 +
 include/uapi/linux/sev-guest.h                     |    9 +
 virt/kvm/guest_memfd.c                             |    8 +-
 29 files changed, 2229 insertions(+), 79 deletions(-)



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v13 01/26] [TEMP] x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 02/26] [TEMP] x86/cc: Add cc_platform_set/_clear() helpers Michael Roth
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

From: "Borislav Petkov (AMD)" <bp@alien8.de>

The functionality to load SEV-SNP guests by the host will soon rely on
cc_platform* helpers because the cpu_feature* API with the early
patching is insufficient when SNP support needs to be disabled late.

Therefore, pull that functionality in.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 7632fe6e4db9..d64fb2b3eb69 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -135,6 +135,7 @@ config KVM_AMD_SEV
 	default y
 	depends on KVM_AMD && X86_64
 	depends on CRYPTO_DEV_SP_PSP && !(KVM_AMD=y && CRYPTO_DEV_CCP_DD=m)
+	select ARCH_HAS_CC_PLATFORM
 	help
 	  Provides support for launching Encrypted VMs (SEV) and Encrypted VMs
 	  with Encrypted State (SEV-ES) on AMD processors.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 02/26] [TEMP] x86/cc: Add cc_platform_set/_clear() helpers
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
  2024-04-18 19:41 ` [PATCH v13 01/26] [TEMP] x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 03/26] [TEMP] x86/CPU/AMD: Track SNP host status with cc_platform_*() Michael Roth
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

From: "Borislav Petkov (AMD)" <bp@alien8.de>

Add functionality to set and/or clear different attributes of the
machine as a confidential computing platform. Add the first one too:
whether the machine is running as a host for SEV-SNP guests.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/coco/core.c        | 52 +++++++++++++++++++++++++++++++++++++
 include/linux/cc_platform.h | 12 +++++++++
 2 files changed, 64 insertions(+)

diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index d07be9d05cd0..8c3fae23d3c6 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -16,6 +16,11 @@
 enum cc_vendor cc_vendor __ro_after_init = CC_VENDOR_NONE;
 u64 cc_mask __ro_after_init;
 
+static struct cc_attr_flags {
+	__u64 host_sev_snp	: 1,
+	      __resv		: 63;
+} cc_flags;
+
 static bool noinstr intel_cc_platform_has(enum cc_attr attr)
 {
 	switch (attr) {
@@ -89,6 +94,9 @@ static bool noinstr amd_cc_platform_has(enum cc_attr attr)
 	case CC_ATTR_GUEST_SEV_SNP:
 		return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
 
+	case CC_ATTR_HOST_SEV_SNP:
+		return cc_flags.host_sev_snp;
+
 	default:
 		return false;
 	}
@@ -148,3 +156,47 @@ u64 cc_mkdec(u64 val)
 	}
 }
 EXPORT_SYMBOL_GPL(cc_mkdec);
+
+static void amd_cc_platform_clear(enum cc_attr attr)
+{
+	switch (attr) {
+	case CC_ATTR_HOST_SEV_SNP:
+		cc_flags.host_sev_snp = 0;
+		break;
+	default:
+		break;
+	}
+}
+
+void cc_platform_clear(enum cc_attr attr)
+{
+	switch (cc_vendor) {
+	case CC_VENDOR_AMD:
+		amd_cc_platform_clear(attr);
+		break;
+	default:
+		break;
+	}
+}
+
+static void amd_cc_platform_set(enum cc_attr attr)
+{
+	switch (attr) {
+	case CC_ATTR_HOST_SEV_SNP:
+		cc_flags.host_sev_snp = 1;
+		break;
+	default:
+		break;
+	}
+}
+
+void cc_platform_set(enum cc_attr attr)
+{
+	switch (cc_vendor) {
+	case CC_VENDOR_AMD:
+		amd_cc_platform_set(attr);
+		break;
+	default:
+		break;
+	}
+}
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index cb0d6cd1c12f..60693a145894 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -90,6 +90,14 @@ enum cc_attr {
 	 * Examples include TDX Guest.
 	 */
 	CC_ATTR_HOTPLUG_DISABLED,
+
+	/**
+	 * @CC_ATTR_HOST_SEV_SNP: AMD SNP enabled on the host.
+	 *
+	 * The host kernel is running with the necessary features
+	 * enabled to run SEV-SNP guests.
+	 */
+	CC_ATTR_HOST_SEV_SNP,
 };
 
 #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
@@ -107,10 +115,14 @@ enum cc_attr {
  * * FALSE - Specified Confidential Computing attribute is not active
  */
 bool cc_platform_has(enum cc_attr attr);
+void cc_platform_set(enum cc_attr attr);
+void cc_platform_clear(enum cc_attr attr);
 
 #else	/* !CONFIG_ARCH_HAS_CC_PLATFORM */
 
 static inline bool cc_platform_has(enum cc_attr attr) { return false; }
+static inline void cc_platform_set(enum cc_attr attr) { }
+static inline void cc_platform_clear(enum cc_attr attr) { }
 
 #endif	/* CONFIG_ARCH_HAS_CC_PLATFORM */
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 03/26] [TEMP] x86/CPU/AMD: Track SNP host status with cc_platform_*()
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
  2024-04-18 19:41 ` [PATCH v13 01/26] [TEMP] x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM Michael Roth
  2024-04-18 19:41 ` [PATCH v13 02/26] [TEMP] x86/cc: Add cc_platform_set/_clear() helpers Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 04/26] KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn() Michael Roth
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

From: "Borislav Petkov (AMD)" <bp@alien8.de>

The host SNP worthiness can determined later, after alternatives have
been patched, in snp_rmptable_init() depending on cmdline options like
iommu=pt which is incompatible with SNP, for example.

Which means that one cannot use X86_FEATURE_SEV_SNP and will need to
have a special flag for that control.

Use that newly added CC_ATTR_HOST_SEV_SNP in the appropriate places.

Move kdump_sev_callback() to its rightfull place, while at it.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev.h         |  4 ++--
 arch/x86/kernel/cpu/amd.c          | 38 ++++++++++++++++++------------
 arch/x86/kernel/cpu/mtrr/generic.c |  2 +-
 arch/x86/kernel/sev.c              | 10 --------
 arch/x86/kvm/svm/sev.c             |  2 +-
 arch/x86/virt/svm/sev.c            | 26 +++++++++++++-------
 drivers/crypto/ccp/sev-dev.c       |  2 +-
 drivers/iommu/amd/init.c           |  4 +++-
 8 files changed, 49 insertions(+), 39 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 9477b4053bce..780182cda3ab 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -228,7 +228,6 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct sn
 void snp_accept_memory(phys_addr_t start, phys_addr_t end);
 u64 snp_get_unsupported_features(u64 status);
 u64 sev_get_status(void);
-void kdump_sev_callback(void);
 void sev_show_status(void);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
@@ -258,7 +257,6 @@ static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *in
 static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }
 static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
 static inline u64 sev_get_status(void) { return 0; }
-static inline void kdump_sev_callback(void) { }
 static inline void sev_show_status(void) { }
 #endif
 
@@ -270,6 +268,7 @@ int psmash(u64 pfn);
 int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 asid, bool immutable);
 int rmp_make_shared(u64 pfn, enum pg_level level);
 void snp_leak_pages(u64 pfn, unsigned int npages);
+void kdump_sev_callback(void);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
@@ -282,6 +281,7 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 as
 }
 static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
 static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
+static inline void kdump_sev_callback(void) { }
 #endif
 
 #endif
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 6d8677e80ddb..9bf17c9c29da 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -345,6 +345,28 @@ static void srat_detect_node(struct cpuinfo_x86 *c)
 #endif
 }
 
+static void bsp_determine_snp(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
+	cc_vendor = CC_VENDOR_AMD;
+
+	if (cpu_has(c, X86_FEATURE_SEV_SNP)) {
+		/*
+		 * RMP table entry format is not architectural and is defined by the
+		 * per-processor PPR. Restrict SNP support on the known CPU models
+		 * for which the RMP table entry format is currently defined for.
+		 */
+		if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
+		    c->x86 >= 0x19 && snp_probe_rmptable_info()) {
+			cc_platform_set(CC_ATTR_HOST_SEV_SNP);
+		} else {
+			setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
+			cc_platform_clear(CC_ATTR_HOST_SEV_SNP);
+		}
+	}
+#endif
+}
+
 static void bsp_init_amd(struct cpuinfo_x86 *c)
 {
 	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {
@@ -452,21 +474,7 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
 		break;
 	}
 
-	if (cpu_has(c, X86_FEATURE_SEV_SNP)) {
-		/*
-		 * RMP table entry format is not architectural and it can vary by processor
-		 * and is defined by the per-processor PPR. Restrict SNP support on the
-		 * known CPU model and family for which the RMP table entry format is
-		 * currently defined for.
-		 */
-		if (!boot_cpu_has(X86_FEATURE_ZEN3) &&
-		    !boot_cpu_has(X86_FEATURE_ZEN4) &&
-		    !boot_cpu_has(X86_FEATURE_ZEN5))
-			setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
-		else if (!snp_probe_rmptable_info())
-			setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
-	}
-
+	bsp_determine_snp(c);
 	return;
 
 warn:
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 422a4ddc2ab7..7b29ebda024f 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -108,7 +108,7 @@ static inline void k8_check_syscfg_dram_mod_en(void)
 	      (boot_cpu_data.x86 >= 0x0f)))
 		return;
 
-	if (cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return;
 
 	rdmsr(MSR_AMD64_SYSCFG, lo, hi);
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index b59b09c2f284..1e1a3c3bd1e8 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2287,16 +2287,6 @@ static int __init snp_init_platform_device(void)
 }
 device_initcall(snp_init_platform_device);
 
-void kdump_sev_callback(void)
-{
-	/*
-	 * Do wbinvd() on remote CPUs when SNP is enabled in order to
-	 * safely do SNP_SHUTDOWN on the local CPU.
-	 */
-	if (cpu_feature_enabled(X86_FEATURE_SEV_SNP))
-		wbinvd();
-}
-
 void sev_show_status(void)
 {
 	int i;
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 1642d7d49bde..598d78b4107f 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3296,7 +3296,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
 	unsigned long pfn;
 	struct page *p;
 
-	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
 
 	/*
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index cffe1157a90a..ab0e8448bb6e 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -77,7 +77,7 @@ static int __mfd_enable(unsigned int cpu)
 {
 	u64 val;
 
-	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return 0;
 
 	rdmsrl(MSR_AMD64_SYSCFG, val);
@@ -98,7 +98,7 @@ static int __snp_enable(unsigned int cpu)
 {
 	u64 val;
 
-	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return 0;
 
 	rdmsrl(MSR_AMD64_SYSCFG, val);
@@ -174,11 +174,11 @@ static int __init snp_rmptable_init(void)
 	u64 rmptable_size;
 	u64 val;
 
-	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return 0;
 
 	if (!amd_iommu_snp_en)
-		return 0;
+		goto nosnp;
 
 	if (!probed_rmp_size)
 		goto nosnp;
@@ -225,7 +225,7 @@ static int __init snp_rmptable_init(void)
 	return 0;
 
 nosnp:
-	setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
+	cc_platform_clear(CC_ATTR_HOST_SEV_SNP);
 	return -ENOSYS;
 }
 
@@ -246,7 +246,7 @@ static struct rmpentry *__snp_lookup_rmpentry(u64 pfn, int *level)
 {
 	struct rmpentry *large_entry, *entry;
 
-	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return ERR_PTR(-ENODEV);
 
 	entry = get_rmpentry(pfn);
@@ -363,7 +363,7 @@ int psmash(u64 pfn)
 	unsigned long paddr = pfn << PAGE_SHIFT;
 	int ret;
 
-	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return -ENODEV;
 
 	if (!pfn_valid(pfn))
@@ -472,7 +472,7 @@ static int rmpupdate(u64 pfn, struct rmp_state *state)
 	unsigned long paddr = pfn << PAGE_SHIFT;
 	int ret, level;
 
-	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return -ENODEV;
 
 	level = RMP_TO_PG_LEVEL(state->pagesize);
@@ -558,3 +558,13 @@ void snp_leak_pages(u64 pfn, unsigned int npages)
 	spin_unlock(&snp_leaked_pages_list_lock);
 }
 EXPORT_SYMBOL_GPL(snp_leak_pages);
+
+void kdump_sev_callback(void)
+{
+	/*
+	 * Do wbinvd() on remote CPUs when SNP is enabled in order to
+	 * safely do SNP_SHUTDOWN on the local CPU.
+	 */
+	if (cc_platform_has(CC_ATTR_HOST_SEV_SNP))
+		wbinvd();
+}
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index f44efbb89c34..2102377f727b 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -1090,7 +1090,7 @@ static int __sev_snp_init_locked(int *error)
 	void *arg = &data;
 	int cmd, rc = 0;
 
-	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return -ENODEV;
 
 	sev = psp->sev_data;
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index e7a44929f0da..33228c1c8980 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3228,7 +3228,7 @@ static bool __init detect_ivrs(void)
 static void iommu_snp_enable(void)
 {
 #ifdef CONFIG_KVM_AMD_SEV
-	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+	if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		return;
 	/*
 	 * The SNP support requires that IOMMU must be enabled, and is
@@ -3236,12 +3236,14 @@ static void iommu_snp_enable(void)
 	 */
 	if (no_iommu || iommu_default_passthrough()) {
 		pr_err("SNP: IOMMU disabled or configured in passthrough mode, SNP cannot be supported.\n");
+		cc_platform_clear(CC_ATTR_HOST_SEV_SNP);
 		return;
 	}
 
 	amd_iommu_snp_en = check_feature(FEATURE_SNP);
 	if (!amd_iommu_snp_en) {
 		pr_err("SNP: IOMMU SNP feature not enabled, SNP cannot be supported.\n");
+		cc_platform_clear(CC_ATTR_HOST_SEV_SNP);
 		return;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 04/26] KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn()
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (2 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 03/26] [TEMP] x86/CPU/AMD: Track SNP host status with cc_platform_*() Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-19 12:58   ` David Hildenbrand
  2024-04-18 19:41 ` [PATCH v13 05/26] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

kvm_gmem_get_folio() may return a PTR_ERR() rather than just NULL. In
particular, for cases where EEXISTS is returned when FGP_CREAT_ONLY
flag is used. Handle this properly in __kvm_gmem_get_pfn().

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 virt/kvm/guest_memfd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index ccf22e44f387..9d7c6a70c547 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -580,8 +580,8 @@ static int __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot,
 	}
 
 	folio = kvm_gmem_get_folio(file_inode(file), index, prepare);
-	if (!folio)
-		return -ENOMEM;
+	if (IS_ERR_OR_NULL(folio))
+		return folio ? PTR_ERR(folio) : -ENOMEM;
 
 	if (folio_test_hwpoison(folio)) {
 		r = -EHWPOISON;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 05/26] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (3 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 04/26] KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn() Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 06/26] KVM: SEV: Add support to handle AP reset MSR protocol Michael Roth
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

SEV-SNP relies on private memory support to run guests, so make sure to
enable that support via the CONFIG_KVM_GENERIC_PRIVATE_MEM config
option.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index d64fb2b3eb69..5e72faca4e8f 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -136,6 +136,7 @@ config KVM_AMD_SEV
 	depends on KVM_AMD && X86_64
 	depends on CRYPTO_DEV_SP_PSP && !(KVM_AMD=y && CRYPTO_DEV_CCP_DD=m)
 	select ARCH_HAS_CC_PLATFORM
+	select KVM_GENERIC_PRIVATE_MEM
 	help
 	  Provides support for launching Encrypted VMs (SEV) and Encrypted VMs
 	  with Encrypted State (SEV-ES) on AMD processors.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 06/26] KVM: SEV: Add support to handle AP reset MSR protocol
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (4 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 05/26] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 07/26] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

Add support for AP Reset Hold being invoked using the GHCB MSR protocol,
available in version 2 of the GHCB specification.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev-common.h |  6 ++--
 arch/x86/kvm/svm/sev.c            | 56 ++++++++++++++++++++++++++-----
 arch/x86/kvm/svm/svm.h            |  1 +
 3 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index b463fcbd4b90..01261f7054ad 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -54,8 +54,10 @@
 	(((unsigned long)fn) << 32))
 
 /* AP Reset Hold */
-#define GHCB_MSR_AP_RESET_HOLD_REQ	0x006
-#define GHCB_MSR_AP_RESET_HOLD_RESP	0x007
+#define GHCB_MSR_AP_RESET_HOLD_REQ		0x006
+#define GHCB_MSR_AP_RESET_HOLD_RESP		0x007
+#define GHCB_MSR_AP_RESET_HOLD_RESULT_POS	12
+#define GHCB_MSR_AP_RESET_HOLD_RESULT_MASK	GENMASK_ULL(51, 0)
 
 /* GHCB GPA Register */
 #define GHCB_MSR_REG_GPA_REQ		0x012
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 598d78b4107f..6e31cb408dd8 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -49,6 +49,10 @@ static bool sev_es_debug_swap_enabled = true;
 module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0444);
 static u64 sev_supported_vmsa_features;
 
+#define AP_RESET_HOLD_NONE		0
+#define AP_RESET_HOLD_NAE_EVENT		1
+#define AP_RESET_HOLD_MSR_PROTO		2
+
 static u8 sev_enc_bit;
 static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
@@ -2727,6 +2731,9 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 
 void sev_es_unmap_ghcb(struct vcpu_svm *svm)
 {
+	/* Clear any indication that the vCPU is in a type of AP Reset Hold */
+	svm->sev_es.ap_reset_hold_type = AP_RESET_HOLD_NONE;
+
 	if (!svm->sev_es.ghcb)
 		return;
 
@@ -2938,6 +2945,22 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 				  GHCB_MSR_INFO_POS);
 		break;
 	}
+	case GHCB_MSR_AP_RESET_HOLD_REQ:
+		svm->sev_es.ap_reset_hold_type = AP_RESET_HOLD_MSR_PROTO;
+		ret = kvm_emulate_ap_reset_hold(&svm->vcpu);
+
+		/*
+		 * Preset the result to a non-SIPI return and then only set
+		 * the result to non-zero when delivering a SIPI.
+		 */
+		set_ghcb_msr_bits(svm, 0,
+				  GHCB_MSR_AP_RESET_HOLD_RESULT_MASK,
+				  GHCB_MSR_AP_RESET_HOLD_RESULT_POS);
+
+		set_ghcb_msr_bits(svm, GHCB_MSR_AP_RESET_HOLD_RESP,
+				  GHCB_MSR_INFO_MASK,
+				  GHCB_MSR_INFO_POS);
+		break;
 	case GHCB_MSR_TERM_REQ: {
 		u64 reason_set, reason_code;
 
@@ -3037,6 +3060,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		ret = 1;
 		break;
 	case SVM_VMGEXIT_AP_HLT_LOOP:
+		svm->sev_es.ap_reset_hold_type = AP_RESET_HOLD_NAE_EVENT;
 		ret = kvm_emulate_ap_reset_hold(vcpu);
 		break;
 	case SVM_VMGEXIT_AP_JUMP_TABLE: {
@@ -3280,15 +3304,31 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 		return;
 	}
 
-	/*
-	 * Subsequent SIPI: Return from an AP Reset Hold VMGEXIT, where
-	 * the guest will set the CS and RIP. Set SW_EXIT_INFO_2 to a
-	 * non-zero value.
-	 */
-	if (!svm->sev_es.ghcb)
-		return;
+	/* Subsequent SIPI */
+	switch (svm->sev_es.ap_reset_hold_type) {
+	case AP_RESET_HOLD_NAE_EVENT:
+		/*
+		 * Return from an AP Reset Hold VMGEXIT, where the guest will
+		 * set the CS and RIP. Set SW_EXIT_INFO_2 to a non-zero value.
+		 */
+		ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
+		break;
+	case AP_RESET_HOLD_MSR_PROTO:
+		/*
+		 * Return from an AP Reset Hold VMGEXIT, where the guest will
+		 * set the CS and RIP. Set GHCB data field to a non-zero value.
+		 */
+		set_ghcb_msr_bits(svm, 1,
+				  GHCB_MSR_AP_RESET_HOLD_RESULT_MASK,
+				  GHCB_MSR_AP_RESET_HOLD_RESULT_POS);
 
-	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
+		set_ghcb_msr_bits(svm, GHCB_MSR_AP_RESET_HOLD_RESP,
+				  GHCB_MSR_INFO_MASK,
+				  GHCB_MSR_INFO_POS);
+		break;
+	default:
+		break;
+	}
 }
 
 struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 323901782547..6fd0f5862681 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -199,6 +199,7 @@ struct vcpu_sev_es_state {
 	u8 valid_bitmap[16];
 	struct kvm_host_map ghcb_map;
 	bool received_first_sipi;
+	unsigned int ap_reset_hold_type;
 
 	/* SEV-ES scratch area support */
 	u64 sw_scratch;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 07/26] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (5 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 06/26] KVM: SEV: Add support to handle AP reset MSR protocol Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 08/26] KVM: SEV: Add initial SEV-SNP support Michael Roth
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

Version 2 of the GHCB specification introduced advertisement of features
that are supported by the Hypervisor.

Now that KVM supports version 2 of the GHCB specification, bump the
maximum supported protocol version.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev-common.h |  2 ++
 arch/x86/kvm/svm/sev.c            | 16 +++++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 01261f7054ad..5a8246dd532f 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -101,6 +101,8 @@ enum psc_op {
 /* GHCB Hypervisor Feature Request/Response */
 #define GHCB_MSR_HV_FT_REQ		0x080
 #define GHCB_MSR_HV_FT_RESP		0x081
+#define GHCB_MSR_HV_FT_POS		12
+#define GHCB_MSR_HV_FT_MASK		GENMASK_ULL(51, 0)
 #define GHCB_MSR_HV_FT_RESP_VAL(v)			\
 	/* GHCBData[63:12] */				\
 	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6e31cb408dd8..1d2264e93afe 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -33,9 +33,11 @@
 #include "cpuid.h"
 #include "trace.h"
 
-#define GHCB_VERSION_MAX	1ULL
+#define GHCB_VERSION_MAX	2ULL
 #define GHCB_VERSION_MIN	1ULL
 
+#define GHCB_HV_FT_SUPPORTED	GHCB_HV_FT_SNP
+
 /* enable/disable SEV support */
 static bool sev_enabled = true;
 module_param_named(sev, sev_enabled, bool, 0444);
@@ -2701,6 +2703,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_AP_HLT_LOOP:
 	case SVM_VMGEXIT_AP_JUMP_TABLE:
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
+	case SVM_VMGEXIT_HV_FEATURES:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -2961,6 +2964,12 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 				  GHCB_MSR_INFO_MASK,
 				  GHCB_MSR_INFO_POS);
 		break;
+	case GHCB_MSR_HV_FT_REQ:
+		set_ghcb_msr_bits(svm, GHCB_HV_FT_SUPPORTED,
+				  GHCB_MSR_HV_FT_MASK, GHCB_MSR_HV_FT_POS);
+		set_ghcb_msr_bits(svm, GHCB_MSR_HV_FT_RESP,
+				  GHCB_MSR_INFO_MASK, GHCB_MSR_INFO_POS);
+		break;
 	case GHCB_MSR_TERM_REQ: {
 		u64 reason_set, reason_code;
 
@@ -3085,6 +3094,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		ret = 1;
 		break;
 	}
+	case SVM_VMGEXIT_HV_FEATURES:
+		ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, GHCB_HV_FT_SUPPORTED);
+
+		ret = 1;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 08/26] KVM: SEV: Add initial SEV-SNP support
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (6 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 07/26] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-19 11:58   ` Paolo Bonzini
  2024-04-18 19:41 ` [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
new hardware-based security protection. SEV-SNP adds strong memory
encryption and integrity protection to help prevent malicious
hypervisor-based attacks such as data replay, memory re-mapping, and
more, to create an isolated execution environment.

Define a new KVM_X86_SNP_VM type which makes use of these capabilities
and extend the KVM_SEV_INIT2 ioctl to support it. Also add a basic
helper to check whether SNP is enabled.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[mdr: commit fixups, use similar ASID reporting as with SEV/SEV-ES]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/svm.h      |  3 ++-
 arch/x86/include/uapi/asm/kvm.h |  1 +
 arch/x86/kvm/svm/sev.c          | 21 ++++++++++++++++++++-
 arch/x86/kvm/svm/svm.c          |  3 ++-
 arch/x86/kvm/svm/svm.h          | 12 ++++++++++++
 arch/x86/kvm/x86.c              |  2 +-
 6 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 728c98175b9c..544a43c1cf11 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -285,7 +285,8 @@ static_assert((X2AVIC_MAX_PHYSICAL_ID & AVIC_PHYSICAL_MAX_INDEX_MASK) == X2AVIC_
 
 #define AVIC_HPA_MASK	~((0xFFFULL << 52) | 0xFFF)
 
-#define SVM_SEV_FEAT_DEBUG_SWAP                        BIT(5)
+#define SVM_SEV_FEAT_SNP_ACTIVE				BIT(0)
+#define SVM_SEV_FEAT_DEBUG_SWAP				BIT(5)
 
 struct vmcb_seg {
 	u16 selector;
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 72ad5ace118d..9a8b81d20314 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -872,5 +872,6 @@ struct kvm_hyperv_eventfd {
 #define KVM_X86_SW_PROTECTED_VM	1
 #define KVM_X86_SEV_VM		2
 #define KVM_X86_SEV_ES_VM	3
+#define KVM_X86_SNP_VM		4
 
 #endif /* _ASM_X86_KVM_H */
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 1d2264e93afe..c41cc73a1efe 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -46,6 +46,9 @@ module_param_named(sev, sev_enabled, bool, 0444);
 static bool sev_es_enabled = true;
 module_param_named(sev_es, sev_es_enabled, bool, 0444);
 
+/* enable/disable SEV-SNP support */
+static bool sev_snp_enabled;
+
 /* enable/disable SEV-ES DebugSwap support */
 static bool sev_es_debug_swap_enabled = true;
 module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0444);
@@ -275,6 +278,9 @@ static int __sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp,
 	sev->es_active = es_active;
 	sev->vmsa_features = data->vmsa_features;
 
+	if (vm_type == KVM_X86_SNP_VM)
+		sev->vmsa_features |= SVM_SEV_FEAT_SNP_ACTIVE;
+
 	ret = sev_asid_new(sev);
 	if (ret)
 		goto e_no_asid;
@@ -326,7 +332,8 @@ static int sev_guest_init2(struct kvm *kvm, struct kvm_sev_cmd *argp)
 		return -EINVAL;
 
 	if (kvm->arch.vm_type != KVM_X86_SEV_VM &&
-	    kvm->arch.vm_type != KVM_X86_SEV_ES_VM)
+	    kvm->arch.vm_type != KVM_X86_SEV_ES_VM &&
+	    kvm->arch.vm_type != KVM_X86_SNP_VM)
 		return -EINVAL;
 
 	if (copy_from_user(&data, u64_to_user_ptr(argp->data), sizeof(data)))
@@ -2306,11 +2313,16 @@ void __init sev_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_SEV_ES);
 		kvm_caps.supported_vm_types |= BIT(KVM_X86_SEV_ES_VM);
 	}
+	if (sev_snp_enabled) {
+		kvm_cpu_cap_set(X86_FEATURE_SEV_SNP);
+		kvm_caps.supported_vm_types |= BIT(KVM_X86_SNP_VM);
+	}
 }
 
 void __init sev_hardware_setup(void)
 {
 	unsigned int eax, ebx, ecx, edx, sev_asid_count, sev_es_asid_count;
+	bool sev_snp_supported = false;
 	bool sev_es_supported = false;
 	bool sev_supported = false;
 
@@ -2391,6 +2403,7 @@ void __init sev_hardware_setup(void)
 	sev_es_asid_count = min_sev_asid - 1;
 	WARN_ON_ONCE(misc_cg_set_capacity(MISC_CG_RES_SEV_ES, sev_es_asid_count));
 	sev_es_supported = true;
+	sev_snp_supported = sev_snp_enabled && cc_platform_has(CC_ATTR_HOST_SEV_SNP);
 
 out:
 	if (boot_cpu_has(X86_FEATURE_SEV))
@@ -2403,9 +2416,15 @@ void __init sev_hardware_setup(void)
 		pr_info("SEV-ES %s (ASIDs %u - %u)\n",
 			sev_es_supported ? "enabled" : "disabled",
 			min_sev_asid > 1 ? 1 : 0, min_sev_asid - 1);
+	if (boot_cpu_has(X86_FEATURE_SEV_SNP))
+		pr_info("SEV-SNP %s (ASIDs %u - %u)\n",
+			sev_snp_supported ? "enabled" : "disabled",
+			min_sev_asid > 1 ? 1 : 0, min_sev_asid - 1);
 
 	sev_enabled = sev_supported;
 	sev_es_enabled = sev_es_supported;
+	sev_snp_enabled = sev_snp_supported;
+
 	if (!sev_es_enabled || !cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP) ||
 	    !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP))
 		sev_es_debug_swap_enabled = false;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 535018f152a3..d31404953bf1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4899,7 +4899,8 @@ static int svm_vm_init(struct kvm *kvm)
 
 	if (type != KVM_X86_DEFAULT_VM &&
 	    type != KVM_X86_SW_PROTECTED_VM) {
-		kvm->arch.has_protected_state = (type == KVM_X86_SEV_ES_VM);
+		kvm->arch.has_protected_state =
+			(type == KVM_X86_SEV_ES_VM || type == KVM_X86_SNP_VM);
 		to_kvm_sev_info(kvm)->need_init = true;
 	}
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6fd0f5862681..7f2e9c7fc4ca 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -348,6 +348,18 @@ static __always_inline bool sev_es_guest(struct kvm *kvm)
 #endif
 }
 
+static __always_inline bool sev_snp_guest(struct kvm *kvm)
+{
+#ifdef CONFIG_KVM_AMD_SEV
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+
+	return (sev->vmsa_features & SVM_SEV_FEAT_SNP_ACTIVE) &&
+	       !WARN_ON_ONCE(!sev_es_guest(kvm));
+#else
+	return false;
+#endif
+}
+
 static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
 {
 	vmcb->control.clean = 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 83b8260443a3..9923921904a2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12598,7 +12598,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 
 	kvm->arch.vm_type = type;
 	kvm->arch.has_private_mem =
-		(type == KVM_X86_SW_PROTECTED_VM);
+		(type == KVM_X86_SW_PROTECTED_VM || type == KVM_X86_SNP_VM);
 
 	ret = kvm_page_track_init(kvm);
 	if (ret)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (7 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 08/26] KVM: SEV: Add initial SEV-SNP support Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-19 11:52   ` Paolo Bonzini
  2024-04-24 21:40   ` Sean Christopherson
  2024-04-18 19:41 ` [PATCH v13 10/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
                   ` (17 subsequent siblings)
  26 siblings, 2 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

KVM_SEV_SNP_LAUNCH_START begins the launch process for an SEV-SNP guest.
The command initializes a cryptographic digest context used to construct
the measurement of the guest. Other commands can then at that point be
used to load/encrypt data into the guest's initial launch image.

For more information see the SEV-SNP specification.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/x86/amd-memory-encryption.rst    |  23 +-
 arch/x86/include/uapi/asm/kvm.h               |   8 +
 arch/x86/kvm/svm/sev.c                        | 208 +++++++++++++++++-
 arch/x86/kvm/svm/svm.h                        |   1 +
 4 files changed, 236 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
index 3381556d596d..1b042f827eab 100644
--- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
@@ -459,6 +459,25 @@ issued by the hypervisor to make the guest ready for execution.
 
 Returns: 0 on success, -negative on error
 
+18. KVM_SEV_SNP_LAUNCH_START
+----------------------------
+
+The KVM_SNP_LAUNCH_START command is used for creating the memory encryption
+context for the SEV-SNP guest.
+
+Parameters (in): struct  kvm_sev_snp_launch_start
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_snp_launch_start {
+                __u64 policy;           /* Guest policy to use. */
+                __u8 gosvw[16];         /* Guest OS visible workarounds. */
+        };
+
+See the SEV-SNP spec [snp-fw-abi]_ for further detail on the launch input.
+
 Device attribute API
 ====================
 
@@ -490,9 +509,11 @@ References
 ==========
 
 
-See [white-paper]_, [api-spec]_, [amd-apm]_ and [kvm-forum]_ for more info.
+See [white-paper]_, [api-spec]_, [amd-apm]_, [kvm-forum]_, and [snp-fw-abi]_
+for more info.
 
 .. [white-paper] https://developer.amd.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf
 .. [api-spec] https://support.amd.com/TechDocs/55766_SEV-KM_API_Specification.pdf
 .. [amd-apm] https://support.amd.com/TechDocs/24593.pdf (section 15.34)
 .. [kvm-forum]  https://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf
+.. [snp-fw-abi] https://www.amd.com/system/files/TechDocs/56860.pdf
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 9a8b81d20314..bdf8c5461a36 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -697,6 +697,9 @@ enum sev_cmd_id {
 	/* Second time is the charm; improved versions of the above ioctls.  */
 	KVM_SEV_INIT2,
 
+	/* SNP-specific commands */
+	KVM_SEV_SNP_LAUNCH_START = 100,
+
 	KVM_SEV_NR_MAX,
 };
 
@@ -822,6 +825,11 @@ struct kvm_sev_receive_update_data {
 	__u32 pad2;
 };
 
+struct kvm_sev_snp_launch_start {
+	__u64 policy;
+	__u8 gosvw[16];
+};
+
 #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
 #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c41cc73a1efe..4c5abc0e7806 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -25,6 +25,7 @@
 #include <asm/fpu/xcr.h>
 #include <asm/fpu/xstate.h>
 #include <asm/debugreg.h>
+#include <asm/sev.h>
 
 #include "mmu.h"
 #include "x86.h"
@@ -58,6 +59,25 @@ static u64 sev_supported_vmsa_features;
 #define AP_RESET_HOLD_NAE_EVENT		1
 #define AP_RESET_HOLD_MSR_PROTO		2
 
+/* As defined by SEV-SNP Firmware ABI, under "Guest Policy". */
+#define SNP_POLICY_MASK_SMT		BIT_ULL(16)
+#define SNP_POLICY_MASK_RSVD_MBO	BIT_ULL(17)
+#define SNP_POLICY_MASK_DEBUG		BIT_ULL(19)
+#define SNP_POLICY_MASK_SINGLE_SOCKET	BIT_ULL(20)
+#define SNP_POLICY_MASK_API_MAJOR	GENMASK_ULL(15, 8)
+#define SNP_POLICY_MASK_API_MINOR	GENMASK_ULL(7, 0)
+
+#define SNP_POLICY_MASK_VALID		(SNP_POLICY_MASK_SMT		| \
+					 SNP_POLICY_MASK_RSVD_MBO	| \
+					 SNP_POLICY_MASK_DEBUG		| \
+					 SNP_POLICY_MASK_SINGLE_SOCKET	| \
+					 SNP_POLICY_MASK_API_MAJOR	| \
+					 SNP_POLICY_MASK_API_MINOR)
+
+/* KVM's SNP support is compatible with 1.51 of the SEV-SNP Firmware ABI. */
+#define SNP_POLICY_API_MAJOR		1
+#define SNP_POLICY_API_MINOR		51
+
 static u8 sev_enc_bit;
 static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
@@ -68,6 +88,8 @@ static unsigned int nr_asids;
 static unsigned long *sev_asid_bitmap;
 static unsigned long *sev_reclaim_asid_bitmap;
 
+static int snp_decommission_context(struct kvm *kvm);
+
 struct enc_region {
 	struct list_head list;
 	unsigned long npages;
@@ -94,12 +116,17 @@ static int sev_flush_asids(unsigned int min_asid, unsigned int max_asid)
 	down_write(&sev_deactivate_lock);
 
 	wbinvd_on_all_cpus();
-	ret = sev_guest_df_flush(&error);
+
+	if (sev_snp_enabled)
+		ret = sev_do_cmd(SEV_CMD_SNP_DF_FLUSH, NULL, &error);
+	else
+		ret = sev_guest_df_flush(&error);
 
 	up_write(&sev_deactivate_lock);
 
 	if (ret)
-		pr_err("SEV: DF_FLUSH failed, ret=%d, error=%#x\n", ret, error);
+		pr_err("SEV%s: DF_FLUSH failed, ret=%d, error=%#x\n",
+		       sev_snp_enabled ? "-SNP" : "", ret, error);
 
 	return ret;
 }
@@ -1976,6 +2003,134 @@ int sev_dev_get_attr(u32 group, u64 attr, u64 *val)
 	}
 }
 
+/*
+ * The guest context contains all the information, keys and metadata
+ * associated with the guest that the firmware tracks to implement SEV
+ * and SNP features. The firmware stores the guest context in hypervisor
+ * provide page via the SNP_GCTX_CREATE command.
+ */
+static void *snp_context_create(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct sev_data_snp_addr data = {};
+	void *context;
+	int rc;
+
+	/* Allocate memory for context page */
+	context = snp_alloc_firmware_page(GFP_KERNEL_ACCOUNT);
+	if (!context)
+		return NULL;
+
+	data.address = __psp_pa(context);
+	rc = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_GCTX_CREATE, &data, &argp->error);
+	if (rc) {
+		pr_warn("Failed to create SEV-SNP context, rc %d fw_error %d",
+			rc, argp->error);
+		snp_free_firmware_page(context);
+		return NULL;
+	}
+
+	return context;
+}
+
+static int snp_bind_asid(struct kvm *kvm, int *error)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_snp_activate data = {0};
+
+	data.gctx_paddr = __psp_pa(sev->snp_context);
+	data.asid = sev_get_asid(kvm);
+	return sev_issue_cmd(kvm, SEV_CMD_SNP_ACTIVATE, &data, error);
+}
+
+static inline bool sev_version_greater_or_equal(u8 major, u8 minor)
+{
+	if (major < SNP_POLICY_API_MAJOR)
+		return true;
+
+	if (major == SNP_POLICY_API_MAJOR && minor <= SNP_POLICY_API_MINOR)
+		return true;
+
+	return false;
+}
+
+static int snp_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_snp_launch_start start = {0};
+	struct kvm_sev_snp_launch_start params;
+	u8 major, minor;
+	int rc;
+
+	if (!sev_snp_guest(kvm))
+		return -ENOTTY;
+
+	if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
+		return -EFAULT;
+
+	/* Don't allow userspace to allocate memory for more than 1 SNP context. */
+	if (sev->snp_context) {
+		pr_debug("SEV-SNP context already exists. Refusing to allocate an additional one.\n");
+		return -EINVAL;
+	}
+
+	sev->snp_context = snp_context_create(kvm, argp);
+	if (!sev->snp_context)
+		return -ENOTTY;
+
+	if (params.policy & ~SNP_POLICY_MASK_VALID) {
+		pr_debug("SEV-SNP hypervisor does not support requested policy %llx (supported %llx).\n",
+			 params.policy, SNP_POLICY_MASK_VALID);
+		return -EINVAL;
+	}
+
+	if (!(params.policy & SNP_POLICY_MASK_RSVD_MBO)) {
+		pr_debug("SEV-SNP hypervisor does not support requested policy %llx (must be set %llx).\n",
+			 params.policy, SNP_POLICY_MASK_RSVD_MBO);
+		return -EINVAL;
+	}
+
+	if (params.policy & SNP_POLICY_MASK_SINGLE_SOCKET) {
+		pr_debug("SEV-SNP hypervisor does not support limiting guests to a single socket.\n");
+		return -EINVAL;
+	}
+
+	if (!(params.policy & SNP_POLICY_MASK_SMT)) {
+		pr_debug("SEV-SNP hypervisor does not support limiting guests to a single SMT thread.\n");
+		return -EINVAL;
+	}
+
+	major = (params.policy & SNP_POLICY_MASK_API_MAJOR);
+	minor = (params.policy & SNP_POLICY_MASK_API_MINOR);
+	if (!sev_version_greater_or_equal(major, minor)) {
+		pr_debug("SEV-SNP hypervisor does not support requested version %d.%d (have %d,%d).\n",
+			 major, minor, SNP_POLICY_API_MAJOR, SNP_POLICY_API_MINOR);
+		return -EINVAL;
+	}
+
+	start.gctx_paddr = __psp_pa(sev->snp_context);
+	start.policy = params.policy;
+	memcpy(start.gosvw, params.gosvw, sizeof(params.gosvw));
+	rc = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_LAUNCH_START, &start, &argp->error);
+	if (rc) {
+		pr_debug("SEV_CMD_SNP_LAUNCH_START firmware command failed, rc %d\n", rc);
+		goto e_free_context;
+	}
+
+	sev->fd = argp->sev_fd;
+	rc = snp_bind_asid(kvm, &argp->error);
+	if (rc) {
+		pr_debug("Failed to bind ASID to SEV-SNP context, rc %d\n", rc);
+		goto e_free_context;
+	}
+
+	return 0;
+
+e_free_context:
+	snp_decommission_context(kvm);
+
+	return rc;
+}
+
 int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -1999,6 +2154,15 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 		goto out;
 	}
 
+	/*
+	 * Once KVM_SEV_INIT2 initializes a KVM instance as an SNP guest, only
+	 * allow the use of SNP-specific commands.
+	 */
+	if (sev_snp_guest(kvm) && sev_cmd.id < KVM_SEV_SNP_LAUNCH_START) {
+		r = -EPERM;
+		goto out;
+	}
+
 	switch (sev_cmd.id) {
 	case KVM_SEV_ES_INIT:
 		if (!sev_es_enabled) {
@@ -2063,6 +2227,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_RECEIVE_FINISH:
 		r = sev_receive_finish(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SNP_LAUNCH_START:
+		r = snp_launch_start(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
@@ -2258,6 +2425,33 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
 	return ret;
 }
 
+static int snp_decommission_context(struct kvm *kvm)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_snp_addr data = {};
+	int ret;
+
+	/* If context is not created then do nothing */
+	if (!sev->snp_context)
+		return 0;
+
+	data.address = __sme_pa(sev->snp_context);
+	down_write(&sev_deactivate_lock);
+	ret = sev_do_cmd(SEV_CMD_SNP_DECOMMISSION, &data, NULL);
+	if (WARN_ONCE(ret, "failed to release guest context")) {
+		up_write(&sev_deactivate_lock);
+		return ret;
+	}
+
+	up_write(&sev_deactivate_lock);
+
+	/* free the context page now */
+	snp_free_firmware_page(sev->snp_context);
+	sev->snp_context = NULL;
+
+	return 0;
+}
+
 void sev_vm_destroy(struct kvm *kvm)
 {
 	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
@@ -2299,7 +2493,15 @@ void sev_vm_destroy(struct kvm *kvm)
 		}
 	}
 
-	sev_unbind_asid(kvm, sev->handle);
+	if (sev_snp_guest(kvm)) {
+		if (snp_decommission_context(kvm)) {
+			WARN_ONCE(1, "Failed to free SNP guest context, leaking asid!\n");
+			return;
+		}
+	} else {
+		sev_unbind_asid(kvm, sev->handle);
+	}
+
 	sev_asid_free(sev);
 }
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7f2e9c7fc4ca..0654fc91d4db 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -92,6 +92,7 @@ struct kvm_sev_info {
 	struct list_head mirror_entry; /* Use as a list entry of mirrors */
 	struct misc_cg *misc_cg; /* For misc cgroup accounting */
 	atomic_t migration_in_progress;
+	void *snp_context;      /* SNP guest context page */
 };
 
 struct kvm_svm {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 10/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (8 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-19 11:56   ` Paolo Bonzini
  2024-04-18 19:41 ` [PATCH v13 11/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
                   ` (16 subsequent siblings)
  26 siblings, 1 reply; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

A key aspect of a launching an SNP guest is initializing it with a
known/measured payload which is then encrypted into guest memory as
pre-validated private pages and then measured into the cryptographic
launch context created with KVM_SEV_SNP_LAUNCH_START so that the guest
can attest itself after booting.

Since all private pages are provided by guest_memfd, make use of the
kvm_gmem_populate() interface to handle this. The general flow is that
guest_memfd will handle allocating the pages associated with the GPA
ranges being initialized by each particular call of
KVM_SEV_SNP_LAUNCH_UPDATE, copying data from userspace into those pages,
and then the post_populate callback will do the work of setting the
RMP entries for these pages to private and issuing the SNP firmware
calls to encrypt/measure them.

For more information see the SEV-SNP specification.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/x86/amd-memory-encryption.rst    |  39 ++++
 arch/x86/include/uapi/asm/kvm.h               |  15 ++
 arch/x86/kvm/svm/sev.c                        | 218 ++++++++++++++++++
 3 files changed, 272 insertions(+)

diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
index 1b042f827eab..1ee8401de72d 100644
--- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
@@ -478,6 +478,45 @@ Returns: 0 on success, -negative on error
 
 See the SEV-SNP spec [snp-fw-abi]_ for further detail on the launch input.
 
+19. KVM_SEV_SNP_LAUNCH_UPDATE
+-----------------------------
+
+The KVM_SEV_SNP_LAUNCH_UPDATE command is used for loading userspace-provided
+data into a guest GPA range, measuring the contents into the SNP guest context
+created by KVM_SEV_SNP_LAUNCH_START, and then encrypting/validating that GPA
+range so that it will be immediately readable using the encryption key
+associated with the guest context once it is booted, after which point it can
+attest the measurement associated with its context before unlocking any
+secrets.
+
+It is required that the GPA ranges initialized by this command have had the
+KVM_MEMORY_ATTRIBUTE_PRIVATE attribute set in advance. See the documentation
+for KVM_SET_MEMORY_ATTRIBUTES for more details on this aspect.
+
+Parameters (in): struct  kvm_sev_snp_launch_update
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_snp_launch_update {
+                __u64 gfn_start;        /* Guest page number to load/encrypt data into. */
+                __u64 uaddr;            /* Userspace address of data to be loaded/encrypted. */
+                __u32 len;              /* 4k-aligned length in bytes to copy into guest memory.*/
+                __u8 type;              /* The type of the guest pages being initialized. */
+        };
+
+where the allowed values for page_type are #define'd as::
+
+	KVM_SEV_SNP_PAGE_TYPE_NORMAL
+	KVM_SEV_SNP_PAGE_TYPE_ZERO
+	KVM_SEV_SNP_PAGE_TYPE_UNMEASURED
+	KVM_SEV_SNP_PAGE_TYPE_SECRETS
+	KVM_SEV_SNP_PAGE_TYPE_CPUID
+
+See the SEV-SNP spec [snp-fw-abi]_ for further details on how each page type is
+used/measured.
+
 Device attribute API
 ====================
 
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index bdf8c5461a36..8612aec97f55 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -699,6 +699,7 @@ enum sev_cmd_id {
 
 	/* SNP-specific commands */
 	KVM_SEV_SNP_LAUNCH_START = 100,
+	KVM_SEV_SNP_LAUNCH_UPDATE,
 
 	KVM_SEV_NR_MAX,
 };
@@ -830,6 +831,20 @@ struct kvm_sev_snp_launch_start {
 	__u8 gosvw[16];
 };
 
+/* Kept in sync with firmware values for simplicity. */
+#define KVM_SEV_SNP_PAGE_TYPE_NORMAL		0x1
+#define KVM_SEV_SNP_PAGE_TYPE_ZERO		0x3
+#define KVM_SEV_SNP_PAGE_TYPE_UNMEASURED	0x4
+#define KVM_SEV_SNP_PAGE_TYPE_SECRETS		0x5
+#define KVM_SEV_SNP_PAGE_TYPE_CPUID		0x6
+
+struct kvm_sev_snp_launch_update {
+	__u64 gfn_start;
+	__u64 uaddr;
+	__u32 len;
+	__u8 type;
+};
+
 #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
 #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 4c5abc0e7806..e721152bae00 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -262,6 +262,35 @@ static void sev_decommission(unsigned int handle)
 	sev_guest_decommission(&decommission, NULL);
 }
 
+static int snp_page_reclaim(u64 pfn)
+{
+	struct sev_data_snp_page_reclaim data = {0};
+	int err, rc;
+
+	data.paddr = __sme_set(pfn << PAGE_SHIFT);
+	rc = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
+	if (WARN_ON_ONCE(rc)) {
+		/*
+		 * This shouldn't happen under normal circumstances, but if the
+		 * reclaim failed, then the page is no longer safe to use.
+		 */
+		snp_leak_pages(pfn, 1);
+	}
+
+	return rc;
+}
+
+static int host_rmp_make_shared(u64 pfn, enum pg_level level)
+{
+	int rc;
+
+	rc = rmp_make_shared(pfn, level);
+	if (rc)
+		snp_leak_pages(pfn, page_level_size(level) >> PAGE_SHIFT);
+
+	return rc;
+}
+
 static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
 {
 	struct sev_data_deactivate deactivate;
@@ -2131,6 +2160,192 @@ static int snp_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return rc;
 }
 
+struct sev_gmem_populate_args {
+	__u8 type;
+	int sev_fd;
+	int fw_error;
+};
+
+static int sev_gmem_post_populate(struct kvm *kvm, gfn_t gfn_start, kvm_pfn_t pfn,
+				  void __user *src, int order, void *opaque)
+{
+	struct sev_gmem_populate_args *sev_populate_args = opaque;
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	int n_private = 0, ret, i;
+	int npages = (1 << order);
+	gfn_t gfn;
+
+	pr_debug("%s: gfn_start %llx pfn_start %llx npages %d\n",
+		 __func__, gfn_start, pfn, npages);
+
+	for (gfn = gfn_start, i = 0; gfn < gfn_start + npages; gfn++, i++) {
+		struct sev_data_snp_launch_update fw_args = {0};
+		bool assigned;
+		void *vaddr;
+		int level;
+
+		if (!kvm_mem_is_private(kvm, gfn)) {
+			pr_debug("%s: Failed to ensure GFN 0x%llx has private memory attribute set\n",
+				 __func__, gfn);
+			ret = -EINVAL;
+			break;
+		}
+
+		ret = snp_lookup_rmpentry((u64)pfn + i, &assigned, &level);
+		if (ret || assigned) {
+			pr_debug("%s: Failed to ensure GFN 0x%llx RMP entry is initial shared state, ret: %d assigned: %d\n",
+				 __func__, gfn, ret, assigned);
+			ret = -EINVAL;
+			break;
+		}
+
+		vaddr = kmap_local_pfn(pfn + i);
+		ret = copy_from_user(vaddr, src + i * PAGE_SIZE, PAGE_SIZE);
+		if (ret) {
+			pr_debug("Failed to copy source page into GFN 0x%llx\n", gfn);
+			goto out_unmap;
+		}
+
+		ret = rmp_make_private(pfn + i, gfn << PAGE_SHIFT, PG_LEVEL_4K,
+				       sev_get_asid(kvm), true);
+		if (ret) {
+			pr_debug("%s: Failed to convert GFN 0x%llx to private, ret: %d\n",
+				 __func__, gfn, ret);
+			goto out_unmap;
+		}
+
+		n_private++;
+
+		fw_args.gctx_paddr = __psp_pa(sev->snp_context);
+		fw_args.address = __sme_set(pfn_to_hpa(pfn + i));
+		fw_args.page_size = PG_LEVEL_TO_RMP(PG_LEVEL_4K);
+		fw_args.page_type = sev_populate_args->type;
+		ret = __sev_issue_cmd(sev_populate_args->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE,
+				      &fw_args, &sev_populate_args->fw_error);
+		if (ret) {
+			pr_debug("%s: SEV-SNP launch update failed, ret: 0x%x, fw_error: 0x%x\n",
+				 __func__, ret, sev_populate_args->fw_error);
+
+			if (snp_page_reclaim(pfn + i))
+				goto out_unmap;
+
+			/*
+			 * When invalid CPUID function entries are detected,
+			 * firmware writes the expected values into the page and
+			 * leaves it unencrypted so it can be used for debugging
+			 * and error-reporting.
+			 *
+			 * Copy this page back into the source buffer so
+			 * userspace can use this information to provide
+			 * information on which CPUID leaves/fields failed CPUID
+			 * validation.
+			 */
+			if (sev_populate_args->type == KVM_SEV_SNP_PAGE_TYPE_CPUID &&
+			    sev_populate_args->fw_error == SEV_RET_INVALID_PARAM) {
+				host_rmp_make_shared(pfn + i, PG_LEVEL_4K);
+
+				if (copy_to_user(src + i * PAGE_SIZE,
+						 vaddr, PAGE_SIZE))
+					pr_debug("Failed to write CPUID page back to userspace\n");
+			}
+		}
+
+out_unmap:
+		kunmap_local(vaddr);
+		if (ret)
+			break;
+	}
+
+	if (ret) {
+		pr_debug("%s: exiting with error ret %d, undoing %d populated gmem pages.\n",
+			 __func__, ret, n_private);
+		for (i = 0; i < n_private; i++)
+			host_rmp_make_shared(pfn + i, PG_LEVEL_4K);
+	}
+
+	return ret;
+}
+
+static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_gmem_populate_args sev_populate_args = {0};
+	struct kvm_sev_snp_launch_update params;
+	struct kvm_memory_slot *memslot;
+	unsigned int npages;
+	int ret = 0;
+
+	if (!sev_snp_guest(kvm) || !sev->snp_context)
+		return -EINVAL;
+
+	if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
+		return -EFAULT;
+
+	if (!IS_ALIGNED(params.len, PAGE_SIZE) ||
+	    (params.type != KVM_SEV_SNP_PAGE_TYPE_NORMAL &&
+	     params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO &&
+	     params.type != KVM_SEV_SNP_PAGE_TYPE_UNMEASURED &&
+	     params.type != KVM_SEV_SNP_PAGE_TYPE_SECRETS &&
+	     params.type != KVM_SEV_SNP_PAGE_TYPE_CPUID))
+		return -EINVAL;
+
+	npages = params.len / PAGE_SIZE;
+
+	pr_debug("%s: GFN range 0x%llx-0x%llx type %d\n", __func__,
+		 params.gfn_start, params.gfn_start + npages, params.type);
+
+	/*
+	 * For each GFN that's being prepared as part of the initial guest
+	 * state, the following pre-conditions are verified:
+	 *
+	 *   1) The backing memslot is a valid private memslot.
+	 *   2) The GFN has been set to private via KVM_SET_MEMORY_ATTRIBUTES
+	 *      beforehand.
+	 *   3) The PFN of the guest_memfd has not already been set to private
+	 *      in the RMP table.
+	 *
+	 * The KVM MMU relies on kvm->mmu_invalidate_seq to retry nested page
+	 * faults if there's a race between a fault and an attribute update via
+	 * KVM_SET_MEMORY_ATTRIBUTES, and a similar approach could be utilized
+	 * here. However, kvm->slots_lock guards against both this as well as
+	 * concurrent memslot updates occurring while these checks are being
+	 * performed, so use that here to make it easier to reason about the
+	 * initial expected state and better guard against unexpected
+	 * situations.
+	 */
+	mutex_lock(&kvm->slots_lock);
+
+	memslot = gfn_to_memslot(kvm, params.gfn_start);
+	if (!kvm_slot_can_be_private(memslot)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	sev_populate_args.sev_fd = argp->sev_fd;
+	sev_populate_args.type = params.type;
+
+	ret = kvm_gmem_populate(kvm, params.gfn_start, u64_to_user_ptr(params.uaddr),
+				npages, sev_gmem_post_populate, &sev_populate_args);
+	if (ret < 0) {
+		argp->error = sev_populate_args.fw_error;
+		pr_debug("%s: kvm_gmem_populate failed, ret %d (fw_error %d)\n",
+			 __func__, ret, argp->error);
+	} else if (ret < npages) {
+		params.len = ret * PAGE_SIZE;
+		ret = -EINTR;
+	} else if (WARN_ONCE(ret > npages, "Completed page count %d exceeds requested amount %d",
+			     ret, npages)) {
+		ret = -EINVAL;
+	} else {
+		ret = 0;
+	}
+
+out:
+	mutex_unlock(&kvm->slots_lock);
+
+	return ret;
+}
+
 int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -2230,6 +2445,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SNP_LAUNCH_START:
 		r = snp_launch_start(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SNP_LAUNCH_UPDATE:
+		r = snp_launch_update(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 11/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (9 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 10/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 12/26] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh, Harald Hoyer

From: Brijesh Singh <brijesh.singh@amd.com>

Add a KVM_SEV_SNP_LAUNCH_FINISH command to finalize the cryptographic
launch digest which stores the measurement of the guest at launch time.
Also extend the existing SNP firmware data structures to support
disabling the use of Versioned Chip Endorsement Keys (VCEK) by guests as
part of this command.

While finalizing the launch flow, the code also issues the LAUNCH_UPDATE
SNP firmware commands to encrypt/measure the initial VMSA pages for each
configured vCPU, which requires setting the RMP entries for those pages
to private, so also add handling to clean up the RMP entries for these
pages whening freeing vCPUs during shutdown.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Harald Hoyer <harald@profian.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 .../virt/kvm/x86/amd-memory-encryption.rst    |  26 ++++
 arch/x86/include/uapi/asm/kvm.h               |  15 +++
 arch/x86/kvm/svm/sev.c                        | 123 ++++++++++++++++++
 include/linux/psp-sev.h                       |   4 +-
 4 files changed, 167 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
index 1ee8401de72d..d2fea9874f68 100644
--- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
@@ -517,6 +517,32 @@ where the allowed values for page_type are #define'd as::
 See the SEV-SNP spec [snp-fw-abi]_ for further details on how each page type is
 used/measured.
 
+20. KVM_SEV_SNP_LAUNCH_FINISH
+-----------------------------
+
+After completion of the SNP guest launch flow, the KVM_SEV_SNP_LAUNCH_FINISH
+command can be issued to make the guest ready for execution.
+
+Parameters (in): struct kvm_sev_snp_launch_finish
+
+Returns: 0 on success, -negative on error
+
+::
+
+        struct kvm_sev_snp_launch_finish {
+                __u64 id_block_uaddr;
+                __u64 id_auth_uaddr;
+                __u8 id_block_en;
+                __u8 auth_key_en;
+                __u8 vcek_disabled;
+                __u8 host_data[32];
+                __u8 pad0[5];
+        };
+
+
+See SNP_LAUNCH_FINISH in the SEV-SNP specification [snp-fw-abi]_ for further
+details on the input parameters in ``struct kvm_sev_snp_launch_finish``.
+
 Device attribute API
 ====================
 
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 8612aec97f55..1d1f149d035e 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -700,6 +700,7 @@ enum sev_cmd_id {
 	/* SNP-specific commands */
 	KVM_SEV_SNP_LAUNCH_START = 100,
 	KVM_SEV_SNP_LAUNCH_UPDATE,
+	KVM_SEV_SNP_LAUNCH_FINISH,
 
 	KVM_SEV_NR_MAX,
 };
@@ -845,6 +846,20 @@ struct kvm_sev_snp_launch_update {
 	__u8 type;
 };
 
+#define KVM_SEV_SNP_ID_BLOCK_SIZE	96
+#define KVM_SEV_SNP_ID_AUTH_SIZE	4096
+#define KVM_SEV_SNP_FINISH_DATA_SIZE	32
+
+struct kvm_sev_snp_launch_finish {
+	__u64 id_block_uaddr;
+	__u64 id_auth_uaddr;
+	__u8 id_block_en;
+	__u8 auth_key_en;
+	__u8 vcek_disabled;
+	__u8 host_data[KVM_SEV_SNP_FINISH_DATA_SIZE];
+	__u8 pad0[5];
+};
+
 #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
 #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index e721152bae00..78412c7c6708 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -78,6 +78,8 @@ static u64 sev_supported_vmsa_features;
 #define SNP_POLICY_API_MAJOR		1
 #define SNP_POLICY_API_MINOR		51
 
+#define INITIAL_VMSA_GPA 0xFFFFFFFFF000
+
 static u8 sev_enc_bit;
 static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
@@ -2346,6 +2348,111 @@ static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return ret;
 }
 
+static int snp_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_data_snp_launch_update data = {};
+	struct kvm_vcpu *vcpu;
+	unsigned long i;
+	int ret;
+
+	data.gctx_paddr = __psp_pa(sev->snp_context);
+	data.page_type = SNP_PAGE_TYPE_VMSA;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		struct vcpu_svm *svm = to_svm(vcpu);
+		u64 pfn = __pa(svm->sev_es.vmsa) >> PAGE_SHIFT;
+
+		/* Perform some pre-encryption checks against the VMSA */
+		ret = sev_es_sync_vmsa(svm);
+		if (ret)
+			return ret;
+
+		/* Transition the VMSA page to a firmware state. */
+		ret = rmp_make_private(pfn, INITIAL_VMSA_GPA, PG_LEVEL_4K, sev->asid, true);
+		if (ret)
+			return ret;
+
+		/* Issue the SNP command to encrypt the VMSA */
+		data.address = __sme_pa(svm->sev_es.vmsa);
+		ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE,
+				      &data, &argp->error);
+		if (ret) {
+			snp_page_reclaim(pfn);
+			return ret;
+		}
+
+		svm->vcpu.arch.guest_state_protected = true;
+	}
+
+	return 0;
+}
+
+static int snp_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct kvm_sev_snp_launch_finish params;
+	struct sev_data_snp_launch_finish *data;
+	void *id_block = NULL, *id_auth = NULL;
+	int ret;
+
+	if (!sev_snp_guest(kvm))
+		return -ENOTTY;
+
+	if (!sev->snp_context)
+		return -EINVAL;
+
+	if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
+		return -EFAULT;
+
+	/* Measure all vCPUs using LAUNCH_UPDATE before finalizing the launch flow. */
+	ret = snp_launch_update_vmsa(kvm, argp);
+	if (ret)
+		return ret;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
+	if (!data)
+		return -ENOMEM;
+
+	if (params.id_block_en) {
+		id_block = psp_copy_user_blob(params.id_block_uaddr, KVM_SEV_SNP_ID_BLOCK_SIZE);
+		if (IS_ERR(id_block)) {
+			ret = PTR_ERR(id_block);
+			goto e_free;
+		}
+
+		data->id_block_en = 1;
+		data->id_block_paddr = __sme_pa(id_block);
+
+		id_auth = psp_copy_user_blob(params.id_auth_uaddr, KVM_SEV_SNP_ID_AUTH_SIZE);
+		if (IS_ERR(id_auth)) {
+			ret = PTR_ERR(id_auth);
+			goto e_free_id_block;
+		}
+
+		data->id_auth_paddr = __sme_pa(id_auth);
+
+		if (params.auth_key_en)
+			data->auth_key_en = 1;
+	}
+
+	data->vcek_disabled = params.vcek_disabled;
+
+	memcpy(data->host_data, params.host_data, KVM_SEV_SNP_FINISH_DATA_SIZE);
+	data->gctx_paddr = __psp_pa(sev->snp_context);
+	ret = sev_issue_cmd(kvm, SEV_CMD_SNP_LAUNCH_FINISH, data, &argp->error);
+
+	kfree(id_auth);
+
+e_free_id_block:
+	kfree(id_block);
+
+e_free:
+	kfree(data);
+
+	return ret;
+}
+
 int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_sev_cmd sev_cmd;
@@ -2448,6 +2555,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
 	case KVM_SEV_SNP_LAUNCH_UPDATE:
 		r = snp_launch_update(kvm, &sev_cmd);
 		break;
+	case KVM_SEV_SNP_LAUNCH_FINISH:
+		r = snp_launch_finish(kvm, &sev_cmd);
+		break;
 	default:
 		r = -EINVAL;
 		goto out;
@@ -2938,11 +3048,24 @@ void sev_free_vcpu(struct kvm_vcpu *vcpu)
 
 	svm = to_svm(vcpu);
 
+	/*
+	 * If it's an SNP guest, then the VMSA was marked in the RMP table as
+	 * a guest-owned page. Transition the page to hypervisor state before
+	 * releasing it back to the system.
+	 */
+	if (sev_snp_guest(vcpu->kvm)) {
+		u64 pfn = __pa(svm->sev_es.vmsa) >> PAGE_SHIFT;
+
+		if (host_rmp_make_shared(pfn, PG_LEVEL_4K))
+			goto skip_vmsa_free;
+	}
+
 	if (vcpu->arch.guest_state_protected)
 		sev_flush_encrypted_page(vcpu, svm->sev_es.vmsa);
 
 	__free_page(virt_to_page(svm->sev_es.vmsa));
 
+skip_vmsa_free:
 	if (svm->sev_es.ghcb_sa_free)
 		kvfree(svm->sev_es.ghcb_sa);
 }
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index 3705c2044fc0..903ddfea8585 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -658,6 +658,7 @@ struct sev_data_snp_launch_update {
  * @id_auth_paddr: system physical address of ID block authentication structure
  * @id_block_en: indicates whether ID block is present
  * @auth_key_en: indicates whether author key is present in authentication structure
+ * @vcek_disabled: indicates whether use of VCEK is allowed for attestation reports
  * @rsvd: reserved
  * @host_data: host-supplied data for guest, not interpreted by firmware
  */
@@ -667,7 +668,8 @@ struct sev_data_snp_launch_finish {
 	u64 id_auth_paddr;
 	u8 id_block_en:1;
 	u8 auth_key_en:1;
-	u64 rsvd:62;
+	u8 vcek_disabled:1;
+	u64 rsvd:61;
 	u8 host_data[32];
 } __packed;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 12/26] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (10 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 11/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 13/26] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

SEV-SNP guests are required to perform a GHCB GPA registration. Before
using a GHCB GPA for a vCPU the first time, a guest must register the
vCPU GHCB GPA. If hypervisor can work with the guest requested GPA then
it must respond back with the same GPA otherwise return -1.

On VMEXIT, verify that the GHCB GPA matches with the registered value.
If a mismatch is detected, then abort the guest.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev-common.h |  8 ++++++++
 arch/x86/kvm/svm/sev.c            | 27 +++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.h            |  7 +++++++
 3 files changed, 42 insertions(+)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 5a8246dd532f..1006bfffe07a 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -59,6 +59,14 @@
 #define GHCB_MSR_AP_RESET_HOLD_RESULT_POS	12
 #define GHCB_MSR_AP_RESET_HOLD_RESULT_MASK	GENMASK_ULL(51, 0)
 
+/* Preferred GHCB GPA Request */
+#define GHCB_MSR_PREF_GPA_REQ		0x010
+#define GHCB_MSR_GPA_VALUE_POS		12
+#define GHCB_MSR_GPA_VALUE_MASK		GENMASK_ULL(51, 0)
+
+#define GHCB_MSR_PREF_GPA_RESP		0x011
+#define GHCB_MSR_PREF_GPA_NONE		0xfffffffffffff
+
 /* GHCB GPA Register */
 #define GHCB_MSR_REG_GPA_REQ		0x012
 #define GHCB_MSR_REG_GPA_REQ_VAL(v)			\
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 78412c7c6708..bd7f46c61c64 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3532,6 +3532,26 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 		set_ghcb_msr_bits(svm, GHCB_MSR_HV_FT_RESP,
 				  GHCB_MSR_INFO_MASK, GHCB_MSR_INFO_POS);
 		break;
+	case GHCB_MSR_PREF_GPA_REQ:
+		set_ghcb_msr_bits(svm, GHCB_MSR_PREF_GPA_NONE, GHCB_MSR_GPA_VALUE_MASK,
+				  GHCB_MSR_GPA_VALUE_POS);
+		set_ghcb_msr_bits(svm, GHCB_MSR_PREF_GPA_RESP, GHCB_MSR_INFO_MASK,
+				  GHCB_MSR_INFO_POS);
+		break;
+	case GHCB_MSR_REG_GPA_REQ: {
+		u64 gfn;
+
+		gfn = get_ghcb_msr_bits(svm, GHCB_MSR_GPA_VALUE_MASK,
+					GHCB_MSR_GPA_VALUE_POS);
+
+		svm->sev_es.ghcb_registered_gpa = gfn_to_gpa(gfn);
+
+		set_ghcb_msr_bits(svm, gfn, GHCB_MSR_GPA_VALUE_MASK,
+				  GHCB_MSR_GPA_VALUE_POS);
+		set_ghcb_msr_bits(svm, GHCB_MSR_REG_GPA_RESP, GHCB_MSR_INFO_MASK,
+				  GHCB_MSR_INFO_POS);
+		break;
+	}
 	case GHCB_MSR_TERM_REQ: {
 		u64 reason_set, reason_code;
 
@@ -3595,6 +3615,13 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 	trace_kvm_vmgexit_enter(vcpu->vcpu_id, svm->sev_es.ghcb);
 
 	sev_es_sync_from_ghcb(svm);
+
+	/* SEV-SNP guest requires that the GHCB GPA must be registered */
+	if (sev_snp_guest(svm->vcpu.kvm) && !ghcb_gpa_is_registered(svm, ghcb_gpa)) {
+		vcpu_unimpl(&svm->vcpu, "vmgexit: GHCB GPA [%#llx] is not registered.\n", ghcb_gpa);
+		return -EINVAL;
+	}
+
 	ret = sev_es_validate_vmgexit(svm);
 	if (ret)
 		return ret;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 0654fc91d4db..730f5ced2a2e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -208,6 +208,8 @@ struct vcpu_sev_es_state {
 	u32 ghcb_sa_len;
 	bool ghcb_sa_sync;
 	bool ghcb_sa_free;
+
+	u64 ghcb_registered_gpa;
 };
 
 struct vcpu_svm {
@@ -361,6 +363,11 @@ static __always_inline bool sev_snp_guest(struct kvm *kvm)
 #endif
 }
 
+static inline bool ghcb_gpa_is_registered(struct vcpu_svm *svm, u64 val)
+{
+	return svm->sev_es.ghcb_registered_gpa == val;
+}
+
 static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
 {
 	vmcb->control.clean = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 13/26] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (11 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 12/26] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 14/26] KVM: SEV: Add support to handle " Michael Roth
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

SEV-SNP VMs can ask the hypervisor to change the page state in the RMP
table to be private or shared using the Page State Change MSR protocol
as defined in the GHCB specification.

When using gmem, private/shared memory is allocated through separate
pools, and KVM relies on userspace issuing a KVM_SET_MEMORY_ATTRIBUTES
KVM ioctl to tell the KVM MMU whether or not a particular GFN should be
backed by private memory or not.

Forward these page state change requests to userspace so that it can
issue the expected KVM ioctls. The KVM MMU will handle updating the RMP
entries when it is ready to map a private page into a guest.

Define a new KVM_EXIT_VMGEXIT for exits of this type, and structure it
so that it can be extended for other cases where VMGEXITs need some
level of handling in userspace.

Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 Documentation/virt/kvm/api.rst    | 33 +++++++++++++++++++++++++++++++
 arch/x86/include/asm/sev-common.h |  6 ++++++
 arch/x86/kvm/svm/sev.c            | 33 +++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h          | 17 ++++++++++++++++
 4 files changed, 89 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index f0b76ff5030d..4a7a2945bc78 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7060,6 +7060,39 @@ Please note that the kernel is allowed to use the kvm_run structure as the
 primary storage for certain register types. Therefore, the kernel may use the
 values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 
+::
+
+		/* KVM_EXIT_VMGEXIT */
+		struct kvm_user_vmgexit {
+		#define KVM_USER_VMGEXIT_PSC_MSR	1
+			__u32 type; /* KVM_USER_VMGEXIT_* type */
+			union {
+				struct {
+					__u64 gpa;
+		#define KVM_USER_VMGEXIT_PSC_MSR_OP_PRIVATE	1
+		#define KVM_USER_VMGEXIT_PSC_MSR_OP_SHARED	2
+					__u8 op;
+					__u32 ret;
+				} psc_msr;
+			};
+		};
+
+If exit reason is KVM_EXIT_VMGEXIT then it indicates that an SEV-SNP guest
+has issued a VMGEXIT instruction (as documented by the AMD Architecture
+Programmer's Manual (APM)) to the hypervisor that needs to be serviced by
+userspace. These are generally handled by the host kernel, but in some
+cases some aspects handling a VMGEXIT are handled by userspace.
+
+A kvm_user_vmgexit structure is defined to encapsulate the data to be
+sent to or returned by userspace. The type field defines the specific type
+of exit that needs to be serviced, and that type is used as a discriminator
+to determine which union type should be used for input/output.
+
+For the KVM_USER_VMGEXIT_PSC_MSR type, the psc_msr union type is used. The
+kernel will supply the 'gpa' and 'op' fields, and userspace is expected to
+update the private/shared state of the GPA using the corresponding
+KVM_SET_MEMORY_ATTRIBUTES ioctl. The 'ret' field is to be set to 0 by
+userpace on success, or some non-zero value on failure.
 
 6. Capabilities that can be enabled on vCPUs
 ============================================
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 1006bfffe07a..6d68db812de1 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -101,11 +101,17 @@ enum psc_op {
 	/* GHCBData[11:0] */				\
 	GHCB_MSR_PSC_REQ)
 
+#define GHCB_MSR_PSC_REQ_TO_GFN(msr) (((msr) & GENMASK_ULL(51, 12)) >> 12)
+#define GHCB_MSR_PSC_REQ_TO_OP(msr) (((msr) & GENMASK_ULL(55, 52)) >> 52)
+
 #define GHCB_MSR_PSC_RESP		0x015
 #define GHCB_MSR_PSC_RESP_VAL(val)			\
 	/* GHCBData[63:32] */				\
 	(((u64)(val) & GENMASK_ULL(63, 32)) >> 32)
 
+/* Set highest bit as a generic error response */
+#define GHCB_MSR_PSC_RESP_ERROR (BIT_ULL(63) | GHCB_MSR_PSC_RESP)
+
 /* GHCB Hypervisor Feature Request/Response */
 #define GHCB_MSR_HV_FT_REQ		0x080
 #define GHCB_MSR_HV_FT_RESP		0x081
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index bd7f46c61c64..e982468554cb 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3454,6 +3454,36 @@ static void set_ghcb_msr(struct vcpu_svm *svm, u64 value)
 	svm->vmcb->control.ghcb_gpa = value;
 }
 
+static int snp_complete_psc_msr(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	u64 vmm_ret = vcpu->run->vmgexit.psc_msr.ret;
+
+	set_ghcb_msr(svm, (vmm_ret << 32) | GHCB_MSR_PSC_RESP);
+
+	return 1; /* resume guest */
+}
+
+static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
+{
+	u64 gpa = gfn_to_gpa(GHCB_MSR_PSC_REQ_TO_GFN(ghcb_msr));
+	u8 op = GHCB_MSR_PSC_REQ_TO_OP(ghcb_msr);
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	if (op != SNP_PAGE_STATE_PRIVATE && op != SNP_PAGE_STATE_SHARED) {
+		set_ghcb_msr(svm, GHCB_MSR_PSC_RESP_ERROR);
+		return 1; /* resume guest */
+	}
+
+	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
+	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC_MSR;
+	vcpu->run->vmgexit.psc_msr.gpa = gpa;
+	vcpu->run->vmgexit.psc_msr.op = op;
+	vcpu->arch.complete_userspace_io = snp_complete_psc_msr;
+
+	return 0; /* forward request to userspace */
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3552,6 +3582,9 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 				  GHCB_MSR_INFO_POS);
 		break;
 	}
+	case GHCB_MSR_PSC_REQ:
+		ret = snp_begin_psc_msr(vcpu, control->ghcb_gpa);
+		break;
 	case GHCB_MSR_TERM_REQ: {
 		u64 reason_set, reason_code;
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2190adbe3002..54b81e46a9fa 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -135,6 +135,20 @@ struct kvm_xen_exit {
 	} u;
 };
 
+struct kvm_user_vmgexit {
+#define KVM_USER_VMGEXIT_PSC_MSR	1
+	__u32 type; /* KVM_USER_VMGEXIT_* type */
+	union {
+		struct {
+			__u64 gpa;
+#define KVM_USER_VMGEXIT_PSC_MSR_OP_PRIVATE	1
+#define KVM_USER_VMGEXIT_PSC_MSR_OP_SHARED	2
+			__u8 op;
+			__u32 ret;
+		} psc_msr;
+	};
+};
+
 #define KVM_S390_GET_SKEYS_NONE   1
 #define KVM_S390_SKEYS_MAX        1048576
 
@@ -178,6 +192,7 @@ struct kvm_xen_exit {
 #define KVM_EXIT_NOTIFY           37
 #define KVM_EXIT_LOONGARCH_IOCSR  38
 #define KVM_EXIT_MEMORY_FAULT     39
+#define KVM_EXIT_VMGEXIT          40
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -433,6 +448,8 @@ struct kvm_run {
 			__u64 gpa;
 			__u64 size;
 		} memory_fault;
+		/* KVM_EXIT_VMGEXIT */
+		struct kvm_user_vmgexit vmgexit;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 14/26] KVM: SEV: Add support to handle Page State Change VMGEXIT
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (12 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 13/26] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 15/26] KVM: SEV: Add support to handle RMP nested page faults Michael Roth
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

SEV-SNP VMs can ask the hypervisor to change the page state in the RMP
table to be private or shared using the Page State Change NAE event
as defined in the GHCB specification version 2.

Forward these requests to userspace as KVM_EXIT_VMGEXITs, similar to how
it is done for requests that don't use a GHCB page.

Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 Documentation/virt/kvm/api.rst | 14 ++++++++++++++
 arch/x86/kvm/svm/sev.c         | 16 ++++++++++++++++
 include/uapi/linux/kvm.h       |  5 +++++
 3 files changed, 35 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 4a7a2945bc78..85099198a10f 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7065,6 +7065,7 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 		/* KVM_EXIT_VMGEXIT */
 		struct kvm_user_vmgexit {
 		#define KVM_USER_VMGEXIT_PSC_MSR	1
+		#define KVM_USER_VMGEXIT_PSC		2
 			__u32 type; /* KVM_USER_VMGEXIT_* type */
 			union {
 				struct {
@@ -7074,9 +7075,14 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 					__u8 op;
 					__u32 ret;
 				} psc_msr;
+				struct {
+					__u64 shared_gpa;
+					__u64 ret;
+				} psc;
 			};
 		};
 
+
 If exit reason is KVM_EXIT_VMGEXIT then it indicates that an SEV-SNP guest
 has issued a VMGEXIT instruction (as documented by the AMD Architecture
 Programmer's Manual (APM)) to the hypervisor that needs to be serviced by
@@ -7094,6 +7100,14 @@ update the private/shared state of the GPA using the corresponding
 KVM_SET_MEMORY_ATTRIBUTES ioctl. The 'ret' field is to be set to 0 by
 userpace on success, or some non-zero value on failure.
 
+For the KVM_USER_VMGEXIT_PSC type, the psc union type is used. The kernel
+will supply the GPA of the Page State Structure defined in the GHCB spec.
+Userspace will process this structure as defined by the GHCB, and issue
+KVM_SET_MEMORY_ATTRIBUTES ioctls to set the GPAs therein to the expected
+private/shared state. Userspace will return a value in 'ret' that is in
+agreement with the GHCB-defined return values that the guest will expect
+in the SW_EXITINFO2 field of the GHCB in response to these requests.
+
 6. Capabilities that can be enabled on vCPUs
 ============================================
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index e982468554cb..96e24a1e34e3 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3266,6 +3266,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_AP_JUMP_TABLE:
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 	case SVM_VMGEXIT_HV_FEATURES:
+	case SVM_VMGEXIT_PSC:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -3484,6 +3485,15 @@ static int snp_begin_psc_msr(struct kvm_vcpu *vcpu, u64 ghcb_msr)
 	return 0; /* forward request to userspace */
 }
 
+static int snp_complete_psc(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, vcpu->run->vmgexit.psc.ret);
+
+	return 1; /* resume guest */
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3721,6 +3731,12 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 
 		ret = 1;
 		break;
+	case SVM_VMGEXIT_PSC:
+		vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
+		vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_PSC;
+		vcpu->run->vmgexit.psc.shared_gpa = svm->sev_es.sw_scratch;
+		vcpu->arch.complete_userspace_io = snp_complete_psc;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 54b81e46a9fa..e33c48bfbd67 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -137,6 +137,7 @@ struct kvm_xen_exit {
 
 struct kvm_user_vmgexit {
 #define KVM_USER_VMGEXIT_PSC_MSR	1
+#define KVM_USER_VMGEXIT_PSC		2
 	__u32 type; /* KVM_USER_VMGEXIT_* type */
 	union {
 		struct {
@@ -146,6 +147,10 @@ struct kvm_user_vmgexit {
 			__u8 op;
 			__u32 ret;
 		} psc_msr;
+		struct {
+			__u64 shared_gpa;
+			__u64 ret;
+		} psc;
 	};
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 15/26] KVM: SEV: Add support to handle RMP nested page faults
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (13 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 14/26] KVM: SEV: Add support to handle " Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 16/26] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

When SEV-SNP is enabled in the guest, the hardware places restrictions
on all memory accesses based on the contents of the RMP table. When
hardware encounters RMP check failure caused by the guest memory access
it raises the #NPF. The error code contains additional information on
the access type. See the APM volume 2 for additional information.

When using gmem, RMP faults resulting from mismatches between the state
in the RMP table vs. what the guest expects via its page table result
in KVM_EXIT_MEMORY_FAULTs being forwarded to userspace to handle. This
means the only expected case that needs to be handled in the kernel is
when the page size of the entry in the RMP table is larger than the
mapping in the nested page table, in which case a PSMASH instruction
needs to be issued to split the large RMP entry into individual 4K
entries so that subsequent accesses can succeed.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/kvm_host.h |   1 +
 arch/x86/include/asm/sev.h      |   3 +
 arch/x86/kvm/mmu.h              |   2 -
 arch/x86/kvm/mmu/mmu.c          |   1 +
 arch/x86/kvm/svm/sev.c          | 109 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c          |  21 ++++--
 arch/x86/kvm/svm/svm.h          |   3 +
 arch/x86/kvm/trace.h            |  31 +++++++++
 arch/x86/kvm/x86.c              |   1 +
 9 files changed, 166 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 744f8c920952..6f03e7649780 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1940,6 +1940,7 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
 				   const struct kvm_memory_slot *memslot);
 void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned long kvm_nr_mmu_pages);
+void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
 
 int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3);
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 780182cda3ab..234a998e2d2d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -91,6 +91,9 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 /* RMUPDATE detected 4K page and 2MB page overlap. */
 #define RMPUPDATE_FAIL_OVERLAP		4
 
+/* PSMASH failed due to concurrent access by another CPU */
+#define PSMASH_FAIL_INUSE		3
+
 /* RMP page size */
 #define RMP_PG_SIZE_4K			0
 #define RMP_PG_SIZE_2M			1
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index e8b620a85627..3317711540cd 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -252,8 +252,6 @@ static inline bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm)
 	return __kvm_mmu_honors_guest_mtrrs(kvm_arch_has_noncoherent_dma(kvm));
 }
 
-void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
-
 int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu);
 
 int kvm_mmu_post_init_vm(struct kvm *kvm);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 748b9064567e..03b98c14cee1 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6744,6 +6744,7 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 
 	return need_tlb_flush;
 }
+EXPORT_SYMBOL_GPL(kvm_zap_gfn_range);
 
 static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
 					   const struct kvm_memory_slot *slot)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 96e24a1e34e3..0f70b057bfb8 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3455,6 +3455,23 @@ static void set_ghcb_msr(struct vcpu_svm *svm, u64 value)
 	svm->vmcb->control.ghcb_gpa = value;
 }
 
+static int snp_rmptable_psmash(kvm_pfn_t pfn)
+{
+	int ret;
+
+	pfn = pfn & ~(KVM_PAGES_PER_HPAGE(PG_LEVEL_2M) - 1);
+
+	/*
+	 * PSMASH_FAIL_INUSE indicates another processor is modifying the
+	 * entry, so retry until that's no longer the case.
+	 */
+	do {
+		ret = psmash(pfn);
+	} while (ret == PSMASH_FAIL_INUSE);
+
+	return ret;
+}
+
 static int snp_complete_psc_msr(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -4014,3 +4031,95 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
 
 	return p;
 }
+
+void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code)
+{
+	struct kvm_memory_slot *slot;
+	struct kvm *kvm = vcpu->kvm;
+	int order, rmp_level, ret;
+	bool assigned;
+	kvm_pfn_t pfn;
+	gfn_t gfn;
+
+	gfn = gpa >> PAGE_SHIFT;
+
+	/*
+	 * The only time RMP faults occur for shared pages is when the guest is
+	 * triggering an RMP fault for an implicit page-state change from
+	 * shared->private. Implicit page-state changes are forwarded to
+	 * userspace via KVM_EXIT_MEMORY_FAULT events, however, so RMP faults
+	 * for shared pages should not end up here.
+	 */
+	if (!kvm_mem_is_private(kvm, gfn)) {
+		pr_warn_ratelimited("SEV: Unexpected RMP fault for non-private GPA 0x%llx\n",
+				    gpa);
+		return;
+	}
+
+	slot = gfn_to_memslot(kvm, gfn);
+	if (!kvm_slot_can_be_private(slot)) {
+		pr_warn_ratelimited("SEV: Unexpected RMP fault, non-private slot for GPA 0x%llx\n",
+				    gpa);
+		return;
+	}
+
+	ret = kvm_gmem_get_pfn(kvm, slot, gfn, &pfn, &order);
+	if (ret) {
+		pr_warn_ratelimited("SEV: Unexpected RMP fault, no backing page for private GPA 0x%llx\n",
+				    gpa);
+		return;
+	}
+
+	ret = snp_lookup_rmpentry(pfn, &assigned, &rmp_level);
+	if (ret || !assigned) {
+		pr_warn_ratelimited("SEV: Unexpected RMP fault, no assigned RMP entry found for GPA 0x%llx PFN 0x%llx error %d\n",
+				    gpa, pfn, ret);
+		goto out_no_trace;
+	}
+
+	/*
+	 * There are 2 cases where a PSMASH may be needed to resolve an #NPF
+	 * with PFERR_GUEST_RMP_BIT set:
+	 *
+	 * 1) RMPADJUST/PVALIDATE can trigger an #NPF with PFERR_GUEST_SIZEM
+	 *    bit set if the guest issues them with a smaller granularity than
+	 *    what is indicated by the page-size bit in the 2MB RMP entry for
+	 *    the PFN that backs the GPA.
+	 *
+	 * 2) Guest access via NPT can trigger an #NPF if the NPT mapping is
+	 *    smaller than what is indicated by the 2MB RMP entry for the PFN
+	 *    that backs the GPA.
+	 *
+	 * In both these cases, the corresponding 2M RMP entry needs to
+	 * be PSMASH'd to 512 4K RMP entries.  If the RMP entry is already
+	 * split into 4K RMP entries, then this is likely a spurious case which
+	 * can occur when there are concurrent accesses by the guest to a 2MB
+	 * GPA range that is backed by a 2MB-aligned PFN who's RMP entry is in
+	 * the process of being PMASH'd into 4K entries. These cases should
+	 * resolve automatically on subsequent accesses, so just ignore them
+	 * here.
+	 */
+	if (rmp_level == PG_LEVEL_4K)
+		goto out;
+
+	ret = snp_rmptable_psmash(pfn);
+	if (ret) {
+		/*
+		 * Look it up again. If it's 4K now then the PSMASH may have
+		 * raced with another process and the issue has already resolved
+		 * itself.
+		 */
+		if (!snp_lookup_rmpentry(pfn, &assigned, &rmp_level) &&
+		    assigned && rmp_level == PG_LEVEL_4K)
+			goto out;
+
+		pr_warn_ratelimited("SEV: Unable to split RMP entry for GPA 0x%llx PFN 0x%llx ret %d\n",
+				    gpa, pfn, ret);
+	}
+
+	kvm_zap_gfn_range(kvm, gfn, gfn + PTRS_PER_PMD);
+out:
+	trace_kvm_rmp_fault(vcpu, gpa, pfn, error_code, rmp_level, ret);
+out_no_trace:
+	put_page(pfn_to_page(pfn));
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d31404953bf1..1cddf7a2aec1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2043,6 +2043,7 @@ static int pf_interception(struct kvm_vcpu *vcpu)
 static int npf_interception(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	int rc;
 
 	u64 fault_address = svm->vmcb->control.exit_info_2;
 	u64 error_code = svm->vmcb->control.exit_info_1;
@@ -2057,10 +2058,22 @@ static int npf_interception(struct kvm_vcpu *vcpu)
 		error_code &= ~PFERR_SYNTHETIC_MASK;
 
 	trace_kvm_page_fault(vcpu, fault_address, error_code);
-	return kvm_mmu_page_fault(vcpu, fault_address, error_code,
-			static_cpu_has(X86_FEATURE_DECODEASSISTS) ?
-			svm->vmcb->control.insn_bytes : NULL,
-			svm->vmcb->control.insn_len);
+	rc = kvm_mmu_page_fault(vcpu, fault_address, error_code,
+				static_cpu_has(X86_FEATURE_DECODEASSISTS) ?
+				svm->vmcb->control.insn_bytes : NULL,
+				svm->vmcb->control.insn_len);
+
+	/*
+	 * rc == 0 indicates a userspace exit is needed to handle page
+	 * transitions, so do that first before updating the RMP table.
+	 */
+	if (error_code & PFERR_GUEST_RMP_MASK) {
+		if (rc == 0)
+			return rc;
+		sev_handle_rmp_fault(vcpu, fault_address, error_code);
+	}
+
+	return rc;
 }
 
 static int db_interception(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 730f5ced2a2e..d2b0ec27d4fe 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -722,6 +722,7 @@ void sev_hardware_unsetup(void);
 int sev_cpu_init(struct svm_cpu_data *sd);
 int sev_dev_get_attr(u32 group, u64 attr, u64 *val);
 extern unsigned int max_sev_asid;
+void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -735,6 +736,8 @@ static inline void sev_hardware_unsetup(void) {}
 static inline int sev_cpu_init(struct svm_cpu_data *sd) { return 0; }
 static inline int sev_dev_get_attr(u32 group, u64 attr, u64 *val) { return -ENXIO; }
 #define max_sev_asid 0
+static inline void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code) {}
+
 #endif
 
 /* vmenter.S */
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index c6b4b1728006..3531a187d5d9 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1834,6 +1834,37 @@ TRACE_EVENT(kvm_vmgexit_msr_protocol_exit,
 		  __entry->vcpu_id, __entry->ghcb_gpa, __entry->result)
 );
 
+/*
+ * Tracepoint for #NPFs due to RMP faults.
+ */
+TRACE_EVENT(kvm_rmp_fault,
+	TP_PROTO(struct kvm_vcpu *vcpu, u64 gpa, u64 pfn, u64 error_code,
+		 int rmp_level, int psmash_ret),
+	TP_ARGS(vcpu, gpa, pfn, error_code, rmp_level, psmash_ret),
+
+	TP_STRUCT__entry(
+		__field(unsigned int, vcpu_id)
+		__field(u64, gpa)
+		__field(u64, pfn)
+		__field(u64, error_code)
+		__field(int, rmp_level)
+		__field(int, psmash_ret)
+	),
+
+	TP_fast_assign(
+		__entry->vcpu_id	= vcpu->vcpu_id;
+		__entry->gpa		= gpa;
+		__entry->pfn		= pfn;
+		__entry->error_code	= error_code;
+		__entry->rmp_level	= rmp_level;
+		__entry->psmash_ret	= psmash_ret;
+	),
+
+	TP_printk("vcpu %u gpa %016llx pfn 0x%llx error_code 0x%llx rmp_level %d psmash_ret %d",
+		  __entry->vcpu_id, __entry->gpa, __entry->pfn,
+		  __entry->error_code, __entry->rmp_level, __entry->psmash_ret)
+);
+
 #endif /* _TRACE_KVM_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9923921904a2..a9d014961d2b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13996,6 +13996,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_enter);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_exit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_msr_protocol_enter);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_msr_protocol_exit);
+EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fault);
 
 static int __init kvm_x86_init(void)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 16/26] KVM: SEV: Support SEV-SNP AP Creation NAE event
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (14 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 15/26] KVM: SEV: Add support to handle RMP nested page faults Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-19 12:01   ` Paolo Bonzini
  2024-04-18 19:41 ` [PATCH v13 17/26] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
                   ` (10 subsequent siblings)
  26 siblings, 1 reply; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

Add support for the SEV-SNP AP Creation NAE event. This allows SEV-SNP
guests to alter the register state of the APs on their own. This allows
the guest a way of simulating INIT-SIPI.

A new event, KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, is created and used
so as to avoid updating the VMSA pointer while the vCPU is running.

For CREATE
  The guest supplies the GPA of the VMSA to be used for the vCPU with
  the specified APIC ID. The GPA is saved in the svm struct of the
  target vCPU, the KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event is added
  to the vCPU and then the vCPU is kicked.

For CREATE_ON_INIT:
  The guest supplies the GPA of the VMSA to be used for the vCPU with
  the specified APIC ID the next time an INIT is performed. The GPA is
  saved in the svm struct of the target vCPU.

For DESTROY:
  The guest indicates it wishes to stop the vCPU. The GPA is cleared
  from the svm struct, the KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event is
  added to vCPU and then the vCPU is kicked.

The KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event handler will be invoked
as a result of the event or as a result of an INIT. If a new VMSA is to
be installed, the VMSA guest page is set as the VMSA in the vCPU VMCB
and the vCPU state is set to KVM_MP_STATE_RUNNABLE. If a new VMSA is not
to be installed, the VMSA is cleared in the vCPU VMCB and the vCPU state
is set to KVM_MP_STATE_HALTED to prevent it from being run.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/kvm_host.h |   1 +
 arch/x86/include/asm/svm.h      |   6 +
 arch/x86/kvm/svm/sev.c          | 229 +++++++++++++++++++++++++++++++-
 arch/x86/kvm/svm/svm.c          |  11 +-
 arch/x86/kvm/svm/svm.h          |   9 ++
 arch/x86/kvm/x86.c              |  11 ++
 6 files changed, 264 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6f03e7649780..9943e989fadb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -121,6 +121,7 @@
 	KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_HV_TLB_FLUSH \
 	KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+#define KVM_REQ_UPDATE_PROTECTED_GUEST_STATE	KVM_ARCH_REQ(34)
 
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 544a43c1cf11..f0dea3750ca9 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -286,8 +286,14 @@ static_assert((X2AVIC_MAX_PHYSICAL_ID & AVIC_PHYSICAL_MAX_INDEX_MASK) == X2AVIC_
 #define AVIC_HPA_MASK	~((0xFFFULL << 52) | 0xFFF)
 
 #define SVM_SEV_FEAT_SNP_ACTIVE				BIT(0)
+#define SVM_SEV_FEAT_RESTRICTED_INJECTION		BIT(3)
+#define SVM_SEV_FEAT_ALTERNATE_INJECTION		BIT(4)
 #define SVM_SEV_FEAT_DEBUG_SWAP				BIT(5)
 
+#define SVM_SEV_FEAT_INT_INJ_MODES		\
+	(SVM_SEV_FEAT_RESTRICTED_INJECTION |	\
+	 SVM_SEV_FEAT_ALTERNATE_INJECTION)
+
 struct vmcb_seg {
 	u16 selector;
 	u16 attrib;
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 0f70b057bfb8..2de3006fec65 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -37,7 +37,7 @@
 #define GHCB_VERSION_MAX	2ULL
 #define GHCB_VERSION_MIN	1ULL
 
-#define GHCB_HV_FT_SUPPORTED	GHCB_HV_FT_SNP
+#define GHCB_HV_FT_SUPPORTED	(GHCB_HV_FT_SNP | GHCB_HV_FT_SNP_AP_CREATION)
 
 /* enable/disable SEV support */
 static bool sev_enabled = true;
@@ -3261,6 +3261,11 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 		if (!kvm_ghcb_sw_scratch_is_valid(svm))
 			goto vmgexit_err;
 		break;
+	case SVM_VMGEXIT_AP_CREATION:
+		if (lower_32_bits(control->exit_info_1) != SVM_VMGEXIT_AP_DESTROY)
+			if (!kvm_ghcb_rax_is_valid(svm))
+				goto vmgexit_err;
+		break;
 	case SVM_VMGEXIT_NMI_COMPLETE:
 	case SVM_VMGEXIT_AP_HLT_LOOP:
 	case SVM_VMGEXIT_AP_JUMP_TABLE:
@@ -3511,6 +3516,205 @@ static int snp_complete_psc(struct kvm_vcpu *vcpu)
 	return 1; /* resume guest */
 }
 
+static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	WARN_ON(!mutex_is_locked(&svm->sev_es.snp_vmsa_mutex));
+
+	/* Mark the vCPU as offline and not runnable */
+	vcpu->arch.pv.pv_unhalted = false;
+	vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
+
+	/* Clear use of the VMSA */
+	svm->vmcb->control.vmsa_pa = INVALID_PAGE;
+
+	if (VALID_PAGE(svm->sev_es.snp_vmsa_gpa)) {
+		gfn_t gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
+		struct kvm_memory_slot *slot;
+		kvm_pfn_t pfn;
+
+		slot = gfn_to_memslot(vcpu->kvm, gfn);
+		if (!slot)
+			return -EINVAL;
+
+		/*
+		 * The new VMSA will be private memory guest memory, so
+		 * retrieve the PFN from the gmem backend.
+		 */
+		if (kvm_gmem_get_pfn(vcpu->kvm, slot, gfn, &pfn, NULL))
+			return -EINVAL;
+
+		/*
+		 * From this point forward, the VMSA will always be a
+		 * guest-mapped page rather than the initial one allocated
+		 * by KVM in svm->sev_es.vmsa. In theory, svm->sev_es.vmsa
+		 * could be free'd and cleaned up here, but that involves
+		 * cleanups like wbinvd_on_all_cpus() which would ideally
+		 * be handled during teardown rather than guest boot.
+		 * Deferring that also allows the existing logic for SEV-ES
+		 * VMSAs to be re-used with minimal SNP-specific changes.
+		 */
+		svm->sev_es.snp_has_guest_vmsa = true;
+
+		/* Use the new VMSA */
+		svm->vmcb->control.vmsa_pa = pfn_to_hpa(pfn);
+
+		/* Mark the vCPU as runnable */
+		vcpu->arch.pv.pv_unhalted = false;
+		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
+
+		svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
+
+		/*
+		 * gmem pages aren't currently migratable, but if this ever
+		 * changes then care should be taken to ensure
+		 * svm->sev_es.vmsa is pinned through some other means.
+		 */
+		kvm_release_pfn_clean(pfn);
+	}
+
+	/*
+	 * When replacing the VMSA during SEV-SNP AP creation,
+	 * mark the VMCB dirty so that full state is always reloaded.
+	 */
+	vmcb_mark_all_dirty(svm->vmcb);
+
+	return 0;
+}
+
+/*
+ * Invoked as part of svm_vcpu_reset() processing of an init event.
+ */
+void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	int ret;
+
+	if (!sev_snp_guest(vcpu->kvm))
+		return;
+
+	mutex_lock(&svm->sev_es.snp_vmsa_mutex);
+
+	if (!svm->sev_es.snp_ap_waiting_for_reset)
+		goto unlock;
+
+	svm->sev_es.snp_ap_waiting_for_reset = false;
+
+	ret = __sev_snp_update_protected_guest_state(vcpu);
+	if (ret)
+		vcpu_unimpl(vcpu, "snp: AP state update on init failed\n");
+
+unlock:
+	mutex_unlock(&svm->sev_es.snp_vmsa_mutex);
+}
+
+static int sev_snp_ap_creation(struct vcpu_svm *svm)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	struct kvm_vcpu *target_vcpu;
+	struct vcpu_svm *target_svm;
+	unsigned int request;
+	unsigned int apic_id;
+	bool kick;
+	int ret;
+
+	request = lower_32_bits(svm->vmcb->control.exit_info_1);
+	apic_id = upper_32_bits(svm->vmcb->control.exit_info_1);
+
+	/* Validate the APIC ID */
+	target_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, apic_id);
+	if (!target_vcpu) {
+		vcpu_unimpl(vcpu, "vmgexit: invalid AP APIC ID [%#x] from guest\n",
+			    apic_id);
+		return -EINVAL;
+	}
+
+	ret = 0;
+
+	target_svm = to_svm(target_vcpu);
+
+	/*
+	 * The target vCPU is valid, so the vCPU will be kicked unless the
+	 * request is for CREATE_ON_INIT. For any errors at this stage, the
+	 * kick will place the vCPU in an non-runnable state.
+	 */
+	kick = true;
+
+	mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
+
+	target_svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
+	target_svm->sev_es.snp_ap_waiting_for_reset = true;
+
+	/* Interrupt injection mode shouldn't change for AP creation */
+	if (request < SVM_VMGEXIT_AP_DESTROY) {
+		u64 sev_features;
+
+		sev_features = vcpu->arch.regs[VCPU_REGS_RAX];
+		sev_features ^= sev->vmsa_features;
+
+		if (sev_features & SVM_SEV_FEAT_INT_INJ_MODES) {
+			vcpu_unimpl(vcpu, "vmgexit: invalid AP injection mode [%#lx] from guest\n",
+				    vcpu->arch.regs[VCPU_REGS_RAX]);
+			ret = -EINVAL;
+			goto out;
+		}
+	}
+
+	switch (request) {
+	case SVM_VMGEXIT_AP_CREATE_ON_INIT:
+		kick = false;
+		fallthrough;
+	case SVM_VMGEXIT_AP_CREATE:
+		if (!page_address_valid(vcpu, svm->vmcb->control.exit_info_2)) {
+			vcpu_unimpl(vcpu, "vmgexit: invalid AP VMSA address [%#llx] from guest\n",
+				    svm->vmcb->control.exit_info_2);
+			ret = -EINVAL;
+			goto out;
+		}
+
+		/*
+		 * Malicious guest can RMPADJUST a large page into VMSA which
+		 * will hit the SNP erratum where the CPU will incorrectly signal
+		 * an RMP violation #PF if a hugepage collides with the RMP entry
+		 * of VMSA page, reject the AP CREATE request if VMSA address from
+		 * guest is 2M aligned.
+		 */
+		if (IS_ALIGNED(svm->vmcb->control.exit_info_2, PMD_SIZE)) {
+			vcpu_unimpl(vcpu,
+				    "vmgexit: AP VMSA address [%llx] from guest is unsafe as it is 2M aligned\n",
+				    svm->vmcb->control.exit_info_2);
+			ret = -EINVAL;
+			goto out;
+		}
+
+		target_svm->sev_es.snp_vmsa_gpa = svm->vmcb->control.exit_info_2;
+		break;
+	case SVM_VMGEXIT_AP_DESTROY:
+		break;
+	default:
+		vcpu_unimpl(vcpu, "vmgexit: invalid AP creation request [%#x] from guest\n",
+			    request);
+		ret = -EINVAL;
+		break;
+	}
+
+out:
+	if (kick) {
+		kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
+
+		if (target_vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)
+			kvm_make_request(KVM_REQ_UNBLOCK, target_vcpu);
+
+		kvm_vcpu_kick(target_vcpu);
+	}
+
+	mutex_unlock(&target_svm->sev_es.snp_vmsa_mutex);
+
+	return ret;
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3754,6 +3958,15 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		vcpu->run->vmgexit.psc.shared_gpa = svm->sev_es.sw_scratch;
 		vcpu->arch.complete_userspace_io = snp_complete_psc;
 		break;
+	case SVM_VMGEXIT_AP_CREATION:
+		ret = sev_snp_ap_creation(svm);
+		if (ret) {
+			ghcb_set_sw_exit_info_1(svm->sev_es.ghcb, 2);
+			ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, GHCB_ERR_INVALID_INPUT);
+		}
+
+		ret = 1;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
@@ -3848,7 +4061,7 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
 	 * the VMSA will be NULL if this vCPU is the destination for intrahost
 	 * migration, and will be copied later.
 	 */
-	if (svm->sev_es.vmsa)
+	if (!svm->sev_es.snp_has_guest_vmsa)
 		svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);
 
 	/* Can't intercept CR register access, HV can't modify CR registers */
@@ -3921,6 +4134,8 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
 	set_ghcb_msr(svm, GHCB_MSR_SEV_INFO(GHCB_VERSION_MAX,
 					    GHCB_VERSION_MIN,
 					    sev_enc_bit));
+
+	mutex_init(&svm->sev_es.snp_vmsa_mutex);
 }
 
 void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
@@ -4032,6 +4247,16 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
 	return p;
 }
 
+void sev_vcpu_unblocking(struct kvm_vcpu *vcpu)
+{
+	if (!sev_snp_guest(vcpu->kvm))
+		return;
+
+	if (kvm_test_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, vcpu) &&
+	    vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)
+		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
+}
+
 void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code)
 {
 	struct kvm_memory_slot *slot;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 1cddf7a2aec1..9dc929316c5d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1398,6 +1398,9 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	svm->spec_ctrl = 0;
 	svm->virt_spec_ctrl = 0;
 
+	if (init_event)
+		sev_snp_init_protected_guest_state(vcpu);
+
 	init_vmcb(vcpu);
 
 	if (!init_event)
@@ -4939,6 +4942,12 @@ static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu)
 	return page_address(page);
 }
 
+static void svm_vcpu_unblocking(struct kvm_vcpu *vcpu)
+{
+	sev_vcpu_unblocking(vcpu);
+	avic_vcpu_unblocking(vcpu);
+}
+
 static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.name = KBUILD_MODNAME,
 
@@ -4961,7 +4970,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.vcpu_load = svm_vcpu_load,
 	.vcpu_put = svm_vcpu_put,
 	.vcpu_blocking = avic_vcpu_blocking,
-	.vcpu_unblocking = avic_vcpu_unblocking,
+	.vcpu_unblocking = svm_vcpu_unblocking,
 
 	.update_exception_bitmap = svm_update_exception_bitmap,
 	.get_msr_feature = svm_get_msr_feature,
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index d2b0ec27d4fe..81e335dca281 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -210,6 +210,11 @@ struct vcpu_sev_es_state {
 	bool ghcb_sa_free;
 
 	u64 ghcb_registered_gpa;
+
+	struct mutex snp_vmsa_mutex; /* Used to handle concurrent updates of VMSA. */
+	gpa_t snp_vmsa_gpa;
+	bool snp_ap_waiting_for_reset;
+	bool snp_has_guest_vmsa;
 };
 
 struct vcpu_svm {
@@ -723,6 +728,8 @@ int sev_cpu_init(struct svm_cpu_data *sd);
 int sev_dev_get_attr(u32 group, u64 attr, u64 *val);
 extern unsigned int max_sev_asid;
 void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code);
+void sev_vcpu_unblocking(struct kvm_vcpu *vcpu);
+void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -737,6 +744,8 @@ static inline int sev_cpu_init(struct svm_cpu_data *sd) { return 0; }
 static inline int sev_dev_get_attr(u32 group, u64 attr, u64 *val) { return -ENXIO; }
 #define max_sev_asid 0
 static inline void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code) {}
+static inline void sev_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
+static inline void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu) {}
 
 #endif
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a9d014961d2b..436078b9e5aa 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10938,6 +10938,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 
 		if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu))
 			static_call(kvm_x86_update_cpu_dirty_logging)(vcpu);
+
+		if (kvm_check_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, vcpu)) {
+			kvm_vcpu_reset(vcpu, true);
+			if (vcpu->arch.mp_state != KVM_MP_STATE_RUNNABLE) {
+				r = 1;
+				goto out;
+			}
+		}
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win ||
@@ -13145,6 +13153,9 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)
 	if (kvm_test_request(KVM_REQ_PMI, vcpu))
 		return true;
 
+	if (kvm_test_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, vcpu))
+		return true;
+
 	if (kvm_arch_interrupt_allowed(vcpu) &&
 	    (kvm_cpu_has_interrupt(vcpu) ||
 	    kvm_guest_apic_has_interrupt(vcpu)))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 17/26] KVM: SEV: Add support for GHCB-based termination requests
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (15 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 16/26] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 18/26] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

GHCB version 2 adds support for a GHCB-based termination request that
a guest can issue when it reaches an error state and wishes to inform
the hypervisor that it should be terminated. Implement support for that
similarly to GHCB MSR-based termination requests that are already
available to SEV-ES guests via earlier versions of the GHCB protocol.

See 'Termination Request' in the 'Invoking VMGEXIT' section of the GHCB
specification for more details.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2de3006fec65..2e0e825b6436 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3272,6 +3272,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 	case SVM_VMGEXIT_HV_FEATURES:
 	case SVM_VMGEXIT_PSC:
+	case SVM_VMGEXIT_TERM_REQUEST:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -3967,6 +3968,14 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 
 		ret = 1;
 		break;
+	case SVM_VMGEXIT_TERM_REQUEST:
+		pr_info("SEV-ES guest requested termination: reason %#llx info %#llx\n",
+			control->exit_info_1, control->exit_info_2);
+		vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
+		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
+		vcpu->run->system_event.ndata = 1;
+		vcpu->run->system_event.data[0] = control->ghcb_gpa;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 18/26] KVM: SEV: Implement gmem hook for initializing private pages
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (16 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 17/26] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 19/26] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

This will handle the RMP table updates needed to put a page into a
private state before mapping it into an SEV-SNP guest.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/Kconfig   |  1 +
 arch/x86/kvm/svm/sev.c | 98 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c |  2 +
 arch/x86/kvm/svm/svm.h |  5 +++
 arch/x86/kvm/x86.c     |  5 +++
 virt/kvm/guest_memfd.c |  4 +-
 6 files changed, 113 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 5e72faca4e8f..10768f13b240 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -137,6 +137,7 @@ config KVM_AMD_SEV
 	depends on CRYPTO_DEV_SP_PSP && !(KVM_AMD=y && CRYPTO_DEV_CCP_DD=m)
 	select ARCH_HAS_CC_PLATFORM
 	select KVM_GENERIC_PRIVATE_MEM
+	select HAVE_KVM_GMEM_PREPARE
 	help
 	  Provides support for launching Encrypted VMs (SEV) and Encrypted VMs
 	  with Encrypted State (SEV-ES) on AMD processors.
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2e0e825b6436..29f6e8dc29c8 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4357,3 +4357,101 @@ void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code)
 out_no_trace:
 	put_page(pfn_to_page(pfn));
 }
+
+static bool is_pfn_range_shared(kvm_pfn_t start, kvm_pfn_t end)
+{
+	kvm_pfn_t pfn = start;
+
+	while (pfn < end) {
+		int ret, rmp_level;
+		bool assigned;
+
+		ret = snp_lookup_rmpentry(pfn, &assigned, &rmp_level);
+		if (ret) {
+			pr_warn_ratelimited("SEV: Failed to retrieve RMP entry: PFN 0x%llx GFN start 0x%llx GFN end 0x%llx RMP level %d error %d\n",
+					    pfn, start, end, rmp_level, ret);
+			return false;
+		}
+
+		if (assigned) {
+			pr_debug("%s: overlap detected, PFN 0x%llx start 0x%llx end 0x%llx RMP level %d\n",
+				 __func__, pfn, start, end, rmp_level);
+			return false;
+		}
+
+		pfn++;
+	}
+
+	return true;
+}
+
+static u8 max_level_for_order(int order)
+{
+	if (order >= KVM_HPAGE_GFN_SHIFT(PG_LEVEL_2M))
+		return PG_LEVEL_2M;
+
+	return PG_LEVEL_4K;
+}
+
+static bool is_large_rmp_possible(struct kvm *kvm, kvm_pfn_t pfn, int order)
+{
+	kvm_pfn_t pfn_aligned = ALIGN_DOWN(pfn, PTRS_PER_PMD);
+
+	/*
+	 * If this is a large folio, and the entire 2M range containing the
+	 * PFN is currently shared, then the entire 2M-aligned range can be
+	 * set to private via a single 2M RMP entry.
+	 */
+	if (max_level_for_order(order) > PG_LEVEL_4K &&
+	    is_pfn_range_shared(pfn_aligned, pfn_aligned + PTRS_PER_PMD))
+		return true;
+
+	return false;
+}
+
+int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	kvm_pfn_t pfn_aligned;
+	gfn_t gfn_aligned;
+	int level, rc;
+	bool assigned;
+
+	if (!sev_snp_guest(kvm))
+		return 0;
+
+	rc = snp_lookup_rmpentry(pfn, &assigned, &level);
+	if (rc) {
+		pr_err_ratelimited("SEV: Failed to look up RMP entry: GFN %llx PFN %llx error %d\n",
+				   gfn, pfn, rc);
+		return -ENOENT;
+	}
+
+	if (assigned) {
+		pr_debug("%s: already assigned: gfn %llx pfn %llx max_order %d level %d\n",
+			 __func__, gfn, pfn, max_order, level);
+		return 0;
+	}
+
+	if (is_large_rmp_possible(kvm, pfn, max_order)) {
+		level = PG_LEVEL_2M;
+		pfn_aligned = ALIGN_DOWN(pfn, PTRS_PER_PMD);
+		gfn_aligned = ALIGN_DOWN(gfn, PTRS_PER_PMD);
+	} else {
+		level = PG_LEVEL_4K;
+		pfn_aligned = pfn;
+		gfn_aligned = gfn;
+	}
+
+	rc = rmp_make_private(pfn_aligned, gfn_to_gpa(gfn_aligned), level, sev->asid, false);
+	if (rc) {
+		pr_err_ratelimited("SEV: Failed to update RMP entry: GFN %llx PFN %llx level %d error %d\n",
+				   gfn, pfn, level, rc);
+		return -EINVAL;
+	}
+
+	pr_debug("%s: updated: gfn %llx pfn %llx pfn_aligned %llx max_order %d level %d\n",
+		 __func__, gfn, pfn, pfn_aligned, max_order, level);
+
+	return 0;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9dc929316c5d..3e8d0752bf1b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5080,6 +5080,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
 	.vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons,
 	.alloc_apic_backing_page = svm_alloc_apic_backing_page,
+
+	.gmem_prepare = sev_gmem_prepare,
 };
 
 /*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 81e335dca281..7712ed90aae8 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -730,6 +730,7 @@ extern unsigned int max_sev_asid;
 void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code);
 void sev_vcpu_unblocking(struct kvm_vcpu *vcpu);
 void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu);
+int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -746,6 +747,10 @@ static inline int sev_dev_get_attr(u32 group, u64 attr, u64 *val) { return -ENXI
 static inline void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code) {}
 static inline void sev_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 static inline void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu) {}
+static inline int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order)
+{
+	return 0;
+}
 
 #endif
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 436078b9e5aa..2e911dc0a991 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13610,6 +13610,11 @@ bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
 EXPORT_SYMBOL_GPL(kvm_arch_no_poll);
 
 #ifdef CONFIG_HAVE_KVM_GMEM_PREPARE
+bool kvm_arch_gmem_prepare_needed(struct kvm *kvm)
+{
+	return kvm->arch.vm_type == KVM_X86_SNP_VM;
+}
+
 int kvm_arch_gmem_prepare(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, int max_order)
 {
 	return static_call(kvm_x86_gmem_prepare)(kvm, pfn, gfn, max_order);
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index 9d7c6a70c547..b814e5d61f8e 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -46,8 +46,8 @@ static int kvm_gmem_prepare_folio(struct inode *inode, pgoff_t index, struct fol
 		gfn = slot->base_gfn + index - slot->gmem.pgoff;
 		rc = kvm_arch_gmem_prepare(kvm, gfn, pfn, compound_order(compound_head(page)));
 		if (rc) {
-			pr_warn_ratelimited("gmem: Failed to prepare folio for index %lx, error %d.\n",
-					    index, rc);
+			pr_warn_ratelimited("gmem: Failed to prepare folio for index %lx GFN %llx PFN %llx error %d.\n",
+					    index, gfn, pfn, rc);
 			return rc;
 		}
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 19/26] KVM: SEV: Implement gmem hook for invalidating private pages
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (17 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 18/26] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 20/26] KVM: x86: Implement gmem hook for determining max NPT mapping level Michael Roth
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

Implement a platform hook to do the work of restoring the direct map
entries of gmem-managed pages and transitioning the corresponding RMP
table entries back to the default shared/hypervisor-owned state.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/Kconfig   |  1 +
 arch/x86/kvm/svm/sev.c | 64 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c |  1 +
 arch/x86/kvm/svm/svm.h |  2 ++
 4 files changed, 68 insertions(+)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 10768f13b240..2a7f69abcac3 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -138,6 +138,7 @@ config KVM_AMD_SEV
 	select ARCH_HAS_CC_PLATFORM
 	select KVM_GENERIC_PRIVATE_MEM
 	select HAVE_KVM_GMEM_PREPARE
+	select HAVE_KVM_GMEM_INVALIDATE
 	help
 	  Provides support for launching Encrypted VMs (SEV) and Encrypted VMs
 	  with Encrypted State (SEV-ES) on AMD processors.
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 29f6e8dc29c8..f60bb8291494 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4455,3 +4455,67 @@ int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order)
 
 	return 0;
 }
+
+void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end)
+{
+	kvm_pfn_t pfn;
+
+	pr_debug("%s: PFN start 0x%llx PFN end 0x%llx\n", __func__, start, end);
+
+	for (pfn = start; pfn < end;) {
+		bool use_2m_update = false;
+		int rc, rmp_level;
+		bool assigned;
+
+		rc = snp_lookup_rmpentry(pfn, &assigned, &rmp_level);
+		if (WARN_ONCE(rc, "SEV: Failed to retrieve RMP entry for PFN 0x%llx error %d\n",
+			      pfn, rc))
+			goto next_pfn;
+
+		if (!assigned)
+			goto next_pfn;
+
+		use_2m_update = IS_ALIGNED(pfn, PTRS_PER_PMD) &&
+				end >= (pfn + PTRS_PER_PMD) &&
+				rmp_level > PG_LEVEL_4K;
+
+		/*
+		 * If an unaligned PFN corresponds to a 2M region assigned as a
+		 * large page in the RMP table, PSMASH the region into individual
+		 * 4K RMP entries before attempting to convert a 4K sub-page.
+		 */
+		if (!use_2m_update && rmp_level > PG_LEVEL_4K) {
+			/*
+			 * This shouldn't fail, but if it does, report it, but
+			 * still try to update RMP entry to shared and pray this
+			 * was a spurious error that can be addressed later.
+			 */
+			rc = snp_rmptable_psmash(pfn);
+			WARN_ONCE(rc, "SEV: Failed to PSMASH RMP entry for PFN 0x%llx error %d\n",
+				  pfn, rc);
+		}
+
+		rc = rmp_make_shared(pfn, use_2m_update ? PG_LEVEL_2M : PG_LEVEL_4K);
+		if (WARN_ONCE(rc, "SEV: Failed to update RMP entry for PFN 0x%llx error %d\n",
+			      pfn, rc))
+			goto next_pfn;
+
+		/*
+		 * SEV-ES avoids host/guest cache coherency issues through
+		 * WBINVD hooks issued via MMU notifiers during run-time, and
+		 * KVM's VM destroy path at shutdown. Those MMU notifier events
+		 * don't cover gmem since there is no requirement to map pages
+		 * to a HVA in order to use them for a running guest. While the
+		 * shutdown path would still likely cover things for SNP guests,
+		 * userspace may also free gmem pages during run-time via
+		 * hole-punching operations on the guest_memfd, so flush the
+		 * cache entries for these pages before free'ing them back to
+		 * the host.
+		 */
+		clflush_cache_range(__va(pfn_to_hpa(pfn)),
+				    use_2m_update ? PMD_SIZE : PAGE_SIZE);
+next_pfn:
+		pfn += use_2m_update ? PTRS_PER_PMD : 1;
+		cond_resched();
+	}
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3e8d0752bf1b..60d121250b0d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5082,6 +5082,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.alloc_apic_backing_page = svm_alloc_apic_backing_page,
 
 	.gmem_prepare = sev_gmem_prepare,
+	.gmem_invalidate = sev_gmem_invalidate,
 };
 
 /*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7712ed90aae8..6721e5c6cf73 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -731,6 +731,7 @@ void sev_handle_rmp_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code);
 void sev_vcpu_unblocking(struct kvm_vcpu *vcpu);
 void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu);
 int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order);
+void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -751,6 +752,7 @@ static inline int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, in
 {
 	return 0;
 }
+static inline void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end) {}
 
 #endif
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 20/26] KVM: x86: Implement gmem hook for determining max NPT mapping level
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (18 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 19/26] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 21/26] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

In the case of SEV-SNP, whether or not a 2MB page can be mapped via a
2MB mapping in the guest's nested page table depends on whether or not
any subpages within the range have already been initialized as private
in the RMP table. The existing mixed-attribute tracking in KVM is
insufficient here, for instance:

  - gmem allocates 2MB page
  - guest issues PVALIDATE on 2MB page
  - guest later converts a subpage to shared
  - SNP host code issues PSMASH to split 2MB RMP mapping to 4K
  - KVM MMU splits NPT mapping to 4K
  - guest later converts that shared page back to private

At this point there are no mixed attributes, and KVM would normally
allow for 2MB NPT mappings again, but this is actually not allowed
because the RMP table mappings are 4K and cannot be promoted on the
hypervisor side, so the NPT mappings must still be limited to 4K to
match this.

Implement a kvm_x86_ops.gmem_validate_fault() hook for SEV that checks
for this condition and adjusts the mapping level accordingly.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 32 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c |  1 +
 arch/x86/kvm/svm/svm.h |  7 +++++++
 3 files changed, 40 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index f60bb8291494..3fabd1ee718f 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4519,3 +4519,35 @@ void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end)
 		cond_resched();
 	}
 }
+
+/*
+ * Re-check whether an #NPF for a private/gmem page can still be serviced, and
+ * adjust maximum mapping level if needed.
+ */
+int sev_gmem_validate_fault(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, bool is_private,
+			    u8 *max_level)
+{
+	int level, rc;
+	bool assigned;
+
+	if (!sev_snp_guest(kvm))
+		return 0;
+
+	rc = snp_lookup_rmpentry(pfn, &assigned, &level);
+	if (rc) {
+		pr_err_ratelimited("SEV: RMP entry not found: GFN %llx PFN %llx level %d error %d\n",
+				   gfn, pfn, level, rc);
+		return -ENOENT;
+	}
+
+	if (!assigned) {
+		pr_err_ratelimited("SEV: RMP entry is not assigned: GFN %llx PFN %llx level %d\n",
+				   gfn, pfn, level);
+		return -EINVAL;
+	}
+
+	if (level < *max_level)
+		*max_level = level;
+
+	return 0;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 60d121250b0d..4b330b5ba4c5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5083,6 +5083,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 
 	.gmem_prepare = sev_gmem_prepare,
 	.gmem_invalidate = sev_gmem_invalidate,
+	.gmem_validate_fault = sev_gmem_validate_fault,
 };
 
 /*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6721e5c6cf73..8a8ee475ad86 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -732,6 +732,8 @@ void sev_vcpu_unblocking(struct kvm_vcpu *vcpu);
 void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu);
 int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order);
 void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end);
+int sev_gmem_validate_fault(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, bool is_private,
+			    u8 *max_level);
 #else
 static inline struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) {
 	return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
@@ -753,6 +755,11 @@ static inline int sev_gmem_prepare(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, in
 	return 0;
 }
 static inline void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end) {}
+static inline int sev_gmem_validate_fault(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn,
+					  bool is_private, u8 *max_level)
+{
+	return 0;
+}
 
 #endif
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 21/26] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (19 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 20/26] KVM: x86: Implement gmem hook for determining max NPT mapping level Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 22/26] KVM: SVM: Add module parameter to enable SEV-SNP Michael Roth
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

From: Ashish Kalra <ashish.kalra@amd.com>

With SNP/guest_memfd, private/encrypted memory should not be mappable,
and MMU notifications for HVA-mapped memory will only be relevant to
unencrypted guest memory. Therefore, the rationale behind issuing a
wbinvd_on_all_cpus() in sev_guest_memory_reclaimed() should not apply
for SNP guests and can be ignored.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[mdr: Add some clarifications in commit]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 3fabd1ee718f..03bfb7b9732d 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3033,7 +3033,14 @@ static void sev_flush_encrypted_page(struct kvm_vcpu *vcpu, void *va)
 
 void sev_guest_memory_reclaimed(struct kvm *kvm)
 {
-	if (!sev_guest(kvm))
+	/*
+	 * With SNP+gmem, private/encrypted memory should be
+	 * unreachable via the hva-based mmu notifiers. Additionally,
+	 * for shared->private translations, H/W coherency will ensure
+	 * first guest access to the page would clear out any existing
+	 * dirty copies of that cacheline.
+	 */
+	if (!sev_guest(kvm) || sev_snp_guest(kvm))
 		return;
 
 	wbinvd_on_all_cpus();
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 22/26] KVM: SVM: Add module parameter to enable SEV-SNP
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (20 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 21/26] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 23/26] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

Add a module parameter than can be used to enable or disable the SEV-SNP
feature. Now that KVM contains the support for the SNP set the GHCB
hypervisor feature flag to indicate that SNP is supported.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/kvm/svm/sev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 03bfb7b9732d..de51c3aa0040 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -48,7 +48,8 @@ static bool sev_es_enabled = true;
 module_param_named(sev_es, sev_es_enabled, bool, 0444);
 
 /* enable/disable SEV-SNP support */
-static bool sev_snp_enabled;
+static bool sev_snp_enabled = true;
+module_param_named(sev_snp, sev_snp_enabled, bool, 0444);
 
 /* enable/disable SEV-ES DebugSwap support */
 static bool sev_es_debug_swap_enabled = true;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 23/26] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (21 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 22/26] KVM: SVM: Add module parameter to enable SEV-SNP Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 24/26] crypto: ccp: Add the SNP_VLEK_LOAD command Michael Roth
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh, Alexey Kardashevskiy

From: Brijesh Singh <brijesh.singh@amd.com>

Version 2 of GHCB specification added support for the SNP Guest Request
Message NAE event. The event allows for an SEV-SNP guest to make
requests to the SEV-SNP firmware through hypervisor using the
SNP_GUEST_REQUEST API defined in the SEV-SNP firmware specification.

This is used by guests primarily to request attestation reports from
firmware. There are other request types are available as well, but the
specifics of what guest requests are being made are opaque to the
hypervisor, which only serves as a proxy for the guest requests and
firmware responses.

Implement handling for these events.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
[mdr: ensure FW command failures are indicated to guest, drop extended
 request handling to be re-written as separate patch, massage commit]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c         | 83 ++++++++++++++++++++++++++++++++++
 include/uapi/linux/sev-guest.h |  9 ++++
 2 files changed, 92 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index de51c3aa0040..953f00ddf31b 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -19,6 +19,7 @@
 #include <linux/misc_cgroup.h>
 #include <linux/processor.h>
 #include <linux/trace_events.h>
+#include <uapi/linux/sev-guest.h>
 
 #include <asm/pkru.h>
 #include <asm/trapnr.h>
@@ -3281,6 +3282,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_HV_FEATURES:
 	case SVM_VMGEXIT_PSC:
 	case SVM_VMGEXIT_TERM_REQUEST:
+	case SVM_VMGEXIT_GUEST_REQUEST:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -3724,6 +3726,83 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
 	return ret;
 }
 
+static bool snp_setup_guest_buf(struct kvm *kvm, struct sev_data_snp_guest_request *data,
+				gpa_t req_gpa, gpa_t resp_gpa)
+{
+	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	kvm_pfn_t req_pfn, resp_pfn;
+
+	if (!PAGE_ALIGNED(req_gpa) || !PAGE_ALIGNED(resp_gpa))
+		return false;
+
+	req_pfn = gfn_to_pfn(kvm, gpa_to_gfn(req_gpa));
+	if (is_error_noslot_pfn(req_pfn))
+		return false;
+
+	resp_pfn = gfn_to_pfn(kvm, gpa_to_gfn(resp_gpa));
+	if (is_error_noslot_pfn(resp_pfn))
+		return false;
+
+	if (rmp_make_private(resp_pfn, 0, PG_LEVEL_4K, 0, true))
+		return false;
+
+	data->gctx_paddr = __psp_pa(sev->snp_context);
+	data->req_paddr = __sme_set(req_pfn << PAGE_SHIFT);
+	data->res_paddr = __sme_set(resp_pfn << PAGE_SHIFT);
+
+	return true;
+}
+
+static bool snp_cleanup_guest_buf(struct sev_data_snp_guest_request *data)
+{
+	u64 pfn = __sme_clr(data->res_paddr) >> PAGE_SHIFT;
+
+	if (snp_page_reclaim(pfn))
+		return false;
+
+	if (rmp_make_shared(pfn, PG_LEVEL_4K))
+		return false;
+
+	return true;
+}
+
+static bool __snp_handle_guest_req(struct kvm *kvm, gpa_t req_gpa, gpa_t resp_gpa,
+				   sev_ret_code *fw_err)
+{
+	struct sev_data_snp_guest_request data = {0};
+	struct kvm_sev_info *sev;
+	bool ret = true;
+
+	if (!sev_snp_guest(kvm))
+		return false;
+
+	sev = &to_kvm_svm(kvm)->sev_info;
+
+	if (!snp_setup_guest_buf(kvm, &data, req_gpa, resp_gpa))
+		return false;
+
+	if (sev_issue_cmd(kvm, SEV_CMD_SNP_GUEST_REQUEST, &data, fw_err))
+		ret = false;
+
+	if (!snp_cleanup_guest_buf(&data))
+		ret = false;
+
+	return ret;
+}
+
+static void snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp_gpa)
+{
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	struct kvm *kvm = vcpu->kvm;
+	sev_ret_code fw_err = 0;
+	int vmm_ret = 0;
+
+	if (!__snp_handle_guest_req(kvm, req_gpa, resp_gpa, &fw_err))
+		vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
+
+	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3984,6 +4063,10 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		vcpu->run->system_event.ndata = 1;
 		vcpu->run->system_event.data[0] = control->ghcb_gpa;
 		break;
+	case SVM_VMGEXIT_GUEST_REQUEST:
+		snp_handle_guest_req(svm, control->exit_info_1, control->exit_info_2);
+		ret = 1;
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
index 154a87a1eca9..7bd78e258569 100644
--- a/include/uapi/linux/sev-guest.h
+++ b/include/uapi/linux/sev-guest.h
@@ -89,8 +89,17 @@ struct snp_ext_report_req {
 #define SNP_GUEST_FW_ERR_MASK		GENMASK_ULL(31, 0)
 #define SNP_GUEST_VMM_ERR_SHIFT		32
 #define SNP_GUEST_VMM_ERR(x)		(((u64)x) << SNP_GUEST_VMM_ERR_SHIFT)
+#define SNP_GUEST_FW_ERR(x)		((x) & SNP_GUEST_FW_ERR_MASK)
+#define SNP_GUEST_ERR(vmm_err, fw_err)	(SNP_GUEST_VMM_ERR(vmm_err) | \
+					 SNP_GUEST_FW_ERR(fw_err))
 
+/*
+ * The GHCB spec only formally defines INVALID_LEN/BUSY VMM errors, but define
+ * a GENERIC error code such that it won't ever conflict with GHCB-defined
+ * errors if any get added in the future.
+ */
 #define SNP_GUEST_VMM_ERR_INVALID_LEN	1
 #define SNP_GUEST_VMM_ERR_BUSY		2
+#define SNP_GUEST_VMM_ERR_GENERIC	BIT(31)
 
 #endif /* __UAPI_LINUX_SEV_GUEST_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 24/26] crypto: ccp: Add the SNP_VLEK_LOAD command
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (22 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 23/26] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 25/26] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands Michael Roth
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

When requesting an attestation report a guest is able to specify whether
it wants SNP firmware to sign the report using either a Versioned Chip
Endorsement Key (VCEK), which is derived from chip-unique secrets, or a
Versioned Loaded Endorsement Key (VLEK) which is obtained from an AMD
Key Derivation Service (KDS) and derived from seeds allocated to
enrolled cloud service providers (CSPs).

For VLEK keys, an SNP_VLEK_LOAD SNP firmware command is used to load
them into the system after obtaining them from the KDS. Add a
corresponding userspace interface so to allow the loading of VLEK keys
into the system.

See SEV-SNP Firmware ABI 1.54, SNP_VLEK_LOAD for more details.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 Documentation/virt/coco/sev-guest.rst | 19 ++++++++++++++
 drivers/crypto/ccp/sev-dev.c          | 36 +++++++++++++++++++++++++++
 include/uapi/linux/psp-sev.h          | 27 ++++++++++++++++++++
 3 files changed, 82 insertions(+)

diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst
index e1eaf6a830ce..de68d3a4b540 100644
--- a/Documentation/virt/coco/sev-guest.rst
+++ b/Documentation/virt/coco/sev-guest.rst
@@ -176,6 +176,25 @@ to SNP_CONFIG command defined in the SEV-SNP spec. The current values of
 the firmware parameters affected by this command can be queried via
 SNP_PLATFORM_STATUS.
 
+2.7 SNP_VLEK_LOAD
+-----------------
+:Technology: sev-snp
+:Type: hypervisor ioctl cmd
+:Parameters (in): struct sev_user_data_snp_vlek_load
+:Returns (out): 0 on success, -negative on error
+
+When requesting an attestation report a guest is able to specify whether
+it wants SNP firmware to sign the report using either a Versioned Chip
+Endorsement Key (VCEK), which is derived from chip-unique secrets, or a
+Versioned Loaded Endorsement Key (VLEK) which is obtained from an AMD
+Key Derivation Service (KDS) and derived from seeds allocated to
+enrolled cloud service providers.
+
+In the case of VLEK keys, the SNP_VLEK_LOAD SNP command is used to load
+them into the system after obtaining them from the KDS, and corresponds
+closely to the SNP_VLEK_LOAD firmware command specified in the SEV-SNP
+spec.
+
 3. SEV-SNP CPUID Enforcement
 ============================
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 2102377f727b..97a7959406ee 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -2027,6 +2027,39 @@ static int sev_ioctl_do_snp_set_config(struct sev_issue_cmd *argp, bool writable
 	return __sev_do_cmd_locked(SEV_CMD_SNP_CONFIG, &config, &argp->error);
 }
 
+static int sev_ioctl_do_snp_vlek_load(struct sev_issue_cmd *argp, bool writable)
+{
+	struct sev_device *sev = psp_master->sev_data;
+	struct sev_user_data_snp_vlek_load input;
+	void *blob;
+	int ret;
+
+	if (!sev->snp_initialized || !argp->data)
+		return -EINVAL;
+
+	if (!writable)
+		return -EPERM;
+
+	if (copy_from_user(&input, u64_to_user_ptr(argp->data), sizeof(input)))
+		return -EFAULT;
+
+	if (input.len != sizeof(input) || input.vlek_wrapped_version != 0)
+		return -EINVAL;
+
+	blob = psp_copy_user_blob(input.vlek_wrapped_address,
+				  sizeof(struct sev_user_data_snp_wrapped_vlek_hashstick));
+	if (IS_ERR(blob))
+		return PTR_ERR(blob);
+
+	input.vlek_wrapped_address = __psp_pa(blob);
+
+	ret = __sev_do_cmd_locked(SEV_CMD_SNP_VLEK_LOAD, &input, &argp->error);
+
+	kfree(blob);
+
+	return ret;
+}
+
 static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	void __user *argp = (void __user *)arg;
@@ -2087,6 +2120,9 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 	case SNP_SET_CONFIG:
 		ret = sev_ioctl_do_snp_set_config(&input, writable);
 		break;
+	case SNP_VLEK_LOAD:
+		ret = sev_ioctl_do_snp_vlek_load(&input, writable);
+		break;
 	default:
 		ret = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index b7a2c2ee35b7..2289b7c76c59 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -31,6 +31,7 @@ enum {
 	SNP_PLATFORM_STATUS,
 	SNP_COMMIT,
 	SNP_SET_CONFIG,
+	SNP_VLEK_LOAD,
 
 	SEV_MAX,
 };
@@ -214,6 +215,32 @@ struct sev_user_data_snp_config {
 	__u8 rsvd1[52];
 } __packed;
 
+/**
+ * struct sev_data_snp_vlek_load - SNP_VLEK_LOAD structure
+ *
+ * @len: length of the command buffer read by the PSP
+ * @vlek_wrapped_version: version of wrapped VLEK hashstick (Must be 0h)
+ * @rsvd: reserved
+ * @vlek_wrapped_address: address of a wrapped VLEK hashstick
+ *                        (struct sev_user_data_snp_wrapped_vlek_hashstick)
+ */
+struct sev_user_data_snp_vlek_load {
+	__u32 len;				/* In */
+	__u8 vlek_wrapped_version;		/* In */
+	__u8 rsvd[3];				/* In */
+	__u64 vlek_wrapped_address;		/* In */
+} __packed;
+
+/**
+ * struct sev_user_data_snp_vlek_wrapped_vlek_hashstick - Wrapped VLEK data
+ *
+ * @data: Opaque data provided by AMD KDS (as described in SEV-SNP Firmware ABI
+ *        1.54, SNP_VLEK_LOAD)
+ */
+struct sev_user_data_snp_wrapped_vlek_hashstick {
+	__u8 data[432];				/* In */
+} __packed;
+
 /**
  * struct sev_issue_cmd - SEV ioctl parameters
  *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 25/26] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (23 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 24/26] crypto: ccp: Add the SNP_VLEK_LOAD command Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-18 19:41 ` [PATCH v13 26/26] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth
  2024-04-19 12:04 ` [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Paolo Bonzini
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

These commands can be used to pause servicing of guest attestation
requests. This useful when updating the reported TCB or signing key with
commands such as SNP_SET_CONFIG/SNP_COMMIT/SNP_VLEK_LOAD, since they may
in turn require updates to userspace-supplied certificates, and if an
attestation request happens to be in-flight at the time those updates
are occurring there is potential for a guest to receive a certificate
blob that is out of sync with the effective signing key for the
attestation report.

These interfaces also provide some versatility with how similar
firmware/certificate update activities can be handled in the future.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 Documentation/virt/coco/sev-guest.rst | 50 +++++++++++++++++++++++++--
 arch/x86/include/asm/sev.h            |  6 ++++
 arch/x86/virt/svm/sev.c               | 43 +++++++++++++++++++++++
 drivers/crypto/ccp/sev-dev.c          | 47 +++++++++++++++++++++++++
 include/uapi/linux/psp-sev.h          | 12 +++++++
 5 files changed, 156 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst
index de68d3a4b540..ab192a008ba7 100644
--- a/Documentation/virt/coco/sev-guest.rst
+++ b/Documentation/virt/coco/sev-guest.rst
@@ -128,8 +128,6 @@ the SEV-SNP specification for further details.
 
 The SNP_GET_EXT_REPORT ioctl is similar to the SNP_GET_REPORT. The difference is
 related to the additional certificate data that is returned with the report.
-The certificate data returned is being provided by the hypervisor through the
-SNP_SET_EXT_CONFIG.
 
 The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command provided by the SEV-SNP
 firmware to get the attestation report.
@@ -195,6 +193,54 @@ them into the system after obtaining them from the KDS, and corresponds
 closely to the SNP_VLEK_LOAD firmware command specified in the SEV-SNP
 spec.
 
+2.8 SNP_PAUSE_ATTESTATION / SNP_RESUME_ATTESTATION
+--------------------------------------------------
+:Technology: sev-snp
+:Type: hypervisor ioctl cmd
+:Parameters (out): struct sev_user_data_snp_pause_transaction
+:Returns (out): 0 on success, -negative on error
+
+When requesting attestation reports, SNP guests have the option of issuing
+an extended guest request which allows host userspace to supply additional
+certificate data that can be used to validate the signature used to sign
+the attestation report. This signature is generated using a key that is
+derived from the reported TCB that can be set via the SNP_SET_CONFIG and
+SNP_COMMIT ioctls, so the accompanying certificate data needs to be kept in
+sync with the changes made to the reported TCB via these ioctls.
+
+Similarly, interfaces like SNP_LOAD_VLEK can modify the key used to sign
+the attestation reports, which may in turn require updating the certificate
+data provided to guests via extended guest requests.
+
+To allow for updating the reported TCB, endorsement key, and any certificate
+data in a manner that is atomic to guests, the SNP_PAUSE_ATTESTATION and
+SNP_RESUME_ATTESTATION commands are provided.
+
+After SNP_PAUSE_ATTESTATION is issued, any attestation report requests via
+extended guest requests that are in-progress, or received after
+SNP_PAUSE_ATTESTATION is issued, will result in the guest receiving a
+GHCB-defined error message instructing it to retry the request. Once all
+the desired reported TCB, endorsement keys, or certificate data updates
+are completed on the host, the SNP_RESUME_ATTESTATION command must be
+issued to allow guest attestation requests to proceed.
+
+In general, hosts should serialize updates of this sort and never have more
+than 1 outstanding transaction in flight that could result in the
+interleaving of multiple SNP_PAUSE_ATTESTATION/SNP_RESUME_ATTESTATION pairs.
+To guard against this, SNP_PAUSE_ATTESTATION will fail if another process
+has already paused attestation requests.
+
+However, there may be occassions where a transaction needs to be aborted due
+to unexpected activity in userspace such as timeouts, crashes, etc., so
+SNP_RESUME_ATTESTATION will always succeed. Nonetheless, this could
+potentially lead to SNP_RESUME_ATTESTATION being called out of sequence, so
+to allow for callers of SNP_{PAUSE,RESUME}_ATTESTATION to detect such
+occurrences, each ioctl will return a transaction ID in the response so the
+caller can monitor whether the start/end ID both match. If they don't, the
+caller should assume that attestation has been paused/resumed unexpectedly,
+and take whatever measures it deems necessary such as logging, reporting,
+auditing the sequence of events.
+
 3. SEV-SNP CPUID Enforcement
 ============================
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 234a998e2d2d..baf223eb5633 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -82,6 +82,8 @@ extern void vc_no_ghcb(void);
 extern void vc_boot_ghcb(void);
 extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 
+extern struct mutex snp_pause_attestation_lock;
+
 /* PVALIDATE return codes */
 #define PVALIDATE_FAIL_SIZEMISMATCH	6
 
@@ -272,6 +274,8 @@ int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 asid, bool immut
 int rmp_make_shared(u64 pfn, enum pg_level level);
 void snp_leak_pages(u64 pfn, unsigned int npages);
 void kdump_sev_callback(void);
+int snp_pause_attestation(u64 *transaction_id);
+void snp_resume_attestation(u64 *transaction_id);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
@@ -285,6 +289,8 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 as
 static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
 static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
 static inline void kdump_sev_callback(void) { }
+static inline int snp_pause_attestation(u64 *transaction_id) { return 0; }
+static inline void snp_resume_attestation(u64 *transaction_id) {}
 #endif
 
 #endif
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index ab0e8448bb6e..b75f2e7d4012 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -70,6 +70,11 @@ static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
 
 static unsigned long snp_nr_leaked_pages;
 
+/* For synchronizing TCB/certificate updates with extended guest requests */
+DEFINE_MUTEX(snp_pause_attestation_lock);
+static u64 snp_transaction_id;
+static bool snp_attestation_paused;
+
 #undef pr_fmt
 #define pr_fmt(fmt)	"SEV-SNP: " fmt
 
@@ -568,3 +573,41 @@ void kdump_sev_callback(void)
 	if (cc_platform_has(CC_ATTR_HOST_SEV_SNP))
 		wbinvd();
 }
+
+int snp_pause_attestation(u64 *transaction_id)
+{
+	mutex_lock(&snp_pause_attestation_lock);
+
+	if (snp_attestation_paused) {
+		mutex_unlock(&snp_pause_attestation_lock);
+		return -EBUSY;
+	}
+
+	/*
+	 * The actual transaction ID update will happen when
+	 * snp_resume_attestation() is called, so return
+	 * the *anticipated* transaction ID that will be
+	 * returned by snp_resume_attestation(). This is
+	 * to ensure that unbalanced/aborted transactions will
+	 * be noticeable when the caller that started the
+	 * transaction calls snp_resume_attestation().
+	 */
+	*transaction_id = snp_transaction_id + 1;
+	snp_attestation_paused = true;
+
+	mutex_unlock(&snp_pause_attestation_lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(snp_pause_attestation);
+
+void snp_resume_attestation(u64 *transaction_id)
+{
+	mutex_lock(&snp_pause_attestation_lock);
+
+	snp_attestation_paused = false;
+	*transaction_id = ++snp_transaction_id;
+
+	mutex_unlock(&snp_pause_attestation_lock);
+}
+EXPORT_SYMBOL_GPL(snp_resume_attestation);
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 97a7959406ee..7eb18a273731 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -2060,6 +2060,47 @@ static int sev_ioctl_do_snp_vlek_load(struct sev_issue_cmd *argp, bool writable)
 	return ret;
 }
 
+static int sev_ioctl_do_snp_pause_attestation(struct sev_issue_cmd *argp, bool writable)
+{
+	struct sev_user_data_snp_pause_attestation transaction = {0};
+	struct sev_device *sev = psp_master->sev_data;
+	int ret;
+
+	if (!sev->snp_initialized || !argp->data)
+		return -EINVAL;
+
+	if (!writable)
+		return -EPERM;
+
+	ret = snp_pause_attestation(&transaction.id);
+	if (ret)
+		return ret;
+
+	if (copy_to_user((void __user *)argp->data, &transaction, sizeof(transaction)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int sev_ioctl_do_snp_resume_attestation(struct sev_issue_cmd *argp, bool writable)
+{
+	struct sev_user_data_snp_pause_attestation transaction = {0};
+	struct sev_device *sev = psp_master->sev_data;
+
+	if (!sev->snp_initialized || !argp->data)
+		return -EINVAL;
+
+	if (!writable)
+		return -EPERM;
+
+	snp_resume_attestation(&transaction.id);
+
+	if (copy_to_user((void __user *)argp->data, &transaction, sizeof(transaction)))
+		return -EFAULT;
+
+	return 0;
+}
+
 static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	void __user *argp = (void __user *)arg;
@@ -2123,6 +2164,12 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 	case SNP_VLEK_LOAD:
 		ret = sev_ioctl_do_snp_vlek_load(&input, writable);
 		break;
+	case SNP_PAUSE_ATTESTATION:
+		ret = sev_ioctl_do_snp_pause_attestation(&input, writable);
+		break;
+	case SNP_RESUME_ATTESTATION:
+		ret = sev_ioctl_do_snp_resume_attestation(&input, writable);
+		break;
 	default:
 		ret = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index 2289b7c76c59..7b35b2814a99 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -32,6 +32,8 @@ enum {
 	SNP_COMMIT,
 	SNP_SET_CONFIG,
 	SNP_VLEK_LOAD,
+	SNP_PAUSE_ATTESTATION,
+	SNP_RESUME_ATTESTATION,
 
 	SEV_MAX,
 };
@@ -241,6 +243,16 @@ struct sev_user_data_snp_wrapped_vlek_hashstick {
 	__u8 data[432];				/* In */
 } __packed;
 
+/**
+ * struct sev_user_data_snp_pause_attestation - metadata for pausing attestation
+ *
+ * @id: the ID of the transaction started/ended by a call to SNP_PAUSE_ATTESTATION
+ *	or SNP_RESUME_ATTESTATION, respectively.
+ */
+struct sev_user_data_snp_pause_attestation {
+	__u64 id;				/* Out */
+} __packed;
+
 /**
  * struct sev_issue_cmd - SEV ioctl parameters
  *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v13 26/26] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (24 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 25/26] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands Michael Roth
@ 2024-04-18 19:41 ` Michael Roth
  2024-04-19 12:04 ` [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Paolo Bonzini
  26 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-18 19:41 UTC (permalink / raw)
  To: kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

Version 2 of GHCB specification added support for the SNP Extended Guest
Request Message NAE event. This event serves a nearly identical purpose
to the previously-added SNP_GUEST_REQUEST event, but allows for
additional certificate data to be supplied via an additional
guest-supplied buffer to be used mainly for verifying the signature of
an attestation report as returned by firmware.

This certificate data is supplied by userspace, so unlike with
SNP_GUEST_REQUEST events, SNP_EXTENDED_GUEST_REQUEST events are first
forwarded to userspace via a KVM_EXIT_VMGEXIT exit type, and then the
firmware request is made only afterward.

Implement handling for these events.

Since there is a potential for race conditions where the
userspace-supplied certificate data may be out-of-sync relative to the
reported TCB or VLEK that firmware will use when signing attestation
reports, make use of the synchronization mechanisms wired up to the
SNP_{PAUSE,RESUME}_ATTESTATION SEV device ioctls such that the guest
will be told to retry the request while attestation has been paused due
to an update being underway on the system.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 Documentation/virt/kvm/api.rst | 26 +++++++++++
 arch/x86/include/asm/sev.h     |  6 +++
 arch/x86/kvm/svm/sev.c         | 82 ++++++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.h         |  3 ++
 arch/x86/virt/svm/sev.c        | 37 +++++++++++++++
 include/uapi/linux/kvm.h       |  6 +++
 6 files changed, 160 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 85099198a10f..6cf186ed8f66 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7066,6 +7066,7 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 		struct kvm_user_vmgexit {
 		#define KVM_USER_VMGEXIT_PSC_MSR	1
 		#define KVM_USER_VMGEXIT_PSC		2
+		#define KVM_USER_VMGEXIT_EXT_GUEST_REQ	3
 			__u32 type; /* KVM_USER_VMGEXIT_* type */
 			union {
 				struct {
@@ -7079,6 +7080,11 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
 					__u64 shared_gpa;
 					__u64 ret;
 				} psc;
+				struct {
+					__u64 data_gpa;
+					__u64 data_npages;
+					__u32 ret;
+				} ext_guest_req;
 			};
 		};
 
@@ -7108,6 +7114,26 @@ private/shared state. Userspace will return a value in 'ret' that is in
 agreement with the GHCB-defined return values that the guest will expect
 in the SW_EXITINFO2 field of the GHCB in response to these requests.
 
+For the KVM_USER_VMGEXIT_EXT_GUEST_REQ type, the ext_guest_req union type
+is used. The kernel will supply in 'data_gpa' the value the guest supplies
+via the RAX field of the GHCB when issued extended guest requests.
+'data_npages' will similarly contain the value the guest supplies in RBX
+denoting the number of shared pages available to write the certificate
+data into.
+
+  - If the supplied number of pages is sufficient, userspace should write
+    the certificate data blob (in the format defined by the GHCB spec) in
+    the address indicated by 'data_gpa' and set 'ret' to 0.
+
+  - If the number of pages supplied is not sufficient, userspace must write
+    the required number of pages in 'data_npages' and then set 'ret' to 1.
+
+  - If userspace is temporarily unable to handle the request, 'ret' should
+    be set to 2 to inform the guest to retry later.
+
+  - If some other error occurred, userspace should set 'ret' to a non-zero
+    value that is distinct from the specific return values mentioned above.
+
 6. Capabilities that can be enabled on vCPUs
 ============================================
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index baf223eb5633..65a012f6bcb4 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -276,6 +276,9 @@ void snp_leak_pages(u64 pfn, unsigned int npages);
 void kdump_sev_callback(void);
 int snp_pause_attestation(u64 *transaction_id);
 void snp_resume_attestation(u64 *transaction_id);
+u64 snp_transaction_get_id(void);
+bool __snp_transaction_is_stale(u64 transaction_id);
+bool snp_transaction_is_stale(u64 transaction_id);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
@@ -291,6 +294,9 @@ static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
 static inline void kdump_sev_callback(void) { }
 static inline int snp_pause_attestation(u64 *transaction_id) { return 0; }
 static inline void snp_resume_attestation(u64 *transaction_id) {}
+static inline u64 snp_transaction_get_id(void) { return 0; }
+static inline bool __snp_transaction_is_stale(u64 transaction_id) { return false; }
+static inline bool snp_transaction_is_stale(u64 transaction_id) { return false; }
 #endif
 
 #endif
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 953f00ddf31b..8ba29b2b2b0a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3283,6 +3283,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
 	case SVM_VMGEXIT_PSC:
 	case SVM_VMGEXIT_TERM_REQUEST:
 	case SVM_VMGEXIT_GUEST_REQUEST:
+	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
 		break;
 	default:
 		reason = GHCB_ERR_INVALID_EVENT;
@@ -3803,6 +3804,84 @@ static void snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp
 	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
 }
 
+static int snp_complete_ext_guest_req(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	struct vmcb_control_area *control;
+	struct kvm *kvm = vcpu->kvm;
+	sev_ret_code fw_err = 0;
+	int vmm_ret;
+
+	vmm_ret = vcpu->run->vmgexit.ext_guest_req.ret;
+	if (vmm_ret) {
+		if (vmm_ret == SNP_GUEST_VMM_ERR_INVALID_LEN)
+			vcpu->arch.regs[VCPU_REGS_RBX] =
+				vcpu->run->vmgexit.ext_guest_req.data_npages;
+		goto abort_request;
+	}
+
+	control = &svm->vmcb->control;
+
+	/*
+	 * To avoid the message sequence number getting out of sync between the
+	 * actual value seen by firmware verses the value expected by the guest,
+	 * make sure attestations can't get paused on the write-side at this
+	 * point by holding the lock for the entire duration of the firmware
+	 * request so that there is no situation where SNP_GUEST_VMM_ERR_BUSY
+	 * would need to be returned after firmware sees the request.
+	 */
+	mutex_lock(&snp_pause_attestation_lock);
+
+	if (__snp_transaction_is_stale(svm->snp_transaction_id))
+		vmm_ret = SNP_GUEST_VMM_ERR_BUSY;
+	else if (!__snp_handle_guest_req(kvm, control->exit_info_1,
+					 control->exit_info_2, &fw_err))
+		vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
+
+	mutex_unlock(&snp_pause_attestation_lock);
+
+abort_request:
+	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
+
+	return 1; /* resume guest */
+}
+
+static int snp_begin_ext_guest_req(struct kvm_vcpu *vcpu)
+{
+	int vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
+	struct vcpu_svm *svm = to_svm(vcpu);
+	unsigned long data_npages;
+	sev_ret_code fw_err;
+	gpa_t data_gpa;
+
+	if (!sev_snp_guest(vcpu->kvm))
+		goto abort_request;
+
+	data_gpa = vcpu->arch.regs[VCPU_REGS_RAX];
+	data_npages = vcpu->arch.regs[VCPU_REGS_RBX];
+
+	if (!IS_ALIGNED(data_gpa, PAGE_SIZE))
+		goto abort_request;
+
+	svm->snp_transaction_id = snp_transaction_get_id();
+	if (snp_transaction_is_stale(svm->snp_transaction_id)) {
+		vmm_ret = SNP_GUEST_VMM_ERR_BUSY;
+		goto abort_request;
+	}
+
+	vcpu->run->exit_reason = KVM_EXIT_VMGEXIT;
+	vcpu->run->vmgexit.type = KVM_USER_VMGEXIT_EXT_GUEST_REQ;
+	vcpu->run->vmgexit.ext_guest_req.data_gpa = data_gpa;
+	vcpu->run->vmgexit.ext_guest_req.data_npages = data_npages;
+	vcpu->arch.complete_userspace_io = snp_complete_ext_guest_req;
+
+	return 0; /* forward request to userspace */
+
+abort_request:
+	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
+	return 1; /* resume guest */
+}
+
 static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -4067,6 +4146,9 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
 		snp_handle_guest_req(svm, control->exit_info_1, control->exit_info_2);
 		ret = 1;
 		break;
+	case SVM_VMGEXIT_EXT_GUEST_REQUEST:
+		ret = snp_begin_ext_guest_req(vcpu);
+		break;
 	case SVM_VMGEXIT_UNSUPPORTED_EVENT:
 		vcpu_unimpl(vcpu,
 			    "vmgexit: unsupported event - exit_info_1=%#llx, exit_info_2=%#llx\n",
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 8a8ee475ad86..28140bc8af27 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -303,6 +303,9 @@ struct vcpu_svm {
 
 	/* Guest GIF value, used when vGIF is not enabled */
 	bool guest_gif;
+
+	/* Transaction ID associated with SNP config updates */
+	u64 snp_transaction_id;
 };
 
 struct svm_cpu_data {
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index b75f2e7d4012..f1f7486a3dcf 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -72,6 +72,7 @@ static unsigned long snp_nr_leaked_pages;
 
 /* For synchronizing TCB/certificate updates with extended guest requests */
 DEFINE_MUTEX(snp_pause_attestation_lock);
+EXPORT_SYMBOL_GPL(snp_pause_attestation_lock);
 static u64 snp_transaction_id;
 static bool snp_attestation_paused;
 
@@ -611,3 +612,39 @@ void snp_resume_attestation(u64 *transaction_id)
 	mutex_unlock(&snp_pause_attestation_lock);
 }
 EXPORT_SYMBOL_GPL(snp_resume_attestation);
+
+u64 snp_transaction_get_id(void)
+{
+	u64 id;
+
+	mutex_lock(&snp_pause_attestation_lock);
+	id = snp_transaction_id;
+	mutex_unlock(&snp_pause_attestation_lock);
+
+	return id;
+}
+EXPORT_SYMBOL_GPL(snp_transaction_get_id);
+
+/* Must be called with snp_pause_attestion_lock held */
+bool __snp_transaction_is_stale(u64 transaction_id)
+{
+	lockdep_assert_held(&snp_pause_attestation_lock);
+
+	return (snp_attestation_paused ||
+		transaction_id != snp_transaction_id);
+}
+EXPORT_SYMBOL_GPL(__snp_transaction_is_stale);
+
+bool snp_transaction_is_stale(u64 transaction_id)
+{
+	bool stale;
+
+	mutex_lock(&snp_pause_attestation_lock);
+
+	stale = __snp_transaction_is_stale(transaction_id);
+
+	mutex_unlock(&snp_pause_attestation_lock);
+
+	return stale;
+}
+EXPORT_SYMBOL_GPL(snp_transaction_is_stale);
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index e33c48bfbd67..585de3a2591e 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -138,6 +138,7 @@ struct kvm_xen_exit {
 struct kvm_user_vmgexit {
 #define KVM_USER_VMGEXIT_PSC_MSR	1
 #define KVM_USER_VMGEXIT_PSC		2
+#define KVM_USER_VMGEXIT_EXT_GUEST_REQ	3
 	__u32 type; /* KVM_USER_VMGEXIT_* type */
 	union {
 		struct {
@@ -151,6 +152,11 @@ struct kvm_user_vmgexit {
 			__u64 shared_gpa;
 			__u64 ret;
 		} psc;
+		struct {
+			__u64 data_gpa;
+			__u64 data_npages;
+			__u32 ret;
+		} ext_guest_req;
 	};
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
  2024-04-18 19:41 ` [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
@ 2024-04-19 11:52   ` Paolo Bonzini
  2024-04-19 14:19     ` Michael Roth
  2024-04-24 21:40   ` Sean Christopherson
  1 sibling, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2024-04-19 11:52 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

On Thu, Apr 18, 2024 at 9:42 PM Michael Roth <michael.roth@amd.com> wrote:
> +/* As defined by SEV-SNP Firmware ABI, under "Guest Policy". */
> +#define SNP_POLICY_MASK_API_MAJOR      GENMASK_ULL(15, 8)
> +#define SNP_POLICY_MASK_API_MINOR      GENMASK_ULL(7, 0)
> +
> +#define SNP_POLICY_MASK_VALID          (SNP_POLICY_MASK_SMT            | \
> +                                        SNP_POLICY_MASK_RSVD_MBO       | \
> +                                        SNP_POLICY_MASK_DEBUG          | \
> +                                        SNP_POLICY_MASK_SINGLE_SOCKET  | \
> +                                        SNP_POLICY_MASK_API_MAJOR      | \
> +                                        SNP_POLICY_MASK_API_MINOR)
> +
> +/* KVM's SNP support is compatible with 1.51 of the SEV-SNP Firmware ABI. */
> +#define SNP_POLICY_API_MAJOR           1
> +#define SNP_POLICY_API_MINOR           51

> +static inline bool sev_version_greater_or_equal(u8 major, u8 minor)
> +{
> +       if (major < SNP_POLICY_API_MAJOR)
> +               return true;

Should it perhaps refuse version 0.x? With something like a

#define SNP_POLICY_API_MAJOR_MIN    1

to make it a bit more future proof (and testable).

> +       major = (params.policy & SNP_POLICY_MASK_API_MAJOR);

This should be >> 8. Do the QEMU patches not set the API version? :)

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 10/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
  2024-04-18 19:41 ` [PATCH v13 10/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
@ 2024-04-19 11:56   ` Paolo Bonzini
  2024-04-19 16:12     ` Paolo Bonzini
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2024-04-19 11:56 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

On Thu, Apr 18, 2024 at 9:42 PM Michael Roth <michael.roth@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> A key aspect of a launching an SNP guest is initializing it with a
> known/measured payload which is then encrypted into guest memory as
> pre-validated private pages and then measured into the cryptographic
> launch context created with KVM_SEV_SNP_LAUNCH_START so that the guest
> can attest itself after booting.
>
> Since all private pages are provided by guest_memfd, make use of the
> kvm_gmem_populate() interface to handle this. The general flow is that
> guest_memfd will handle allocating the pages associated with the GPA
> ranges being initialized by each particular call of
> KVM_SEV_SNP_LAUNCH_UPDATE, copying data from userspace into those pages,
> and then the post_populate callback will do the work of setting the
> RMP entries for these pages to private and issuing the SNP firmware
> calls to encrypt/measure them.
>
> For more information see the SEV-SNP specification.
>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Co-developed-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/x86/amd-memory-encryption.rst    |  39 ++++
>  arch/x86/include/uapi/asm/kvm.h               |  15 ++
>  arch/x86/kvm/svm/sev.c                        | 218 ++++++++++++++++++
>  3 files changed, 272 insertions(+)
>
> diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> index 1b042f827eab..1ee8401de72d 100644
> --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> @@ -478,6 +478,45 @@ Returns: 0 on success, -negative on error
>
>  See the SEV-SNP spec [snp-fw-abi]_ for further detail on the launch input.
>
> +19. KVM_SEV_SNP_LAUNCH_UPDATE
> +-----------------------------
> +
> +The KVM_SEV_SNP_LAUNCH_UPDATE command is used for loading userspace-provided
> +data into a guest GPA range, measuring the contents into the SNP guest context
> +created by KVM_SEV_SNP_LAUNCH_START, and then encrypting/validating that GPA
> +range so that it will be immediately readable using the encryption key
> +associated with the guest context once it is booted, after which point it can
> +attest the measurement associated with its context before unlocking any
> +secrets.
> +
> +It is required that the GPA ranges initialized by this command have had the
> +KVM_MEMORY_ATTRIBUTE_PRIVATE attribute set in advance. See the documentation
> +for KVM_SET_MEMORY_ATTRIBUTES for more details on this aspect.
> +
> +Parameters (in): struct  kvm_sev_snp_launch_update
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +
> +        struct kvm_sev_snp_launch_update {
> +                __u64 gfn_start;        /* Guest page number to load/encrypt data into. */
> +                __u64 uaddr;            /* Userspace address of data to be loaded/encrypted. */
> +                __u32 len;              /* 4k-aligned length in bytes to copy into guest memory.*/
> +                __u8 type;              /* The type of the guest pages being initialized. */
> +        };
> +
> +where the allowed values for page_type are #define'd as::
> +
> +       KVM_SEV_SNP_PAGE_TYPE_NORMAL
> +       KVM_SEV_SNP_PAGE_TYPE_ZERO
> +       KVM_SEV_SNP_PAGE_TYPE_UNMEASURED
> +       KVM_SEV_SNP_PAGE_TYPE_SECRETS
> +       KVM_SEV_SNP_PAGE_TYPE_CPUID
> +
> +See the SEV-SNP spec [snp-fw-abi]_ for further details on how each page type is
> +used/measured.
> +
>  Device attribute API
>  ====================
>
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index bdf8c5461a36..8612aec97f55 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -699,6 +699,7 @@ enum sev_cmd_id {
>
>         /* SNP-specific commands */
>         KVM_SEV_SNP_LAUNCH_START = 100,
> +       KVM_SEV_SNP_LAUNCH_UPDATE,
>
>         KVM_SEV_NR_MAX,
>  };
> @@ -830,6 +831,20 @@ struct kvm_sev_snp_launch_start {
>         __u8 gosvw[16];
>  };
>
> +/* Kept in sync with firmware values for simplicity. */
> +#define KVM_SEV_SNP_PAGE_TYPE_NORMAL           0x1
> +#define KVM_SEV_SNP_PAGE_TYPE_ZERO             0x3
> +#define KVM_SEV_SNP_PAGE_TYPE_UNMEASURED       0x4
> +#define KVM_SEV_SNP_PAGE_TYPE_SECRETS          0x5
> +#define KVM_SEV_SNP_PAGE_TYPE_CPUID            0x6
> +
> +struct kvm_sev_snp_launch_update {
> +       __u64 gfn_start;
> +       __u64 uaddr;
> +       __u32 len;
> +       __u8 type;
> +};
> +
>  #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
>  #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 4c5abc0e7806..e721152bae00 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -262,6 +262,35 @@ static void sev_decommission(unsigned int handle)
>         sev_guest_decommission(&decommission, NULL);
>  }
>
> +static int snp_page_reclaim(u64 pfn)
> +{
> +       struct sev_data_snp_page_reclaim data = {0};
> +       int err, rc;
> +
> +       data.paddr = __sme_set(pfn << PAGE_SHIFT);
> +       rc = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
> +       if (WARN_ON_ONCE(rc)) {
> +               /*
> +                * This shouldn't happen under normal circumstances, but if the
> +                * reclaim failed, then the page is no longer safe to use.
> +                */
> +               snp_leak_pages(pfn, 1);
> +       }
> +
> +       return rc;
> +}
> +
> +static int host_rmp_make_shared(u64 pfn, enum pg_level level)
> +{
> +       int rc;
> +
> +       rc = rmp_make_shared(pfn, level);
> +       if (rc)
> +               snp_leak_pages(pfn, page_level_size(level) >> PAGE_SHIFT);
> +
> +       return rc;
> +}
> +
>  static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
>  {
>         struct sev_data_deactivate deactivate;
> @@ -2131,6 +2160,192 @@ static int snp_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return rc;
>  }
>
> +struct sev_gmem_populate_args {
> +       __u8 type;
> +       int sev_fd;
> +       int fw_error;
> +};
> +
> +static int sev_gmem_post_populate(struct kvm *kvm, gfn_t gfn_start, kvm_pfn_t pfn,
> +                                 void __user *src, int order, void *opaque)
> +{
> +       struct sev_gmem_populate_args *sev_populate_args = opaque;
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       int n_private = 0, ret, i;
> +       int npages = (1 << order);
> +       gfn_t gfn;
> +
> +       pr_debug("%s: gfn_start %llx pfn_start %llx npages %d\n",
> +                __func__, gfn_start, pfn, npages);
> +
> +       for (gfn = gfn_start, i = 0; gfn < gfn_start + npages; gfn++, i++) {
> +               struct sev_data_snp_launch_update fw_args = {0};
> +               bool assigned;
> +               void *vaddr;
> +               int level;
> +
> +               if (!kvm_mem_is_private(kvm, gfn)) {
> +                       pr_debug("%s: Failed to ensure GFN 0x%llx has private memory attribute set\n",
> +                                __func__, gfn);
> +                       ret = -EINVAL;
> +                       break;
> +               }
> +
> +               ret = snp_lookup_rmpentry((u64)pfn + i, &assigned, &level);
> +               if (ret || assigned) {
> +                       pr_debug("%s: Failed to ensure GFN 0x%llx RMP entry is initial shared state, ret: %d assigned: %d\n",
> +                                __func__, gfn, ret, assigned);
> +                       ret = -EINVAL;
> +                       break;
> +               }
> +
> +               vaddr = kmap_local_pfn(pfn + i);
> +               ret = copy_from_user(vaddr, src + i * PAGE_SIZE, PAGE_SIZE);
> +               if (ret) {
> +                       pr_debug("Failed to copy source page into GFN 0x%llx\n", gfn);
> +                       goto out_unmap;
> +               }
> +
> +               ret = rmp_make_private(pfn + i, gfn << PAGE_SHIFT, PG_LEVEL_4K,
> +                                      sev_get_asid(kvm), true);
> +               if (ret) {
> +                       pr_debug("%s: Failed to convert GFN 0x%llx to private, ret: %d\n",
> +                                __func__, gfn, ret);
> +                       goto out_unmap;
> +               }
> +
> +               n_private++;
> +
> +               fw_args.gctx_paddr = __psp_pa(sev->snp_context);
> +               fw_args.address = __sme_set(pfn_to_hpa(pfn + i));
> +               fw_args.page_size = PG_LEVEL_TO_RMP(PG_LEVEL_4K);
> +               fw_args.page_type = sev_populate_args->type;
> +               ret = __sev_issue_cmd(sev_populate_args->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE,
> +                                     &fw_args, &sev_populate_args->fw_error);
> +               if (ret) {
> +                       pr_debug("%s: SEV-SNP launch update failed, ret: 0x%x, fw_error: 0x%x\n",
> +                                __func__, ret, sev_populate_args->fw_error);
> +
> +                       if (snp_page_reclaim(pfn + i))
> +                               goto out_unmap;
> +
> +                       /*
> +                        * When invalid CPUID function entries are detected,
> +                        * firmware writes the expected values into the page and
> +                        * leaves it unencrypted so it can be used for debugging
> +                        * and error-reporting.
> +                        *
> +                        * Copy this page back into the source buffer so
> +                        * userspace can use this information to provide
> +                        * information on which CPUID leaves/fields failed CPUID
> +                        * validation.
> +                        */
> +                       if (sev_populate_args->type == KVM_SEV_SNP_PAGE_TYPE_CPUID &&
> +                           sev_populate_args->fw_error == SEV_RET_INVALID_PARAM) {
> +                               host_rmp_make_shared(pfn + i, PG_LEVEL_4K);
> +
> +                               if (copy_to_user(src + i * PAGE_SIZE,
> +                                                vaddr, PAGE_SIZE))
> +                                       pr_debug("Failed to write CPUID page back to userspace\n");
> +                       }
> +               }
> +
> +out_unmap:
> +               kunmap_local(vaddr);
> +               if (ret)
> +                       break;
> +       }
> +
> +       if (ret) {
> +               pr_debug("%s: exiting with error ret %d, undoing %d populated gmem pages.\n",
> +                        __func__, ret, n_private);
> +               for (i = 0; i < n_private; i++)
> +                       host_rmp_make_shared(pfn + i, PG_LEVEL_4K);
> +       }
> +
> +       return ret;
> +}
> +
> +static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +       struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +       struct sev_gmem_populate_args sev_populate_args = {0};
> +       struct kvm_sev_snp_launch_update params;
> +       struct kvm_memory_slot *memslot;
> +       unsigned int npages;
> +       int ret = 0;
> +
> +       if (!sev_snp_guest(kvm) || !sev->snp_context)
> +               return -EINVAL;
> +
> +       if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
> +               return -EFAULT;
> +
> +       if (!IS_ALIGNED(params.len, PAGE_SIZE) ||
> +           (params.type != KVM_SEV_SNP_PAGE_TYPE_NORMAL &&
> +            params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO &&
> +            params.type != KVM_SEV_SNP_PAGE_TYPE_UNMEASURED &&
> +            params.type != KVM_SEV_SNP_PAGE_TYPE_SECRETS &&
> +            params.type != KVM_SEV_SNP_PAGE_TYPE_CPUID))
> +               return -EINVAL;
> +
> +       npages = params.len / PAGE_SIZE;
> +
> +       pr_debug("%s: GFN range 0x%llx-0x%llx type %d\n", __func__,
> +                params.gfn_start, params.gfn_start + npages, params.type);
> +
> +       /*
> +        * For each GFN that's being prepared as part of the initial guest
> +        * state, the following pre-conditions are verified:
> +        *
> +        *   1) The backing memslot is a valid private memslot.
> +        *   2) The GFN has been set to private via KVM_SET_MEMORY_ATTRIBUTES
> +        *      beforehand.
> +        *   3) The PFN of the guest_memfd has not already been set to private
> +        *      in the RMP table.
> +        *
> +        * The KVM MMU relies on kvm->mmu_invalidate_seq to retry nested page
> +        * faults if there's a race between a fault and an attribute update via
> +        * KVM_SET_MEMORY_ATTRIBUTES, and a similar approach could be utilized
> +        * here. However, kvm->slots_lock guards against both this as well as
> +        * concurrent memslot updates occurring while these checks are being
> +        * performed, so use that here to make it easier to reason about the
> +        * initial expected state and better guard against unexpected
> +        * situations.
> +        */
> +       mutex_lock(&kvm->slots_lock);
> +
> +       memslot = gfn_to_memslot(kvm, params.gfn_start);
> +       if (!kvm_slot_can_be_private(memslot)) {
> +               ret = -EINVAL;
> +               goto out;
> +       }
> +
> +       sev_populate_args.sev_fd = argp->sev_fd;
> +       sev_populate_args.type = params.type;
> +
> +       ret = kvm_gmem_populate(kvm, params.gfn_start, u64_to_user_ptr(params.uaddr),
> +                               npages, sev_gmem_post_populate, &sev_populate_args);
> +       if (ret < 0) {
> +               argp->error = sev_populate_args.fw_error;
> +               pr_debug("%s: kvm_gmem_populate failed, ret %d (fw_error %d)\n",
> +                        __func__, ret, argp->error);
> +       } else if (ret < npages) {
> +               params.len = ret * PAGE_SIZE;
> +               ret = -EINTR;

This probably should 1) update also gfn_start and uaddr 2) return 0
for consistency with the planned KVM_PRE_FAULT_MEMORY ioctl (aka
KVM_MAP_MEMORY).

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 08/26] KVM: SEV: Add initial SEV-SNP support
  2024-04-18 19:41 ` [PATCH v13 08/26] KVM: SEV: Add initial SEV-SNP support Michael Roth
@ 2024-04-19 11:58   ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2024-04-19 11:58 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

On Thu, Apr 18, 2024 at 9:51 PM Michael Roth <michael.roth@amd.com> wrote:
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 535018f152a3..d31404953bf1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4899,7 +4899,8 @@ static int svm_vm_init(struct kvm *kvm)
>
>         if (type != KVM_X86_DEFAULT_VM &&
>             type != KVM_X86_SW_PROTECTED_VM) {
> -               kvm->arch.has_protected_state = (type == KVM_X86_SEV_ES_VM);
> +               kvm->arch.has_protected_state =
> +                       (type == KVM_X86_SEV_ES_VM || type == KVM_X86_SNP_VM);

I'd rather set has_private_mem here too, rather than in x86.c

Also this patch is the place to have

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d31404953bf1..6209f70ab11a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2056,6 +2056,9 @@ static int npf_interception(struct kvm_vcpu *vcpu)
     if (WARN_ON_ONCE(error_code & PFERR_SYNTHETIC_MASK))
         error_code &= ~PFERR_SYNTHETIC_MASK;

+    if (sev_snp_guest(vcpu) && (error_code & PFERR_GUEST_ENC_MASK))
+        error_code |= PFERR_PRIVATE_ACCESS;
+
     trace_kvm_page_fault(vcpu, fault_address, error_code);
     return kvm_mmu_page_fault(vcpu, fault_address, error_code,
             static_cpu_has(X86_FEATURE_DECODEASSISTS) ?


for the final shape of the MMU changes.

Paolo

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 83b8260443a3..9923921904a2 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12598,7 +12598,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>
>         kvm->arch.vm_type = type;
>         kvm->arch.has_private_mem =
> -               (type == KVM_X86_SW_PROTECTED_VM);
> +               (type == KVM_X86_SW_PROTECTED_VM || type == KVM_X86_SNP_VM);
>
>         ret = kvm_page_track_init(kvm);
>         if (ret)
> --
> 2.25.1
>


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 16/26] KVM: SEV: Support SEV-SNP AP Creation NAE event
  2024-04-18 19:41 ` [PATCH v13 16/26] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
@ 2024-04-19 12:01   ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2024-04-19 12:01 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

On Thu, Apr 18, 2024 at 9:45 PM Michael Roth <michael.roth@amd.com> wrote:
>          * the VMSA will be NULL if this vCPU is the destination for intrahost
>          * migration, and will be copied later.
>          */
> -       if (svm->sev_es.vmsa)
> +       if (!svm->sev_es.snp_has_guest_vmsa)
>                 svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);
>
>         /* Can't intercept CR register access, HV can't modify CR registers */

This needs to be svm->sev_es.vmsa && ... (see existing comment above the "if").

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support
  2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
                   ` (25 preceding siblings ...)
  2024-04-18 19:41 ` [PATCH v13 26/26] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth
@ 2024-04-19 12:04 ` Paolo Bonzini
  2024-04-21 18:00   ` Michael Roth
  26 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2024-04-19 12:04 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

On Thu, Apr 18, 2024 at 9:42 PM Michael Roth <michael.roth@amd.com> wrote:
>
> This patchset is also available at:
>
>   https://github.com/amdese/linux/commits/snp-host-v13
>
> and is based on commit 4d2deb62185f (as suggested by Paolo) from:
>
>   https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=kvm-coco-queue

This is pretty much ready to go into kvm-coco-queue. Let me know if
you want to do a quick v14 with the few changes I suggested, or I can
do them too.

Then the next steps are:

1) get the mm acks

2) figure out the state of patches 1-3

3) wait for more reviews of course

4) merge everything into kvm/next.

Seems in good shape for a 6.10 target.

Paolo

>
> Patch Layout
> ------------
>
> 01-03: These patches are minor dependencies for this series and are already
>        included in both tip/master and mainline, so are only included here
>        as a stop-gap until merged from one of those trees. These are needed
>        by patch #8 in this series which makes use of CC_ATTR_HOST_SEV_SNP
>
> 04:    This is a small general fix-up for guest_memfd that can be applied
>        independently of this series.
>
> 05-08: These patches add some basic infrastructure and introduces a new
>        KVM_X86_SNP_VM vm_type to handle differences verses the existing
>        KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types.
>
> 09-11: These implement the KVM API to handle the creation of a
>        cryptographic launch context, encrypt/measure the initial image
>        into guest memory, and finalize it before launching it.
>
> 12-17: These implement handling for various guest-generated events such
>        as page state changes, onlining of additional vCPUs, etc.
>
> 18-21: These implement the gmem hooks needed to prepare gmem-allocated
>        pages before mapping them into guest private memory ranges as
>        well as cleaning them up prior to returning them to the host for
>        use as normal memory. Because this supplants certain activities
>        like issued WBINVDs during KVM MMU invalidations, there's also
>        a patch to avoid duplicating that work to avoid unecessary
>        overhead.
>
> 22:    With all the core support in place, the patch adds a kvm_amd module
>        parameter to enable SNP support.
>
> 23-26: These patches all deal with the servicing of guest requests to handle
>        things like attestation, as well as some related host-management
>        interfaces.
>
>
> Testing
> -------
>
> For testing this via QEMU, use the following tree:
>
>   https://github.com/amdese/qemu/commits/snp-v4-wip3
>
> A patched OVMF is also needed due to upstream KVM no longer supporting MMIO
> ranges that are mapped as private. It is recommended you build the AmdSevX64
> variant as it provides the kernel-hashing support present in this series:
>
>   https://github.com/amdese/ovmf/commits/apic-mmio-fix1d
>
> A basic command-line invocation for SNP would be:
>
>  qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
>   -machine q35,confidential-guest-support=sev0,memory-backend=ram1
>   -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
>   -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=
>   -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd
>
> With kernel-hashing and certificate data supplied:
>
>  qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
>   -machine q35,confidential-guest-support=sev0,memory-backend=ram1
>   -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
>   -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=,certs-path=/home/mroth/cert.blob,kernel-hashes=on
>   -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd
>   -kernel /boot/vmlinuz-$ver
>   -initrd /boot/initrd.img-$ver
>   -append "root=UUID=d72a6d1c-06cf-4b79-af43-f1bac4f620f9 ro console=ttyS0,115200n8"
>
> With standard X64 OVMF package with separate image for persistent NVRAM:
>
>  qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
>   -machine q35,confidential-guest-support=sev0,memory-backend=ram1
>   -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
>   -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=
>   -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d.fd
>   -drive if=pflash,format=raw,unit=0,file=OVMF_VARS-upstream-20240410-apic-mmio-fix1d.fd,readonly=off
>
>
> Known issues / TODOs
> --------------------
>
>  * SEV-ES guests may trigger the following warning:
>
>      WARNING: CPU: 151 PID: 4003 at arch/x86/kvm/mmu/mmu.c:5855 kvm_mmu_page_fault+0x33b/0x860 [kvm]
>
>    It is assumed here that these will be resolved once the transition to
>    PFERR_PRIVATE_ACCESS is fully completed, but if that's not the case let me
>    know and will investigate further.
>
>  * Base tree in some cases reports "Unpatched return thunk in use. This should
>    not happen!" the first time it runs an SVM/SEV/SNP guests. This a recent
>    regression upstream and unrelated to this series:
>
>      https://lore.kernel.org/linux-kernel/CANpmjNOcKzEvLHoGGeL-boWDHJobwfwyVxUqMq2kWeka3N4tXA@mail.gmail.com/T/
>
>  * 2MB hugepage support has been dropped pending discussion on how we plan to
>    re-enable it in gmem.
>
>  * Host kexec should work, but there is a known issue with host kdump support
>    while SNP guests are running that will be addressed as a follow-up.
>
>  * SNP kselftests are currently a WIP and will be included as part of SNP
>    upstreaming efforts in the near-term.
>
>
> SEV-SNP Overview
> ----------------
>
> This part of the Secure Encrypted Paging (SEV-SNP) series focuses on the
> changes required to add KVM support for SEV-SNP. This series builds upon
> SEV-SNP guest support, which is now in mainline, and and SEV-SNP host
> initialization support, which is now in linux-next.
>
> While series provides the basic building blocks to support booting the
> SEV-SNP VMs, it does not cover all the security enhancement introduced by
> the SEV-SNP such as interrupt protection, which will added in the future.
>
> With SNP, when pages are marked as guest-owned in the RMP table, they are
> assigned to a specific guest/ASID, as well as a specific GFN with in the
> guest. Any attempts to map it in the RMP table to a different guest/ASID,
> or a different GFN within a guest/ASID, will result in an RMP nested page
> fault.
>
> Prior to accessing a guest-owned page, the guest must validate it with a
> special PVALIDATE instruction which will set a special bit in the RMP table
> for the guest. This is the only way to set the validated bit outside of the
> initial pre-encrypted guest payload/image; any attempts outside the guest to
> modify the RMP entry from that point forward will result in the validated
> bit being cleared, at which point the guest will trigger an exception if it
> attempts to access that page so it can be made aware of possible tampering.
>
> One exception to this is the initial guest payload, which is pre-validated
> by the firmware prior to launching. The guest can use Guest Message requests
> to fetch an attestation report which will include the measurement of the
> initial image so that the guest can verify it was booted with the expected
> image/environment.
>
> After boot, guests can use Page State Change requests to switch pages
> between shared/hypervisor-owned and private/guest-owned to share data for
> things like DMA, virtio buffers, and other GHCB requests.
>
> In this implementation of SEV-SNP, private guest memory is managed by a new
> kernel framework called guest_memfd (gmem). With gmem, a new
> KVM_SET_MEMORY_ATTRIBUTES KVM ioctl has been added to tell the KVM
> MMU whether a particular GFN should be backed by shared (normal) memory or
> private (gmem-allocated) memory. To tie into this, Page State Change
> requests are forward to userspace via KVM_EXIT_VMGEXIT exits, which will
> then issue the corresponding KVM_SET_MEMORY_ATTRIBUTES call to set the
> private/shared state in the KVM MMU.
>
> The gmem / KVM MMU hooks implemented in this series will then update the RMP
> table entries for the backing PFNs to set them to guest-owned/private when
> mapping private pages into the guest via KVM MMU, or use the normal KVM MMU
> handling in the case of shared pages where the corresponding RMP table
> entries are left in the default shared/hypervisor-owned state.
>
> Feedback/review is very much appreciated!
>
> -Mike
>
>
> Changes since v12:
>
>  * rebased to latest kvm-coco-queue branch (commit 4d2deb62185f)
>  * add more input validation for SNP_LAUNCH_START, especially for handling
>    things like MBO/MBZ policy bits, and API major/minor minimums. (Paolo)
>  * block SNP KVM instances from being able to run legacy SEV commands (Paolo)
>  * don't attempt to measure VMSA for vcpu 0/BSP before the others, let
>    userspace deal with the ordering just like with SEV-ES (Paolo)
>  * fix up docs for SNP_LAUNCH_FINISH (Paolo)
>  * introduce svm->sev_es.snp_has_guest_vmsa flag to better distinguish
>    handling for guest-mapped vs non-guest-mapped VMSAs, rename
>    'snp_ap_create' flag to 'snp_ap_waiting_for_reset' (Paolo)
>  * drop "KVM: SEV: Use a VMSA physical address variable for populating VMCB"
>    as it is no longer needed due to above VMSA rework
>  * replace pr_debug_ratelimited() messages for RMP #NPFs with a single trace
>    event
>  * handle transient PSMASH_FAIL_INUSE return codes in kvm_gmem_invalidate(),
>    switch to WARN_ON*()'s to indicate remaining error cases are not expected
>    and should not be seen in practice. (Paolo)
>  * add a cond_resched() in kvm_gmem_invalidate() to avoid soft lock-ups when
>    cleaning up large guest memory ranges.
>  * rename VLEK_REQUIRED to VCEK_DISABLE. it's be more applicable if another
>    key type ever gets added.
>  * don't allow attestation to be paused while an attestation request is
>    being processed by firmware (Tom)
>  * add missing Documentation entry for SNP_VLEK_LOAD
>  * collect Reviewed-by's from Paolo and Tom
>
> Changes since v11:
>
>  * Rebase series on kvm-coco-queue and re-work to leverage more
>    infrastructure between SNP/TDX series.
>  * Drop KVM_SNP_INIT in favor of the new KVM_SEV_INIT2 interface introduced
>    here (Paolo):
>      https://lore.kernel.org/lkml/20240318233352.2728327-1-pbonzini@redhat.com/
>  * Drop exposure API fields related to things like VMPL levels, migration
>    agents, etc., until they are actually supported/used (Sean)
>  * Rework KVM_SEV_SNP_LAUNCH_UPDATE handling to use a new
>    kvm_gmem_populate() interface instead of copying data directly into
>    gmem-allocated pages (Sean)
>  * Add support for SNP_LOAD_VLEK, rework the SNP_SET_CONFIG_{START,END} to
>    have simpler semantics that are applicable to management of SNP_LOAD_VLEK
>    updates as well, rename interfaces to the now more appropriate
>    SNP_{PAUSE,RESUME}_ATTESTATION
>  * Fix up documentation wording and do print warnings for
>    userspace-triggerable failures (Peter, Sean)
>  * Fix a race with AP_CREATION wake-up events (Jacob, Sean)
>  * Fix a memory leak with VMSA pages (Sean)
>  * Tighten up handling of RMP page faults to better distinguish between real
>    and spurious cases (Tom)
>  * Various patch/documentation rewording, cleanups, etc.
>
>
> ----------------------------------------------------------------
> Ashish Kalra (1):
>       KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP
>
> Borislav Petkov (AMD) (3):
>       [TEMP] x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM
>       [TEMP] x86/cc: Add cc_platform_set/_clear() helpers
>       [TEMP] x86/CPU/AMD: Track SNP host status with cc_platform_*()
>
> Brijesh Singh (10):
>       KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
>       KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
>       KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
>       KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command
>       KVM: SEV: Add support to handle GHCB GPA register VMGEXIT
>       KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
>       KVM: SEV: Add support to handle Page State Change VMGEXIT
>       KVM: SEV: Add support to handle RMP nested page faults
>       KVM: SVM: Add module parameter to enable SEV-SNP
>       KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
>
> Michael Roth (10):
>       KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn()
>       KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y
>       KVM: SEV: Add initial SEV-SNP support
>       KVM: SEV: Add support for GHCB-based termination requests
>       KVM: SEV: Implement gmem hook for initializing private pages
>       KVM: SEV: Implement gmem hook for invalidating private pages
>       KVM: x86: Implement gmem hook for determining max NPT mapping level
>       crypto: ccp: Add the SNP_VLEK_LOAD command
>       crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
>       KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
>
> Tom Lendacky (2):
>       KVM: SEV: Add support to handle AP reset MSR protocol
>       KVM: SEV: Support SEV-SNP AP Creation NAE event
>
>  Documentation/virt/coco/sev-guest.rst              |   69 +-
>  Documentation/virt/kvm/api.rst                     |   73 +
>  .../virt/kvm/x86/amd-memory-encryption.rst         |   88 +-
>  arch/x86/coco/core.c                               |   52 +
>  arch/x86/include/asm/kvm_host.h                    |    2 +
>  arch/x86/include/asm/sev-common.h                  |   22 +-
>  arch/x86/include/asm/sev.h                         |   19 +-
>  arch/x86/include/asm/svm.h                         |    9 +-
>  arch/x86/include/uapi/asm/kvm.h                    |   39 +
>  arch/x86/kernel/cpu/amd.c                          |   38 +-
>  arch/x86/kernel/cpu/mtrr/generic.c                 |    2 +-
>  arch/x86/kernel/sev.c                              |   10 -
>  arch/x86/kvm/Kconfig                               |    4 +
>  arch/x86/kvm/mmu.h                                 |    2 -
>  arch/x86/kvm/mmu/mmu.c                             |    1 +
>  arch/x86/kvm/svm/sev.c                             | 1444 +++++++++++++++++++-
>  arch/x86/kvm/svm/svm.c                             |   39 +-
>  arch/x86/kvm/svm/svm.h                             |   50 +
>  arch/x86/kvm/trace.h                               |   31 +
>  arch/x86/kvm/x86.c                                 |   19 +-
>  arch/x86/virt/svm/sev.c                            |  106 +-
>  drivers/crypto/ccp/sev-dev.c                       |   85 +-
>  drivers/iommu/amd/init.c                           |    4 +-
>  include/linux/cc_platform.h                        |   12 +
>  include/linux/psp-sev.h                            |    4 +-
>  include/uapi/linux/kvm.h                           |   28 +
>  include/uapi/linux/psp-sev.h                       |   39 +
>  include/uapi/linux/sev-guest.h                     |    9 +
>  virt/kvm/guest_memfd.c                             |    8 +-
>  29 files changed, 2229 insertions(+), 79 deletions(-)
>
>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 04/26] KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn()
  2024-04-18 19:41 ` [PATCH v13 04/26] KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn() Michael Roth
@ 2024-04-19 12:58   ` David Hildenbrand
  2024-04-19 15:11     ` Michael Roth
  0 siblings, 1 reply; 41+ messages in thread
From: David Hildenbrand @ 2024-04-19 12:58 UTC (permalink / raw)
  To: Michael Roth, kvm
  Cc: linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

On 18.04.24 21:41, Michael Roth wrote:
> kvm_gmem_get_folio() may return a PTR_ERR() rather than just NULL. In
> particular, for cases where EEXISTS is returned when FGP_CREAT_ONLY
> flag is used. Handle this properly in __kvm_gmem_get_pfn().
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> ---
>   virt/kvm/guest_memfd.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index ccf22e44f387..9d7c6a70c547 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -580,8 +580,8 @@ static int __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot,
>   	}
>   
>   	folio = kvm_gmem_get_folio(file_inode(file), index, prepare);
> -	if (!folio)
> -		return -ENOMEM;
> +	if (IS_ERR_OR_NULL(folio))
> +		return folio ? PTR_ERR(folio) : -ENOMEM;

Will it even return NULL?  Staring at other filemap_grab_folio() users, 
they all check for IS_ERR().

>   
>   	if (folio_test_hwpoison(folio)) {
>   		r = -EHWPOISON;

Do we have a Fixes: tag?

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
  2024-04-19 11:52   ` Paolo Bonzini
@ 2024-04-19 14:19     ` Michael Roth
  2024-04-19 16:13       ` Paolo Bonzini
  0 siblings, 1 reply; 41+ messages in thread
From: Michael Roth @ 2024-04-19 14:19 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

On Fri, Apr 19, 2024 at 01:52:24PM +0200, Paolo Bonzini wrote:
> On Thu, Apr 18, 2024 at 9:42 PM Michael Roth <michael.roth@amd.com> wrote:
> > +/* As defined by SEV-SNP Firmware ABI, under "Guest Policy". */
> > +#define SNP_POLICY_MASK_API_MAJOR      GENMASK_ULL(15, 8)
> > +#define SNP_POLICY_MASK_API_MINOR      GENMASK_ULL(7, 0)
> > +
> > +#define SNP_POLICY_MASK_VALID          (SNP_POLICY_MASK_SMT            | \
> > +                                        SNP_POLICY_MASK_RSVD_MBO       | \
> > +                                        SNP_POLICY_MASK_DEBUG          | \
> > +                                        SNP_POLICY_MASK_SINGLE_SOCKET  | \
> > +                                        SNP_POLICY_MASK_API_MAJOR      | \
> > +                                        SNP_POLICY_MASK_API_MINOR)
> > +
> > +/* KVM's SNP support is compatible with 1.51 of the SEV-SNP Firmware ABI. */
> > +#define SNP_POLICY_API_MAJOR           1
> > +#define SNP_POLICY_API_MINOR           51
> 
> > +static inline bool sev_version_greater_or_equal(u8 major, u8 minor)
> > +{
> > +       if (major < SNP_POLICY_API_MAJOR)
> > +               return true;
> 
> Should it perhaps refuse version 0.x? With something like a
> 
> #define SNP_POLICY_API_MAJOR_MIN    1
> 
> to make it a bit more future proof (and testable).
> 
> > +       major = (params.policy & SNP_POLICY_MASK_API_MAJOR);
> 
> This should be >> 8. Do the QEMU patches not set the API version? :)

Argh...it does if you set it via the -object sev-snp-guest,policy=0x...
option. I tested with reserved ranges and other flags, but not with
non-zero major/minor API fields. =/

But I'm having 2nd thoughts about trying to enforce API version via
KVM_SEV_SNP_LAUNCH_START. In practice, the only sensible way to really
interpret it is as "the minimum firmware version that userspace decides
it is comfortable running a particular guest" on. And that enforcement
is already handled as part of the SNP_LAUNCH_START firmware command in
the SNP Firmware ABI: if the policy specifies a higher minimum version
than what firmware is currently running then it will return as error
that will be reported by QEMU as:

  sev_snp_launch_start: SNP_LAUNCH_START ret=-5 fw_error=7 'Policy is not allowed'

On the firmware driver side (drivers/crypto/ccp/sev-dev.c), we already
enforce 1.51 as minimum supported SNP firmware, so that sort of already
covers the SNP_POLICY_API_MAJOR_MIN case as well. E.g. the test surface
KVM needs to concern itself with is already effectively 1.51+. In that
sense, whether the user decides to be less restrictive with what minimum
firmware version they allow is then totally up to the user, and has no
bearing on what firmware versions KVM needs to concern itself with.

Then the question of whether or not KVM fully *exposes* a particular
user-visible feature of a newer version of the firmware/ABI would be a
separate thing to be handled via the normal KVM capabilities/attributes
mechanisms.

So my current leaning is to send a v14 that backs out the major/minor
policy enforcement and let firmware handle that aspect. (and also
address your other comments).

But let me know if you think that should be handled differently.

Thanks!

-Mike

> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 04/26] KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn()
  2024-04-19 12:58   ` David Hildenbrand
@ 2024-04-19 15:11     ` Michael Roth
  2024-04-19 16:17       ` Paolo Bonzini
  0 siblings, 1 reply; 41+ messages in thread
From: Michael Roth @ 2024-04-19 15:11 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

On Fri, Apr 19, 2024 at 02:58:43PM +0200, David Hildenbrand wrote:
> On 18.04.24 21:41, Michael Roth wrote:
> > kvm_gmem_get_folio() may return a PTR_ERR() rather than just NULL. In
> > particular, for cases where EEXISTS is returned when FGP_CREAT_ONLY
> > flag is used. Handle this properly in __kvm_gmem_get_pfn().
> > 
> > Signed-off-by: Michael Roth <michael.roth@amd.com>
> > ---
> >   virt/kvm/guest_memfd.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > index ccf22e44f387..9d7c6a70c547 100644
> > --- a/virt/kvm/guest_memfd.c
> > +++ b/virt/kvm/guest_memfd.c
> > @@ -580,8 +580,8 @@ static int __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot,
> >   	}
> >   	folio = kvm_gmem_get_folio(file_inode(file), index, prepare);
> > -	if (!folio)
> > -		return -ENOMEM;
> > +	if (IS_ERR_OR_NULL(folio))
> > +		return folio ? PTR_ERR(folio) : -ENOMEM;
> 
> Will it even return NULL?  Staring at other filemap_grab_folio() users, they
> all check for IS_ERR().

Looks like the NULL case is handled with PTR_ERR(-ENOENT), so IS_ERR()
would be sufficient. I think in the past kvm_gmem_get_folio() itself
would return NULL in some cases, but as of commit 2b01b7e994e95 that's
no longer the case.

I'll fix this up to expect only PTR_ERR() when I re-spin v14, and also
address the other kvm_gmem_get_folio() / __filemap_get_folio() call
sites.

> 
> >   	if (folio_test_hwpoison(folio)) {
> >   		r = -EHWPOISON;
> 
> Do we have a Fixes: tag?

Fixes: 2b01b7e994e95 ("KVM: guest_memfd: pass error up from filemap_grab_folio")

Will add that in the re-spin as well.

Thanks!

-Mike

> 
> -- 
> Cheers,
> 
> David / dhildenb
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 10/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
  2024-04-19 11:56   ` Paolo Bonzini
@ 2024-04-19 16:12     ` Paolo Bonzini
  2024-04-21 17:52       ` Michael Roth
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2024-04-19 16:12 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

On Fri, Apr 19, 2024 at 1:56 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > +       ret = kvm_gmem_populate(kvm, params.gfn_start, u64_to_user_ptr(params.uaddr),
> > +                               npages, sev_gmem_post_populate, &sev_populate_args);
> > +       if (ret < 0) {
> > +               argp->error = sev_populate_args.fw_error;
> > +               pr_debug("%s: kvm_gmem_populate failed, ret %d (fw_error %d)\n",
> > +                        __func__, ret, argp->error);
> > +       } else if (ret < npages) {
> > +               params.len = ret * PAGE_SIZE;
> > +               ret = -EINTR;
>
> This probably should 1) update also gfn_start and uaddr 2) return 0
> for consistency with the planned KVM_PRE_FAULT_MEMORY ioctl (aka
> KVM_MAP_MEMORY).

To be more precise, params.len should be set to the number of bytes *left*, i.e.

   params.len -= ret * PAGE_SIZE;
   params.gfn_start += ret * PAGE_SIZE;
   if (params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO)
       params.uaddr += ret * PAGE_SIZE;

Also this patch needs some other changes:

1) snp_launch_update() should have something like this:

   src = params.type == KVM_SEV_SNP_PAGE_TYPE_ZERO ? NULL :
u64_to_user_ptr(params.uaddr),;

so that then...

> +               vaddr = kmap_local_pfn(pfn + i);
> +               ret = copy_from_user(vaddr, src + i * PAGE_SIZE, PAGE_SIZE);
> +               if (ret) {
> +                       pr_debug("Failed to copy source page into GFN 0x%llx\n", gfn);
> +                       goto out_unmap;
> +               }

... the copy can be done only if src is non-NULL

2) the struct should have some more fields

> +        struct kvm_sev_snp_launch_update {
> +                __u64 gfn_start;        /* Guest page number to load/encrypt data into. */
> +                __u64 uaddr;            /* Userspace address of data to be loaded/encrypted. */
> +                __u32 len;              /* 4k-aligned length in bytes to copy into guest memory.*/
> +                __u8 type;              /* The type of the guest pages being initialized. */

__u8 pad0;
__u16 flags;   // must be zero
__u64 pad1[5];

with accompanying flags check in snp_launch_update().

If you think IMI can be implemented already (with a bit in flags) go
ahead and do it.

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
  2024-04-19 14:19     ` Michael Roth
@ 2024-04-19 16:13       ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2024-04-19 16:13 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

On Fri, Apr 19, 2024 at 4:19 PM Michael Roth <michael.roth@amd.com> wrote:
> So my current leaning is to send a v14 that backs out the major/minor
> policy enforcement and let firmware handle that aspect. (and also
> address your other comments).

Sounds good to me!

In the meanwhile I guess you can also update to have PFERR_PRIVATE_ACCESS.

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 04/26] KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn()
  2024-04-19 15:11     ` Michael Roth
@ 2024-04-19 16:17       ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2024-04-19 16:17 UTC (permalink / raw)
  To: Michael Roth
  Cc: David Hildenbrand, kvm, linux-coco, linux-mm, linux-crypto, x86,
	linux-kernel, tglx, mingo, jroedel, thomas.lendacky, hpa, ardb,
	seanjc, vkuznets, jmattson, luto, dave.hansen, slp, pgonda,
	peterz, srinivas.pandruvada, rientjes, dovmurik, tobin, bp,
	vbabka, kirill, ak, tony.luck, sathyanarayanan.kuppuswamy,
	alpergun, jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

On Fri, Apr 19, 2024 at 5:11 PM Michael Roth <michael.roth@amd.com> wrote:
>
> On Fri, Apr 19, 2024 at 02:58:43PM +0200, David Hildenbrand wrote:
> > On 18.04.24 21:41, Michael Roth wrote:
> > > kvm_gmem_get_folio() may return a PTR_ERR() rather than just NULL. In
> > > particular, for cases where EEXISTS is returned when FGP_CREAT_ONLY
> > > flag is used. Handle this properly in __kvm_gmem_get_pfn().
> > >
> > > Signed-off-by: Michael Roth <michael.roth@amd.com>
> > > ---
> > >   virt/kvm/guest_memfd.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > > index ccf22e44f387..9d7c6a70c547 100644
> > > --- a/virt/kvm/guest_memfd.c
> > > +++ b/virt/kvm/guest_memfd.c
> > > @@ -580,8 +580,8 @@ static int __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot,
> > >     }
> > >     folio = kvm_gmem_get_folio(file_inode(file), index, prepare);
> > > -   if (!folio)
> > > -           return -ENOMEM;
> > > +   if (IS_ERR_OR_NULL(folio))
> > > +           return folio ? PTR_ERR(folio) : -ENOMEM;
> >
> > Will it even return NULL?  Staring at other filemap_grab_folio() users, they
> > all check for IS_ERR().
>
> Looks like the NULL case is handled with PTR_ERR(-ENOENT), so IS_ERR()
> would be sufficient. I think in the past kvm_gmem_get_folio() itself
> would return NULL in some cases, but as of commit 2b01b7e994e95 that's
> no longer the case.
>
> I'll fix this up to expect only PTR_ERR() when I re-spin v14, and also
> address the other kvm_gmem_get_folio() / __filemap_get_folio() call
> sites.
>
> >
> > >     if (folio_test_hwpoison(folio)) {
> > >             r = -EHWPOISON;
> >
> > Do we have a Fixes: tag?
>
> Fixes: 2b01b7e994e95 ("KVM: guest_memfd: pass error up from filemap_grab_folio")

I'll squash it so when you rebase on the new kvm-coco-queue it will go
away. Thanks to both!

Paolo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 10/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
  2024-04-19 16:12     ` Paolo Bonzini
@ 2024-04-21 17:52       ` Michael Roth
  0 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-21 17:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

On Fri, Apr 19, 2024 at 06:12:11PM +0200, Paolo Bonzini wrote:
> On Fri, Apr 19, 2024 at 1:56 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > +       ret = kvm_gmem_populate(kvm, params.gfn_start, u64_to_user_ptr(params.uaddr),
> > > +                               npages, sev_gmem_post_populate, &sev_populate_args);
> > > +       if (ret < 0) {
> > > +               argp->error = sev_populate_args.fw_error;
> > > +               pr_debug("%s: kvm_gmem_populate failed, ret %d (fw_error %d)\n",
> > > +                        __func__, ret, argp->error);
> > > +       } else if (ret < npages) {
> > > +               params.len = ret * PAGE_SIZE;
> > > +               ret = -EINTR;
> >
> > This probably should 1) update also gfn_start and uaddr 2) return 0
> > for consistency with the planned KVM_PRE_FAULT_MEMORY ioctl (aka
> > KVM_MAP_MEMORY).
> 
> To be more precise, params.len should be set to the number of bytes *left*, i.e.
> 
>    params.len -= ret * PAGE_SIZE;
>    params.gfn_start += ret * PAGE_SIZE;
>    if (params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO)
>        params.uaddr += ret * PAGE_SIZE;
> 
> Also this patch needs some other changes:
> 
> 1) snp_launch_update() should have something like this:
> 
>    src = params.type == KVM_SEV_SNP_PAGE_TYPE_ZERO ? NULL :
> u64_to_user_ptr(params.uaddr),;
> 
> so that then...
> 
> > +               vaddr = kmap_local_pfn(pfn + i);
> > +               ret = copy_from_user(vaddr, src + i * PAGE_SIZE, PAGE_SIZE);
> > +               if (ret) {
> > +                       pr_debug("Failed to copy source page into GFN 0x%llx\n", gfn);
> > +                       goto out_unmap;
> > +               }
> 
> ... the copy can be done only if src is non-NULL
> 
> 2) the struct should have some more fields
> 
> > +        struct kvm_sev_snp_launch_update {
> > +                __u64 gfn_start;        /* Guest page number to load/encrypt data into. */
> > +                __u64 uaddr;            /* Userspace address of data to be loaded/encrypted. */
> > +                __u32 len;              /* 4k-aligned length in bytes to copy into guest memory.*/
> > +                __u8 type;              /* The type of the guest pages being initialized. */
> 
> __u8 pad0;
> __u16 flags;   // must be zero
> __u64 pad1[5];
> 
> with accompanying flags check in snp_launch_update().

Have these all addressed in v14, but I ended up making 'len' a __u64, so the
final struct looks like this:

  struct kvm_sev_snp_launch_update {
          __u64 gfn_start;
          __u64 uaddr;
          __u64 len;
          __u8 type;
          __u8 pad0;
          __u16 flags;
          __u32 pad1;
          __u64 pad2[4];
  };

> 
> If you think IMI can be implemented already (with a bit in flags) go
> ahead and do it.

Migration will also need related flags in LAUNCH_START, and depending on how
we implement things, possibly in LAUNCH_FINISH. So for now I've left IMI
out, but added similar 'flags' and padding to those structs as well so we have
some flexibility with how we end up handling that.

-Mike

> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support
  2024-04-19 12:04 ` [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Paolo Bonzini
@ 2024-04-21 18:00   ` Michael Roth
  0 siblings, 0 replies; 41+ messages in thread
From: Michael Roth @ 2024-04-21 18:00 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick

On Fri, Apr 19, 2024 at 02:04:54PM +0200, Paolo Bonzini wrote:
> On Thu, Apr 18, 2024 at 9:42 PM Michael Roth <michael.roth@amd.com> wrote:
> >
> > This patchset is also available at:
> >
> >   https://github.com/amdese/linux/commits/snp-host-v13
> >
> > and is based on commit 4d2deb62185f (as suggested by Paolo) from:
> >
> >   https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=kvm-coco-queue
> 
> This is pretty much ready to go into kvm-coco-queue. Let me know if
> you want to do a quick v14 with the few changes I suggested, or I can
> do them too.

Submitted v14 based on 20cc50a0410f from latest kvm-coco-queue
(bf1390326099). Hoping that way you can easily replace v13 with v14 and
force-push, but let me know if you wanted to go about it a different way.

> 
> Then the next steps are:
> 
> 1) get the mm acks
> 
> 2) figure out the state of patches 1-3

With latest kvm-coco-queue these patches are now in the base tree so
I've dropped them from the series.

> 
> 3) wait for more reviews of course
> 
> 4) merge everything into kvm/next.
> 
> Seems in good shape for a 6.10 target.

Awesome! If anything needs attention just let me know.

Thanks,

Mike

> 
> Paolo
> 
> >
> > Patch Layout
> > ------------
> >
> > 01-03: These patches are minor dependencies for this series and are already
> >        included in both tip/master and mainline, so are only included here
> >        as a stop-gap until merged from one of those trees. These are needed
> >        by patch #8 in this series which makes use of CC_ATTR_HOST_SEV_SNP
> >
> > 04:    This is a small general fix-up for guest_memfd that can be applied
> >        independently of this series.
> >
> > 05-08: These patches add some basic infrastructure and introduces a new
> >        KVM_X86_SNP_VM vm_type to handle differences verses the existing
> >        KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types.
> >
> > 09-11: These implement the KVM API to handle the creation of a
> >        cryptographic launch context, encrypt/measure the initial image
> >        into guest memory, and finalize it before launching it.
> >
> > 12-17: These implement handling for various guest-generated events such
> >        as page state changes, onlining of additional vCPUs, etc.
> >
> > 18-21: These implement the gmem hooks needed to prepare gmem-allocated
> >        pages before mapping them into guest private memory ranges as
> >        well as cleaning them up prior to returning them to the host for
> >        use as normal memory. Because this supplants certain activities
> >        like issued WBINVDs during KVM MMU invalidations, there's also
> >        a patch to avoid duplicating that work to avoid unecessary
> >        overhead.
> >
> > 22:    With all the core support in place, the patch adds a kvm_amd module
> >        parameter to enable SNP support.
> >
> > 23-26: These patches all deal with the servicing of guest requests to handle
> >        things like attestation, as well as some related host-management
> >        interfaces.
> >
> >
> > Testing
> > -------
> >
> > For testing this via QEMU, use the following tree:
> >
> >   https://github.com/amdese/qemu/commits/snp-v4-wip3
> >
> > A patched OVMF is also needed due to upstream KVM no longer supporting MMIO
> > ranges that are mapped as private. It is recommended you build the AmdSevX64
> > variant as it provides the kernel-hashing support present in this series:
> >
> >   https://github.com/amdese/ovmf/commits/apic-mmio-fix1d
> >
> > A basic command-line invocation for SNP would be:
> >
> >  qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
> >   -machine q35,confidential-guest-support=sev0,memory-backend=ram1
> >   -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
> >   -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=
> >   -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd
> >
> > With kernel-hashing and certificate data supplied:
> >
> >  qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
> >   -machine q35,confidential-guest-support=sev0,memory-backend=ram1
> >   -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
> >   -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=,certs-path=/home/mroth/cert.blob,kernel-hashes=on
> >   -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd
> >   -kernel /boot/vmlinuz-$ver
> >   -initrd /boot/initrd.img-$ver
> >   -append "root=UUID=d72a6d1c-06cf-4b79-af43-f1bac4f620f9 ro console=ttyS0,115200n8"
> >
> > With standard X64 OVMF package with separate image for persistent NVRAM:
> >
> >  qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2
> >   -machine q35,confidential-guest-support=sev0,memory-backend=ram1
> >   -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false
> >   -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=
> >   -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d.fd
> >   -drive if=pflash,format=raw,unit=0,file=OVMF_VARS-upstream-20240410-apic-mmio-fix1d.fd,readonly=off
> >
> >
> > Known issues / TODOs
> > --------------------
> >
> >  * SEV-ES guests may trigger the following warning:
> >
> >      WARNING: CPU: 151 PID: 4003 at arch/x86/kvm/mmu/mmu.c:5855 kvm_mmu_page_fault+0x33b/0x860 [kvm]
> >
> >    It is assumed here that these will be resolved once the transition to
> >    PFERR_PRIVATE_ACCESS is fully completed, but if that's not the case let me
> >    know and will investigate further.
> >
> >  * Base tree in some cases reports "Unpatched return thunk in use. This should
> >    not happen!" the first time it runs an SVM/SEV/SNP guests. This a recent
> >    regression upstream and unrelated to this series:
> >
> >      https://lore.kernel.org/linux-kernel/CANpmjNOcKzEvLHoGGeL-boWDHJobwfwyVxUqMq2kWeka3N4tXA@mail.gmail.com/T/
> >
> >  * 2MB hugepage support has been dropped pending discussion on how we plan to
> >    re-enable it in gmem.
> >
> >  * Host kexec should work, but there is a known issue with host kdump support
> >    while SNP guests are running that will be addressed as a follow-up.
> >
> >  * SNP kselftests are currently a WIP and will be included as part of SNP
> >    upstreaming efforts in the near-term.
> >
> >
> > SEV-SNP Overview
> > ----------------
> >
> > This part of the Secure Encrypted Paging (SEV-SNP) series focuses on the
> > changes required to add KVM support for SEV-SNP. This series builds upon
> > SEV-SNP guest support, which is now in mainline, and and SEV-SNP host
> > initialization support, which is now in linux-next.
> >
> > While series provides the basic building blocks to support booting the
> > SEV-SNP VMs, it does not cover all the security enhancement introduced by
> > the SEV-SNP such as interrupt protection, which will added in the future.
> >
> > With SNP, when pages are marked as guest-owned in the RMP table, they are
> > assigned to a specific guest/ASID, as well as a specific GFN with in the
> > guest. Any attempts to map it in the RMP table to a different guest/ASID,
> > or a different GFN within a guest/ASID, will result in an RMP nested page
> > fault.
> >
> > Prior to accessing a guest-owned page, the guest must validate it with a
> > special PVALIDATE instruction which will set a special bit in the RMP table
> > for the guest. This is the only way to set the validated bit outside of the
> > initial pre-encrypted guest payload/image; any attempts outside the guest to
> > modify the RMP entry from that point forward will result in the validated
> > bit being cleared, at which point the guest will trigger an exception if it
> > attempts to access that page so it can be made aware of possible tampering.
> >
> > One exception to this is the initial guest payload, which is pre-validated
> > by the firmware prior to launching. The guest can use Guest Message requests
> > to fetch an attestation report which will include the measurement of the
> > initial image so that the guest can verify it was booted with the expected
> > image/environment.
> >
> > After boot, guests can use Page State Change requests to switch pages
> > between shared/hypervisor-owned and private/guest-owned to share data for
> > things like DMA, virtio buffers, and other GHCB requests.
> >
> > In this implementation of SEV-SNP, private guest memory is managed by a new
> > kernel framework called guest_memfd (gmem). With gmem, a new
> > KVM_SET_MEMORY_ATTRIBUTES KVM ioctl has been added to tell the KVM
> > MMU whether a particular GFN should be backed by shared (normal) memory or
> > private (gmem-allocated) memory. To tie into this, Page State Change
> > requests are forward to userspace via KVM_EXIT_VMGEXIT exits, which will
> > then issue the corresponding KVM_SET_MEMORY_ATTRIBUTES call to set the
> > private/shared state in the KVM MMU.
> >
> > The gmem / KVM MMU hooks implemented in this series will then update the RMP
> > table entries for the backing PFNs to set them to guest-owned/private when
> > mapping private pages into the guest via KVM MMU, or use the normal KVM MMU
> > handling in the case of shared pages where the corresponding RMP table
> > entries are left in the default shared/hypervisor-owned state.
> >
> > Feedback/review is very much appreciated!
> >
> > -Mike
> >
> >
> > Changes since v12:
> >
> >  * rebased to latest kvm-coco-queue branch (commit 4d2deb62185f)
> >  * add more input validation for SNP_LAUNCH_START, especially for handling
> >    things like MBO/MBZ policy bits, and API major/minor minimums. (Paolo)
> >  * block SNP KVM instances from being able to run legacy SEV commands (Paolo)
> >  * don't attempt to measure VMSA for vcpu 0/BSP before the others, let
> >    userspace deal with the ordering just like with SEV-ES (Paolo)
> >  * fix up docs for SNP_LAUNCH_FINISH (Paolo)
> >  * introduce svm->sev_es.snp_has_guest_vmsa flag to better distinguish
> >    handling for guest-mapped vs non-guest-mapped VMSAs, rename
> >    'snp_ap_create' flag to 'snp_ap_waiting_for_reset' (Paolo)
> >  * drop "KVM: SEV: Use a VMSA physical address variable for populating VMCB"
> >    as it is no longer needed due to above VMSA rework
> >  * replace pr_debug_ratelimited() messages for RMP #NPFs with a single trace
> >    event
> >  * handle transient PSMASH_FAIL_INUSE return codes in kvm_gmem_invalidate(),
> >    switch to WARN_ON*()'s to indicate remaining error cases are not expected
> >    and should not be seen in practice. (Paolo)
> >  * add a cond_resched() in kvm_gmem_invalidate() to avoid soft lock-ups when
> >    cleaning up large guest memory ranges.
> >  * rename VLEK_REQUIRED to VCEK_DISABLE. it's be more applicable if another
> >    key type ever gets added.
> >  * don't allow attestation to be paused while an attestation request is
> >    being processed by firmware (Tom)
> >  * add missing Documentation entry for SNP_VLEK_LOAD
> >  * collect Reviewed-by's from Paolo and Tom
> >
> > Changes since v11:
> >
> >  * Rebase series on kvm-coco-queue and re-work to leverage more
> >    infrastructure between SNP/TDX series.
> >  * Drop KVM_SNP_INIT in favor of the new KVM_SEV_INIT2 interface introduced
> >    here (Paolo):
> >      https://lore.kernel.org/lkml/20240318233352.2728327-1-pbonzini@redhat.com/
> >  * Drop exposure API fields related to things like VMPL levels, migration
> >    agents, etc., until they are actually supported/used (Sean)
> >  * Rework KVM_SEV_SNP_LAUNCH_UPDATE handling to use a new
> >    kvm_gmem_populate() interface instead of copying data directly into
> >    gmem-allocated pages (Sean)
> >  * Add support for SNP_LOAD_VLEK, rework the SNP_SET_CONFIG_{START,END} to
> >    have simpler semantics that are applicable to management of SNP_LOAD_VLEK
> >    updates as well, rename interfaces to the now more appropriate
> >    SNP_{PAUSE,RESUME}_ATTESTATION
> >  * Fix up documentation wording and do print warnings for
> >    userspace-triggerable failures (Peter, Sean)
> >  * Fix a race with AP_CREATION wake-up events (Jacob, Sean)
> >  * Fix a memory leak with VMSA pages (Sean)
> >  * Tighten up handling of RMP page faults to better distinguish between real
> >    and spurious cases (Tom)
> >  * Various patch/documentation rewording, cleanups, etc.
> >
> >
> > ----------------------------------------------------------------
> > Ashish Kalra (1):
> >       KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP
> >
> > Borislav Petkov (AMD) (3):
> >       [TEMP] x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM
> >       [TEMP] x86/cc: Add cc_platform_set/_clear() helpers
> >       [TEMP] x86/CPU/AMD: Track SNP host status with cc_platform_*()
> >
> > Brijesh Singh (10):
> >       KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
> >       KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
> >       KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
> >       KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command
> >       KVM: SEV: Add support to handle GHCB GPA register VMGEXIT
> >       KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT
> >       KVM: SEV: Add support to handle Page State Change VMGEXIT
> >       KVM: SEV: Add support to handle RMP nested page faults
> >       KVM: SVM: Add module parameter to enable SEV-SNP
> >       KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
> >
> > Michael Roth (10):
> >       KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn()
> >       KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y
> >       KVM: SEV: Add initial SEV-SNP support
> >       KVM: SEV: Add support for GHCB-based termination requests
> >       KVM: SEV: Implement gmem hook for initializing private pages
> >       KVM: SEV: Implement gmem hook for invalidating private pages
> >       KVM: x86: Implement gmem hook for determining max NPT mapping level
> >       crypto: ccp: Add the SNP_VLEK_LOAD command
> >       crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands
> >       KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
> >
> > Tom Lendacky (2):
> >       KVM: SEV: Add support to handle AP reset MSR protocol
> >       KVM: SEV: Support SEV-SNP AP Creation NAE event
> >
> >  Documentation/virt/coco/sev-guest.rst              |   69 +-
> >  Documentation/virt/kvm/api.rst                     |   73 +
> >  .../virt/kvm/x86/amd-memory-encryption.rst         |   88 +-
> >  arch/x86/coco/core.c                               |   52 +
> >  arch/x86/include/asm/kvm_host.h                    |    2 +
> >  arch/x86/include/asm/sev-common.h                  |   22 +-
> >  arch/x86/include/asm/sev.h                         |   19 +-
> >  arch/x86/include/asm/svm.h                         |    9 +-
> >  arch/x86/include/uapi/asm/kvm.h                    |   39 +
> >  arch/x86/kernel/cpu/amd.c                          |   38 +-
> >  arch/x86/kernel/cpu/mtrr/generic.c                 |    2 +-
> >  arch/x86/kernel/sev.c                              |   10 -
> >  arch/x86/kvm/Kconfig                               |    4 +
> >  arch/x86/kvm/mmu.h                                 |    2 -
> >  arch/x86/kvm/mmu/mmu.c                             |    1 +
> >  arch/x86/kvm/svm/sev.c                             | 1444 +++++++++++++++++++-
> >  arch/x86/kvm/svm/svm.c                             |   39 +-
> >  arch/x86/kvm/svm/svm.h                             |   50 +
> >  arch/x86/kvm/trace.h                               |   31 +
> >  arch/x86/kvm/x86.c                                 |   19 +-
> >  arch/x86/virt/svm/sev.c                            |  106 +-
> >  drivers/crypto/ccp/sev-dev.c                       |   85 +-
> >  drivers/iommu/amd/init.c                           |    4 +-
> >  include/linux/cc_platform.h                        |   12 +
> >  include/linux/psp-sev.h                            |    4 +-
> >  include/uapi/linux/kvm.h                           |   28 +
> >  include/uapi/linux/psp-sev.h                       |   39 +
> >  include/uapi/linux/sev-guest.h                     |    9 +
> >  virt/kvm/guest_memfd.c                             |    8 +-
> >  29 files changed, 2229 insertions(+), 79 deletions(-)
> >
> >
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command
  2024-04-18 19:41 ` [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
  2024-04-19 11:52   ` Paolo Bonzini
@ 2024-04-24 21:40   ` Sean Christopherson
  1 sibling, 0 replies; 41+ messages in thread
From: Sean Christopherson @ 2024-04-24 21:40 UTC (permalink / raw)
  To: Michael Roth
  Cc: kvm, linux-coco, linux-mm, linux-crypto, x86, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, dovmurik, tobin, bp, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	liam.merwick, Brijesh Singh

On Thu, Apr 18, 2024, Michael Roth wrote:
> +static inline bool sev_version_greater_or_equal(u8 major, u8 minor)
> +{
> +	if (major < SNP_POLICY_API_MAJOR)
> +		return true;
> +
> +	if (major == SNP_POLICY_API_MAJOR && minor <= SNP_POLICY_API_MINOR)
> +		return true;
> +
> +	return false;
> +}
> +
> +static int snp_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	struct sev_data_snp_launch_start start = {0};
> +	struct kvm_sev_snp_launch_start params;
> +	u8 major, minor;
> +	int rc;
> +
> +	if (!sev_snp_guest(kvm))
> +		return -ENOTTY;
> +
> +	if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
> +		return -EFAULT;
> +
> +	/* Don't allow userspace to allocate memory for more than 1 SNP context. */
> +	if (sev->snp_context) {
> +		pr_debug("SEV-SNP context already exists. Refusing to allocate an additional one.\n");

What's the plan with all these printks?   There are far too many in this series.
Some might be useful, but many of them have no business landing upstream.

> +		return -EINVAL;
> +	}
> +
> +	sev->snp_context = snp_context_create(kvm, argp);
> +	if (!sev->snp_context)
> +		return -ENOTTY;
> +
> +	if (params.policy & ~SNP_POLICY_MASK_VALID) {
> +		pr_debug("SEV-SNP hypervisor does not support requested policy %llx (supported %llx).\n",

What does "SEV-SNP hypervisor" even mean?

> +			 params.policy, SNP_POLICY_MASK_VALID);
> +		return -EINVAL;
> +	}
> +
> +	if (!(params.policy & SNP_POLICY_MASK_RSVD_MBO)) {
> +		pr_debug("SEV-SNP hypervisor does not support requested policy %llx (must be set %llx).\n",
> +			 params.policy, SNP_POLICY_MASK_RSVD_MBO);
> +		return -EINVAL;
> +	}
> +
> +	if (params.policy & SNP_POLICY_MASK_SINGLE_SOCKET) {
> +		pr_debug("SEV-SNP hypervisor does not support limiting guests to a single socket.\n");
> +		return -EINVAL;
> +	}
> +
> +	if (!(params.policy & SNP_POLICY_MASK_SMT)) {
> +		pr_debug("SEV-SNP hypervisor does not support limiting guests to a single SMT thread.\n");
> +		return -EINVAL;
> +	}
> +
> +	major = (params.policy & SNP_POLICY_MASK_API_MAJOR);
> +	minor = (params.policy & SNP_POLICY_MASK_API_MINOR);
> +	if (!sev_version_greater_or_equal(major, minor)) {

Why does this need a someone weirdly named helper?  Isn't this just?

	if (major < SNP_POLICY_API_MAJOR ||
	    (major == SNP_POLICY_API_MAJOR && minor < SNP_POLICY_API_MINOR))

> +		pr_debug("SEV-SNP hypervisor does not support requested version %d.%d (have %d,%d).\n",
> +			 major, minor, SNP_POLICY_API_MAJOR, SNP_POLICY_API_MINOR);
> +		return -EINVAL;
> +	}
> +
> +	start.gctx_paddr = __psp_pa(sev->snp_context);
> +	start.policy = params.policy;
> +	memcpy(start.gosvw, params.gosvw, sizeof(params.gosvw));
> +	rc = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_LAUNCH_START, &start, &argp->error);
> +	if (rc) {
> +		pr_debug("SEV_CMD_SNP_LAUNCH_START firmware command failed, rc %d\n", rc);
> +		goto e_free_context;
> +	}
> +
> +	sev->fd = argp->sev_fd;
> +	rc = snp_bind_asid(kvm, &argp->error);
> +	if (rc) {
> +		pr_debug("Failed to bind ASID to SEV-SNP context, rc %d\n", rc);
> +		goto e_free_context;
> +	}
> +
> +	return 0;
> +
> +e_free_context:
> +	snp_decommission_context(kvm);
> +
> +	return rc;
> +}
> +
>  int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
>  {
>  	struct kvm_sev_cmd sev_cmd;
> @@ -1999,6 +2154,15 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
>  		goto out;
>  	}
>  
> +	/*
> +	 * Once KVM_SEV_INIT2 initializes a KVM instance as an SNP guest, only
> +	 * allow the use of SNP-specific commands.
> +	 */
> +	if (sev_snp_guest(kvm) && sev_cmd.id < KVM_SEV_SNP_LAUNCH_START) {
> +		r = -EPERM;
> +		goto out;
> +	}
> +
>  	switch (sev_cmd.id) {
>  	case KVM_SEV_ES_INIT:
>  		if (!sev_es_enabled) {
> @@ -2063,6 +2227,9 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
>  	case KVM_SEV_RECEIVE_FINISH:
>  		r = sev_receive_finish(kvm, &sev_cmd);
>  		break;
> +	case KVM_SEV_SNP_LAUNCH_START:
> +		r = snp_launch_start(kvm, &sev_cmd);
> +		break;
>  	default:
>  		r = -EINVAL;
>  		goto out;
> @@ -2258,6 +2425,33 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
>  	return ret;
>  }
>  
> +static int snp_decommission_context(struct kvm *kvm)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	struct sev_data_snp_addr data = {};
> +	int ret;
> +
> +	/* If context is not created then do nothing */
> +	if (!sev->snp_context)
> +		return 0;
> +
> +	data.address = __sme_pa(sev->snp_context);
> +	down_write(&sev_deactivate_lock);
> +	ret = sev_do_cmd(SEV_CMD_SNP_DECOMMISSION, &data, NULL);
> +	if (WARN_ONCE(ret, "failed to release guest context")) {

WARN here, or WARN in the caller, not both.  And if you warn here, this can be

	down_write(&sev_deactivate_lock);
	ret = sev_do_cmd(SEV_CMD_SNP_DECOMMISSION, &data, NULL);
	up_write(&sev_deactivate_lock);

	if (WARN_ONCE(ret, "..."))

> +		up_write(&sev_deactivate_lock);
> +		return ret;
> +	}
> +
> +	up_write(&sev_deactivate_lock);
> +
> +	/* free the context page now */

This doesn't seem like a particularly useful comment.  What would be useful is
a comment explaining the "decommission" unbinds the ASID.  

> +	snp_free_firmware_page(sev->snp_context);
> +	sev->snp_context = NULL;
> +
> +	return 0;
> +}
> +
>  void sev_vm_destroy(struct kvm *kvm)
>  {
>  	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> @@ -2299,7 +2493,15 @@ void sev_vm_destroy(struct kvm *kvm)
>  		}
>  	}
>  
> -	sev_unbind_asid(kvm, sev->handle);
> +	if (sev_snp_guest(kvm)) {
> +		if (snp_decommission_context(kvm)) {
> +			WARN_ONCE(1, "Failed to free SNP guest context, leaking asid!\n");

WARN on the actually failure, not '1'.  And a newline isn't needed.

		if (WARN_ONCE(snp_decommission_context(kvm)
			      "Failed to free SNP guest context, leaking asid!"))
			return;

> +			return;
> +		}
> +	} else {
> +		sev_unbind_asid(kvm, sev->handle);
> +	}
> +
>  	sev_asid_free(sev);
>  }
>  
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 7f2e9c7fc4ca..0654fc91d4db 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -92,6 +92,7 @@ struct kvm_sev_info {
>  	struct list_head mirror_entry; /* Use as a list entry of mirrors */
>  	struct misc_cg *misc_cg; /* For misc cgroup accounting */
>  	atomic_t migration_in_progress;
> +	void *snp_context;      /* SNP guest context page */
>  };
>  
>  struct kvm_svm {
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2024-04-24 21:40 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18 19:41 [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
2024-04-18 19:41 ` [PATCH v13 01/26] [TEMP] x86/kvm/Kconfig: Have KVM_AMD_SEV select ARCH_HAS_CC_PLATFORM Michael Roth
2024-04-18 19:41 ` [PATCH v13 02/26] [TEMP] x86/cc: Add cc_platform_set/_clear() helpers Michael Roth
2024-04-18 19:41 ` [PATCH v13 03/26] [TEMP] x86/CPU/AMD: Track SNP host status with cc_platform_*() Michael Roth
2024-04-18 19:41 ` [PATCH v13 04/26] KVM: guest_memfd: Fix PTR_ERR() handling in __kvm_gmem_get_pfn() Michael Roth
2024-04-19 12:58   ` David Hildenbrand
2024-04-19 15:11     ` Michael Roth
2024-04-19 16:17       ` Paolo Bonzini
2024-04-18 19:41 ` [PATCH v13 05/26] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
2024-04-18 19:41 ` [PATCH v13 06/26] KVM: SEV: Add support to handle AP reset MSR protocol Michael Roth
2024-04-18 19:41 ` [PATCH v13 07/26] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
2024-04-18 19:41 ` [PATCH v13 08/26] KVM: SEV: Add initial SEV-SNP support Michael Roth
2024-04-19 11:58   ` Paolo Bonzini
2024-04-18 19:41 ` [PATCH v13 09/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
2024-04-19 11:52   ` Paolo Bonzini
2024-04-19 14:19     ` Michael Roth
2024-04-19 16:13       ` Paolo Bonzini
2024-04-24 21:40   ` Sean Christopherson
2024-04-18 19:41 ` [PATCH v13 10/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
2024-04-19 11:56   ` Paolo Bonzini
2024-04-19 16:12     ` Paolo Bonzini
2024-04-21 17:52       ` Michael Roth
2024-04-18 19:41 ` [PATCH v13 11/26] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
2024-04-18 19:41 ` [PATCH v13 12/26] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
2024-04-18 19:41 ` [PATCH v13 13/26] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
2024-04-18 19:41 ` [PATCH v13 14/26] KVM: SEV: Add support to handle " Michael Roth
2024-04-18 19:41 ` [PATCH v13 15/26] KVM: SEV: Add support to handle RMP nested page faults Michael Roth
2024-04-18 19:41 ` [PATCH v13 16/26] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
2024-04-19 12:01   ` Paolo Bonzini
2024-04-18 19:41 ` [PATCH v13 17/26] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
2024-04-18 19:41 ` [PATCH v13 18/26] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
2024-04-18 19:41 ` [PATCH v13 19/26] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
2024-04-18 19:41 ` [PATCH v13 20/26] KVM: x86: Implement gmem hook for determining max NPT mapping level Michael Roth
2024-04-18 19:41 ` [PATCH v13 21/26] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
2024-04-18 19:41 ` [PATCH v13 22/26] KVM: SVM: Add module parameter to enable SEV-SNP Michael Roth
2024-04-18 19:41 ` [PATCH v13 23/26] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
2024-04-18 19:41 ` [PATCH v13 24/26] crypto: ccp: Add the SNP_VLEK_LOAD command Michael Roth
2024-04-18 19:41 ` [PATCH v13 25/26] crypto: ccp: Add the SNP_{PAUSE,RESUME}_ATTESTATION commands Michael Roth
2024-04-18 19:41 ` [PATCH v13 26/26] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth
2024-04-19 12:04 ` [PATCH v13 00/26] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Paolo Bonzini
2024-04-21 18:00   ` Michael Roth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).