linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support
@ 2024-01-26  4:11 Michael Roth
  2024-01-26  4:11 ` [PATCH v2 01/25] x86/cpufeatures: Add SEV-SNP CPU feature Michael Roth
                   ` (25 more replies)
  0 siblings, 26 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

This patchset is also available at:

  https://github.com/amdese/linux/commits/snp-host-init-v2

and is based on top of linux-next tag next-20240125

These patches were originally included in v10 of the SNP KVM/hypervisor
patches[1], but have been split off from the general KVM support for easier
review and eventual merging into the x86 tree. They are based on linux-next
to help stay in sync with both tip and kvm-next.

There is 1 KVM-specific patch here since it is needed to avoid regressions
when running legacy SEV guests while the RMP table is enabled.

== OVERVIEW ==

AMD EPYC systems utilizing Zen 3 and newer microarchitectures add support
for a new feature called SEV-SNP, which adds Secure Nested Paging support
on top of the SEV/SEV-ES support already present on existing EPYC systems.

One of the main features of SNP is the addition of an RMP (Reverse Map)
table to enforce additional security protections for private guest memory.
This series primarily focuses on the various host initialization
requirements for enabling SNP on the system, while the actual KVM support
for running SNP guests is added as a separate series based on top of these
patches.

The basic requirements to initialize SNP support on a host when the feature
has been enabled in the BIOS are:

  - Discovering and initializing the RMP table
  - Initializing various MSRs to enable the capability across CPUs
  - Various tasks to maintain legacy functionality on the system, such as:
    - Setting up hooks for handling RMP-related changes for IOMMU pages
    - Initializing SNP in the firmware via the CCP driver, and implement
      additional requirements needed for continued operation of legacy
      SEV/SEV-ES guests

Additionally some basic SEV ioctl interfaces are added to configure various
aspects of SNP-enabled firmwares via the CCP driver.

More details are available in the SEV-SNP Firmware ABI[2].

Feedback/review is very much appreciated!

-Mike

[1] https://lore.kernel.org/kvm/20231016132819.1002933-1-michael.roth@amd.com/ 
[2] https://www.amd.com/en/developer/sev.html 

Changes since v1:

 * rebased on linux-next tag next-202401125
 * patch 26: crypto: ccp: Add the SNP_SET_CONFIG command
   - documentation updates (Boris)
 * patch 25: crypto: ccp: Add the SNP_COMMIT command
   - s/len/length/ in SNP_COMMIT kernel-doc (Boris)
   - s/length/len/ in struct for consistency with other cmds/structs
 * patch 24: crypto: ccp: Add the SNP_PLATFORM_STATUS command
   - add comments regarding the need for page reclaim of status page (Boris)
 * patch 21: crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump
   - enable crash_kexec_post_notifiers by default for SNP (Ashish)
   - squash in refactorings from Boris, but keep wbinvd_on_all_cpus() so that
     non-SNP/non-panic case is still handled as it was prior to SNP.
 * patch 20: crypto: ccp: Add debug support for decrypting pages
   - not utilized in current code, dropped it for now (Sean)
 * patch 18: crypto: ccp: Handle legacy SEV commands when SNP is enabled
   - drop uneeded index variables used when rolling back descriptor mappings
     when failure occurs (Boris)
   - error message fixups (Boris)
   - fix indentation to align each argument description line to same column
     as recommended by kernel-doc documentation
   - attempt to unmap/reclaim all descriptors even if a previous unmap/reclaim
     failed
   - s/restored to/restored to hypervisor-owned/ in commit message
   - fix reclaim clobbering return value when a command failure occurs
   - reclaim cmd buffers and descriptors separately so that active cmd buffers
     can be marked !inuse even when reclaim for a descriptor fails
 * patch 17: crypto: ccp: Handle non-volatile INIT_EX data when SNP is enabled
   - drop usage of vmap() to access INIT_EX non-volatile data buffer, no longer
     needed now that rmpupdate() splits directmap rather than removes mappings
 * patch 15: x86/sev: Introduce snp leaked pages list
   - add Suggested-by: Vlastimil Babka <vbabka@suse.cz>
   - avoid the need to allocate memory when leaking unreclaimable pages
     (Vlastimil, Ashish)
   - use pr_warn() instead of pr_debug() when leaking (Sean)
 * patch 14: crypto: ccp: Provide API to issue SEV and SNP commands
   - s/SEV/SEV device/ in kernel-doc for sev_do_cmd() (Boris) 
 * patch 13: crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP
   - account for non-page-aligned HV-Fixed ranges when calling SNP_INIT (Ashish)
   - various comment/commit fixups (Boris)
 * patch 12: crypto: ccp: Define the SEV-SNP commands
   - fix alignment of SEV_CMD_SNP_DOWNLOAD_FIRMWARE_EX enum
   - add "Incoming Migration Image" in all places that mention of IMI for
     purposes of documentation (Boris)
   - fix indentation to align each argument description line to same column
     in accordance with Documentation/doc-guide/kernel-doc.rst
   - rename sev_data_snp_addr.paddr_gctx to more appropriate
     sev_data_snp_addr.address, fix up affected patches
   - len/length in kernel-doc descriptions
 * patch 11: x86/sev: Invalidate pages from the direct map when adding them to the RMP table
   - replace old patch/handling with an implementation based on set_memory_4k()
     that splits directmap as-needed rather than adding/removing mappings in
     response to shared/private conversions.
   - lookup/confirm the mappings in advance of splitting, and avoid unecessary
     splits for 2M private ranges to reduce unecessary contention / TLB flushing
     (Dave, Boris, Tom, Mike, Vlastimil)
 * patch 10: x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction
   - add additional comments clarifying RMPUPDATE retry logic (Boris)
 * patch 07: x86/fault: Add helper for dumping RMP entries
   - fix dump logic for 2MB+ pages (Tom)
   - dump full 2MB range if particular 4K RMP entry is not populated (Boris)
   - drop EXPORT_SYMBOL_GPL() until actually needed by a module (Boris)
 * patch 04: x86/sev: Add the host SEV-SNP initialization support
   - macro alignment fixups (Boris)
   - commit message cleanups (Boris)
   - work in Zen5 patch (Boris)
   - squash __snp_init_rmptable in helper back in (Boris)
 * patch 03: iommu/amd: Don't rely on external callers to enable IOMMU SNP support
   - add Joerg's acked-by
   - error message cleanups (Boris)

Changes since being split off from v10 hypervisor patches:

 * Move all host initialization patches to beginning of overall series and
   post as a separate patchset. Include only KVM patches that are necessary
   for maintaining legacy SVM/SEV functionality with SNP enabled.
   (Paolo, Boris)
 * Don't enable X86_FEATURE_SEV_SNP until all patches are in place to maintain
   legacy SVM/SEV/SEV-ES functionality when RMP table and SNP firmware support
   are enabled. (Paolo)
 * Re-write how firmware-owned buffers are handled when dealing with legacy
   SEV commands. Allocate on-demand rather than relying on pre-allocated pool,
   use a descriptor format for handling nested/pointer params instead of
   relying on macros. (Boris)
 * Don't introduce sev-host.h, re-use sev.h (Boris)
 * Various renames, cleanups, refactorings throughout the tree (Boris)
 * Rework leaked pages handling (Vlastimil)
 * Fix kernel-doc errors introduced by series (Boris)
 * Fix warnings when onlining/offlining CPUs (Jeremi, Ashish)
 * Ensure X86_FEATURE_SEV_SNP is cleared early enough that AutoIBRS will still
   be enabled if RMP table support is not available/configured. (Tom)
 * Only read the RMP base/end MSR values once via BSP (Boris)
 * Handle IOMMU SNP setup automatically based on state machine rather than via
   external caller (Boris)

----------------------------------------------------------------
Ashish Kalra (5):
      iommu/amd: Don't rely on external callers to enable IOMMU SNP support
      x86/mtrr: Don't print errors if MtrrFixDramModEn is set when SNP enabled
      x86/sev: Introduce snp leaked pages list
      iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown
      crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump

Brijesh Singh (14):
      x86/cpufeatures: Add SEV-SNP CPU feature
      x86/sev: Add the host SEV-SNP initialization support
      x86/sev: Add RMP entry lookup helpers
      x86/fault: Add helper for dumping RMP entries
      x86/traps: Define RMP violation #PF error code
      x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction
      crypto: ccp: Define the SEV-SNP commands
      crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP
      crypto: ccp: Provide API to issue SEV and SNP commands
      crypto: ccp: Handle the legacy TMR allocation when SNP is enabled
      crypto: ccp: Handle legacy SEV commands when SNP is enabled
      KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe
      crypto: ccp: Add the SNP_PLATFORM_STATUS command
      crypto: ccp: Add the SNP_SET_CONFIG command

Kim Phillips (1):
      x86/speculation: Do not enable Automatic IBRS if SEV SNP is enabled

Michael Roth (3):
      x86/fault: Dump RMP table information when RMP page faults occur
      x86/sev: Adjust directmap to avoid inadvertant RMP faults
      x86/cpufeatures: Enable/unmask SEV-SNP CPU feature

Tom Lendacky (2):
      crypto: ccp: Handle non-volatile INIT_EX data when SNP is enabled
      crypto: ccp: Add the SNP_COMMIT command

 Documentation/virt/coco/sev-guest.rst    |   51 ++
 arch/x86/Kbuild                          |    2 +
 arch/x86/include/asm/cpufeatures.h       |    1 +
 arch/x86/include/asm/disabled-features.h |    8 +-
 arch/x86/include/asm/iommu.h             |    1 +
 arch/x86/include/asm/kvm-x86-ops.h       |    1 +
 arch/x86/include/asm/kvm_host.h          |    1 +
 arch/x86/include/asm/msr-index.h         |   11 +-
 arch/x86/include/asm/sev.h               |   38 +
 arch/x86/include/asm/trap_pf.h           |   20 +-
 arch/x86/kernel/cpu/amd.c                |   21 +-
 arch/x86/kernel/cpu/common.c             |    7 +-
 arch/x86/kernel/cpu/mtrr/generic.c       |    3 +
 arch/x86/kernel/crash.c                  |    3 +
 arch/x86/kernel/sev.c                    |   10 +
 arch/x86/kvm/lapic.c                     |    5 +-
 arch/x86/kvm/svm/nested.c                |    2 +-
 arch/x86/kvm/svm/sev.c                   |   37 +-
 arch/x86/kvm/svm/svm.c                   |   17 +-
 arch/x86/kvm/svm/svm.h                   |    1 +
 arch/x86/mm/fault.c                      |    5 +
 arch/x86/virt/svm/Makefile               |    3 +
 arch/x86/virt/svm/sev.c                  |  547 +++++++++++++++
 drivers/crypto/ccp/sev-dev.c             | 1122 ++++++++++++++++++++++++++++--
 drivers/crypto/ccp/sev-dev.h             |    5 +
 drivers/iommu/amd/amd_iommu.h            |    1 -
 drivers/iommu/amd/init.c                 |  120 +++-
 include/linux/amd-iommu.h                |    6 +-
 include/linux/psp-sev.h                  |  319 ++++++++-
 include/uapi/linux/psp-sev.h             |   59 ++
 tools/arch/x86/include/asm/cpufeatures.h |    1 +
 31 files changed, 2303 insertions(+), 125 deletions(-)
 create mode 100644 arch/x86/virt/svm/Makefile
 create mode 100644 arch/x86/virt/svm/sev.c




^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v2 01/25] x86/cpufeatures: Add SEV-SNP CPU feature
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 02/25] x86/speculation: Do not enable Automatic IBRS if SEV SNP is enabled Michael Roth
                   ` (24 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh, Jarkko Sakkinen, Ashish Kalra

From: Brijesh Singh <brijesh.singh@amd.com>

Add CPU feature detection for Secure Encrypted Virtualization with
Secure Nested Paging. This feature adds a strong memory integrity
protection to help prevent malicious hypervisor-based attacks like
data replay, memory re-mapping, and more.

Since enabling the SNP CPU feature imposes a number of additional
requirements on host initialization and handling legacy firmware APIs
for SEV/SEV-ES guests, only introduce the CPU feature bit so that the
relevant handling can be added, but leave it disabled via a
disabled-features mask.

Once all the necessary changes needed to maintain legacy SEV/SEV-ES
support are introduced in subsequent patches, the SNP feature bit will
be unmasked/enabled.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Jarkko Sakkinen <jarkko@profian.com>
Signed-off-by: Ashish Kalra <Ashish.Kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/cpufeatures.h       | 1 +
 arch/x86/include/asm/disabled-features.h | 4 +++-
 arch/x86/kernel/cpu/amd.c                | 5 +++--
 tools/arch/x86/include/asm/cpufeatures.h | 1 +
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index fdf723b6f6d0..0fa702673e73 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -440,6 +440,7 @@
 #define X86_FEATURE_SEV			(19*32+ 1) /* AMD Secure Encrypted Virtualization */
 #define X86_FEATURE_VM_PAGE_FLUSH	(19*32+ 2) /* "" VM Page Flush MSR is supported */
 #define X86_FEATURE_SEV_ES		(19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */
+#define X86_FEATURE_SEV_SNP		(19*32+ 4) /* AMD Secure Encrypted Virtualization - Secure Nested Paging */
 #define X86_FEATURE_V_TSC_AUX		(19*32+ 9) /* "" Virtual TSC_AUX */
 #define X86_FEATURE_SME_COHERENT	(19*32+10) /* "" AMD hardware-enforced cache coherency */
 #define X86_FEATURE_DEBUG_SWAP		(19*32+14) /* AMD SEV-ES full debug state swap support */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 36d0c1e05e60..1ea64d4e7021 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -117,6 +117,8 @@
 #define DISABLE_IBT	(1 << (X86_FEATURE_IBT & 31))
 #endif
 
+#define DISABLE_SEV_SNP		(1 << (X86_FEATURE_SEV_SNP & 31))
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -141,7 +143,7 @@
 			 DISABLE_ENQCMD)
 #define DISABLED_MASK17	0
 #define DISABLED_MASK18	(DISABLE_IBT)
-#define DISABLED_MASK19	0
+#define DISABLED_MASK19	(DISABLE_SEV_SNP)
 #define DISABLED_MASK20	0
 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21)
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 34e5c2cb8042..79153e9b92b5 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -602,8 +602,8 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
 	 *	      SME feature (set in scattered.c).
 	 *	      If the kernel has not enabled SME via any means then
 	 *	      don't advertise the SME feature.
-	 *   For SEV: If BIOS has not enabled SEV then don't advertise the
-	 *            SEV and SEV_ES feature (set in scattered.c).
+	 *   For SEV: If BIOS has not enabled SEV then don't advertise SEV and
+	 *	      any additional functionality based on it.
 	 *
 	 *   In all cases, since support for SME and SEV requires long mode,
 	 *   don't advertise the feature under CONFIG_X86_32.
@@ -638,6 +638,7 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
 clear_sev:
 		setup_clear_cpu_cap(X86_FEATURE_SEV);
 		setup_clear_cpu_cap(X86_FEATURE_SEV_ES);
+		setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
 	}
 }
 
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index f4542d2718f4..e58bd69356ee 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -437,6 +437,7 @@
 #define X86_FEATURE_SEV			(19*32+ 1) /* AMD Secure Encrypted Virtualization */
 #define X86_FEATURE_VM_PAGE_FLUSH	(19*32+ 2) /* "" VM Page Flush MSR is supported */
 #define X86_FEATURE_SEV_ES		(19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */
+#define X86_FEATURE_SEV_SNP		(19*32+ 4) /* AMD Secure Encrypted Virtualization - Secure Nested Paging */
 #define X86_FEATURE_V_TSC_AUX		(19*32+ 9) /* "" Virtual TSC_AUX */
 #define X86_FEATURE_SME_COHERENT	(19*32+10) /* "" AMD hardware-enforced cache coherency */
 #define X86_FEATURE_DEBUG_SWAP		(19*32+14) /* AMD SEV-ES full debug state swap support */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 02/25] x86/speculation: Do not enable Automatic IBRS if SEV SNP is enabled
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
  2024-01-26  4:11 ` [PATCH v2 01/25] x86/cpufeatures: Add SEV-SNP CPU feature Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 03/25] iommu/amd: Don't rely on external callers to enable IOMMU SNP support Michael Roth
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Kim Phillips, Dave Hansen

From: Kim Phillips <kim.phillips@amd.com>

Without SEV-SNP, Automatic IBRS protects only the kernel. But when
SEV-SNP is enabled, the Automatic IBRS protection umbrella widens to all
host-side code, including userspace. This protection comes at a cost:
reduced userspace indirect branch performance.

To avoid this performance loss, don't use Automatic IBRS on SEV-SNP
hosts. Fall back to retpolines instead.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
[mdr: squash in changes from review discussion]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kernel/cpu/common.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8f367d376520..6b253440ea72 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1355,8 +1355,13 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 	/*
 	 * AMD's AutoIBRS is equivalent to Intel's eIBRS - use the Intel feature
 	 * flag and protect from vendor-specific bugs via the whitelist.
+	 *
+	 * Don't use AutoIBRS when SNP is enabled because it degrades host
+	 * userspace indirect branch performance.
 	 */
-	if ((ia32_cap & ARCH_CAP_IBRS_ALL) || cpu_has(c, X86_FEATURE_AUTOIBRS)) {
+	if ((ia32_cap & ARCH_CAP_IBRS_ALL) ||
+	    (cpu_has(c, X86_FEATURE_AUTOIBRS) &&
+	     !cpu_feature_enabled(X86_FEATURE_SEV_SNP))) {
 		setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
 		if (!cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) &&
 		    !(ia32_cap & ARCH_CAP_PBRSB_NO))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 03/25] iommu/amd: Don't rely on external callers to enable IOMMU SNP support
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
  2024-01-26  4:11 ` [PATCH v2 01/25] x86/cpufeatures: Add SEV-SNP CPU feature Michael Roth
  2024-01-26  4:11 ` [PATCH v2 02/25] x86/speculation: Do not enable Automatic IBRS if SEV SNP is enabled Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 04/25] x86/sev: Add the host SEV-SNP initialization support Michael Roth
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

From: Ashish Kalra <ashish.kalra@amd.com>

Currently the expectation is that the kernel will call
amd_iommu_snp_enable() to perform various checks and set the
amd_iommu_snp_en flag that the IOMMU uses to adjust its setup routines
to account for additional requirements on hosts where SNP is enabled.

This is somewhat fragile as it relies on this call being done prior to
IOMMU setup. It is more robust to just do this automatically as part of
IOMMU initialization, so rework the code accordingly.

There is still a need to export information about whether or not the
IOMMU is configured in a manner compatible with SNP, so relocate the
existing amd_iommu_snp_en flag so it can be used to convey that
information in place of the return code that was previously provided by
calls to amd_iommu_snp_enable().

While here, also adjust the kernel messages related to IOMMU SNP
enablement for consistency/grammar/clarity.

Suggested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/iommu.h  |  1 +
 drivers/iommu/amd/amd_iommu.h |  1 -
 drivers/iommu/amd/init.c      | 69 ++++++++++++++++-------------------
 include/linux/amd-iommu.h     |  4 --
 4 files changed, 32 insertions(+), 43 deletions(-)

diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index 2fd52b65deac..3be2451e7bc8 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -10,6 +10,7 @@ extern int force_iommu, no_iommu;
 extern int iommu_detected;
 extern int iommu_merge;
 extern int panic_on_overflow;
+extern bool amd_iommu_snp_en;
 
 #ifdef CONFIG_SWIOTLB
 extern bool x86_swiotlb_enable;
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 8b3601f285fd..c970eae2313d 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -164,5 +164,4 @@ void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 				  u64 *root, int mode);
 struct dev_table_entry *get_dev_table(struct amd_iommu *iommu);
 
-extern bool amd_iommu_snp_en;
 #endif
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index c83bd0c2a1c9..3a4eeb26d515 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3221,6 +3221,36 @@ static bool __init detect_ivrs(void)
 	return true;
 }
 
+static void iommu_snp_enable(void)
+{
+#ifdef CONFIG_KVM_AMD_SEV
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return;
+	/*
+	 * The SNP support requires that IOMMU must be enabled, and is
+	 * not configured in the passthrough mode.
+	 */
+	if (no_iommu || iommu_default_passthrough()) {
+		pr_err("SNP: IOMMU disabled or configured in passthrough mode, SNP cannot be supported.\n");
+		return;
+	}
+
+	amd_iommu_snp_en = check_feature(FEATURE_SNP);
+	if (!amd_iommu_snp_en) {
+		pr_err("SNP: IOMMU SNP feature not enabled, SNP cannot be supported.\n");
+		return;
+	}
+
+	pr_info("IOMMU SNP support enabled.\n");
+
+	/* Enforce IOMMU v1 pagetable when SNP is enabled. */
+	if (amd_iommu_pgtable != AMD_IOMMU_V1) {
+		pr_warn("Forcing use of AMD IOMMU v1 page table due to SNP.\n");
+		amd_iommu_pgtable = AMD_IOMMU_V1;
+	}
+#endif
+}
+
 /****************************************************************************
  *
  * AMD IOMMU Initialization State Machine
@@ -3256,6 +3286,7 @@ static int __init state_next(void)
 		break;
 	case IOMMU_ENABLED:
 		register_syscore_ops(&amd_iommu_syscore_ops);
+		iommu_snp_enable();
 		ret = amd_iommu_init_pci();
 		init_state = ret ? IOMMU_INIT_ERROR : IOMMU_PCI_INIT;
 		break;
@@ -3766,41 +3797,3 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn, u64
 
 	return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true);
 }
-
-#ifdef CONFIG_AMD_MEM_ENCRYPT
-int amd_iommu_snp_enable(void)
-{
-	/*
-	 * The SNP support requires that IOMMU must be enabled, and is
-	 * not configured in the passthrough mode.
-	 */
-	if (no_iommu || iommu_default_passthrough()) {
-		pr_err("SNP: IOMMU is disabled or configured in passthrough mode, SNP cannot be supported");
-		return -EINVAL;
-	}
-
-	/*
-	 * Prevent enabling SNP after IOMMU_ENABLED state because this process
-	 * affect how IOMMU driver sets up data structures and configures
-	 * IOMMU hardware.
-	 */
-	if (init_state > IOMMU_ENABLED) {
-		pr_err("SNP: Too late to enable SNP for IOMMU.\n");
-		return -EINVAL;
-	}
-
-	amd_iommu_snp_en = check_feature(FEATURE_SNP);
-	if (!amd_iommu_snp_en)
-		return -EINVAL;
-
-	pr_info("SNP enabled\n");
-
-	/* Enforce IOMMU v1 pagetable when SNP is enabled. */
-	if (amd_iommu_pgtable != AMD_IOMMU_V1) {
-		pr_warn("Force to using AMD IOMMU v1 page table due to SNP\n");
-		amd_iommu_pgtable = AMD_IOMMU_V1;
-	}
-
-	return 0;
-}
-#endif
diff --git a/include/linux/amd-iommu.h b/include/linux/amd-iommu.h
index dc7ed2f46886..7365be00a795 100644
--- a/include/linux/amd-iommu.h
+++ b/include/linux/amd-iommu.h
@@ -85,8 +85,4 @@ int amd_iommu_pc_get_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn,
 		u64 *value);
 struct amd_iommu *get_amd_iommu(unsigned int idx);
 
-#ifdef CONFIG_AMD_MEM_ENCRYPT
-int amd_iommu_snp_enable(void);
-#endif
-
 #endif /* _ASM_X86_AMD_IOMMU_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 04/25] x86/sev: Add the host SEV-SNP initialization support
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (2 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 03/25] iommu/amd: Don't rely on external callers to enable IOMMU SNP support Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 05/25] x86/mtrr: Don't print errors if MtrrFixDramModEn is set when SNP enabled Michael Roth
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

The memory integrity guarantees of SEV-SNP are enforced through a new
structure called the Reverse Map Table (RMP). The RMP is a single data
structure shared across the system that contains one entry for every 4K
page of DRAM that may be used by SEV-SNP VMs. The APM Volume 2 section
on Secure Nested Paging (SEV-SNP) details a number of steps needed to
detect/enable SEV-SNP and RMP table support on the host:

 - Detect SEV-SNP support based on CPUID bit
 - Initialize the RMP table memory reported by the RMP base/end MSR
   registers and configure IOMMU to be compatible with RMP access
   restrictions
 - Set the MtrrFixDramModEn bit in SYSCFG MSR
 - Set the SecureNestedPagingEn and VMPLEn bits in the SYSCFG MSR
 - Configure IOMMU

RMP table entry format is non-architectural and it can vary by
processor. It is defined by the PPR document for each respective CPU
family. Restrict SNP support to CPU models/families which are compatible
with the current RMP table entry format to guard against any undefined
behavior when running on other system types. Future models/support will
handle this through an architectural mechanism to allow for broader
compatibility.

SNP host code depends on CONFIG_KVM_AMD_SEV config flag which may be
enabled even when CONFIG_AMD_MEM_ENCRYPT isn't set, so update the
SNP-specific IOMMU helpers used here to rely on CONFIG_KVM_AMD_SEV
instead of CONFIG_AMD_MEM_ENCRYPT.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Co-developed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/Kbuild                  |   2 +
 arch/x86/include/asm/msr-index.h |  11 +-
 arch/x86/include/asm/sev.h       |   6 +
 arch/x86/kernel/cpu/amd.c        |  16 +++
 arch/x86/virt/svm/Makefile       |   3 +
 arch/x86/virt/svm/sev.c          | 216 +++++++++++++++++++++++++++++++
 6 files changed, 253 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/virt/svm/Makefile
 create mode 100644 arch/x86/virt/svm/sev.c

diff --git a/arch/x86/Kbuild b/arch/x86/Kbuild
index 5a83da703e87..6a1f36df6a18 100644
--- a/arch/x86/Kbuild
+++ b/arch/x86/Kbuild
@@ -28,5 +28,7 @@ obj-y += net/
 
 obj-$(CONFIG_KEXEC_FILE) += purgatory/
 
+obj-y += virt/svm/
+
 # for cleaning
 subdir- += boot tools
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index f1bd7b91b3c6..f482bc6a5ae7 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -599,6 +599,8 @@
 #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
 #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
 #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
+#define MSR_AMD64_RMP_BASE		0xc0010132
+#define MSR_AMD64_RMP_END		0xc0010133
 
 /* SNP feature bits enabled by the hypervisor */
 #define MSR_AMD64_SNP_VTOM			BIT_ULL(3)
@@ -708,8 +710,15 @@
 #define MSR_K8_TOP_MEM1			0xc001001a
 #define MSR_K8_TOP_MEM2			0xc001001d
 #define MSR_AMD64_SYSCFG		0xc0010010
-#define MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT	23
+#define MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT 23
 #define MSR_AMD64_SYSCFG_MEM_ENCRYPT	BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
+#define MSR_AMD64_SYSCFG_SNP_EN_BIT	24
+#define MSR_AMD64_SYSCFG_SNP_EN		BIT_ULL(MSR_AMD64_SYSCFG_SNP_EN_BIT)
+#define MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT 25
+#define MSR_AMD64_SYSCFG_SNP_VMPL_EN	BIT_ULL(MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT)
+#define MSR_AMD64_SYSCFG_MFDM_BIT	19
+#define MSR_AMD64_SYSCFG_MFDM		BIT_ULL(MSR_AMD64_SYSCFG_MFDM_BIT)
+
 #define MSR_K8_INT_PENDING_MSG		0xc0010055
 /* C1E active bits in int pending message */
 #define K8_INTP_C1E_ACTIVE_MASK		0x18000000
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 5b4a1ce3d368..1f59d8ba9776 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -243,4 +243,10 @@ static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
 static inline u64 sev_get_status(void) { return 0; }
 #endif
 
+#ifdef CONFIG_KVM_AMD_SEV
+bool snp_probe_rmptable_info(void);
+#else
+static inline bool snp_probe_rmptable_info(void) { return false; }
+#endif
+
 #endif
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 79153e9b92b5..f48c51640c65 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -20,6 +20,7 @@
 #include <asm/delay.h>
 #include <asm/debugreg.h>
 #include <asm/resctrl.h>
+#include <asm/sev.h>
 
 #ifdef CONFIG_X86_64
 # include <asm/mmconfig.h>
@@ -584,6 +585,21 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
 		break;
 	}
 
+	if (cpu_has(c, X86_FEATURE_SEV_SNP)) {
+		/*
+		 * RMP table entry format is not architectural and it can vary by processor
+		 * and is defined by the per-processor PPR. Restrict SNP support on the
+		 * known CPU model and family for which the RMP table entry format is
+		 * currently defined for.
+		 */
+		if (!boot_cpu_has(X86_FEATURE_ZEN3) &&
+		    !boot_cpu_has(X86_FEATURE_ZEN4) &&
+		    !boot_cpu_has(X86_FEATURE_ZEN5))
+			setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
+		else if (!snp_probe_rmptable_info())
+			setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
+	}
+
 	return;
 
 warn:
diff --git a/arch/x86/virt/svm/Makefile b/arch/x86/virt/svm/Makefile
new file mode 100644
index 000000000000..ef2a31bdcc70
--- /dev/null
+++ b/arch/x86/virt/svm/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_KVM_AMD_SEV) += sev.o
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
new file mode 100644
index 000000000000..575a9ff046cb
--- /dev/null
+++ b/arch/x86/virt/svm/sev.c
@@ -0,0 +1,216 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD SVM-SEV Host Support.
+ *
+ * Copyright (C) 2023 Advanced Micro Devices, Inc.
+ *
+ * Author: Ashish Kalra <ashish.kalra@amd.com>
+ *
+ */
+
+#include <linux/cc_platform.h>
+#include <linux/printk.h>
+#include <linux/mm_types.h>
+#include <linux/set_memory.h>
+#include <linux/memblock.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/cpumask.h>
+#include <linux/iommu.h>
+#include <linux/amd-iommu.h>
+
+#include <asm/sev.h>
+#include <asm/processor.h>
+#include <asm/setup.h>
+#include <asm/svm.h>
+#include <asm/smp.h>
+#include <asm/cpu.h>
+#include <asm/apic.h>
+#include <asm/cpuid.h>
+#include <asm/cmdline.h>
+#include <asm/iommu.h>
+
+/*
+ * The RMP entry format is not architectural. The format is defined in PPR
+ * Family 19h Model 01h, Rev B1 processor.
+ */
+struct rmpentry {
+	u64	assigned	: 1,
+		pagesize	: 1,
+		immutable	: 1,
+		rsvd1		: 9,
+		gpa		: 39,
+		asid		: 10,
+		vmsa		: 1,
+		validated	: 1,
+		rsvd2		: 1;
+	u64 rsvd3;
+} __packed;
+
+/*
+ * The first 16KB from the RMP_BASE is used by the processor for the
+ * bookkeeping, the range needs to be added during the RMP entry lookup.
+ */
+#define RMPTABLE_CPU_BOOKKEEPING_SZ	0x4000
+
+static u64 probed_rmp_base, probed_rmp_size;
+static struct rmpentry *rmptable __ro_after_init;
+static u64 rmptable_max_pfn __ro_after_init;
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"SEV-SNP: " fmt
+
+static int __mfd_enable(unsigned int cpu)
+{
+	u64 val;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return 0;
+
+	rdmsrl(MSR_AMD64_SYSCFG, val);
+
+	val |= MSR_AMD64_SYSCFG_MFDM;
+
+	wrmsrl(MSR_AMD64_SYSCFG, val);
+
+	return 0;
+}
+
+static __init void mfd_enable(void *arg)
+{
+	__mfd_enable(smp_processor_id());
+}
+
+static int __snp_enable(unsigned int cpu)
+{
+	u64 val;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return 0;
+
+	rdmsrl(MSR_AMD64_SYSCFG, val);
+
+	val |= MSR_AMD64_SYSCFG_SNP_EN;
+	val |= MSR_AMD64_SYSCFG_SNP_VMPL_EN;
+
+	wrmsrl(MSR_AMD64_SYSCFG, val);
+
+	return 0;
+}
+
+static __init void snp_enable(void *arg)
+{
+	__snp_enable(smp_processor_id());
+}
+
+#define RMP_ADDR_MASK GENMASK_ULL(51, 13)
+
+bool snp_probe_rmptable_info(void)
+{
+	u64 max_rmp_pfn, calc_rmp_sz, rmp_sz, rmp_base, rmp_end;
+
+	rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
+	rdmsrl(MSR_AMD64_RMP_END, rmp_end);
+
+	if (!(rmp_base & RMP_ADDR_MASK) || !(rmp_end & RMP_ADDR_MASK)) {
+		pr_err("Memory for the RMP table has not been reserved by BIOS\n");
+		return false;
+	}
+
+	if (rmp_base > rmp_end) {
+		pr_err("RMP configuration not valid: base=%#llx, end=%#llx\n", rmp_base, rmp_end);
+		return false;
+	}
+
+	rmp_sz = rmp_end - rmp_base + 1;
+
+	/*
+	 * Calculate the amount the memory that must be reserved by the BIOS to
+	 * address the whole RAM, including the bookkeeping area. The RMP itself
+	 * must also be covered.
+	 */
+	max_rmp_pfn = max_pfn;
+	if (PHYS_PFN(rmp_end) > max_pfn)
+		max_rmp_pfn = PHYS_PFN(rmp_end);
+
+	calc_rmp_sz = (max_rmp_pfn << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ;
+
+	if (calc_rmp_sz > rmp_sz) {
+		pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
+		       calc_rmp_sz, rmp_sz);
+		return false;
+	}
+
+	probed_rmp_base = rmp_base;
+	probed_rmp_size = rmp_sz;
+
+	pr_info("RMP table physical range [0x%016llx - 0x%016llx]\n",
+		probed_rmp_base, probed_rmp_base + probed_rmp_size - 1);
+
+	return true;
+}
+
+/*
+ * Do the necessary preparations which are verified by the firmware as
+ * described in the SNP_INIT_EX firmware command description in the SNP
+ * firmware ABI spec.
+ */
+static int __init snp_rmptable_init(void)
+{
+	void *rmptable_start;
+	u64 rmptable_size;
+	u64 val;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return 0;
+
+	if (!amd_iommu_snp_en)
+		return 0;
+
+	if (!probed_rmp_size)
+		goto nosnp;
+
+	rmptable_start = memremap(probed_rmp_base, probed_rmp_size, MEMREMAP_WB);
+	if (!rmptable_start) {
+		pr_err("Failed to map RMP table\n");
+		return 1;
+	}
+
+	/*
+	 * Check if SEV-SNP is already enabled, this can happen in case of
+	 * kexec boot.
+	 */
+	rdmsrl(MSR_AMD64_SYSCFG, val);
+	if (val & MSR_AMD64_SYSCFG_SNP_EN)
+		goto skip_enable;
+
+	memset(rmptable_start, 0, probed_rmp_size);
+
+	/* Flush the caches to ensure that data is written before SNP is enabled. */
+	wbinvd_on_all_cpus();
+
+	/* MtrrFixDramModEn must be enabled on all the CPUs prior to enabling SNP. */
+	on_each_cpu(mfd_enable, NULL, 1);
+
+	on_each_cpu(snp_enable, NULL, 1);
+
+skip_enable:
+	rmptable_start += RMPTABLE_CPU_BOOKKEEPING_SZ;
+	rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
+
+	rmptable = (struct rmpentry *)rmptable_start;
+	rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry) - 1;
+
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
+
+	return 0;
+
+nosnp:
+	setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
+	return -ENOSYS;
+}
+
+/*
+ * This must be called after the IOMMU has been initialized.
+ */
+device_initcall(snp_rmptable_init);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 05/25] x86/mtrr: Don't print errors if MtrrFixDramModEn is set when SNP enabled
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (3 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 04/25] x86/sev: Add the host SEV-SNP initialization support Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 06/25] x86/sev: Add RMP entry lookup helpers Michael Roth
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Jeremi Piotrowski

From: Ashish Kalra <ashish.kalra@amd.com>

SNP enabled platforms require the MtrrFixDramModeEn bit to be set across
all CPUs when SNP is enabled. Therefore, don't print error messages when
MtrrFixDramModeEn is set when bringing CPUs online.

Reported-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Closes: https://lore.kernel.org/kvm/68b2d6bf-bce7-47f9-bebb-2652cc923ff9@linux.microsoft.com/
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kernel/cpu/mtrr/generic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index d3524778a545..422a4ddc2ab7 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -108,6 +108,9 @@ static inline void k8_check_syscfg_dram_mod_en(void)
 	      (boot_cpu_data.x86 >= 0x0f)))
 		return;
 
+	if (cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return;
+
 	rdmsr(MSR_AMD64_SYSCFG, lo, hi);
 	if (lo & K8_MTRRFIXRANGE_DRAM_MODIFY) {
 		pr_err(FW_WARN "MTRR: CPU %u: SYSCFG[MtrrFixDramModEn]"
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 06/25] x86/sev: Add RMP entry lookup helpers
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (4 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 05/25] x86/mtrr: Don't print errors if MtrrFixDramModEn is set when SNP enabled Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 07/25] x86/fault: Add helper for dumping RMP entries Michael Roth
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

Add a helper that can be used to access information contained in the RMP
entry corresponding to a particular PFN. This will be needed to make
decisions on how to handle setting up mappings in the NPT in response to
guest page-faults and handling things like cleaning up pages and setting
them back to the default hypervisor-owned state when they are no longer
being used for private data.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[mdr: separate 'assigned' indicator from return code, and simplify
      function signatures for various helpers]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev.h |  3 +++
 arch/x86/virt/svm/sev.c    | 49 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 1f59d8ba9776..01ce61b283a3 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -90,6 +90,7 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 /* RMP page size */
 #define RMP_PG_SIZE_4K			0
 #define RMP_PG_SIZE_2M			1
+#define RMP_TO_PG_LEVEL(level)		(((level) == RMP_PG_SIZE_4K) ? PG_LEVEL_4K : PG_LEVEL_2M)
 
 #define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
 
@@ -245,8 +246,10 @@ static inline u64 sev_get_status(void) { return 0; }
 
 #ifdef CONFIG_KVM_AMD_SEV
 bool snp_probe_rmptable_info(void);
+int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
+static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
 #endif
 
 #endif
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 575a9ff046cb..7669b2ff0ec7 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -53,6 +53,9 @@ struct rmpentry {
  */
 #define RMPTABLE_CPU_BOOKKEEPING_SZ	0x4000
 
+/* Mask to apply to a PFN to get the first PFN of a 2MB page */
+#define PFN_PMD_MASK	GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT)
+
 static u64 probed_rmp_base, probed_rmp_size;
 static struct rmpentry *rmptable __ro_after_init;
 static u64 rmptable_max_pfn __ro_after_init;
@@ -214,3 +217,49 @@ static int __init snp_rmptable_init(void)
  * This must be called after the IOMMU has been initialized.
  */
 device_initcall(snp_rmptable_init);
+
+static struct rmpentry *get_rmpentry(u64 pfn)
+{
+	if (WARN_ON_ONCE(pfn > rmptable_max_pfn))
+		return ERR_PTR(-EFAULT);
+
+	return &rmptable[pfn];
+}
+
+static struct rmpentry *__snp_lookup_rmpentry(u64 pfn, int *level)
+{
+	struct rmpentry *large_entry, *entry;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return ERR_PTR(-ENODEV);
+
+	entry = get_rmpentry(pfn);
+	if (IS_ERR(entry))
+		return entry;
+
+	/*
+	 * Find the authoritative RMP entry for a PFN. This can be either a 4K
+	 * RMP entry or a special large RMP entry that is authoritative for a
+	 * whole 2M area.
+	 */
+	large_entry = get_rmpentry(pfn & PFN_PMD_MASK);
+	if (IS_ERR(large_entry))
+		return large_entry;
+
+	*level = RMP_TO_PG_LEVEL(large_entry->pagesize);
+
+	return entry;
+}
+
+int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level)
+{
+	struct rmpentry *e;
+
+	e = __snp_lookup_rmpentry(pfn, level);
+	if (IS_ERR(e))
+		return PTR_ERR(e);
+
+	*assigned = !!e->assigned;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(snp_lookup_rmpentry);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 07/25] x86/fault: Add helper for dumping RMP entries
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (5 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 06/25] x86/sev: Add RMP entry lookup helpers Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 08/25] x86/traps: Define RMP violation #PF error code Michael Roth
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

This information will be useful for debugging things like page faults
due to RMP access violations and RMPUPDATE failures.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[mdr: move helper to standalone patch, rework dump logic as suggested by
      Boris ]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev.h |  2 +
 arch/x86/virt/svm/sev.c    | 99 ++++++++++++++++++++++++++++++++++----
 2 files changed, 91 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 01ce61b283a3..2c53e3de0b71 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -247,9 +247,11 @@ static inline u64 sev_get_status(void) { return 0; }
 #ifdef CONFIG_KVM_AMD_SEV
 bool snp_probe_rmptable_info(void);
 int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level);
+void snp_dump_hva_rmpentry(unsigned long address);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
+static inline void snp_dump_hva_rmpentry(unsigned long address) {}
 #endif
 
 #endif
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 7669b2ff0ec7..c74266e039b2 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -35,16 +35,21 @@
  * Family 19h Model 01h, Rev B1 processor.
  */
 struct rmpentry {
-	u64	assigned	: 1,
-		pagesize	: 1,
-		immutable	: 1,
-		rsvd1		: 9,
-		gpa		: 39,
-		asid		: 10,
-		vmsa		: 1,
-		validated	: 1,
-		rsvd2		: 1;
-	u64 rsvd3;
+	union {
+		struct {
+			u64 assigned	: 1,
+			    pagesize	: 1,
+			    immutable	: 1,
+			    rsvd1	: 9,
+			    gpa		: 39,
+			    asid	: 10,
+			    vmsa	: 1,
+			    validated	: 1,
+			    rsvd2	: 1;
+		};
+		u64 lo;
+	};
+	u64 hi;
 } __packed;
 
 /*
@@ -263,3 +268,77 @@ int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(snp_lookup_rmpentry);
+
+/*
+ * Dump the raw RMP entry for a particular PFN. These bits are documented in the
+ * PPR for a particular CPU model and provide useful information about how a
+ * particular PFN is being utilized by the kernel/firmware at the time certain
+ * unexpected events occur, such as RMP faults.
+ */
+static void dump_rmpentry(u64 pfn)
+{
+	u64 pfn_i, pfn_end;
+	struct rmpentry *e;
+	int level;
+
+	e = __snp_lookup_rmpentry(pfn, &level);
+	if (IS_ERR(e)) {
+		pr_err("Failed to read RMP entry for PFN 0x%llx, error %ld\n",
+		       pfn, PTR_ERR(e));
+		return;
+	}
+
+	if (e->assigned) {
+		pr_info("PFN 0x%llx, RMP entry: [0x%016llx - 0x%016llx]\n",
+			pfn, e->lo, e->hi);
+		return;
+	}
+
+	/*
+	 * If the RMP entry for a particular PFN is not in an assigned state,
+	 * then it is sometimes useful to get an idea of whether or not any RMP
+	 * entries for other PFNs within the same 2MB region are assigned, since
+	 * those too can affect the ability to access a particular PFN in
+	 * certain situations, such as when the PFN is being accessed via a 2MB
+	 * mapping in the host page table.
+	 */
+	pfn_i = ALIGN_DOWN(pfn, PTRS_PER_PMD);
+	pfn_end = pfn_i + PTRS_PER_PMD;
+
+	pr_info("PFN 0x%llx unassigned, dumping non-zero entries in 2M PFN region: [0x%llx - 0x%llx]\n",
+		pfn, pfn_i, pfn_end);
+
+	while (pfn_i < pfn_end) {
+		e = __snp_lookup_rmpentry(pfn_i, &level);
+		if (IS_ERR(e)) {
+			pr_err("Error %ld reading RMP entry for PFN 0x%llx\n",
+			       PTR_ERR(e), pfn_i);
+			pfn_i++;
+			continue;
+		}
+
+		if (e->lo || e->hi)
+			pr_info("PFN: 0x%llx, [0x%016llx - 0x%016llx]\n", pfn_i, e->lo, e->hi);
+		pfn_i++;
+	}
+}
+
+void snp_dump_hva_rmpentry(unsigned long hva)
+{
+	unsigned long paddr;
+	unsigned int level;
+	pgd_t *pgd;
+	pte_t *pte;
+
+	pgd = __va(read_cr3_pa());
+	pgd += pgd_index(hva);
+	pte = lookup_address_in_pgd(pgd, hva, &level);
+
+	if (!pte) {
+		pr_err("Can't dump RMP entry for HVA %lx: no PTE/PFN found\n", hva);
+		return;
+	}
+
+	paddr = PFN_PHYS(pte_pfn(*pte)) | (hva & ~page_level_mask(level));
+	dump_rmpentry(PHYS_PFN(paddr));
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 08/25] x86/traps: Define RMP violation #PF error code
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (6 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 07/25] x86/fault: Add helper for dumping RMP entries Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 09/25] x86/fault: Dump RMP table information when RMP page faults occur Michael Roth
                   ` (17 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh, Dave Hansen

From: Brijesh Singh <brijesh.singh@amd.com>

Bit 31 in the page fault-error bit will be set when processor encounters
an RMP violation.

While at it, use the BIT() macro.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off by: Ashish Kalra <ashish.kalra@amd.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/trap_pf.h | 20 ++++++++++++--------
 arch/x86/mm/fault.c            |  1 +
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/trap_pf.h b/arch/x86/include/asm/trap_pf.h
index afa524325e55..7fc4db774dd6 100644
--- a/arch/x86/include/asm/trap_pf.h
+++ b/arch/x86/include/asm/trap_pf.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_X86_TRAP_PF_H
 #define _ASM_X86_TRAP_PF_H
 
+#include <linux/bits.h>  /* BIT() macro */
+
 /*
  * Page fault error code bits:
  *
@@ -13,16 +15,18 @@
  *   bit 5 ==				1: protection keys block access
  *   bit 6 ==				1: shadow stack access fault
  *   bit 15 ==				1: SGX MMU page-fault
+ *   bit 31 ==				1: fault was due to RMP violation
  */
 enum x86_pf_error_code {
-	X86_PF_PROT	=		1 << 0,
-	X86_PF_WRITE	=		1 << 1,
-	X86_PF_USER	=		1 << 2,
-	X86_PF_RSVD	=		1 << 3,
-	X86_PF_INSTR	=		1 << 4,
-	X86_PF_PK	=		1 << 5,
-	X86_PF_SHSTK	=		1 << 6,
-	X86_PF_SGX	=		1 << 15,
+	X86_PF_PROT	=		BIT(0),
+	X86_PF_WRITE	=		BIT(1),
+	X86_PF_USER	=		BIT(2),
+	X86_PF_RSVD	=		BIT(3),
+	X86_PF_INSTR	=		BIT(4),
+	X86_PF_PK	=		BIT(5),
+	X86_PF_SHSTK	=		BIT(6),
+	X86_PF_SGX	=		BIT(15),
+	X86_PF_RMP	=		BIT(31),
 };
 
 #endif /* _ASM_X86_TRAP_PF_H */
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 679b09cfe241..8805e2e20df6 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -547,6 +547,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad
 		 !(error_code & X86_PF_PROT) ? "not-present page" :
 		 (error_code & X86_PF_RSVD)  ? "reserved bit violation" :
 		 (error_code & X86_PF_PK)    ? "protection keys violation" :
+		 (error_code & X86_PF_RMP)   ? "RMP violation" :
 					       "permissions violation");
 
 	if (!(error_code & X86_PF_USER) && user_mode(regs)) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 09/25] x86/fault: Dump RMP table information when RMP page faults occur
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (7 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 08/25] x86/traps: Define RMP violation #PF error code Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 10/25] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction Michael Roth
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

RMP faults on kernel addresses are fatal and should never happen in
practice. They indicate a bug in the host kernel somewhere. Userspace
RMP faults shouldn't occur either, since even for VMs the memory used
for private pages is handled by guest_memfd and by design is not
mappable by userspace.

Dump RMP table information about the PFN corresponding to the faulting
HVA to help diagnose any issues of this sort when show_fault_oops() is
triggered by an RMP fault.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/mm/fault.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 8805e2e20df6..859adcd123c9 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -34,6 +34,7 @@
 #include <asm/kvm_para.h>		/* kvm_handle_async_pf		*/
 #include <asm/vdso.h>			/* fixup_vdso_exception()	*/
 #include <asm/irq_stack.h>
+#include <asm/sev.h>			/* snp_dump_hva_rmpentry()	*/
 
 #define CREATE_TRACE_POINTS
 #include <asm/trace/exceptions.h>
@@ -580,6 +581,9 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad
 	}
 
 	dump_pagetable(address);
+
+	if (error_code & X86_PF_RMP)
+		snp_dump_hva_rmpentry(address);
 }
 
 static noinline void
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 10/25] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (8 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 09/25] x86/fault: Dump RMP table information when RMP page faults occur Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-29 18:00   ` Liam Merwick
  2024-01-26  4:11 ` [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults Michael Roth
                   ` (15 subsequent siblings)
  25 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

The RMPUPDATE instruction updates the access restrictions for a page via
its corresponding entry in the RMP Table. The hypervisor will use the
instruction to enforce various access restrictions on pages used for
confidential guests and other specialized functionality. See APM3 for
details on the instruction operations.

The PSMASH instruction expands a 2MB RMP entry in the RMP table into a
corresponding set of contiguous 4KB RMP entries while retaining the
state of the validated bit from the original 2MB RMP entry. The
hypervisor will use this instruction in cases where it needs to re-map a
page as 4K rather than 2MB in a guest's nested page table.

Add helpers to make use of these instructions.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[mdr: add RMPUPDATE retry logic for transient FAIL_OVERLAP errors]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev.h | 23 ++++++++++
 arch/x86/virt/svm/sev.c    | 92 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 115 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 2c53e3de0b71..d3ccb7a0c7e9 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -87,10 +87,23 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 /* Software defined (when rFlags.CF = 1) */
 #define PVALIDATE_FAIL_NOUPDATE		255
 
+/* RMUPDATE detected 4K page and 2MB page overlap. */
+#define RMPUPDATE_FAIL_OVERLAP		4
+
 /* RMP page size */
 #define RMP_PG_SIZE_4K			0
 #define RMP_PG_SIZE_2M			1
 #define RMP_TO_PG_LEVEL(level)		(((level) == RMP_PG_SIZE_4K) ? PG_LEVEL_4K : PG_LEVEL_2M)
+#define PG_LEVEL_TO_RMP(level)		(((level) == PG_LEVEL_4K) ? RMP_PG_SIZE_4K : RMP_PG_SIZE_2M)
+
+struct rmp_state {
+	u64 gpa;
+	u8 assigned;
+	u8 pagesize;
+	u8 immutable;
+	u8 rsvd;
+	u32 asid;
+} __packed;
 
 #define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
 
@@ -248,10 +261,20 @@ static inline u64 sev_get_status(void) { return 0; }
 bool snp_probe_rmptable_info(void);
 int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level);
 void snp_dump_hva_rmpentry(unsigned long address);
+int psmash(u64 pfn);
+int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool immutable);
+int rmp_make_shared(u64 pfn, enum pg_level level);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
 static inline void snp_dump_hva_rmpentry(unsigned long address) {}
+static inline int psmash(u64 pfn) { return -ENODEV; }
+static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid,
+				   bool immutable)
+{
+	return -ENODEV;
+}
+static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
 #endif
 
 #endif
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index c74266e039b2..16b3d8139649 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -342,3 +342,95 @@ void snp_dump_hva_rmpentry(unsigned long hva)
 	paddr = PFN_PHYS(pte_pfn(*pte)) | (hva & ~page_level_mask(level));
 	dump_rmpentry(PHYS_PFN(paddr));
 }
+
+/*
+ * PSMASH a 2MB aligned page into 4K pages in the RMP table while preserving the
+ * Validated bit.
+ */
+int psmash(u64 pfn)
+{
+	unsigned long paddr = pfn << PAGE_SHIFT;
+	int ret;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return -ENODEV;
+
+	if (!pfn_valid(pfn))
+		return -EINVAL;
+
+	/* Binutils version 2.36 supports the PSMASH mnemonic. */
+	asm volatile(".byte 0xF3, 0x0F, 0x01, 0xFF"
+		      : "=a" (ret)
+		      : "a" (paddr)
+		      : "memory", "cc");
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(psmash);
+
+/*
+ * It is expected that those operations are seldom enough so that no mutual
+ * exclusion of updaters is needed and thus the overlap error condition below
+ * should happen very seldomly and would get resolved relatively quickly by
+ * the firmware.
+ *
+ * If not, one could consider introducing a mutex or so here to sync concurrent
+ * RMP updates and thus diminish the amount of cases where firmware needs to
+ * lock 2M ranges to protect against concurrent updates.
+ *
+ * The optimal solution would be range locking to avoid locking disjoint
+ * regions unnecessarily but there's no support for that yet.
+ */
+static int rmpupdate(u64 pfn, struct rmp_state *state)
+{
+	unsigned long paddr = pfn << PAGE_SHIFT;
+	int ret;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return -ENODEV;
+
+	do {
+		/* Binutils version 2.36 supports the RMPUPDATE mnemonic. */
+		asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFE"
+			     : "=a" (ret)
+			     : "a" (paddr), "c" ((unsigned long)state)
+			     : "memory", "cc");
+	} while (ret == RMPUPDATE_FAIL_OVERLAP);
+
+	if (ret) {
+		pr_err("RMPUPDATE failed for PFN %llx, ret: %d\n", pfn, ret);
+		dump_rmpentry(pfn);
+		dump_stack();
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+/* Transition a page to guest-owned/private state in the RMP table. */
+int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool immutable)
+{
+	struct rmp_state state;
+
+	memset(&state, 0, sizeof(state));
+	state.assigned = 1;
+	state.asid = asid;
+	state.immutable = immutable;
+	state.gpa = gpa;
+	state.pagesize = PG_LEVEL_TO_RMP(level);
+
+	return rmpupdate(pfn, &state);
+}
+EXPORT_SYMBOL_GPL(rmp_make_private);
+
+/* Transition a page to hypervisor-owned/shared state in the RMP table. */
+int rmp_make_shared(u64 pfn, enum pg_level level)
+{
+	struct rmp_state state;
+
+	memset(&state, 0, sizeof(state));
+	state.pagesize = PG_LEVEL_TO_RMP(level);
+
+	return rmpupdate(pfn, &state);
+}
+EXPORT_SYMBOL_GPL(rmp_make_shared);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (9 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 10/25] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26 15:34   ` Borislav Petkov
  2024-01-26  4:11 ` [PATCH v2 12/25] crypto: ccp: Define the SEV-SNP commands Michael Roth
                   ` (14 subsequent siblings)
  25 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

If the kernel uses a 2MB or larger directmap mapping to write to an
address, and that mapping contains any 4KB pages that are set to private
in the RMP table, an RMP #PF will trigger and cause a host crash.
SNP-aware code that owns the private PFNs will never attempt such a
write, but other kernel tasks writing to other PFNs in the range may
trigger these checks inadvertantly due to writing to those other PFNs
via a large directmap mapping that happens to also map a private PFN.

Prevent this by splitting any 2MB+ mappings that might end up containing
a mix of private/shared PFNs as a result of a subsequent RMPUPDATE for
the PFN/rmp_level passed in.

Another way to handle this would be to limit the directmap to 4K
mappings in the case of hosts that support SNP, but there is potential
risk for performance regressions of certain host workloads. Handling it
as-needed results in the directmap being slowly split over time, which
lessens the risk of a performance regression since the more the
directmap gets split as a result of running SNP guests, the more likely
the host is being used primarily to run SNP guests, where a mostly-split
directmap is actually beneficial since there is less chance of TLB
flushing and cpa_lock contention being needed to perform these splits.

Cases where a host knows in advance it wants to primarily run SNP guests
and wishes to pre-split the directmap can be handled by adding a
tuneable in the future, but preliminary testing has shown this to not
provide a signficant benefit in the common case of guests that are
backed primarily by 2MB THPs, so it does not seem to be warranted
currently and can be added later if a need arises in the future.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/virt/svm/sev.c | 75 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 73 insertions(+), 2 deletions(-)

diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 16b3d8139649..1a13eff78c9d 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -368,6 +368,71 @@ int psmash(u64 pfn)
 }
 EXPORT_SYMBOL_GPL(psmash);
 
+/*
+ * If the kernel uses a 2MB or larger directmap mapping to write to an address,
+ * and that mapping contains any 4KB pages that are set to private in the RMP
+ * table, an RMP #PF will trigger and cause a host crash. Hypervisor code that
+ * owns the PFNs being transitioned will never attempt such a write, but other
+ * kernel tasks writing to other PFNs in the range may trigger these checks
+ * inadvertantly due a large directmap mapping that happens to overlap such a
+ * PFN.
+ *
+ * Prevent this by splitting any 2MB+ mappings that might end up containing a
+ * mix of private/shared PFNs as a result of a subsequent RMPUPDATE for the
+ * PFN/rmp_level passed in.
+ *
+ * Note that there is no attempt here to scan all the RMP entries for the 2MB
+ * physical range, since it would only be worthwhile in determining if a
+ * subsequent RMPUPDATE for a 4KB PFN would result in all the entries being of
+ * the same shared/private state, thus avoiding the need to split the mapping.
+ * But that would mean the entries are currently in a mixed state, and so the
+ * mapping would have already been split as a result of prior transitions.
+ * And since the 4K split is only done if the mapping is 2MB+, and there isn't
+ * currently a mechanism in place to restore 2MB+ mappings, such a check would
+ * not provide any usable benefit.
+ *
+ * More specifics on how these checks are carried out can be found in APM
+ * Volume 2, "RMP and VMPL Access Checks".
+ */
+static int adjust_direct_map(u64 pfn, int rmp_level)
+{
+	unsigned long vaddr = (unsigned long)pfn_to_kaddr(pfn);
+	unsigned int level;
+	int npages, ret;
+	pte_t *pte;
+
+	/* Only 4KB/2MB RMP entries are supported by current hardware. */
+	if (WARN_ON_ONCE(rmp_level > PG_LEVEL_2M))
+		return -EINVAL;
+
+	if (WARN_ON_ONCE(rmp_level == PG_LEVEL_2M && !IS_ALIGNED(pfn, PTRS_PER_PMD)))
+		return -EINVAL;
+
+	/*
+	 * If an entire 2MB physical range is being transitioned, then there is
+	 * no risk of RMP #PFs due to write accesses from overlapping mappings,
+	 * since even accesses from 1GB mappings will be treated as 2MB accesses
+	 * as far as RMP table checks are concerned.
+	 */
+	if (rmp_level == PG_LEVEL_2M)
+		return 0;
+
+	pte = lookup_address(vaddr, &level);
+	if (!pte || pte_none(*pte))
+		return 0;
+
+	if (level == PG_LEVEL_4K)
+		return 0;
+
+	npages = page_level_size(rmp_level) / PAGE_SIZE;
+	ret = set_memory_4k(vaddr, npages);
+	if (ret)
+		pr_warn("Failed to split direct map for PFN 0x%llx, ret: %d\n",
+			pfn, ret);
+
+	return ret;
+}
+
 /*
  * It is expected that those operations are seldom enough so that no mutual
  * exclusion of updaters is needed and thus the overlap error condition below
@@ -384,11 +449,16 @@ EXPORT_SYMBOL_GPL(psmash);
 static int rmpupdate(u64 pfn, struct rmp_state *state)
 {
 	unsigned long paddr = pfn << PAGE_SHIFT;
-	int ret;
+	int ret, level;
 
 	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
 		return -ENODEV;
 
+	level = RMP_TO_PG_LEVEL(state->pagesize);
+
+	if (adjust_direct_map(pfn, level))
+		return -EFAULT;
+
 	do {
 		/* Binutils version 2.36 supports the RMPUPDATE mnemonic. */
 		asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFE"
@@ -398,7 +468,8 @@ static int rmpupdate(u64 pfn, struct rmp_state *state)
 	} while (ret == RMPUPDATE_FAIL_OVERLAP);
 
 	if (ret) {
-		pr_err("RMPUPDATE failed for PFN %llx, ret: %d\n", pfn, ret);
+		pr_err("RMPUPDATE failed for PFN %llx, pg_level: %d, ret: %d\n",
+		       pfn, level, ret);
 		dump_rmpentry(pfn);
 		dump_stack();
 		return -EFAULT;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 12/25] crypto: ccp: Define the SEV-SNP commands
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (10 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 13/25] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP Michael Roth
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

AMD introduced the next generation of SEV called SEV-SNP (Secure Nested
Paging). SEV-SNP builds upon existing SEV and SEV-ES functionality while
adding new hardware security protection.

Define the commands and structures used to communicate with the AMD-SP
when creating and managing the SEV-SNP guests. The SEV-SNP firmware spec
is available at developer.amd.com/sev.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[mdr: update SNP command list and SNP status struct based on current
      spec, use C99 flexible arrays, fix kernel-doc issues]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 drivers/crypto/ccp/sev-dev.c |  16 +++
 include/linux/psp-sev.h      | 265 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/psp-sev.h |  56 ++++++++
 3 files changed, 337 insertions(+)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index e4d3f45242f6..e38986d39b63 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -130,6 +130,8 @@ static int sev_cmd_buffer_len(int cmd)
 	switch (cmd) {
 	case SEV_CMD_INIT:			return sizeof(struct sev_data_init);
 	case SEV_CMD_INIT_EX:                   return sizeof(struct sev_data_init_ex);
+	case SEV_CMD_SNP_SHUTDOWN_EX:		return sizeof(struct sev_data_snp_shutdown_ex);
+	case SEV_CMD_SNP_INIT_EX:		return sizeof(struct sev_data_snp_init_ex);
 	case SEV_CMD_PLATFORM_STATUS:		return sizeof(struct sev_user_data_status);
 	case SEV_CMD_PEK_CSR:			return sizeof(struct sev_data_pek_csr);
 	case SEV_CMD_PEK_CERT_IMPORT:		return sizeof(struct sev_data_pek_cert_import);
@@ -158,6 +160,20 @@ static int sev_cmd_buffer_len(int cmd)
 	case SEV_CMD_GET_ID:			return sizeof(struct sev_data_get_id);
 	case SEV_CMD_ATTESTATION_REPORT:	return sizeof(struct sev_data_attestation_report);
 	case SEV_CMD_SEND_CANCEL:		return sizeof(struct sev_data_send_cancel);
+	case SEV_CMD_SNP_GCTX_CREATE:		return sizeof(struct sev_data_snp_addr);
+	case SEV_CMD_SNP_LAUNCH_START:		return sizeof(struct sev_data_snp_launch_start);
+	case SEV_CMD_SNP_LAUNCH_UPDATE:		return sizeof(struct sev_data_snp_launch_update);
+	case SEV_CMD_SNP_ACTIVATE:		return sizeof(struct sev_data_snp_activate);
+	case SEV_CMD_SNP_DECOMMISSION:		return sizeof(struct sev_data_snp_addr);
+	case SEV_CMD_SNP_PAGE_RECLAIM:		return sizeof(struct sev_data_snp_page_reclaim);
+	case SEV_CMD_SNP_GUEST_STATUS:		return sizeof(struct sev_data_snp_guest_status);
+	case SEV_CMD_SNP_LAUNCH_FINISH:		return sizeof(struct sev_data_snp_launch_finish);
+	case SEV_CMD_SNP_DBG_DECRYPT:		return sizeof(struct sev_data_snp_dbg);
+	case SEV_CMD_SNP_DBG_ENCRYPT:		return sizeof(struct sev_data_snp_dbg);
+	case SEV_CMD_SNP_PAGE_UNSMASH:		return sizeof(struct sev_data_snp_page_unsmash);
+	case SEV_CMD_SNP_PLATFORM_STATUS:	return sizeof(struct sev_data_snp_addr);
+	case SEV_CMD_SNP_GUEST_REQUEST:		return sizeof(struct sev_data_snp_guest_request);
+	case SEV_CMD_SNP_CONFIG:		return sizeof(struct sev_user_data_snp_config);
 	default:				return 0;
 	}
 
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index 7fd17e82bab4..006e4cdbeb78 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -78,6 +78,36 @@ enum sev_cmd {
 	SEV_CMD_DBG_DECRYPT		= 0x060,
 	SEV_CMD_DBG_ENCRYPT		= 0x061,
 
+	/* SNP specific commands */
+	SEV_CMD_SNP_INIT		= 0x081,
+	SEV_CMD_SNP_SHUTDOWN		= 0x082,
+	SEV_CMD_SNP_PLATFORM_STATUS	= 0x083,
+	SEV_CMD_SNP_DF_FLUSH		= 0x084,
+	SEV_CMD_SNP_INIT_EX		= 0x085,
+	SEV_CMD_SNP_SHUTDOWN_EX		= 0x086,
+	SEV_CMD_SNP_DECOMMISSION	= 0x090,
+	SEV_CMD_SNP_ACTIVATE		= 0x091,
+	SEV_CMD_SNP_GUEST_STATUS	= 0x092,
+	SEV_CMD_SNP_GCTX_CREATE		= 0x093,
+	SEV_CMD_SNP_GUEST_REQUEST	= 0x094,
+	SEV_CMD_SNP_ACTIVATE_EX		= 0x095,
+	SEV_CMD_SNP_LAUNCH_START	= 0x0A0,
+	SEV_CMD_SNP_LAUNCH_UPDATE	= 0x0A1,
+	SEV_CMD_SNP_LAUNCH_FINISH	= 0x0A2,
+	SEV_CMD_SNP_DBG_DECRYPT		= 0x0B0,
+	SEV_CMD_SNP_DBG_ENCRYPT		= 0x0B1,
+	SEV_CMD_SNP_PAGE_SWAP_OUT	= 0x0C0,
+	SEV_CMD_SNP_PAGE_SWAP_IN	= 0x0C1,
+	SEV_CMD_SNP_PAGE_MOVE		= 0x0C2,
+	SEV_CMD_SNP_PAGE_MD_INIT	= 0x0C3,
+	SEV_CMD_SNP_PAGE_SET_STATE	= 0x0C6,
+	SEV_CMD_SNP_PAGE_RECLAIM	= 0x0C7,
+	SEV_CMD_SNP_PAGE_UNSMASH	= 0x0C8,
+	SEV_CMD_SNP_CONFIG		= 0x0C9,
+	SEV_CMD_SNP_DOWNLOAD_FIRMWARE_EX = 0x0CA,
+	SEV_CMD_SNP_COMMIT		= 0x0CB,
+	SEV_CMD_SNP_VLEK_LOAD		= 0x0CD,
+
 	SEV_CMD_MAX,
 };
 
@@ -523,6 +553,241 @@ struct sev_data_attestation_report {
 	u32 len;				/* In/Out */
 } __packed;
 
+/**
+ * struct sev_data_snp_download_firmware - SNP_DOWNLOAD_FIRMWARE command params
+ *
+ * @address: physical address of firmware image
+ * @len: length of the firmware image
+ */
+struct sev_data_snp_download_firmware {
+	u64 address;				/* In */
+	u32 len;				/* In */
+} __packed;
+
+/**
+ * struct sev_data_snp_activate - SNP_ACTIVATE command params
+ *
+ * @gctx_paddr: system physical address guest context page
+ * @asid: ASID to bind to the guest
+ */
+struct sev_data_snp_activate {
+	u64 gctx_paddr;				/* In */
+	u32 asid;				/* In */
+} __packed;
+
+/**
+ * struct sev_data_snp_addr - generic SNP command params
+ *
+ * @address: physical address of generic data param
+ */
+struct sev_data_snp_addr {
+	u64 address;				/* In/Out */
+} __packed;
+
+/**
+ * struct sev_data_snp_launch_start - SNP_LAUNCH_START command params
+ *
+ * @gctx_paddr: system physical address of guest context page
+ * @policy: guest policy
+ * @ma_gctx_paddr: system physical address of migration agent
+ * @ma_en: the guest is associated with a migration agent
+ * @imi_en: launch flow is launching an IMI (Incoming Migration Image) for the
+ *          purpose of guest-assisted migration.
+ * @rsvd: reserved
+ * @gosvw: guest OS-visible workarounds, as defined by hypervisor
+ */
+struct sev_data_snp_launch_start {
+	u64 gctx_paddr;				/* In */
+	u64 policy;				/* In */
+	u64 ma_gctx_paddr;			/* In */
+	u32 ma_en:1;				/* In */
+	u32 imi_en:1;				/* In */
+	u32 rsvd:30;
+	u8 gosvw[16];				/* In */
+} __packed;
+
+/* SNP support page type */
+enum {
+	SNP_PAGE_TYPE_NORMAL		= 0x1,
+	SNP_PAGE_TYPE_VMSA		= 0x2,
+	SNP_PAGE_TYPE_ZERO		= 0x3,
+	SNP_PAGE_TYPE_UNMEASURED	= 0x4,
+	SNP_PAGE_TYPE_SECRET		= 0x5,
+	SNP_PAGE_TYPE_CPUID		= 0x6,
+
+	SNP_PAGE_TYPE_MAX
+};
+
+/**
+ * struct sev_data_snp_launch_update - SNP_LAUNCH_UPDATE command params
+ *
+ * @gctx_paddr: system physical address of guest context page
+ * @page_size: page size 0 indicates 4K and 1 indicates 2MB page
+ * @page_type: encoded page type
+ * @imi_page: indicates that this page is part of the IMI (Incoming Migration
+ *            Image) of the guest
+ * @rsvd: reserved
+ * @rsvd2: reserved
+ * @address: system physical address of destination page to encrypt
+ * @rsvd3: reserved
+ * @vmpl1_perms: VMPL permission mask for VMPL1
+ * @vmpl2_perms: VMPL permission mask for VMPL2
+ * @vmpl3_perms: VMPL permission mask for VMPL3
+ * @rsvd4: reserved
+ */
+struct sev_data_snp_launch_update {
+	u64 gctx_paddr;				/* In */
+	u32 page_size:1;			/* In */
+	u32 page_type:3;			/* In */
+	u32 imi_page:1;				/* In */
+	u32 rsvd:27;
+	u32 rsvd2;
+	u64 address;				/* In */
+	u32 rsvd3:8;
+	u32 vmpl1_perms:8;			/* In */
+	u32 vmpl2_perms:8;			/* In */
+	u32 vmpl3_perms:8;			/* In */
+	u32 rsvd4;
+} __packed;
+
+/**
+ * struct sev_data_snp_launch_finish - SNP_LAUNCH_FINISH command params
+ *
+ * @gctx_paddr: system physical address of guest context page
+ * @id_block_paddr: system physical address of ID block
+ * @id_auth_paddr: system physical address of ID block authentication structure
+ * @id_block_en: indicates whether ID block is present
+ * @auth_key_en: indicates whether author key is present in authentication structure
+ * @rsvd: reserved
+ * @host_data: host-supplied data for guest, not interpreted by firmware
+ */
+struct sev_data_snp_launch_finish {
+	u64 gctx_paddr;
+	u64 id_block_paddr;
+	u64 id_auth_paddr;
+	u8 id_block_en:1;
+	u8 auth_key_en:1;
+	u64 rsvd:62;
+	u8 host_data[32];
+} __packed;
+
+/**
+ * struct sev_data_snp_guest_status - SNP_GUEST_STATUS command params
+ *
+ * @gctx_paddr: system physical address of guest context page
+ * @address: system physical address of guest status page
+ */
+struct sev_data_snp_guest_status {
+	u64 gctx_paddr;
+	u64 address;
+} __packed;
+
+/**
+ * struct sev_data_snp_page_reclaim - SNP_PAGE_RECLAIM command params
+ *
+ * @paddr: system physical address of page to be claimed. The 0th bit in the
+ *         address indicates the page size. 0h indicates 4KB and 1h indicates
+ *         2MB page.
+ */
+struct sev_data_snp_page_reclaim {
+	u64 paddr;
+} __packed;
+
+/**
+ * struct sev_data_snp_page_unsmash - SNP_PAGE_UNSMASH command params
+ *
+ * @paddr: system physical address of page to be unsmashed. The 0th bit in the
+ *         address indicates the page size. 0h indicates 4 KB and 1h indicates
+ *         2 MB page.
+ */
+struct sev_data_snp_page_unsmash {
+	u64 paddr;
+} __packed;
+
+/**
+ * struct sev_data_snp_dbg - DBG_ENCRYPT/DBG_DECRYPT command parameters
+ *
+ * @gctx_paddr: system physical address of guest context page
+ * @src_addr: source address of data to operate on
+ * @dst_addr: destination address of data to operate on
+ */
+struct sev_data_snp_dbg {
+	u64 gctx_paddr;				/* In */
+	u64 src_addr;				/* In */
+	u64 dst_addr;				/* In */
+} __packed;
+
+/**
+ * struct sev_data_snp_guest_request - SNP_GUEST_REQUEST command params
+ *
+ * @gctx_paddr: system physical address of guest context page
+ * @req_paddr: system physical address of request page
+ * @res_paddr: system physical address of response page
+ */
+struct sev_data_snp_guest_request {
+	u64 gctx_paddr;				/* In */
+	u64 req_paddr;				/* In */
+	u64 res_paddr;				/* In */
+} __packed;
+
+/**
+ * struct sev_data_snp_init_ex - SNP_INIT_EX structure
+ *
+ * @init_rmp: indicate that the RMP should be initialized.
+ * @list_paddr_en: indicate that list_paddr is valid
+ * @rsvd: reserved
+ * @rsvd1: reserved
+ * @list_paddr: system physical address of range list
+ * @rsvd2: reserved
+ */
+struct sev_data_snp_init_ex {
+	u32 init_rmp:1;
+	u32 list_paddr_en:1;
+	u32 rsvd:30;
+	u32 rsvd1;
+	u64 list_paddr;
+	u8  rsvd2[48];
+} __packed;
+
+/**
+ * struct sev_data_range - RANGE structure
+ *
+ * @base: system physical address of first byte of range
+ * @page_count: number of 4KB pages in this range
+ * @rsvd: reserved
+ */
+struct sev_data_range {
+	u64 base;
+	u32 page_count;
+	u32 rsvd;
+} __packed;
+
+/**
+ * struct sev_data_range_list - RANGE_LIST structure
+ *
+ * @num_elements: number of elements in RANGE_ARRAY
+ * @rsvd: reserved
+ * @ranges: array of num_elements of type RANGE
+ */
+struct sev_data_range_list {
+	u32 num_elements;
+	u32 rsvd;
+	struct sev_data_range ranges[];
+} __packed;
+
+/**
+ * struct sev_data_snp_shutdown_ex - SNP_SHUTDOWN_EX structure
+ *
+ * @len: length of the command buffer read by the PSP
+ * @iommu_snp_shutdown: Disable enforcement of SNP in the IOMMU
+ * @rsvd1: reserved
+ */
+struct sev_data_snp_shutdown_ex {
+	u32 len;
+	u32 iommu_snp_shutdown:1;
+	u32 rsvd1:31;
+} __packed;
+
 #ifdef CONFIG_CRYPTO_DEV_SP_PSP
 
 /**
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index b44ba7dcdefc..207e34217528 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -69,6 +69,12 @@ typedef enum {
 	SEV_RET_RESOURCE_LIMIT,
 	SEV_RET_SECURE_DATA_INVALID,
 	SEV_RET_INVALID_KEY = 0x27,
+	SEV_RET_INVALID_PAGE_SIZE,
+	SEV_RET_INVALID_PAGE_STATE,
+	SEV_RET_INVALID_MDATA_ENTRY,
+	SEV_RET_INVALID_PAGE_OWNER,
+	SEV_RET_INVALID_PAGE_AEAD_OFLOW,
+	SEV_RET_RMP_INIT_REQUIRED,
 	SEV_RET_MAX,
 } sev_ret_code;
 
@@ -155,6 +161,56 @@ struct sev_user_data_get_id2 {
 	__u32 length;				/* In/Out */
 } __packed;
 
+/**
+ * struct sev_user_data_snp_status - SNP status
+ *
+ * @api_major: API major version
+ * @api_minor: API minor version
+ * @state: current platform state
+ * @is_rmp_initialized: whether RMP is initialized or not
+ * @rsvd: reserved
+ * @build_id: firmware build id for the API version
+ * @mask_chip_id: whether chip id is present in attestation reports or not
+ * @mask_chip_key: whether attestation reports are signed or not
+ * @vlek_en: VLEK (Version Loaded Endorsement Key) hashstick is loaded
+ * @rsvd1: reserved
+ * @guest_count: the number of guest currently managed by the firmware
+ * @current_tcb_version: current TCB version
+ * @reported_tcb_version: reported TCB version
+ */
+struct sev_user_data_snp_status {
+	__u8 api_major;			/* Out */
+	__u8 api_minor;			/* Out */
+	__u8 state;			/* Out */
+	__u8 is_rmp_initialized:1;	/* Out */
+	__u8 rsvd:7;
+	__u32 build_id;			/* Out */
+	__u32 mask_chip_id:1;		/* Out */
+	__u32 mask_chip_key:1;		/* Out */
+	__u32 vlek_en:1;		/* Out */
+	__u32 rsvd1:29;
+	__u32 guest_count;		/* Out */
+	__u64 current_tcb_version;	/* Out */
+	__u64 reported_tcb_version;	/* Out */
+} __packed;
+
+/**
+ * struct sev_user_data_snp_config - system wide configuration value for SNP.
+ *
+ * @reported_tcb: the TCB version to report in the guest attestation report.
+ * @mask_chip_id: whether chip id is present in attestation reports or not
+ * @mask_chip_key: whether attestation reports are signed or not
+ * @rsvd: reserved
+ * @rsvd1: reserved
+ */
+struct sev_user_data_snp_config {
+	__u64 reported_tcb  ;   /* In */
+	__u32 mask_chip_id:1;   /* In */
+	__u32 mask_chip_key:1;  /* In */
+	__u32 rsvd:30;          /* In */
+	__u8 rsvd1[52];
+} __packed;
+
 /**
  * struct sev_issue_cmd - SEV ioctl parameters
  *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 13/25] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (11 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 12/25] crypto: ccp: Define the SEV-SNP commands Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-29 17:58   ` Borislav Petkov
  2024-01-26  4:11 ` [PATCH v2 14/25] crypto: ccp: Provide API to issue SEV and SNP commands Michael Roth
                   ` (12 subsequent siblings)
  25 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh, Jarkko Sakkinen

From: Brijesh Singh <brijesh.singh@amd.com>

Before SNP VMs can be launched, the platform must be appropriately
configured and initialized. Platform initialization is accomplished via
the SNP_INIT command.

During the execution of SNP_INIT command, the firmware configures
and enables SNP security policy enforcement in many system components.
Some system components write to regions of memory reserved by early
x86 firmware (e.g. UEFI). Other system components write to regions
provided by the operation system, hypervisor, or x86 firmware.
Such system components can only write to HV-fixed pages or Default
pages. They will error when attempting to write to pages in other page
states after SNP_INIT enables their SNP enforcement.

Starting in SNP firmware v1.52, the SNP_INIT_EX command takes a list of
system physical address ranges to convert into the HV-fixed page states
during the RMP initialization. If INIT_RMP is 1, hypervisors should
provide all system physical address ranges that the hypervisor will
never assign to a guest until the next RMP re-initialization.
For instance, the memory that UEFI reserves should be included in the
range list. This allows system components that occasionally write to
memory (e.g. logging to UEFI reserved regions) to not fail due to
RMP initialization and SNP enablement.

Note that SNP_INIT(_EX) must not be executed while non-SEV guests are
executing, otherwise it is possible that the system could reset or hang.
The psp_init_on_probe module parameter was added for SEV/SEV-ES support
and the init_ex_path module parameter to allow for time for the
necessary file system to be mounted/available. SNP_INIT(_EX) does not
use the file associated with init_ex_path. So, to avoid running into
issues where SNP_INIT(_EX) is called while there are other running
guests, issue it during module probe regardless of the psp_init_on_probe
setting, but maintain the previous deferrable handling for SEV/SEV-ES
initialization.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Co-developed-by: Jarkko Sakkinen <jarkko@profian.com>
Signed-off-by: Jarkko Sakkinen <jarkko@profian.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[mdr: squash in psp_init_on_probe changes from Tom, reduce
 proliferation of 'probe' function parameter where possible]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c       |   5 +-
 drivers/crypto/ccp/sev-dev.c | 280 ++++++++++++++++++++++++++++++++---
 drivers/crypto/ccp/sev-dev.h |   2 +
 include/linux/psp-sev.h      |  17 ++-
 4 files changed, 281 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index f760106c31f8..564091f386f7 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -246,6 +246,7 @@ static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
 static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 {
 	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+	struct sev_platform_init_args init_args = {0};
 	int asid, ret;
 
 	if (kvm->created_vcpus)
@@ -262,7 +263,8 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 		goto e_no_asid;
 	sev->asid = asid;
 
-	ret = sev_platform_init(&argp->error);
+	init_args.probe = false;
+	ret = sev_platform_init(&init_args);
 	if (ret)
 		goto e_free;
 
@@ -274,6 +276,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	return 0;
 
 e_free:
+	argp->error = init_args.error;
 	sev_asid_free(sev);
 	sev->asid = 0;
 e_no_asid:
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index e38986d39b63..712964469612 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -29,6 +29,7 @@
 
 #include <asm/smp.h>
 #include <asm/cacheflush.h>
+#include <asm/e820/types.h>
 
 #include "psp-dev.h"
 #include "sev-dev.h"
@@ -37,6 +38,10 @@
 #define SEV_FW_FILE		"amd/sev.fw"
 #define SEV_FW_NAME_SIZE	64
 
+/* Minimum firmware version required for the SEV-SNP support */
+#define SNP_MIN_API_MAJOR	1
+#define SNP_MIN_API_MINOR	51
+
 static DEFINE_MUTEX(sev_cmd_mutex);
 static struct sev_misc_dev *misc_dev;
 
@@ -80,6 +85,13 @@ static void *sev_es_tmr;
 #define NV_LENGTH (32 * 1024)
 static void *sev_init_ex_buffer;
 
+/*
+ * SEV_DATA_RANGE_LIST:
+ *   Array containing range of pages that firmware transitions to HV-fixed
+ *   page state.
+ */
+struct sev_data_range_list *snp_range_list;
+
 static inline bool sev_version_greater_or_equal(u8 maj, u8 min)
 {
 	struct sev_device *sev = psp_master->sev_data;
@@ -480,20 +492,163 @@ static inline int __sev_do_init_locked(int *psp_ret)
 		return __sev_init_locked(psp_ret);
 }
 
-static int __sev_platform_init_locked(int *error)
+static void snp_set_hsave_pa(void *arg)
+{
+	wrmsrl(MSR_VM_HSAVE_PA, 0);
+}
+
+static int snp_filter_reserved_mem_regions(struct resource *rs, void *arg)
+{
+	struct sev_data_range_list *range_list = arg;
+	struct sev_data_range *range = &range_list->ranges[range_list->num_elements];
+	size_t size;
+
+	/*
+	 * Ensure the list of HV_FIXED pages that will be passed to firmware
+	 * do not exceed the page-sized argument buffer.
+	 */
+	if ((range_list->num_elements * sizeof(struct sev_data_range) +
+	     sizeof(struct sev_data_range_list)) > PAGE_SIZE)
+		return -E2BIG;
+
+	switch (rs->desc) {
+	case E820_TYPE_RESERVED:
+	case E820_TYPE_PMEM:
+	case E820_TYPE_ACPI:
+		range->base = rs->start & PAGE_MASK;
+		size = PAGE_ALIGN((rs->end + 1) - rs->start);
+		range->page_count = size >> PAGE_SHIFT;
+		range_list->num_elements++;
+		break;
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+static int __sev_snp_init_locked(int *error)
 {
-	int rc = 0, psp_ret = SEV_RET_NO_FW_CALL;
 	struct psp_device *psp = psp_master;
+	struct sev_data_snp_init_ex data;
 	struct sev_device *sev;
+	void *arg = &data;
+	int cmd, rc = 0;
 
-	if (!psp || !psp->sev_data)
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
 		return -ENODEV;
 
 	sev = psp->sev_data;
 
+	if (sev->snp_initialized)
+		return 0;
+
+	if (!sev_version_greater_or_equal(SNP_MIN_API_MAJOR, SNP_MIN_API_MINOR)) {
+		dev_dbg(sev->dev, "SEV-SNP support requires firmware version >= %d:%d\n",
+			SNP_MIN_API_MAJOR, SNP_MIN_API_MINOR);
+		return 0;
+	}
+
+	/* SNP_INIT requires MSR_VM_HSAVE_PA to be cleared on all CPUs. */
+	on_each_cpu(snp_set_hsave_pa, NULL, 1);
+
+	/*
+	 * Starting in SNP firmware v1.52, the SNP_INIT_EX command takes a list
+	 * of system physical address ranges to convert into HV-fixed page
+	 * states during the RMP initialization.  For instance, the memory that
+	 * UEFI reserves should be included in the that list. This allows system
+	 * components that occasionally write to memory (e.g. logging to UEFI
+	 * reserved regions) to not fail due to RMP initialization and SNP
+	 * enablement.
+	 *
+	 */
+	if (sev_version_greater_or_equal(SNP_MIN_API_MAJOR, 52)) {
+		/*
+		 * Firmware checks that the pages containing the ranges enumerated
+		 * in the RANGES structure are either in the default page state or in the
+		 * firmware page state.
+		 */
+		snp_range_list = kzalloc(PAGE_SIZE, GFP_KERNEL);
+		if (!snp_range_list) {
+			dev_err(sev->dev,
+				"SEV: SNP_INIT_EX range list memory allocation failed\n");
+			return -ENOMEM;
+		}
+
+		/*
+		 * Retrieve all reserved memory regions from the e820 memory map
+		 * to be setup as HV-fixed pages.
+		 */
+		rc = walk_iomem_res_desc(IORES_DESC_NONE, IORESOURCE_MEM, 0, ~0,
+					 snp_range_list, snp_filter_reserved_mem_regions);
+		if (rc) {
+			dev_err(sev->dev,
+				"SEV: SNP_INIT_EX walk_iomem_res_desc failed rc = %d\n", rc);
+			return rc;
+		}
+
+		memset(&data, 0, sizeof(data));
+		data.init_rmp = 1;
+		data.list_paddr_en = 1;
+		data.list_paddr = __psp_pa(snp_range_list);
+		cmd = SEV_CMD_SNP_INIT_EX;
+	} else {
+		cmd = SEV_CMD_SNP_INIT;
+		arg = NULL;
+	}
+
+	/*
+	 * The following sequence must be issued before launching the first SNP
+	 * guest to ensure all dirty cache lines are flushed, including from
+	 * updates to the RMP table itself via the RMPUPDATE instruction:
+	 *
+	 * - WBINVD on all running CPUs
+	 * - SEV_CMD_SNP_INIT[_EX] firmware command
+	 * - WBINVD on all running CPUs
+	 * - SEV_CMD_SNP_DF_FLUSH firmware command
+	 */
+	wbinvd_on_all_cpus();
+
+	rc = __sev_do_cmd_locked(cmd, arg, error);
+	if (rc)
+		return rc;
+
+	/* Prepare for first SNP guest launch after INIT. */
+	wbinvd_on_all_cpus();
+	rc = __sev_do_cmd_locked(SEV_CMD_SNP_DF_FLUSH, NULL, error);
+	if (rc)
+		return rc;
+
+	sev->snp_initialized = true;
+	dev_dbg(sev->dev, "SEV-SNP firmware initialized\n");
+
+	return rc;
+}
+
+static int __sev_platform_init_locked(int *error)
+{
+	int rc, psp_ret = SEV_RET_NO_FW_CALL;
+	struct sev_device *sev;
+
+	if (!psp_master || !psp_master->sev_data)
+		return -ENODEV;
+
+	sev = psp_master->sev_data;
+
 	if (sev->state == SEV_STATE_INIT)
 		return 0;
 
+	if (!sev_es_tmr) {
+		/* Obtain the TMR memory area for SEV-ES use */
+		sev_es_tmr = sev_fw_alloc(SEV_ES_TMR_SIZE);
+		if (sev_es_tmr)
+			/* Must flush the cache before giving it to the firmware */
+			clflush_cache_range(sev_es_tmr, SEV_ES_TMR_SIZE);
+		else
+			dev_warn(sev->dev,
+				 "SEV: TMR allocation failed, SEV-ES support unavailable\n");
+		}
+
 	if (sev_init_ex_buffer) {
 		rc = sev_read_init_ex_file();
 		if (rc)
@@ -536,12 +691,46 @@ static int __sev_platform_init_locked(int *error)
 	return 0;
 }
 
-int sev_platform_init(int *error)
+static int _sev_platform_init_locked(struct sev_platform_init_args *args)
+{
+	struct sev_device *sev;
+	int rc;
+
+	if (!psp_master || !psp_master->sev_data)
+		return -ENODEV;
+
+	sev = psp_master->sev_data;
+
+	if (sev->state == SEV_STATE_INIT)
+		return 0;
+
+	/*
+	 * Legacy guests cannot be running while SNP_INIT(_EX) is executing,
+	 * so perform SEV-SNP initialization at probe time.
+	 */
+	rc = __sev_snp_init_locked(&args->error);
+	if (rc && rc != -ENODEV) {
+		/*
+		 * Don't abort the probe if SNP INIT failed,
+		 * continue to initialize the legacy SEV firmware.
+		 */
+		dev_err(sev->dev, "SEV-SNP: failed to INIT rc %d, error %#x\n",
+			rc, args->error);
+	}
+
+	/* Defer legacy SEV/SEV-ES support if allowed by caller/module. */
+	if (args->probe && !psp_init_on_probe)
+		return 0;
+
+	return __sev_platform_init_locked(&args->error);
+}
+
+int sev_platform_init(struct sev_platform_init_args *args)
 {
 	int rc;
 
 	mutex_lock(&sev_cmd_mutex);
-	rc = __sev_platform_init_locked(error);
+	rc = _sev_platform_init_locked(args);
 	mutex_unlock(&sev_cmd_mutex);
 
 	return rc;
@@ -852,6 +1041,55 @@ static int sev_update_firmware(struct device *dev)
 	return ret;
 }
 
+static int __sev_snp_shutdown_locked(int *error)
+{
+	struct sev_device *sev = psp_master->sev_data;
+	struct sev_data_snp_shutdown_ex data;
+	int ret;
+
+	if (!sev->snp_initialized)
+		return 0;
+
+	memset(&data, 0, sizeof(data));
+	data.len = sizeof(data);
+	data.iommu_snp_shutdown = 1;
+
+	wbinvd_on_all_cpus();
+
+	ret = __sev_do_cmd_locked(SEV_CMD_SNP_SHUTDOWN_EX, &data, error);
+	/* SHUTDOWN may require DF_FLUSH */
+	if (*error == SEV_RET_DFFLUSH_REQUIRED) {
+		ret = __sev_do_cmd_locked(SEV_CMD_SNP_DF_FLUSH, NULL, NULL);
+		if (ret) {
+			dev_err(sev->dev, "SEV-SNP DF_FLUSH failed\n");
+			return ret;
+		}
+		/* reissue the shutdown command */
+		ret = __sev_do_cmd_locked(SEV_CMD_SNP_SHUTDOWN_EX, &data,
+					  error);
+	}
+	if (ret) {
+		dev_err(sev->dev, "SEV-SNP firmware shutdown failed\n");
+		return ret;
+	}
+
+	sev->snp_initialized = false;
+	dev_dbg(sev->dev, "SEV-SNP firmware shutdown\n");
+
+	return ret;
+}
+
+static int sev_snp_shutdown(int *error)
+{
+	int rc;
+
+	mutex_lock(&sev_cmd_mutex);
+	rc = __sev_snp_shutdown_locked(error);
+	mutex_unlock(&sev_cmd_mutex);
+
+	return rc;
+}
+
 static int sev_ioctl_do_pek_import(struct sev_issue_cmd *argp, bool writable)
 {
 	struct sev_device *sev = psp_master->sev_data;
@@ -1299,6 +1537,8 @@ int sev_dev_init(struct psp_device *psp)
 
 static void sev_firmware_shutdown(struct sev_device *sev)
 {
+	int error;
+
 	sev_platform_shutdown(NULL);
 
 	if (sev_es_tmr) {
@@ -1315,6 +1555,13 @@ static void sev_firmware_shutdown(struct sev_device *sev)
 			   get_order(NV_LENGTH));
 		sev_init_ex_buffer = NULL;
 	}
+
+	if (snp_range_list) {
+		kfree(snp_range_list);
+		snp_range_list = NULL;
+	}
+
+	sev_snp_shutdown(&error);
 }
 
 void sev_dev_destroy(struct psp_device *psp)
@@ -1345,7 +1592,8 @@ EXPORT_SYMBOL_GPL(sev_issue_cmd_external_user);
 void sev_pci_init(void)
 {
 	struct sev_device *sev = psp_master->sev_data;
-	int error, rc;
+	struct sev_platform_init_args args = {0};
+	int rc;
 
 	if (!sev)
 		return;
@@ -1370,23 +1618,15 @@ void sev_pci_init(void)
 		}
 	}
 
-	/* Obtain the TMR memory area for SEV-ES use */
-	sev_es_tmr = sev_fw_alloc(SEV_ES_TMR_SIZE);
-	if (sev_es_tmr)
-		/* Must flush the cache before giving it to the firmware */
-		clflush_cache_range(sev_es_tmr, SEV_ES_TMR_SIZE);
-	else
-		dev_warn(sev->dev,
-			 "SEV: TMR allocation failed, SEV-ES support unavailable\n");
-
-	if (!psp_init_on_probe)
-		return;
-
 	/* Initialize the platform */
-	rc = sev_platform_init(&error);
+	args.probe = true;
+	rc = sev_platform_init(&args);
 	if (rc)
 		dev_err(sev->dev, "SEV: failed to INIT error %#x, rc %d\n",
-			error, rc);
+			args.error, rc);
+
+	dev_info(sev->dev, "SEV%s API:%d.%d build:%d\n", sev->snp_initialized ?
+		"-SNP" : "", sev->api_major, sev->api_minor, sev->build);
 
 	return;
 
diff --git a/drivers/crypto/ccp/sev-dev.h b/drivers/crypto/ccp/sev-dev.h
index 778c95155e74..85506325051a 100644
--- a/drivers/crypto/ccp/sev-dev.h
+++ b/drivers/crypto/ccp/sev-dev.h
@@ -52,6 +52,8 @@ struct sev_device {
 	u8 build;
 
 	void *cmd_buf;
+
+	bool snp_initialized;
 };
 
 int sev_dev_init(struct psp_device *psp);
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index 006e4cdbeb78..8128de17f0f4 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -790,10 +790,23 @@ struct sev_data_snp_shutdown_ex {
 
 #ifdef CONFIG_CRYPTO_DEV_SP_PSP
 
+/**
+ * struct sev_platform_init_args
+ *
+ * @error: SEV firmware error code
+ * @probe: True if this is being called as part of CCP module probe, which
+ *  will defer SEV_INIT/SEV_INIT_EX firmware initialization until needed
+ *  unless psp_init_on_probe module param is set
+ */
+struct sev_platform_init_args {
+	int error;
+	bool probe;
+};
+
 /**
  * sev_platform_init - perform SEV INIT command
  *
- * @error: SEV command return code
+ * @args: struct sev_platform_init_args to pass in arguments
  *
  * Returns:
  * 0 if the SEV successfully processed the command
@@ -802,7 +815,7 @@ struct sev_data_snp_shutdown_ex {
  * -%ETIMEDOUT if the SEV command timed out
  * -%EIO       if the SEV returned a non-zero return code
  */
-int sev_platform_init(int *error);
+int sev_platform_init(struct sev_platform_init_args *args);
 
 /**
  * sev_platform_status - perform SEV PLATFORM_STATUS command
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 14/25] crypto: ccp: Provide API to issue SEV and SNP commands
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (12 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 13/25] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 15/25] x86/sev: Introduce snp leaked pages list Michael Roth
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

Export sev_do_cmd() as a generic API for the hypervisor to issue
commands to manage an SEV and SNP guest. The commands for SEV and SNP
are defined in the SEV and SEV-SNP firmware specifications.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[mdr: kernel-doc fixups]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 drivers/crypto/ccp/sev-dev.c |  3 ++-
 include/linux/psp-sev.h      | 19 +++++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 712964469612..abee1a68d609 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -431,7 +431,7 @@ static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret)
 	return ret;
 }
 
-static int sev_do_cmd(int cmd, void *data, int *psp_ret)
+int sev_do_cmd(int cmd, void *data, int *psp_ret)
 {
 	int rc;
 
@@ -441,6 +441,7 @@ static int sev_do_cmd(int cmd, void *data, int *psp_ret)
 
 	return rc;
 }
+EXPORT_SYMBOL_GPL(sev_do_cmd);
 
 static int __sev_init_locked(int *error)
 {
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index 8128de17f0f4..c7dd6ff9f36b 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -915,6 +915,22 @@ int sev_guest_df_flush(int *error);
  */
 int sev_guest_decommission(struct sev_data_decommission *data, int *error);
 
+/**
+ * sev_do_cmd - issue an SEV or an SEV-SNP command
+ *
+ * @cmd: SEV or SEV-SNP firmware command to issue
+ * @data: arguments for firmware command
+ * @psp_ret: SEV command return code
+ *
+ * Returns:
+ * 0 if the SEV device successfully processed the command
+ * -%ENODEV    if the PSP device is not available
+ * -%ENOTSUPP  if PSP device does not support SEV
+ * -%ETIMEDOUT if the SEV command timed out
+ * -%EIO       if PSP device returned a non-zero return code
+ */
+int sev_do_cmd(int cmd, void *data, int *psp_ret);
+
 void *psp_copy_user_blob(u64 uaddr, u32 len);
 
 #else	/* !CONFIG_CRYPTO_DEV_SP_PSP */
@@ -930,6 +946,9 @@ sev_guest_deactivate(struct sev_data_deactivate *data, int *error) { return -ENO
 static inline int
 sev_guest_decommission(struct sev_data_decommission *data, int *error) { return -ENODEV; }
 
+static inline int
+sev_do_cmd(int cmd, void *data, int *psp_ret) { return -ENODEV; }
+
 static inline int
 sev_guest_activate(struct sev_data_activate *data, int *error) { return -ENODEV; }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 15/25] x86/sev: Introduce snp leaked pages list
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (13 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 14/25] crypto: ccp: Provide API to issue SEV and SNP commands Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-29 14:26   ` Vlastimil Babka
  2024-01-26  4:11 ` [PATCH v2 16/25] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled Michael Roth
                   ` (10 subsequent siblings)
  25 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

From: Ashish Kalra <ashish.kalra@amd.com>

Pages are unsafe to be released back to the page-allocator, if they
have been transitioned to firmware/guest state and can't be reclaimed
or transitioned back to hypervisor/shared state. In this case add
them to an internal leaked pages list to ensure that they are not freed
or touched/accessed to cause fatal page faults.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[mdr: relocate to arch/x86/virt/svm/sev.c]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev.h |  2 ++
 arch/x86/virt/svm/sev.c    | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index d3ccb7a0c7e9..435ba9bc4510 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -264,6 +264,7 @@ void snp_dump_hva_rmpentry(unsigned long address);
 int psmash(u64 pfn);
 int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool immutable);
 int rmp_make_shared(u64 pfn, enum pg_level level);
+void snp_leak_pages(u64 pfn, unsigned int npages);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
@@ -275,6 +276,7 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int as
 	return -ENODEV;
 }
 static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
+static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
 #endif
 
 #endif
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 1a13eff78c9d..649ac1bb6b0e 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -65,6 +65,11 @@ static u64 probed_rmp_base, probed_rmp_size;
 static struct rmpentry *rmptable __ro_after_init;
 static u64 rmptable_max_pfn __ro_after_init;
 
+static LIST_HEAD(snp_leaked_pages_list);
+static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
+
+static unsigned long snp_nr_leaked_pages;
+
 #undef pr_fmt
 #define pr_fmt(fmt)	"SEV-SNP: " fmt
 
@@ -505,3 +510,32 @@ int rmp_make_shared(u64 pfn, enum pg_level level)
 	return rmpupdate(pfn, &state);
 }
 EXPORT_SYMBOL_GPL(rmp_make_shared);
+
+void snp_leak_pages(u64 pfn, unsigned int npages)
+{
+	struct page *page = pfn_to_page(pfn);
+
+	pr_warn("Leaking PFN range 0x%llx-0x%llx\n", pfn, pfn + npages);
+
+	spin_lock(&snp_leaked_pages_list_lock);
+	while (npages--) {
+		/*
+		 * Reuse the page's buddy list for chaining into the leaked
+		 * pages list. This page should not be on a free list currently
+		 * and is also unsafe to be added to a free list.
+		 */
+		if (likely(!PageCompound(page)) ||
+		    (PageHead(page) && compound_nr(page) <= npages))
+			/*
+			 * Skip inserting tail pages of compound page as
+			 * page->buddy_list of tail pages is not usable.
+			 */
+			list_add_tail(&page->buddy_list, &snp_leaked_pages_list);
+		dump_rmpentry(pfn);
+		snp_nr_leaked_pages++;
+		pfn++;
+		page++;
+	}
+	spin_unlock(&snp_leaked_pages_list_lock);
+}
+EXPORT_SYMBOL_GPL(snp_leak_pages);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 16/25] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (14 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 15/25] x86/sev: Introduce snp leaked pages list Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-29 15:04   ` Borislav Petkov
  2024-01-26  4:11 ` [PATCH v2 17/25] crypto: ccp: Handle non-volatile INIT_EX data " Michael Roth
                   ` (9 subsequent siblings)
  25 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

The behavior and requirement for the SEV-legacy command is altered when
the SNP firmware is in the INIT state. See SEV-SNP firmware ABI
specification for more details.

Allocate the Trusted Memory Region (TMR) as a 2MB-sized/aligned region
when SNP is enabled to satisfy new requirements for SNP. Continue
allocating a 1MB-sized region for !SNP configuration.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[mdr: use struct sev_data_snp_page_reclaim instead of passing paddr
      directly to SEV_CMD_SNP_PAGE_RECLAIM]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 drivers/crypto/ccp/sev-dev.c | 176 +++++++++++++++++++++++++++++++----
 include/linux/psp-sev.h      |   9 ++
 2 files changed, 165 insertions(+), 20 deletions(-)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index abee1a68d609..fa992ce57ffe 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -30,6 +30,7 @@
 #include <asm/smp.h>
 #include <asm/cacheflush.h>
 #include <asm/e820/types.h>
+#include <asm/sev.h>
 
 #include "psp-dev.h"
 #include "sev-dev.h"
@@ -73,9 +74,14 @@ static int psp_timeout;
  *   The TMR is a 1MB area that must be 1MB aligned.  Use the page allocator
  *   to allocate the memory, which will return aligned memory for the specified
  *   allocation order.
+ *
+ * When SEV-SNP is enabled the TMR needs to be 2MB aligned and 2MB sized.
  */
-#define SEV_ES_TMR_SIZE		(1024 * 1024)
+#define SEV_TMR_SIZE		(1024 * 1024)
+#define SNP_TMR_SIZE		(2 * 1024 * 1024)
+
 static void *sev_es_tmr;
+static size_t sev_es_tmr_size = SEV_TMR_SIZE;
 
 /* INIT_EX NV Storage:
  *   The NV Storage is a 32Kb area and must be 4Kb page aligned.  Use the page
@@ -192,17 +198,6 @@ static int sev_cmd_buffer_len(int cmd)
 	return 0;
 }
 
-static void *sev_fw_alloc(unsigned long len)
-{
-	struct page *page;
-
-	page = alloc_pages(GFP_KERNEL, get_order(len));
-	if (!page)
-		return NULL;
-
-	return page_address(page);
-}
-
 static struct file *open_file_as_root(const char *filename, int flags, umode_t mode)
 {
 	struct file *fp;
@@ -333,6 +328,142 @@ static int sev_write_init_ex_file_if_required(int cmd_id)
 	return sev_write_init_ex_file();
 }
 
+/*
+ * snp_reclaim_pages() needs __sev_do_cmd_locked(), and __sev_do_cmd_locked()
+ * needs snp_reclaim_pages(), so a forward declaration is needed.
+ */
+static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret);
+
+static int snp_reclaim_pages(unsigned long paddr, unsigned int npages, bool locked)
+{
+	int ret, err, i;
+
+	paddr = __sme_clr(ALIGN_DOWN(paddr, PAGE_SIZE));
+
+	for (i = 0; i < npages; i++, paddr += PAGE_SIZE) {
+		struct sev_data_snp_page_reclaim data = {0};
+
+		data.paddr = paddr;
+
+		if (locked)
+			ret = __sev_do_cmd_locked(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
+		else
+			ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
+
+		if (ret)
+			goto cleanup;
+
+		ret = rmp_make_shared(__phys_to_pfn(paddr), PG_LEVEL_4K);
+		if (ret)
+			goto cleanup;
+	}
+
+	return 0;
+
+cleanup:
+	/*
+	 * If there was a failure reclaiming the page then it is no longer safe
+	 * to release it back to the system; leak it instead.
+	 */
+	snp_leak_pages(__phys_to_pfn(paddr), npages - i);
+	return ret;
+}
+
+static int rmp_mark_pages_firmware(unsigned long paddr, unsigned int npages, bool locked)
+{
+	unsigned long pfn = __sme_clr(paddr) >> PAGE_SHIFT;
+	int rc, i;
+
+	for (i = 0; i < npages; i++, pfn++) {
+		rc = rmp_make_private(pfn, 0, PG_LEVEL_4K, 0, true);
+		if (rc)
+			goto cleanup;
+	}
+
+	return 0;
+
+cleanup:
+	/*
+	 * Try unrolling the firmware state changes by
+	 * reclaiming the pages which were already changed to the
+	 * firmware state.
+	 */
+	snp_reclaim_pages(paddr, i, locked);
+
+	return rc;
+}
+
+static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order)
+{
+	unsigned long npages = 1ul << order, paddr;
+	struct sev_device *sev;
+	struct page *page;
+
+	if (!psp_master || !psp_master->sev_data)
+		return NULL;
+
+	page = alloc_pages(gfp_mask, order);
+	if (!page)
+		return NULL;
+
+	/* If SEV-SNP is initialized then add the page in RMP table. */
+	sev = psp_master->sev_data;
+	if (!sev->snp_initialized)
+		return page;
+
+	paddr = __pa((unsigned long)page_address(page));
+	if (rmp_mark_pages_firmware(paddr, npages, false))
+		return NULL;
+
+	return page;
+}
+
+void *snp_alloc_firmware_page(gfp_t gfp_mask)
+{
+	struct page *page;
+
+	page = __snp_alloc_firmware_pages(gfp_mask, 0);
+
+	return page ? page_address(page) : NULL;
+}
+EXPORT_SYMBOL_GPL(snp_alloc_firmware_page);
+
+static void __snp_free_firmware_pages(struct page *page, int order, bool locked)
+{
+	struct sev_device *sev = psp_master->sev_data;
+	unsigned long paddr, npages = 1ul << order;
+
+	if (!page)
+		return;
+
+	paddr = __pa((unsigned long)page_address(page));
+	if (sev->snp_initialized &&
+	    snp_reclaim_pages(paddr, npages, locked))
+		return;
+
+	__free_pages(page, order);
+}
+
+void snp_free_firmware_page(void *addr)
+{
+	if (!addr)
+		return;
+
+	__snp_free_firmware_pages(virt_to_page(addr), 0, false);
+}
+EXPORT_SYMBOL_GPL(snp_free_firmware_page);
+
+static void *sev_fw_alloc(unsigned long len)
+{
+	struct page *page;
+
+	page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len));
+	if (!page)
+		return NULL;
+
+	return page_address(page);
+}
+
 static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret)
 {
 	struct psp_device *psp = psp_master;
@@ -456,7 +587,7 @@ static int __sev_init_locked(int *error)
 		data.tmr_address = __pa(sev_es_tmr);
 
 		data.flags |= SEV_INIT_FLAGS_SEV_ES;
-		data.tmr_len = SEV_ES_TMR_SIZE;
+		data.tmr_len = sev_es_tmr_size;
 	}
 
 	return __sev_do_cmd_locked(SEV_CMD_INIT, &data, error);
@@ -479,7 +610,7 @@ static int __sev_init_ex_locked(int *error)
 		data.tmr_address = __pa(sev_es_tmr);
 
 		data.flags |= SEV_INIT_FLAGS_SEV_ES;
-		data.tmr_len = SEV_ES_TMR_SIZE;
+		data.tmr_len = sev_es_tmr_size;
 	}
 
 	return __sev_do_cmd_locked(SEV_CMD_INIT_EX, &data, error);
@@ -623,6 +754,8 @@ static int __sev_snp_init_locked(int *error)
 	sev->snp_initialized = true;
 	dev_dbg(sev->dev, "SEV-SNP firmware initialized\n");
 
+	sev_es_tmr_size = SNP_TMR_SIZE;
+
 	return rc;
 }
 
@@ -641,14 +774,16 @@ static int __sev_platform_init_locked(int *error)
 
 	if (!sev_es_tmr) {
 		/* Obtain the TMR memory area for SEV-ES use */
-		sev_es_tmr = sev_fw_alloc(SEV_ES_TMR_SIZE);
-		if (sev_es_tmr)
+		sev_es_tmr = sev_fw_alloc(sev_es_tmr_size);
+		if (sev_es_tmr) {
 			/* Must flush the cache before giving it to the firmware */
-			clflush_cache_range(sev_es_tmr, SEV_ES_TMR_SIZE);
-		else
+			if (!sev->snp_initialized)
+				clflush_cache_range(sev_es_tmr, sev_es_tmr_size);
+		} else {
 			dev_warn(sev->dev,
 				 "SEV: TMR allocation failed, SEV-ES support unavailable\n");
 		}
+	}
 
 	if (sev_init_ex_buffer) {
 		rc = sev_read_init_ex_file();
@@ -1546,8 +1681,9 @@ static void sev_firmware_shutdown(struct sev_device *sev)
 		/* The TMR area was encrypted, flush it from the cache */
 		wbinvd_on_all_cpus();
 
-		free_pages((unsigned long)sev_es_tmr,
-			   get_order(SEV_ES_TMR_SIZE));
+		__snp_free_firmware_pages(virt_to_page(sev_es_tmr),
+					  get_order(sev_es_tmr_size),
+					  false);
 		sev_es_tmr = NULL;
 	}
 
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index c7dd6ff9f36b..7f9bc1979018 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -932,6 +932,8 @@ int sev_guest_decommission(struct sev_data_decommission *data, int *error);
 int sev_do_cmd(int cmd, void *data, int *psp_ret);
 
 void *psp_copy_user_blob(u64 uaddr, u32 len);
+void *snp_alloc_firmware_page(gfp_t mask);
+void snp_free_firmware_page(void *addr);
 
 #else	/* !CONFIG_CRYPTO_DEV_SP_PSP */
 
@@ -959,6 +961,13 @@ sev_issue_cmd_external_user(struct file *filep, unsigned int id, void *data, int
 
 static inline void *psp_copy_user_blob(u64 __user uaddr, u32 len) { return ERR_PTR(-EINVAL); }
 
+static inline void *snp_alloc_firmware_page(gfp_t mask)
+{
+	return NULL;
+}
+
+static inline void snp_free_firmware_page(void *addr) { }
+
 #endif	/* CONFIG_CRYPTO_DEV_SP_PSP */
 
 #endif	/* __PSP_SEV_H__ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 17/25] crypto: ccp: Handle non-volatile INIT_EX data when SNP is enabled
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (15 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 16/25] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-29 15:12   ` Borislav Petkov
  2024-01-26  4:11 ` [PATCH v2 18/25] crypto: ccp: Handle legacy SEV commands " Michael Roth
                   ` (8 subsequent siblings)
  25 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

From: Tom Lendacky <thomas.lendacky@amd.com>

For SEV/SEV-ES, a buffer can be used to access non-volatile data so it
can be initialized from a file specified by the init_ex_path CCP module
parameter instead of relying on the SPI bus for NV storage, and
afterward the buffer can be read from to sync new data back to the file.

When SNP is enabled, the pages comprising this buffer need to be set to
firmware-owned in the RMP table before they can be accessed by firmware
for subsequent updates to the initial contents.

Implement that handling here.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 drivers/crypto/ccp/sev-dev.c | 47 ++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index fa992ce57ffe..97fdd98e958c 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -785,10 +785,38 @@ static int __sev_platform_init_locked(int *error)
 		}
 	}
 
-	if (sev_init_ex_buffer) {
+	/*
+	 * If an init_ex_path is provided allocate a buffer for the file and
+	 * read in the contents. Additionally, if SNP is initialized, convert
+	 * the buffer pages to firmware pages.
+	 */
+	if (init_ex_path && !sev_init_ex_buffer) {
+		struct page *page;
+
+		page = alloc_pages(GFP_KERNEL, get_order(NV_LENGTH));
+		if (!page) {
+			dev_err(sev->dev, "SEV: INIT_EX NV memory allocation failed\n");
+			return -ENOMEM;
+		}
+
+		sev_init_ex_buffer = page_address(page);
+
 		rc = sev_read_init_ex_file();
 		if (rc)
 			return rc;
+
+		/* If SEV-SNP is initialized, transition to firmware page. */
+		if (sev->snp_initialized) {
+			unsigned long npages;
+
+			npages = 1UL << get_order(NV_LENGTH);
+			if (rmp_mark_pages_firmware(__pa(sev_init_ex_buffer),
+						    npages, false)) {
+				dev_err(sev->dev,
+					"SEV: INIT_EX NV memory page state change failed.\n");
+				return -ENOMEM;
+			}
+		}
 	}
 
 	rc = __sev_do_init_locked(&psp_ret);
@@ -1688,8 +1716,9 @@ static void sev_firmware_shutdown(struct sev_device *sev)
 	}
 
 	if (sev_init_ex_buffer) {
-		free_pages((unsigned long)sev_init_ex_buffer,
-			   get_order(NV_LENGTH));
+		__snp_free_firmware_pages(virt_to_page(sev_init_ex_buffer),
+					  get_order(NV_LENGTH),
+					  true);
 		sev_init_ex_buffer = NULL;
 	}
 
@@ -1743,18 +1772,6 @@ void sev_pci_init(void)
 	if (sev_update_firmware(sev->dev) == 0)
 		sev_get_api_version();
 
-	/* If an init_ex_path is provided rely on INIT_EX for PSP initialization
-	 * instead of INIT.
-	 */
-	if (init_ex_path) {
-		sev_init_ex_buffer = sev_fw_alloc(NV_LENGTH);
-		if (!sev_init_ex_buffer) {
-			dev_err(sev->dev,
-				"SEV: INIT_EX NV memory allocation failed\n");
-			goto err;
-		}
-	}
-
 	/* Initialize the platform */
 	args.probe = true;
 	rc = sev_platform_init(&args);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 18/25] crypto: ccp: Handle legacy SEV commands when SNP is enabled
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (16 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 17/25] crypto: ccp: Handle non-volatile INIT_EX data " Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 19/25] iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown Michael Roth
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

The behavior of legacy SEV commands is altered when the firmware is
initialized for SNP support. In that case, all command buffer memory
that may get written to by legacy SEV commands must be marked as
firmware-owned in the RMP table prior to issuing the command.

Additionally, when a command buffer contains a system physical address
that points to additional buffers that firmware may write to, special
handling is needed depending on whether:

  1) the system physical address points to guest memory
  2) the system physical address points to host memory

To handle case #1, the pages of these buffers are changed to
firmware-owned in the RMP table before issuing the command, and restored
to hypervisor-owned after the command completes.

For case #2, a bounce buffer is used instead of the original address.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 drivers/crypto/ccp/sev-dev.c | 418 ++++++++++++++++++++++++++++++++++-
 drivers/crypto/ccp/sev-dev.h |   3 +
 2 files changed, 411 insertions(+), 10 deletions(-)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 97fdd98e958c..b2ad41ce5f77 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -43,6 +43,15 @@
 #define SNP_MIN_API_MAJOR	1
 #define SNP_MIN_API_MINOR	51
 
+/*
+ * Maximum number of firmware-writable buffers that might be specified
+ * in the parameters of a legacy SEV command buffer.
+ */
+#define CMD_BUF_FW_WRITABLE_MAX 2
+
+/* Leave room in the descriptor array for an end-of-list indicator. */
+#define CMD_BUF_DESC_MAX (CMD_BUF_FW_WRITABLE_MAX + 1)
+
 static DEFINE_MUTEX(sev_cmd_mutex);
 static struct sev_misc_dev *misc_dev;
 
@@ -464,13 +473,344 @@ static void *sev_fw_alloc(unsigned long len)
 	return page_address(page);
 }
 
+/**
+ * struct cmd_buf_desc - descriptors for managing legacy SEV command address
+ * parameters corresponding to buffers that may be written to by firmware.
+ *
+ * @paddr_ptr: pointer the address parameter in the command buffer, which may
+ *             need to be saved/restored depending on whether a bounce buffer is
+ *             used. Must be NULL if this descriptor is only an end-of-list
+ *             indicator.
+ * @paddr_orig: storage for the original address parameter, which can be used to
+ *              restore the original value in @paddr_ptr in cases where it is
+ *              replaced with the address of a bounce buffer.
+ * @len: length of buffer located at the address originally stored at @paddr_ptr
+ * @guest_owned: true if the address corresponds to guest-owned pages, in which
+ *               case bounce buffers are not needed.
+ */
+struct cmd_buf_desc {
+	u64 *paddr_ptr;
+	u64 paddr_orig;
+	u32 len;
+	bool guest_owned;
+};
+
+/*
+ * If a legacy SEV command parameter is a memory address, those pages in
+ * turn need to be transitioned to/from firmware-owned before/after
+ * executing the firmware command.
+ *
+ * Additionally, in cases where those pages are not guest-owned, a bounce
+ * buffer is needed in place of the original memory address parameter.
+ *
+ * A set of descriptors are used to keep track of this handling, and
+ * initialized here based on the specific commands being executed.
+ */
+static void snp_populate_cmd_buf_desc_list(int cmd, void *cmd_buf,
+					   struct cmd_buf_desc *desc_list)
+{
+	switch (cmd) {
+	case SEV_CMD_PDH_CERT_EXPORT: {
+		struct sev_data_pdh_cert_export *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->pdh_cert_address;
+		desc_list[0].len = data->pdh_cert_len;
+		desc_list[1].paddr_ptr = &data->cert_chain_address;
+		desc_list[1].len = data->cert_chain_len;
+		break;
+	}
+	case SEV_CMD_GET_ID: {
+		struct sev_data_get_id *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->address;
+		desc_list[0].len = data->len;
+		break;
+	}
+	case SEV_CMD_PEK_CSR: {
+		struct sev_data_pek_csr *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->address;
+		desc_list[0].len = data->len;
+		break;
+	}
+	case SEV_CMD_LAUNCH_UPDATE_DATA: {
+		struct sev_data_launch_update_data *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->address;
+		desc_list[0].len = data->len;
+		desc_list[0].guest_owned = true;
+		break;
+	}
+	case SEV_CMD_LAUNCH_UPDATE_VMSA: {
+		struct sev_data_launch_update_vmsa *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->address;
+		desc_list[0].len = data->len;
+		desc_list[0].guest_owned = true;
+		break;
+	}
+	case SEV_CMD_LAUNCH_MEASURE: {
+		struct sev_data_launch_measure *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->address;
+		desc_list[0].len = data->len;
+		break;
+	}
+	case SEV_CMD_LAUNCH_UPDATE_SECRET: {
+		struct sev_data_launch_secret *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->guest_address;
+		desc_list[0].len = data->guest_len;
+		desc_list[0].guest_owned = true;
+		break;
+	}
+	case SEV_CMD_DBG_DECRYPT: {
+		struct sev_data_dbg *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->dst_addr;
+		desc_list[0].len = data->len;
+		desc_list[0].guest_owned = true;
+		break;
+	}
+	case SEV_CMD_DBG_ENCRYPT: {
+		struct sev_data_dbg *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->dst_addr;
+		desc_list[0].len = data->len;
+		desc_list[0].guest_owned = true;
+		break;
+	}
+	case SEV_CMD_ATTESTATION_REPORT: {
+		struct sev_data_attestation_report *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->address;
+		desc_list[0].len = data->len;
+		break;
+	}
+	case SEV_CMD_SEND_START: {
+		struct sev_data_send_start *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->session_address;
+		desc_list[0].len = data->session_len;
+		break;
+	}
+	case SEV_CMD_SEND_UPDATE_DATA: {
+		struct sev_data_send_update_data *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->hdr_address;
+		desc_list[0].len = data->hdr_len;
+		desc_list[1].paddr_ptr = &data->trans_address;
+		desc_list[1].len = data->trans_len;
+		break;
+	}
+	case SEV_CMD_SEND_UPDATE_VMSA: {
+		struct sev_data_send_update_vmsa *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->hdr_address;
+		desc_list[0].len = data->hdr_len;
+		desc_list[1].paddr_ptr = &data->trans_address;
+		desc_list[1].len = data->trans_len;
+		break;
+	}
+	case SEV_CMD_RECEIVE_UPDATE_DATA: {
+		struct sev_data_receive_update_data *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->guest_address;
+		desc_list[0].len = data->guest_len;
+		desc_list[0].guest_owned = true;
+		break;
+	}
+	case SEV_CMD_RECEIVE_UPDATE_VMSA: {
+		struct sev_data_receive_update_vmsa *data = cmd_buf;
+
+		desc_list[0].paddr_ptr = &data->guest_address;
+		desc_list[0].len = data->guest_len;
+		desc_list[0].guest_owned = true;
+		break;
+	}
+	default:
+		break;
+	}
+}
+
+static int snp_map_cmd_buf_desc(struct cmd_buf_desc *desc)
+{
+	unsigned int npages;
+
+	if (!desc->len)
+		return 0;
+
+	/* Allocate a bounce buffer if this isn't a guest owned page. */
+	if (!desc->guest_owned) {
+		struct page *page;
+
+		page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(desc->len));
+		if (!page) {
+			pr_warn("Failed to allocate bounce buffer for SEV legacy command.\n");
+			return -ENOMEM;
+		}
+
+		desc->paddr_orig = *desc->paddr_ptr;
+		*desc->paddr_ptr = __psp_pa(page_to_virt(page));
+	}
+
+	npages = PAGE_ALIGN(desc->len) >> PAGE_SHIFT;
+
+	/* Transition the buffer to firmware-owned. */
+	if (rmp_mark_pages_firmware(*desc->paddr_ptr, npages, true)) {
+		pr_warn("Error moving pages to firmware-owned state for SEV legacy command.\n");
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int snp_unmap_cmd_buf_desc(struct cmd_buf_desc *desc)
+{
+	unsigned int npages;
+
+	if (!desc->len)
+		return 0;
+
+	npages = PAGE_ALIGN(desc->len) >> PAGE_SHIFT;
+
+	/* Transition the buffers back to hypervisor-owned. */
+	if (snp_reclaim_pages(*desc->paddr_ptr, npages, true)) {
+		pr_warn("Failed to reclaim firmware-owned pages while issuing SEV legacy command.\n");
+		return -EFAULT;
+	}
+
+	/* Copy data from bounce buffer and then free it. */
+	if (!desc->guest_owned) {
+		void *bounce_buf = __va(__sme_clr(*desc->paddr_ptr));
+		void *dst_buf = __va(__sme_clr(desc->paddr_orig));
+
+		memcpy(dst_buf, bounce_buf, desc->len);
+		__free_pages(virt_to_page(bounce_buf), get_order(desc->len));
+
+		/* Restore the original address in the command buffer. */
+		*desc->paddr_ptr = desc->paddr_orig;
+	}
+
+	return 0;
+}
+
+static int snp_map_cmd_buf_desc_list(int cmd, void *cmd_buf, struct cmd_buf_desc *desc_list)
+{
+	int i;
+
+	snp_populate_cmd_buf_desc_list(cmd, cmd_buf, desc_list);
+
+	for (i = 0; i < CMD_BUF_DESC_MAX; i++) {
+		struct cmd_buf_desc *desc = &desc_list[i];
+
+		if (!desc->paddr_ptr)
+			break;
+
+		if (snp_map_cmd_buf_desc(desc))
+			goto err_unmap;
+	}
+
+	return 0;
+
+err_unmap:
+	for (i--; i >= 0; i--)
+		snp_unmap_cmd_buf_desc(&desc_list[i]);
+
+	return -EFAULT;
+}
+
+static int snp_unmap_cmd_buf_desc_list(struct cmd_buf_desc *desc_list)
+{
+	int i, ret = 0;
+
+	for (i = 0; i < CMD_BUF_DESC_MAX; i++) {
+		struct cmd_buf_desc *desc = &desc_list[i];
+
+		if (!desc->paddr_ptr)
+			break;
+
+		if (snp_unmap_cmd_buf_desc(&desc_list[i]))
+			ret = -EFAULT;
+	}
+
+	return ret;
+}
+
+static bool sev_cmd_buf_writable(int cmd)
+{
+	switch (cmd) {
+	case SEV_CMD_PLATFORM_STATUS:
+	case SEV_CMD_GUEST_STATUS:
+	case SEV_CMD_LAUNCH_START:
+	case SEV_CMD_RECEIVE_START:
+	case SEV_CMD_LAUNCH_MEASURE:
+	case SEV_CMD_SEND_START:
+	case SEV_CMD_SEND_UPDATE_DATA:
+	case SEV_CMD_SEND_UPDATE_VMSA:
+	case SEV_CMD_PEK_CSR:
+	case SEV_CMD_PDH_CERT_EXPORT:
+	case SEV_CMD_GET_ID:
+	case SEV_CMD_ATTESTATION_REPORT:
+		return true;
+	default:
+		return false;
+	}
+}
+
+/* After SNP is INIT'ed, the behavior of legacy SEV commands is changed. */
+static bool snp_legacy_handling_needed(int cmd)
+{
+	struct sev_device *sev = psp_master->sev_data;
+
+	return cmd < SEV_CMD_SNP_INIT && sev->snp_initialized;
+}
+
+static int snp_prep_cmd_buf(int cmd, void *cmd_buf, struct cmd_buf_desc *desc_list)
+{
+	if (!snp_legacy_handling_needed(cmd))
+		return 0;
+
+	if (snp_map_cmd_buf_desc_list(cmd, cmd_buf, desc_list))
+		return -EFAULT;
+
+	/*
+	 * Before command execution, the command buffer needs to be put into
+	 * the firmware-owned state.
+	 */
+	if (sev_cmd_buf_writable(cmd)) {
+		if (rmp_mark_pages_firmware(__pa(cmd_buf), 1, true))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int snp_reclaim_cmd_buf(int cmd, void *cmd_buf)
+{
+	if (!snp_legacy_handling_needed(cmd))
+		return 0;
+
+	/*
+	 * After command completion, the command buffer needs to be put back
+	 * into the hypervisor-owned state.
+	 */
+	if (sev_cmd_buf_writable(cmd))
+		if (snp_reclaim_pages(__pa(cmd_buf), 1, true))
+			return -EFAULT;
+
+	return 0;
+}
+
 static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret)
 {
+	struct cmd_buf_desc desc_list[CMD_BUF_DESC_MAX] = {0};
 	struct psp_device *psp = psp_master;
 	struct sev_device *sev;
 	unsigned int cmdbuff_hi, cmdbuff_lo;
 	unsigned int phys_lsb, phys_msb;
 	unsigned int reg, ret = 0;
+	void *cmd_buf;
 	int buf_len;
 
 	if (!psp || !psp->sev_data)
@@ -490,12 +830,47 @@ static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret)
 	 * work for some memory, e.g. vmalloc'd addresses, and @data may not be
 	 * physically contiguous.
 	 */
-	if (data)
-		memcpy(sev->cmd_buf, data, buf_len);
+	if (data) {
+		/*
+		 * Commands are generally issued one at a time and require the
+		 * sev_cmd_mutex, but there could be recursive firmware requests
+		 * due to SEV_CMD_SNP_PAGE_RECLAIM needing to be issued while
+		 * preparing buffers for another command. This is the only known
+		 * case of nesting in the current code, so exactly one
+		 * additional command buffer is available for that purpose.
+		 */
+		if (!sev->cmd_buf_active) {
+			cmd_buf = sev->cmd_buf;
+			sev->cmd_buf_active = true;
+		} else if (!sev->cmd_buf_backup_active) {
+			cmd_buf = sev->cmd_buf_backup;
+			sev->cmd_buf_backup_active = true;
+		} else {
+			dev_err(sev->dev,
+				"SEV: too many firmware commands in progress, no command buffers available.\n");
+			return -EBUSY;
+		}
+
+		memcpy(cmd_buf, data, buf_len);
+
+		/*
+		 * The behavior of the SEV-legacy commands is altered when the
+		 * SNP firmware is in the INIT state.
+		 */
+		ret = snp_prep_cmd_buf(cmd, cmd_buf, desc_list);
+		if (ret) {
+			dev_err(sev->dev,
+				"SEV: failed to prepare buffer for legacy command 0x%x. Error: %d\n",
+				cmd, ret);
+			return ret;
+		}
+	} else {
+		cmd_buf = sev->cmd_buf;
+	}
 
 	/* Get the physical address of the command buffer */
-	phys_lsb = data ? lower_32_bits(__psp_pa(sev->cmd_buf)) : 0;
-	phys_msb = data ? upper_32_bits(__psp_pa(sev->cmd_buf)) : 0;
+	phys_lsb = data ? lower_32_bits(__psp_pa(cmd_buf)) : 0;
+	phys_msb = data ? upper_32_bits(__psp_pa(cmd_buf)) : 0;
 
 	dev_dbg(sev->dev, "sev command id %#x buffer 0x%08x%08x timeout %us\n",
 		cmd, phys_msb, phys_lsb, psp_timeout);
@@ -549,15 +924,36 @@ static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret)
 		ret = sev_write_init_ex_file_if_required(cmd);
 	}
 
-	print_hex_dump_debug("(out): ", DUMP_PREFIX_OFFSET, 16, 2, data,
-			     buf_len, false);
-
 	/*
 	 * Copy potential output from the PSP back to data.  Do this even on
 	 * failure in case the caller wants to glean something from the error.
 	 */
-	if (data)
-		memcpy(data, sev->cmd_buf, buf_len);
+	if (data) {
+		int ret_reclaim;
+		/*
+		 * Restore the page state after the command completes.
+		 */
+		ret_reclaim = snp_reclaim_cmd_buf(cmd, cmd_buf);
+		if (ret_reclaim) {
+			dev_err(sev->dev,
+				"SEV: failed to reclaim buffer for legacy command %#x. Error: %d\n",
+				cmd, ret_reclaim);
+			return ret_reclaim;
+		}
+
+		memcpy(data, cmd_buf, buf_len);
+
+		if (sev->cmd_buf_backup_active)
+			sev->cmd_buf_backup_active = false;
+		else
+			sev->cmd_buf_active = false;
+
+		if (snp_unmap_cmd_buf_desc_list(desc_list))
+			return -EFAULT;
+	}
+
+	print_hex_dump_debug("(out): ", DUMP_PREFIX_OFFSET, 16, 2, data,
+			     buf_len, false);
 
 	return ret;
 }
@@ -1657,10 +2053,12 @@ int sev_dev_init(struct psp_device *psp)
 	if (!sev)
 		goto e_err;
 
-	sev->cmd_buf = (void *)devm_get_free_pages(dev, GFP_KERNEL, 0);
+	sev->cmd_buf = (void *)devm_get_free_pages(dev, GFP_KERNEL, 1);
 	if (!sev->cmd_buf)
 		goto e_sev;
 
+	sev->cmd_buf_backup = (uint8_t *)sev->cmd_buf + PAGE_SIZE;
+
 	psp->sev_data = sev;
 
 	sev->dev = dev;
diff --git a/drivers/crypto/ccp/sev-dev.h b/drivers/crypto/ccp/sev-dev.h
index 85506325051a..3e4e5574e88a 100644
--- a/drivers/crypto/ccp/sev-dev.h
+++ b/drivers/crypto/ccp/sev-dev.h
@@ -52,6 +52,9 @@ struct sev_device {
 	u8 build;
 
 	void *cmd_buf;
+	void *cmd_buf_backup;
+	bool cmd_buf_active;
+	bool cmd_buf_backup_active;
 
 	bool snp_initialized;
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 19/25] iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (17 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 18/25] crypto: ccp: Handle legacy SEV commands " Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 20/25] crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump Michael Roth
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

From: Ashish Kalra <ashish.kalra@amd.com>

Add a new IOMMU API interface amd_iommu_snp_disable() to transition
IOMMU pages to Hypervisor state from Reclaim state after SNP_SHUTDOWN_EX
command. Invoke this API from the CCP driver after SNP_SHUTDOWN_EX
command.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 drivers/crypto/ccp/sev-dev.c | 20 +++++++++
 drivers/iommu/amd/init.c     | 79 ++++++++++++++++++++++++++++++++++++
 include/linux/amd-iommu.h    |  6 +++
 3 files changed, 105 insertions(+)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index b2ad41ce5f77..d26bff55ec93 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -26,6 +26,7 @@
 #include <linux/fs.h>
 #include <linux/fs_struct.h>
 #include <linux/psp.h>
+#include <linux/amd-iommu.h>
 
 #include <asm/smp.h>
 #include <asm/cacheflush.h>
@@ -1633,6 +1634,25 @@ static int __sev_snp_shutdown_locked(int *error)
 		return ret;
 	}
 
+	/*
+	 * SNP_SHUTDOWN_EX with IOMMU_SNP_SHUTDOWN set to 1 disables SNP
+	 * enforcement by the IOMMU and also transitions all pages
+	 * associated with the IOMMU to the Reclaim state.
+	 * Firmware was transitioning the IOMMU pages to Hypervisor state
+	 * before version 1.53. But, accounting for the number of assigned
+	 * 4kB pages in a 2M page was done incorrectly by not transitioning
+	 * to the Reclaim state. This resulted in RMP #PF when later accessing
+	 * the 2M page containing those pages during kexec boot. Hence, the
+	 * firmware now transitions these pages to Reclaim state and hypervisor
+	 * needs to transition these pages to shared state. SNP Firmware
+	 * version 1.53 and above are needed for kexec boot.
+	 */
+	ret = amd_iommu_snp_disable();
+	if (ret) {
+		dev_err(sev->dev, "SNP IOMMU shutdown failed\n");
+		return ret;
+	}
+
 	sev->snp_initialized = false;
 	dev_dbg(sev->dev, "SEV-SNP firmware shutdown\n");
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 3a4eeb26d515..88bb08ae39b2 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -30,6 +30,7 @@
 #include <asm/io_apic.h>
 #include <asm/irq_remapping.h>
 #include <asm/set_memory.h>
+#include <asm/sev.h>
 
 #include <linux/crash_dump.h>
 
@@ -3797,3 +3798,81 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn, u64
 
 	return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true);
 }
+
+#ifdef CONFIG_KVM_AMD_SEV
+static int iommu_page_make_shared(void *page)
+{
+	unsigned long paddr, pfn;
+
+	paddr = iommu_virt_to_phys(page);
+	/* Cbit maybe set in the paddr */
+	pfn = __sme_clr(paddr) >> PAGE_SHIFT;
+
+	if (!(pfn % PTRS_PER_PMD)) {
+		int ret, level;
+		bool assigned;
+
+		ret = snp_lookup_rmpentry(pfn, &assigned, &level);
+		if (ret)
+			pr_warn("IOMMU PFN %lx RMP lookup failed, ret %d\n",
+				pfn, ret);
+
+		if (!assigned)
+			pr_warn("IOMMU PFN %lx not assigned in RMP table\n",
+				pfn);
+
+		if (level > PG_LEVEL_4K) {
+			ret = psmash(pfn);
+			if (ret) {
+				pr_warn("IOMMU PFN %lx had a huge RMP entry, but attempted psmash failed, ret: %d, level: %d\n",
+					pfn, ret, level);
+			}
+		}
+	}
+
+	return rmp_make_shared(pfn, PG_LEVEL_4K);
+}
+
+static int iommu_make_shared(void *va, size_t size)
+{
+	void *page;
+	int ret;
+
+	if (!va)
+		return 0;
+
+	for (page = va; page < (va + size); page += PAGE_SIZE) {
+		ret = iommu_page_make_shared(page);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+int amd_iommu_snp_disable(void)
+{
+	struct amd_iommu *iommu;
+	int ret;
+
+	if (!amd_iommu_snp_en)
+		return 0;
+
+	for_each_iommu(iommu) {
+		ret = iommu_make_shared(iommu->evt_buf, EVT_BUFFER_SIZE);
+		if (ret)
+			return ret;
+
+		ret = iommu_make_shared(iommu->ppr_log, PPR_LOG_SIZE);
+		if (ret)
+			return ret;
+
+		ret = iommu_make_shared((void *)iommu->cmd_sem, PAGE_SIZE);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(amd_iommu_snp_disable);
+#endif
diff --git a/include/linux/amd-iommu.h b/include/linux/amd-iommu.h
index 7365be00a795..2b90c48a6a87 100644
--- a/include/linux/amd-iommu.h
+++ b/include/linux/amd-iommu.h
@@ -85,4 +85,10 @@ int amd_iommu_pc_get_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn,
 		u64 *value);
 struct amd_iommu *get_amd_iommu(unsigned int idx);
 
+#ifdef CONFIG_KVM_AMD_SEV
+int amd_iommu_snp_disable(void);
+#else
+static inline int amd_iommu_snp_disable(void) { return 0; }
+#endif
+
 #endif /* _ASM_X86_AMD_IOMMU_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 20/25] crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (18 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 19/25] iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 21/25] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe Michael Roth
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

From: Ashish Kalra <ashish.kalra@amd.com>

Add a kdump safe version of sev_firmware_shutdown() and register it as a
crash_kexec_post_notifier so it will be invoked during panic/crash to do
SEV/SNP shutdown. This is required for transitioning all IOMMU pages to
reclaim/hypervisor state, otherwise re-init of IOMMU pages during
crashdump kernel boot fails and panics the crashdump kernel.

This panic notifier runs in atomic context, hence it ensures not to
acquire any locks/mutexes and polls for PSP command completion instead
of depending on PSP command completion interrupt.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[mdr: remove use of "we" in comments]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/sev.h   |   2 +
 arch/x86/kernel/crash.c      |   3 +
 arch/x86/kernel/sev.c        |  10 ++++
 arch/x86/virt/svm/sev.c      |   6 ++
 drivers/crypto/ccp/sev-dev.c | 111 +++++++++++++++++++++++++----------
 5 files changed, 102 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 435ba9bc4510..b3ba32c6fc9f 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -227,6 +227,7 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct sn
 void snp_accept_memory(phys_addr_t start, phys_addr_t end);
 u64 snp_get_unsupported_features(u64 status);
 u64 sev_get_status(void);
+void kdump_sev_callback(void);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -255,6 +256,7 @@ static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *in
 static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }
 static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
 static inline u64 sev_get_status(void) { return 0; }
+static inline void kdump_sev_callback(void) { }
 #endif
 
 #ifdef CONFIG_KVM_AMD_SEV
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index b6b044356f1b..d184c29398db 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -40,6 +40,7 @@
 #include <asm/intel_pt.h>
 #include <asm/crash.h>
 #include <asm/cmdline.h>
+#include <asm/sev.h>
 
 /* Used while preparing memory map entries for second kernel */
 struct crash_memmap_data {
@@ -59,6 +60,8 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
 	 */
 	cpu_emergency_stop_pt();
 
+	kdump_sev_callback();
+
 	disable_local_APIC();
 }
 
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 1ec753331524..002af6c30601 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2265,3 +2265,13 @@ static int __init snp_init_platform_device(void)
 	return 0;
 }
 device_initcall(snp_init_platform_device);
+
+void kdump_sev_callback(void)
+{
+	/*
+	 * Do wbinvd() on remote CPUs when SNP is enabled in order to
+	 * safely do SNP_SHUTDOWN on the local CPU.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		wbinvd();
+}
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 649ac1bb6b0e..3d33b75b4606 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -216,6 +216,12 @@ static int __init snp_rmptable_init(void)
 
 	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
 
+	/*
+	 * Setting crash_kexec_post_notifiers to 'true' to ensure that SNP panic
+	 * notifier is invoked to do SNP IOMMU shutdown before kdump.
+	 */
+	crash_kexec_post_notifiers = true;
+
 	return 0;
 
 nosnp:
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index d26bff55ec93..9a395f0f9b10 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -21,6 +21,7 @@
 #include <linux/hw_random.h>
 #include <linux/ccp.h>
 #include <linux/firmware.h>
+#include <linux/panic_notifier.h>
 #include <linux/gfp.h>
 #include <linux/cpufeature.h>
 #include <linux/fs.h>
@@ -143,6 +144,25 @@ static int sev_wait_cmd_ioc(struct sev_device *sev,
 {
 	int ret;
 
+	/*
+	 * If invoked during panic handling, local interrupts are disabled,
+	 * so the PSP command completion interrupt can't be used. Poll for
+	 * PSP command completion instead.
+	 */
+	if (irqs_disabled()) {
+		unsigned long timeout_usecs = (timeout * USEC_PER_SEC) / 10;
+
+		/* Poll for SEV command completion: */
+		while (timeout_usecs--) {
+			*reg = ioread32(sev->io_regs + sev->vdata->cmdresp_reg);
+			if (*reg & PSP_CMDRESP_RESP)
+				return 0;
+
+			udelay(10);
+		}
+		return -ETIMEDOUT;
+	}
+
 	ret = wait_event_timeout(sev->int_queue,
 			sev->int_rcvd, timeout * HZ);
 	if (!ret)
@@ -1316,17 +1336,6 @@ static int __sev_platform_shutdown_locked(int *error)
 	return ret;
 }
 
-static int sev_platform_shutdown(int *error)
-{
-	int rc;
-
-	mutex_lock(&sev_cmd_mutex);
-	rc = __sev_platform_shutdown_locked(NULL);
-	mutex_unlock(&sev_cmd_mutex);
-
-	return rc;
-}
-
 static int sev_get_platform_state(int *state, int *error)
 {
 	struct sev_user_data_status data;
@@ -1602,7 +1611,7 @@ static int sev_update_firmware(struct device *dev)
 	return ret;
 }
 
-static int __sev_snp_shutdown_locked(int *error)
+static int __sev_snp_shutdown_locked(int *error, bool panic)
 {
 	struct sev_device *sev = psp_master->sev_data;
 	struct sev_data_snp_shutdown_ex data;
@@ -1615,7 +1624,16 @@ static int __sev_snp_shutdown_locked(int *error)
 	data.len = sizeof(data);
 	data.iommu_snp_shutdown = 1;
 
-	wbinvd_on_all_cpus();
+	/*
+	 * If invoked during panic handling, local interrupts are disabled
+	 * and all CPUs are stopped, so wbinvd_on_all_cpus() can't be called.
+	 * In that case, a wbinvd() is done on remote CPUs via the NMI
+	 * callback, so only a local wbinvd() is needed here.
+	 */
+	if (!panic)
+		wbinvd_on_all_cpus();
+	else
+		wbinvd();
 
 	ret = __sev_do_cmd_locked(SEV_CMD_SNP_SHUTDOWN_EX, &data, error);
 	/* SHUTDOWN may require DF_FLUSH */
@@ -1659,17 +1677,6 @@ static int __sev_snp_shutdown_locked(int *error)
 	return ret;
 }
 
-static int sev_snp_shutdown(int *error)
-{
-	int rc;
-
-	mutex_lock(&sev_cmd_mutex);
-	rc = __sev_snp_shutdown_locked(error);
-	mutex_unlock(&sev_cmd_mutex);
-
-	return rc;
-}
-
 static int sev_ioctl_do_pek_import(struct sev_issue_cmd *argp, bool writable)
 {
 	struct sev_device *sev = psp_master->sev_data;
@@ -2117,19 +2124,28 @@ int sev_dev_init(struct psp_device *psp)
 	return ret;
 }
 
-static void sev_firmware_shutdown(struct sev_device *sev)
+static void __sev_firmware_shutdown(struct sev_device *sev, bool panic)
 {
 	int error;
 
-	sev_platform_shutdown(NULL);
+	__sev_platform_shutdown_locked(NULL);
 
 	if (sev_es_tmr) {
-		/* The TMR area was encrypted, flush it from the cache */
-		wbinvd_on_all_cpus();
+		/*
+		 * The TMR area was encrypted, flush it from the cache
+		 *
+		 * If invoked during panic handling, local interrupts are
+		 * disabled and all CPUs are stopped, so wbinvd_on_all_cpus()
+		 * can't be used. In that case, wbinvd() is done on remote CPUs
+		 * via the NMI callback, and done for this CPU later during
+		 * SNP shutdown, so wbinvd_on_all_cpus() can be skipped.
+		 */
+		if (!panic)
+			wbinvd_on_all_cpus();
 
 		__snp_free_firmware_pages(virt_to_page(sev_es_tmr),
 					  get_order(sev_es_tmr_size),
-					  false);
+					  true);
 		sev_es_tmr = NULL;
 	}
 
@@ -2145,7 +2161,14 @@ static void sev_firmware_shutdown(struct sev_device *sev)
 		snp_range_list = NULL;
 	}
 
-	sev_snp_shutdown(&error);
+	__sev_snp_shutdown_locked(&error, panic);
+}
+
+static void sev_firmware_shutdown(struct sev_device *sev)
+{
+	mutex_lock(&sev_cmd_mutex);
+	__sev_firmware_shutdown(sev, false);
+	mutex_unlock(&sev_cmd_mutex);
 }
 
 void sev_dev_destroy(struct psp_device *psp)
@@ -2163,6 +2186,29 @@ void sev_dev_destroy(struct psp_device *psp)
 	psp_clear_sev_irq_handler(psp);
 }
 
+static int snp_shutdown_on_panic(struct notifier_block *nb,
+				 unsigned long reason, void *arg)
+{
+	struct sev_device *sev = psp_master->sev_data;
+
+	/*
+	 * If sev_cmd_mutex is already acquired, then it's likely
+	 * another PSP command is in flight and issuing a shutdown
+	 * would fail in unexpected ways. Rather than create even
+	 * more confusion during a panic, just bail out here.
+	 */
+	if (mutex_is_locked(&sev_cmd_mutex))
+		return NOTIFY_DONE;
+
+	__sev_firmware_shutdown(sev, true);
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block snp_panic_notifier = {
+	.notifier_call = snp_shutdown_on_panic,
+};
+
 int sev_issue_cmd_external_user(struct file *filep, unsigned int cmd,
 				void *data, int *error)
 {
@@ -2200,6 +2246,8 @@ void sev_pci_init(void)
 	dev_info(sev->dev, "SEV%s API:%d.%d build:%d\n", sev->snp_initialized ?
 		"-SNP" : "", sev->api_major, sev->api_minor, sev->build);
 
+	atomic_notifier_chain_register(&panic_notifier_list,
+				       &snp_panic_notifier);
 	return;
 
 err:
@@ -2214,4 +2262,7 @@ void sev_pci_exit(void)
 		return;
 
 	sev_firmware_shutdown(sev);
+
+	atomic_notifier_chain_unregister(&panic_notifier_list,
+					 &snp_panic_notifier);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 21/25] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (19 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 20/25] crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26 11:00   ` Paolo Bonzini
  2024-01-26  4:11 ` [PATCH v2 22/25] x86/cpufeatures: Enable/unmask SEV-SNP CPU feature Michael Roth
                   ` (4 subsequent siblings)
  25 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh, Marc Orr

From: Brijesh Singh <brijesh.singh@amd.com>

Implement a workaround for an SNP erratum where the CPU will incorrectly
signal an RMP violation #PF if a hugepage (2MB or 1GB) collides with the
RMP entry of a VMCB, VMSA or AVIC backing page.

When SEV-SNP is globally enabled, the CPU marks the VMCB, VMSA, and AVIC
backing pages as "in-use" via a reserved bit in the corresponding RMP
entry after a successful VMRUN. This is done for _all_ VMs, not just
SNP-Active VMs.

If the hypervisor accesses an in-use page through a writable
translation, the CPU will throw an RMP violation #PF. On early SNP
hardware, if an in-use page is 2MB-aligned and software accesses any
part of the associated 2MB region with a hugepage, the CPU will
incorrectly treat the entire 2MB region as in-use and signal a an RMP
violation #PF.

To avoid this, the recommendation is to not use a 2MB-aligned page for
the VMCB, VMSA or AVIC pages. Add a generic allocator that will ensure
that the page returned is not 2MB-aligned and is safe to be used when
SEV-SNP is enabled. Also implement similar handling for the VMCB/VMSA
pages of nested guests.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Marc Orr <marcorr@google.com>
Reported-by: Alper Gun <alpergun@google.com> # for nested VMSA case
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
[mdr: squash in nested guest handling from Ashish, commit msg fixups]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  1 +
 arch/x86/kvm/lapic.c               |  5 ++++-
 arch/x86/kvm/svm/nested.c          |  2 +-
 arch/x86/kvm/svm/sev.c             | 32 ++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c             | 17 +++++++++++++---
 arch/x86/kvm/svm/svm.h             |  1 +
 7 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 378ed944b849..ab24ce207988 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -138,6 +138,7 @@ KVM_X86_OP(complete_emulated_msr)
 KVM_X86_OP(vcpu_deliver_sipi_vector)
 KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
 KVM_X86_OP_OPTIONAL(get_untagged_addr)
+KVM_X86_OP_OPTIONAL(alloc_apic_backing_page)
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b5b2d0fde579..5c12af29fd9b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1794,6 +1794,7 @@ struct kvm_x86_ops {
 	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
 
 	gva_t (*get_untagged_addr)(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags);
+	void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3242f3da2457..1edf93ee3395 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2815,7 +2815,10 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
 
 	vcpu->arch.apic = apic;
 
-	apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+	if (kvm_x86_ops.alloc_apic_backing_page)
+		apic->regs = static_call(kvm_x86_alloc_apic_backing_page)(vcpu);
+	else
+		apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
 	if (!apic->regs) {
 		printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
 		       vcpu->vcpu_id);
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index dee62362a360..55b9a6d96bcf 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1181,7 +1181,7 @@ int svm_allocate_nested(struct vcpu_svm *svm)
 	if (svm->nested.initialized)
 		return 0;
 
-	vmcb02_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+	vmcb02_page = snp_safe_alloc_page(&svm->vcpu);
 	if (!vmcb02_page)
 		return -ENOMEM;
 	svm->nested.vmcb02.ptr = page_address(vmcb02_page);
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 564091f386f7..f99435b6648f 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3163,3 +3163,35 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 
 	ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
 }
+
+struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
+{
+	unsigned long pfn;
+	struct page *p;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+
+	/*
+	 * Allocate an SNP-safe page to workaround the SNP erratum where
+	 * the CPU will incorrectly signal an RMP violation #PF if a
+	 * hugepage (2MB or 1GB) collides with the RMP entry of a
+	 * 2MB-aligned VMCB, VMSA, or AVIC backing page.
+	 *
+	 * Allocate one extra page, choose a page which is not
+	 * 2MB-aligned, and free the other.
+	 */
+	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
+	if (!p)
+		return NULL;
+
+	split_page(p, 1);
+
+	pfn = page_to_pfn(p);
+	if (IS_ALIGNED(pfn, PTRS_PER_PMD))
+		__free_page(p++);
+	else
+		__free_page(p + 1);
+
+	return p;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 61f2bdc9f4f8..272d5ed37ce7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -703,7 +703,7 @@ static int svm_cpu_init(int cpu)
 	int ret = -ENOMEM;
 
 	memset(sd, 0, sizeof(struct svm_cpu_data));
-	sd->save_area = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	sd->save_area = snp_safe_alloc_page(NULL);
 	if (!sd->save_area)
 		return ret;
 
@@ -1421,7 +1421,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 	svm = to_svm(vcpu);
 
 	err = -ENOMEM;
-	vmcb01_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+	vmcb01_page = snp_safe_alloc_page(vcpu);
 	if (!vmcb01_page)
 		goto out;
 
@@ -1430,7 +1430,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 		 * SEV-ES guests require a separate VMSA page used to contain
 		 * the encrypted register state of the guest.
 		 */
-		vmsa_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+		vmsa_page = snp_safe_alloc_page(vcpu);
 		if (!vmsa_page)
 			goto error_free_vmcb_page;
 
@@ -4900,6 +4900,16 @@ static int svm_vm_init(struct kvm *kvm)
 	return 0;
 }
 
+static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu)
+{
+	struct page *page = snp_safe_alloc_page(vcpu);
+
+	if (!page)
+		return NULL;
+
+	return page_address(page);
+}
+
 static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.name = KBUILD_MODNAME,
 
@@ -5031,6 +5041,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 
 	.vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
 	.vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons,
+	.alloc_apic_backing_page = svm_alloc_apic_backing_page,
 };
 
 /*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 8ef95139cd24..7f1fbd874c45 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -694,6 +694,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
 void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
+struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu);
 
 /* vmenter.S */
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 22/25] x86/cpufeatures: Enable/unmask SEV-SNP CPU feature
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (20 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 21/25] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 23/25] crypto: ccp: Add the SNP_PLATFORM_STATUS command Michael Roth
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

With all the required host changes in place, it should now be possible
to initialize SNP-related MSR bits, set up RMP table enforcement, and
initialize SNP support in firmware while maintaining legacy support for
SEV/SEV-ES guests. Go ahead and enable the SNP feature now.

Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/disabled-features.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 1ea64d4e7021..85a7b5ce96c9 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -117,7 +117,11 @@
 #define DISABLE_IBT	(1 << (X86_FEATURE_IBT & 31))
 #endif
 
+#ifdef CONFIG_KVM_AMD_SEV
+#define DISABLE_SEV_SNP		0
+#else
 #define DISABLE_SEV_SNP		(1 << (X86_FEATURE_SEV_SNP & 31))
+#endif
 
 /*
  * Make sure to add features to the correct mask
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 23/25] crypto: ccp: Add the SNP_PLATFORM_STATUS command
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (21 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 22/25] x86/cpufeatures: Enable/unmask SEV-SNP CPU feature Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 24/25] crypto: ccp: Add the SNP_COMMIT command Michael Roth
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

From: Brijesh Singh <brijesh.singh@amd.com>

This command is used to query the SNP platform status. See the SEV-SNP
spec for more details.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 Documentation/virt/coco/sev-guest.rst | 27 ++++++++++++++
 drivers/crypto/ccp/sev-dev.c          | 52 +++++++++++++++++++++++++++
 include/uapi/linux/psp-sev.h          |  1 +
 3 files changed, 80 insertions(+)

diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst
index 68b0d2363af8..6d3d5d336e5f 100644
--- a/Documentation/virt/coco/sev-guest.rst
+++ b/Documentation/virt/coco/sev-guest.rst
@@ -67,6 +67,22 @@ counter (e.g. counter overflow), then -EIO will be returned.
                 };
         };
 
+The host ioctls are issued to a file descriptor of the /dev/sev device.
+The ioctl accepts the command ID/input structure documented below.
+
+::
+        struct sev_issue_cmd {
+                /* Command ID */
+                __u32 cmd;
+
+                /* Command request structure */
+                __u64 data;
+
+                /* Firmware error code on failure (see psp-sev.h) */
+                __u32 error;
+        };
+
+
 2.1 SNP_GET_REPORT
 ------------------
 
@@ -124,6 +140,17 @@ be updated with the expected value.
 
 See GHCB specification for further detail on how to parse the certificate blob.
 
+2.4 SNP_PLATFORM_STATUS
+-----------------------
+:Technology: sev-snp
+:Type: hypervisor ioctl cmd
+:Parameters (out): struct sev_user_data_snp_status
+:Returns (out): 0 on success, -negative on error
+
+The SNP_PLATFORM_STATUS command is used to query the SNP platform status. The
+status includes API major, minor version and more. See the SEV-SNP
+specification for further details.
+
 3. SEV-SNP CPUID Enforcement
 ============================
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 9a395f0f9b10..9f6ee0d24781 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -1919,6 +1919,55 @@ static int sev_ioctl_do_pdh_export(struct sev_issue_cmd *argp, bool writable)
 	return ret;
 }
 
+static int sev_ioctl_do_snp_platform_status(struct sev_issue_cmd *argp)
+{
+	struct sev_device *sev = psp_master->sev_data;
+	struct sev_data_snp_addr buf;
+	struct page *status_page;
+	void *data;
+	int ret;
+
+	if (!sev->snp_initialized || !argp->data)
+		return -EINVAL;
+
+	status_page = alloc_page(GFP_KERNEL_ACCOUNT);
+	if (!status_page)
+		return -ENOMEM;
+
+	data = page_address(status_page);
+
+	/*
+	 * Firmware expects status page to be in firmware-owned state, otherwise
+	 * it will report firmware error code INVALID_PAGE_STATE (0x1A).
+	 */
+	if (rmp_mark_pages_firmware(__pa(data), 1, true)) {
+		ret = -EFAULT;
+		goto cleanup;
+	}
+
+	buf.address = __psp_pa(data);
+	ret = __sev_do_cmd_locked(SEV_CMD_SNP_PLATFORM_STATUS, &buf, &argp->error);
+
+	/*
+	 * Status page will be transitioned to Reclaim state upon success, or
+	 * left in Firmware state in failure. Use snp_reclaim_pages() to
+	 * transition either case back to Hypervisor-owned state.
+	 */
+	if (snp_reclaim_pages(__pa(data), 1, true))
+		return -EFAULT;
+
+	if (ret)
+		goto cleanup;
+
+	if (copy_to_user((void __user *)argp->data, data,
+			 sizeof(struct sev_user_data_snp_status)))
+		ret = -EFAULT;
+
+cleanup:
+	__free_pages(status_page, 0);
+	return ret;
+}
+
 static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	void __user *argp = (void __user *)arg;
@@ -1970,6 +2019,9 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 	case SEV_GET_ID2:
 		ret = sev_ioctl_do_get_id2(&input);
 		break;
+	case SNP_PLATFORM_STATUS:
+		ret = sev_ioctl_do_snp_platform_status(&input);
+		break;
 	default:
 		ret = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index 207e34217528..f1e2c55a92b4 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -28,6 +28,7 @@ enum {
 	SEV_PEK_CERT_IMPORT,
 	SEV_GET_ID,	/* This command is deprecated, use SEV_GET_ID2 */
 	SEV_GET_ID2,
+	SNP_PLATFORM_STATUS,
 
 	SEV_MAX,
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 24/25] crypto: ccp: Add the SNP_COMMIT command
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (22 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 23/25] crypto: ccp: Add the SNP_PLATFORM_STATUS command Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-26  4:11 ` [PATCH v2 25/25] crypto: ccp: Add the SNP_SET_CONFIG command Michael Roth
  2024-01-30 16:19 ` [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Borislav Petkov
  25 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

From: Tom Lendacky <thomas.lendacky@amd.com>

The SNP_COMMIT command is used to commit the currently installed version
of the SEV firmware. Once committed, the firmware cannot be replaced
with a previous firmware version (cannot be rolled back). This command
will also update the reported TCB to match that of the currently
installed firmware.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[mdr: note the reported TCB update in the documentation/commit]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 Documentation/virt/coco/sev-guest.rst | 11 +++++++++++
 drivers/crypto/ccp/sev-dev.c          | 17 +++++++++++++++++
 include/linux/psp-sev.h               |  9 +++++++++
 include/uapi/linux/psp-sev.h          |  1 +
 4 files changed, 38 insertions(+)

diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst
index 6d3d5d336e5f..007ae828aa2a 100644
--- a/Documentation/virt/coco/sev-guest.rst
+++ b/Documentation/virt/coco/sev-guest.rst
@@ -151,6 +151,17 @@ The SNP_PLATFORM_STATUS command is used to query the SNP platform status. The
 status includes API major, minor version and more. See the SEV-SNP
 specification for further details.
 
+2.5 SNP_COMMIT
+--------------
+:Technology: sev-snp
+:Type: hypervisor ioctl cmd
+:Returns (out): 0 on success, -negative on error
+
+SNP_COMMIT is used to commit the currently installed firmware using the
+SEV-SNP firmware SNP_COMMIT command. This prevents roll-back to a previously
+committed firmware version. This will also update the reported TCB to match
+that of the currently installed firmware.
+
 3. SEV-SNP CPUID Enforcement
 ============================
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 9f6ee0d24781..73ace4064e5a 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -222,6 +222,7 @@ static int sev_cmd_buffer_len(int cmd)
 	case SEV_CMD_SNP_PLATFORM_STATUS:	return sizeof(struct sev_data_snp_addr);
 	case SEV_CMD_SNP_GUEST_REQUEST:		return sizeof(struct sev_data_snp_guest_request);
 	case SEV_CMD_SNP_CONFIG:		return sizeof(struct sev_user_data_snp_config);
+	case SEV_CMD_SNP_COMMIT:		return sizeof(struct sev_data_snp_commit);
 	default:				return 0;
 	}
 
@@ -1968,6 +1969,19 @@ static int sev_ioctl_do_snp_platform_status(struct sev_issue_cmd *argp)
 	return ret;
 }
 
+static int sev_ioctl_do_snp_commit(struct sev_issue_cmd *argp)
+{
+	struct sev_device *sev = psp_master->sev_data;
+	struct sev_data_snp_commit buf;
+
+	if (!sev->snp_initialized)
+		return -EINVAL;
+
+	buf.len = sizeof(buf);
+
+	return __sev_do_cmd_locked(SEV_CMD_SNP_COMMIT, &buf, &argp->error);
+}
+
 static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	void __user *argp = (void __user *)arg;
@@ -2022,6 +2036,9 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 	case SNP_PLATFORM_STATUS:
 		ret = sev_ioctl_do_snp_platform_status(&input);
 		break;
+	case SNP_COMMIT:
+		ret = sev_ioctl_do_snp_commit(&input);
+		break;
 	default:
 		ret = -EINVAL;
 		goto out;
diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index 7f9bc1979018..beba10d6b39c 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -788,6 +788,15 @@ struct sev_data_snp_shutdown_ex {
 	u32 rsvd1:31;
 } __packed;
 
+/**
+ * struct sev_data_snp_commit - SNP_COMMIT structure
+ *
+ * @len: length of the command buffer read by the PSP
+ */
+struct sev_data_snp_commit {
+	u32 len;
+} __packed;
+
 #ifdef CONFIG_CRYPTO_DEV_SP_PSP
 
 /**
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index f1e2c55a92b4..35c207664e95 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -29,6 +29,7 @@ enum {
 	SEV_GET_ID,	/* This command is deprecated, use SEV_GET_ID2 */
 	SEV_GET_ID2,
 	SNP_PLATFORM_STATUS,
+	SNP_COMMIT,
 
 	SEV_MAX,
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 25/25] crypto: ccp: Add the SNP_SET_CONFIG command
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (23 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 24/25] crypto: ccp: Add the SNP_COMMIT command Michael Roth
@ 2024-01-26  4:11 ` Michael Roth
  2024-01-29 19:18   ` Liam Merwick
  2024-01-30 16:19 ` [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Borislav Petkov
  25 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26  4:11 UTC (permalink / raw)
  To: x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh, Alexey Kardashevskiy, Dionna Glaze

From: Brijesh Singh <brijesh.singh@amd.com>

The SEV-SNP firmware provides the SNP_CONFIG command used to set various
system-wide configuration values for SNP guests, such as the reported
TCB version used when signing guest attestation reports. Add an
interface to set this via userspace.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[mdr: squash in doc patch from Dionna, drop extended request/certificate
 handling and simplify this to a simple wrapper around SNP_CONFIG fw
 cmd]
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 Documentation/virt/coco/sev-guest.rst | 13 +++++++++++++
 drivers/crypto/ccp/sev-dev.c          | 20 ++++++++++++++++++++
 include/uapi/linux/psp-sev.h          |  1 +
 3 files changed, 34 insertions(+)

diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst
index 007ae828aa2a..14c9de997b7d 100644
--- a/Documentation/virt/coco/sev-guest.rst
+++ b/Documentation/virt/coco/sev-guest.rst
@@ -162,6 +162,19 @@ SEV-SNP firmware SNP_COMMIT command. This prevents roll-back to a previously
 committed firmware version. This will also update the reported TCB to match
 that of the currently installed firmware.
 
+2.6 SNP_SET_CONFIG
+------------------
+:Technology: sev-snp
+:Type: hypervisor ioctl cmd
+:Parameters (in): struct sev_user_data_snp_config
+:Returns (out): 0 on success, -negative on error
+
+SNP_SET_CONFIG is used to set the system-wide configuration such as
+reported TCB version in the attestation report. The command is similar
+to SNP_CONFIG command defined in the SEV-SNP spec. The current values of
+the firmware parameters affected by this command can be queried via
+SNP_PLATFORM_STATUS.
+
 3. SEV-SNP CPUID Enforcement
 ============================
 
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 73ace4064e5a..398ae932aa0b 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -1982,6 +1982,23 @@ static int sev_ioctl_do_snp_commit(struct sev_issue_cmd *argp)
 	return __sev_do_cmd_locked(SEV_CMD_SNP_COMMIT, &buf, &argp->error);
 }
 
+static int sev_ioctl_do_snp_set_config(struct sev_issue_cmd *argp, bool writable)
+{
+	struct sev_device *sev = psp_master->sev_data;
+	struct sev_user_data_snp_config config;
+
+	if (!sev->snp_initialized || !argp->data)
+		return -EINVAL;
+
+	if (!writable)
+		return -EPERM;
+
+	if (copy_from_user(&config, (void __user *)argp->data, sizeof(config)))
+		return -EFAULT;
+
+	return __sev_do_cmd_locked(SEV_CMD_SNP_CONFIG, &config, &argp->error);
+}
+
 static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	void __user *argp = (void __user *)arg;
@@ -2039,6 +2056,9 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 	case SNP_COMMIT:
 		ret = sev_ioctl_do_snp_commit(&input);
 		break;
+	case SNP_SET_CONFIG:
+		ret = sev_ioctl_do_snp_set_config(&input, writable);
+		break;
 	default:
 		ret = -EINVAL;
 		goto out;
diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
index 35c207664e95..b7a2c2ee35b7 100644
--- a/include/uapi/linux/psp-sev.h
+++ b/include/uapi/linux/psp-sev.h
@@ -30,6 +30,7 @@ enum {
 	SEV_GET_ID2,
 	SNP_PLATFORM_STATUS,
 	SNP_COMMIT,
+	SNP_SET_CONFIG,
 
 	SEV_MAX,
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 21/25] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe
  2024-01-26  4:11 ` [PATCH v2 21/25] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe Michael Roth
@ 2024-01-26 11:00   ` Paolo Bonzini
  0 siblings, 0 replies; 47+ messages in thread
From: Paolo Bonzini @ 2024-01-26 11:00 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, seanjc, vkuznets,
	jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh, Marc Orr

On Fri, Jan 26, 2024 at 5:45 AM Michael Roth <michael.roth@amd.com> wrote:
>
> From: Brijesh Singh <brijesh.singh@amd.com>
>
> Implement a workaround for an SNP erratum where the CPU will incorrectly
> signal an RMP violation #PF if a hugepage (2MB or 1GB) collides with the
> RMP entry of a VMCB, VMSA or AVIC backing page.
>
> When SEV-SNP is globally enabled, the CPU marks the VMCB, VMSA, and AVIC
> backing pages as "in-use" via a reserved bit in the corresponding RMP
> entry after a successful VMRUN. This is done for _all_ VMs, not just
> SNP-Active VMs.
>
> If the hypervisor accesses an in-use page through a writable
> translation, the CPU will throw an RMP violation #PF. On early SNP
> hardware, if an in-use page is 2MB-aligned and software accesses any
> part of the associated 2MB region with a hugepage, the CPU will
> incorrectly treat the entire 2MB region as in-use and signal a an RMP
> violation #PF.
>
> To avoid this, the recommendation is to not use a 2MB-aligned page for
> the VMCB, VMSA or AVIC pages. Add a generic allocator that will ensure
> that the page returned is not 2MB-aligned and is safe to be used when
> SEV-SNP is enabled. Also implement similar handling for the VMCB/VMSA
> pages of nested guests.
>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Co-developed-by: Marc Orr <marcorr@google.com>
> Signed-off-by: Marc Orr <marcorr@google.com>
> Reported-by: Alper Gun <alpergun@google.com> # for nested VMSA case
> Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> [mdr: squash in nested guest handling from Ashish, commit msg fixups]
> Signed-off-by: Michael Roth <michael.roth@amd.com>

Acked-by: Paolo Bonzini <pbonzini@redhat.com>


> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  1 +
>  arch/x86/include/asm/kvm_host.h    |  1 +
>  arch/x86/kvm/lapic.c               |  5 ++++-
>  arch/x86/kvm/svm/nested.c          |  2 +-
>  arch/x86/kvm/svm/sev.c             | 32 ++++++++++++++++++++++++++++++
>  arch/x86/kvm/svm/svm.c             | 17 +++++++++++++---
>  arch/x86/kvm/svm/svm.h             |  1 +
>  7 files changed, 54 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 378ed944b849..ab24ce207988 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -138,6 +138,7 @@ KVM_X86_OP(complete_emulated_msr)
>  KVM_X86_OP(vcpu_deliver_sipi_vector)
>  KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
>  KVM_X86_OP_OPTIONAL(get_untagged_addr)
> +KVM_X86_OP_OPTIONAL(alloc_apic_backing_page)
>
>  #undef KVM_X86_OP
>  #undef KVM_X86_OP_OPTIONAL
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index b5b2d0fde579..5c12af29fd9b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1794,6 +1794,7 @@ struct kvm_x86_ops {
>         unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
>
>         gva_t (*get_untagged_addr)(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags);
> +       void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu);
>  };
>
>  struct kvm_x86_nested_ops {
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 3242f3da2457..1edf93ee3395 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2815,7 +2815,10 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
>
>         vcpu->arch.apic = apic;
>
> -       apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
> +       if (kvm_x86_ops.alloc_apic_backing_page)
> +               apic->regs = static_call(kvm_x86_alloc_apic_backing_page)(vcpu);
> +       else
> +               apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
>         if (!apic->regs) {
>                 printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
>                        vcpu->vcpu_id);
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index dee62362a360..55b9a6d96bcf 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1181,7 +1181,7 @@ int svm_allocate_nested(struct vcpu_svm *svm)
>         if (svm->nested.initialized)
>                 return 0;
>
> -       vmcb02_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +       vmcb02_page = snp_safe_alloc_page(&svm->vcpu);
>         if (!vmcb02_page)
>                 return -ENOMEM;
>         svm->nested.vmcb02.ptr = page_address(vmcb02_page);
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 564091f386f7..f99435b6648f 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3163,3 +3163,35 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
>
>         ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
>  }
> +
> +struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> +{
> +       unsigned long pfn;
> +       struct page *p;
> +
> +       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
> +               return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +
> +       /*
> +        * Allocate an SNP-safe page to workaround the SNP erratum where
> +        * the CPU will incorrectly signal an RMP violation #PF if a
> +        * hugepage (2MB or 1GB) collides with the RMP entry of a
> +        * 2MB-aligned VMCB, VMSA, or AVIC backing page.
> +        *
> +        * Allocate one extra page, choose a page which is not
> +        * 2MB-aligned, and free the other.
> +        */
> +       p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> +       if (!p)
> +               return NULL;
> +
> +       split_page(p, 1);
> +
> +       pfn = page_to_pfn(p);
> +       if (IS_ALIGNED(pfn, PTRS_PER_PMD))
> +               __free_page(p++);
> +       else
> +               __free_page(p + 1);
> +
> +       return p;
> +}
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 61f2bdc9f4f8..272d5ed37ce7 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -703,7 +703,7 @@ static int svm_cpu_init(int cpu)
>         int ret = -ENOMEM;
>
>         memset(sd, 0, sizeof(struct svm_cpu_data));
> -       sd->save_area = alloc_page(GFP_KERNEL | __GFP_ZERO);
> +       sd->save_area = snp_safe_alloc_page(NULL);
>         if (!sd->save_area)
>                 return ret;
>
> @@ -1421,7 +1421,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
>         svm = to_svm(vcpu);
>
>         err = -ENOMEM;
> -       vmcb01_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +       vmcb01_page = snp_safe_alloc_page(vcpu);
>         if (!vmcb01_page)
>                 goto out;
>
> @@ -1430,7 +1430,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
>                  * SEV-ES guests require a separate VMSA page used to contain
>                  * the encrypted register state of the guest.
>                  */
> -               vmsa_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +               vmsa_page = snp_safe_alloc_page(vcpu);
>                 if (!vmsa_page)
>                         goto error_free_vmcb_page;
>
> @@ -4900,6 +4900,16 @@ static int svm_vm_init(struct kvm *kvm)
>         return 0;
>  }
>
> +static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu)
> +{
> +       struct page *page = snp_safe_alloc_page(vcpu);
> +
> +       if (!page)
> +               return NULL;
> +
> +       return page_address(page);
> +}
> +
>  static struct kvm_x86_ops svm_x86_ops __initdata = {
>         .name = KBUILD_MODNAME,
>
> @@ -5031,6 +5041,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>
>         .vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
>         .vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons,
> +       .alloc_apic_backing_page = svm_alloc_apic_backing_page,
>  };
>
>  /*
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 8ef95139cd24..7f1fbd874c45 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -694,6 +694,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm);
>  void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
>  void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
>  void sev_es_unmap_ghcb(struct vcpu_svm *svm);
> +struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu);
>
>  /* vmenter.S */
>
> --
> 2.25.1
>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-26  4:11 ` [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults Michael Roth
@ 2024-01-26 15:34   ` Borislav Petkov
  2024-01-26 17:04     ` Michael Roth
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2024-01-26 15:34 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Thu, Jan 25, 2024 at 10:11:11PM -0600, Michael Roth wrote:
> +static int adjust_direct_map(u64 pfn, int rmp_level)
> +{
> +	unsigned long vaddr = (unsigned long)pfn_to_kaddr(pfn);
> +	unsigned int level;
> +	int npages, ret;
> +	pte_t *pte;

Again, something I asked the last time but no reply:

Looking at Documentation/arch/x86/x86_64/mm.rst, the direct map starts
at page_offset_base so this here should at least check

	if (vaddr < __PAGE_OFFSET)
		return 0;

I'm not sure about the upper end. Right now, the adjusting should not
happen only for the direct map but also for the whole kernel address
space range because we don't want to cause any mismatch between page
mappings anywhere.

Which means, this function should be called adjust_kernel_map() or so...

Hmmm.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-26 15:34   ` Borislav Petkov
@ 2024-01-26 17:04     ` Michael Roth
  2024-01-26 18:43       ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26 17:04 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Fri, Jan 26, 2024 at 04:34:51PM +0100, Borislav Petkov wrote:
> On Thu, Jan 25, 2024 at 10:11:11PM -0600, Michael Roth wrote:
> > +static int adjust_direct_map(u64 pfn, int rmp_level)
> > +{
> > +	unsigned long vaddr = (unsigned long)pfn_to_kaddr(pfn);
> > +	unsigned int level;
> > +	int npages, ret;
> > +	pte_t *pte;
> 
> Again, something I asked the last time but no reply:
> 
> Looking at Documentation/arch/x86/x86_64/mm.rst, the direct map starts
> at page_offset_base so this here should at least check
> 
> 	if (vaddr < __PAGE_OFFSET)
> 		return 0;

vaddr comes from pfn_to_kaddr(pfn), i.e. __va(paddr), so it will
necessarily be a direct-mapped address above __PAGE_OFFSET.

> 
> I'm not sure about the upper end. Right now, the adjusting should not

For upper-end, a pfn_valid(pfn) check might suffice, since only a valid
PFN would have a possibly-valid mapping wthin the directmap range.

> happen only for the direct map but also for the whole kernel address
> space range because we don't want to cause any mismatch between page
> mappings anywhere.
> 
> Which means, this function should be called adjust_kernel_map() or so...

These are PFNs that are owned/allocated-to the caller. Due to the nature
of the directmap it's possible non-owners would write to a mapping that
overlaps, but vmalloc()/etc. would not create mappings for any pages that
were not specifically part of an allocation that belongs to the caller,
so I don't see where there's any chance for an overlap there. And the caller
of these functions would not be adjusting directmap for PFNs that might be
mapped into other kernel address ranges like kernel-text/etc unless the
caller was specifically making SNP-aware adjustments to those ranges, in
which case it would be responsible for making those other adjustments,
or implementing the necessary helpers/etc.

I'm not aware of such cases in the current code, and I don't think it makes
sense to attempt to try to handle them here generically until such a case
arises, since it will likely involve more specific requirements than what
we can anticipate from a theoretical/generic standpoint.

-Mike

> 
> Hmmm.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-26 17:04     ` Michael Roth
@ 2024-01-26 18:43       ` Borislav Petkov
  2024-01-26 23:54         ` Michael Roth
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2024-01-26 18:43 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Fri, Jan 26, 2024 at 11:04:15AM -0600, Michael Roth wrote:
> vaddr comes from pfn_to_kaddr(pfn), i.e. __va(paddr), so it will
> necessarily be a direct-mapped address above __PAGE_OFFSET.

Ah, true.

> For upper-end, a pfn_valid(pfn) check might suffice, since only a valid
> PFN would have a possibly-valid mapping wthin the directmap range.

Looking at it, yap, that could be a sensible thing to check.

> These are PFNs that are owned/allocated-to the caller. Due to the nature
> of the directmap it's possible non-owners would write to a mapping that
> overlaps, but vmalloc()/etc. would not create mappings for any pages that
> were not specifically part of an allocation that belongs to the caller,
> so I don't see where there's any chance for an overlap there. And the caller
> of these functions would not be adjusting directmap for PFNs that might be
> mapped into other kernel address ranges like kernel-text/etc unless the
> caller was specifically making SNP-aware adjustments to those ranges, in
> which case it would be responsible for making those other adjustments,
> or implementing the necessary helpers/etc.

Why does any of that matter?

If you can make this helper as generic as possible now, why don't you?

> I'm not aware of such cases in the current code, and I don't think it makes
> sense to attempt to try to handle them here generically until such a case
> arises, since it will likely involve more specific requirements than what
> we can anticipate from a theoretical/generic standpoint.

Then that's a different story. If it will likely involve more specific
handling, then that function should deal with pfns for which it can DTRT
and for others warn loudly so that the code gets fixed in time.

IOW, then it should check for the upper pfn of the direct map here and
we have two, depending on the page sizes used...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-26 18:43       ` Borislav Petkov
@ 2024-01-26 23:54         ` Michael Roth
  2024-01-27 11:42           ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-26 23:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Fri, Jan 26, 2024 at 07:43:40PM +0100, Borislav Petkov wrote:
> On Fri, Jan 26, 2024 at 11:04:15AM -0600, Michael Roth wrote:
> > vaddr comes from pfn_to_kaddr(pfn), i.e. __va(paddr), so it will
> > necessarily be a direct-mapped address above __PAGE_OFFSET.
> 
> Ah, true.
> 
> > For upper-end, a pfn_valid(pfn) check might suffice, since only a valid
> > PFN would have a possibly-valid mapping wthin the directmap range.
> 
> Looking at it, yap, that could be a sensible thing to check.
> 
> > These are PFNs that are owned/allocated-to the caller. Due to the nature
> > of the directmap it's possible non-owners would write to a mapping that
> > overlaps, but vmalloc()/etc. would not create mappings for any pages that
> > were not specifically part of an allocation that belongs to the caller,
> > so I don't see where there's any chance for an overlap there. And the caller
> > of these functions would not be adjusting directmap for PFNs that might be
> > mapped into other kernel address ranges like kernel-text/etc unless the
> > caller was specifically making SNP-aware adjustments to those ranges, in
> > which case it would be responsible for making those other adjustments,
> > or implementing the necessary helpers/etc.
> 
> Why does any of that matter?
> 
> If you can make this helper as generic as possible now, why don't you?

In this case, it would make it more difficult to handle things
efficiently and implement proper bounds-checking/etc. For instance, if
the caller *knows* they are doing something different like splitting a
kernel-text mapping, then we could implement proper bounds-checking
based on expected ranges, and implement any special handling associated
with that use-case, and capture that in a nice/understandable
adjust_kernel_text_mapping() helper. Or maybe if these are adjustments
for non-static/non-linear mappings, it makes more sense to be given an
HVA rather than PFN, etc., since we might not have any sort of reverse-map
structure/function than can be used to do the PFN->HVA lookups efficiently.

It's just a lot to guess at. And just the directmap splitting itself has
been the source of so much discussion, investigation, re-work, benchmarking,
etc., that it hints that implementing similar handling for other use-cases
really needs to have a clear and necessary purpose and set of requirements 
hat can be evaluated/tested before enabling them and reasonably expecting
them to work as expected.

> 
> > I'm not aware of such cases in the current code, and I don't think it makes
> > sense to attempt to try to handle them here generically until such a case
> > arises, since it will likely involve more specific requirements than what
> > we can anticipate from a theoretical/generic standpoint.
> 
> Then that's a different story. If it will likely involve more specific
> handling, then that function should deal with pfns for which it can DTRT
> and for others warn loudly so that the code gets fixed in time.
> 
> IOW, then it should check for the upper pfn of the direct map here and
> we have two, depending on the page sizes used...

Is something like this close to what you're thinking? I've re-tested with
SNP guests and it seems to work as expected.

diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 846e9e53dff0..c09497487c08 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -421,7 +421,12 @@ static int adjust_direct_map(u64 pfn, int rmp_level)
        if (WARN_ON_ONCE(rmp_level > PG_LEVEL_2M))
                return -EINVAL;

-       if (WARN_ON_ONCE(rmp_level == PG_LEVEL_2M && !IS_ALIGNED(pfn, PTRS_PER_PMD)))
+       if (!pfn_valid(pfn))
+               return -EINVAL;
+
+       if (rmp_level == PG_LEVEL_2M &&
+           (!IS_ALIGNED(pfn, PTRS_PER_PMD) ||
+            !pfn_valid(pfn + PTRS_PER_PMD - 1)))
                return -EINVAL;

Note that I removed the WARN_ON_ONCE(), which I think was a bit overzealous
for this check. The one above it for rmp_level > PG_LEVEL_2M I think is more
warranted, since it returns error for a use-case that might in theory become
valid on future hardware, but would result in unecessary 1GB->2MB directmap
splitting if we were to try to implement handling for that before it's
actually needed (which may well be never).

-Mike

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-26 23:54         ` Michael Roth
@ 2024-01-27 11:42           ` Borislav Petkov
  2024-01-27 15:45             ` Michael Roth
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2024-01-27 11:42 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Fri, Jan 26, 2024 at 05:54:20PM -0600, Michael Roth wrote:
> Is something like this close to what you're thinking? I've re-tested with
> SNP guests and it seems to work as expected.
> 
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 846e9e53dff0..c09497487c08 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -421,7 +421,12 @@ static int adjust_direct_map(u64 pfn, int rmp_level)
>         if (WARN_ON_ONCE(rmp_level > PG_LEVEL_2M))
>                 return -EINVAL;
> 
> -       if (WARN_ON_ONCE(rmp_level == PG_LEVEL_2M && !IS_ALIGNED(pfn, PTRS_PER_PMD)))
> +       if (!pfn_valid(pfn))

_text at VA 0xffffffff81000000 is also a valid pfn so no, this is not
enough.

Either this function should not have "direct map" in the name as it
converts *any* valid pfn not just the direct map ones or it should check
whether the pfn belongs to the direct map range.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-27 11:42           ` Borislav Petkov
@ 2024-01-27 15:45             ` Michael Roth
  2024-01-27 16:02               ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2024-01-27 15:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Sat, Jan 27, 2024 at 12:42:07PM +0100, Borislav Petkov wrote:
> On Fri, Jan 26, 2024 at 05:54:20PM -0600, Michael Roth wrote:
> > Is something like this close to what you're thinking? I've re-tested with
> > SNP guests and it seems to work as expected.
> > 
> > diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> > index 846e9e53dff0..c09497487c08 100644
> > --- a/arch/x86/virt/svm/sev.c
> > +++ b/arch/x86/virt/svm/sev.c
> > @@ -421,7 +421,12 @@ static int adjust_direct_map(u64 pfn, int rmp_level)
> >         if (WARN_ON_ONCE(rmp_level > PG_LEVEL_2M))
> >                 return -EINVAL;
> > 
> > -       if (WARN_ON_ONCE(rmp_level == PG_LEVEL_2M && !IS_ALIGNED(pfn, PTRS_PER_PMD)))
> > +       if (!pfn_valid(pfn))
> 
> _text at VA 0xffffffff81000000 is also a valid pfn so no, this is not
> enough.

directmap maps all physical memory accessible by kernel, including text
pages, so those are valid PFNs as far as this function is concerned. We
can't generally guard against the caller passing in any random PFN that
might also be mapped into additional address ranges, similarly to how
we can't guard against something doing a write to some random PFN
__va(0x1234) and scribbling over memory that it doesn't own, or just
unmapping the entire directmap range and blowing up the kernel.

The expectation is that the caller is aware of what PFNs it is passing in,
whether those PFNs have additional mappings, and if those mappings are of
concern, implement the necessary handlers if new use-cases are ever
introducted, like the adjust_kernel_text_mapping() example I mentioned
earlier. 

> 
> Either this function should not have "direct map" in the name as it
> converts *any* valid pfn not just the direct map ones or it should check
> whether the pfn belongs to the direct map range.

This function only splits mappings in the 0xffff888000000000 directmap
range. It would be inaccurate to name it in such a way that suggests
that it does anything else. If a use-case ever arises for splitting
_text mappings at 0xffffffff81000000, or any other ranges, those too
would best served by dedicated helpers adjust_kernel_text_mapping()
that *actually* modify mappings for those virtual ranges, and implement
bounds-checking appropriate for those physical/virtual ranges. The
directmap range maps all kernel-accessible physical memory, so it's
appropriate that our bounds-checking for the purpose of
adjust_direct_map() is all kernel-accessible physical memory. If that
includes PFNs mapped to other virtual ranges as well, the caller needs
to consider that and implement additional helpers as necessary, but
they'd likely *still* need to call adjust_direct_map() to adjust the
directmap range in addition to those other ranges, so even if were
possible to do so reliably, we shouldn't try to selectively reject any
PFN ranges beyond what mm.rst suggests is valid for 0xffff888000000000.

-Mike

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-27 15:45             ` Michael Roth
@ 2024-01-27 16:02               ` Borislav Petkov
  2024-01-29 11:59                 ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2024-01-27 16:02 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Sat, Jan 27, 2024 at 09:45:06AM -0600, Michael Roth wrote:
> directmap maps all physical memory accessible by kernel, including text
> pages, so those are valid PFNs as far as this function is concerned.

Why don't you have a look at

Documentation/arch/x86/x86_64/mm.rst

to sync up on the nomenclature first?

   ffff888000000000 | -119.5  TB | ffffc87fffffffff |   64 TB | direct mapping of all physical memory (page_offset_base)

   ...

   ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0

and so on.

> The expectation is that the caller is aware of what PFNs it is passing in,

There are no expectations. Have you written them down somewhere?

> This function only splits mappings in the 0xffff888000000000 directmap
> range.

This function takes any PFN it gets passed in as it is. I don't care
who its users are now or in the future and whether they pay attention
what they pass into - it needs to be properly defined.

Mike, please get on with the program. Use the right naming for the
function and basta.

IOW, this:

diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index 0a8f9334ec6e..652ee63e87fd 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -394,7 +394,7 @@ EXPORT_SYMBOL_GPL(psmash);
  * More specifics on how these checks are carried out can be found in APM
  * Volume 2, "RMP and VMPL Access Checks".
  */
-static int adjust_direct_map(u64 pfn, int rmp_level)
+static int split_pfn(u64 pfn, int rmp_level)
 {
 	unsigned long vaddr = (unsigned long)pfn_to_kaddr(pfn);
 	unsigned int level;
@@ -405,7 +405,12 @@ static int adjust_direct_map(u64 pfn, int rmp_level)
 	if (WARN_ON_ONCE(rmp_level > PG_LEVEL_2M))
 		return -EINVAL;
 
-	if (WARN_ON_ONCE(rmp_level == PG_LEVEL_2M && !IS_ALIGNED(pfn, PTRS_PER_PMD)))
+       if (!pfn_valid(pfn))
+               return -EINVAL;
+
+       if (rmp_level == PG_LEVEL_2M &&
+           (!IS_ALIGNED(pfn, PTRS_PER_PMD) ||
+            !pfn_valid(pfn + PTRS_PER_PMD - 1)))
 		return -EINVAL;
 
 	/*
@@ -456,7 +461,7 @@ static int rmpupdate(u64 pfn, struct rmp_state *state)
 
 	level = RMP_TO_PG_LEVEL(state->pagesize);
 
-	if (adjust_direct_map(pfn, level))
+	if (split_pfn(pfn, level))
 		return -EFAULT;
 
 	do {


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-27 16:02               ` Borislav Petkov
@ 2024-01-29 11:59                 ` Borislav Petkov
  2024-01-29 15:26                   ` Vlastimil Babka
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2024-01-29 11:59 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Sat, Jan 27, 2024 at 05:02:49PM +0100, Borislav Petkov wrote:
> This function takes any PFN it gets passed in as it is. I don't care
> who its users are now or in the future and whether they pay attention
> what they pass into - it needs to be properly defined.

Ok, we solved it offlist, here's the final version I have. It has
a comment explaining what I was asking.

---
From: Michael Roth <michael.roth@amd.com>
Date: Thu, 25 Jan 2024 22:11:11 -0600
Subject: [PATCH] x86/sev: Adjust the directmap to avoid inadvertent RMP faults

If the kernel uses a 2MB or larger directmap mapping to write to an
address, and that mapping contains any 4KB pages that are set to private
in the RMP table, an RMP #PF will trigger and cause a host crash.

SNP-aware code that owns the private PFNs will never attempt such
a write, but other kernel tasks writing to other PFNs in the range may
trigger these checks inadvertently due to writing to those other PFNs
via a large directmap mapping that happens to also map a private PFN.

Prevent this by splitting any 2MB+ mappings that might end up containing
a mix of private/shared PFNs as a result of a subsequent RMPUPDATE for
the PFN/rmp_level passed in.

Another way to handle this would be to limit the directmap to 4K
mappings in the case of hosts that support SNP, but there is potential
risk for performance regressions of certain host workloads.

Handling it as-needed results in the directmap being slowly split over
time, which lessens the risk of a performance regression since the more
the directmap gets split as a result of running SNP guests, the more
likely the host is being used primarily to run SNP guests, where
a mostly-split directmap is actually beneficial since there is less
chance of TLB flushing and cpa_lock contention being needed to perform
these splits.

Cases where a host knows in advance it wants to primarily run SNP guests
and wishes to pre-split the directmap can be handled by adding
a tuneable in the future, but preliminary testing has shown this to not
provide a signficant benefit in the common case of guests that are
backed primarily by 2MB THPs, so it does not seem to be warranted
currently and can be added later if a need arises in the future.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240126041126.1927228-12-michael.roth@amd.com
---
 arch/x86/virt/svm/sev.c | 86 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 84 insertions(+), 2 deletions(-)

diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index c0b4c2306e8d..8da9c5330ff0 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -368,6 +368,82 @@ int psmash(u64 pfn)
 }
 EXPORT_SYMBOL_GPL(psmash);
 
+/*
+ * If the kernel uses a 2MB or larger directmap mapping to write to an address,
+ * and that mapping contains any 4KB pages that are set to private in the RMP
+ * table, an RMP #PF will trigger and cause a host crash. Hypervisor code that
+ * owns the PFNs being transitioned will never attempt such a write, but other
+ * kernel tasks writing to other PFNs in the range may trigger these checks
+ * inadvertently due a large directmap mapping that happens to overlap such a
+ * PFN.
+ *
+ * Prevent this by splitting any 2MB+ mappings that might end up containing a
+ * mix of private/shared PFNs as a result of a subsequent RMPUPDATE for the
+ * PFN/rmp_level passed in.
+ *
+ * Note that there is no attempt here to scan all the RMP entries for the 2MB
+ * physical range, since it would only be worthwhile in determining if a
+ * subsequent RMPUPDATE for a 4KB PFN would result in all the entries being of
+ * the same shared/private state, thus avoiding the need to split the mapping.
+ * But that would mean the entries are currently in a mixed state, and so the
+ * mapping would have already been split as a result of prior transitions.
+ * And since the 4K split is only done if the mapping is 2MB+, and there isn't
+ * currently a mechanism in place to restore 2MB+ mappings, such a check would
+ * not provide any usable benefit.
+ *
+ * More specifics on how these checks are carried out can be found in APM
+ * Volume 2, "RMP and VMPL Access Checks".
+ */
+static int adjust_direct_map(u64 pfn, int rmp_level)
+{
+	unsigned long vaddr;
+	unsigned int level;
+	int npages, ret;
+	pte_t *pte;
+
+	/*
+	 * pfn_to_kaddr() will return a vaddr only within the direct
+	 * map range.
+	 */
+	vaddr = (unsigned long)pfn_to_kaddr(pfn);
+
+	/* Only 4KB/2MB RMP entries are supported by current hardware. */
+	if (WARN_ON_ONCE(rmp_level > PG_LEVEL_2M))
+		return -EINVAL;
+
+       if (!pfn_valid(pfn))
+               return -EINVAL;
+
+       if (rmp_level == PG_LEVEL_2M &&
+           (!IS_ALIGNED(pfn, PTRS_PER_PMD) ||
+            !pfn_valid(pfn + PTRS_PER_PMD - 1)))
+		return -EINVAL;
+
+	/*
+	 * If an entire 2MB physical range is being transitioned, then there is
+	 * no risk of RMP #PFs due to write accesses from overlapping mappings,
+	 * since even accesses from 1GB mappings will be treated as 2MB accesses
+	 * as far as RMP table checks are concerned.
+	 */
+	if (rmp_level == PG_LEVEL_2M)
+		return 0;
+
+	pte = lookup_address(vaddr, &level);
+	if (!pte || pte_none(*pte))
+		return 0;
+
+	if (level == PG_LEVEL_4K)
+		return 0;
+
+	npages = page_level_size(rmp_level) / PAGE_SIZE;
+	ret = set_memory_4k(vaddr, npages);
+	if (ret)
+		pr_warn("Failed to split direct map for PFN 0x%llx, ret: %d\n",
+			pfn, ret);
+
+	return ret;
+}
+
 /*
  * It is expected that those operations are seldom enough so that no mutual
  * exclusion of updaters is needed and thus the overlap error condition below
@@ -384,11 +460,16 @@ EXPORT_SYMBOL_GPL(psmash);
 static int rmpupdate(u64 pfn, struct rmp_state *state)
 {
 	unsigned long paddr = pfn << PAGE_SHIFT;
-	int ret;
+	int ret, level;
 
 	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
 		return -ENODEV;
 
+	level = RMP_TO_PG_LEVEL(state->pagesize);
+
+	if (adjust_direct_map(pfn, level))
+		return -EFAULT;
+
 	do {
 		/* Binutils version 2.36 supports the RMPUPDATE mnemonic. */
 		asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFE"
@@ -398,7 +479,8 @@ static int rmpupdate(u64 pfn, struct rmp_state *state)
 	} while (ret == RMPUPDATE_FAIL_OVERLAP);
 
 	if (ret) {
-		pr_err("RMPUPDATE failed for PFN %llx, ret: %d\n", pfn, ret);
+		pr_err("RMPUPDATE failed for PFN %llx, pg_level: %d, ret: %d\n",
+		       pfn, level, ret);
 		dump_rmpentry(pfn);
 		dump_stack();
 		return -EFAULT;
-- 
2.43.0

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 15/25] x86/sev: Introduce snp leaked pages list
  2024-01-26  4:11 ` [PATCH v2 15/25] x86/sev: Introduce snp leaked pages list Michael Roth
@ 2024-01-29 14:26   ` Vlastimil Babka
  2024-01-29 14:29     ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Vlastimil Babka @ 2024-01-29 14:26 UTC (permalink / raw)
  To: Michael Roth, x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, kirill, ak, tony.luck,
	sathyanarayanan.kuppuswamy, alpergun, jarkko, ashish.kalra,
	nikunj.dadhania, pankaj.gupta, liam.merwick

On 1/26/24 05:11, Michael Roth wrote:
> From: Ashish Kalra <ashish.kalra@amd.com>
> 
> Pages are unsafe to be released back to the page-allocator, if they
> have been transitioned to firmware/guest state and can't be reclaimed
> or transitioned back to hypervisor/shared state. In this case add
> them to an internal leaked pages list to ensure that they are not freed
> or touched/accessed to cause fatal page faults.
> 
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> [mdr: relocate to arch/x86/virt/svm/sev.c]
> Signed-off-by: Michael Roth <michael.roth@amd.com>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

Some minor nitpicks:

> ---
>  arch/x86/include/asm/sev.h |  2 ++
>  arch/x86/virt/svm/sev.c    | 34 ++++++++++++++++++++++++++++++++++
>  2 files changed, 36 insertions(+)
> 
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index d3ccb7a0c7e9..435ba9bc4510 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -264,6 +264,7 @@ void snp_dump_hva_rmpentry(unsigned long address);
>  int psmash(u64 pfn);
>  int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool immutable);
>  int rmp_make_shared(u64 pfn, enum pg_level level);
> +void snp_leak_pages(u64 pfn, unsigned int npages);
>  #else
>  static inline bool snp_probe_rmptable_info(void) { return false; }
>  static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
> @@ -275,6 +276,7 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int as
>  	return -ENODEV;
>  }
>  static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
> +static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
>  #endif
>  
>  #endif
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 1a13eff78c9d..649ac1bb6b0e 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -65,6 +65,11 @@ static u64 probed_rmp_base, probed_rmp_size;
>  static struct rmpentry *rmptable __ro_after_init;
>  static u64 rmptable_max_pfn __ro_after_init;
>  
> +static LIST_HEAD(snp_leaked_pages_list);
> +static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
> +
> +static unsigned long snp_nr_leaked_pages;
> +
>  #undef pr_fmt
>  #define pr_fmt(fmt)	"SEV-SNP: " fmt
>  
> @@ -505,3 +510,32 @@ int rmp_make_shared(u64 pfn, enum pg_level level)
>  	return rmpupdate(pfn, &state);
>  }
>  EXPORT_SYMBOL_GPL(rmp_make_shared);
> +
> +void snp_leak_pages(u64 pfn, unsigned int npages)
> +{
> +	struct page *page = pfn_to_page(pfn);
> +
> +	pr_warn("Leaking PFN range 0x%llx-0x%llx\n", pfn, pfn + npages);
> +
> +	spin_lock(&snp_leaked_pages_list_lock);
> +	while (npages--) {
> +		/*
> +		 * Reuse the page's buddy list for chaining into the leaked
> +		 * pages list. This page should not be on a free list currently
> +		 * and is also unsafe to be added to a free list.
> +		 */
> +		if (likely(!PageCompound(page)) ||
> +		    (PageHead(page) && compound_nr(page) <= npages))
> +			/*
> +			 * Skip inserting tail pages of compound page as
> +			 * page->buddy_list of tail pages is not usable.
> +			 */
> +			list_add_tail(&page->buddy_list, &snp_leaked_pages_list);

Even though it's not necessary for correctness, with the comment I'd put the
whole block into { } to make easier to follow. Or just move the comment
above the if() itself?

> +		dump_rmpentry(pfn);
> +		snp_nr_leaked_pages++;
> +		pfn++;
> +		page++;
> +	}
> +	spin_unlock(&snp_leaked_pages_list_lock);
> +}
> +EXPORT_SYMBOL_GPL(snp_leak_pages);


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 15/25] x86/sev: Introduce snp leaked pages list
  2024-01-29 14:26   ` Vlastimil Babka
@ 2024-01-29 14:29     ` Borislav Petkov
  0 siblings, 0 replies; 47+ messages in thread
From: Borislav Petkov @ 2024-01-29 14:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Michael Roth, x86, kvm, linux-coco, linux-mm, linux-crypto,
	linux-kernel, tglx, mingo, jroedel, thomas.lendacky, hpa, ardb,
	pbonzini, seanjc, vkuznets, jmattson, luto, dave.hansen, slp,
	pgonda, peterz, srinivas.pandruvada, rientjes, tobin, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Mon, Jan 29, 2024 at 03:26:29PM +0100, Vlastimil Babka wrote:
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> 
> Some minor nitpicks:

Thanks, here's what I have applied:

commit c3875aff4e0739a6af385795470da70d675a7635
Author: Ashish Kalra <ashish.kalra@amd.com>
Date:   Thu Jan 25 22:11:15 2024 -0600

    x86/sev: Introduce an SNP leaked pages list
    
    Pages are unsafe to be released back to the page-allocator if they
    have been transitioned to firmware/guest state and can't be reclaimed
    or transitioned back to hypervisor/shared state. In this case, add them
    to an internal leaked pages list to ensure that they are not freed or
    touched/accessed to cause fatal page faults.
    
      [ mdr: Relocate to arch/x86/virt/svm/sev.c ]
    
    Suggested-by: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
    Signed-off-by: Michael Roth <michael.roth@amd.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Link: https://lore.kernel.org/r/20240126041126.1927228-16-michael.roth@amd.com

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index d3ccb7a0c7e9..435ba9bc4510 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -264,6 +264,7 @@ void snp_dump_hva_rmpentry(unsigned long address);
 int psmash(u64 pfn);
 int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool immutable);
 int rmp_make_shared(u64 pfn, enum pg_level level);
+void snp_leak_pages(u64 pfn, unsigned int npages);
 #else
 static inline bool snp_probe_rmptable_info(void) { return false; }
 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
@@ -275,6 +276,7 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int as
 	return -ENODEV;
 }
 static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
+static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
 #endif
 
 #endif
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
index f1be56555ee6..901863a842d7 100644
--- a/arch/x86/virt/svm/sev.c
+++ b/arch/x86/virt/svm/sev.c
@@ -65,6 +65,11 @@ static u64 probed_rmp_base, probed_rmp_size;
 static struct rmpentry *rmptable __ro_after_init;
 static u64 rmptable_max_pfn __ro_after_init;
 
+static LIST_HEAD(snp_leaked_pages_list);
+static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
+
+static unsigned long snp_nr_leaked_pages;
+
 #undef pr_fmt
 #define pr_fmt(fmt)	"SEV-SNP: " fmt
 
@@ -515,3 +520,35 @@ int rmp_make_shared(u64 pfn, enum pg_level level)
 	return rmpupdate(pfn, &state);
 }
 EXPORT_SYMBOL_GPL(rmp_make_shared);
+
+void snp_leak_pages(u64 pfn, unsigned int npages)
+{
+	struct page *page = pfn_to_page(pfn);
+
+	pr_warn("Leaking PFN range 0x%llx-0x%llx\n", pfn, pfn + npages);
+
+	spin_lock(&snp_leaked_pages_list_lock);
+	while (npages--) {
+
+		/*
+		 * Reuse the page's buddy list for chaining into the leaked
+		 * pages list. This page should not be on a free list currently
+		 * and is also unsafe to be added to a free list.
+		 */
+		if (likely(!PageCompound(page)) ||
+
+			/*
+			 * Skip inserting tail pages of compound page as
+			 * page->buddy_list of tail pages is not usable.
+			 */
+		    (PageHead(page) && compound_nr(page) <= npages))
+			list_add_tail(&page->buddy_list, &snp_leaked_pages_list);
+
+		dump_rmpentry(pfn);
+		snp_nr_leaked_pages++;
+		pfn++;
+		page++;
+	}
+	spin_unlock(&snp_leaked_pages_list_lock);
+}
+EXPORT_SYMBOL_GPL(snp_leak_pages);


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 16/25] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled
  2024-01-26  4:11 ` [PATCH v2 16/25] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled Michael Roth
@ 2024-01-29 15:04   ` Borislav Petkov
  0 siblings, 0 replies; 47+ messages in thread
From: Borislav Petkov @ 2024-01-29 15:04 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh

On Thu, Jan 25, 2024 at 10:11:16PM -0600, Michael Roth wrote:
> @@ -641,14 +774,16 @@ static int __sev_platform_init_locked(int *error)

That function - especially when looking at the next patch - becomes too
big and hard to follow.

Let's add subfunctions for each thing, diff ontop:

---
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index fa992ce57ffe..70aabd1d3d5f 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -759,6 +759,22 @@ static int __sev_snp_init_locked(int *error)
 	return rc;
 }
 
+static void __sev_platform_init_handle_tmr(struct sev_device *sev)
+{
+	if (sev_es_tmr)
+		return;
+
+	/* Obtain the TMR memory area for SEV-ES use */
+	sev_es_tmr = sev_fw_alloc(sev_es_tmr_size);
+	if (sev_es_tmr) {
+		/* Must flush the cache before giving it to the firmware */
+		if (!sev->snp_initialized)
+			clflush_cache_range(sev_es_tmr, sev_es_tmr_size);
+	} else {
+			dev_warn(sev->dev, "SEV: TMR allocation failed, SEV-ES support unavailable\n");
+	}
+}
+
 static int __sev_platform_init_locked(int *error)
 {
 	int rc, psp_ret = SEV_RET_NO_FW_CALL;
@@ -772,18 +788,7 @@ static int __sev_platform_init_locked(int *error)
 	if (sev->state == SEV_STATE_INIT)
 		return 0;
 
-	if (!sev_es_tmr) {
-		/* Obtain the TMR memory area for SEV-ES use */
-		sev_es_tmr = sev_fw_alloc(sev_es_tmr_size);
-		if (sev_es_tmr) {
-			/* Must flush the cache before giving it to the firmware */
-			if (!sev->snp_initialized)
-				clflush_cache_range(sev_es_tmr, sev_es_tmr_size);
-		} else {
-			dev_warn(sev->dev,
-				 "SEV: TMR allocation failed, SEV-ES support unavailable\n");
-		}
-	}
+	__sev_platform_init_handle_tmr(sev);
 
 	if (sev_init_ex_buffer) {
 		rc = sev_read_init_ex_file();

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 17/25] crypto: ccp: Handle non-volatile INIT_EX data when SNP is enabled
  2024-01-26  4:11 ` [PATCH v2 17/25] crypto: ccp: Handle non-volatile INIT_EX data " Michael Roth
@ 2024-01-29 15:12   ` Borislav Petkov
  0 siblings, 0 replies; 47+ messages in thread
From: Borislav Petkov @ 2024-01-29 15:12 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Thu, Jan 25, 2024 at 10:11:17PM -0600, Michael Roth wrote:
> -	if (sev_init_ex_buffer) {
> +	/*
> +	 * If an init_ex_path is provided allocate a buffer for the file and
> +	 * read in the contents. Additionally, if SNP is initialized, convert
> +	 * the buffer pages to firmware pages.
> +	 */
> +	if (init_ex_path && !sev_init_ex_buffer) {
> +		struct page *page;
> +
> +		page = alloc_pages(GFP_KERNEL, get_order(NV_LENGTH));
> +		if (!page) {
> +			dev_err(sev->dev, "SEV: INIT_EX NV memory allocation failed\n");
> +			return -ENOMEM;
> +		}
> +
> +		sev_init_ex_buffer = page_address(page);
> +
>  		rc = sev_read_init_ex_file();
>  		if (rc)
>  			return rc;
> +
> +		/* If SEV-SNP is initialized, transition to firmware page. */
> +		if (sev->snp_initialized) {
> +			unsigned long npages;
> +
> +			npages = 1UL << get_order(NV_LENGTH);
> +			if (rmp_mark_pages_firmware(__pa(sev_init_ex_buffer),
> +						    npages, false)) {
> +				dev_err(sev->dev,
> +					"SEV: INIT_EX NV memory page state change failed.\n");
> +				return -ENOMEM;
> +			}
> +		}
>  	}

Ontop:

---
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index c364ad33f376..5ec563611953 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -775,6 +775,48 @@ static void __sev_platform_init_handle_tmr(struct sev_device *sev)
 	}
 }
 
+/*
+ * If an init_ex_path is provided allocate a buffer for the file and
+ * read in the contents. Additionally, if SNP is initialized, convert
+ * the buffer pages to firmware pages.
+ */
+static int __sev_platform_init_handle_init_ex_path(struct sev_device *sev)
+{
+	struct page *page;
+	int rc;
+
+	if (!init_ex_path)
+		return 0;
+
+	if (sev_init_ex_buffer)
+		return 0;
+
+	page = alloc_pages(GFP_KERNEL, get_order(NV_LENGTH));
+	if (!page) {
+		dev_err(sev->dev, "SEV: INIT_EX NV memory allocation failed\n");
+		return -ENOMEM;
+	}
+
+	sev_init_ex_buffer = page_address(page);
+
+	rc = sev_read_init_ex_file();
+	if (rc)
+		return rc;
+
+	/* If SEV-SNP is initialized, transition to firmware page. */
+	if (sev->snp_initialized) {
+		unsigned long npages;
+
+		npages = 1UL << get_order(NV_LENGTH);
+		if (rmp_mark_pages_firmware(__pa(sev_init_ex_buffer), npages, false)) {
+			dev_err(sev->dev, "SEV: INIT_EX NV memory page state change failed.\n");
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
 static int __sev_platform_init_locked(int *error)
 {
 	int rc, psp_ret = SEV_RET_NO_FW_CALL;
@@ -790,39 +832,9 @@ static int __sev_platform_init_locked(int *error)
 
 	__sev_platform_init_handle_tmr(sev);
 
-	/*
-	 * If an init_ex_path is provided allocate a buffer for the file and
-	 * read in the contents. Additionally, if SNP is initialized, convert
-	 * the buffer pages to firmware pages.
-	 */
-	if (init_ex_path && !sev_init_ex_buffer) {
-		struct page *page;
-
-		page = alloc_pages(GFP_KERNEL, get_order(NV_LENGTH));
-		if (!page) {
-			dev_err(sev->dev, "SEV: INIT_EX NV memory allocation failed\n");
-			return -ENOMEM;
-		}
-
-		sev_init_ex_buffer = page_address(page);
-
-		rc = sev_read_init_ex_file();
-		if (rc)
-			return rc;
-
-		/* If SEV-SNP is initialized, transition to firmware page. */
-		if (sev->snp_initialized) {
-			unsigned long npages;
-
-			npages = 1UL << get_order(NV_LENGTH);
-			if (rmp_mark_pages_firmware(__pa(sev_init_ex_buffer),
-						    npages, false)) {
-				dev_err(sev->dev,
-					"SEV: INIT_EX NV memory page state change failed.\n");
-				return -ENOMEM;
-			}
-		}
-	}
+	rc = __sev_platform_init_handle_init_ex_path(sev);
+	if (rc)
+		return rc;
 
 	rc = __sev_do_init_locked(&psp_ret);
 	if (rc && psp_ret == SEV_RET_SECURE_DATA_INVALID) {


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults
  2024-01-29 11:59                 ` Borislav Petkov
@ 2024-01-29 15:26                   ` Vlastimil Babka
  0 siblings, 0 replies; 47+ messages in thread
From: Vlastimil Babka @ 2024-01-29 15:26 UTC (permalink / raw)
  To: Borislav Petkov, Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, kirill, ak, tony.luck,
	sathyanarayanan.kuppuswamy, alpergun, jarkko, ashish.kalra,
	nikunj.dadhania, pankaj.gupta, liam.merwick

On 1/29/24 12:59, Borislav Petkov wrote:
> On Sat, Jan 27, 2024 at 05:02:49PM +0100, Borislav Petkov wrote:
>> This function takes any PFN it gets passed in as it is. I don't care
>> who its users are now or in the future and whether they pay attention
>> what they pass into - it needs to be properly defined.
> 
> Ok, we solved it offlist, here's the final version I have. It has
> a comment explaining what I was asking.
> 
> ---
> From: Michael Roth <michael.roth@amd.com>
> Date: Thu, 25 Jan 2024 22:11:11 -0600
> Subject: [PATCH] x86/sev: Adjust the directmap to avoid inadvertent RMP faults
> 
> If the kernel uses a 2MB or larger directmap mapping to write to an
> address, and that mapping contains any 4KB pages that are set to private
> in the RMP table, an RMP #PF will trigger and cause a host crash.
> 
> SNP-aware code that owns the private PFNs will never attempt such
> a write, but other kernel tasks writing to other PFNs in the range may
> trigger these checks inadvertently due to writing to those other PFNs
> via a large directmap mapping that happens to also map a private PFN.
> 
> Prevent this by splitting any 2MB+ mappings that might end up containing
> a mix of private/shared PFNs as a result of a subsequent RMPUPDATE for
> the PFN/rmp_level passed in.
> 
> Another way to handle this would be to limit the directmap to 4K
> mappings in the case of hosts that support SNP, but there is potential
> risk for performance regressions of certain host workloads.
> 
> Handling it as-needed results in the directmap being slowly split over
> time, which lessens the risk of a performance regression since the more
> the directmap gets split as a result of running SNP guests, the more
> likely the host is being used primarily to run SNP guests, where
> a mostly-split directmap is actually beneficial since there is less
> chance of TLB flushing and cpa_lock contention being needed to perform
> these splits.
> 
> Cases where a host knows in advance it wants to primarily run SNP guests
> and wishes to pre-split the directmap can be handled by adding
> a tuneable in the future, but preliminary testing has shown this to not
> provide a signficant benefit in the common case of guests that are
> backed primarily by 2MB THPs, so it does not seem to be warranted
> currently and can be added later if a need arises in the future.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> Link: https://lore.kernel.org/r/20240126041126.1927228-12-michael.roth@amd.com

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Thanks!

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 13/25] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP
  2024-01-26  4:11 ` [PATCH v2 13/25] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP Michael Roth
@ 2024-01-29 17:58   ` Borislav Petkov
  0 siblings, 0 replies; 47+ messages in thread
From: Borislav Petkov @ 2024-01-29 17:58 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick,
	Brijesh Singh, Jarkko Sakkinen

On Thu, Jan 25, 2024 at 10:11:13PM -0600, Michael Roth wrote:
> diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
> index 006e4cdbeb78..8128de17f0f4 100644
> --- a/include/linux/psp-sev.h
> +++ b/include/linux/psp-sev.h
> @@ -790,10 +790,23 @@ struct sev_data_snp_shutdown_ex {
>  
>  #ifdef CONFIG_CRYPTO_DEV_SP_PSP
>  
> +/**
> + * struct sev_platform_init_args
> + *
> + * @error: SEV firmware error code
> + * @probe: True if this is being called as part of CCP module probe, which
> + *  will defer SEV_INIT/SEV_INIT_EX firmware initialization until needed
> + *  unless psp_init_on_probe module param is set
> + */
> +struct sev_platform_init_args {
> +	int error;
> +	bool probe;
> +};

This struct definition cannot be under the ifdef, otherwise:

arch/x86/kvm/svm/sev.c: In function ‘sev_guest_init’:
arch/x86/kvm/svm/sev.c:267:33: error: passing argument 1 of ‘sev_platform_init’ from incompatible pointer type [-Werror=incompatible-pointer-types]
  267 |         ret = sev_platform_init(&init_args);
      |                                 ^~~~~~~~~~
      |                                 |
      |                                 struct sev_platform_init_args *
In file included from arch/x86/kvm/svm/sev.c:16:
./include/linux/psp-sev.h:952:42: note: expected ‘int *’ but argument is of type ‘struct sev_platform_init_args *’
  952 | static inline int sev_platform_init(int *error) { return -ENODEV; }
      |                                     ~~~~~^~~~~
cc1: all warnings being treated as errors

---

on a 32-bit allmodconfig.

Build fix:

---

diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
index beba10d6b39c..d0e184db9d37 100644
--- a/include/linux/psp-sev.h
+++ b/include/linux/psp-sev.h
@@ -797,8 +797,6 @@ struct sev_data_snp_commit {
 	u32 len;
 } __packed;
 
-#ifdef CONFIG_CRYPTO_DEV_SP_PSP
-
 /**
  * struct sev_platform_init_args
  *
@@ -812,6 +810,8 @@ struct sev_platform_init_args {
 	bool probe;
 };
 
+#ifdef CONFIG_CRYPTO_DEV_SP_PSP
+
 /**
  * sev_platform_init - perform SEV INIT command
  *
@@ -949,7 +949,7 @@ void snp_free_firmware_page(void *addr);
 static inline int
 sev_platform_status(struct sev_user_data_status *status, int *error) { return -ENODEV; }
 
-static inline int sev_platform_init(int *error) { return -ENODEV; }
+static inline int sev_platform_init(struct sev_platform_init_args *args) { return -ENODEV; }
 
 static inline int
 sev_guest_deactivate(struct sev_data_deactivate *data, int *error) { return -ENODEV; }

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/25] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction
  2024-01-26  4:11 ` [PATCH v2 10/25] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction Michael Roth
@ 2024-01-29 18:00   ` Liam Merwick
  2024-01-29 19:28     ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Liam Merwick @ 2024-01-29 18:00 UTC (permalink / raw)
  To: Michael Roth, x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, Brijesh Singh,
	Liam Merwick

On 26/01/2024 04:11, Michael Roth wrote:
> From: Brijesh Singh <brijesh.singh@amd.com>
> 
> The RMPUPDATE instruction updates the access restrictions for a page via
> its corresponding entry in the RMP Table. The hypervisor will use the
> instruction to enforce various access restrictions on pages used for
> confidential guests and other specialized functionality. See APM3 for
> details on the instruction operations.
> 
> The PSMASH instruction expands a 2MB RMP entry in the RMP table into a
> corresponding set of contiguous 4KB RMP entries while retaining the
> state of the validated bit from the original 2MB RMP entry. The
> hypervisor will use this instruction in cases where it needs to re-map a
> page as 4K rather than 2MB in a guest's nested page table.
> 
> Add helpers to make use of these instructions.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> [mdr: add RMPUPDATE retry logic for transient FAIL_OVERLAP errors]
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> ---
>   arch/x86/include/asm/sev.h | 23 ++++++++++
>   arch/x86/virt/svm/sev.c    | 92 ++++++++++++++++++++++++++++++++++++++
>   2 files changed, 115 insertions(+)
> 
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 2c53e3de0b71..d3ccb7a0c7e9 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -87,10 +87,23 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
>   /* Software defined (when rFlags.CF = 1) */
>   #define PVALIDATE_FAIL_NOUPDATE		255
>   
> +/* RMUPDATE detected 4K page and 2MB page overlap. */
> +#define RMPUPDATE_FAIL_OVERLAP		4
> +
>   /* RMP page size */
>   #define RMP_PG_SIZE_4K			0
>   #define RMP_PG_SIZE_2M			1
>   #define RMP_TO_PG_LEVEL(level)		(((level) == RMP_PG_SIZE_4K) ? PG_LEVEL_4K : PG_LEVEL_2M)
> +#define PG_LEVEL_TO_RMP(level)		(((level) == PG_LEVEL_4K) ? RMP_PG_SIZE_4K : RMP_PG_SIZE_2M)
> +
> +struct rmp_state {
> +	u64 gpa;
> +	u8 assigned;
> +	u8 pagesize;
> +	u8 immutable;
> +	u8 rsvd;
> +	u32 asid;
> +} __packed;
>   
>   #define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
>   
> @@ -248,10 +261,20 @@ static inline u64 sev_get_status(void) { return 0; }
>   bool snp_probe_rmptable_info(void);
>   int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level);
>   void snp_dump_hva_rmpentry(unsigned long address);
> +int psmash(u64 pfn);
> +int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool immutable);
> +int rmp_make_shared(u64 pfn, enum pg_level level);
>   #else
>   static inline bool snp_probe_rmptable_info(void) { return false; }
>   static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
>   static inline void snp_dump_hva_rmpentry(unsigned long address) {}
> +static inline int psmash(u64 pfn) { return -ENODEV; }
> +static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid,
> +				   bool immutable)
> +{
> +	return -ENODEV;
> +}
> +static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
>   #endif
>   
>   #endif
> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index c74266e039b2..16b3d8139649 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c
> @@ -342,3 +342,95 @@ void snp_dump_hva_rmpentry(unsigned long hva)
>   	paddr = PFN_PHYS(pte_pfn(*pte)) | (hva & ~page_level_mask(level));
>   	dump_rmpentry(PHYS_PFN(paddr));
>   }
> +
> +/*
> + * PSMASH a 2MB aligned page into 4K pages in the RMP table while preserving the
> + * Validated bit.
> + */
> +int psmash(u64 pfn)
> +{
> +	unsigned long paddr = pfn << PAGE_SHIFT;
> +	int ret;
> +
> +	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
> +		return -ENODEV;
> +
> +	if (!pfn_valid(pfn))
> +		return -EINVAL;
> +
> +	/* Binutils version 2.36 supports the PSMASH mnemonic. */
> +	asm volatile(".byte 0xF3, 0x0F, 0x01, 0xFF"
> +		      : "=a" (ret)
> +		      : "a" (paddr)
> +		      : "memory", "cc");
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(psmash);
> +
> +/*
> + * It is expected that those operations are seldom enough so that no mutual
> + * exclusion of updaters is needed and thus the overlap error condition below
> + * should happen very seldomly and would get resolved relatively quickly by
> + * the firmware.
> + *
> + * If not, one could consider introducing a mutex or so here to sync concurrent
> + * RMP updates and thus diminish the amount of cases where firmware needs to
> + * lock 2M ranges to protect against concurrent updates.
> + *
> + * The optimal solution would be range locking to avoid locking disjoint
> + * regions unnecessarily but there's no support for that yet.
> + */
> +static int rmpupdate(u64 pfn, struct rmp_state *state)
> +{
> +	unsigned long paddr = pfn << PAGE_SHIFT;
> +	int ret;
> +
> +	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
> +		return -ENODEV;
> +
> +	do {
> +		/* Binutils version 2.36 supports the RMPUPDATE mnemonic. */
> +		asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFE"
> +			     : "=a" (ret)
> +			     : "a" (paddr), "c" ((unsigned long)state)
> +			     : "memory", "cc");
> +	} while (ret == RMPUPDATE_FAIL_OVERLAP);
> +
> +	if (ret) {
> +		pr_err("RMPUPDATE failed for PFN %llx, ret: %d\n", pfn, ret);
> +		dump_rmpentry(pfn);
> +		dump_stack();
> +		return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
> +/* Transition a page to guest-owned/private state in the RMP table. */
> +int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool immutable)


asid is typically a u32 (or at least unsigned) - is it better to avoid 
potential conversion issues with an 'int'?

Otherwise
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>


> +{
> +	struct rmp_state state;
> +
> +	memset(&state, 0, sizeof(state));
> +	state.assigned = 1;
> +	state.asid = asid > +	state.immutable = immutable;
> +	state.gpa = gpa;
> +	state.pagesize = PG_LEVEL_TO_RMP(level);
> +
> +	return rmpupdate(pfn, &state);
> +}
> +EXPORT_SYMBOL_GPL(rmp_make_private);
> +
> +/* Transition a page to hypervisor-owned/shared state in the RMP table. */
> +int rmp_make_shared(u64 pfn, enum pg_level level)
> +{
> +	struct rmp_state state;
> +
> +	memset(&state, 0, sizeof(state));
> +	state.pagesize = PG_LEVEL_TO_RMP(level);
> +
> +	return rmpupdate(pfn, &state);
> +}
> +EXPORT_SYMBOL_GPL(rmp_make_shared);


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 25/25] crypto: ccp: Add the SNP_SET_CONFIG command
  2024-01-26  4:11 ` [PATCH v2 25/25] crypto: ccp: Add the SNP_SET_CONFIG command Michael Roth
@ 2024-01-29 19:18   ` Liam Merwick
  2024-01-29 20:10     ` Michael Roth
  0 siblings, 1 reply; 47+ messages in thread
From: Liam Merwick @ 2024-01-29 19:18 UTC (permalink / raw)
  To: Michael Roth, x86
  Cc: kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, Brijesh Singh,
	Alexey Kardashevskiy, Dionna Glaze, Liam Merwick

On 26/01/2024 04:11, Michael Roth wrote:
> From: Brijesh Singh <brijesh.singh@amd.com>
> 
> The SEV-SNP firmware provides the SNP_CONFIG command used to set various
> system-wide configuration values for SNP guests, such as the reported
> TCB version used when signing guest attestation reports. Add an
> interface to set this via userspace.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> Co-developed-by: Dionna Glaze <dionnaglaze@google.com>
> Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> [mdr: squash in doc patch from Dionna, drop extended request/certificate
>   handling and simplify this to a simple wrapper around SNP_CONFIG fw
>   cmd]
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> ---
>   Documentation/virt/coco/sev-guest.rst | 13 +++++++++++++
>   drivers/crypto/ccp/sev-dev.c          | 20 ++++++++++++++++++++
>   include/uapi/linux/psp-sev.h          |  1 +
>   3 files changed, 34 insertions(+)
> 
> diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst
> index 007ae828aa2a..14c9de997b7d 100644
> --- a/Documentation/virt/coco/sev-guest.rst
> +++ b/Documentation/virt/coco/sev-guest.rst
> @@ -162,6 +162,19 @@ SEV-SNP firmware SNP_COMMIT command. This prevents roll-back to a previously
>   committed firmware version. This will also update the reported TCB to match
>   that of the currently installed firmware.
>   
> +2.6 SNP_SET_CONFIG
> +------------------
> +:Technology: sev-snp
> +:Type: hypervisor ioctl cmd
> +:Parameters (in): struct sev_user_data_snp_config
> +:Returns (out): 0 on success, -negative on error
> +
> +SNP_SET_CONFIG is used to set the system-wide configuration such as
> +reported TCB version in the attestation report. The command is similar
> +to SNP_CONFIG command defined in the SEV-SNP spec. The current values of
> +the firmware parameters affected by this command can be queried via
> +SNP_PLATFORM_STATUS.
> +
>   3. SEV-SNP CPUID Enforcement
>   ============================
>   
> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> index 73ace4064e5a..398ae932aa0b 100644
> --- a/drivers/crypto/ccp/sev-dev.c
> +++ b/drivers/crypto/ccp/sev-dev.c
> @@ -1982,6 +1982,23 @@ static int sev_ioctl_do_snp_commit(struct sev_issue_cmd *argp)
>   	return __sev_do_cmd_locked(SEV_CMD_SNP_COMMIT, &buf, &argp->error);
>   }
>   
> +static int sev_ioctl_do_snp_set_config(struct sev_issue_cmd *argp, bool writable)
> +{
> +	struct sev_device *sev = psp_master->sev_data;

Should this check that psp_master is not NULL? Like the fix for
https://lore.kernel.org/all/20240125231253.3122579-1-kim.phillips@amd.com/

Otherwise,
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>

> +	struct sev_user_data_snp_config config;
> +
> +	if (!sev->snp_initialized || !argp->data)
> +		return -EINVAL;
> +
> +	if (!writable)
> +		return -EPERM;
> +
> +	if (copy_from_user(&config, (void __user *)argp->data, sizeof(config)))
> +		return -EFAULT;
> +
> +	return __sev_do_cmd_locked(SEV_CMD_SNP_CONFIG, &config, &argp->error);
> +}
> +
>   static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
>   {
>   	void __user *argp = (void __user *)arg;
> @@ -2039,6 +2056,9 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
>   	case SNP_COMMIT:
>   		ret = sev_ioctl_do_snp_commit(&input);
>   		break;
> +	case SNP_SET_CONFIG:
> +		ret = sev_ioctl_do_snp_set_config(&input, writable);
> +		break;
>   	default:
>   		ret = -EINVAL;
>   		goto out;
> diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
> index 35c207664e95..b7a2c2ee35b7 100644
> --- a/include/uapi/linux/psp-sev.h
> +++ b/include/uapi/linux/psp-sev.h
> @@ -30,6 +30,7 @@ enum {
>   	SEV_GET_ID2,
>   	SNP_PLATFORM_STATUS,
>   	SNP_COMMIT,
> +	SNP_SET_CONFIG,
>   
>   	SEV_MAX,
>   };


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/25] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction
  2024-01-29 18:00   ` Liam Merwick
@ 2024-01-29 19:28     ` Borislav Petkov
  2024-01-29 19:33       ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2024-01-29 19:28 UTC (permalink / raw)
  To: Liam Merwick
  Cc: Michael Roth, x86, kvm, linux-coco, linux-mm, linux-crypto,
	linux-kernel, tglx, mingo, jroedel, thomas.lendacky, hpa, ardb,
	pbonzini, seanjc, vkuznets, jmattson, luto, dave.hansen, slp,
	pgonda, peterz, srinivas.pandruvada, rientjes, tobin, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	Brijesh Singh

On Mon, Jan 29, 2024 at 06:00:36PM +0000, Liam Merwick wrote:
> asid is typically a u32 (or at least unsigned) - is it better to avoid 
> potential conversion issues with an 'int'?

struct rmp_state.asid is already u32 but yeah, lemme fix that.

> Otherwise
> Reviewed-by: Liam Merwick <liam.merwick@oracle.com>

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 10/25] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction
  2024-01-29 19:28     ` Borislav Petkov
@ 2024-01-29 19:33       ` Borislav Petkov
  0 siblings, 0 replies; 47+ messages in thread
From: Borislav Petkov @ 2024-01-29 19:33 UTC (permalink / raw)
  To: Liam Merwick
  Cc: Michael Roth, x86, kvm, linux-coco, linux-mm, linux-crypto,
	linux-kernel, tglx, mingo, jroedel, thomas.lendacky, hpa, ardb,
	pbonzini, seanjc, vkuznets, jmattson, luto, dave.hansen, slp,
	pgonda, peterz, srinivas.pandruvada, rientjes, tobin, vbabka,
	kirill, ak, tony.luck, sathyanarayanan.kuppuswamy, alpergun,
	jarkko, ashish.kalra, nikunj.dadhania, pankaj.gupta,
	Brijesh Singh

On Mon, Jan 29, 2024 at 08:28:20PM +0100, Borislav Petkov wrote:
> On Mon, Jan 29, 2024 at 06:00:36PM +0000, Liam Merwick wrote:
> > asid is typically a u32 (or at least unsigned) - is it better to avoid 
> > potential conversion issues with an 'int'?
> 
> struct rmp_state.asid is already u32 but yeah, lemme fix that.

Btw,

static int sev_get_asid(struct kvm *kvm)
{
        struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;

        return sev->asid;
}

whereas

struct kvm_sev_info {
	...
        unsigned int asid;      /* ASID used for this guest */

so that function needs fixing too.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 25/25] crypto: ccp: Add the SNP_SET_CONFIG command
  2024-01-29 19:18   ` Liam Merwick
@ 2024-01-29 20:10     ` Michael Roth
  0 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2024-01-29 20:10 UTC (permalink / raw)
  To: Liam Merwick
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, bp, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, Brijesh Singh,
	Alexey Kardashevskiy, Dionna Glaze

On Mon, Jan 29, 2024 at 07:18:54PM +0000, Liam Merwick wrote:
> On 26/01/2024 04:11, Michael Roth wrote:
> > From: Brijesh Singh <brijesh.singh@amd.com>
> > 
> > The SEV-SNP firmware provides the SNP_CONFIG command used to set various
> > system-wide configuration values for SNP guests, such as the reported
> > TCB version used when signing guest attestation reports. Add an
> > interface to set this via userspace.
> > 
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
> > Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> > Co-developed-by: Dionna Glaze <dionnaglaze@google.com>
> > Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
> > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> > [mdr: squash in doc patch from Dionna, drop extended request/certificate
> >   handling and simplify this to a simple wrapper around SNP_CONFIG fw
> >   cmd]
> > Signed-off-by: Michael Roth <michael.roth@amd.com>
> > ---
> >   Documentation/virt/coco/sev-guest.rst | 13 +++++++++++++
> >   drivers/crypto/ccp/sev-dev.c          | 20 ++++++++++++++++++++
> >   include/uapi/linux/psp-sev.h          |  1 +
> >   3 files changed, 34 insertions(+)
> > 
> > diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst
> > index 007ae828aa2a..14c9de997b7d 100644
> > --- a/Documentation/virt/coco/sev-guest.rst
> > +++ b/Documentation/virt/coco/sev-guest.rst
> > @@ -162,6 +162,19 @@ SEV-SNP firmware SNP_COMMIT command. This prevents roll-back to a previously
> >   committed firmware version. This will also update the reported TCB to match
> >   that of the currently installed firmware.
> >   
> > +2.6 SNP_SET_CONFIG
> > +------------------
> > +:Technology: sev-snp
> > +:Type: hypervisor ioctl cmd
> > +:Parameters (in): struct sev_user_data_snp_config
> > +:Returns (out): 0 on success, -negative on error
> > +
> > +SNP_SET_CONFIG is used to set the system-wide configuration such as
> > +reported TCB version in the attestation report. The command is similar
> > +to SNP_CONFIG command defined in the SEV-SNP spec. The current values of
> > +the firmware parameters affected by this command can be queried via
> > +SNP_PLATFORM_STATUS.
> > +
> >   3. SEV-SNP CPUID Enforcement
> >   ============================
> >   
> > diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> > index 73ace4064e5a..398ae932aa0b 100644
> > --- a/drivers/crypto/ccp/sev-dev.c
> > +++ b/drivers/crypto/ccp/sev-dev.c
> > @@ -1982,6 +1982,23 @@ static int sev_ioctl_do_snp_commit(struct sev_issue_cmd *argp)
> >   	return __sev_do_cmd_locked(SEV_CMD_SNP_COMMIT, &buf, &argp->error);
> >   }
> >   
> > +static int sev_ioctl_do_snp_set_config(struct sev_issue_cmd *argp, bool writable)
> > +{
> > +	struct sev_device *sev = psp_master->sev_data;
> 
> Should this check that psp_master is not NULL? Like the fix for
> https://lore.kernel.org/all/20240125231253.3122579-1-kim.phillips@amd.com/

It wouldn't hurt, but sev_ioctl() will check for it before getting here,
so I don't think it's a reachable condition.

There's a number of spots where existing SEV/legacy code relies on
psp_pci_init() always setting psp_master before sev_pci_init(). Kim's
patch fixes 1 exception that's reachable if you synthetically force a
driver remove via DEBUG_TEST_DRIVER_REMOVE before the psp_pci_init()
call is made during probe, which generally wouldn't happen in practice.

But there may be other cases like the one Kim hit where "not work"
isn't handled gracefully. Fortunately the other obvious spot,
sev_do_cmd(), also checks for !psp_master, so that covers most of the
internal users of sev-dev.c.

But a general cleanup to introduce a helper like get_sev_device() that
handles !psp_master might be a nice follow-up to make things more
robust.

-Mike

> 
> Otherwise,
> Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
> 
> > +	struct sev_user_data_snp_config config;
> > +
> > +	if (!sev->snp_initialized || !argp->data)
> > +		return -EINVAL;
> > +
> > +	if (!writable)
> > +		return -EPERM;
> > +
> > +	if (copy_from_user(&config, (void __user *)argp->data, sizeof(config)))
> > +		return -EFAULT;
> > +
> > +	return __sev_do_cmd_locked(SEV_CMD_SNP_CONFIG, &config, &argp->error);
> > +}
> > +
> >   static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
> >   {
> >   	void __user *argp = (void __user *)arg;
> > @@ -2039,6 +2056,9 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
> >   	case SNP_COMMIT:
> >   		ret = sev_ioctl_do_snp_commit(&input);
> >   		break;
> > +	case SNP_SET_CONFIG:
> > +		ret = sev_ioctl_do_snp_set_config(&input, writable);
> > +		break;
> >   	default:
> >   		ret = -EINVAL;
> >   		goto out;
> > diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h
> > index 35c207664e95..b7a2c2ee35b7 100644
> > --- a/include/uapi/linux/psp-sev.h
> > +++ b/include/uapi/linux/psp-sev.h
> > @@ -30,6 +30,7 @@ enum {
> >   	SEV_GET_ID2,
> >   	SNP_PLATFORM_STATUS,
> >   	SNP_COMMIT,
> > +	SNP_SET_CONFIG,
> >   
> >   	SEV_MAX,
> >   };
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support
  2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
                   ` (24 preceding siblings ...)
  2024-01-26  4:11 ` [PATCH v2 25/25] crypto: ccp: Add the SNP_SET_CONFIG command Michael Roth
@ 2024-01-30 16:19 ` Borislav Petkov
  25 siblings, 0 replies; 47+ messages in thread
From: Borislav Petkov @ 2024-01-30 16:19 UTC (permalink / raw)
  To: Michael Roth, Paolo Bonzini, Sean Christopherson
  Cc: x86, kvm, linux-coco, linux-mm, linux-crypto, linux-kernel, tglx,
	mingo, jroedel, thomas.lendacky, hpa, ardb, pbonzini, seanjc,
	vkuznets, jmattson, luto, dave.hansen, slp, pgonda, peterz,
	srinivas.pandruvada, rientjes, tobin, vbabka, kirill, ak,
	tony.luck, sathyanarayanan.kuppuswamy, alpergun, jarkko,
	ashish.kalra, nikunj.dadhania, pankaj.gupta, liam.merwick

On Thu, Jan 25, 2024 at 10:11:00PM -0600, Michael Roth wrote:
> This patchset is also available at:
> 
>   https://github.com/amdese/linux/commits/snp-host-init-v2
> 
> and is based on top of linux-next tag next-20240125

Ok, I've queued this now after some testing.

As of now

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/sev

is frozen and fixes should come ontop so that Paolo/Sean can use that
branch to base the remaining SNP host stuff ontop.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2024-01-30 16:20 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-26  4:11 [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Michael Roth
2024-01-26  4:11 ` [PATCH v2 01/25] x86/cpufeatures: Add SEV-SNP CPU feature Michael Roth
2024-01-26  4:11 ` [PATCH v2 02/25] x86/speculation: Do not enable Automatic IBRS if SEV SNP is enabled Michael Roth
2024-01-26  4:11 ` [PATCH v2 03/25] iommu/amd: Don't rely on external callers to enable IOMMU SNP support Michael Roth
2024-01-26  4:11 ` [PATCH v2 04/25] x86/sev: Add the host SEV-SNP initialization support Michael Roth
2024-01-26  4:11 ` [PATCH v2 05/25] x86/mtrr: Don't print errors if MtrrFixDramModEn is set when SNP enabled Michael Roth
2024-01-26  4:11 ` [PATCH v2 06/25] x86/sev: Add RMP entry lookup helpers Michael Roth
2024-01-26  4:11 ` [PATCH v2 07/25] x86/fault: Add helper for dumping RMP entries Michael Roth
2024-01-26  4:11 ` [PATCH v2 08/25] x86/traps: Define RMP violation #PF error code Michael Roth
2024-01-26  4:11 ` [PATCH v2 09/25] x86/fault: Dump RMP table information when RMP page faults occur Michael Roth
2024-01-26  4:11 ` [PATCH v2 10/25] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction Michael Roth
2024-01-29 18:00   ` Liam Merwick
2024-01-29 19:28     ` Borislav Petkov
2024-01-29 19:33       ` Borislav Petkov
2024-01-26  4:11 ` [PATCH v2 11/25] x86/sev: Adjust directmap to avoid inadvertant RMP faults Michael Roth
2024-01-26 15:34   ` Borislav Petkov
2024-01-26 17:04     ` Michael Roth
2024-01-26 18:43       ` Borislav Petkov
2024-01-26 23:54         ` Michael Roth
2024-01-27 11:42           ` Borislav Petkov
2024-01-27 15:45             ` Michael Roth
2024-01-27 16:02               ` Borislav Petkov
2024-01-29 11:59                 ` Borislav Petkov
2024-01-29 15:26                   ` Vlastimil Babka
2024-01-26  4:11 ` [PATCH v2 12/25] crypto: ccp: Define the SEV-SNP commands Michael Roth
2024-01-26  4:11 ` [PATCH v2 13/25] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP Michael Roth
2024-01-29 17:58   ` Borislav Petkov
2024-01-26  4:11 ` [PATCH v2 14/25] crypto: ccp: Provide API to issue SEV and SNP commands Michael Roth
2024-01-26  4:11 ` [PATCH v2 15/25] x86/sev: Introduce snp leaked pages list Michael Roth
2024-01-29 14:26   ` Vlastimil Babka
2024-01-29 14:29     ` Borislav Petkov
2024-01-26  4:11 ` [PATCH v2 16/25] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled Michael Roth
2024-01-29 15:04   ` Borislav Petkov
2024-01-26  4:11 ` [PATCH v2 17/25] crypto: ccp: Handle non-volatile INIT_EX data " Michael Roth
2024-01-29 15:12   ` Borislav Petkov
2024-01-26  4:11 ` [PATCH v2 18/25] crypto: ccp: Handle legacy SEV commands " Michael Roth
2024-01-26  4:11 ` [PATCH v2 19/25] iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown Michael Roth
2024-01-26  4:11 ` [PATCH v2 20/25] crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump Michael Roth
2024-01-26  4:11 ` [PATCH v2 21/25] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe Michael Roth
2024-01-26 11:00   ` Paolo Bonzini
2024-01-26  4:11 ` [PATCH v2 22/25] x86/cpufeatures: Enable/unmask SEV-SNP CPU feature Michael Roth
2024-01-26  4:11 ` [PATCH v2 23/25] crypto: ccp: Add the SNP_PLATFORM_STATUS command Michael Roth
2024-01-26  4:11 ` [PATCH v2 24/25] crypto: ccp: Add the SNP_COMMIT command Michael Roth
2024-01-26  4:11 ` [PATCH v2 25/25] crypto: ccp: Add the SNP_SET_CONFIG command Michael Roth
2024-01-29 19:18   ` Liam Merwick
2024-01-29 20:10     ` Michael Roth
2024-01-30 16:19 ` [PATCH v2 00/25] Add AMD Secure Nested Paging (SEV-SNP) Initialization Support Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).