kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support
@ 2021-12-10 15:42 Brijesh Singh
  2021-12-10 15:42 ` [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot Brijesh Singh
                   ` (40 more replies)
  0 siblings, 41 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:42 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

This part of Secure Encrypted Paging (SEV-SNP) series focuses on the changes
required in a guest OS for SEV-SNP support.

SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
new hardware-based memory protections. SEV-SNP adds strong memory integrity
protection to help prevent malicious hypervisor-based attacks like data
replay, memory re-mapping and more in order to create an isolated memory
encryption environment.
 
This series provides the basic building blocks to support booting the SEV-SNP
VMs, it does not cover all the security enhancement introduced by the SEV-SNP
such as interrupt protection.

Many of the integrity guarantees of SEV-SNP are enforced through a new
structure called the Reverse Map Table (RMP). Adding a new page to SEV-SNP
VM requires a 2-step process. First, the hypervisor assigns a page to the
guest using the new RMPUPDATE instruction. This transitions the page to
guest-invalid. Second, the guest validates the page using the new PVALIDATE
instruction. The SEV-SNP VMs can use the new "Page State Change Request NAE"
defined in the GHCB specification to ask hypervisor to add or remove page
from the RMP table.

Each page assigned to the SEV-SNP VM can either be validated or unvalidated,
as indicated by the Validated flag in the page's RMP entry. There are two
approaches that can be taken for the page validation: Pre-validation and
Lazy Validation.

Under pre-validation, the pages are validated prior to first use. And under
lazy validation, pages are validated when first accessed. An access to a
unvalidated page results in a #VC exception, at which time the exception
handler may validate the page. Lazy validation requires careful tracking of
the validated pages to avoid validating the same GPA more than once. The
recently introduced "Unaccepted" memory type can be used to communicate the
unvalidated memory ranges to the Guest OS.

At this time we only sypport the pre-validation, the OVMF guest BIOS
validates the entire RAM before the control is handed over to the guest kernel.
The early_set_memory_{encrypt,decrypt} and set_memory_{encrypt,decrypt} are
enlightened to perform the page validation or invalidation while setting or
clearing the encryption attribute from the page table.

This series does not provide support for the Interrupt security yet which will
be added after the base support.

The series is based on tip/master
  7f32a31b0a34 (origin/master, origin/HEAD) Merge branch into tip/master: 'core/entry'

Additional resources
---------------------
SEV-SNP whitepaper
https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf
 
APM 2: https://www.amd.com/system/files/TechDocs/24593.pdf
(section 15.36)

GHCB spec:
https://developer.amd.com/wp-content/resources/56421.pdf

SEV-SNP firmware specification:
https://developer.amd.com/sev/

v7: https://lore.kernel.org/linux-mm/20211110220731.2396491-40-brijesh.singh@amd.com/
v6: https://lore.kernel.org/linux-mm/20211008180453.462291-1-brijesh.singh@amd.com/
v5: https://lore.kernel.org/lkml/20210820151933.22401-1-brijesh.singh@amd.com/

Changes since v7:
 * sevguest: extend the get report structure to accept the vmpl from userspace.
 * In the compressed path, move the GHCB protocol negotiation from VC handler
   to sev_enable().
 * sev_enable(): don't expect SEV bit in status MSR when cpuid bit is present, update comments.
 * sme_enable(): call directly from head_64.S rather than as part of startup_64_setup_env, add comments
 * snp_find_cc_blob(), sev_prep_identity_maps(): add missing 'static' keywords to function prototypes

Changes since v6:
 * Add rmpadjust() helper to be used by AP creation and vmpl0 detect function.
 * Clear the VM communication key if guest detects that hypervisor is modifying
   the SNP_GUEST_REQ response header.
 * Move the per-cpu GHCB registration from first #VC to idt setup.
 * Consolidate initial SEV/SME setup into a common entry point that gets called
   early enough to also be used for SEV-SNP CPUID table setup.
 * SNP CPUID: separate initial SEV-SNP feature detection out into standalone
   snp_init() routines, then add CPUID table setup to it as a separate patch.
 * SNP CPUID: fix boot issue with Seabios due to ACPI relying on certain EFI
   config table lookup failures as fallthrough cases rather than error cases.
 * SNP CPUID: drop the use of a separate init routines to handle pointer fixups
   after switching to kernel virtual addresses, instead use a helper that uses
   RIP-relative addressing to access CPUID table when either on identity mapping
   or kernel virtual addresses.

Changes since v5:
 * move the seqno allocation in the sevguest driver.
 * extend snp_issue_guest_request() to accept the exit_info to simplify the logic.
 * use smaller structure names based on feedback.
 * explicitly clear the memory after the SNP guest request is completed.
 * cpuid validation: use a local copy of cpuid table instead of keeping
   firmware table mapped throughout boot.
 * cpuid validation: coding style fix-ups and refactor cpuid-related helpers
   as suggested.
 * cpuid validation: drop a number of BOOT_COMPRESSED-guarded defs/declarations
   by moving things like snp_cpuid_init*() out of sev-shared.c and keeping only
   the common bits there.
 * Break up EFI config table helpers and related acpi.c changes into separate
   patches.
 * re-enable stack protection for 32-bit kernels as well, not just 64-bit

Changes since v4:
 * Address the cpuid specific review comment
 * Simplified the macro based on the review feedback
 * Move macro definition to the patch that needs it
 * Fix the issues reported by the checkpath
 * Address the AP creation specific review comment

Changes since v3:
 * Add support to use the PSP filtered CPUID.
 * Add support for the extended guest request.
 * Move sevguest driver in driver/virt/coco.
 * Add documentation for sevguest ioctl.
 * Add support to check the vmpl0.
 * Pass the VM encryption key and id to be used for encrypting guest messages
   through the platform drv data.
 * Multiple cleanup and fixes to address the review feedbacks.

Changes since v2:
 * Add support for AP startup using SNP specific vmgexit.
 * Add snp_prep_memory() helper.
 * Drop sev_snp_active() helper.
 * Add sev_feature_enabled() helper to check which SEV feature is active.
 * Sync the SNP guest message request header with latest SNP FW spec.
 * Multiple cleanup and fixes to address the review feedbacks.

Changes since v1:
 * Integerate the SNP support in sev.{ch}.
 * Add support to query the hypervisor feature and detect whether SNP is supported.
 * Define Linux specific reason code for the SNP guest termination.
 * Extend the setup_header provide a way for hypervisor to pass secret and cpuid page.
 * Add support to create a platform device and driver to query the attestation report
   and the derive a key.
 * Multiple cleanup and fixes to address Boris's review fedback.

Brijesh Singh (21):
  x86/mm: Extend cc_attr to include AMD SEV-SNP
  x86/sev: Define the Linux specific guest termination reasons
  x86/sev: Save the negotiated GHCB version
  x86/sev: Check SEV-SNP features support
  x86/sev: Add a helper for the PVALIDATE instruction
  x86/sev: Check the vmpl level
  x86/compressed: Add helper for validating pages in the decompression
    stage
  x86/compressed: Register GHCB memory when SEV-SNP is active
  x86/sev: Register GHCB memory when SEV-SNP is active
  x86/sev: Add helper for validating pages in early enc attribute
    changes
  x86/kernel: Make the bss.decrypted section shared in RMP table
  x86/kernel: Validate rom memory before accessing when SEV-SNP is
    active
  x86/mm: Add support to validate memory when changing C-bit
  KVM: SVM: Define sev_features and vmpl field in the VMSA
  KVM: SVM: Create a separate mapping for the SEV-ES save area
  x86/boot: Add Confidential Computing type to setup_data
  x86/sev: Provide support for SNP guest request NAEs
  x86/sev: Register SNP guest request platform device
  virt: Add SEV-SNP guest driver
  virt: sevguest: Add support to derive key
  virt: sevguest: Add support to get extended report

Michael Roth (16):
  x86/compressed/64: detect/setup SEV/SME features earlier in boot
  x86/sev: detect/setup SEV/SME features earlier in boot
  x86/head: re-enable stack protection for 32/64-bit builds
  x86/sev: move MSR-based VMGEXITs for CPUID to helper
  KVM: x86: move lookup of indexed CPUID leafs to helper
  x86/compressed/acpi: move EFI system table lookup to helper
  x86/compressed/acpi: move EFI config table lookup to helper
  x86/compressed/acpi: move EFI vendor table lookup to helper
  KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement
  x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  x86/boot: add a pointer to Confidential Computing blob in bootparams
  x86/compressed: add SEV-SNP feature detection/setup
  x86/compressed: use firmware-validated CPUID for SEV-SNP guests
  x86/compressed/64: add identity mapping for Confidential Computing
    blob
  x86/sev: add SEV-SNP feature detection/setup
  x86/sev: use firmware-validated CPUID for SEV-SNP guests

Tom Lendacky (3):
  KVM: SVM: Create a separate mapping for the GHCB save area
  KVM: SVM: Update the SEV-ES save area mapping
  x86/sev: Use SEV-SNP AP creation to start secondary CPUs

 Documentation/virt/coco/sevguest.rst          | 121 +++
 .../virt/kvm/amd-memory-encryption.rst        |  28 +
 arch/x86/boot/compressed/Makefile             |   1 +
 arch/x86/boot/compressed/acpi.c               | 129 +--
 arch/x86/boot/compressed/efi.c                | 179 ++++
 arch/x86/boot/compressed/head_64.S            |  32 +-
 arch/x86/boot/compressed/ident_map_64.c       |  44 +-
 arch/x86/boot/compressed/mem_encrypt.S        |  36 -
 arch/x86/boot/compressed/misc.h               |  44 +-
 arch/x86/boot/compressed/sev.c                | 249 +++++-
 arch/x86/include/asm/bootparam_utils.h        |   1 +
 arch/x86/include/asm/cpuid.h                  |  26 +
 arch/x86/include/asm/msr-index.h              |   2 +
 arch/x86/include/asm/sev-common.h             |  82 ++
 arch/x86/include/asm/sev.h                    |  96 +-
 arch/x86/include/asm/svm.h                    | 171 +++-
 arch/x86/include/uapi/asm/bootparam.h         |   4 +-
 arch/x86/include/uapi/asm/svm.h               |  13 +
 arch/x86/kernel/Makefile                      |   1 -
 arch/x86/kernel/cc_platform.c                 |   2 +
 arch/x86/kernel/cpu/common.c                  |   4 +
 arch/x86/kernel/head64.c                      |  11 +-
 arch/x86/kernel/head_64.S                     |  37 +
 arch/x86/kernel/probe_roms.c                  |  13 +-
 arch/x86/kernel/sev-shared.c                  | 539 ++++++++++-
 arch/x86/kernel/sev.c                         | 841 ++++++++++++++++--
 arch/x86/kernel/smpboot.c                     |   3 +
 arch/x86/kvm/cpuid.c                          |  17 +-
 arch/x86/kvm/svm/sev.c                        |  24 +-
 arch/x86/kvm/svm/svm.c                        |   4 +-
 arch/x86/kvm/svm/svm.h                        |   2 +-
 arch/x86/mm/mem_encrypt.c                     |  55 +-
 arch/x86/mm/mem_encrypt_identity.c            |   8 +
 arch/x86/mm/pat/set_memory.c                  |  15 +
 drivers/virt/Kconfig                          |   3 +
 drivers/virt/Makefile                         |   1 +
 drivers/virt/coco/sevguest/Kconfig            |   9 +
 drivers/virt/coco/sevguest/Makefile           |   2 +
 drivers/virt/coco/sevguest/sevguest.c         | 738 +++++++++++++++
 drivers/virt/coco/sevguest/sevguest.h         |  98 ++
 include/linux/cc_platform.h                   |   8 +
 include/linux/efi.h                           |   1 +
 include/uapi/linux/sev-guest.h                |  77 ++
 43 files changed, 3474 insertions(+), 297 deletions(-)
 create mode 100644 Documentation/virt/coco/sevguest.rst
 create mode 100644 arch/x86/boot/compressed/efi.c
 create mode 100644 arch/x86/include/asm/cpuid.h
 create mode 100644 drivers/virt/coco/sevguest/Kconfig
 create mode 100644 drivers/virt/coco/sevguest/Makefile
 create mode 100644 drivers/virt/coco/sevguest/sevguest.c
 create mode 100644 drivers/virt/coco/sevguest/sevguest.h
 create mode 100644 include/uapi/linux/sev-guest.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 183+ messages in thread

* [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
@ 2021-12-10 15:42 ` Brijesh Singh
  2021-12-10 18:47   ` Dave Hansen
                     ` (2 more replies)
  2021-12-10 15:42 ` [PATCH v8 02/40] x86/sev: " Brijesh Singh
                   ` (39 subsequent siblings)
  40 siblings, 3 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:42 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

With upcoming SEV-SNP support, SEV-related features need to be
initialized earlier in boot, at the same point the initial #VC handler
is set up, so that the SEV-SNP CPUID table can be utilized during the
initial feature checks. Also, SEV-SNP feature detection will rely on
EFI helper functions to scan the EFI config table for the Confidential
Computing blob, and so would need to be implemented at least partially
in C.

Currently set_sev_encryption_mask() is used to initialize the
sev_status and sme_me_mask globals that advertise what SEV/SME features
are available in a guest. Rename it to sev_enable() to better reflect
that (SME is only enabled in the case of SEV guests in the
boot/compressed kernel), and move it to just after the stage1 #VC
handler is set up so that it can be used to initialize SEV-SNP as well
in future patches.

While at it, re-implement it as C code so that all SEV feature
detection can be better consolidated with upcoming SEV-SNP feature
detection, which will also be in C.

The 32-bit entry path remains unchanged, as it never relied on the
set_sev_encryption_mask() initialization to begin with, possibly due to
the normal rva() helper for accessing globals only being usable by code
in .head.text. Either way, 32-bit entry for SEV-SNP would likely only
be supported for non-EFI boot paths, and so wouldn't rely on existing
EFI helper functions, and so could be handled by a separate/simpler
32-bit initializer in the future if needed.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/head_64.S     | 32 ++++++++++--------
 arch/x86/boot/compressed/mem_encrypt.S | 36 ---------------------
 arch/x86/boot/compressed/misc.h        |  4 +--
 arch/x86/boot/compressed/sev.c         | 45 ++++++++++++++++++++++++++
 4 files changed, 66 insertions(+), 51 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 572c535cf45b..20b174adca51 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -191,9 +191,8 @@ SYM_FUNC_START(startup_32)
 	/*
 	 * Mark SEV as active in sev_status so that startup32_check_sev_cbit()
 	 * will do a check. The sev_status memory will be fully initialized
-	 * with the contents of MSR_AMD_SEV_STATUS later in
-	 * set_sev_encryption_mask(). For now it is sufficient to know that SEV
-	 * is active.
+	 * with the contents of MSR_AMD_SEV_STATUS later via sev_enable(). For
+	 * now it is sufficient to know that SEV is active.
 	 */
 	movl	$1, rva(sev_status)(%ebp)
 1:
@@ -447,6 +446,23 @@ SYM_CODE_START(startup_64)
 	call	load_stage1_idt
 	popq	%rsi
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	/*
+	 * Now that the stage1 interrupt handlers are set up, #VC exceptions from
+	 * CPUID instructions can be properly handled for SEV-ES guests.
+	 *
+	 * For SEV-SNP, the CPUID table also needs to be set up in advance of any
+	 * CPUID instructions being issued, so go ahead and do that now via
+	 * sev_enable(), which will also handle the rest of the SEV-related
+	 * detection/setup to ensure that has been done in advance of any dependent
+	 * code.
+	 */
+	pushq	%rsi
+	movq	%rsi, %rdi		/* real mode address */
+	call	sev_enable
+	popq	%rsi
+#endif
+
 	/*
 	 * paging_prepare() sets up the trampoline and checks if we need to
 	 * enable 5-level paging.
@@ -559,17 +575,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
 	shrq	$3, %rcx
 	rep	stosq
 
-/*
- * If running as an SEV guest, the encryption mask is required in the
- * page-table setup code below. When the guest also has SEV-ES enabled
- * set_sev_encryption_mask() will cause #VC exceptions, but the stage2
- * handler can't map its GHCB because the page-table is not set up yet.
- * So set up the encryption mask here while still on the stage1 #VC
- * handler. Then load stage2 IDT and switch to the kernel's own
- * page-table.
- */
 	pushq	%rsi
-	call	set_sev_encryption_mask
 	call	load_stage2_idt
 
 	/* Pass boot_params to initialize_identity_maps() */
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
index c1e81a848b2a..311d40f35a4b 100644
--- a/arch/x86/boot/compressed/mem_encrypt.S
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -187,42 +187,6 @@ SYM_CODE_END(startup32_vc_handler)
 	.code64
 
 #include "../../kernel/sev_verify_cbit.S"
-SYM_FUNC_START(set_sev_encryption_mask)
-#ifdef CONFIG_AMD_MEM_ENCRYPT
-	push	%rbp
-	push	%rdx
-
-	movq	%rsp, %rbp		/* Save current stack pointer */
-
-	call	get_sev_encryption_bit	/* Get the encryption bit position */
-	testl	%eax, %eax
-	jz	.Lno_sev_mask
-
-	bts	%rax, sme_me_mask(%rip)	/* Create the encryption mask */
-
-	/*
-	 * Read MSR_AMD64_SEV again and store it to sev_status. Can't do this in
-	 * get_sev_encryption_bit() because this function is 32-bit code and
-	 * shared between 64-bit and 32-bit boot path.
-	 */
-	movl	$MSR_AMD64_SEV, %ecx	/* Read the SEV MSR */
-	rdmsr
-
-	/* Store MSR value in sev_status */
-	shlq	$32, %rdx
-	orq	%rdx, %rax
-	movq	%rax, sev_status(%rip)
-
-.Lno_sev_mask:
-	movq	%rbp, %rsp		/* Restore original stack pointer */
-
-	pop	%rdx
-	pop	%rbp
-#endif
-
-	xor	%rax, %rax
-	ret
-SYM_FUNC_END(set_sev_encryption_mask)
 
 	.data
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 16ed360b6692..23e0e395084a 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -120,12 +120,12 @@ static inline void console_init(void)
 { }
 #endif
 
-void set_sev_encryption_mask(void);
-
 #ifdef CONFIG_AMD_MEM_ENCRYPT
+void sev_enable(struct boot_params *bp);
 void sev_es_shutdown_ghcb(void);
 extern bool sev_es_check_ghcb_fault(unsigned long address);
 #else
+static inline void sev_enable(struct boot_params *bp) { }
 static inline void sev_es_shutdown_ghcb(void) { }
 static inline bool sev_es_check_ghcb_fault(unsigned long address)
 {
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 28bcf04c022e..8eebdf589a90 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -204,3 +204,48 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
 	else if (result != ES_RETRY)
 		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
 }
+
+static inline u64 rd_sev_status_msr(void)
+{
+	unsigned long low, high;
+
+	asm volatile("rdmsr" : "=a" (low), "=d" (high) :
+			"c" (MSR_AMD64_SEV));
+
+	return ((high << 32) | low);
+}
+
+void sev_enable(struct boot_params *bp)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	/* Check for the SME/SEV support leaf */
+	eax = 0x80000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax < 0x8000001f)
+		return;
+
+	/*
+	 * Check for the SME/SEV feature:
+	 *   CPUID Fn8000_001F[EAX]
+	 *   - Bit 0 - Secure Memory Encryption support
+	 *   - Bit 1 - Secure Encrypted Virtualization support
+	 *   CPUID Fn8000_001F[EBX]
+	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
+	 */
+	eax = 0x8000001f;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	/* Check whether SEV is supported */
+	if (!(eax & BIT(1)))
+		return;
+
+	/* Set the SME mask if this is an SEV guest. */
+	sev_status   = rd_sev_status_msr();
+
+	if (!(sev_status & MSR_AMD64_SEV_ENABLED))
+		return;
+
+	sme_me_mask = BIT_ULL(ebx & 0x3f);
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 02/40] x86/sev: detect/setup SEV/SME features earlier in boot
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
  2021-12-10 15:42 ` [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot Brijesh Singh
@ 2021-12-10 15:42 ` Brijesh Singh
  2021-12-13 22:36   ` Venu Busireddy
  2021-12-10 15:42 ` [PATCH v8 03/40] x86/mm: Extend cc_attr to include AMD SEV-SNP Brijesh Singh
                   ` (38 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:42 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

sme_enable() handles feature detection for both SEV and SME. Future
patches will also use it for SEV-SNP feature detection/setup, which
will need to be done immediately after the first #VC handler is set up.
Move it now in preparation.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/head64.c  |  3 ---
 arch/x86/kernel/head_64.S | 13 +++++++++++++
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 3be9dd213dad..b01f64e8389b 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -192,9 +192,6 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	if (load_delta & ~PMD_PAGE_MASK)
 		for (;;);
 
-	/* Activate Secure Memory Encryption (SME) if supported and enabled */
-	sme_enable(bp);
-
 	/* Include the SME encryption mask in the fixup value */
 	load_delta += sme_get_me_mask();
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d8b3ebd2bb85..99de8fd461e8 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -69,6 +69,19 @@ SYM_CODE_START_NOALIGN(startup_64)
 	call	startup_64_setup_env
 	popq	%rsi
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	/*
+	 * Activate SEV/SME memory encryption if supported/enabled. This needs to
+	 * be done now, since this also includes setup of the SEV-SNP CPUID table,
+	 * which needs to be done before any CPUID instructions are executed in
+	 * subsequent code.
+	 */
+	movq	%rsi, %rdi
+	pushq	%rsi
+	call	sme_enable
+	popq	%rsi
+#endif
+
 	/* Now switch to __KERNEL_CS so IRET works reliably */
 	pushq	$__KERNEL_CS
 	leaq	.Lon_kernel_cs(%rip), %rax
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 03/40] x86/mm: Extend cc_attr to include AMD SEV-SNP
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
  2021-12-10 15:42 ` [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot Brijesh Singh
  2021-12-10 15:42 ` [PATCH v8 02/40] x86/sev: " Brijesh Singh
@ 2021-12-10 15:42 ` Brijesh Singh
  2021-12-13 22:47   ` Venu Busireddy
  2021-12-14 15:53   ` Borislav Petkov
  2021-12-10 15:42 ` [PATCH v8 04/40] x86/sev: Define the Linux specific guest termination reasons Brijesh Singh
                   ` (37 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:42 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The CC_ATTR_SEV_SNP can be used by the guest to query whether the SNP -
Secure Nested Paging feature is active.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/msr-index.h | 2 ++
 arch/x86/kernel/cc_platform.c    | 2 ++
 arch/x86/mm/mem_encrypt.c        | 4 ++++
 include/linux/cc_platform.h      | 8 ++++++++
 4 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 01e2650b9585..98a64b230447 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -481,8 +481,10 @@
 #define MSR_AMD64_SEV			0xc0010131
 #define MSR_AMD64_SEV_ENABLED_BIT	0
 #define MSR_AMD64_SEV_ES_ENABLED_BIT	1
+#define MSR_AMD64_SEV_SNP_ENABLED_BIT	2
 #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
 #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
+#define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
 
 #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
 
diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c
index 03bb2f343ddb..e05310f5ec2f 100644
--- a/arch/x86/kernel/cc_platform.c
+++ b/arch/x86/kernel/cc_platform.c
@@ -50,6 +50,8 @@ static bool amd_cc_platform_has(enum cc_attr attr)
 	case CC_ATTR_GUEST_STATE_ENCRYPT:
 		return sev_status & MSR_AMD64_SEV_ES_ENABLED;
 
+	case CC_ATTR_SEV_SNP:
+		return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
 	default:
 		return false;
 	}
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 35487305d8af..3ba801ff6afc 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -487,6 +487,10 @@ static void print_mem_encrypt_feature_info(void)
 	if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
 		pr_cont(" SEV-ES");
 
+	/* Secure Nested Paging */
+	if (cc_platform_has(CC_ATTR_SEV_SNP))
+		pr_cont(" SEV-SNP");
+
 	pr_cont("\n");
 }
 
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index a075b70b9a70..ef5e2209c9b8 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -61,6 +61,14 @@ enum cc_attr {
 	 * Examples include SEV-ES.
 	 */
 	CC_ATTR_GUEST_STATE_ENCRYPT,
+
+	/**
+	 * @CC_ATTR_SEV_SNP: Guest SNP is active.
+	 *
+	 * The platform/OS is running as a guest/virtual machine and actively
+	 * using AMD SEV-SNP features.
+	 */
+	CC_ATTR_SEV_SNP = 0x100,
 };
 
 #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 04/40] x86/sev: Define the Linux specific guest termination reasons
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (2 preceding siblings ...)
  2021-12-10 15:42 ` [PATCH v8 03/40] x86/mm: Extend cc_attr to include AMD SEV-SNP Brijesh Singh
@ 2021-12-10 15:42 ` Brijesh Singh
  2021-12-14  0:13   ` Venu Busireddy
  2021-12-14 22:22   ` Borislav Petkov
  2021-12-10 15:42 ` [PATCH v8 05/40] x86/sev: Save the negotiated GHCB version Brijesh Singh
                   ` (36 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:42 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

GHCB specification defines the reason code for reason set 0. The reason
codes defined in the set 0 do not cover all possible causes for a guest
to request termination.

The reason set 1 to 255 is reserved for the vendor-specific codes.
Reseve the reason set 1 for the Linux guest. Define an error codes for
reason set 1.

While at it, change the sev_es_terminate() to accept the reason set
parameter.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    |  6 +++---
 arch/x86/include/asm/sev-common.h |  8 ++++++++
 arch/x86/kernel/sev-shared.c      | 11 ++++-------
 arch/x86/kernel/sev.c             |  4 ++--
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 8eebdf589a90..0b6cc6402ac1 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -122,7 +122,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
 static bool early_setup_sev_es(void)
 {
 	if (!sev_es_negotiate_protocol())
-		sev_es_terminate(GHCB_SEV_ES_PROT_UNSUPPORTED);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
 
 	if (set_page_decrypted((unsigned long)&boot_ghcb_page))
 		return false;
@@ -175,7 +175,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
 	enum es_result result;
 
 	if (!boot_ghcb && !early_setup_sev_es())
-		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 
 	vc_ghcb_invalidate(boot_ghcb);
 	result = vc_init_em_ctxt(&ctxt, regs, exit_code);
@@ -202,7 +202,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
 	if (result == ES_OK)
 		vc_finish_insn(&ctxt);
 	else if (result != ES_RETRY)
-		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 }
 
 static inline u64 rd_sev_status_msr(void)
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 1b2fd32b42fe..94f0ea574049 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -73,9 +73,17 @@
 	 /* GHCBData[23:16] */				\
 	((((u64)reason_val) & 0xff) << 16))
 
+/* Error codes from reason set 0 */
+#define SEV_TERM_SET_GEN		0
 #define GHCB_SEV_ES_GEN_REQ		0
 #define GHCB_SEV_ES_PROT_UNSUPPORTED	1
 
+/* Linux-specific reason codes (used with reason set 1) */
+#define SEV_TERM_SET_LINUX		1
+#define GHCB_TERM_REGISTER		0	/* GHCB GPA registration failure */
+#define GHCB_TERM_PSC			1	/* Page State Change failure */
+#define GHCB_TERM_PVALIDATE		2	/* Pvalidate failure */
+
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
 /*
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index ce987688bbc0..2abf8a7d75e5 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -24,15 +24,12 @@ static bool __init sev_es_check_cpu_features(void)
 	return true;
 }
 
-static void __noreturn sev_es_terminate(unsigned int reason)
+static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
 {
 	u64 val = GHCB_MSR_TERM_REQ;
 
-	/*
-	 * Tell the hypervisor what went wrong - only reason-set 0 is
-	 * currently supported.
-	 */
-	val |= GHCB_SEV_TERM_REASON(0, reason);
+	/* Tell the hypervisor what went wrong. */
+	val |= GHCB_SEV_TERM_REASON(set, reason);
 
 	/* Request Guest Termination from Hypvervisor */
 	sev_es_wr_ghcb_msr(val);
@@ -221,7 +218,7 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
 
 fail:
 	/* Terminate the guest */
-	sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+	sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 }
 
 static enum es_result vc_insn_string_read(struct es_em_ctxt *ctxt,
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index e6d316a01fdd..19ad09712902 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1337,7 +1337,7 @@ DEFINE_IDTENTRY_VC_KERNEL(exc_vmm_communication)
 		show_regs(regs);
 
 		/* Ask hypervisor to sev_es_terminate */
-		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 
 		/* If that fails and we get here - just panic */
 		panic("Returned from Terminate-Request to Hypervisor\n");
@@ -1385,7 +1385,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
 
 	/* Do initial setup or terminate the guest */
 	if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
-		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 
 	vc_ghcb_invalidate(boot_ghcb);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 05/40] x86/sev: Save the negotiated GHCB version
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (3 preceding siblings ...)
  2021-12-10 15:42 ` [PATCH v8 04/40] x86/sev: Define the Linux specific guest termination reasons Brijesh Singh
@ 2021-12-10 15:42 ` Brijesh Singh
  2021-12-14  0:32   ` Venu Busireddy
  2021-12-10 15:42 ` [PATCH v8 06/40] x86/sev: Check SEV-SNP features support Brijesh Singh
                   ` (35 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:42 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The SEV-ES guest calls the sev_es_negotiate_protocol() to negotiate the
GHCB protocol version before establishing the GHCB. Cache the negotiated
GHCB version so that it can be used later.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h   |  2 +-
 arch/x86/kernel/sev-shared.c | 17 ++++++++++++++---
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index ec060c433589..9b9c190e8c3b 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -12,7 +12,7 @@
 #include <asm/insn.h>
 #include <asm/sev-common.h>
 
-#define GHCB_PROTO_OUR		0x0001UL
+#define GHCB_PROTOCOL_MIN	1ULL
 #define GHCB_PROTOCOL_MAX	1ULL
 #define GHCB_DEFAULT_USAGE	0ULL
 
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 2abf8a7d75e5..91105f5a02a8 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -14,6 +14,15 @@
 #define has_cpuflag(f)	boot_cpu_has(f)
 #endif
 
+/*
+ * Since feature negotiation related variables are set early in the boot
+ * process they must reside in the .data section so as not to be zeroed
+ * out when the .bss section is later cleared.
+ *
+ * GHCB protocol version negotiated with the hypervisor.
+ */
+static u16 ghcb_version __ro_after_init;
+
 static bool __init sev_es_check_cpu_features(void)
 {
 	if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -51,10 +60,12 @@ static bool sev_es_negotiate_protocol(void)
 	if (GHCB_MSR_INFO(val) != GHCB_MSR_SEV_INFO_RESP)
 		return false;
 
-	if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTO_OUR ||
-	    GHCB_MSR_PROTO_MIN(val) > GHCB_PROTO_OUR)
+	if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTOCOL_MIN ||
+	    GHCB_MSR_PROTO_MIN(val) > GHCB_PROTOCOL_MAX)
 		return false;
 
+	ghcb_version = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
+
 	return true;
 }
 
@@ -127,7 +138,7 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
 				   u64 exit_info_1, u64 exit_info_2)
 {
 	/* Fill in protocol and format specifiers */
-	ghcb->protocol_version = GHCB_PROTOCOL_MAX;
+	ghcb->protocol_version = ghcb_version;
 	ghcb->ghcb_usage       = GHCB_DEFAULT_USAGE;
 
 	ghcb_set_sw_exit_code(ghcb, exit_code);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 06/40] x86/sev: Check SEV-SNP features support
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (4 preceding siblings ...)
  2021-12-10 15:42 ` [PATCH v8 05/40] x86/sev: Save the negotiated GHCB version Brijesh Singh
@ 2021-12-10 15:42 ` Brijesh Singh
  2021-12-16 15:47   ` Borislav Petkov
  2021-12-16 19:01   ` Venu Busireddy
  2021-12-10 15:42 ` [PATCH v8 07/40] x86/sev: Add a helper for the PVALIDATE instruction Brijesh Singh
                   ` (34 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:42 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Version 2 of the GHCB specification added the advertisement of features
that are supported by the hypervisor. If hypervisor supports the SEV-SNP
then it must set the SEV-SNP features bit to indicate that the base
SEV-SNP is supported.

Check the SEV-SNP feature while establishing the GHCB, if failed,
terminate the guest.

Version 2 of GHCB specification adds several new NAEs, most of them are
optional except the hypervisor feature. Now that hypervisor feature NAE
is implemented, so bump the GHCB maximum support protocol version.

While at it, move the GHCB protocol negotitation check from VC exception
handler to sev_enable() so that all feature detection happens before
the first VC exception.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    | 21 ++++++++++++++++-----
 arch/x86/include/asm/sev-common.h |  6 ++++++
 arch/x86/include/asm/sev.h        |  2 +-
 arch/x86/include/uapi/asm/svm.h   |  2 ++
 arch/x86/kernel/sev-shared.c      | 20 ++++++++++++++++++++
 arch/x86/kernel/sev.c             | 16 ++++++++++++++++
 6 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 0b6cc6402ac1..a0708f359a46 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -119,11 +119,8 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
 /* Include code for early handlers */
 #include "../../kernel/sev-shared.c"
 
-static bool early_setup_sev_es(void)
+static bool early_setup_ghcb(void)
 {
-	if (!sev_es_negotiate_protocol())
-		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
-
 	if (set_page_decrypted((unsigned long)&boot_ghcb_page))
 		return false;
 
@@ -174,7 +171,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
 	struct es_em_ctxt ctxt;
 	enum es_result result;
 
-	if (!boot_ghcb && !early_setup_sev_es())
+	if (!boot_ghcb && !early_setup_ghcb())
 		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 
 	vc_ghcb_invalidate(boot_ghcb);
@@ -247,5 +244,19 @@ void sev_enable(struct boot_params *bp)
 	if (!(sev_status & MSR_AMD64_SEV_ENABLED))
 		return;
 
+	/* Negotiate the GHCB protocol version */
+	if (sev_status & MSR_AMD64_SEV_ES_ENABLED)
+		if (!sev_es_negotiate_protocol())
+			sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
+
+	/*
+	 * SNP is supported in v2 of the GHCB spec which mandates support for HV
+	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
+	 * the SEV-SNP features.
+	 */
+	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED && !(get_hv_features() & GHCB_HV_FT_SNP))
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
+
 	sme_me_mask = BIT_ULL(ebx & 0x3f);
 }
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 94f0ea574049..6f037c29a46e 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -60,6 +60,11 @@
 /* GHCB Hypervisor Feature Request/Response */
 #define GHCB_MSR_HV_FT_REQ		0x080
 #define GHCB_MSR_HV_FT_RESP		0x081
+#define GHCB_MSR_HV_FT_RESP_VAL(v)			\
+	/* GHCBData[63:12] */				\
+	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
+
+#define GHCB_HV_FT_SNP			BIT_ULL(0)
 
 #define GHCB_MSR_TERM_REQ		0x100
 #define GHCB_MSR_TERM_REASON_SET_POS	12
@@ -77,6 +82,7 @@
 #define SEV_TERM_SET_GEN		0
 #define GHCB_SEV_ES_GEN_REQ		0
 #define GHCB_SEV_ES_PROT_UNSUPPORTED	1
+#define GHCB_SNP_UNSUPPORTED		2
 
 /* Linux-specific reason codes (used with reason set 1) */
 #define SEV_TERM_SET_LINUX		1
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 9b9c190e8c3b..17b75f6ee11a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -13,7 +13,7 @@
 #include <asm/sev-common.h>
 
 #define GHCB_PROTOCOL_MIN	1ULL
-#define GHCB_PROTOCOL_MAX	1ULL
+#define GHCB_PROTOCOL_MAX	2ULL
 #define GHCB_DEFAULT_USAGE	0ULL
 
 #define	VMGEXIT()			{ asm volatile("rep; vmmcall\n\r"); }
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index efa969325ede..b0ad00f4c1e1 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -108,6 +108,7 @@
 #define SVM_VMGEXIT_AP_JUMP_TABLE		0x80000005
 #define SVM_VMGEXIT_SET_AP_JUMP_TABLE		0
 #define SVM_VMGEXIT_GET_AP_JUMP_TABLE		1
+#define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
 #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
 
 /* Exit code reserved for hypervisor/software use */
@@ -218,6 +219,7 @@
 	{ SVM_VMGEXIT_NMI_COMPLETE,	"vmgexit_nmi_complete" }, \
 	{ SVM_VMGEXIT_AP_HLT_LOOP,	"vmgexit_ap_hlt_loop" }, \
 	{ SVM_VMGEXIT_AP_JUMP_TABLE,	"vmgexit_ap_jump_table" }, \
+	{ SVM_VMGEXIT_HV_FEATURES,	"vmgexit_hypervisor_feature" }, \
 	{ SVM_EXIT_ERR,         "invalid_guest_state" }
 
 
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 91105f5a02a8..4a876e684f67 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -48,6 +48,26 @@ static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
 		asm volatile("hlt\n" : : : "memory");
 }
 
+/*
+ * The hypervisor features are available from GHCB version 2 onward.
+ */
+static u64 get_hv_features(void)
+{
+	u64 val;
+
+	if (ghcb_version < 2)
+		return 0;
+
+	sev_es_wr_ghcb_msr(GHCB_MSR_HV_FT_REQ);
+	VMGEXIT();
+
+	val = sev_es_rd_ghcb_msr();
+	if (GHCB_RESP_CODE(val) != GHCB_MSR_HV_FT_RESP)
+		return 0;
+
+	return GHCB_MSR_HV_FT_RESP_VAL(val);
+}
+
 static bool sev_es_negotiate_protocol(void)
 {
 	u64 val;
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 19ad09712902..a0cada8398a4 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -43,6 +43,10 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
  */
 static struct ghcb __initdata *boot_ghcb;
 
+/* Bitmap of SEV features supported by the hypervisor */
+static u64 sev_hv_features;
+
+
 /* #VC handler runtime per-CPU data */
 struct sev_es_runtime_data {
 	struct ghcb ghcb_page;
@@ -766,6 +770,18 @@ void __init sev_es_init_vc_handling(void)
 	if (!sev_es_check_cpu_features())
 		panic("SEV-ES CPU Features missing");
 
+	/*
+	 * SNP is supported in v2 of the GHCB spec which mandates support for HV
+	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
+	 * the SEV-SNP features.
+	 */
+	if (cc_platform_has(CC_ATTR_SEV_SNP)) {
+		sev_hv_features = get_hv_features();
+
+		if (!(sev_hv_features & GHCB_HV_FT_SNP))
+			sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+	}
+
 	/* Enable SEV-ES special handling */
 	static_branch_enable(&sev_es_enable_key);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 07/40] x86/sev: Add a helper for the PVALIDATE instruction
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (5 preceding siblings ...)
  2021-12-10 15:42 ` [PATCH v8 06/40] x86/sev: Check SEV-SNP features support Brijesh Singh
@ 2021-12-10 15:42 ` Brijesh Singh
  2021-12-16 20:20   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 08/40] x86/sev: Check the vmpl level Brijesh Singh
                   ` (33 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:42 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

An SNP-active guest uses the PVALIDATE instruction to validate or
rescind the validation of a guest page’s RMP entry. Upon completion,
a return code is stored in EAX and rFLAGS bits are set based on the
return code. If the instruction completed successfully, the CF
indicates if the content of the RMP were changed or not.

See AMD APM Volume 3 for additional details.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 17b75f6ee11a..4ee98976aed8 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -60,6 +60,9 @@ extern void vc_no_ghcb(void);
 extern void vc_boot_ghcb(void);
 extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 
+/* Software defined (when rFlags.CF = 1) */
+#define PVALIDATE_FAIL_NOUPDATE		255
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -87,12 +90,30 @@ extern enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
 					  struct es_em_ctxt *ctxt,
 					  u64 exit_code, u64 exit_info_1,
 					  u64 exit_info_2);
+static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
+{
+	bool no_rmpupdate;
+	int rc;
+
+	/* "pvalidate" mnemonic support in binutils 2.36 and newer */
+	asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFF\n\t"
+		     CC_SET(c)
+		     : CC_OUT(c) (no_rmpupdate), "=a"(rc)
+		     : "a"(vaddr), "c"(rmp_psize), "d"(validate)
+		     : "memory", "cc");
+
+	if (no_rmpupdate)
+		return PVALIDATE_FAIL_NOUPDATE;
+
+	return rc;
+}
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
 static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { return 0; }
 static inline void sev_es_nmi_complete(void) { }
 static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
+static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
 #endif
 
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 08/40] x86/sev: Check the vmpl level
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (6 preceding siblings ...)
  2021-12-10 15:42 ` [PATCH v8 07/40] x86/sev: Add a helper for the PVALIDATE instruction Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-16 20:24   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage Brijesh Singh
                   ` (32 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
allows a guest VM to divide its address space into four levels. The level
can be used to provide the hardware isolated abstraction layers with a VM.
The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
Certain operations must be done by the VMPL0 software, such as:

* Validate or invalidate memory range (PVALIDATE instruction)
* Allocate VMSA page (RMPADJUST instruction when VMSA=1)

The initial SEV-SNP support requires that the guest kernel is running on
VMPL0. Add a check to make sure that kernel is running at VMPL0 before
continuing the boot. There is no easy method to query the current VMPL
level, so use the RMPADJUST instruction to determine whether the guest is
running at the VMPL0.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
 arch/x86/include/asm/sev-common.h |  1 +
 arch/x86/include/asm/sev.h        | 16 +++++++++++++++
 3 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index a0708f359a46..9be369f72299 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -212,6 +212,31 @@ static inline u64 rd_sev_status_msr(void)
 	return ((high << 32) | low);
 }
 
+static void enforce_vmpl0(void)
+{
+	u64 attrs;
+	int err;
+
+	/*
+	 * There is no straightforward way to query the current VMPL level. The
+	 * simplest method is to use the RMPADJUST instruction to change a page
+	 * permission to a VMPL level-1, and if the guest kernel is launched at
+	 * a level <= 1, then RMPADJUST instruction will return an error.
+	 */
+	attrs = 1;
+
+	/*
+	 * Any page-aligned virtual address is sufficient to test the VMPL level.
+	 * The boot_ghcb_page is page aligned memory, so use for the test.
+	 *
+	 * The RMPADJUST operation below clears the permission for the boot_ghcb_page
+	 * on VMPL1. If the guest is booted at the VMPL0, then there is no need to
+	 * restore the permissions because VMPL1 permission will be all zero.
+	 */
+	if (rmpadjust((unsigned long)&boot_ghcb_page, RMP_PG_SIZE_4K, attrs))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_NOT_VMPL0);
+}
+
 void sev_enable(struct boot_params *bp)
 {
 	unsigned int eax, ebx, ecx, edx;
@@ -252,11 +277,14 @@ void sev_enable(struct boot_params *bp)
 	/*
 	 * SNP is supported in v2 of the GHCB spec which mandates support for HV
 	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
-	 * the SEV-SNP features.
+	 * the SEV-SNP features and is launched at VMPL0 level.
 	 */
-	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED && !(get_hv_features() & GHCB_HV_FT_SNP))
-		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) {
+		if (!(get_hv_features() & GHCB_HV_FT_SNP))
+			sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
 
+		enforce_vmpl0();
+	}
 
 	sme_me_mask = BIT_ULL(ebx & 0x3f);
 }
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 6f037c29a46e..7ac5842e32b6 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -89,6 +89,7 @@
 #define GHCB_TERM_REGISTER		0	/* GHCB GPA registration failure */
 #define GHCB_TERM_PSC			1	/* Page State Change failure */
 #define GHCB_TERM_PVALIDATE		2	/* Pvalidate failure */
+#define GHCB_TERM_NOT_VMPL0		3	/* SNP guest is not running at VMPL-0 */
 
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 4ee98976aed8..e37451849165 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -63,6 +63,9 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 /* Software defined (when rFlags.CF = 1) */
 #define PVALIDATE_FAIL_NOUPDATE		255
 
+/* RMP page size */
+#define RMP_PG_SIZE_4K			0
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -90,6 +93,18 @@ extern enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
 					  struct es_em_ctxt *ctxt,
 					  u64 exit_code, u64 exit_info_1,
 					  u64 exit_info_2);
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs)
+{
+	int rc;
+
+	/* "rmpadjust" mnemonic support in binutils 2.36 and newer */
+	asm volatile(".byte 0xF3,0x0F,0x01,0xFE\n\t"
+		     : "=a"(rc)
+		     : "a"(vaddr), "c"(rmp_psize), "d"(attrs)
+		     : "memory", "cc");
+
+	return rc;
+}
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
 {
 	bool no_rmpupdate;
@@ -114,6 +129,7 @@ static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { ret
 static inline void sev_es_nmi_complete(void) { }
 static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
 #endif
 
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (7 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 08/40] x86/sev: Check the vmpl level Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-17 20:47   ` Venu Busireddy
  2021-12-21 13:01   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 10/40] x86/compressed: Register GHCB memory when SEV-SNP is active Brijesh Singh
                   ` (31 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Many of the integrity guarantees of SEV-SNP are enforced through the
Reverse Map Table (RMP). Each RMP entry contains the GPA at which a
particular page of DRAM should be mapped. The VMs can request the
hypervisor to add pages in the RMP table via the Page State Change VMGEXIT
defined in the GHCB specification. Inside each RMP entry is a Validated
flag; this flag is automatically cleared to 0 by the CPU hardware when a
new RMP entry is created for a guest. Each VM page can be either
validated or invalidated, as indicated by the Validated flag in the RMP
entry. Memory access to a private page that is not validated generates
a #VC. A VM must use PVALIDATE instruction to validate the private page
before using it.

To maintain the security guarantee of SEV-SNP guests, when transitioning
pages from private to shared, the guest must invalidate the pages before
asking the hypervisor to change the page state to shared in the RMP table.

After the pages are mapped private in the page table, the guest must issue
a page state change VMGEXIT to make the pages private in the RMP table and
validate it.

On boot, BIOS should have validated the entire system memory. During
the kernel decompression stage, the VC handler uses the
set_memory_decrypted() to make the GHCB page shared (i.e clear encryption
attribute). And while exiting from the decompression, it calls the
set_page_encrypted() to make the page private.

Add sev_snp_set_page_{private,shared}() helper that is used by the
set_memory_{decrypt,encrypt}() to change the page state in the RMP table.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/ident_map_64.c | 18 +++++++++-
 arch/x86/boot/compressed/misc.h         |  4 +++
 arch/x86/boot/compressed/sev.c          | 46 +++++++++++++++++++++++++
 arch/x86/include/asm/sev-common.h       | 26 ++++++++++++++
 4 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index f7213d0943b8..ef77453cc629 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -275,15 +275,31 @@ static int set_clr_page_flags(struct x86_mapping_info *info,
 	 * Changing encryption attributes of a page requires to flush it from
 	 * the caches.
 	 */
-	if ((set | clr) & _PAGE_ENC)
+	if ((set | clr) & _PAGE_ENC) {
 		clflush_page(address);
 
+		/*
+		 * If the encryption attribute is being cleared, then change
+		 * the page state to shared in the RMP table.
+		 */
+		if (clr)
+			snp_set_page_shared(pte_pfn(*ptep) << PAGE_SHIFT);
+	}
+
 	/* Update PTE */
 	pte = *ptep;
 	pte = pte_set_flags(pte, set);
 	pte = pte_clear_flags(pte, clr);
 	set_pte(ptep, pte);
 
+	/*
+	 * If the encryption attribute is being set, then change the page state to
+	 * private in the RMP entry. The page state must be done after the PTE
+	 * is updated.
+	 */
+	if (set & _PAGE_ENC)
+		snp_set_page_private(__pa(address & PAGE_MASK));
+
 	/* Flush TLB after changing encryption attribute */
 	write_cr3(top_level_pgt);
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 23e0e395084a..01cc13c12059 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -124,6 +124,8 @@ static inline void console_init(void)
 void sev_enable(struct boot_params *bp);
 void sev_es_shutdown_ghcb(void);
 extern bool sev_es_check_ghcb_fault(unsigned long address);
+void snp_set_page_private(unsigned long paddr);
+void snp_set_page_shared(unsigned long paddr);
 #else
 static inline void sev_enable(struct boot_params *bp) { }
 static inline void sev_es_shutdown_ghcb(void) { }
@@ -131,6 +133,8 @@ static inline bool sev_es_check_ghcb_fault(unsigned long address)
 {
 	return false;
 }
+static inline void snp_set_page_private(unsigned long paddr) { }
+static inline void snp_set_page_shared(unsigned long paddr) { }
 #endif
 
 /* acpi.c */
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 9be369f72299..12a93acc94ba 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -119,6 +119,52 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
 /* Include code for early handlers */
 #include "../../kernel/sev-shared.c"
 
+static inline bool sev_snp_enabled(void)
+{
+	return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
+}
+
+static void __page_state_change(unsigned long paddr, enum psc_op op)
+{
+	u64 val;
+
+	if (!sev_snp_enabled())
+		return;
+
+	/*
+	 * If private -> shared then invalidate the page before requesting the
+	 * state change in the RMP table.
+	 */
+	if (op == SNP_PAGE_STATE_SHARED && pvalidate(paddr, RMP_PG_SIZE_4K, 0))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+	/* Issue VMGEXIT to change the page state in RMP table. */
+	sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+	VMGEXIT();
+
+	/* Read the response of the VMGEXIT. */
+	val = sev_es_rd_ghcb_msr();
+	if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+
+	/*
+	 * Now that page is added in the RMP table, validate it so that it is
+	 * consistent with the RMP entry.
+	 */
+	if (op == SNP_PAGE_STATE_PRIVATE && pvalidate(paddr, RMP_PG_SIZE_4K, 1))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+}
+
+void snp_set_page_private(unsigned long paddr)
+{
+	__page_state_change(paddr, SNP_PAGE_STATE_PRIVATE);
+}
+
+void snp_set_page_shared(unsigned long paddr)
+{
+	__page_state_change(paddr, SNP_PAGE_STATE_SHARED);
+}
+
 static bool early_setup_ghcb(void)
 {
 	if (set_page_decrypted((unsigned long)&boot_ghcb_page))
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 7ac5842e32b6..a2f956cfafba 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -57,6 +57,32 @@
 #define GHCB_MSR_AP_RESET_HOLD_REQ	0x006
 #define GHCB_MSR_AP_RESET_HOLD_RESP	0x007
 
+/*
+ * SNP Page State Change Operation
+ *
+ * GHCBData[55:52] - Page operation:
+ *   0x0001 – Page assignment, Private
+ *   0x0002 – Page assignment, Shared
+ */
+enum psc_op {
+	SNP_PAGE_STATE_PRIVATE = 1,
+	SNP_PAGE_STATE_SHARED,
+};
+
+#define GHCB_MSR_PSC_REQ		0x014
+#define GHCB_MSR_PSC_REQ_GFN(gfn, op)			\
+	/* GHCBData[55:52] */				\
+	(((u64)((op) & 0xf) << 52) |			\
+	/* GHCBData[51:12] */				\
+	((u64)((gfn) & GENMASK_ULL(39, 0)) << 12) |	\
+	/* GHCBData[11:0] */				\
+	GHCB_MSR_PSC_REQ)
+
+#define GHCB_MSR_PSC_RESP		0x015
+#define GHCB_MSR_PSC_RESP_VAL(val)			\
+	/* GHCBData[63:32] */				\
+	(((u64)(val) & GENMASK_ULL(63, 32)) >> 32)
+
 /* GHCB Hypervisor Feature Request/Response */
 #define GHCB_MSR_HV_FT_REQ		0x080
 #define GHCB_MSR_HV_FT_RESP		0x081
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 10/40] x86/compressed: Register GHCB memory when SEV-SNP is active
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (8 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-03 19:54   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 11/40] x86/sev: " Brijesh Singh
                   ` (30 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The SEV-SNP guest is required by the GHCB spec to register the GHCB's
Guest Physical Address (GPA). This is because the hypervisor may prefer
that a guest use a consistent and/or specific GPA for the GHCB associated
with a vCPU. For more information, see the GHCB specification section
"GHCB GPA Registration".

If hypervisor can not work with the guest provided GPA then terminate the
guest boot.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    |  4 ++++
 arch/x86/include/asm/sev-common.h | 13 +++++++++++++
 arch/x86/kernel/sev-shared.c      | 16 ++++++++++++++++
 3 files changed, 33 insertions(+)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 12a93acc94ba..348f7711c3ea 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -178,6 +178,10 @@ static bool early_setup_ghcb(void)
 	/* Initialize lookup tables for the instruction decoder */
 	inat_init_tables();
 
+	/* SEV-SNP guest requires the GHCB GPA must be registered */
+	if (sev_snp_enabled())
+		snp_register_ghcb_early(__pa(&boot_ghcb_page));
+
 	return true;
 }
 
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index a2f956cfafba..6dc27963690e 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -57,6 +57,19 @@
 #define GHCB_MSR_AP_RESET_HOLD_REQ	0x006
 #define GHCB_MSR_AP_RESET_HOLD_RESP	0x007
 
+/* GHCB GPA Register */
+#define GHCB_MSR_REG_GPA_REQ		0x012
+#define GHCB_MSR_REG_GPA_REQ_VAL(v)			\
+	/* GHCBData[63:12] */				\
+	(((u64)((v) & GENMASK_ULL(51, 0)) << 12) |	\
+	/* GHCBData[11:0] */				\
+	GHCB_MSR_REG_GPA_REQ)
+
+#define GHCB_MSR_REG_GPA_RESP		0x013
+#define GHCB_MSR_REG_GPA_RESP_VAL(v)			\
+	/* GHCBData[63:12] */				\
+	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
+
 /*
  * SNP Page State Change Operation
  *
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 4a876e684f67..e9ff13cd90b0 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -68,6 +68,22 @@ static u64 get_hv_features(void)
 	return GHCB_MSR_HV_FT_RESP_VAL(val);
 }
 
+static void __maybe_unused snp_register_ghcb_early(unsigned long paddr)
+{
+	unsigned long pfn = paddr >> PAGE_SHIFT;
+	u64 val;
+
+	sev_es_wr_ghcb_msr(GHCB_MSR_REG_GPA_REQ_VAL(pfn));
+	VMGEXIT();
+
+	val = sev_es_rd_ghcb_msr();
+
+	/* If the response GPA is not ours then abort the guest */
+	if ((GHCB_RESP_CODE(val) != GHCB_MSR_REG_GPA_RESP) ||
+	    (GHCB_MSR_REG_GPA_RESP_VAL(val) != pfn))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_REGISTER);
+}
+
 static bool sev_es_negotiate_protocol(void)
 {
 	u64 val;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 11/40] x86/sev: Register GHCB memory when SEV-SNP is active
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (9 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 10/40] x86/compressed: Register GHCB memory when SEV-SNP is active Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-22 13:16   ` Borislav Petkov
  2022-01-03 22:47   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes Brijesh Singh
                   ` (29 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The SEV-SNP guest is required by the GHCB spec to register the GHCB's
Guest Physical Address (GPA). This is because the hypervisor may prefer
that a guest use a consistent and/or specific GPA for the GHCB associated
with a vCPU. For more information, see the GHCB specification section
"GHCB GPA Registration".

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h   |   2 +
 arch/x86/kernel/cpu/common.c |   4 ++
 arch/x86/kernel/head64.c     |   1 +
 arch/x86/kernel/sev-shared.c |   2 +-
 arch/x86/kernel/sev.c        | 120 ++++++++++++++++++++---------------
 5 files changed, 77 insertions(+), 52 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index e37451849165..0df508374a35 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -122,6 +122,7 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
 
 	return rc;
 }
+void sev_snp_register_ghcb(void);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -130,6 +131,7 @@ static inline void sev_es_nmi_complete(void) { }
 static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
 static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
+static inline void sev_snp_register_ghcb(void) { }
 #endif
 
 #endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0663642d6199..1146e8920b03 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -59,6 +59,7 @@
 #include <asm/cpu_device_id.h>
 #include <asm/uv/uv.h>
 #include <asm/sigframe.h>
+#include <asm/sev.h>
 
 #include "cpu.h"
 
@@ -1988,6 +1989,9 @@ void cpu_init_exception_handling(void)
 
 	load_TR_desc();
 
+	/* Register the GHCB before taking any VC exception */
+	sev_snp_register_ghcb();
+
 	/* Finally load the IDT */
 	load_current_idt();
 }
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index b01f64e8389b..fa02402dcb9b 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -588,6 +588,7 @@ void early_setup_idt(void)
 
 	bringup_idt_descr.address = (unsigned long)bringup_idt_table;
 	native_load_idt(&bringup_idt_descr);
+	sev_snp_register_ghcb();
 }
 
 /*
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index e9ff13cd90b0..3aaef1a18ffe 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -68,7 +68,7 @@ static u64 get_hv_features(void)
 	return GHCB_MSR_HV_FT_RESP_VAL(val);
 }
 
-static void __maybe_unused snp_register_ghcb_early(unsigned long paddr)
+static void snp_register_ghcb_early(unsigned long paddr)
 {
 	unsigned long pfn = paddr >> PAGE_SHIFT;
 	u64 val;
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index a0cada8398a4..17ad603f62da 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -162,55 +162,6 @@ void noinstr __sev_es_ist_exit(void)
 	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist);
 }
 
-/*
- * Nothing shall interrupt this code path while holding the per-CPU
- * GHCB. The backup GHCB is only for NMIs interrupting this path.
- *
- * Callers must disable local interrupts around it.
- */
-static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
-{
-	struct sev_es_runtime_data *data;
-	struct ghcb *ghcb;
-
-	WARN_ON(!irqs_disabled());
-
-	data = this_cpu_read(runtime_data);
-	ghcb = &data->ghcb_page;
-
-	if (unlikely(data->ghcb_active)) {
-		/* GHCB is already in use - save its contents */
-
-		if (unlikely(data->backup_ghcb_active)) {
-			/*
-			 * Backup-GHCB is also already in use. There is no way
-			 * to continue here so just kill the machine. To make
-			 * panic() work, mark GHCBs inactive so that messages
-			 * can be printed out.
-			 */
-			data->ghcb_active        = false;
-			data->backup_ghcb_active = false;
-
-			instrumentation_begin();
-			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
-			instrumentation_end();
-		}
-
-		/* Mark backup_ghcb active before writing to it */
-		data->backup_ghcb_active = true;
-
-		state->ghcb = &data->backup_ghcb;
-
-		/* Backup GHCB content */
-		*state->ghcb = *ghcb;
-	} else {
-		state->ghcb = NULL;
-		data->ghcb_active = true;
-	}
-
-	return ghcb;
-}
-
 static inline u64 sev_es_rd_ghcb_msr(void)
 {
 	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
@@ -484,6 +435,55 @@ static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt
 /* Include code shared with pre-decompression boot stage */
 #include "sev-shared.c"
 
+/*
+ * Nothing shall interrupt this code path while holding the per-CPU
+ * GHCB. The backup GHCB is only for NMIs interrupting this path.
+ *
+ * Callers must disable local interrupts around it.
+ */
+static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
+{
+	struct sev_es_runtime_data *data;
+	struct ghcb *ghcb;
+
+	WARN_ON(!irqs_disabled());
+
+	data = this_cpu_read(runtime_data);
+	ghcb = &data->ghcb_page;
+
+	if (unlikely(data->ghcb_active)) {
+		/* GHCB is already in use - save its contents */
+
+		if (unlikely(data->backup_ghcb_active)) {
+			/*
+			 * Backup-GHCB is also already in use. There is no way
+			 * to continue here so just kill the machine. To make
+			 * panic() work, mark GHCBs inactive so that messages
+			 * can be printed out.
+			 */
+			data->ghcb_active        = false;
+			data->backup_ghcb_active = false;
+
+			instrumentation_begin();
+			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
+			instrumentation_end();
+		}
+
+		/* Mark backup_ghcb active before writing to it */
+		data->backup_ghcb_active = true;
+
+		state->ghcb = &data->backup_ghcb;
+
+		/* Backup GHCB content */
+		*state->ghcb = *ghcb;
+	} else {
+		state->ghcb = NULL;
+		data->ghcb_active = true;
+	}
+
+	return ghcb;
+}
+
 static noinstr void __sev_put_ghcb(struct ghcb_state *state)
 {
 	struct sev_es_runtime_data *data;
@@ -652,7 +652,7 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
  * This function runs on the first #VC exception after the kernel
  * switched to virtual addresses.
  */
-static bool __init sev_es_setup_ghcb(void)
+static bool __init setup_ghcb(void)
 {
 	/* First make sure the hypervisor talks a supported protocol. */
 	if (!sev_es_negotiate_protocol())
@@ -667,6 +667,10 @@ static bool __init sev_es_setup_ghcb(void)
 	/* Alright - Make the boot-ghcb public */
 	boot_ghcb = &boot_ghcb_page;
 
+	/* SEV-SNP guest requires that GHCB GPA must be registered. */
+	if (cc_platform_has(CC_ATTR_SEV_SNP))
+		snp_register_ghcb_early(__pa(&boot_ghcb_page));
+
 	return true;
 }
 
@@ -758,6 +762,20 @@ static void __init init_ghcb(int cpu)
 	data->backup_ghcb_active = false;
 }
 
+void sev_snp_register_ghcb(void)
+{
+	struct sev_es_runtime_data *data;
+	struct ghcb *ghcb;
+
+	if (!cc_platform_has(CC_ATTR_SEV_SNP))
+		return;
+
+	data = this_cpu_read(runtime_data);
+	ghcb = &data->ghcb_page;
+
+	snp_register_ghcb_early(__pa(ghcb));
+}
+
 void __init sev_es_init_vc_handling(void)
 {
 	int cpu;
@@ -1400,7 +1418,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
 	enum es_result result;
 
 	/* Do initial setup or terminate the guest */
-	if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
+	if (unlikely(boot_ghcb == NULL && !setup_ghcb()))
 		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 
 	vc_ghcb_invalidate(boot_ghcb);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (10 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 11/40] x86/sev: " Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-23 11:50   ` Borislav Petkov
  2022-01-03 23:28   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table Brijesh Singh
                   ` (28 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The early_set_memory_{encrypt,decrypt}() are used for changing the
page from decrypted (shared) to encrypted (private) and vice versa.
When SEV-SNP is active, the page state transition needs to go through
additional steps.

If the page is transitioned from shared to private, then perform the
following after the encryption attribute is set in the page table:

1. Issue the page state change VMGEXIT to add the page as a private
   in the RMP table.
2. Validate the page after its successfully added in the RMP table.

To maintain the security guarantees, if the page is transitioned from
private to shared, then perform the following before clearing the
encryption attribute from the page table.

1. Invalidate the page.
2. Issue the page state change VMGEXIT to make the page shared in the
   RMP table.

The early_set_memory_{encrypt,decrypt} can be called before the GHCB
is setup, use the SNP page state MSR protocol VMGEXIT defined in the GHCB
specification to request the page state change in the RMP table.

While at it, add a helper snp_prep_memory() that can be used outside
the sev specific files to change the page state for a specified memory
range.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h |  10 ++++
 arch/x86/kernel/sev.c      | 102 +++++++++++++++++++++++++++++++++++++
 arch/x86/mm/mem_encrypt.c  |  51 +++++++++++++++++--
 3 files changed, 159 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 0df508374a35..eec2e1b9d557 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -123,6 +123,11 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
 	return rc;
 }
 void sev_snp_register_ghcb(void);
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+					 unsigned int npages);
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+					unsigned int npages);
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -132,6 +137,11 @@ static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
 static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
 static inline void sev_snp_register_ghcb(void) { }
+static inline void __init
+early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init
+early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
 #endif
 
 #endif
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 17ad603f62da..2971aa280ce6 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -557,6 +557,108 @@ static u64 get_jump_table_addr(void)
 	return ret;
 }
 
+static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool validate)
+{
+	unsigned long vaddr_end;
+	int rc;
+
+	vaddr = vaddr & PAGE_MASK;
+	vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+	while (vaddr < vaddr_end) {
+		rc = pvalidate(vaddr, RMP_PG_SIZE_4K, validate);
+		if (WARN(rc, "Failed to validate address 0x%lx ret %d", vaddr, rc))
+			sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+		vaddr = vaddr + PAGE_SIZE;
+	}
+}
+
+static void __init early_set_page_state(unsigned long paddr, unsigned int npages, enum psc_op op)
+{
+	unsigned long paddr_end;
+	u64 val;
+
+	paddr = paddr & PAGE_MASK;
+	paddr_end = paddr + (npages << PAGE_SHIFT);
+
+	while (paddr < paddr_end) {
+		/*
+		 * Use the MSR protocol because this function can be called before the GHCB
+		 * is established.
+		 */
+		sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+		VMGEXIT();
+
+		val = sev_es_rd_ghcb_msr();
+
+		if (WARN(GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP,
+			 "Wrong PSC response code: 0x%x\n",
+			 (unsigned int)GHCB_RESP_CODE(val)))
+			goto e_term;
+
+		if (WARN(GHCB_MSR_PSC_RESP_VAL(val),
+			 "Failed to change page state to '%s' paddr 0x%lx error 0x%llx\n",
+			 op == SNP_PAGE_STATE_PRIVATE ? "private" : "shared",
+			 paddr, GHCB_MSR_PSC_RESP_VAL(val)))
+			goto e_term;
+
+		paddr = paddr + PAGE_SIZE;
+	}
+
+	return;
+
+e_term:
+	sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+					 unsigned int npages)
+{
+	if (!cc_platform_has(CC_ATTR_SEV_SNP))
+		return;
+
+	 /*
+	  * Ask the hypervisor to mark the memory pages as private in the RMP
+	  * table.
+	  */
+	early_set_page_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+	/* Validate the memory pages after they've been added in the RMP table. */
+	pvalidate_pages(vaddr, npages, 1);
+}
+
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+					unsigned int npages)
+{
+	if (!cc_platform_has(CC_ATTR_SEV_SNP))
+		return;
+
+	/*
+	 * Invalidate the memory pages before they are marked shared in the
+	 * RMP table.
+	 */
+	pvalidate_pages(vaddr, npages, 0);
+
+	 /* Ask hypervisor to mark the memory pages shared in the RMP table. */
+	early_set_page_state(paddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
+{
+	unsigned long vaddr, npages;
+
+	vaddr = (unsigned long)__va(paddr);
+	npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+	if (op == SNP_PAGE_STATE_PRIVATE)
+		early_snp_set_memory_private(vaddr, paddr, npages);
+	else if (op == SNP_PAGE_STATE_SHARED)
+		early_snp_set_memory_shared(vaddr, paddr, npages);
+	else
+		WARN(1, "invalid memory op %d\n", op);
+}
+
 int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
 {
 	u16 startup_cs, startup_ip;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 3ba801ff6afc..5d19aad06670 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -31,6 +31,7 @@
 #include <asm/processor-flags.h>
 #include <asm/msr.h>
 #include <asm/cmdline.h>
+#include <asm/sev.h>
 
 #include "mm_internal.h"
 
@@ -49,6 +50,34 @@ EXPORT_SYMBOL_GPL(sev_enable_key);
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
 
+/*
+ * When SNP is active, change the page state from private to shared before
+ * copying the data from the source to destination and restore after the copy.
+ * This is required because the source address is mapped as decrypted by the
+ * caller of the routine.
+ */
+static inline void __init snp_memcpy(void *dst, void *src, size_t sz,
+				     unsigned long paddr, bool decrypt)
+{
+	unsigned long npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+	if (!cc_platform_has(CC_ATTR_SEV_SNP) || !decrypt) {
+		memcpy(dst, src, sz);
+		return;
+	}
+
+	/*
+	 * With SNP, the paddr needs to be accessed decrypted, mark the page
+	 * shared in the RMP table before copying it.
+	 */
+	early_snp_set_memory_shared((unsigned long)__va(paddr), paddr, npages);
+
+	memcpy(dst, src, sz);
+
+	/* Restore the page state after the memcpy. */
+	early_snp_set_memory_private((unsigned long)__va(paddr), paddr, npages);
+}
+
 /*
  * This routine does not change the underlying encryption setting of the
  * page(s) that map this memory. It assumes that eventually the memory is
@@ -97,8 +126,8 @@ static void __init __sme_early_enc_dec(resource_size_t paddr,
 		 * Use a temporary buffer, of cache-line multiple size, to
 		 * avoid data corruption as documented in the APM.
 		 */
-		memcpy(sme_early_buffer, src, len);
-		memcpy(dst, sme_early_buffer, len);
+		snp_memcpy(sme_early_buffer, src, len, paddr, enc);
+		snp_memcpy(dst, sme_early_buffer, len, paddr, !enc);
 
 		early_memunmap(dst, len);
 		early_memunmap(src, len);
@@ -320,14 +349,28 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
 	clflush_cache_range(__va(pa), size);
 
 	/* Encrypt/decrypt the contents in-place */
-	if (enc)
+	if (enc) {
 		sme_early_encrypt(pa, size);
-	else
+	} else {
 		sme_early_decrypt(pa, size);
 
+		/*
+		 * ON SNP, the page state in the RMP table must happen
+		 * before the page table updates.
+		 */
+		early_snp_set_memory_shared((unsigned long)__va(pa), pa, 1);
+	}
+
 	/* Change the page encryption mask. */
 	new_pte = pfn_pte(pfn, new_prot);
 	set_pte_atomic(kpte, new_pte);
+
+	/*
+	 * If page is set encrypted in the page table, then update the RMP table to
+	 * add this page as private.
+	 */
+	if (enc)
+		early_snp_set_memory_private((unsigned long)__va(pa), pa, 1);
 }
 
 static int __init early_set_memory_enc_dec(unsigned long vaddr,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (11 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-28 11:53   ` Borislav Petkov
  2022-01-04 17:56   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 14/40] x86/kernel: Validate rom memory before accessing when SEV-SNP is active Brijesh Singh
                   ` (27 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The encryption attribute for the bss.decrypted region is cleared in the
initial page table build. This is because the section contains the data
that need to be shared between the guest and the hypervisor.

When SEV-SNP is active, just clearing the encryption attribute in the
page table is not enough. The page state need to be updated in the RMP
table.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/head64.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index fa02402dcb9b..72c5082a3ba4 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -143,7 +143,14 @@ static unsigned long sme_postprocess_startup(struct boot_params *bp, pmdval_t *p
 	if (sme_get_me_mask()) {
 		vaddr = (unsigned long)__start_bss_decrypted;
 		vaddr_end = (unsigned long)__end_bss_decrypted;
+
 		for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
+			/*
+			 * When SEV-SNP is active then transition the page to shared in the RMP
+			 * table so that it is consistent with the page table attribute change.
+			 */
+			early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
+
 			i = pmd_index(vaddr);
 			pmd[i] -= sme_get_me_mask();
 		}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 14/40] x86/kernel: Validate rom memory before accessing when SEV-SNP is active
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (12 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-28 15:40   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 15/40] x86/mm: Add support to validate memory when changing C-bit Brijesh Singh
                   ` (26 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The probe_roms() access the memory range (0xc0000 - 0x10000) to probe
various ROMs. The memory range is not part of the E820 system RAM
range. The memory range is mapped as private (i.e encrypted) in page
table.

When SEV-SNP is active, all the private memory must be validated before
the access. The ROM range was not part of E820 map, so the guest BIOS
did not validate it. An access to invalidated memory will cause a VC
exception. The guest does not support handling not-validated VC exception
yet, so validate the ROM memory regions before it is accessed.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/probe_roms.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 36e84d904260..d19a80565252 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -21,6 +21,7 @@
 #include <asm/sections.h>
 #include <asm/io.h>
 #include <asm/setup_arch.h>
+#include <asm/sev.h>
 
 static struct resource system_rom_resource = {
 	.name	= "System ROM",
@@ -197,11 +198,21 @@ static int __init romchecksum(const unsigned char *rom, unsigned long length)
 
 void __init probe_roms(void)
 {
-	const unsigned char *rom;
 	unsigned long start, length, upper;
+	const unsigned char *rom;
 	unsigned char c;
 	int i;
 
+	/*
+	 * The ROM memory is not part of the E820 system RAM and is not pre-validated
+	 * by the BIOS. The kernel page table maps the ROM region as encrypted memory,
+	 * the SEV-SNP requires the encrypted memory must be validated before the
+	 * access. Validate the ROM before accessing it.
+	 */
+	snp_prep_memory(video_rom_resource.start,
+			((system_rom_resource.end + 1) - video_rom_resource.start),
+			SNP_PAGE_STATE_PRIVATE);
+
 	/* video rom */
 	upper = adapter_rom_resources[0].start;
 	for (start = video_rom_resource.start; start < upper; start += 2048) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 15/40] x86/mm: Add support to validate memory when changing C-bit
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (13 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 14/40] x86/kernel: Validate rom memory before accessing when SEV-SNP is active Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-29 11:09   ` Borislav Petkov
  2022-01-04 22:31   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 16/40] KVM: SVM: Define sev_features and vmpl field in the VMSA Brijesh Singh
                   ` (25 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The set_memory_{encrypt,decrypt}() are used for changing the pages
from decrypted (shared) to encrypted (private) and vice versa.
When SEV-SNP is active, the page state transition needs to go through
additional steps.

If the page is transitioned from shared to private, then perform the
following after the encryption attribute is set in the page table:

1. Issue the page state change VMGEXIT to add the memory region in
   the RMP table.
2. Validate the memory region after the RMP entry is added.

To maintain the security guarantees, if the page is transitioned from
private to shared, then perform the following before encryption attribute
is removed from the page table:

1. Invalidate the page.
2. Issue the page state change VMGEXIT to remove the page from RMP table.

To change the page state in the RMP table, use the Page State Change
VMGEXIT defined in the GHCB specification.

The GHCB specification provides the flexibility to use either 4K or 2MB
page size in during the page state change (PSC) request. For now use the
4K page size for all the PSC until page size tracking is supported in the
kernel.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev-common.h |  22 ++++
 arch/x86/include/asm/sev.h        |   4 +
 arch/x86/include/asm/svm.h        |   4 +-
 arch/x86/include/uapi/asm/svm.h   |   2 +
 arch/x86/kernel/sev.c             | 161 +++++++++++++++++++++++++++++-
 arch/x86/mm/pat/set_memory.c      |  15 +++
 6 files changed, 204 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 6dc27963690e..123a96f7dff2 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -105,6 +105,28 @@ enum psc_op {
 
 #define GHCB_HV_FT_SNP			BIT_ULL(0)
 
+/* SNP Page State Change NAE event */
+#define VMGEXIT_PSC_MAX_ENTRY		253
+
+struct psc_hdr {
+	u16 cur_entry;
+	u16 end_entry;
+	u32 reserved;
+} __packed;
+
+struct psc_entry {
+	u64	cur_page	: 12,
+		gfn		: 40,
+		operation	: 4,
+		pagesize	: 1,
+		reserved	: 7;
+} __packed;
+
+struct snp_psc_desc {
+	struct psc_hdr hdr;
+	struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
+} __packed;
+
 #define GHCB_MSR_TERM_REQ		0x100
 #define GHCB_MSR_TERM_REASON_SET_POS	12
 #define GHCB_MSR_TERM_REASON_SET_MASK	0xf
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index eec2e1b9d557..f5d0569fd02b 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -128,6 +128,8 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
 void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
 					unsigned int npages);
 void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -142,6 +144,8 @@ early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned
 static inline void __init
 early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
 static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
+static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
+static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
 #endif
 
 #endif
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index b00dbc5fac2b..d3277486a6c0 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -309,11 +309,13 @@ struct vmcb_save_area {
 	u64 x87_state_gpa;
 } __packed;
 
+#define GHCB_SHARED_BUF_SIZE	2032
+
 struct ghcb {
 	struct vmcb_save_area save;
 	u8 reserved_save[2048 - sizeof(struct vmcb_save_area)];
 
-	u8 shared_buffer[2032];
+	u8 shared_buffer[GHCB_SHARED_BUF_SIZE];
 
 	u8 reserved_1[10];
 	u16 protocol_version;	/* negotiated SEV-ES/GHCB protocol version */
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index b0ad00f4c1e1..0dcdb6e0c913 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -108,6 +108,7 @@
 #define SVM_VMGEXIT_AP_JUMP_TABLE		0x80000005
 #define SVM_VMGEXIT_SET_AP_JUMP_TABLE		0
 #define SVM_VMGEXIT_GET_AP_JUMP_TABLE		1
+#define SVM_VMGEXIT_PSC				0x80000010
 #define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
 #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
 
@@ -219,6 +220,7 @@
 	{ SVM_VMGEXIT_NMI_COMPLETE,	"vmgexit_nmi_complete" }, \
 	{ SVM_VMGEXIT_AP_HLT_LOOP,	"vmgexit_ap_hlt_loop" }, \
 	{ SVM_VMGEXIT_AP_JUMP_TABLE,	"vmgexit_ap_jump_table" }, \
+	{ SVM_VMGEXIT_PSC,	"vmgexit_page_state_change" }, \
 	{ SVM_VMGEXIT_HV_FEATURES,	"vmgexit_hypervisor_feature" }, \
 	{ SVM_EXIT_ERR,         "invalid_guest_state" }
 
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 2971aa280ce6..35c772bf9f6c 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -574,7 +574,7 @@ static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool valid
 	}
 }
 
-static void __init early_set_page_state(unsigned long paddr, unsigned int npages, enum psc_op op)
+static void __init early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
 {
 	unsigned long paddr_end;
 	u64 val;
@@ -622,7 +622,7 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
 	  * Ask the hypervisor to mark the memory pages as private in the RMP
 	  * table.
 	  */
-	early_set_page_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
+	early_set_pages_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
 
 	/* Validate the memory pages after they've been added in the RMP table. */
 	pvalidate_pages(vaddr, npages, 1);
@@ -641,7 +641,7 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
 	pvalidate_pages(vaddr, npages, 0);
 
 	 /* Ask hypervisor to mark the memory pages shared in the RMP table. */
-	early_set_page_state(paddr, npages, SNP_PAGE_STATE_SHARED);
+	early_set_pages_state(paddr, npages, SNP_PAGE_STATE_SHARED);
 }
 
 void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
@@ -659,6 +659,161 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
 		WARN(1, "invalid memory op %d\n", op);
 }
 
+static int vmgexit_psc(struct snp_psc_desc *desc)
+{
+	int cur_entry, end_entry, ret = 0;
+	struct snp_psc_desc *data;
+	struct ghcb_state state;
+	unsigned long flags;
+	struct ghcb *ghcb;
+
+	/* __sev_get_ghcb() need to run with IRQs disabled because it using per-cpu GHCB */
+	local_irq_save(flags);
+
+	ghcb = __sev_get_ghcb(&state);
+	if (unlikely(!ghcb))
+		panic("SEV-SNP: Failed to get GHCB\n");
+
+	/* Copy the input desc into GHCB shared buffer */
+	data = (struct snp_psc_desc *)ghcb->shared_buffer;
+	memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof(*desc)));
+
+	/*
+	 * As per the GHCB specification, the hypervisor can resume the guest
+	 * before processing all the entries. Check whether all the entries
+	 * are processed. If not, then keep retrying.
+	 *
+	 * The stragtegy here is to wait for the hypervisor to change the page
+	 * state in the RMP table before guest accesses the memory pages. If the
+	 * page state change was not successful, then later memory access will result
+	 * in a crash.
+	 */
+	cur_entry = data->hdr.cur_entry;
+	end_entry = data->hdr.end_entry;
+
+	while (data->hdr.cur_entry <= data->hdr.end_entry) {
+		ghcb_set_sw_scratch(ghcb, (u64)__pa(data));
+
+		ret = sev_es_ghcb_hv_call(ghcb, true, NULL, SVM_VMGEXIT_PSC, 0, 0);
+
+		/*
+		 * Page State Change VMGEXIT can pass error code through
+		 * exit_info_2.
+		 */
+		if (WARN(ret || ghcb->save.sw_exit_info_2,
+			 "SEV-SNP: PSC failed ret=%d exit_info_2=%llx\n",
+			 ret, ghcb->save.sw_exit_info_2)) {
+			ret = 1;
+			goto out;
+		}
+
+		/* Verify that reserved bit is not set */
+		if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n")) {
+			ret = 1;
+			goto out;
+		}
+
+		/*
+		 * Sanity check that entry processing is not going backward.
+		 * This will happen only if hypervisor is tricking us.
+		 */
+		if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_entry,
+"SEV-SNP:  PSC processing going backward, end_entry %d (got %d) cur_entry %d (got %d)\n",
+			 end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) {
+			ret = 1;
+			goto out;
+		}
+	}
+
+out:
+	__sev_put_ghcb(&state);
+	local_irq_restore(flags);
+
+	return ret;
+}
+
+static void __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
+			      unsigned long vaddr_end, int op)
+{
+	struct psc_hdr *hdr;
+	struct psc_entry *e;
+	unsigned long pfn;
+	int i;
+
+	hdr = &data->hdr;
+	e = data->entries;
+
+	memset(data, 0, sizeof(*data));
+	i = 0;
+
+	while (vaddr < vaddr_end) {
+		if (is_vmalloc_addr((void *)vaddr))
+			pfn = vmalloc_to_pfn((void *)vaddr);
+		else
+			pfn = __pa(vaddr) >> PAGE_SHIFT;
+
+		e->gfn = pfn;
+		e->operation = op;
+		hdr->end_entry = i;
+		e->pagesize = RMP_PG_SIZE_4K;
+
+		vaddr = vaddr + PAGE_SIZE;
+		e++;
+		i++;
+	}
+
+	if (vmgexit_psc(data))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
+{
+	unsigned long vaddr_end, next_vaddr;
+	struct snp_psc_desc *desc;
+
+	desc = kmalloc(sizeof(*desc), GFP_KERNEL_ACCOUNT);
+	if (!desc)
+		panic("SEV-SNP: failed to allocate memory for PSC descriptor\n");
+
+	vaddr = vaddr & PAGE_MASK;
+	vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+	while (vaddr < vaddr_end) {
+		/*
+		 * Calculate the last vaddr that can be fit in one
+		 * struct snp_psc_desc.
+		 */
+		next_vaddr = min_t(unsigned long, vaddr_end,
+				   (VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr);
+
+		__set_pages_state(desc, vaddr, next_vaddr, op);
+
+		vaddr = next_vaddr;
+	}
+
+	kfree(desc);
+}
+
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
+{
+	if (!cc_platform_has(CC_ATTR_SEV_SNP))
+		return;
+
+	pvalidate_pages(vaddr, npages, 0);
+
+	set_pages_state(vaddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
+{
+	if (!cc_platform_has(CC_ATTR_SEV_SNP))
+		return;
+
+	set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+	pvalidate_pages(vaddr, npages, 1);
+}
+
 int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
 {
 	u16 startup_cs, startup_ip;
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index b4072115c8ef..5dc17d446204 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -32,6 +32,7 @@
 #include <asm/set_memory.h>
 #include <asm/hyperv-tlfs.h>
 #include <asm/mshyperv.h>
+#include <asm/sev.h>
 
 #include "../mm_internal.h"
 
@@ -2012,8 +2013,22 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
 	 */
 	cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
 
+	/*
+	 * To maintain the security guarantees of SEV-SNP guest invalidate the memory
+	 * before clearing the encryption attribute.
+	 */
+	if (!enc)
+		snp_set_memory_shared(addr, numpages);
+
 	ret = __change_page_attr_set_clr(&cpa, 1);
 
+	/*
+	 * Now that memory is mapped encrypted in the page table, validate it
+	 * so that is consistent with the above page state.
+	 */
+	if (!ret && enc)
+		snp_set_memory_private(addr, numpages);
+
 	/*
 	 * After changing the encryption attribute, we need to flush TLBs again
 	 * in case any speculative TLB caching occurred (but no need to flush
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 16/40] KVM: SVM: Define sev_features and vmpl field in the VMSA
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (14 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 15/40] x86/mm: Add support to validate memory when changing C-bit Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-04 22:59   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 17/40] KVM: SVM: Create a separate mapping for the SEV-ES save area Brijesh Singh
                   ` (24 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The hypervisor uses the sev_features field (offset 3B0h) in the Save State
Area to control the SEV-SNP guest features such as SNPActive, vTOM,
ReflectVC etc. An SEV-SNP guest can read the SEV_FEATURES fields through
the SEV_STATUS MSR.

While at it, update the dump_vmcb() to log the VMPL level.

See APM2 Table 15-34 and B-4 for more details.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/svm.h | 6 ++++--
 arch/x86/kvm/svm/svm.c     | 4 ++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index d3277486a6c0..c3fad5172584 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -238,7 +238,8 @@ struct vmcb_save_area {
 	struct vmcb_seg ldtr;
 	struct vmcb_seg idtr;
 	struct vmcb_seg tr;
-	u8 reserved_1[43];
+	u8 reserved_1[42];
+	u8 vmpl;
 	u8 cpl;
 	u8 reserved_2[4];
 	u64 efer;
@@ -303,7 +304,8 @@ struct vmcb_save_area {
 	u64 sw_exit_info_1;
 	u64 sw_exit_info_2;
 	u64 sw_scratch;
-	u8 reserved_11[56];
+	u64 sev_features;
+	u8 reserved_11[48];
 	u64 xcr0;
 	u8 valid_bitmap[16];
 	u64 x87_state_gpa;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 65707bee208d..d3a6356fa1af 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3290,8 +3290,8 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
 	       "tr:",
 	       save01->tr.selector, save01->tr.attrib,
 	       save01->tr.limit, save01->tr.base);
-	pr_err("cpl:            %d                efer:         %016llx\n",
-		save->cpl, save->efer);
+	pr_err("vmpl: %d   cpl:  %d               efer:          %016llx\n",
+	       save->vmpl, save->cpl, save->efer);
 	pr_err("%-15s %016llx %-13s %016llx\n",
 	       "cr0:", save->cr0, "cr2:", save->cr2);
 	pr_err("%-15s %016llx %-13s %016llx\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 17/40] KVM: SVM: Create a separate mapping for the SEV-ES save area
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (15 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 16/40] KVM: SVM: Define sev_features and vmpl field in the VMSA Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-30 12:19   ` Borislav Petkov
  2022-01-05  1:38   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 18/40] KVM: SVM: Create a separate mapping for the GHCB " Brijesh Singh
                   ` (23 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The save area for SEV-ES/SEV-SNP guests, as used by the hardware, is
different from the save area of a non SEV-ES/SEV-SNP guest.

This is the first step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Create an SEV-ES/SEV-SNP save area and adjust usage to the new
save area definition where needed.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/svm.h | 83 +++++++++++++++++++++++++++++---------
 arch/x86/kvm/svm/sev.c     | 24 +++++------
 arch/x86/kvm/svm/svm.h     |  2 +-
 3 files changed, 77 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index c3fad5172584..3ce2e575a2de 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -227,6 +227,7 @@ struct vmcb_seg {
 	u64 base;
 } __packed;
 
+/* Save area definition for legacy and SEV-MEM guests */
 struct vmcb_save_area {
 	struct vmcb_seg es;
 	struct vmcb_seg cs;
@@ -243,8 +244,58 @@ struct vmcb_save_area {
 	u8 cpl;
 	u8 reserved_2[4];
 	u64 efer;
+	u8 reserved_3[112];
+	u64 cr4;
+	u64 cr3;
+	u64 cr0;
+	u64 dr7;
+	u64 dr6;
+	u64 rflags;
+	u64 rip;
+	u8 reserved_4[88];
+	u64 rsp;
+	u64 s_cet;
+	u64 ssp;
+	u64 isst_addr;
+	u64 rax;
+	u64 star;
+	u64 lstar;
+	u64 cstar;
+	u64 sfmask;
+	u64 kernel_gs_base;
+	u64 sysenter_cs;
+	u64 sysenter_esp;
+	u64 sysenter_eip;
+	u64 cr2;
+	u8 reserved_5[32];
+	u64 g_pat;
+	u64 dbgctl;
+	u64 br_from;
+	u64 br_to;
+	u64 last_excp_from;
+	u64 last_excp_to;
+	u8 reserved_6[72];
+	u32 spec_ctrl;		/* Guest version of SPEC_CTRL at 0x2E0 */
+} __packed;
+
+/* Save area definition for SEV-ES and SEV-SNP guests */
+struct sev_es_save_area {
+	struct vmcb_seg es;
+	struct vmcb_seg cs;
+	struct vmcb_seg ss;
+	struct vmcb_seg ds;
+	struct vmcb_seg fs;
+	struct vmcb_seg gs;
+	struct vmcb_seg gdtr;
+	struct vmcb_seg ldtr;
+	struct vmcb_seg idtr;
+	struct vmcb_seg tr;
+	u8 reserved_1[43];
+	u8 cpl;
+	u8 reserved_2[4];
+	u64 efer;
 	u8 reserved_3[104];
-	u64 xss;		/* Valid for SEV-ES only */
+	u64 xss;
 	u64 cr4;
 	u64 cr3;
 	u64 cr0;
@@ -272,22 +323,14 @@ struct vmcb_save_area {
 	u64 br_to;
 	u64 last_excp_from;
 	u64 last_excp_to;
-
-	/*
-	 * The following part of the save area is valid only for
-	 * SEV-ES guests when referenced through the GHCB or for
-	 * saving to the host save area.
-	 */
-	u8 reserved_7[72];
-	u32 spec_ctrl;		/* Guest version of SPEC_CTRL at 0x2E0 */
-	u8 reserved_7b[4];
+	u8 reserved_7[80];
 	u32 pkru;
-	u8 reserved_7a[20];
-	u64 reserved_8;		/* rax already available at 0x01f8 */
+	u8 reserved_9[20];
+	u64 reserved_10;	/* rax already available at 0x01f8 */
 	u64 rcx;
 	u64 rdx;
 	u64 rbx;
-	u64 reserved_9;		/* rsp already available at 0x01d8 */
+	u64 reserved_11;	/* rsp already available at 0x01d8 */
 	u64 rbp;
 	u64 rsi;
 	u64 rdi;
@@ -299,13 +342,13 @@ struct vmcb_save_area {
 	u64 r13;
 	u64 r14;
 	u64 r15;
-	u8 reserved_10[16];
+	u8 reserved_12[16];
 	u64 sw_exit_code;
 	u64 sw_exit_info_1;
 	u64 sw_exit_info_2;
 	u64 sw_scratch;
 	u64 sev_features;
-	u8 reserved_11[48];
+	u8 reserved_13[48];
 	u64 xcr0;
 	u8 valid_bitmap[16];
 	u64 x87_state_gpa;
@@ -314,8 +357,8 @@ struct vmcb_save_area {
 #define GHCB_SHARED_BUF_SIZE	2032
 
 struct ghcb {
-	struct vmcb_save_area save;
-	u8 reserved_save[2048 - sizeof(struct vmcb_save_area)];
+	struct sev_es_save_area save;
+	u8 reserved_save[2048 - sizeof(struct sev_es_save_area)];
 
 	u8 shared_buffer[GHCB_SHARED_BUF_SIZE];
 
@@ -325,13 +368,15 @@ struct ghcb {
 } __packed;
 
 
-#define EXPECTED_VMCB_SAVE_AREA_SIZE		1032
+#define EXPECTED_VMCB_SAVE_AREA_SIZE		740
+#define EXPECTED_SEV_ES_SAVE_AREA_SIZE		1032
 #define EXPECTED_VMCB_CONTROL_AREA_SIZE		1024
 #define EXPECTED_GHCB_SIZE			PAGE_SIZE
 
 static inline void __unused_size_checks(void)
 {
 	BUILD_BUG_ON(sizeof(struct vmcb_save_area)	!= EXPECTED_VMCB_SAVE_AREA_SIZE);
+	BUILD_BUG_ON(sizeof(struct sev_es_save_area)	!= EXPECTED_SEV_ES_SAVE_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct vmcb_control_area)	!= EXPECTED_VMCB_CONTROL_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct ghcb)		!= EXPECTED_GHCB_SIZE);
 }
@@ -401,7 +446,7 @@ struct vmcb {
 /* GHCB Accessor functions */
 
 #define GHCB_BITMAP_IDX(field)							\
-	(offsetof(struct vmcb_save_area, field) / sizeof(u64))
+	(offsetof(struct sev_es_save_area, field) / sizeof(u64))
 
 #define DEFINE_GHCB_ACCESSORS(field)						\
 	static inline bool ghcb_##field##_is_valid(const struct ghcb *ghcb)	\
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 7656a2c5662a..63334af988af 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -558,12 +558,20 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 
 static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 {
-	struct vmcb_save_area *save = &svm->vmcb->save;
+	struct sev_es_save_area *save = svm->sev_es.vmsa;
 
 	/* Check some debug related fields before encrypting the VMSA */
-	if (svm->vcpu.guest_debug || (save->dr7 & ~DR7_FIXED_1))
+	if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1))
 		return -EINVAL;
 
+	/*
+	 * SEV-ES will use a VMSA that is pointed to by the VMCB, not
+	 * the traditional VMSA that is part of the VMCB. Copy the
+	 * traditional VMSA as it has been built so far (in prep
+	 * for LAUNCH_UPDATE_VMSA) to be the initial SEV-ES state.
+	 */
+	memcpy(save, &svm->vmcb->save, sizeof(svm->vmcb->save));
+
 	/* Sync registgers */
 	save->rax = svm->vcpu.arch.regs[VCPU_REGS_RAX];
 	save->rbx = svm->vcpu.arch.regs[VCPU_REGS_RBX];
@@ -591,14 +599,6 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 	save->xss  = svm->vcpu.arch.ia32_xss;
 	save->dr6  = svm->vcpu.arch.dr6;
 
-	/*
-	 * SEV-ES will use a VMSA that is pointed to by the VMCB, not
-	 * the traditional VMSA that is part of the VMCB. Copy the
-	 * traditional VMSA as it has been built so far (in prep
-	 * for LAUNCH_UPDATE_VMSA) to be the initial SEV-ES state.
-	 */
-	memcpy(svm->sev_es.vmsa, save, sizeof(*save));
-
 	return 0;
 }
 
@@ -2904,7 +2904,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
 void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu)
 {
 	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
-	struct vmcb_save_area *hostsa;
+	struct sev_es_save_area *hostsa;
 
 	/*
 	 * As an SEV-ES guest, hardware will restore the host state on VMEXIT,
@@ -2914,7 +2914,7 @@ void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu)
 	vmsave(__sme_page_pa(sd->save_area));
 
 	/* XCR0 is restored on VMEXIT, save the current host value */
-	hostsa = (struct vmcb_save_area *)(page_address(sd->save_area) + 0x400);
+	hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
 	hostsa->xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
 
 	/* PKRU is restored on VMEXIT, save the current host value */
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 1c7306c370fa..cecfcdb1a1b3 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -127,7 +127,7 @@ struct svm_nested_state {
 
 struct vcpu_sev_es_state {
 	/* SEV-ES support */
-	struct vmcb_save_area *vmsa;
+	struct sev_es_save_area *vmsa;
 	struct ghcb *ghcb;
 	struct kvm_host_map ghcb_map;
 	bool received_first_sipi;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 18/40] KVM: SVM: Create a separate mapping for the GHCB save area
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (16 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 17/40] KVM: SVM: Create a separate mapping for the SEV-ES save area Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-05 18:41   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 19/40] KVM: SVM: Update the SEV-ES save area mapping Brijesh Singh
                   ` (22 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

The initial implementation of the GHCB spec was based on trying to keep
the register state offsets the same relative to the VM save area. However,
the save area for SEV-ES has changed within the hardware causing the
relation between the SEV-ES save area to change relative to the GHCB save
area.

This is the second step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Create a GHCB save area that matches the GHCB specification.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/svm.h | 48 +++++++++++++++++++++++++++++++++++---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 3ce2e575a2de..5ff1fa364a31 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -354,11 +354,51 @@ struct sev_es_save_area {
 	u64 x87_state_gpa;
 } __packed;
 
+struct ghcb_save_area {
+	u8 reserved_1[203];
+	u8 cpl;
+	u8 reserved_2[116];
+	u64 xss;
+	u8 reserved_3[24];
+	u64 dr7;
+	u8 reserved_4[16];
+	u64 rip;
+	u8 reserved_5[88];
+	u64 rsp;
+	u8 reserved_6[24];
+	u64 rax;
+	u8 reserved_7[264];
+	u64 rcx;
+	u64 rdx;
+	u64 rbx;
+	u8 reserved_8[8];
+	u64 rbp;
+	u64 rsi;
+	u64 rdi;
+	u64 r8;
+	u64 r9;
+	u64 r10;
+	u64 r11;
+	u64 r12;
+	u64 r13;
+	u64 r14;
+	u64 r15;
+	u8 reserved_9[16];
+	u64 sw_exit_code;
+	u64 sw_exit_info_1;
+	u64 sw_exit_info_2;
+	u64 sw_scratch;
+	u8 reserved_10[56];
+	u64 xcr0;
+	u8 valid_bitmap[16];
+	u64 x87_state_gpa;
+} __packed;
+
 #define GHCB_SHARED_BUF_SIZE	2032
 
 struct ghcb {
-	struct sev_es_save_area save;
-	u8 reserved_save[2048 - sizeof(struct sev_es_save_area)];
+	struct ghcb_save_area save;
+	u8 reserved_save[2048 - sizeof(struct ghcb_save_area)];
 
 	u8 shared_buffer[GHCB_SHARED_BUF_SIZE];
 
@@ -369,6 +409,7 @@ struct ghcb {
 
 
 #define EXPECTED_VMCB_SAVE_AREA_SIZE		740
+#define EXPECTED_GHCB_SAVE_AREA_SIZE		1032
 #define EXPECTED_SEV_ES_SAVE_AREA_SIZE		1032
 #define EXPECTED_VMCB_CONTROL_AREA_SIZE		1024
 #define EXPECTED_GHCB_SIZE			PAGE_SIZE
@@ -376,6 +417,7 @@ struct ghcb {
 static inline void __unused_size_checks(void)
 {
 	BUILD_BUG_ON(sizeof(struct vmcb_save_area)	!= EXPECTED_VMCB_SAVE_AREA_SIZE);
+	BUILD_BUG_ON(sizeof(struct ghcb_save_area)	!= EXPECTED_GHCB_SAVE_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct sev_es_save_area)	!= EXPECTED_SEV_ES_SAVE_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct vmcb_control_area)	!= EXPECTED_VMCB_CONTROL_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct ghcb)		!= EXPECTED_GHCB_SIZE);
@@ -446,7 +488,7 @@ struct vmcb {
 /* GHCB Accessor functions */
 
 #define GHCB_BITMAP_IDX(field)							\
-	(offsetof(struct sev_es_save_area, field) / sizeof(u64))
+	(offsetof(struct ghcb_save_area, field) / sizeof(u64))
 
 #define DEFINE_GHCB_ACCESSORS(field)						\
 	static inline bool ghcb_##field##_is_valid(const struct ghcb *ghcb)	\
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 19/40] KVM: SVM: Update the SEV-ES save area mapping
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (17 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 18/40] KVM: SVM: Create a separate mapping for the GHCB " Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-05 18:54   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs Brijesh Singh
                   ` (21 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

This is the final step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Update the SEV-ES/SEV-SNP save area to match the APM. This save
area will be used for the upcoming SEV-SNP AP Creation NAE event support.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/svm.h | 66 +++++++++++++++++++++++++++++---------
 1 file changed, 50 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 5ff1fa364a31..7d90321e7775 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -290,7 +290,13 @@ struct sev_es_save_area {
 	struct vmcb_seg ldtr;
 	struct vmcb_seg idtr;
 	struct vmcb_seg tr;
-	u8 reserved_1[43];
+	u64 vmpl0_ssp;
+	u64 vmpl1_ssp;
+	u64 vmpl2_ssp;
+	u64 vmpl3_ssp;
+	u64 u_cet;
+	u8 reserved_1[2];
+	u8 vmpl;
 	u8 cpl;
 	u8 reserved_2[4];
 	u64 efer;
@@ -303,9 +309,19 @@ struct sev_es_save_area {
 	u64 dr6;
 	u64 rflags;
 	u64 rip;
-	u8 reserved_4[88];
+	u64 dr0;
+	u64 dr1;
+	u64 dr2;
+	u64 dr3;
+	u64 dr0_addr_mask;
+	u64 dr1_addr_mask;
+	u64 dr2_addr_mask;
+	u64 dr3_addr_mask;
+	u8 reserved_4[24];
 	u64 rsp;
-	u8 reserved_5[24];
+	u64 s_cet;
+	u64 ssp;
+	u64 isst_addr;
 	u64 rax;
 	u64 star;
 	u64 lstar;
@@ -316,7 +332,7 @@ struct sev_es_save_area {
 	u64 sysenter_esp;
 	u64 sysenter_eip;
 	u64 cr2;
-	u8 reserved_6[32];
+	u8 reserved_5[32];
 	u64 g_pat;
 	u64 dbgctl;
 	u64 br_from;
@@ -325,12 +341,12 @@ struct sev_es_save_area {
 	u64 last_excp_to;
 	u8 reserved_7[80];
 	u32 pkru;
-	u8 reserved_9[20];
-	u64 reserved_10;	/* rax already available at 0x01f8 */
+	u8 reserved_8[20];
+	u64 reserved_9;		/* rax already available at 0x01f8 */
 	u64 rcx;
 	u64 rdx;
 	u64 rbx;
-	u64 reserved_11;	/* rsp already available at 0x01d8 */
+	u64 reserved_10;	/* rsp already available at 0x01d8 */
 	u64 rbp;
 	u64 rsi;
 	u64 rdi;
@@ -342,16 +358,34 @@ struct sev_es_save_area {
 	u64 r13;
 	u64 r14;
 	u64 r15;
-	u8 reserved_12[16];
-	u64 sw_exit_code;
-	u64 sw_exit_info_1;
-	u64 sw_exit_info_2;
-	u64 sw_scratch;
+	u8 reserved_11[16];
+	u64 guest_exit_info_1;
+	u64 guest_exit_info_2;
+	u64 guest_exit_int_info;
+	u64 guest_nrip;
 	u64 sev_features;
-	u8 reserved_13[48];
+	u64 vintr_ctrl;
+	u64 guest_exit_code;
+	u64 virtual_tom;
+	u64 tlb_id;
+	u64 pcpu_id;
+	u64 event_inj;
 	u64 xcr0;
-	u8 valid_bitmap[16];
-	u64 x87_state_gpa;
+	u8 reserved_12[16];
+
+	/* Floating point area */
+	u64 x87_dp;
+	u32 mxcsr;
+	u16 x87_ftw;
+	u16 x87_fsw;
+	u16 x87_fcw;
+	u16 x87_fop;
+	u16 x87_ds;
+	u16 x87_cs;
+	u64 x87_rip;
+	u8 fpreg_x87[80];
+	u8 fpreg_xmm[256];
+	u8 fpreg_ymm[256];
 } __packed;
 
 struct ghcb_save_area {
@@ -410,7 +444,7 @@ struct ghcb {
 
 #define EXPECTED_VMCB_SAVE_AREA_SIZE		740
 #define EXPECTED_GHCB_SAVE_AREA_SIZE		1032
-#define EXPECTED_SEV_ES_SAVE_AREA_SIZE		1032
+#define EXPECTED_SEV_ES_SAVE_AREA_SIZE		1648
 #define EXPECTED_VMCB_CONTROL_AREA_SIZE		1024
 #define EXPECTED_GHCB_SIZE			PAGE_SIZE
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (18 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 19/40] KVM: SVM: Update the SEV-ES save area mapping Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-10 18:50   ` Dave Hansen
  2021-12-31 15:36   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 21/40] x86/head: re-enable stack protection for 32/64-bit builds Brijesh Singh
                   ` (20 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

To provide a more secure way to start APs under SEV-SNP, use the SEV-SNP
AP Creation NAE event. This allows for guest control over the AP register
state rather than trusting the hypervisor with the SEV-ES Jump Table
address.

During native_smp_prepare_cpus(), invoke an SEV-SNP function that, if
SEV-SNP is active, will set/override apic->wakeup_secondary_cpu. This
will allow the SEV-SNP AP Creation NAE event method to be used to boot
the APs. As a result of installing the override when SEV-SNP is active,
this method of starting the APs becomes the required method. The override
function will fail to start the AP if the hypervisor does not have
support for AP creation.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev-common.h |   1 +
 arch/x86/include/asm/sev.h        |   4 +
 arch/x86/include/uapi/asm/svm.h   |   5 +
 arch/x86/kernel/sev.c             | 229 ++++++++++++++++++++++++++++++
 arch/x86/kernel/smpboot.c         |   3 +
 5 files changed, 242 insertions(+)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 123a96f7dff2..38c14601ae4a 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -104,6 +104,7 @@ enum psc_op {
 	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
 
 #define GHCB_HV_FT_SNP			BIT_ULL(0)
+#define GHCB_HV_FT_SNP_AP_CREATION	(BIT_ULL(1) | GHCB_HV_FT_SNP)
 
 /* SNP Page State Change NAE event */
 #define VMGEXIT_PSC_MAX_ENTRY		253
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index f5d0569fd02b..f7cbd5164136 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -66,6 +66,8 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 /* RMP page size */
 #define RMP_PG_SIZE_4K			0
 
+#define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -130,6 +132,7 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
 void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
 void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
+void snp_set_wakeup_secondary_cpu(void);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -146,6 +149,7 @@ early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned i
 static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
 static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
 static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
+static inline void snp_set_wakeup_secondary_cpu(void) { }
 #endif
 
 #endif
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 0dcdb6e0c913..8b4c57baec52 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -109,6 +109,10 @@
 #define SVM_VMGEXIT_SET_AP_JUMP_TABLE		0
 #define SVM_VMGEXIT_GET_AP_JUMP_TABLE		1
 #define SVM_VMGEXIT_PSC				0x80000010
+#define SVM_VMGEXIT_AP_CREATION			0x80000013
+#define SVM_VMGEXIT_AP_CREATE_ON_INIT		0
+#define SVM_VMGEXIT_AP_CREATE			1
+#define SVM_VMGEXIT_AP_DESTROY			2
 #define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
 #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
 
@@ -221,6 +225,7 @@
 	{ SVM_VMGEXIT_AP_HLT_LOOP,	"vmgexit_ap_hlt_loop" }, \
 	{ SVM_VMGEXIT_AP_JUMP_TABLE,	"vmgexit_ap_jump_table" }, \
 	{ SVM_VMGEXIT_PSC,	"vmgexit_page_state_change" }, \
+	{ SVM_VMGEXIT_AP_CREATION,	"vmgexit_ap_creation" }, \
 	{ SVM_VMGEXIT_HV_FEATURES,	"vmgexit_hypervisor_feature" }, \
 	{ SVM_EXIT_ERR,         "invalid_guest_state" }
 
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 35c772bf9f6c..21926b094378 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -18,6 +18,7 @@
 #include <linux/memblock.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
+#include <linux/cpumask.h>
 
 #include <asm/cpu_entry_area.h>
 #include <asm/stacktrace.h>
@@ -31,6 +32,7 @@
 #include <asm/svm.h>
 #include <asm/smp.h>
 #include <asm/cpu.h>
+#include <asm/apic.h>
 
 #define DR7_RESET_VALUE        0x400
 
@@ -91,6 +93,8 @@ struct ghcb_state {
 static DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
 DEFINE_STATIC_KEY_FALSE(sev_es_enable_key);
 
+static DEFINE_PER_CPU(struct sev_es_save_area *, snp_vmsa);
+
 static __always_inline bool on_vc_stack(struct pt_regs *regs)
 {
 	unsigned long sp = regs->sp;
@@ -814,6 +818,231 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
 	pvalidate_pages(vaddr, npages, 1);
 }
 
+static int snp_set_vmsa(void *va, bool vmsa)
+{
+	u64 attrs;
+
+	/*
+	 * The RMPADJUST instruction is used to set or clear the VMSA bit for
+	 * a page. A change to the VMSA bit is only performed when running
+	 * at VMPL0 and is ignored at other VMPL levels. If too low of a target
+	 * VMPL level is specified, the instruction can succeed without changing
+	 * the VMSA bit should the kernel not be in VMPL0. Using a target VMPL
+	 * level of 1 will return a FAIL_PERMISSION error if the kernel is not
+	 * at VMPL0, thus ensuring that the VMSA bit has been properly set when
+	 * no error is returned.
+	 */
+	attrs = 1;
+	if (vmsa)
+		attrs |= RMPADJUST_VMSA_PAGE_BIT;
+
+	return rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs);
+}
+
+#define __ATTR_BASE		(SVM_SELECTOR_P_MASK | SVM_SELECTOR_S_MASK)
+#define INIT_CS_ATTRIBS		(__ATTR_BASE | SVM_SELECTOR_READ_MASK | SVM_SELECTOR_CODE_MASK)
+#define INIT_DS_ATTRIBS		(__ATTR_BASE | SVM_SELECTOR_WRITE_MASK)
+
+#define INIT_LDTR_ATTRIBS	(SVM_SELECTOR_P_MASK | 2)
+#define INIT_TR_ATTRIBS		(SVM_SELECTOR_P_MASK | 3)
+
+static void *snp_safe_alloc_page(void)
+{
+	unsigned long pfn;
+	struct page *p;
+
+	/*
+	 * Allocate an SNP safe page to workaround the SNP erratum where
+	 * the CPU will incorrectly signal an RMP violation  #PF if a
+	 * hugepage (2mb or 1gb) collides with the RMP entry of VMSA page.
+	 * The recommeded workaround is to not use the large page.
+	 *
+	 * Allocate one extra page, use a page which is not 2mb aligned
+	 * and free the other.
+	 */
+	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
+	if (!p)
+		return NULL;
+
+	split_page(p, 1);
+
+	pfn = page_to_pfn(p);
+	if (IS_ALIGNED(__pfn_to_phys(pfn), PMD_SIZE)) {
+		pfn++;
+		__free_page(p);
+	} else {
+		__free_page(pfn_to_page(pfn + 1));
+	}
+
+	return page_address(pfn_to_page(pfn));
+}
+
+static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
+{
+	struct sev_es_save_area *cur_vmsa, *vmsa;
+	struct ghcb_state state;
+	unsigned long flags;
+	struct ghcb *ghcb;
+	int cpu, err, ret;
+	u8 sipi_vector;
+	u64 cr4;
+
+	if ((sev_hv_features & GHCB_HV_FT_SNP_AP_CREATION) != GHCB_HV_FT_SNP_AP_CREATION)
+		return -EOPNOTSUPP;
+
+	/*
+	 * Verify the desired start IP against the known trampoline start IP
+	 * to catch any future new trampolines that may be introduced that
+	 * would require a new protected guest entry point.
+	 */
+	if (WARN_ONCE(start_ip != real_mode_header->trampoline_start,
+		      "Unsupported SEV-SNP start_ip: %lx\n", start_ip))
+		return -EINVAL;
+
+	/* Override start_ip with known protected guest start IP */
+	start_ip = real_mode_header->sev_es_trampoline_start;
+
+	/* Find the logical CPU for the APIC ID */
+	for_each_present_cpu(cpu) {
+		if (arch_match_cpu_phys_id(cpu, apic_id))
+			break;
+	}
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+
+	cur_vmsa = per_cpu(snp_vmsa, cpu);
+
+	/*
+	 * A new VMSA is created each time because there is no guarantee that
+	 * the current VMSA is the kernels or that the vCPU is not running. If
+	 * an attempt was done to use the current VMSA with a running vCPU, a
+	 * #VMEXIT of that vCPU would wipe out all of the settings being done
+	 * here.
+	 */
+	vmsa = (struct sev_es_save_area *)snp_safe_alloc_page();
+	if (!vmsa)
+		return -ENOMEM;
+
+	/* CR4 should maintain the MCE value */
+	cr4 = native_read_cr4() & X86_CR4_MCE;
+
+	/* Set the CS value based on the start_ip converted to a SIPI vector */
+	sipi_vector		= (start_ip >> 12);
+	vmsa->cs.base		= sipi_vector << 12;
+	vmsa->cs.limit		= 0xffff;
+	vmsa->cs.attrib		= INIT_CS_ATTRIBS;
+	vmsa->cs.selector	= sipi_vector << 8;
+
+	/* Set the RIP value based on start_ip */
+	vmsa->rip		= start_ip & 0xfff;
+
+	/* Set VMSA entries to the INIT values as documented in the APM */
+	vmsa->ds.limit		= 0xffff;
+	vmsa->ds.attrib		= INIT_DS_ATTRIBS;
+	vmsa->es		= vmsa->ds;
+	vmsa->fs		= vmsa->ds;
+	vmsa->gs		= vmsa->ds;
+	vmsa->ss		= vmsa->ds;
+
+	vmsa->gdtr.limit	= 0xffff;
+	vmsa->ldtr.limit	= 0xffff;
+	vmsa->ldtr.attrib	= INIT_LDTR_ATTRIBS;
+	vmsa->idtr.limit	= 0xffff;
+	vmsa->tr.limit		= 0xffff;
+	vmsa->tr.attrib		= INIT_TR_ATTRIBS;
+
+	vmsa->efer		= 0x1000;	/* Must set SVME bit */
+	vmsa->cr4		= cr4;
+	vmsa->cr0		= 0x60000010;
+	vmsa->dr7		= 0x400;
+	vmsa->dr6		= 0xffff0ff0;
+	vmsa->rflags		= 0x2;
+	vmsa->g_pat		= 0x0007040600070406ULL;
+	vmsa->xcr0		= 0x1;
+	vmsa->mxcsr		= 0x1f80;
+	vmsa->x87_ftw		= 0x5555;
+	vmsa->x87_fcw		= 0x0040;
+
+	/*
+	 * Set the SNP-specific fields for this VMSA:
+	 *   VMPL level
+	 *   SEV_FEATURES (matches the SEV STATUS MSR right shifted 2 bits)
+	 */
+	vmsa->vmpl		= 0;
+	vmsa->sev_features	= sev_status >> 2;
+
+	/* Switch the page over to a VMSA page now that it is initialized */
+	ret = snp_set_vmsa(vmsa, true);
+	if (ret) {
+		pr_err("set VMSA page failed (%u)\n", ret);
+		free_page((unsigned long)vmsa);
+
+		return -EINVAL;
+	}
+
+	/* Issue VMGEXIT AP Creation NAE event */
+	local_irq_save(flags);
+
+	ghcb = __sev_get_ghcb(&state);
+
+	vc_ghcb_invalidate(ghcb);
+	ghcb_set_rax(ghcb, vmsa->sev_features);
+	ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION);
+	ghcb_set_sw_exit_info_1(ghcb, ((u64)apic_id << 32) | SVM_VMGEXIT_AP_CREATE);
+	ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa));
+
+	sev_es_wr_ghcb_msr(__pa(ghcb));
+	VMGEXIT();
+
+	if (!ghcb_sw_exit_info_1_is_valid(ghcb) ||
+	    lower_32_bits(ghcb->save.sw_exit_info_1)) {
+		pr_alert("SNP AP Creation error\n");
+		ret = -EINVAL;
+	}
+
+	__sev_put_ghcb(&state);
+
+	local_irq_restore(flags);
+
+	/* Perform cleanup if there was an error */
+	if (ret) {
+		err = snp_set_vmsa(vmsa, false);
+		if (err)
+			pr_err("clear VMSA page failed (%u), leaking page\n", err);
+		else
+			free_page((unsigned long)vmsa);
+
+		vmsa = NULL;
+	}
+
+	/* Free up any previous VMSA page */
+	if (cur_vmsa) {
+		err = snp_set_vmsa(cur_vmsa, false);
+		if (err)
+			pr_err("clear VMSA page failed (%u), leaking page\n", err);
+		else
+			free_page((unsigned long)cur_vmsa);
+	}
+
+	/* Record the current VMSA page */
+	per_cpu(snp_vmsa, cpu) = vmsa;
+
+	return ret;
+}
+
+void snp_set_wakeup_secondary_cpu(void)
+{
+	if (!cc_platform_has(CC_ATTR_SEV_SNP))
+		return;
+
+	/*
+	 * Always set this override if SEV-SNP is enabled. This makes it the
+	 * required method to start APs under SEV-SNP. If the hypervisor does
+	 * not support AP creation, then no APs will be started.
+	 */
+	apic->wakeup_secondary_cpu = wakeup_cpu_via_vmgexit;
+}
+
 int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
 {
 	u16 startup_cs, startup_ip;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ac2909f0cab3..9eca0b8a72e9 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -82,6 +82,7 @@
 #include <asm/spec-ctrl.h>
 #include <asm/hw_irq.h>
 #include <asm/stackprotector.h>
+#include <asm/sev.h>
 
 #ifdef CONFIG_ACPI_CPPC_LIB
 #include <acpi/cppc_acpi.h>
@@ -1425,6 +1426,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
 	smp_quirk_init_udelay();
 
 	speculative_store_bypass_ht_init();
+
+	snp_set_wakeup_secondary_cpu();
 }
 
 void arch_thaw_secondary_cpus_begin(void)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 21/40] x86/head: re-enable stack protection for 32/64-bit builds
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (19 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-03 16:49   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper Brijesh Singh
                   ` (19 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

As of commit 103a4908ad4d ("x86/head/64: Disable stack protection for
head$(BITS).o") kernel/head64.c is compiled with -fno-stack-protector
to allow a call to set_bringup_idt_handler(), which would otherwise
have stack protection enabled with CONFIG_STACKPROTECTOR_STRONG. While
sufficient for that case, there may still be issues with calls to any
external functions that were compiled with stack protection enabled that
in-turn make stack-protected calls, or if the exception handlers set up
by set_bringup_idt_handler() make calls to stack-protected functions.
As part of 103a4908ad4d, stack protection was also disabled for
kernel/head32.c as a precaution.

Subsequent patches for SEV-SNP CPUID validation support will introduce
both such cases. Attempting to disable stack protection for everything
in scope to address that is prohibitive since much of the code, like
SEV-ES #VC handler, is shared code that remains in use after boot and
could benefit from having stack protection enabled. Attempting to inline
calls is brittle and can quickly balloon out to library/helper code
where that's not really an option.

Instead, re-enable stack protection for head32.c/head64.c and make the
appropriate changes to ensure the segment used for the stack canary is
initialized in advance of any stack-protected C calls.

for head64.c:

- The BSP will enter from startup_64 and call into C code
  (startup_64_setup_env) shortly after setting up the stack, which may
  result in calls to stack-protected code. Set up %gs early to allow
  for this safely.
- APs will enter from secondary_startup_64*, and %gs will be set up
  soon after. There is one call to C code prior to this
  (__startup_secondary_64), but it is only to fetch sme_me_mask, and
  unlikely to be stack-protected, so leave things as they are, but add
  a note about this in case things change in the future.

for head32.c:

- BSPs/APs will set %fs to __BOOT_DS prior to any C calls. In recent
  kernels, the compiler is configured to access the stack canary at
  %fs:__stack_chk_guard, which overlaps with the initial per-cpu
  __stack_chk_guard variable in the initial/'master' .data..percpu
  area. This is sufficient to allow access to the canary for use
  during initial startup, so no changes are needed there.

Suggested-by: Joerg Roedel <jroedel@suse.de> #for 64-bit %gs set up
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/Makefile  |  1 -
 arch/x86/kernel/head_64.S | 24 ++++++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 2ff3e600f426..4df8c8f7d2ac 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -48,7 +48,6 @@ endif
 # non-deterministic coverage.
 KCOV_INSTRUMENT		:= n
 
-CFLAGS_head$(BITS).o	+= -fno-stack-protector
 CFLAGS_cc_platform.o	+= -fno-stack-protector
 
 CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 99de8fd461e8..9f8a7e48aca7 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -65,6 +65,22 @@ SYM_CODE_START_NOALIGN(startup_64)
 	leaq	(__end_init_task - FRAME_SIZE)(%rip), %rsp
 
 	leaq	_text(%rip), %rdi
+
+	/*
+	 * initial_gs points to initial fixed_per_cpu struct with storage for
+	 * the stack protector canary. Global pointer fixups are needed at this
+	 * stage, so apply them as is done in fixup_pointer(), and initialize %gs
+	 * such that the canary can be accessed at %gs:40 for subsequent C calls.
+	 */
+	movl	$MSR_GS_BASE, %ecx
+	movq	initial_gs(%rip), %rax
+	movq	$_text, %rdx
+	subq	%rdx, %rax
+	addq	%rdi, %rax
+	movq	%rax, %rdx
+	shrq	$32,  %rdx
+	wrmsr
+
 	pushq	%rsi
 	call	startup_64_setup_env
 	popq	%rsi
@@ -146,6 +162,14 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	 * added to the initial pgdir entry that will be programmed into CR3.
 	 */
 	pushq	%rsi
+	/*
+	 * NOTE: %gs at this point is a stale data segment left over from the
+	 * real-mode trampoline, so the default stack protector canary location
+	 * at %gs:40 does not yet coincide with the expected fixed_per_cpu struct
+	 * that contains storage for the stack canary. So take care not to add
+	 * anything to the C functions in this path that would result in stack
+	 * protected C code being generated.
+	 */
 	call	__startup_secondary_64
 	popq	%rsi
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (20 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 21/40] x86/head: re-enable stack protection for 32/64-bit builds Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-30 18:52   ` Sean Christopherson
  2022-01-06 18:38   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 23/40] KVM: x86: move lookup of indexed CPUID leafs " Brijesh Singh
                   ` (18 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

This code will also be used later for SEV-SNP-validated CPUID code in
some cases, so move it to a common helper.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/sev-shared.c | 84 +++++++++++++++++++++++++-----------
 1 file changed, 58 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 3aaef1a18ffe..d89481b31022 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -194,6 +194,58 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
 	return verify_exception_info(ghcb, ctxt);
 }
 
+static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
+			u32 *ecx, u32 *edx)
+{
+	u64 val;
+
+	if (eax) {
+		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EAX));
+		VMGEXIT();
+		val = sev_es_rd_ghcb_msr();
+
+		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+			return -EIO;
+
+		*eax = (val >> 32);
+	}
+
+	if (ebx) {
+		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EBX));
+		VMGEXIT();
+		val = sev_es_rd_ghcb_msr();
+
+		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+			return -EIO;
+
+		*ebx = (val >> 32);
+	}
+
+	if (ecx) {
+		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_ECX));
+		VMGEXIT();
+		val = sev_es_rd_ghcb_msr();
+
+		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+			return -EIO;
+
+		*ecx = (val >> 32);
+	}
+
+	if (edx) {
+		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EDX));
+		VMGEXIT();
+		val = sev_es_rd_ghcb_msr();
+
+		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+			return -EIO;
+
+		*edx = (val >> 32);
+	}
+
+	return 0;
+}
+
 /*
  * Boot VC Handler - This is the first VC handler during boot, there is no GHCB
  * page yet, so it only supports the MSR based communication with the
@@ -202,39 +254,19 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
 void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
 {
 	unsigned int fn = lower_bits(regs->ax, 32);
-	unsigned long val;
+	u32 eax, ebx, ecx, edx;
 
 	/* Only CPUID is supported via MSR protocol */
 	if (exit_code != SVM_EXIT_CPUID)
 		goto fail;
 
-	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EAX));
-	VMGEXIT();
-	val = sev_es_rd_ghcb_msr();
-	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+	if (sev_cpuid_hv(fn, 0, &eax, &ebx, &ecx, &edx))
 		goto fail;
-	regs->ax = val >> 32;
 
-	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EBX));
-	VMGEXIT();
-	val = sev_es_rd_ghcb_msr();
-	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
-		goto fail;
-	regs->bx = val >> 32;
-
-	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_ECX));
-	VMGEXIT();
-	val = sev_es_rd_ghcb_msr();
-	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
-		goto fail;
-	regs->cx = val >> 32;
-
-	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EDX));
-	VMGEXIT();
-	val = sev_es_rd_ghcb_msr();
-	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
-		goto fail;
-	regs->dx = val >> 32;
+	regs->ax = eax;
+	regs->bx = ebx;
+	regs->cx = ecx;
+	regs->dx = edx;
 
 	/*
 	 * This is a VC handler and the #VC is only raised when SEV-ES is
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 23/40] KVM: x86: move lookup of indexed CPUID leafs to helper
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (21 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-06 18:46   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup " Brijesh Singh
                   ` (17 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Determining which CPUID leafs have significant ECX/index values is
also needed by guest kernel code when doing SEV-SNP-validated CPUID
lookups. Move this to common code to keep future updates in sync.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/cpuid.h | 26 ++++++++++++++++++++++++++
 arch/x86/kvm/cpuid.c         | 17 ++---------------
 2 files changed, 28 insertions(+), 15 deletions(-)
 create mode 100644 arch/x86/include/asm/cpuid.h

diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
new file mode 100644
index 000000000000..61426eb1f665
--- /dev/null
+++ b/arch/x86/include/asm/cpuid.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_CPUID_H
+#define _ASM_X86_CPUID_H
+
+static __always_inline bool cpuid_function_is_indexed(u32 function)
+{
+	switch (function) {
+	case 4:
+	case 7:
+	case 0xb:
+	case 0xd:
+	case 0xf:
+	case 0x10:
+	case 0x12:
+	case 0x14:
+	case 0x17:
+	case 0x18:
+	case 0x1f:
+	case 0x8000001d:
+		return true;
+	}
+
+	return false;
+}
+
+#endif /* _ASM_X86_CPUID_H */
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 07e9215e911d..6b99e8e87480 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -19,6 +19,7 @@
 #include <asm/user.h>
 #include <asm/fpu/xstate.h>
 #include <asm/sgx.h>
+#include <asm/cpuid.h>
 #include "cpuid.h"
 #include "lapic.h"
 #include "mmu.h"
@@ -626,22 +627,8 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
 	cpuid_count(entry->function, entry->index,
 		    &entry->eax, &entry->ebx, &entry->ecx, &entry->edx);
 
-	switch (function) {
-	case 4:
-	case 7:
-	case 0xb:
-	case 0xd:
-	case 0xf:
-	case 0x10:
-	case 0x12:
-	case 0x14:
-	case 0x17:
-	case 0x18:
-	case 0x1f:
-	case 0x8000001d:
+	if (cpuid_function_is_indexed(function))
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-		break;
-	}
 
 	return entry;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup to helper
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (22 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 23/40] KVM: x86: move lookup of indexed CPUID leafs " Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-10 18:54   ` Dave Hansen
                     ` (2 more replies)
  2021-12-10 15:43 ` [PATCH v8 25/40] x86/compressed/acpi: move EFI config " Brijesh Singh
                   ` (16 subsequent siblings)
  40 siblings, 3 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/Makefile |  1 +
 arch/x86/boot/compressed/acpi.c   | 60 ++++++++++----------------
 arch/x86/boot/compressed/efi.c    | 72 +++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/misc.h   | 14 ++++++
 4 files changed, 109 insertions(+), 38 deletions(-)
 create mode 100644 arch/x86/boot/compressed/efi.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 431bf7f846c3..d364192c2367 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -100,6 +100,7 @@ endif
 vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
 
 vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
+vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o
 efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a
 
 $(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 8bcbcee54aa1..9e784bd7b2e6 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -86,8 +86,8 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
 {
 	efi_system_table_64_t *systab;
 	struct efi_setup_data *esd;
-	struct efi_info *ei;
-	char *sig;
+	bool efi_64;
+	int ret;
 
 	esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
 	if (!esd)
@@ -98,18 +98,16 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
 		return 0;
 	}
 
-	ei = &boot_params->efi_info;
-	sig = (char *)&ei->efi_loader_signature;
-	if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+	/* Get systab from boot params. */
+	ret = efi_get_system_table(boot_params, (unsigned long *)&systab, &efi_64);
+	if (ret)
+		error("EFI system table not found in kexec boot_params.");
+
+	if (!efi_64) {
 		debug_putstr("Wrong kexec EFI loader signature.\n");
 		return 0;
 	}
 
-	/* Get systab from boot params. */
-	systab = (efi_system_table_64_t *) (ei->efi_systab | ((__u64)ei->efi_systab_hi << 32));
-	if (!systab)
-		error("EFI system table not found in kexec boot_params.");
-
 	return __efi_get_rsdp_addr((unsigned long)esd->tables, systab->nr_tables, true);
 }
 #else
@@ -119,45 +117,31 @@ static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
 static acpi_physical_address efi_get_rsdp_addr(void)
 {
 #ifdef CONFIG_EFI
-	unsigned long systab, config_tables;
+	unsigned long systab_tbl_pa, config_tables;
 	unsigned int nr_tables;
-	struct efi_info *ei;
 	bool efi_64;
-	char *sig;
-
-	ei = &boot_params->efi_info;
-	sig = (char *)&ei->efi_loader_signature;
-
-	if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
-		efi_64 = true;
-	} else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
-		efi_64 = false;
-	} else {
-		debug_putstr("Wrong EFI loader signature.\n");
-		return 0;
-	}
+	int ret;
 
-	/* Get systab from boot params. */
-#ifdef CONFIG_X86_64
-	systab = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
-#else
-	if (ei->efi_systab_hi || ei->efi_memmap_hi) {
-		debug_putstr("Error getting RSDP address: EFI system table located above 4GB.\n");
+	/*
+	 * This function is called even for non-EFI BIOSes, and callers expect
+	 * failure to locate the EFI system table to result in 0 being returned
+	 * as indication that EFI is not available, rather than outright
+	 * failure/abort.
+	 */
+	ret = efi_get_system_table(boot_params, &systab_tbl_pa, &efi_64);
+	if (ret == -EOPNOTSUPP)
 		return 0;
-	}
-	systab = ei->efi_systab;
-#endif
-	if (!systab)
-		error("EFI system table not found.");
+	if (ret)
+		error("EFI support advertised, but unable to locate system table.");
 
 	/* Handle EFI bitness properly */
 	if (efi_64) {
-		efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab;
+		efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab_tbl_pa;
 
 		config_tables	= stbl->tables;
 		nr_tables	= stbl->nr_tables;
 	} else {
-		efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab;
+		efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab_tbl_pa;
 
 		config_tables	= stbl->tables;
 		nr_tables	= stbl->nr_tables;
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
new file mode 100644
index 000000000000..1c626d28f07e
--- /dev/null
+++ b/arch/x86/boot/compressed/efi.c
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Helpers for early access to EFI configuration table
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Michael Roth <michael.roth@amd.com>
+ */
+
+#include "misc.h"
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+/**
+ * efi_get_system_table - Given boot_params, retrieve the physical address of
+ *                        EFI system table.
+ *
+ * @boot_params:        pointer to boot_params
+ * @sys_tbl_pa:         location to store physical address of system table
+ * @is_efi_64:          location to store whether using 64-bit EFI or not
+ *
+ * Return: 0 on success. On error, return params are left unchanged.
+ *
+ * Note: Existing callers like ACPI will call this unconditionally even for
+ * non-EFI BIOSes. In such cases, those callers may treat cases where
+ * bootparams doesn't indicate that a valid EFI system table is available as
+ * non-fatal errors to allow fall-through to non-EFI alternatives. This
+ * class of errors are reported as EOPNOTSUPP and should be kept in sync with
+ * callers who check for that specific error.
+ */
+int efi_get_system_table(struct boot_params *boot_params, unsigned long *sys_tbl_pa,
+			 bool *is_efi_64)
+{
+	unsigned long sys_tbl;
+	struct efi_info *ei;
+	bool efi_64;
+	char *sig;
+
+	if (!sys_tbl_pa || !is_efi_64)
+		return -EINVAL;
+
+	ei = &boot_params->efi_info;
+	sig = (char *)&ei->efi_loader_signature;
+
+	if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+		efi_64 = true;
+	} else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
+		efi_64 = false;
+	} else {
+		debug_putstr("Wrong EFI loader signature.\n");
+		return -EOPNOTSUPP;
+	}
+
+	/* Get systab from boot params. */
+#ifdef CONFIG_X86_64
+	sys_tbl = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
+#else
+	if (ei->efi_systab_hi || ei->efi_memmap_hi) {
+		debug_putstr("Error: EFI system table located above 4GB.\n");
+		return -EOPNOTSUPP;
+	}
+	sys_tbl = ei->efi_systab;
+#endif
+	if (!sys_tbl) {
+		debug_putstr("EFI system table not found.");
+		return -ENOENT;
+	}
+
+	*sys_tbl_pa = sys_tbl;
+	*is_efi_64 = efi_64;
+	return 0;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 01cc13c12059..165640f64b71 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -23,6 +23,7 @@
 #include <linux/screen_info.h>
 #include <linux/elf.h>
 #include <linux/io.h>
+#include <linux/efi.h>
 #include <asm/page.h>
 #include <asm/boot.h>
 #include <asm/bootparam.h>
@@ -176,4 +177,17 @@ void boot_stage2_vc(void);
 
 unsigned long sev_verify_cbit(unsigned long cr3);
 
+#ifdef CONFIG_EFI
+/* helpers for early EFI config table access */
+int efi_get_system_table(struct boot_params *boot_params,
+			 unsigned long *sys_tbl_pa, bool *is_efi_64);
+#else
+static inline int
+efi_get_system_table(struct boot_params *boot_params,
+		     unsigned long *sys_tbl_pa, bool *is_efi_64)
+{
+	return -ENOENT;
+}
+#endif /* CONFIG_EFI */
+
 #endif /* BOOT_COMPRESSED_MISC_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 25/40] x86/compressed/acpi: move EFI config table lookup to helper
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (23 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup " Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-06 20:33   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 26/40] x86/compressed/acpi: move EFI vendor " Brijesh Singh
                   ` (15 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/acpi.c | 25 ++++++--------------
 arch/x86/boot/compressed/efi.c  | 42 +++++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/misc.h |  9 +++++++
 3 files changed, 58 insertions(+), 18 deletions(-)

diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 9e784bd7b2e6..fea72a1504ff 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -117,8 +117,9 @@ static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
 static acpi_physical_address efi_get_rsdp_addr(void)
 {
 #ifdef CONFIG_EFI
-	unsigned long systab_tbl_pa, config_tables;
-	unsigned int nr_tables;
+	unsigned long cfg_tbl_pa = 0;
+	unsigned long systab_tbl_pa;
+	unsigned int cfg_tbl_len;
 	bool efi_64;
 	int ret;
 
@@ -134,23 +135,11 @@ static acpi_physical_address efi_get_rsdp_addr(void)
 	if (ret)
 		error("EFI support advertised, but unable to locate system table.");
 
-	/* Handle EFI bitness properly */
-	if (efi_64) {
-		efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab_tbl_pa;
+	ret = efi_get_conf_table(boot_params, &cfg_tbl_pa, &cfg_tbl_len, &efi_64);
+	if (ret || !cfg_tbl_pa)
+		error("EFI config table not found.");
 
-		config_tables	= stbl->tables;
-		nr_tables	= stbl->nr_tables;
-	} else {
-		efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab_tbl_pa;
-
-		config_tables	= stbl->tables;
-		nr_tables	= stbl->nr_tables;
-	}
-
-	if (!config_tables)
-		error("EFI config tables not found.");
-
-	return __efi_get_rsdp_addr(config_tables, nr_tables, efi_64);
+	return __efi_get_rsdp_addr(cfg_tbl_pa, cfg_tbl_len, efi_64);
 #else
 	return 0;
 #endif
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
index 1c626d28f07e..08ad517b0731 100644
--- a/arch/x86/boot/compressed/efi.c
+++ b/arch/x86/boot/compressed/efi.c
@@ -70,3 +70,45 @@ int efi_get_system_table(struct boot_params *boot_params, unsigned long *sys_tbl
 	*is_efi_64 = efi_64;
 	return 0;
 }
+
+/**
+ * efi_get_conf_table - Given boot_params, locate EFI system table from it
+ *                        and return the physical address EFI configuration table.
+ *
+ * @boot_params:        pointer to boot_params
+ * @cfg_tbl_pa:         location to store physical address of config table
+ * @cfg_tbl_len:        location to store number of config table entries
+ * @is_efi_64:          location to store whether using 64-bit EFI or not
+ *
+ * Return: 0 on success. On error, return params are left unchanged.
+ */
+int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
+		       unsigned int *cfg_tbl_len, bool *is_efi_64)
+{
+	unsigned long sys_tbl_pa = 0;
+	int ret;
+
+	if (!cfg_tbl_pa || !cfg_tbl_len || !is_efi_64)
+		return -EINVAL;
+
+	ret = efi_get_system_table(boot_params, &sys_tbl_pa, is_efi_64);
+	if (ret)
+		return ret;
+
+	/* Handle EFI bitness properly */
+	if (*is_efi_64) {
+		efi_system_table_64_t *stbl =
+			(efi_system_table_64_t *)sys_tbl_pa;
+
+		*cfg_tbl_pa	= stbl->tables;
+		*cfg_tbl_len	= stbl->nr_tables;
+	} else {
+		efi_system_table_32_t *stbl =
+			(efi_system_table_32_t *)sys_tbl_pa;
+
+		*cfg_tbl_pa	= stbl->tables;
+		*cfg_tbl_len	= stbl->nr_tables;
+	}
+
+	return 0;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 165640f64b71..1c69592e83da 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -181,6 +181,8 @@ unsigned long sev_verify_cbit(unsigned long cr3);
 /* helpers for early EFI config table access */
 int efi_get_system_table(struct boot_params *boot_params,
 			 unsigned long *sys_tbl_pa, bool *is_efi_64);
+int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
+		       unsigned int *cfg_tbl_len, bool *is_efi_64);
 #else
 static inline int
 efi_get_system_table(struct boot_params *boot_params,
@@ -188,6 +190,13 @@ efi_get_system_table(struct boot_params *boot_params,
 {
 	return -ENOENT;
 }
+
+static inline int
+efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
+		   unsigned int *cfg_tbl_len, bool *is_efi_64)
+{
+	return -ENOENT;
+}
 #endif /* CONFIG_EFI */
 
 #endif /* BOOT_COMPRESSED_MISC_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 26/40] x86/compressed/acpi: move EFI vendor table lookup to helper
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (24 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 25/40] x86/compressed/acpi: move EFI config " Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-06 20:47   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data Brijesh Singh
                   ` (14 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/acpi.c | 50 ++++++++-----------------
 arch/x86/boot/compressed/efi.c  | 65 +++++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/misc.h |  9 +++++
 3 files changed, 90 insertions(+), 34 deletions(-)

diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index fea72a1504ff..0670c8f8888a 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -20,46 +20,28 @@
  */
 struct mem_vector immovable_mem[MAX_NUMNODES*2];
 
-/*
- * Search EFI system tables for RSDP.  If both ACPI_20_TABLE_GUID and
- * ACPI_TABLE_GUID are found, take the former, which has more features.
- */
 static acpi_physical_address
-__efi_get_rsdp_addr(unsigned long config_tables, unsigned int nr_tables,
-		    bool efi_64)
+__efi_get_rsdp_addr(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len, bool efi_64)
 {
 	acpi_physical_address rsdp_addr = 0;
 
 #ifdef CONFIG_EFI
-	int i;
-
-	/* Get EFI tables from systab. */
-	for (i = 0; i < nr_tables; i++) {
-		acpi_physical_address table;
-		efi_guid_t guid;
-
-		if (efi_64) {
-			efi_config_table_64_t *tbl = (efi_config_table_64_t *)config_tables + i;
-
-			guid  = tbl->guid;
-			table = tbl->table;
-
-			if (!IS_ENABLED(CONFIG_X86_64) && table >> 32) {
-				debug_putstr("Error getting RSDP address: EFI config table located above 4GB.\n");
-				return 0;
-			}
-		} else {
-			efi_config_table_32_t *tbl = (efi_config_table_32_t *)config_tables + i;
-
-			guid  = tbl->guid;
-			table = tbl->table;
-		}
+	int ret;
 
-		if (!(efi_guidcmp(guid, ACPI_TABLE_GUID)))
-			rsdp_addr = table;
-		else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID)))
-			return table;
-	}
+	/*
+	 * Search EFI system tables for RSDP. Preferred is ACPI_20_TABLE_GUID to
+	 * ACPI_TABLE_GUID because it has more features.
+	 */
+	ret = efi_find_vendor_table(cfg_tbl_pa, cfg_tbl_len, ACPI_20_TABLE_GUID,
+				    efi_64, (unsigned long *)&rsdp_addr);
+	if (!ret)
+		return rsdp_addr;
+
+	/* No ACPI_20_TABLE_GUID found, fallback to ACPI_TABLE_GUID. */
+	ret = efi_find_vendor_table(cfg_tbl_pa, cfg_tbl_len, ACPI_TABLE_GUID,
+				    efi_64, (unsigned long *)&rsdp_addr);
+	if (ret)
+		debug_putstr("Error getting RSDP address.\n");
 #endif
 	return rsdp_addr;
 }
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
index 08ad517b0731..c1ddc72ef4d9 100644
--- a/arch/x86/boot/compressed/efi.c
+++ b/arch/x86/boot/compressed/efi.c
@@ -112,3 +112,68 @@ int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_p
 
 	return 0;
 }
+
+/* Get vendor table address/guid from EFI config table at the given index */
+static int get_vendor_table(void *cfg_tbl, unsigned int idx,
+			    unsigned long *vendor_tbl_pa,
+			    efi_guid_t *vendor_tbl_guid,
+			    bool efi_64)
+{
+	if (efi_64) {
+		efi_config_table_64_t *tbl_entry =
+			(efi_config_table_64_t *)cfg_tbl + idx;
+
+		if (!IS_ENABLED(CONFIG_X86_64) && tbl_entry->table >> 32) {
+			debug_putstr("Error: EFI config table entry located above 4GB.\n");
+			return -EINVAL;
+		}
+
+		*vendor_tbl_pa		= tbl_entry->table;
+		*vendor_tbl_guid	= tbl_entry->guid;
+
+	} else {
+		efi_config_table_32_t *tbl_entry =
+			(efi_config_table_32_t *)cfg_tbl + idx;
+
+		*vendor_tbl_pa		= tbl_entry->table;
+		*vendor_tbl_guid	= tbl_entry->guid;
+	}
+
+	return 0;
+}
+
+/**
+ * efi_find_vendor_table - Given EFI config table, search it for the physical
+ *                         address of the vendor table associated with GUID.
+ *
+ * @cfg_tbl_pa:        pointer to EFI configuration table
+ * @cfg_tbl_len:       number of entries in EFI configuration table
+ * @guid:              GUID of vendor table
+ * @efi_64:            true if using 64-bit EFI
+ * @vendor_tbl_pa:     location to store physical address of vendor table
+ *
+ * Return: 0 on success. On error, return params are left unchanged.
+ */
+int efi_find_vendor_table(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len,
+			  efi_guid_t guid, bool efi_64, unsigned long *vendor_tbl_pa)
+{
+	unsigned int i;
+
+	for (i = 0; i < cfg_tbl_len; i++) {
+		unsigned long vendor_tbl_pa_tmp;
+		efi_guid_t vendor_tbl_guid;
+		int ret;
+
+		if (get_vendor_table((void *)cfg_tbl_pa, i,
+				     &vendor_tbl_pa_tmp,
+				     &vendor_tbl_guid, efi_64))
+			return -EINVAL;
+
+		if (!efi_guidcmp(guid, vendor_tbl_guid)) {
+			*vendor_tbl_pa = vendor_tbl_pa_tmp;
+			return 0;
+		}
+	}
+
+	return -ENOENT;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 1c69592e83da..e9fde1482fbe 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -183,6 +183,8 @@ int efi_get_system_table(struct boot_params *boot_params,
 			 unsigned long *sys_tbl_pa, bool *is_efi_64);
 int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
 		       unsigned int *cfg_tbl_len, bool *is_efi_64);
+int efi_find_vendor_table(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len,
+			  efi_guid_t guid, bool efi_64, unsigned long *vendor_tbl_pa);
 #else
 static inline int
 efi_get_system_table(struct boot_params *boot_params,
@@ -197,6 +199,13 @@ efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
 {
 	return -ENOENT;
 }
+
+static inline int
+efi_find_vendor_table(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len,
+		      efi_guid_t guid, bool efi_64, unsigned long *vendor_tbl_pa)
+{
+	return -ENOENT;
+}
 #endif /* CONFIG_EFI */
 
 #endif /* BOOT_COMPRESSED_MISC_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (25 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 26/40] x86/compressed/acpi: move EFI vendor " Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-10 19:12   ` Dave Hansen
  2022-01-06 22:48   ` Venu Busireddy
  2021-12-10 15:43 ` [PATCH v8 28/40] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement Brijesh Singh
                   ` (13 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

While launching the encrypted guests, the hypervisor may need to provide
some additional information during the guest boot. When booting under the
EFI based BIOS, the EFI configuration table contains an entry for the
confidential computing blob that contains the required information.

To support booting encrypted guests on non-EFI VM, the hypervisor needs to
pass this additional information to the kernel with a different method.

For this purpose, introduce SETUP_CC_BLOB type in setup_data to hold the
physical address of the confidential computing blob location. The boot
loader or hypervisor may choose to use this method instead of EFI
configuration table. The CC blob location scanning should give preference
to setup_data data over the EFI configuration table.

In AMD SEV-SNP, the CC blob contains the address of the secrets and CPUID
pages. The secrets page includes information such as a VM to PSP
communication key and CPUID page contains PSP filtered CPUID values.
Define the AMD SEV confidential computing blob structure.

While at it, define the EFI GUID for the confidential computing blob.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h            | 12 ++++++++++++
 arch/x86/include/uapi/asm/bootparam.h |  1 +
 include/linux/efi.h                   |  1 +
 3 files changed, 14 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index f7cbd5164136..f42fbe3c332f 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -44,6 +44,18 @@ struct es_em_ctxt {
 
 void do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code);
 
+/* AMD SEV Confidential computing blob structure */
+#define CC_BLOB_SEV_HDR_MAGIC	0x45444d41
+struct cc_blob_sev_info {
+	u32 magic;
+	u16 version;
+	u16 reserved;
+	u64 secrets_phys;
+	u32 secrets_len;
+	u64 cpuid_phys;
+	u32 cpuid_len;
+};
+
 static inline u64 lower_bits(u64 val, unsigned int bits)
 {
 	u64 mask = (1ULL << bits) - 1;
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index b25d3f82c2f3..1ac5acca72ce 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -10,6 +10,7 @@
 #define SETUP_EFI			4
 #define SETUP_APPLE_PROPERTIES		5
 #define SETUP_JAILHOUSE			6
+#define SETUP_CC_BLOB			7
 
 #define SETUP_INDIRECT			(1<<31)
 
diff --git a/include/linux/efi.h b/include/linux/efi.h
index dbd39b20e034..a022aed7adb3 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -344,6 +344,7 @@ void efi_native_runtime_setup(void);
 #define EFI_CERT_SHA256_GUID			EFI_GUID(0xc1c41626, 0x504c, 0x4092, 0xac, 0xa9, 0x41, 0xf9, 0x36, 0x93, 0x43, 0x28)
 #define EFI_CERT_X509_GUID			EFI_GUID(0xa5c059a1, 0x94e4, 0x4aa7, 0x87, 0xb5, 0xab, 0x15, 0x5c, 0x2b, 0xf0, 0x72)
 #define EFI_CERT_X509_SHA256_GUID		EFI_GUID(0x3bd2a492, 0x96c0, 0x4079, 0xb4, 0x20, 0xfc, 0xf9, 0x8e, 0xf1, 0x03, 0xed)
+#define EFI_CC_BLOB_GUID			EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54, 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42)
 
 /*
  * This GUID is used to pass to the kernel proper the struct screen_info
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 28/40] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (26 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-07 13:22   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers Brijesh Singh
                   ` (12 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Update the documentation with SEV-SNP CPUID enforcement.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        | 28 +++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index 5c081c8c7164..aa8292fa579a 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -427,6 +427,34 @@ issued by the hypervisor to make the guest ready for execution.
 
 Returns: 0 on success, -negative on error
 
+SEV-SNP CPUID Enforcement
+=========================
+
+SEV-SNP guests can access a special page that contains a table of CPUID values
+that have been validated by the PSP as part of SNP_LAUNCH_UPDATE firmware
+command. It provides the following assurances regarding the validity of CPUID
+values:
+
+ - Its address is obtained via bootloader/firmware (via CC blob), whose
+   binares will be measured as part of the SEV-SNP attestation report.
+ - Its initial state will be encrypted/pvalidated, so attempts to modify
+   it during run-time will be result in garbage being written, or #VC
+   exceptions being generated due to changes in validation state if the
+   hypervisor tries to swap the backing page.
+ - Attempts to bypass PSP checks by hypervisor by using a normal page, or a
+   non-CPUID encrypted page will change the measurement provided by the
+   SEV-SNP attestation report.
+ - The CPUID page contents are *not* measured, but attempts to modify the
+   expected contents of a CPUID page as part of guest initialization will be
+   gated by the PSP CPUID enforcement policy checks performed on the page
+   during SNP_LAUNCH_UPDATE, and noticeable later if the guest owner
+   implements their own checks of the CPUID values.
+
+It is important to note that this last assurance is only useful if the kernel
+has taken care to make use of the SEV-SNP CPUID throughout all stages of boot.
+Otherwise guest owner attestation provides no assurance that the kernel wasn't
+fed incorrect values at some point during boot.
+
 References
 ==========
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (27 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 28/40] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-13 13:16   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 30/40] x86/boot: add a pointer to Confidential Computing blob in bootparams Brijesh Singh
                   ` (11 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

CPUID instructions generate a #VC exception for SEV-ES/SEV-SNP guests,
for which early handlers are currently set up to handle. In the case
of SEV-SNP, guests can use a configurable location in guest memory
that has been pre-populated with a firmware-validated CPUID table to
look up the relevant CPUID values rather than requesting them from
hypervisor via a VMGEXIT. Add the various hooks in the #VC handlers to
allow CPUID instructions to be handled via the table. The code to
actually configure/enable the table will be added in a subsequent
commit.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    |   1 +
 arch/x86/include/asm/sev-common.h |   2 +
 arch/x86/kernel/sev-shared.c      | 320 ++++++++++++++++++++++++++++++
 arch/x86/kernel/sev.c             |   1 +
 4 files changed, 324 insertions(+)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 348f7711c3ea..3514feb5b226 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -20,6 +20,7 @@
 #include <asm/fpu/xcr.h>
 #include <asm/ptrace.h>
 #include <asm/svm.h>
+#include <asm/cpuid.h>
 
 #include "error.h"
 
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 38c14601ae4a..673e6778194b 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -152,6 +152,8 @@ struct snp_psc_desc {
 #define GHCB_TERM_PSC			1	/* Page State Change failure */
 #define GHCB_TERM_PVALIDATE		2	/* Pvalidate failure */
 #define GHCB_TERM_NOT_VMPL0		3	/* SNP guest is not running at VMPL-0 */
+#define GHCB_TERM_CPUID			4	/* CPUID-validation failure */
+#define GHCB_TERM_CPUID_HV		5	/* CPUID failure during hypervisor fallback */
 
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index d89481b31022..dabb425498e0 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -14,6 +14,41 @@
 #define has_cpuflag(f)	boot_cpu_has(f)
 #endif
 
+/*
+ * Individual entries of the SEV-SNP CPUID table, as defined by the SEV-SNP
+ * Firmware ABI, Revision 0.9, Section 7.1, Table 14. Note that the XCR0_IN
+ * and XSS_IN are denoted here as __unused/__unused2, since they are not
+ * needed for the current guest implementation, where the size of the buffers
+ * needed to store enabled XSAVE-saved features are calculated rather than
+ * encoded in the CPUID table for each possible combination of XCR0_IN/XSS_IN
+ * to save space.
+ */
+struct snp_cpuid_fn {
+	u32 eax_in;
+	u32 ecx_in;
+	u64 __unused;
+	u64 __unused2;
+	u32 eax;
+	u32 ebx;
+	u32 ecx;
+	u32 edx;
+	u64 __reserved;
+} __packed;
+
+/*
+ * SEV-SNP CPUID table header, as defined by the SEV-SNP Firmware ABI,
+ * Revision 0.9, Section 8.14.2.6. Also noted there is the SEV-SNP
+ * firmware-enforced limit of 64 entries per CPUID table.
+ */
+#define SNP_CPUID_COUNT_MAX 64
+
+struct snp_cpuid_info {
+	u32 count;
+	u32 __reserved1;
+	u64 __reserved2;
+	struct snp_cpuid_fn fn[SNP_CPUID_COUNT_MAX];
+} __packed;
+
 /*
  * Since feature negotiation related variables are set early in the boot
  * process they must reside in the .data section so as not to be zeroed
@@ -23,6 +58,20 @@
  */
 static u16 ghcb_version __ro_after_init;
 
+/* Copy of the SNP firmware's CPUID page. */
+static struct snp_cpuid_info cpuid_info_copy __ro_after_init;
+static bool snp_cpuid_initialized __ro_after_init;
+
+/*
+ * These will be initialized based on CPUID table so that non-present
+ * all-zero leaves (for sparse tables) can be differentiated from
+ * invalid/out-of-range leaves. This is needed since all-zero leaves
+ * still need to be post-processed.
+ */
+u32 cpuid_std_range_max __ro_after_init;
+u32 cpuid_hyp_range_max __ro_after_init;
+u32 cpuid_ext_range_max __ro_after_init;
+
 static bool __init sev_es_check_cpu_features(void)
 {
 	if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -246,6 +295,244 @@ static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
 	return 0;
 }
 
+static const struct snp_cpuid_info *
+snp_cpuid_info_get_ptr(void)
+{
+	void *ptr;
+
+	/*
+	 * This may be called early while still running on the initial identity
+	 * mapping. Use RIP-relative addressing to obtain the correct address
+	 * in both for identity mapping and after switch-over to kernel virtual
+	 * addresses.
+	 */
+	asm ("lea cpuid_info_copy(%%rip), %0"
+	     : "=r" (ptr)
+	     : "p" (&cpuid_info_copy));
+
+	return ptr;
+}
+
+static inline bool snp_cpuid_active(void)
+{
+	return snp_cpuid_initialized;
+}
+
+static int snp_cpuid_calc_xsave_size(u64 xfeatures_en, u32 base_size,
+				     u32 *xsave_size, bool compacted)
+{
+	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
+	u32 xsave_size_total = base_size;
+	u64 xfeatures_found = 0;
+	int i;
+
+	for (i = 0; i < cpuid_info->count; i++) {
+		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+		if (!(fn->eax_in == 0xD && fn->ecx_in > 1 && fn->ecx_in < 64))
+			continue;
+		if (!(xfeatures_en & (BIT_ULL(fn->ecx_in))))
+			continue;
+		if (xfeatures_found & (BIT_ULL(fn->ecx_in)))
+			continue;
+
+		xfeatures_found |= (BIT_ULL(fn->ecx_in));
+
+		if (compacted)
+			xsave_size_total += fn->eax;
+		else
+			xsave_size_total = max(xsave_size_total,
+					       fn->eax + fn->ebx);
+	}
+
+	/*
+	 * Either the guest set unsupported XCR0/XSS bits, or the corresponding
+	 * entries in the CPUID table were not present. This is not a valid
+	 * state to be in.
+	 */
+	if (xfeatures_found != (xfeatures_en & GENMASK_ULL(63, 2)))
+		return -EINVAL;
+
+	*xsave_size = xsave_size_total;
+
+	return 0;
+}
+
+static void snp_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
+			 u32 *edx)
+{
+	/*
+	 * MSR protocol does not support fetching indexed subfunction, but is
+	 * sufficient to handle current fallback cases. Should that change,
+	 * make sure to terminate rather than ignoring the index and grabbing
+	 * random values. If this issue arises in the future, handling can be
+	 * added here to use GHCB-page protocol for cases that occur late
+	 * enough in boot that GHCB page is available.
+	 */
+	if (cpuid_function_is_indexed(func) && subfunc)
+		sev_es_terminate(1, GHCB_TERM_CPUID_HV);
+
+	if (sev_cpuid_hv(func, 0, eax, ebx, ecx, edx))
+		sev_es_terminate(1, GHCB_TERM_CPUID_HV);
+}
+
+static bool
+snp_cpuid_find_validated_func(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
+			      u32 *ecx, u32 *edx)
+{
+	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
+	int i;
+
+	for (i = 0; i < cpuid_info->count; i++) {
+		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+		if (fn->eax_in != func)
+			continue;
+
+		if (cpuid_function_is_indexed(func) && fn->ecx_in != subfunc)
+			continue;
+
+		*eax = fn->eax;
+		*ebx = fn->ebx;
+		*ecx = fn->ecx;
+		*edx = fn->edx;
+
+		return true;
+	}
+
+	return false;
+}
+
+static bool snp_cpuid_check_range(u32 func)
+{
+	if (func <= cpuid_std_range_max ||
+	    (func >= 0x40000000 && func <= cpuid_hyp_range_max) ||
+	    (func >= 0x80000000 && func <= cpuid_ext_range_max))
+		return true;
+
+	return false;
+}
+
+static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
+				 u32 *ecx, u32 *edx)
+{
+	u32 ebx2, ecx2, edx2;
+
+	switch (func) {
+	case 0x1:
+		snp_cpuid_hv(func, subfunc, NULL, &ebx2, NULL, &edx2);
+
+		/* initial APIC ID */
+		*ebx = (ebx2 & GENMASK(31, 24)) | (*ebx & GENMASK(23, 0));
+		/* APIC enabled bit */
+		*edx = (edx2 & BIT(9)) | (*edx & ~BIT(9));
+
+		/* OSXSAVE enabled bit */
+		if (native_read_cr4() & X86_CR4_OSXSAVE)
+			*ecx |= BIT(27);
+		break;
+	case 0x7:
+		/* OSPKE enabled bit */
+		*ecx &= ~BIT(4);
+		if (native_read_cr4() & X86_CR4_PKE)
+			*ecx |= BIT(4);
+		break;
+	case 0xB:
+		/* extended APIC ID */
+		snp_cpuid_hv(func, 0, NULL, NULL, NULL, edx);
+		break;
+	case 0xD: {
+		bool compacted = false;
+		u64 xcr0 = 1, xss = 0;
+		u32 xsave_size;
+
+		if (subfunc != 0 && subfunc != 1)
+			return 0;
+
+		if (native_read_cr4() & X86_CR4_OSXSAVE)
+			xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
+		if (subfunc == 1) {
+			/* Get XSS value if XSAVES is enabled. */
+			if (*eax & BIT(3)) {
+				unsigned long lo, hi;
+
+				asm volatile("rdmsr" : "=a" (lo), "=d" (hi)
+						     : "c" (MSR_IA32_XSS));
+				xss = (hi << 32) | lo;
+			}
+
+			/*
+			 * The PPR and APM aren't clear on what size should be
+			 * encoded in 0xD:0x1:EBX when compaction is not enabled
+			 * by either XSAVEC (feature bit 1) or XSAVES (feature
+			 * bit 3) since SNP-capable hardware has these feature
+			 * bits fixed as 1. KVM sets it to 0 in this case, but
+			 * to avoid this becoming an issue it's safer to simply
+			 * treat this as unsupported for SEV-SNP guests.
+			 */
+			if (!(*eax & (BIT(1) | BIT(3))))
+				return -EINVAL;
+
+			compacted = true;
+		}
+
+		if (snp_cpuid_calc_xsave_size(xcr0 | xss, *ebx, &xsave_size,
+					      compacted))
+			return -EINVAL;
+
+		*ebx = xsave_size;
+		}
+		break;
+	case 0x8000001E:
+		/* extended APIC ID */
+		snp_cpuid_hv(func, subfunc, eax, &ebx2, &ecx2, NULL);
+		/* compute ID */
+		*ebx = (*ebx & GENMASK(31, 8)) | (ebx2 & GENMASK(7, 0));
+		/* node ID */
+		*ecx = (*ecx & GENMASK(31, 8)) | (ecx2 & GENMASK(7, 0));
+		break;
+	default:
+		/* No fix-ups needed, use values as-is. */
+		break;
+	}
+
+	return 0;
+}
+
+/*
+ * Returns -EOPNOTSUPP if feature not enabled. Any other return value should be
+ * treated as fatal by caller.
+ */
+static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
+		     u32 *edx)
+{
+	if (!snp_cpuid_active())
+		return -EOPNOTSUPP;
+
+	if (!snp_cpuid_find_validated_func(func, subfunc, eax, ebx, ecx, edx)) {
+		/*
+		 * Some hypervisors will avoid keeping track of CPUID entries
+		 * where all values are zero, since they can be handled the
+		 * same as out-of-range values (all-zero). This is useful here
+		 * as well as it allows virtually all guest configurations to
+		 * work using a single SEV-SNP CPUID table.
+		 *
+		 * To allow for this, there is a need to distinguish between
+		 * out-of-range entries and in-range zero entries, since the
+		 * CPUID table entries are only a template that may need to be
+		 * augmented with additional values for things like
+		 * CPU-specific information during post-processing. So if it's
+		 * not in the table, but is still in the valid range, proceed
+		 * with the post-processing. Otherwise, just return zeros.
+		 */
+		*eax = *ebx = *ecx = *edx = 0;
+		if (!snp_cpuid_check_range(func))
+			return 0;
+	}
+
+	return snp_cpuid_postprocess(func, subfunc, eax, ebx, ecx, edx);
+}
+
 /*
  * Boot VC Handler - This is the first VC handler during boot, there is no GHCB
  * page yet, so it only supports the MSR based communication with the
@@ -253,16 +540,26 @@ static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
  */
 void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
 {
+	unsigned int subfn = lower_bits(regs->cx, 32);
 	unsigned int fn = lower_bits(regs->ax, 32);
 	u32 eax, ebx, ecx, edx;
+	int ret;
 
 	/* Only CPUID is supported via MSR protocol */
 	if (exit_code != SVM_EXIT_CPUID)
 		goto fail;
 
+	ret = snp_cpuid(fn, subfn, &eax, &ebx, &ecx, &edx);
+	if (ret == 0)
+		goto cpuid_done;
+
+	if (ret != -EOPNOTSUPP)
+		goto fail;
+
 	if (sev_cpuid_hv(fn, 0, &eax, &ebx, &ecx, &edx))
 		goto fail;
 
+cpuid_done:
 	regs->ax = eax;
 	regs->bx = ebx;
 	regs->cx = ecx;
@@ -557,12 +854,35 @@ static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
 	return ret;
 }
 
+static int vc_handle_cpuid_snp(struct pt_regs *regs)
+{
+	u32 eax, ebx, ecx, edx;
+	int ret;
+
+	ret = snp_cpuid(regs->ax, regs->cx, &eax, &ebx, &ecx, &edx);
+	if (ret == 0) {
+		regs->ax = eax;
+		regs->bx = ebx;
+		regs->cx = ecx;
+		regs->dx = edx;
+	}
+
+	return ret;
+}
+
 static enum es_result vc_handle_cpuid(struct ghcb *ghcb,
 				      struct es_em_ctxt *ctxt)
 {
 	struct pt_regs *regs = ctxt->regs;
 	u32 cr4 = native_read_cr4();
 	enum es_result ret;
+	int snp_cpuid_ret;
+
+	snp_cpuid_ret = vc_handle_cpuid_snp(regs);
+	if (snp_cpuid_ret == 0)
+		return ES_OK;
+	if (snp_cpuid_ret != -EOPNOTSUPP)
+		return ES_VMM_ERROR;
 
 	ghcb_set_rax(ghcb, regs->ax);
 	ghcb_set_rcx(ghcb, regs->cx);
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 21926b094378..32f60602ec29 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -33,6 +33,7 @@
 #include <asm/smp.h>
 #include <asm/cpu.h>
 #include <asm/apic.h>
+#include <asm/cpuid.h>
 
 #define DR7_RESET_VALUE        0x400
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 30/40] x86/boot: add a pointer to Confidential Computing blob in bootparams
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (28 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-17 18:14   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 31/40] x86/compressed: add SEV-SNP feature detection/setup Brijesh Singh
                   ` (10 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

The previously defined Confidential Computing blob is provided to the
kernel via a setup_data structure or EFI config table entry. Currently
these are both checked for by boot/compressed kernel to access the
CPUID table address within it for use with SEV-SNP CPUID enforcement.

To also enable SEV-SNP CPUID enforcement for the run-time kernel,
similar early access to the CPUID table is needed early on while it's
still using the identity-mapped page table set up by boot/compressed,
where global pointers need to be accessed via fixup_pointer().

This isn't much of an issue for accessing setup_data, and the EFI
config table helper code currently used in boot/compressed *could* be
used in this case as well since they both rely on identity-mapping.
However, it has some reliance on EFI helpers/string constants that
would need to be accessed via fixup_pointer(), and fixing it up while
making it shareable between boot/compressed and run-time kernel is
fragile and introduces a good bit of uglyness.

Instead, add a boot_params->cc_blob_address pointer that the
boot/compressed kernel can initialize so that the run-time kernel can
access the CC blob from there instead of re-scanning the EFI config
table.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/bootparam_utils.h | 1 +
 arch/x86/include/uapi/asm/bootparam.h  | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/bootparam_utils.h b/arch/x86/include/asm/bootparam_utils.h
index 981fe923a59f..53e9b0620d96 100644
--- a/arch/x86/include/asm/bootparam_utils.h
+++ b/arch/x86/include/asm/bootparam_utils.h
@@ -74,6 +74,7 @@ static void sanitize_boot_params(struct boot_params *boot_params)
 			BOOT_PARAM_PRESERVE(hdr),
 			BOOT_PARAM_PRESERVE(e820_table),
 			BOOT_PARAM_PRESERVE(eddbuf),
+			BOOT_PARAM_PRESERVE(cc_blob_address),
 		};
 
 		memset(&scratch, 0, sizeof(scratch));
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index 1ac5acca72ce..bea5cdcdf532 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -188,7 +188,8 @@ struct boot_params {
 	__u32 ext_ramdisk_image;			/* 0x0c0 */
 	__u32 ext_ramdisk_size;				/* 0x0c4 */
 	__u32 ext_cmd_line_ptr;				/* 0x0c8 */
-	__u8  _pad4[116];				/* 0x0cc */
+	__u8  _pad4[112];				/* 0x0cc */
+	__u32 cc_blob_address;				/* 0x13c */
 	struct edid_info edid_info;			/* 0x140 */
 	struct efi_info efi_info;			/* 0x1c0 */
 	__u32 alt_mem_k;				/* 0x1e0 */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 31/40] x86/compressed: add SEV-SNP feature detection/setup
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (29 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 30/40] x86/boot: add a pointer to Confidential Computing blob in bootparams Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-19 12:55   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 32/40] x86/compressed: use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
                   ` (9 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Initial/preliminary detection of SEV-SNP is done via the Confidential
Computing blob. Check for it prior to the normal SEV/SME feature
initialization, and add some sanity checks to confirm it agrees with
SEV-SNP CPUID/MSR bits.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c | 91 +++++++++++++++++++++++++++++++++-
 arch/x86/include/asm/sev.h     | 13 +++++
 arch/x86/kernel/sev-shared.c   | 34 +++++++++++++
 3 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 3514feb5b226..93e125da12cf 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -291,6 +291,13 @@ static void enforce_vmpl0(void)
 void sev_enable(struct boot_params *bp)
 {
 	unsigned int eax, ebx, ecx, edx;
+	bool snp;
+
+	/*
+	 * Setup/preliminary detection of SEV-SNP. This will be sanity-checked
+	 * against CPUID/MSR values later.
+	 */
+	snp = snp_init(bp);
 
 	/* Check for the SME/SEV support leaf */
 	eax = 0x80000000;
@@ -311,8 +318,11 @@ void sev_enable(struct boot_params *bp)
 	ecx = 0;
 	native_cpuid(&eax, &ebx, &ecx, &edx);
 	/* Check whether SEV is supported */
-	if (!(eax & BIT(1)))
+	if (!(eax & BIT(1))) {
+		if (snp)
+			error("SEV-SNP support indicated by CC blob, but not CPUID.");
 		return;
+	}
 
 	/* Set the SME mask if this is an SEV guest. */
 	sev_status   = rd_sev_status_msr();
@@ -337,5 +347,84 @@ void sev_enable(struct boot_params *bp)
 		enforce_vmpl0();
 	}
 
+	if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
+		error("SEV-SNP supported indicated by CC blob, but not SEV status MSR.");
+
 	sme_me_mask = BIT_ULL(ebx & 0x3f);
 }
+
+/* Search for Confidential Computing blob in the EFI config table. */
+static struct cc_blob_sev_info *snp_find_cc_blob_efi(struct boot_params *bp)
+{
+	struct cc_blob_sev_info *cc_info;
+	unsigned long conf_table_pa;
+	unsigned int conf_table_len;
+	bool efi_64;
+	int ret;
+
+	ret = efi_get_conf_table(bp, &conf_table_pa, &conf_table_len, &efi_64);
+	if (ret)
+		return NULL;
+
+	ret = efi_find_vendor_table(conf_table_pa, conf_table_len,
+				    EFI_CC_BLOB_GUID, efi_64,
+				    (unsigned long *)&cc_info);
+	if (ret)
+		return NULL;
+
+	return cc_info;
+}
+
+/*
+ * Initial set up of SEV-SNP relies on information provided by the
+ * Confidential Computing blob, which can be passed to the boot kernel
+ * by firmware/bootloader in the following ways:
+ *
+ * - via an entry in the EFI config table
+ * - via a setup_data structure, as defined by the Linux Boot Protocol
+ *
+ * Scan for the blob in that order.
+ */
+static struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
+{
+	struct cc_blob_sev_info *cc_info;
+
+	cc_info = snp_find_cc_blob_efi(bp);
+	if (cc_info)
+		goto found_cc_info;
+
+	cc_info = snp_find_cc_blob_setup_data(bp);
+	if (!cc_info)
+		return NULL;
+
+found_cc_info:
+	if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
+		sev_es_terminate(0, GHCB_SNP_UNSUPPORTED);
+
+	return cc_info;
+}
+
+bool snp_init(struct boot_params *bp)
+{
+	struct cc_blob_sev_info *cc_info;
+
+	if (!bp)
+		return false;
+
+	cc_info = snp_find_cc_blob(bp);
+	if (!cc_info)
+		return false;
+
+	/*
+	 * Pass run-time kernel a pointer to CC info via boot_params so EFI
+	 * config table doesn't need to be searched again during early startup
+	 * phase.
+	 */
+	bp->cc_blob_address = (u32)(unsigned long)cc_info;
+
+	/*
+	 * Indicate SEV-SNP based on presence of SEV-SNP-specific CC blob.
+	 * Subsequent checks will verify SEV-SNP CPUID/MSR bits.
+	 */
+	return true;
+}
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index f42fbe3c332f..cd189c20bcc4 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -11,6 +11,7 @@
 #include <linux/types.h>
 #include <asm/insn.h>
 #include <asm/sev-common.h>
+#include <asm/bootparam.h>
 
 #define GHCB_PROTOCOL_MIN	1ULL
 #define GHCB_PROTOCOL_MAX	2ULL
@@ -145,6 +146,17 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
 void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
 void snp_set_wakeup_secondary_cpu(void);
+bool snp_init(struct boot_params *bp);
+/*
+ * TODO: These are exported only temporarily while boot/compressed/sev.c is
+ * the only user. This is to avoid unused function warnings for kernel/sev.c
+ * during the build of kernel proper.
+ *
+ * Once the code is added to consume these in kernel proper these functions
+ * can be moved back to being statically-scoped to units that pull in
+ * sev-shared.c via #include and these declarations can be dropped.
+ */
+struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params *bp);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -162,6 +174,7 @@ static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz,
 static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
 static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
 static inline void snp_set_wakeup_secondary_cpu(void) { }
+static inline bool snp_init(struct boot_params *bp) { return false; }
 #endif
 
 #endif
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index dabb425498e0..bd58a4ce29c8 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -934,3 +934,37 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
 
 	return ES_OK;
 }
+
+struct cc_setup_data {
+	struct setup_data header;
+	u32 cc_blob_address;
+};
+
+static struct cc_setup_data *get_cc_setup_data(struct boot_params *bp)
+{
+	struct setup_data *hdr = (struct setup_data *)bp->hdr.setup_data;
+
+	while (hdr) {
+		if (hdr->type == SETUP_CC_BLOB)
+			return (struct cc_setup_data *)hdr;
+		hdr = (struct setup_data *)hdr->next;
+	}
+
+	return NULL;
+}
+
+/*
+ * Search for a Confidential Computing blob passed in as a setup_data entry
+ * via the Linux Boot Protocol.
+ */
+struct cc_blob_sev_info *
+snp_find_cc_blob_setup_data(struct boot_params *bp)
+{
+	struct cc_setup_data *sd;
+
+	sd = get_cc_setup_data(bp);
+	if (!sd)
+		return NULL;
+
+	return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 32/40] x86/compressed: use firmware-validated CPUID for SEV-SNP guests
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (30 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 31/40] x86/compressed: add SEV-SNP feature detection/setup Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-20 12:18   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 33/40] x86/compressed/64: add identity mapping for Confidential Computing blob Brijesh Singh
                   ` (8 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

SEV-SNP guests will be provided the location of special 'secrets'
'CPUID' pages via the Confidential Computing blob. This blob is
provided to the boot kernel either through an EFI config table entry,
or via a setup_data structure as defined by the Linux Boot Protocol.

Locate the Confidential Computing from these sources and, if found,
use the provided CPUID page/table address to create a copy that the
boot kernel will use when servicing cpuid instructions via a #VC
handler.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c | 13 ++++++++++
 arch/x86/include/asm/sev.h     |  1 +
 arch/x86/kernel/sev-shared.c   | 43 ++++++++++++++++++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 93e125da12cf..29dfb34b5907 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -415,6 +415,19 @@ bool snp_init(struct boot_params *bp)
 	if (!cc_info)
 		return false;
 
+	/*
+	 * If SEV-SNP-specific Confidential Computing blob is present, then
+	 * firmware/bootloader have indicated SEV-SNP support. Verifying this
+	 * involves CPUID checks which will be more reliable if the SEV-SNP
+	 * CPUID table is used. See comments for snp_cpuid_info_create() for
+	 * more details.
+	 */
+	snp_cpuid_info_create(cc_info);
+
+	/* SEV-SNP CPUID table should be set up now. */
+	if (!snp_cpuid_active())
+		sev_es_terminate(1, GHCB_TERM_CPUID);
+
 	/*
 	 * Pass run-time kernel a pointer to CC info via boot_params so EFI
 	 * config table doesn't need to be searched again during early startup
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index cd189c20bcc4..4fa7ca20d7c9 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -157,6 +157,7 @@ bool snp_init(struct boot_params *bp);
  * sev-shared.c via #include and these declarations can be dropped.
  */
 struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params *bp);
+void snp_cpuid_info_create(const struct cc_blob_sev_info *cc_info);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index bd58a4ce29c8..5cb8f87df4b3 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -403,6 +403,23 @@ snp_cpuid_find_validated_func(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
 	return false;
 }
 
+static void __init snp_cpuid_set_ranges(void)
+{
+	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
+	int i;
+
+	for (i = 0; i < cpuid_info->count; i++) {
+		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+		if (fn->eax_in == 0x0)
+			cpuid_std_range_max = fn->eax;
+		else if (fn->eax_in == 0x40000000)
+			cpuid_hyp_range_max = fn->eax;
+		else if (fn->eax_in == 0x80000000)
+			cpuid_ext_range_max = fn->eax;
+	}
+}
+
 static bool snp_cpuid_check_range(u32 func)
 {
 	if (func <= cpuid_std_range_max ||
@@ -968,3 +985,29 @@ snp_find_cc_blob_setup_data(struct boot_params *bp)
 
 	return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
 }
+
+/*
+ * Initialize the kernel's copy of the SEV-SNP CPUID table, and set up the
+ * pointer that will be used to access it.
+ *
+ * Maintaining a direct mapping of the SEV-SNP CPUID table used by firmware
+ * would be possible as an alternative, but the approach is brittle since the
+ * mapping needs to be updated in sync with all the changes to virtual memory
+ * layout and related mapping facilities throughout the boot process.
+ */
+void __init snp_cpuid_info_create(const struct cc_blob_sev_info *cc_info)
+{
+	const struct snp_cpuid_info *cpuid_info_fw, *cpuid_info;
+
+	if (!cc_info || !cc_info->cpuid_phys || cc_info->cpuid_len < PAGE_SIZE)
+		sev_es_terminate(1, GHCB_TERM_CPUID);
+
+	cpuid_info_fw = (const struct snp_cpuid_info *)cc_info->cpuid_phys;
+	if (!cpuid_info_fw->count || cpuid_info_fw->count > SNP_CPUID_COUNT_MAX)
+		sev_es_terminate(1, GHCB_TERM_CPUID);
+
+	cpuid_info = snp_cpuid_info_get_ptr();
+	memcpy((void *)cpuid_info, cpuid_info_fw, sizeof(*cpuid_info));
+	snp_cpuid_initialized = true;
+	snp_cpuid_set_ranges();
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 33/40] x86/compressed/64: add identity mapping for Confidential Computing blob
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (31 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 32/40] x86/compressed: use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-10 19:52   ` Dave Hansen
  2022-01-25 13:48   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 34/40] x86/sev: add SEV-SNP feature detection/setup Brijesh Singh
                   ` (7 subsequent siblings)
  40 siblings, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

The run-time kernel will need to access the Confidential Computing
blob very early in boot to access the CPUID table it points to. At
that stage of boot it will be relying on the identity-mapped page table
set up by boot/compressed kernel, so make sure the blob and the CPUID
table it points to are mapped in advance.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/ident_map_64.c | 26 ++++++++++++++++++++++++-
 arch/x86/boot/compressed/misc.h         |  4 ++++
 arch/x86/boot/compressed/sev.c          |  2 +-
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index ef77453cc629..2a99b3274ec2 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -37,6 +37,8 @@
 #include <asm/setup.h>	/* For COMMAND_LINE_SIZE */
 #undef _SETUP
 
+#include <asm/sev.h> /* For ConfidentialComputing blob */
+
 extern unsigned long get_cmd_line_ptr(void);
 
 /* Used by PAGE_KERN* macros: */
@@ -106,6 +108,27 @@ static void add_identity_map(unsigned long start, unsigned long end)
 		error("Error: kernel_ident_mapping_init() failed\n");
 }
 
+static void sev_prep_identity_maps(void)
+{
+	/*
+	 * The ConfidentialComputing blob is used very early in uncompressed
+	 * kernel to find the in-memory cpuid table to handle cpuid
+	 * instructions. Make sure an identity-mapping exists so it can be
+	 * accessed after switchover.
+	 */
+	if (sev_snp_enabled()) {
+		struct cc_blob_sev_info *cc_info =
+			(void *)(unsigned long)boot_params->cc_blob_address;
+
+		add_identity_map((unsigned long)cc_info,
+				 (unsigned long)cc_info + sizeof(*cc_info));
+		add_identity_map((unsigned long)cc_info->cpuid_phys,
+				 (unsigned long)cc_info->cpuid_phys + cc_info->cpuid_len);
+	}
+
+	sev_verify_cbit(top_level_pgt);
+}
+
 /* Locates and clears a region for a new top level page table. */
 void initialize_identity_maps(void *rmode)
 {
@@ -163,8 +186,9 @@ void initialize_identity_maps(void *rmode)
 	cmdline = get_cmd_line_ptr();
 	add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
 
+	sev_prep_identity_maps();
+
 	/* Load the new page-table. */
-	sev_verify_cbit(top_level_pgt);
 	write_cr3(top_level_pgt);
 }
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index e9fde1482fbe..4b02bf5c8582 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -127,6 +127,8 @@ void sev_es_shutdown_ghcb(void);
 extern bool sev_es_check_ghcb_fault(unsigned long address);
 void snp_set_page_private(unsigned long paddr);
 void snp_set_page_shared(unsigned long paddr);
+bool sev_snp_enabled(void);
+
 #else
 static inline void sev_enable(struct boot_params *bp) { }
 static inline void sev_es_shutdown_ghcb(void) { }
@@ -136,6 +138,8 @@ static inline bool sev_es_check_ghcb_fault(unsigned long address)
 }
 static inline void snp_set_page_private(unsigned long paddr) { }
 static inline void snp_set_page_shared(unsigned long paddr) { }
+static inline bool sev_snp_enabled(void) { return false; }
+
 #endif
 
 /* acpi.c */
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 29dfb34b5907..c2bf99522e5e 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -120,7 +120,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
 /* Include code for early handlers */
 #include "../../kernel/sev-shared.c"
 
-static inline bool sev_snp_enabled(void)
+bool sev_snp_enabled(void)
 {
 	return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 34/40] x86/sev: add SEV-SNP feature detection/setup
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (32 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 33/40] x86/compressed/64: add identity mapping for Confidential Computing blob Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-25 18:43   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 35/40] x86/sev: use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
                   ` (6 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Initial/preliminary detection of SEV-SNP is done via the Confidential
Computing blob. Check for it prior to the normal SEV/SME feature
initialization, and add some sanity checks to confirm it agrees with
SEV-SNP CPUID/MSR bits.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h         |  3 +-
 arch/x86/kernel/sev-shared.c       |  2 +-
 arch/x86/kernel/sev.c              | 65 ++++++++++++++++++++++++++++++
 arch/x86/mm/mem_encrypt_identity.c |  8 ++++
 4 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 4fa7ca20d7c9..4d32af1348ed 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -147,6 +147,7 @@ void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
 void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
+void snp_abort(void);
 /*
  * TODO: These are exported only temporarily while boot/compressed/sev.c is
  * the only user. This is to avoid unused function warnings for kernel/sev.c
@@ -156,7 +157,6 @@ bool snp_init(struct boot_params *bp);
  * can be moved back to being statically-scoped to units that pull in
  * sev-shared.c via #include and these declarations can be dropped.
  */
-struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params *bp);
 void snp_cpuid_info_create(const struct cc_blob_sev_info *cc_info);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
@@ -176,6 +176,7 @@ static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npage
 static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
 static inline void snp_set_wakeup_secondary_cpu(void) { }
 static inline bool snp_init(struct boot_params *bp) { return false; }
+static inline void snp_abort(void) { }
 #endif
 
 #endif
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 5cb8f87df4b3..72836abcdbe2 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -974,7 +974,7 @@ static struct cc_setup_data *get_cc_setup_data(struct boot_params *bp)
  * Search for a Confidential Computing blob passed in as a setup_data entry
  * via the Linux Boot Protocol.
  */
-struct cc_blob_sev_info *
+static struct cc_blob_sev_info *
 snp_find_cc_blob_setup_data(struct boot_params *bp)
 {
 	struct cc_setup_data *sd;
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 32f60602ec29..0e5c45eacc77 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1949,3 +1949,68 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
 	while (true)
 		halt();
 }
+
+/*
+ * Initial set up of SEV-SNP relies on information provided by the
+ * Confidential Computing blob, which can be passed to the kernel
+ * in the following ways, depending on how it is booted:
+ *
+ * - when booted via the boot/decompress kernel:
+ *   - via boot_params
+ *
+ * - when booted directly by firmware/bootloader (e.g. CONFIG_PVH):
+ *   - via a setup_data entry, as defined by the Linux Boot Protocol
+ *
+ * Scan for the blob in that order.
+ */
+static struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
+{
+	struct cc_blob_sev_info *cc_info;
+
+	/* Boot kernel would have passed the CC blob via boot_params. */
+	if (bp->cc_blob_address) {
+		cc_info = (struct cc_blob_sev_info *)
+			  (unsigned long)bp->cc_blob_address;
+		goto found_cc_info;
+	}
+
+	/*
+	 * If kernel was booted directly, without the use of the
+	 * boot/decompression kernel, the CC blob may have been passed via
+	 * setup_data instead.
+	 */
+	cc_info = snp_find_cc_blob_setup_data(bp);
+	if (!cc_info)
+		return NULL;
+
+found_cc_info:
+	if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
+		sev_es_terminate(1, GHCB_SNP_UNSUPPORTED);
+
+	return cc_info;
+}
+
+bool __init snp_init(struct boot_params *bp)
+{
+	struct cc_blob_sev_info *cc_info;
+
+	if (!bp)
+		return false;
+
+	cc_info = snp_find_cc_blob(bp);
+	if (!cc_info)
+		return false;
+
+	/*
+	 * The CC blob will be used later to access the secrets page. Cache
+	 * it here like the boot kernel does.
+	 */
+	bp->cc_blob_address = (u32)(unsigned long)cc_info;
+
+	return true;
+}
+
+void __init snp_abort(void)
+{
+	sev_es_terminate(1, GHCB_SNP_UNSUPPORTED);
+}
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 3f0abb403340..2f723e106ed3 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -44,6 +44,7 @@
 #include <asm/setup.h>
 #include <asm/sections.h>
 #include <asm/cmdline.h>
+#include <asm/sev.h>
 
 #include "mm_internal.h"
 
@@ -508,8 +509,11 @@ void __init sme_enable(struct boot_params *bp)
 	bool active_by_default;
 	unsigned long me_mask;
 	char buffer[16];
+	bool snp;
 	u64 msr;
 
+	snp = snp_init(bp);
+
 	/* Check for the SME/SEV support leaf */
 	eax = 0x80000000;
 	ecx = 0;
@@ -541,6 +545,10 @@ void __init sme_enable(struct boot_params *bp)
 	sev_status   = __rdmsr(MSR_AMD64_SEV);
 	feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
 
+	/* The SEV-SNP CC blob should never be present unless SEV-SNP is enabled. */
+	if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
+		snp_abort();
+
 	/* Check if memory encryption is enabled */
 	if (feature_mask == AMD_SME_BIT) {
 		/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 35/40] x86/sev: use firmware-validated CPUID for SEV-SNP guests
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (33 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 34/40] x86/sev: add SEV-SNP feature detection/setup Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-26 18:35   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs Brijesh Singh
                   ` (5 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

SEV-SNP guests will be provided the location of special 'secrets' and
'CPUID' pages via the Confidential Computing blob. This blob is
provided to the run-time kernel either through bootparams field that
was initialized by the boot/compressed kernel, or via a setup_data
structure as defined by the Linux Boot Protocol.

Locate the Confidential Computing from these sources and, if found,
use the provided CPUID page/table address to create a copy that the
run-time kernel will use when servicing cpuid instructions via a #VC
handler.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h   | 10 ----------
 arch/x86/kernel/sev-shared.c |  2 +-
 arch/x86/kernel/sev.c        | 37 ++++++++++++++++++++++++++++++++++++
 3 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 4d32af1348ed..76a208fd451b 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -148,16 +148,6 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
 void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
 void snp_abort(void);
-/*
- * TODO: These are exported only temporarily while boot/compressed/sev.c is
- * the only user. This is to avoid unused function warnings for kernel/sev.c
- * during the build of kernel proper.
- *
- * Once the code is added to consume these in kernel proper these functions
- * can be moved back to being statically-scoped to units that pull in
- * sev-shared.c via #include and these declarations can be dropped.
- */
-void snp_cpuid_info_create(const struct cc_blob_sev_info *cc_info);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 72836abcdbe2..7bc7e297f88c 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -995,7 +995,7 @@ snp_find_cc_blob_setup_data(struct boot_params *bp)
  * mapping needs to be updated in sync with all the changes to virtual memory
  * layout and related mapping facilities throughout the boot process.
  */
-void __init snp_cpuid_info_create(const struct cc_blob_sev_info *cc_info)
+static void __init snp_cpuid_info_create(const struct cc_blob_sev_info *cc_info)
 {
 	const struct snp_cpuid_info *cpuid_info_fw, *cpuid_info;
 
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 0e5c45eacc77..70e18b98bb68 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2001,6 +2001,12 @@ bool __init snp_init(struct boot_params *bp)
 	if (!cc_info)
 		return false;
 
+	snp_cpuid_info_create(cc_info);
+
+	/* SEV-SNP CPUID table is set up now. Do some sanity checks. */
+	if (!snp_cpuid_active())
+		sev_es_terminate(1, GHCB_TERM_CPUID);
+
 	/*
 	 * The CC blob will be used later to access the secrets page. Cache
 	 * it here like the boot kernel does.
@@ -2014,3 +2020,34 @@ void __init snp_abort(void)
 {
 	sev_es_terminate(1, GHCB_SNP_UNSUPPORTED);
 }
+
+/*
+ * It is useful from an auditing/testing perspective to provide an easy way
+ * for the guest owner to know that the CPUID table has been initialized as
+ * expected, but that initialization happens too early in boot to print any
+ * sort of indicator, and there's not really any other good place to do it. So
+ * do it here, and while at it, go ahead and re-verify that nothing strange has
+ * happened between early boot and now.
+ */
+static int __init snp_cpuid_check_status(void)
+{
+	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
+
+	if (!cc_platform_has(CC_ATTR_SEV_SNP)) {
+		/* Firmware should not have advertised the feature. */
+		if (snp_cpuid_active())
+			panic("Invalid use of SEV-SNP CPUID table.");
+		return 0;
+	}
+
+	/* CPUID table should always be available when SEV-SNP is enabled. */
+	if (!snp_cpuid_active())
+		sev_es_terminate(1, GHCB_TERM_CPUID);
+
+	pr_info("Using SEV-SNP CPUID table, %d entries present.\n",
+		cpuid_info->count);
+
+	return 0;
+}
+
+arch_initcall(snp_cpuid_check_status);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (34 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 35/40] x86/sev: use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2022-01-27 16:21   ` Borislav Petkov
  2021-12-10 15:43 ` [PATCH v8 37/40] x86/sev: Register SNP guest request platform device Brijesh Singh
                   ` (4 subsequent siblings)
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Version 2 of GHCB specification provides SNP_GUEST_REQUEST and
SNP_EXT_GUEST_REQUEST NAE that can be used by the SNP guest to communicate
with the PSP.

While at it, add a snp_issue_guest_request() helper that can be used by
driver or other subsystem to issue the request to PSP.

See SEV-SNP and GHCB spec for more details.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev-common.h |  3 ++
 arch/x86/include/asm/sev.h        | 14 +++++++++
 arch/x86/include/uapi/asm/svm.h   |  4 +++
 arch/x86/kernel/sev.c             | 51 +++++++++++++++++++++++++++++++
 4 files changed, 72 insertions(+)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 673e6778194b..346600724b84 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -128,6 +128,9 @@ struct snp_psc_desc {
 	struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
 } __packed;
 
+/* Guest message request error code */
+#define SNP_GUEST_REQ_INVALID_LEN	BIT_ULL(32)
+
 #define GHCB_MSR_TERM_REQ		0x100
 #define GHCB_MSR_TERM_REASON_SET_POS	12
 #define GHCB_MSR_TERM_REASON_SET_MASK	0xf
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 76a208fd451b..a47fa0f2547e 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -81,6 +81,14 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 
 #define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
 
+/* SNP Guest message request */
+struct snp_req_data {
+	unsigned long req_gpa;
+	unsigned long resp_gpa;
+	unsigned long data_gpa;
+	unsigned int data_npages;
+};
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -148,6 +156,7 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
 void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
 void snp_abort(void);
+int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -167,6 +176,11 @@ static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npag
 static inline void snp_set_wakeup_secondary_cpu(void) { }
 static inline bool snp_init(struct boot_params *bp) { return false; }
 static inline void snp_abort(void) { }
+static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input,
+					  unsigned long *fw_err)
+{
+	return -ENOTTY;
+}
 #endif
 
 #endif
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 8b4c57baec52..5b8bc2b65a5e 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -109,6 +109,8 @@
 #define SVM_VMGEXIT_SET_AP_JUMP_TABLE		0
 #define SVM_VMGEXIT_GET_AP_JUMP_TABLE		1
 #define SVM_VMGEXIT_PSC				0x80000010
+#define SVM_VMGEXIT_GUEST_REQUEST		0x80000011
+#define SVM_VMGEXIT_EXT_GUEST_REQUEST		0x80000012
 #define SVM_VMGEXIT_AP_CREATION			0x80000013
 #define SVM_VMGEXIT_AP_CREATE_ON_INIT		0
 #define SVM_VMGEXIT_AP_CREATE			1
@@ -225,6 +227,8 @@
 	{ SVM_VMGEXIT_AP_HLT_LOOP,	"vmgexit_ap_hlt_loop" }, \
 	{ SVM_VMGEXIT_AP_JUMP_TABLE,	"vmgexit_ap_jump_table" }, \
 	{ SVM_VMGEXIT_PSC,	"vmgexit_page_state_change" }, \
+	{ SVM_VMGEXIT_GUEST_REQUEST,		"vmgexit_guest_request" }, \
+	{ SVM_VMGEXIT_EXT_GUEST_REQUEST,	"vmgexit_ext_guest_request" }, \
 	{ SVM_VMGEXIT_AP_CREATION,	"vmgexit_ap_creation" }, \
 	{ SVM_VMGEXIT_HV_FEATURES,	"vmgexit_hypervisor_feature" }, \
 	{ SVM_EXIT_ERR,         "invalid_guest_state" }
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 70e18b98bb68..289f93e1ab80 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2051,3 +2051,54 @@ static int __init snp_cpuid_check_status(void)
 }
 
 arch_initcall(snp_cpuid_check_status);
+
+int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err)
+{
+	struct ghcb_state state;
+	unsigned long flags;
+	struct ghcb *ghcb;
+	int ret;
+
+	if (!cc_platform_has(CC_ATTR_SEV_SNP))
+		return -ENODEV;
+
+	/* __sev_get_ghcb() need to run with IRQs disabled because it using per-cpu GHCB */
+	local_irq_save(flags);
+
+	ghcb = __sev_get_ghcb(&state);
+	if (!ghcb) {
+		ret = -EIO;
+		goto e_restore_irq;
+	}
+
+	vc_ghcb_invalidate(ghcb);
+
+	if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
+		ghcb_set_rax(ghcb, input->data_gpa);
+		ghcb_set_rbx(ghcb, input->data_npages);
+	}
+
+	ret = sev_es_ghcb_hv_call(ghcb, true, NULL, exit_code, input->req_gpa, input->resp_gpa);
+	if (ret)
+		goto e_put;
+
+	if (ghcb->save.sw_exit_info_2) {
+		/* Number of expected pages are returned in RBX */
+		if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
+		    ghcb->save.sw_exit_info_2 == SNP_GUEST_REQ_INVALID_LEN)
+			input->data_npages = ghcb_get_rbx(ghcb);
+
+		if (fw_err)
+			*fw_err = ghcb->save.sw_exit_info_2;
+
+		ret = -EIO;
+	}
+
+e_put:
+	__sev_put_ghcb(&state);
+e_restore_irq:
+	local_irq_restore(flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(snp_issue_guest_request);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 37/40] x86/sev: Register SNP guest request platform device
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (35 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-10 15:43 ` [PATCH v8 38/40] virt: Add SEV-SNP guest driver Brijesh Singh
                   ` (3 subsequent siblings)
  40 siblings, 0 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Version 2 of GHCB specification provides Non Automatic Exit (NAE) that can
be used by the SNP guest to communicate with the PSP without risk from a
malicious hypervisor who wishes to read, alter, drop or replay the messages
sent.

SNP_LAUNCH_UPDATE can insert two special pages into the guest’s memory:
the secrets page and the CPUID page. The PSP firmware populate the contents
of the secrets page. The secrets page contains encryption keys used by the
guest to interact with the firmware. Because the secrets page is encrypted
with the guest’s memory encryption key, the hypervisor cannot read the keys.
See SNP FW ABI spec for further details about the secrets page.

Create a platform device that the SNP guest driver can bind to get the
platform resources such as encryption key and message id to use to
communicate with the PSP. The SNP guest driver provides a userspace
interface to get the attestation report, key derivation, extended
attestation report etc.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h |  4 +++
 arch/x86/kernel/sev.c      | 61 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index a47fa0f2547e..7a5934af9d47 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -89,6 +89,10 @@ struct snp_req_data {
 	unsigned int data_npages;
 };
 
+struct snp_guest_platform_data {
+	u64 secrets_gpa;
+};
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 289f93e1ab80..bb33e880a1fa 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -19,6 +19,9 @@
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/cpumask.h>
+#include <linux/efi.h>
+#include <linux/platform_device.h>
+#include <linux/io.h>
 
 #include <asm/cpu_entry_area.h>
 #include <asm/stacktrace.h>
@@ -34,6 +37,7 @@
 #include <asm/cpu.h>
 #include <asm/apic.h>
 #include <asm/cpuid.h>
+#include <asm/setup.h>
 
 #define DR7_RESET_VALUE        0x400
 
@@ -2102,3 +2106,60 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned
 	return ret;
 }
 EXPORT_SYMBOL_GPL(snp_issue_guest_request);
+
+static struct platform_device guest_req_device = {
+	.name		= "snp-guest",
+	.id		= -1,
+};
+
+static u64 get_secrets_page(void)
+{
+	u64 pa_data = boot_params.cc_blob_address;
+	struct cc_blob_sev_info info;
+	void *map;
+
+	/*
+	 * The CC blob contains the address of the secrets page, check if the
+	 * blob is present.
+	 */
+	if (!pa_data)
+		return 0;
+
+	map = early_memremap(pa_data, sizeof(info));
+	memcpy(&info, map, sizeof(info));
+	early_memunmap(map, sizeof(info));
+
+	/* smoke-test the secrets page passed */
+	if (!info.secrets_phys || info.secrets_len != PAGE_SIZE)
+		return 0;
+
+	return info.secrets_phys;
+}
+
+static int __init init_snp_platform_device(void)
+{
+	struct snp_guest_platform_data data;
+	u64 gpa;
+
+	if (!cc_platform_has(CC_ATTR_SEV_SNP))
+		return -ENODEV;
+
+	gpa = get_secrets_page();
+	if (!gpa)
+		return -ENODEV;
+
+	data.secrets_gpa = gpa;
+	if (platform_device_add_data(&guest_req_device, &data, sizeof(data)))
+		goto e_fail;
+
+	if (platform_device_register(&guest_req_device))
+		goto e_fail;
+
+	pr_info("SNP guest platform device initialized.\n");
+	return 0;
+
+e_fail:
+	pr_err("Failed to initialize SNP guest device\n");
+	return -ENODEV;
+}
+device_initcall(init_snp_platform_device);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 38/40] virt: Add SEV-SNP guest driver
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (36 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 37/40] x86/sev: Register SNP guest request platform device Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-10 15:43 ` [PATCH v8 39/40] virt: sevguest: Add support to derive key Brijesh Singh
                   ` (2 subsequent siblings)
  40 siblings, 0 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

SEV-SNP specification provides the guest a mechanism to communicate with
the PSP without risk from a malicious hypervisor who wishes to read, alter,
drop or replay the messages sent. The driver uses snp_issue_guest_request()
to issue GHCB SNP_GUEST_REQUEST or SNP_EXT_GUEST_REQUEST NAE events to
submit the request to PSP.

The PSP requires that all communication should be encrypted using key
specified through the platform_data.

The userspace can use SNP_GET_REPORT ioctl() to query the guest
attestation report.

See SEV-SNP spec section Guest Messages for more details.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 Documentation/virt/coco/sevguest.rst  |  81 ++++
 drivers/virt/Kconfig                  |   3 +
 drivers/virt/Makefile                 |   1 +
 drivers/virt/coco/sevguest/Kconfig    |   9 +
 drivers/virt/coco/sevguest/Makefile   |   2 +
 drivers/virt/coco/sevguest/sevguest.c | 604 ++++++++++++++++++++++++++
 drivers/virt/coco/sevguest/sevguest.h |  98 +++++
 include/uapi/linux/sev-guest.h        |  47 ++
 8 files changed, 845 insertions(+)
 create mode 100644 Documentation/virt/coco/sevguest.rst
 create mode 100644 drivers/virt/coco/sevguest/Kconfig
 create mode 100644 drivers/virt/coco/sevguest/Makefile
 create mode 100644 drivers/virt/coco/sevguest/sevguest.c
 create mode 100644 drivers/virt/coco/sevguest/sevguest.h
 create mode 100644 include/uapi/linux/sev-guest.h

diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
new file mode 100644
index 000000000000..47ef3b0821d5
--- /dev/null
+++ b/Documentation/virt/coco/sevguest.rst
@@ -0,0 +1,81 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================================================
+The Definitive SEV Guest API Documentation
+===================================================================
+
+1. General description
+======================
+
+The SEV API is a set of ioctls that are used by the guest or hypervisor
+to get or set certain aspect of the SEV virtual machine. The ioctls belong
+to the following classes:
+
+ - Hypervisor ioctls: These query and set global attributes which affect the
+   whole SEV firmware.  These ioctl are used by platform provision tools.
+
+ - Guest ioctls: These query and set attributes of the SEV virtual machine.
+
+2. API description
+==================
+
+This section describes ioctls that can be used to query or set SEV guests.
+For each ioctl, the following information is provided along with a
+description:
+
+  Technology:
+      which SEV technology provides this ioctl. sev, sev-es, sev-snp or all.
+
+  Type:
+      hypervisor or guest. The ioctl can be used inside the guest or the
+      hypervisor.
+
+  Parameters:
+      what parameters are accepted by the ioctl.
+
+  Returns:
+      the return value.  General error numbers (ENOMEM, EINVAL)
+      are not detailed, but errors with specific meanings are.
+
+The guest ioctl should be issued on a file descriptor of the /dev/sev-guest device.
+The ioctl accepts struct snp_user_guest_request. The input and output structure is
+specified through the req_data and resp_data field respectively. If the ioctl fails
+to execute due to a firmware error, then fw_err code will be set otherwise the
+fw_err will be set to 0xff.
+
+::
+        struct snp_guest_request_ioctl {
+                /* Message version number */
+                __u32 msg_version;
+
+                /* Request and response structure address */
+                __u64 req_data;
+                __u64 resp_data;
+
+                /* firmware error code on failure (see psp-sev.h) */
+                __u64 fw_err;
+        };
+
+2.1 SNP_GET_REPORT
+------------------
+
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in): struct snp_report_req
+:Returns (out): struct snp_report_resp on success, -negative on error
+
+The SNP_GET_REPORT ioctl can be used to query the attestation report from the
+SEV-SNP firmware. The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command
+provided by the SEV-SNP firmware to query the attestation report.
+
+On success, the snp_report_resp.data will contains the report. The report
+contain the format described in the SEV-SNP specification. See the SEV-SNP
+specification for further details.
+
+
+Reference
+---------
+
+SEV-SNP and GHCB specification: developer.amd.com/sev
+
+The driver is based on SEV-SNP firmware spec 0.9 and GHCB spec version 2.0.
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index 8061e8ef449f..e457e47610d3 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -36,4 +36,7 @@ source "drivers/virt/vboxguest/Kconfig"
 source "drivers/virt/nitro_enclaves/Kconfig"
 
 source "drivers/virt/acrn/Kconfig"
+
+source "drivers/virt/coco/sevguest/Kconfig"
+
 endif
diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index 3e272ea60cd9..9c704a6fdcda 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -8,3 +8,4 @@ obj-y				+= vboxguest/
 
 obj-$(CONFIG_NITRO_ENCLAVES)	+= nitro_enclaves/
 obj-$(CONFIG_ACRN_HSM)		+= acrn/
+obj-$(CONFIG_SEV_GUEST)		+= coco/sevguest/
diff --git a/drivers/virt/coco/sevguest/Kconfig b/drivers/virt/coco/sevguest/Kconfig
new file mode 100644
index 000000000000..96190919cca8
--- /dev/null
+++ b/drivers/virt/coco/sevguest/Kconfig
@@ -0,0 +1,9 @@
+config SEV_GUEST
+	tristate "AMD SEV Guest driver"
+	default y
+	depends on AMD_MEM_ENCRYPT && CRYPTO_AEAD2
+	help
+	  The driver can be used by the SEV-SNP guest to communicate with the PSP to
+	  request the attestation report and more.
+
+	  If you choose 'M' here, this module will be called sevguest.
diff --git a/drivers/virt/coco/sevguest/Makefile b/drivers/virt/coco/sevguest/Makefile
new file mode 100644
index 000000000000..b1ffb2b4177b
--- /dev/null
+++ b/drivers/virt/coco/sevguest/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_SEV_GUEST) += sevguest.o
diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
new file mode 100644
index 000000000000..b3b080c9b2d6
--- /dev/null
+++ b/drivers/virt/coco/sevguest/sevguest.c
@@ -0,0 +1,604 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Secure Encrypted Virtualization Nested Paging (SEV-SNP) guest request interface
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/io.h>
+#include <linux/platform_device.h>
+#include <linux/miscdevice.h>
+#include <linux/set_memory.h>
+#include <linux/fs.h>
+#include <crypto/aead.h>
+#include <linux/scatterlist.h>
+#include <linux/psp-sev.h>
+#include <uapi/linux/sev-guest.h>
+#include <uapi/linux/psp-sev.h>
+
+#include <asm/svm.h>
+#include <asm/sev.h>
+
+#include "sevguest.h"
+
+#define DEVICE_NAME	"sev-guest"
+#define AAD_LEN		48
+#define MSG_HDR_VER	1
+
+struct snp_guest_crypto {
+	struct crypto_aead *tfm;
+	u8 *iv, *authtag;
+	int iv_len, a_len;
+};
+
+struct snp_guest_dev {
+	struct device *dev;
+	struct miscdevice misc;
+
+	struct snp_guest_crypto *crypto;
+	struct snp_guest_msg *request, *response;
+	struct snp_secrets_page_layout *layout;
+	struct snp_req_data input;
+	u32 *os_area_msg_seqno;
+	u8 *vmpck;
+};
+
+static u32 vmpck_id;
+module_param(vmpck_id, uint, 0444);
+MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.");
+
+static DEFINE_MUTEX(snp_cmd_mutex);
+
+static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
+{
+	char zero_key[VMPCK_KEY_LEN] = {0};
+
+	if (snp_dev->vmpck)
+		return memcmp(snp_dev->vmpck, zero_key, VMPCK_KEY_LEN) == 0;
+
+	return true;
+}
+
+static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
+{
+	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
+	snp_dev->vmpck = NULL;
+}
+
+static inline u64 __snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+	u64 count;
+
+	lockdep_assert_held(&snp_cmd_mutex);
+
+	/* Read the current message sequence counter from secrets pages */
+	count = *snp_dev->os_area_msg_seqno;
+
+	return count + 1;
+}
+
+/* Return a non-zero on success */
+static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+	u64 count = __snp_get_msg_seqno(snp_dev);
+
+	/*
+	 * The message sequence counter for the SNP guest request is a  64-bit
+	 * value but the version 2 of GHCB specification defines a 32-bit storage
+	 * for it. If the counter exceeds the 32-bit value then return zero.
+	 * The caller should check the return value, but if the caller happens to
+	 * not check the value and use it, then the firmware treats zero as an
+	 * invalid number and will fail the  message request.
+	 */
+	if (count >= UINT_MAX) {
+		pr_err_ratelimited("SNP guest request message sequence counter overflow\n");
+		return 0;
+	}
+
+	return count;
+}
+
+static void snp_inc_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+	/*
+	 * The counter is also incremented by the PSP, so increment it by 2
+	 * and save in secrets page.
+	 */
+	*snp_dev->os_area_msg_seqno += 2;
+}
+
+static inline struct snp_guest_dev *to_snp_dev(struct file *file)
+{
+	struct miscdevice *dev = file->private_data;
+
+	return container_of(dev, struct snp_guest_dev, misc);
+}
+
+static struct snp_guest_crypto *init_crypto(struct snp_guest_dev *snp_dev, u8 *key, size_t keylen)
+{
+	struct snp_guest_crypto *crypto;
+
+	crypto = kzalloc(sizeof(*crypto), GFP_KERNEL_ACCOUNT);
+	if (!crypto)
+		return NULL;
+
+	crypto->tfm = crypto_alloc_aead("gcm(aes)", 0, 0);
+	if (IS_ERR(crypto->tfm))
+		goto e_free;
+
+	if (crypto_aead_setkey(crypto->tfm, key, keylen))
+		goto e_free_crypto;
+
+	crypto->iv_len = crypto_aead_ivsize(crypto->tfm);
+	if (crypto->iv_len < 12) {
+		dev_err(snp_dev->dev, "IV length is less than 12.\n");
+		goto e_free_crypto;
+	}
+
+	crypto->iv = kmalloc(crypto->iv_len, GFP_KERNEL_ACCOUNT);
+	if (!crypto->iv)
+		goto e_free_crypto;
+
+	if (crypto_aead_authsize(crypto->tfm) > MAX_AUTHTAG_LEN) {
+		if (crypto_aead_setauthsize(crypto->tfm, MAX_AUTHTAG_LEN)) {
+			dev_err(snp_dev->dev, "failed to set authsize to %d\n", MAX_AUTHTAG_LEN);
+			goto e_free_crypto;
+		}
+	}
+
+	crypto->a_len = crypto_aead_authsize(crypto->tfm);
+	crypto->authtag = kmalloc(crypto->a_len, GFP_KERNEL_ACCOUNT);
+	if (!crypto->authtag)
+		goto e_free_crypto;
+
+	return crypto;
+
+e_free_crypto:
+	crypto_free_aead(crypto->tfm);
+e_free:
+	kfree(crypto->iv);
+	kfree(crypto->authtag);
+	kfree(crypto);
+
+	return NULL;
+}
+
+static void deinit_crypto(struct snp_guest_crypto *crypto)
+{
+	crypto_free_aead(crypto->tfm);
+	kfree(crypto->iv);
+	kfree(crypto->authtag);
+	kfree(crypto);
+}
+
+static int enc_dec_message(struct snp_guest_crypto *crypto, struct snp_guest_msg *msg,
+			   u8 *src_buf, u8 *dst_buf, size_t len, bool enc)
+{
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
+	struct scatterlist src[3], dst[3];
+	DECLARE_CRYPTO_WAIT(wait);
+	struct aead_request *req;
+	int ret;
+
+	req = aead_request_alloc(crypto->tfm, GFP_KERNEL);
+	if (!req)
+		return -ENOMEM;
+
+	/*
+	 * AEAD memory operations:
+	 * +------ AAD -------+------- DATA -----+---- AUTHTAG----+
+	 * |  msg header      |  plaintext       |  hdr->authtag  |
+	 * | bytes 30h - 5Fh  |    or            |                |
+	 * |                  |   cipher         |                |
+	 * +------------------+------------------+----------------+
+	 */
+	sg_init_table(src, 3);
+	sg_set_buf(&src[0], &hdr->algo, AAD_LEN);
+	sg_set_buf(&src[1], src_buf, hdr->msg_sz);
+	sg_set_buf(&src[2], hdr->authtag, crypto->a_len);
+
+	sg_init_table(dst, 3);
+	sg_set_buf(&dst[0], &hdr->algo, AAD_LEN);
+	sg_set_buf(&dst[1], dst_buf, hdr->msg_sz);
+	sg_set_buf(&dst[2], hdr->authtag, crypto->a_len);
+
+	aead_request_set_ad(req, AAD_LEN);
+	aead_request_set_tfm(req, crypto->tfm);
+	aead_request_set_callback(req, 0, crypto_req_done, &wait);
+
+	aead_request_set_crypt(req, src, dst, len, crypto->iv);
+	ret = crypto_wait_req(enc ? crypto_aead_encrypt(req) : crypto_aead_decrypt(req), &wait);
+
+	aead_request_free(req);
+	return ret;
+}
+
+static int __enc_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
+			 void *plaintext, size_t len)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
+
+	memset(crypto->iv, 0, crypto->iv_len);
+	memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
+
+	return enc_dec_message(crypto, msg, plaintext, msg->payload, len, true);
+}
+
+static int dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
+		       void *plaintext, size_t len)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
+
+	/* Build IV with response buffer sequence number */
+	memset(crypto->iv, 0, crypto->iv_len);
+	memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
+
+	return enc_dec_message(crypto, msg, msg->payload, plaintext, len, false);
+}
+
+static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, u32 sz)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_guest_msg *resp = snp_dev->response;
+	struct snp_guest_msg *req = snp_dev->request;
+	struct snp_guest_msg_hdr *req_hdr = &req->hdr;
+	struct snp_guest_msg_hdr *resp_hdr = &resp->hdr;
+
+	dev_dbg(snp_dev->dev, "response [seqno %lld type %d version %d sz %d]\n",
+		resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version, resp_hdr->msg_sz);
+
+	/* Verify that the sequence counter is incremented by 1 */
+	if (unlikely(resp_hdr->msg_seqno != (req_hdr->msg_seqno + 1)))
+		return -EBADMSG;
+
+	/* Verify response message type and version number. */
+	if (resp_hdr->msg_type != (req_hdr->msg_type + 1) ||
+	    resp_hdr->msg_version != req_hdr->msg_version)
+		return -EBADMSG;
+
+	/*
+	 * If the message size is greater than our buffer length then return
+	 * an error.
+	 */
+	if (unlikely((resp_hdr->msg_sz + crypto->a_len) > sz))
+		return -EBADMSG;
+
+	/* Decrypt the payload */
+	return dec_payload(snp_dev, resp, payload, resp_hdr->msg_sz + crypto->a_len);
+}
+
+static bool enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8 type,
+			void *payload, size_t sz)
+{
+	struct snp_guest_msg *req = snp_dev->request;
+	struct snp_guest_msg_hdr *hdr = &req->hdr;
+
+	memset(req, 0, sizeof(*req));
+
+	hdr->algo = SNP_AEAD_AES_256_GCM;
+	hdr->hdr_version = MSG_HDR_VER;
+	hdr->hdr_sz = sizeof(*hdr);
+	hdr->msg_type = type;
+	hdr->msg_version = version;
+	hdr->msg_seqno = seqno;
+	hdr->msg_vmpck = vmpck_id;
+	hdr->msg_sz = sz;
+
+	/* Verify the sequence number is non-zero */
+	if (!hdr->msg_seqno)
+		return -ENOSR;
+
+	dev_dbg(snp_dev->dev, "request [seqno %lld type %d version %d sz %d]\n",
+		hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
+
+	return __enc_payload(snp_dev, req, payload, sz);
+}
+
+static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
+				u8 type, void *req_buf, size_t req_sz, void *resp_buf,
+				u32 resp_sz, __u64 *fw_err)
+{
+	unsigned long err;
+	u64 seqno;
+	int rc;
+
+	/* Get message sequence and verify that its a non-zero */
+	seqno = snp_get_msg_seqno(snp_dev);
+	if (!seqno)
+		return -EIO;
+
+	memset(snp_dev->response, 0, sizeof(*snp_dev->response));
+
+	/* Encrypt the userspace provided payload */
+	rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
+	if (rc)
+		return rc;
+
+	/* Call firmware to process the request */
+	rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+	if (fw_err)
+		*fw_err = err;
+
+	if (rc)
+		return rc;
+
+	rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
+	if (rc) {
+		/*
+		 * The verify_and_dec_payload() will fail only if the hypervisor is
+		 * actively modifiying the message header or corrupting the encrypted payload.
+		 * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
+		 * the key cannot be used for any communication. The key is disabled to ensure
+		 * that AES-GCM does not use the same IV while encrypting the request payload.
+		 */
+		dev_alert(snp_dev->dev,
+			  "Detected unexpected decode failure, disabling the vmpck_id %d\n", vmpck_id);
+		snp_disable_vmpck(snp_dev);
+		return rc;
+	}
+
+	/* Increment to new message sequence after payload descryption was successful. */
+	snp_inc_msg_seqno(snp_dev);
+
+	return 0;
+}
+
+static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_report_resp *resp;
+	struct snp_report_req req;
+	int rc, resp_len;
+
+	if (!arg->req_data || !arg->resp_data)
+		return -EINVAL;
+
+	/* Copy the request payload from userspace */
+	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+		return -EFAULT;
+
+	/*
+	 * The intermediate response buffer is used while decrypting the
+	 * response payload. Make sure that it has enough space to cover the
+	 * authtag.
+	 */
+	resp_len = sizeof(resp->data) + crypto->a_len;
+	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+	if (!resp)
+		return -ENOMEM;
+
+	/* Issue the command to get the attestation report */
+	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
+				  SNP_MSG_REPORT_REQ, &req.user_data, sizeof(req.user_data),
+				  resp->data, resp_len, &arg->fw_err);
+	if (rc)
+		goto e_free;
+
+	/* Copy the response payload to userspace */
+	if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+		rc = -EFAULT;
+
+e_free:
+	kfree(resp);
+	return rc;
+}
+
+static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
+{
+	struct snp_guest_dev *snp_dev = to_snp_dev(file);
+	void __user *argp = (void __user *)arg;
+	struct snp_guest_request_ioctl input;
+	int ret = -ENOTTY;
+
+	if (copy_from_user(&input, argp, sizeof(input)))
+		return -EFAULT;
+
+	input.fw_err = 0xff;
+
+	/* Message version must be non-zero */
+	if (!input.msg_version)
+		return -EINVAL;
+
+	mutex_lock(&snp_cmd_mutex);
+
+	/* Check if the VMPCK is not empty */
+	if (is_vmpck_empty(snp_dev)) {
+		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
+		mutex_unlock(&snp_cmd_mutex);
+		return -ENOTTY;
+	}
+
+	switch (ioctl) {
+	case SNP_GET_REPORT:
+		ret = get_report(snp_dev, &input);
+		break;
+	default:
+		break;
+	}
+
+	mutex_unlock(&snp_cmd_mutex);
+
+	if (input.fw_err && copy_to_user(argp, &input, sizeof(input)))
+		return -EFAULT;
+
+	return ret;
+}
+
+static void free_shared_pages(void *buf, size_t sz)
+{
+	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+	if (!buf)
+		return;
+
+	/* If fail to restore the encryption mask then leak it. */
+	if (WARN_ONCE(set_memory_encrypted((unsigned long)buf, npages),
+		      "Failed to restore encryption mask (leak it)\n"))
+		return;
+
+	__free_pages(virt_to_page(buf), get_order(sz));
+}
+
+static void *alloc_shared_pages(size_t sz)
+{
+	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+	struct page *page;
+	int ret;
+
+	page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
+	if (IS_ERR(page))
+		return NULL;
+
+	ret = set_memory_decrypted((unsigned long)page_address(page), npages);
+	if (ret) {
+		pr_err("SEV-SNP: failed to mark page shared, ret=%d\n", ret);
+		__free_pages(page, get_order(sz));
+		return NULL;
+	}
+
+	return page_address(page);
+}
+
+static const struct file_operations snp_guest_fops = {
+	.owner	= THIS_MODULE,
+	.unlocked_ioctl = snp_guest_ioctl,
+};
+
+static u8 *get_vmpck(int id, struct snp_secrets_page_layout *layout, u32 **seqno)
+{
+	u8 *key = NULL;
+
+	switch (id) {
+	case 0:
+		*seqno = &layout->os_area.msg_seqno_0;
+		key = layout->vmpck0;
+		break;
+	case 1:
+		*seqno = &layout->os_area.msg_seqno_1;
+		key = layout->vmpck1;
+		break;
+	case 2:
+		*seqno = &layout->os_area.msg_seqno_2;
+		key = layout->vmpck2;
+		break;
+	case 3:
+		*seqno = &layout->os_area.msg_seqno_3;
+		key = layout->vmpck3;
+		break;
+	default:
+		break;
+	}
+
+	return key;
+}
+
+static int __init snp_guest_probe(struct platform_device *pdev)
+{
+	struct snp_secrets_page_layout *layout;
+	struct snp_guest_platform_data *data;
+	struct device *dev = &pdev->dev;
+	struct snp_guest_dev *snp_dev;
+	struct miscdevice *misc;
+	int ret;
+
+	if (!dev->platform_data)
+		return -ENODEV;
+
+	data = (struct snp_guest_platform_data *)dev->platform_data;
+	layout = (__force void *)ioremap_encrypted(data->secrets_gpa, PAGE_SIZE);
+	if (!layout)
+		return -ENODEV;
+
+	ret = -ENOMEM;
+	snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL);
+	if (!snp_dev)
+		goto e_fail;
+
+	ret = -EINVAL;
+	snp_dev->vmpck = get_vmpck(vmpck_id, layout, &snp_dev->os_area_msg_seqno);
+	if (!snp_dev->vmpck) {
+		dev_err(dev, "invalid vmpck id %d\n", vmpck_id);
+		goto e_fail;
+	}
+
+	/* Verify that VMPCK is not zero. */
+	if (is_vmpck_empty(snp_dev)) {
+		dev_err(dev, "vmpck id %d is null\n", vmpck_id);
+		goto e_fail;
+	}
+
+	platform_set_drvdata(pdev, snp_dev);
+	snp_dev->dev = dev;
+	snp_dev->layout = layout;
+
+	/* Allocate the shared page used for the request and response message. */
+	snp_dev->request = alloc_shared_pages(sizeof(struct snp_guest_msg));
+	if (!snp_dev->request)
+		goto e_fail;
+
+	snp_dev->response = alloc_shared_pages(sizeof(struct snp_guest_msg));
+	if (!snp_dev->response)
+		goto e_fail;
+
+	ret = -EIO;
+	snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
+	if (!snp_dev->crypto)
+		goto e_fail;
+
+	misc = &snp_dev->misc;
+	misc->minor = MISC_DYNAMIC_MINOR;
+	misc->name = DEVICE_NAME;
+	misc->fops = &snp_guest_fops;
+
+	/* initial the input address for guest request */
+	snp_dev->input.req_gpa = __pa(snp_dev->request);
+	snp_dev->input.resp_gpa = __pa(snp_dev->response);
+
+	ret =  misc_register(misc);
+	if (ret)
+		goto e_fail;
+
+	dev_info(dev, "Initialized SNP guest driver (using vmpck_id %d)\n", vmpck_id);
+	return 0;
+
+e_fail:
+	iounmap(layout);
+	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
+	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+
+	return ret;
+}
+
+static int __exit snp_guest_remove(struct platform_device *pdev)
+{
+	struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev);
+
+	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
+	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+	deinit_crypto(snp_dev->crypto);
+	misc_deregister(&snp_dev->misc);
+
+	return 0;
+}
+
+static struct platform_driver snp_guest_driver = {
+	.remove		= __exit_p(snp_guest_remove),
+	.driver		= {
+		.name = "snp-guest",
+	},
+};
+
+module_platform_driver_probe(snp_guest_driver, snp_guest_probe);
+
+MODULE_AUTHOR("Brijesh Singh <brijesh.singh@amd.com>");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.0.0");
+MODULE_DESCRIPTION("AMD SNP Guest Driver");
diff --git a/drivers/virt/coco/sevguest/sevguest.h b/drivers/virt/coco/sevguest/sevguest.h
new file mode 100644
index 000000000000..cfa76cf8a21a
--- /dev/null
+++ b/drivers/virt/coco/sevguest/sevguest.h
@@ -0,0 +1,98 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * SEV-SNP API spec is available at https://developer.amd.com/sev
+ */
+
+#ifndef __LINUX_SEVGUEST_H_
+#define __LINUX_SEVGUEST_H_
+
+#include <linux/types.h>
+
+#define MAX_AUTHTAG_LEN		32
+
+/* See SNP spec SNP_GUEST_REQUEST section for the structure */
+enum msg_type {
+	SNP_MSG_TYPE_INVALID = 0,
+	SNP_MSG_CPUID_REQ,
+	SNP_MSG_CPUID_RSP,
+	SNP_MSG_KEY_REQ,
+	SNP_MSG_KEY_RSP,
+	SNP_MSG_REPORT_REQ,
+	SNP_MSG_REPORT_RSP,
+	SNP_MSG_EXPORT_REQ,
+	SNP_MSG_EXPORT_RSP,
+	SNP_MSG_IMPORT_REQ,
+	SNP_MSG_IMPORT_RSP,
+	SNP_MSG_ABSORB_REQ,
+	SNP_MSG_ABSORB_RSP,
+	SNP_MSG_VMRK_REQ,
+	SNP_MSG_VMRK_RSP,
+
+	SNP_MSG_TYPE_MAX
+};
+
+enum aead_algo {
+	SNP_AEAD_INVALID,
+	SNP_AEAD_AES_256_GCM,
+};
+
+struct snp_guest_msg_hdr {
+	u8 authtag[MAX_AUTHTAG_LEN];
+	u64 msg_seqno;
+	u8 rsvd1[8];
+	u8 algo;
+	u8 hdr_version;
+	u16 hdr_sz;
+	u8 msg_type;
+	u8 msg_version;
+	u16 msg_sz;
+	u32 rsvd2;
+	u8 msg_vmpck;
+	u8 rsvd3[35];
+} __packed;
+
+struct snp_guest_msg {
+	struct snp_guest_msg_hdr hdr;
+	u8 payload[4000];
+} __packed;
+
+/*
+ * The secrets page contains 96-bytes of reserved field that can be used by
+ * the guest OS. The guest OS uses the area to save the message sequence
+ * number for each VMPCK.
+ *
+ * See the GHCB spec section Secret page layout for the format for this area.
+ */
+struct secrets_os_area {
+	u32 msg_seqno_0;
+	u32 msg_seqno_1;
+	u32 msg_seqno_2;
+	u32 msg_seqno_3;
+	u64 ap_jump_table_pa;
+	u8 rsvd[40];
+	u8 guest_usage[32];
+} __packed;
+
+#define VMPCK_KEY_LEN		32
+
+/* See the SNP spec version 0.9 for secrets page format */
+struct snp_secrets_page_layout {
+	u32 version;
+	u32 imien	: 1,
+	    rsvd1	: 31;
+	u32 fms;
+	u32 rsvd2;
+	u8 gosvw[16];
+	u8 vmpck0[VMPCK_KEY_LEN];
+	u8 vmpck1[VMPCK_KEY_LEN];
+	u8 vmpck2[VMPCK_KEY_LEN];
+	u8 vmpck3[VMPCK_KEY_LEN];
+	struct secrets_os_area os_area;
+	u8 rsvd3[3840];
+} __packed;
+
+#endif /* __LINUX_SNP_GUEST_H__ */
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
new file mode 100644
index 000000000000..0bfc162da465
--- /dev/null
+++ b/include/uapi/linux/sev-guest.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/*
+ * Userspace interface for AMD SEV and SEV-SNP guest driver.
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * SEV API specification is available at: https://developer.amd.com/sev/
+ */
+
+#ifndef __UAPI_LINUX_SEV_GUEST_H_
+#define __UAPI_LINUX_SEV_GUEST_H_
+
+#include <linux/types.h>
+
+struct snp_report_req {
+	/* user data that should be included in the report */
+	__u8 user_data[64];
+
+	/* The vmpl level to be included in the report */
+	__u32 vmpl;
+};
+
+struct snp_report_resp {
+	/* response data, see SEV-SNP spec for the format */
+	__u8 data[4000];
+};
+
+struct snp_guest_request_ioctl {
+	/* message version number (must be non-zero) */
+	__u8 msg_version;
+
+	/* Request and response structure address */
+	__u64 req_data;
+	__u64 resp_data;
+
+	/* firmware error code on failure (see psp-sev.h) */
+	__u64 fw_err;
+};
+
+#define SNP_GUEST_REQ_IOC_TYPE	'S'
+
+/* Get SNP attestation report */
+#define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_guest_request_ioctl)
+
+#endif /* __UAPI_LINUX_SEV_GUEST_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 39/40] virt: sevguest: Add support to derive key
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (37 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 38/40] virt: Add SEV-SNP guest driver Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-10 22:27   ` Liam Merwick
  2021-12-10 15:43 ` [PATCH v8 40/40] virt: sevguest: Add support to get extended report Brijesh Singh
  2021-12-10 20:17 ` [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Dave Hansen
  40 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The SNP_GET_DERIVED_KEY ioctl interface can be used by the SNP guest to
ask the firmware to provide a key derived from a root key. The derived
key may be used by the guest for any purposes it choose, such as a
sealing key or communicating with the external entities.

See SEV-SNP firmware spec for more information.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 Documentation/virt/coco/sevguest.rst  | 17 ++++++++++
 drivers/virt/coco/sevguest/sevguest.c | 45 +++++++++++++++++++++++++++
 include/uapi/linux/sev-guest.h        | 17 ++++++++++
 3 files changed, 79 insertions(+)

diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
index 47ef3b0821d5..8c22d514d44f 100644
--- a/Documentation/virt/coco/sevguest.rst
+++ b/Documentation/virt/coco/sevguest.rst
@@ -72,6 +72,23 @@ On success, the snp_report_resp.data will contains the report. The report
 contain the format described in the SEV-SNP specification. See the SEV-SNP
 specification for further details.
 
+2.2 SNP_GET_DERIVED_KEY
+-----------------------
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in): struct snp_derived_key_req
+:Returns (out): struct snp_derived_key_req on success, -negative on error
+
+The SNP_GET_DERIVED_KEY ioctl can be used to get a key derive from a root key.
+The derived key can be used by the guest for any purpose, such as sealing keys
+or communicating with external entities.
+
+The ioctl uses the SNP_GUEST_REQUEST (MSG_KEY_REQ) command provided by the
+SEV-SNP firmware to derive the key. See SEV-SNP specification for further details
+on the various fields passed in the key derivation request.
+
+On success, the snp_derived_key_resp.data contains the derived key value. See
+the SEV-SNP specification for further details.
 
 Reference
 ---------
diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
index b3b080c9b2d6..d8dcafc32e11 100644
--- a/drivers/virt/coco/sevguest/sevguest.c
+++ b/drivers/virt/coco/sevguest/sevguest.c
@@ -391,6 +391,48 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 	return rc;
 }
 
+static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_derived_key_resp resp = {0};
+	struct snp_derived_key_req req;
+	int rc, resp_len;
+	u8 buf[64+16]; /* Response data is 64 bytes and max authsize for GCM is 16 bytes */
+
+	if (!arg->req_data || !arg->resp_data)
+		return -EINVAL;
+
+	/* Copy the request payload from userspace */
+	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+		return -EFAULT;
+
+	/*
+	 * The intermediate response buffer is used while decrypting the
+	 * response payload. Make sure that it has enough space to cover the
+	 * authtag.
+	 */
+	resp_len = sizeof(resp.data) + crypto->a_len;
+	if (sizeof(buf) < resp_len)
+		return -ENOMEM;
+
+	/* Issue the command to get the attestation report */
+	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
+				  SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len,
+				  &arg->fw_err);
+	if (rc)
+		goto e_free;
+
+	/* Copy the response payload to userspace */
+	memcpy(resp.data, buf, sizeof(resp.data));
+	if (copy_to_user((void __user *)arg->resp_data, &resp, sizeof(resp)))
+		rc = -EFAULT;
+
+e_free:
+	memzero_explicit(buf, sizeof(buf));
+	memzero_explicit(&resp, sizeof(resp));
+	return rc;
+}
+
 static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	struct snp_guest_dev *snp_dev = to_snp_dev(file);
@@ -420,6 +462,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 	case SNP_GET_REPORT:
 		ret = get_report(snp_dev, &input);
 		break;
+	case SNP_GET_DERIVED_KEY:
+		ret = get_derived_key(snp_dev, &input);
+		break;
 	default:
 		break;
 	}
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
index 0bfc162da465..ce595539e00c 100644
--- a/include/uapi/linux/sev-guest.h
+++ b/include/uapi/linux/sev-guest.h
@@ -27,6 +27,20 @@ struct snp_report_resp {
 	__u8 data[4000];
 };
 
+struct snp_derived_key_req {
+	__u32 root_key_select;
+	__u32 rsvd;
+	__u64 guest_field_select;
+	__u32 vmpl;
+	__u32 guest_svn;
+	__u64 tcb_version;
+};
+
+struct snp_derived_key_resp {
+	/* response data, see SEV-SNP spec for the format */
+	__u8 data[64];
+};
+
 struct snp_guest_request_ioctl {
 	/* message version number (must be non-zero) */
 	__u8 msg_version;
@@ -44,4 +58,7 @@ struct snp_guest_request_ioctl {
 /* Get SNP attestation report */
 #define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_guest_request_ioctl)
 
+/* Get a derived key from the root */
+#define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_guest_request_ioctl)
+
 #endif /* __UAPI_LINUX_SEV_GUEST_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* [PATCH v8 40/40] virt: sevguest: Add support to get extended report
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (38 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 39/40] virt: sevguest: Add support to derive key Brijesh Singh
@ 2021-12-10 15:43 ` Brijesh Singh
  2021-12-10 20:17 ` [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Dave Hansen
  40 siblings, 0 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 15:43 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Version 2 of GHCB specification defines Non-Automatic-Exit(NAE) to get
the extended guest report. It is similar to the SNP_GET_REPORT ioctl.
The main difference is related to the additional data that will be
returned. The additional data returned is a certificate blob that can
be used by the SNP guest user. The certificate blob layout is defined
in the GHCB specification. The driver simply treats the blob as a opaque
data and copies it to userspace.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 Documentation/virt/coco/sevguest.rst  | 23 +++++++
 drivers/virt/coco/sevguest/sevguest.c | 89 +++++++++++++++++++++++++++
 include/uapi/linux/sev-guest.h        | 13 ++++
 3 files changed, 125 insertions(+)

diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
index 8c22d514d44f..515af0d71469 100644
--- a/Documentation/virt/coco/sevguest.rst
+++ b/Documentation/virt/coco/sevguest.rst
@@ -90,6 +90,29 @@ on the various fields passed in the key derivation request.
 On success, the snp_derived_key_resp.data contains the derived key value. See
 the SEV-SNP specification for further details.
 
+
+2.3 SNP_GET_EXT_REPORT
+----------------------
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in/out): struct snp_ext_report_req
+:Returns (out): struct snp_report_resp on success, -negative on error
+
+The SNP_GET_EXT_REPORT ioctl is similar to the SNP_GET_REPORT. The difference is
+related to the additional certificate data that is returned with the report.
+The certificate data returned is being provided by the hypervisor through the
+SNP_SET_EXT_CONFIG.
+
+The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command provided by the SEV-SNP
+firmware to get the attestation report.
+
+On success, the snp_ext_report_resp.data will contain the attestation report
+and snp_ext_report_req.certs_address will contain the certificate blob. If the
+length of the blob is smaller than expected then snp_ext_report_req.certs_len will
+be updated with the expected value.
+
+See GHCB specification for further detail on how to parse the certificate blob.
+
 Reference
 ---------
 
diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
index d8dcafc32e11..f86fa13b8e5b 100644
--- a/drivers/virt/coco/sevguest/sevguest.c
+++ b/drivers/virt/coco/sevguest/sevguest.c
@@ -41,6 +41,7 @@ struct snp_guest_dev {
 	struct device *dev;
 	struct miscdevice misc;
 
+	void *certs_data;
 	struct snp_guest_crypto *crypto;
 	struct snp_guest_msg *request, *response;
 	struct snp_secrets_page_layout *layout;
@@ -433,6 +434,84 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 	return rc;
 }
 
+static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_ext_report_req req;
+	struct snp_report_resp *resp;
+	int ret, npages = 0, resp_len;
+
+	if (!arg->req_data || !arg->resp_data)
+		return -EINVAL;
+
+	/* Copy the request payload from userspace */
+	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+		return -EFAULT;
+
+	if (req.certs_len) {
+		if (req.certs_len > SEV_FW_BLOB_MAX_SIZE ||
+		    !IS_ALIGNED(req.certs_len, PAGE_SIZE))
+			return -EINVAL;
+	}
+
+	if (req.certs_address && req.certs_len) {
+		if (!access_ok(req.certs_address, req.certs_len))
+			return -EFAULT;
+
+		/*
+		 * Initialize the intermediate buffer with all zero's. This buffer
+		 * is used in the guest request message to get the certs blob from
+		 * the host. If host does not supply any certs in it, then copy
+		 * zeros to indicate that certificate data was not provided.
+		 */
+		memset(snp_dev->certs_data, 0, req.certs_len);
+
+		npages = req.certs_len >> PAGE_SHIFT;
+	}
+
+	/*
+	 * The intermediate response buffer is used while decrypting the
+	 * response payload. Make sure that it has enough space to cover the
+	 * authtag.
+	 */
+	resp_len = sizeof(resp->data) + crypto->a_len;
+	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+	if (!resp)
+		return -ENOMEM;
+
+	snp_dev->input.data_npages = npages;
+	ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg->msg_version,
+				   SNP_MSG_REPORT_REQ, &req.data.user_data,
+				   sizeof(req.data.user_data), resp->data, resp_len, &arg->fw_err);
+
+	/* If certs length is invalid then copy the returned length */
+	if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
+		req.certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
+
+		if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
+			ret = -EFAULT;
+	}
+
+	if (ret)
+		goto e_free;
+
+	/* Copy the certificate data blob to userspace */
+	if (req.certs_address && req.certs_len &&
+	    copy_to_user((void __user *)req.certs_address, snp_dev->certs_data,
+			 req.certs_len)) {
+		ret = -EFAULT;
+		goto e_free;
+	}
+
+	/* Copy the response payload to userspace */
+	if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+		ret = -EFAULT;
+
+e_free:
+	kfree(resp);
+	return ret;
+}
+
 static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	struct snp_guest_dev *snp_dev = to_snp_dev(file);
@@ -465,6 +544,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 	case SNP_GET_DERIVED_KEY:
 		ret = get_derived_key(snp_dev, &input);
 		break;
+	case SNP_GET_EXT_REPORT:
+		ret = get_ext_report(snp_dev, &input);
+		break;
 	default:
 		break;
 	}
@@ -593,6 +675,10 @@ static int __init snp_guest_probe(struct platform_device *pdev)
 	if (!snp_dev->response)
 		goto e_fail;
 
+	snp_dev->certs_data = alloc_shared_pages(SEV_FW_BLOB_MAX_SIZE);
+	if (!snp_dev->certs_data)
+		goto e_fail;
+
 	ret = -EIO;
 	snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
 	if (!snp_dev->crypto)
@@ -606,6 +692,7 @@ static int __init snp_guest_probe(struct platform_device *pdev)
 	/* initial the input address for guest request */
 	snp_dev->input.req_gpa = __pa(snp_dev->request);
 	snp_dev->input.resp_gpa = __pa(snp_dev->response);
+	snp_dev->input.data_gpa = __pa(snp_dev->certs_data);
 
 	ret =  misc_register(misc);
 	if (ret)
@@ -616,6 +703,7 @@ static int __init snp_guest_probe(struct platform_device *pdev)
 
 e_fail:
 	iounmap(layout);
+	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
 	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
 	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
 
@@ -628,6 +716,7 @@ static int __exit snp_guest_remove(struct platform_device *pdev)
 
 	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
 	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
 	deinit_crypto(snp_dev->crypto);
 	misc_deregister(&snp_dev->misc);
 
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
index ce595539e00c..43127aa18026 100644
--- a/include/uapi/linux/sev-guest.h
+++ b/include/uapi/linux/sev-guest.h
@@ -53,6 +53,16 @@ struct snp_guest_request_ioctl {
 	__u64 fw_err;
 };
 
+struct snp_ext_report_req {
+	struct snp_report_req data;
+
+	/* where to copy the certificate blob */
+	__u64 certs_address;
+
+	/* length of the certificate blob */
+	__u32 certs_len;
+};
+
 #define SNP_GUEST_REQ_IOC_TYPE	'S'
 
 /* Get SNP attestation report */
@@ -61,4 +71,7 @@ struct snp_guest_request_ioctl {
 /* Get a derived key from the root */
 #define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_guest_request_ioctl)
 
+/* Get SNP extended report as defined in the GHCB specification version 2. */
+#define SNP_GET_EXT_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x2, struct snp_guest_request_ioctl)
+
 #endif /* __UAPI_LINUX_SEV_GUEST_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-10 15:42 ` [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot Brijesh Singh
@ 2021-12-10 18:47   ` Dave Hansen
  2021-12-10 19:12   ` Borislav Petkov
  2021-12-13 19:09   ` Venu Busireddy
  2 siblings, 0 replies; 183+ messages in thread
From: Dave Hansen @ 2021-12-10 18:47 UTC (permalink / raw)
  To: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/10/21 7:42 AM, Brijesh Singh wrote:
> +	/* Set the SME mask if this is an SEV guest. */
> +	sev_status   = rd_sev_status_msr();

Nit: there's some weird extra whitespace there.  Might be some some old
attempts at vertical alignment.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2021-12-10 15:43 ` [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs Brijesh Singh
@ 2021-12-10 18:50   ` Dave Hansen
  2022-01-12 16:17     ` Brijesh Singh
  2021-12-31 15:36   ` Borislav Petkov
  1 sibling, 1 reply; 183+ messages in thread
From: Dave Hansen @ 2021-12-10 18:50 UTC (permalink / raw)
  To: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/10/21 7:43 AM, Brijesh Singh wrote:
> +	vmsa->efer		= 0x1000;	/* Must set SVME bit */
> +	vmsa->cr4		= cr4;
> +	vmsa->cr0		= 0x60000010;
> +	vmsa->dr7		= 0x400;
> +	vmsa->dr6		= 0xffff0ff0;
> +	vmsa->rflags		= 0x2;
> +	vmsa->g_pat		= 0x0007040600070406ULL;
> +	vmsa->xcr0		= 0x1;
> +	vmsa->mxcsr		= 0x1f80;
> +	vmsa->x87_ftw		= 0x5555;
> +	vmsa->x87_fcw		= 0x0040;

This is a big fat pile of magic numbers.  We also have nice macros for a
non-zero number of these, like:

	#define MXCSR_DEFAULT 0x1f80

I understand that this probably _works_ as-is, but it doesn't look very
friendly if someone else needs to go hack on it.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup to helper
  2021-12-10 15:43 ` [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup " Brijesh Singh
@ 2021-12-10 18:54   ` Dave Hansen
  2021-12-13 15:47     ` Michael Roth
  2022-01-05 23:50   ` Borislav Petkov
  2022-01-06 19:59   ` Venu Busireddy
  2 siblings, 1 reply; 183+ messages in thread
From: Dave Hansen @ 2021-12-10 18:54 UTC (permalink / raw)
  To: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/10/21 7:43 AM, Brijesh Singh wrote:
> +/*
> + * Helpers for early access to EFI configuration table
> + *
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Michael Roth <michael.roth@amd.com>
> + */

It doesn't seem quite right to slap this copyright on a file that's full
of content that came from other files.  It would be one thing if
arch/x86/boot/compressed/acpi.c had this banner in it already.  Also, a
bunch of the lines in this file seem to come from:

	commit 33f0df8d843deb9ec24116dcd79a40ca0ea8e8a9
	Author: Chao Fan <fanc.fnst@cn.fujitsu.com>
	Date:   Wed Jan 23 19:08:46 2019 +0800

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data
  2021-12-10 15:43 ` [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data Brijesh Singh
@ 2021-12-10 19:12   ` Dave Hansen
  2021-12-10 20:18     ` Brijesh Singh
  2022-01-06 22:48   ` Venu Busireddy
  1 sibling, 1 reply; 183+ messages in thread
From: Dave Hansen @ 2021-12-10 19:12 UTC (permalink / raw)
  To: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/10/21 7:43 AM, Brijesh Singh wrote:
> +/* AMD SEV Confidential computing blob structure */
> +#define CC_BLOB_SEV_HDR_MAGIC	0x45444d41
> +struct cc_blob_sev_info {
> +	u32 magic;
> +	u16 version;
> +	u16 reserved;
> +	u64 secrets_phys;
> +	u32 secrets_len;
> +	u64 cpuid_phys;
> +	u32 cpuid_len;
> +};

This is an ABI structure rather than some purely kernel construct, right?

I searched through all of the specs to which you linked in the cover
letter.  I looked for "blob", "guid", the magic and part of the GUID
itself trying to find where this is defined to see if the struct is correct.

I couldn't find anything.

Where is the spec for this blob?  How large is it?  Did you mean to
leave a 4-byte hole after secrets_len and before cpuid_phys?

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-10 15:42 ` [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot Brijesh Singh
  2021-12-10 18:47   ` Dave Hansen
@ 2021-12-10 19:12   ` Borislav Petkov
  2021-12-10 19:23     ` Dave Hansen
  2021-12-13 19:09   ` Venu Busireddy
  2 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2021-12-10 19:12 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:42:53AM -0600, Brijesh Singh wrote:
> @@ -447,6 +446,23 @@ SYM_CODE_START(startup_64)
>  	call	load_stage1_idt
>  	popq	%rsi
>  
> +#ifdef CONFIG_AMD_MEM_ENCRYPT

I guess that ifdeffery is not needed.

> +	/*
> +	 * Now that the stage1 interrupt handlers are set up, #VC exceptions from
> +	 * CPUID instructions can be properly handled for SEV-ES guests.
> +	 *
> +	 * For SEV-SNP, the CPUID table also needs to be set up in advance of any
> +	 * CPUID instructions being issued, so go ahead and do that now via
> +	 * sev_enable(), which will also handle the rest of the SEV-related
> +	 * detection/setup to ensure that has been done in advance of any dependent
> +	 * code.
> +	 */
> +	pushq	%rsi
> +	movq	%rsi, %rdi		/* real mode address */
> +	call	sev_enable
> +	popq	%rsi
> +#endif
> +
>  	/*
>  	 * paging_prepare() sets up the trampoline and checks if we need to
>  	 * enable 5-level paging.

...

> +void sev_enable(struct boot_params *bp)
> +{
> +	unsigned int eax, ebx, ecx, edx;
> +
> +	/* Check for the SME/SEV support leaf */
> +	eax = 0x80000000;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);
> +	if (eax < 0x8000001f)
> +		return;
> +
> +	/*
> +	 * Check for the SME/SEV feature:
> +	 *   CPUID Fn8000_001F[EAX]
> +	 *   - Bit 0 - Secure Memory Encryption support
> +	 *   - Bit 1 - Secure Encrypted Virtualization support
> +	 *   CPUID Fn8000_001F[EBX]
> +	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
> +	 */
> +	eax = 0x8000001f;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);
> +	/* Check whether SEV is supported */
> +	if (!(eax & BIT(1)))
> +		return;
> +
> +	/* Set the SME mask if this is an SEV guest. */
> +	sev_status   = rd_sev_status_msr();
> +

^ Superfluous newline.

> +	if (!(sev_status & MSR_AMD64_SEV_ENABLED))
> +		return;
> +
> +	sme_me_mask = BIT_ULL(ebx & 0x3f);
> +}
> -- 

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-10 19:12   ` Borislav Petkov
@ 2021-12-10 19:23     ` Dave Hansen
  2021-12-10 19:33       ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Dave Hansen @ 2021-12-10 19:23 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On 12/10/21 11:12 AM, Borislav Petkov wrote:
> On Fri, Dec 10, 2021 at 09:42:53AM -0600, Brijesh Singh wrote:
>> @@ -447,6 +446,23 @@ SYM_CODE_START(startup_64)
>>  	call	load_stage1_idt
>>  	popq	%rsi
>>  
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> 
> I guess that ifdeffery is not needed.

I think sev_enable() is only defined in arch/x86/boot/compressed/sev.c,
which is compiled via:

        vmlinux-objs-$(CONFIG_AMD_MEM_ENCRYPT) += $(obj)/sev.o

So I think we either need the #ifdef or a stub for sev_enable()
somewhere else.

>> +	/*
>> +	 * Now that the stage1 interrupt handlers are set up, #VC exceptions from
>> +	 * CPUID instructions can be properly handled for SEV-ES guests.
>> +	 *
>> +	 * For SEV-SNP, the CPUID table also needs to be set up in advance of any
>> +	 * CPUID instructions being issued, so go ahead and do that now via
>> +	 * sev_enable(), which will also handle the rest of the SEV-related
>> +	 * detection/setup to ensure that has been done in advance of any dependent
>> +	 * code.
>> +	 */
>> +	pushq	%rsi
>> +	movq	%rsi, %rdi		/* real mode address */
>> +	call	sev_enable
>> +	popq	%rsi
>> +#endif


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-10 19:23     ` Dave Hansen
@ 2021-12-10 19:33       ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-10 19:33 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 11:23:24AM -0800, Dave Hansen wrote:
> So I think we either need the #ifdef or a stub for sev_enable()
> somewhere else.

Yeah, there's a stub but in the C header so that won't work for asm
files. Forget what I said.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 33/40] x86/compressed/64: add identity mapping for Confidential Computing blob
  2021-12-10 15:43 ` [PATCH v8 33/40] x86/compressed/64: add identity mapping for Confidential Computing blob Brijesh Singh
@ 2021-12-10 19:52   ` Dave Hansen
  2021-12-13 17:54     ` Michael Roth
  2022-01-25 13:48   ` Borislav Petkov
  1 sibling, 1 reply; 183+ messages in thread
From: Dave Hansen @ 2021-12-10 19:52 UTC (permalink / raw)
  To: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/10/21 7:43 AM, Brijesh Singh wrote:
> +static void sev_prep_identity_maps(void)
> +{
> +	/*
> +	 * The ConfidentialComputing blob is used very early in uncompressed
> +	 * kernel to find the in-memory cpuid table to handle cpuid
> +	 * instructions. Make sure an identity-mapping exists so it can be
> +	 * accessed after switchover.
> +	 */
> +	if (sev_snp_enabled()) {
> +		struct cc_blob_sev_info *cc_info =
> +			(void *)(unsigned long)boot_params->cc_blob_address;
> +
> +		add_identity_map((unsigned long)cc_info,
> +				 (unsigned long)cc_info + sizeof(*cc_info));
> +		add_identity_map((unsigned long)cc_info->cpuid_phys,
> +				 (unsigned long)cc_info->cpuid_phys + cc_info->cpuid_len);
> +	}

The casting here is pretty ugly.  Also, isn't ->cpuid_phys already a
u64?  Whats the idea behind casting it?

I also have a sneaking suspicion that a single "unsigned long cc_blob"
could remove virtually all the casting.  Does this work?

	unsigned long cc_blob = boot_params->cc_blob_addres;
	struct cc_blob_sev_info *cc_info;

	add_identity_map(cc_blob, cc_blob + sizeof(*cc_info));

	cc_info = (struct cc_blob_sev_info *)cc_blob;
	add_identity_map(cc_info->cpuid_phys,
			 cc_info->cpuid_phys + cc_info->cpuid_len);

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support
  2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (39 preceding siblings ...)
  2021-12-10 15:43 ` [PATCH v8 40/40] virt: sevguest: Add support to get extended report Brijesh Singh
@ 2021-12-10 20:17 ` Dave Hansen
  2021-12-10 20:20   ` Brijesh Singh
  40 siblings, 1 reply; 183+ messages in thread
From: Dave Hansen @ 2021-12-10 20:17 UTC (permalink / raw)
  To: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/10/21 7:42 AM, Brijesh Singh wrote:
> The series is based on tip/master
>   7f32a31b0a34 (origin/master, origin/HEAD) Merge branch into tip/master: 'core/entry'

FWIW, this is rather useless since tip/master gets rebased all the time.
 Also, being a merge commit, it's rather impossible to even infer which
commit this might have been near.

Personally, I like to take my series', tag them, then throw them out in
a public git tree somewhere.  That has two advantages.  First, it makes
it easy for a reviewer to look at the series as a whole in its applied
state.  Second, it makes it utterly trivial to figure out where the
series was based because the entire history is there.  The entire
history will be there even if you based it off some tip branch that got
rebased away since.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data
  2021-12-10 19:12   ` Dave Hansen
@ 2021-12-10 20:18     ` Brijesh Singh
  2021-12-10 20:30       ` Dave Hansen
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 20:18 UTC (permalink / raw)
  To: Dave Hansen, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: brijesh.singh, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy


On 12/10/21 1:12 PM, Dave Hansen wrote:
> On 12/10/21 7:43 AM, Brijesh Singh wrote:
>> +/* AMD SEV Confidential computing blob structure */
>> +#define CC_BLOB_SEV_HDR_MAGIC	0x45444d41
>> +struct cc_blob_sev_info {
>> +	u32 magic;
>> +	u16 version;
>> +	u16 reserved;
>> +	u64 secrets_phys;
>> +	u32 secrets_len;
>> +	u64 cpuid_phys;
>> +	u32 cpuid_len;
>> +};
> This is an ABI structure rather than some purely kernel construct, right?


This is ABI between the guest BIOS and Guest OS. It is defined in the OVMF.

https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h

SEV-SNP FW spec does not have it documented; it's up to the guest BIOS
on how it wants to communicate the Secrets and CPUID page location to
guest OS.


> I searched through all of the specs to which you linked in the cover
> letter.  I looked for "blob", "guid", the magic and part of the GUID
> itself trying to find where this is defined to see if the struct is correct.
>
> I couldn't find anything.
>
> Where is the spec for this blob?  How large is it?  Did you mean to
> leave a 4-byte hole after secrets_len and before cpuid_phys?

Yes, the length is never going to be > 4GB.



^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support
  2021-12-10 20:17 ` [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Dave Hansen
@ 2021-12-10 20:20   ` Brijesh Singh
  0 siblings, 0 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-10 20:20 UTC (permalink / raw)
  To: Dave Hansen, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: brijesh.singh, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy


On 12/10/21 2:17 PM, Dave Hansen wrote:
> On 12/10/21 7:42 AM, Brijesh Singh wrote:
>> The series is based on tip/master
>>   7f32a31b0a34 (origin/master, origin/HEAD) Merge branch into tip/master: 'core/entry'
> FWIW, this is rather useless since tip/master gets rebased all the time.
>  Also, being a merge commit, it's rather impossible to even infer which
> commit this might have been near.
>
> Personally, I like to take my series', tag them, then throw them out in
> a public git tree somewhere.  That has two advantages.  First, it makes
> it easy for a reviewer to look at the series as a whole in its applied
> state.  Second, it makes it utterly trivial to figure out where the
> series was based because the entire history is there.  The entire
> history will be there even if you based it off some tip branch that got
> rebased away since.

I thought I mentioned this in the cover letter, but I missed including
the below information.

The full tree is available here:
https://github.com/AMDESE/linux/tree/sev-snp-v8

-Brijesh


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data
  2021-12-10 20:18     ` Brijesh Singh
@ 2021-12-10 20:30       ` Dave Hansen
  2021-12-13 14:49         ` Brijesh Singh
  0 siblings, 1 reply; 183+ messages in thread
From: Dave Hansen @ 2021-12-10 20:30 UTC (permalink / raw)
  To: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/10/21 12:18 PM, Brijesh Singh wrote:
> On 12/10/21 1:12 PM, Dave Hansen wrote:
>> On 12/10/21 7:43 AM, Brijesh Singh wrote:
>>> +/* AMD SEV Confidential computing blob structure */
>>> +#define CC_BLOB_SEV_HDR_MAGIC	0x45444d41
>>> +struct cc_blob_sev_info {
>>> +	u32 magic;
>>> +	u16 version;
>>> +	u16 reserved;
>>> +	u64 secrets_phys;
>>> +	u32 secrets_len;
>>> +	u64 cpuid_phys;
>>> +	u32 cpuid_len;
>>> +};
>> This is an ABI structure rather than some purely kernel construct, right?
> 
> This is ABI between the guest BIOS and Guest OS. It is defined in the OVMF.
> 
> https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h
> 
> SEV-SNP FW spec does not have it documented; it's up to the guest BIOS
> on how it wants to communicate the Secrets and CPUID page location to
> guest OS.

Well, no matter where it is defined, could we please make it a bit
easier for folks to find it in the future?

>> I searched through all of the specs to which you linked in the cover
>> letter.  I looked for "blob", "guid", the magic and part of the GUID
>> itself trying to find where this is defined to see if the struct is correct.
>>
>> I couldn't find anything.
>>
>> Where is the spec for this blob?  How large is it?  Did you mean to
>> leave a 4-byte hole after secrets_len and before cpuid_phys?
> Yes, the length is never going to be > 4GB.

I was more concerned that this structure could change sizes if it were
compiled on 32-bit versus 64-bit code.  For kernel ABIs, we try not to
do that.

Is this somehow OK when talking to firmware?  Or can a 32-bit OS and
64-bit firmware never interact?

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 39/40] virt: sevguest: Add support to derive key
  2021-12-10 15:43 ` [PATCH v8 39/40] virt: sevguest: Add support to derive key Brijesh Singh
@ 2021-12-10 22:27   ` Liam Merwick
  0 siblings, 0 replies; 183+ messages in thread
From: Liam Merwick @ 2021-12-10 22:27 UTC (permalink / raw)
  To: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 10/12/2021 15:43, Brijesh Singh wrote:
> The SNP_GET_DERIVED_KEY ioctl interface can be used by the SNP guest to
> ask the firmware to provide a key derived from a root key. The derived
> key may be used by the guest for any purposes it choose, such as a

nit: choose -> chooses

> sealing key or communicating with the external entities.
> 
> See SEV-SNP firmware spec for more information.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>   Documentation/virt/coco/sevguest.rst  | 17 ++++++++++
>   drivers/virt/coco/sevguest/sevguest.c | 45 +++++++++++++++++++++++++++
>   include/uapi/linux/sev-guest.h        | 17 ++++++++++
>   3 files changed, 79 insertions(+)
> 
> diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
> index 47ef3b0821d5..8c22d514d44f 100644
> --- a/Documentation/virt/coco/sevguest.rst
> +++ b/Documentation/virt/coco/sevguest.rst
> @@ -72,6 +72,23 @@ On success, the snp_report_resp.data will contains the report. The report
>   contain the format described in the SEV-SNP specification. See the SEV-SNP
>   specification for further details.
>   
> +2.2 SNP_GET_DERIVED_KEY
> +-----------------------
> +:Technology: sev-snp
> +:Type: guest ioctl
> +:Parameters (in): struct snp_derived_key_req
> +:Returns (out): struct snp_derived_key_req on success, -negative on error
> +

Does it return 'struct snp_derived_key_resp' on success?


> +The SNP_GET_DERIVED_KEY ioctl can be used to get a key derive from a root key.

nit: derive -> derived ?

Otherwise

Reviewed-by: Liam Merwick <liam.merwick@oracle.com>

> +The derived key can be used by the guest for any purpose, such as sealing keys
> +or communicating with external entities.
> +
> +The ioctl uses the SNP_GUEST_REQUEST (MSG_KEY_REQ) command provided by the
> +SEV-SNP firmware to derive the key. See SEV-SNP specification for further details
> +on the various fields passed in the key derivation request.
> +
> +On success, the snp_derived_key_resp.data contains the derived key value. See
> +the SEV-SNP specification for further details.
>   
>   Reference
>   ---------
> diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
> index b3b080c9b2d6..d8dcafc32e11 100644
> --- a/drivers/virt/coco/sevguest/sevguest.c
> +++ b/drivers/virt/coco/sevguest/sevguest.c
> @@ -391,6 +391,48 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
>   	return rc;
>   }
>   
> +static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> +{
> +	struct snp_guest_crypto *crypto = snp_dev->crypto;
> +	struct snp_derived_key_resp resp = {0};
> +	struct snp_derived_key_req req;
> +	int rc, resp_len;
> +	u8 buf[64+16]; /* Response data is 64 bytes and max authsize for GCM is 16 bytes */
> +
> +	if (!arg->req_data || !arg->resp_data)
> +		return -EINVAL;
> +
> +	/* Copy the request payload from userspace */
> +	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> +		return -EFAULT;
> +
> +	/*
> +	 * The intermediate response buffer is used while decrypting the
> +	 * response payload. Make sure that it has enough space to cover the
> +	 * authtag.
> +	 */
> +	resp_len = sizeof(resp.data) + crypto->a_len;
> +	if (sizeof(buf) < resp_len)
> +		return -ENOMEM;
> +
> +	/* Issue the command to get the attestation report */
> +	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
> +				  SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len,
> +				  &arg->fw_err);
> +	if (rc)
> +		goto e_free;
> +
> +	/* Copy the response payload to userspace */
> +	memcpy(resp.data, buf, sizeof(resp.data));
> +	if (copy_to_user((void __user *)arg->resp_data, &resp, sizeof(resp)))
> +		rc = -EFAULT;
> +
> +e_free:
> +	memzero_explicit(buf, sizeof(buf));
> +	memzero_explicit(&resp, sizeof(resp));
> +	return rc;
> +}
> +
>   static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
>   {
>   	struct snp_guest_dev *snp_dev = to_snp_dev(file);
> @@ -420,6 +462,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>   	case SNP_GET_REPORT:
>   		ret = get_report(snp_dev, &input);
>   		break;
> +	case SNP_GET_DERIVED_KEY:
> +		ret = get_derived_key(snp_dev, &input);
> +		break;
>   	default:
>   		break;
>   	}
> diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
> index 0bfc162da465..ce595539e00c 100644
> --- a/include/uapi/linux/sev-guest.h
> +++ b/include/uapi/linux/sev-guest.h
> @@ -27,6 +27,20 @@ struct snp_report_resp {
>   	__u8 data[4000];
>   };
>   
> +struct snp_derived_key_req {
> +	__u32 root_key_select;
> +	__u32 rsvd;
> +	__u64 guest_field_select;
> +	__u32 vmpl;
> +	__u32 guest_svn;
> +	__u64 tcb_version;
> +};
> +
> +struct snp_derived_key_resp {
> +	/* response data, see SEV-SNP spec for the format */
> +	__u8 data[64];
> +};
> +
>   struct snp_guest_request_ioctl {
>   	/* message version number (must be non-zero) */
>   	__u8 msg_version;
> @@ -44,4 +58,7 @@ struct snp_guest_request_ioctl {
>   /* Get SNP attestation report */
>   #define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_guest_request_ioctl)
>   
> +/* Get a derived key from the root */
> +#define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_guest_request_ioctl)
> +
>   #endif /* __UAPI_LINUX_SEV_GUEST_H_ */
> 


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data
  2021-12-10 20:30       ` Dave Hansen
@ 2021-12-13 14:49         ` Brijesh Singh
  2021-12-13 15:08           ` Dave Hansen
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-13 14:49 UTC (permalink / raw)
  To: Dave Hansen, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: brijesh.singh, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 12/10/21 2:30 PM, Dave Hansen wrote:
> On 12/10/21 12:18 PM, Brijesh Singh wrote:
>> On 12/10/21 1:12 PM, Dave Hansen wrote:
>>> On 12/10/21 7:43 AM, Brijesh Singh wrote:
>>>> +/* AMD SEV Confidential computing blob structure */
>>>> +#define CC_BLOB_SEV_HDR_MAGIC	0x45444d41
>>>> +struct cc_blob_sev_info {
>>>> +	u32 magic;
>>>> +	u16 version;
>>>> +	u16 reserved;
>>>> +	u64 secrets_phys;
>>>> +	u32 secrets_len;
>>>> +	u64 cpuid_phys;
>>>> +	u32 cpuid_len;
>>>> +};
>>> This is an ABI structure rather than some purely kernel construct, right?
>>
>> This is ABI between the guest BIOS and Guest OS. It is defined in the OVMF.
>>
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftianocore%2Fedk2%2Fblob%2Fmaster%2FOvmfPkg%2FInclude%2FGuid%2FConfidentialComputingSevSnpBlob.h&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7C460f6abff7f04e065c9108d9bc1bfcf7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637747650681544593%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=GI1fAngRJ%2Bj4hcM91UutVXlS1F7kfk2xxtG6I%2BL%2FRYc%3D&amp;reserved=0
>>
>> SEV-SNP FW spec does not have it documented; it's up to the guest BIOS
>> on how it wants to communicate the Secrets and CPUID page location to
>> guest OS.
> 
> Well, no matter where it is defined, could we please make it a bit
> easier for folks to find it in the future?
> 

Noted, I will add a comment so that readers can find it easily. 
Additionally, I will create a doc and get it published on 
developer.amd.com/sev so that information is documented outside the 
source code files.

>>> I searched through all of the specs to which you linked in the cover
>>> letter.  I looked for "blob", "guid", the magic and part of the GUID
>>> itself trying to find where this is defined to see if the struct is correct.
>>>
>>> I couldn't find anything.
>>>
>>> Where is the spec for this blob?  How large is it?  Did you mean to
>>> leave a 4-byte hole after secrets_len and before cpuid_phys?
>> Yes, the length is never going to be > 4GB.
> 
> I was more concerned that this structure could change sizes if it were
> compiled on 32-bit versus 64-bit code.  For kernel ABIs, we try not to
> do that.
> 
> Is this somehow OK when talking to firmware?  Or can a 32-bit OS and
> 64-bit firmware never interact?
> 

For SNP, both the firmware and OS need to be 64-bit. IIRC, both the 
Linux and OVMF do not enable the memory encryption for the 32-bit.

thanks

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data
  2021-12-13 14:49         ` Brijesh Singh
@ 2021-12-13 15:08           ` Dave Hansen
  2021-12-13 15:55             ` Brijesh Singh
  2022-01-07 11:54             ` Borislav Petkov
  0 siblings, 2 replies; 183+ messages in thread
From: Dave Hansen @ 2021-12-13 15:08 UTC (permalink / raw)
  To: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/13/21 6:49 AM, Brijesh Singh wrote:
>> I was more concerned that this structure could change sizes if it were
>> compiled on 32-bit versus 64-bit code.  For kernel ABIs, we try not to
>> do that.
>>
>> Is this somehow OK when talking to firmware?  Or can a 32-bit OS and
>> 64-bit firmware never interact?
> 
> For SNP, both the firmware and OS need to be 64-bit. IIRC, both the
> Linux and OVMF do not enable the memory encryption for the 32-bit.

Could you please make the structure's size invariant?  That's great if
there's no problem in today's implementation, but it's best no to leave
little land mines like this around.  Let's say someone copies your code
as an example of something that interacts with a firmware table a few
years or months down the road.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup to helper
  2021-12-10 18:54   ` Dave Hansen
@ 2021-12-13 15:47     ` Michael Roth
  2021-12-13 16:21       ` Dave Hansen
  2022-01-11  8:59       ` Chao Fan
  0 siblings, 2 replies; 183+ messages in thread
From: Michael Roth @ 2021-12-13 15:47 UTC (permalink / raw)
  To: Dave Hansen, fanc.fnst, j-nomura, bp
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Vlastimil Babka, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 10:54:35AM -0800, Dave Hansen wrote:
> On 12/10/21 7:43 AM, Brijesh Singh wrote:
> > +/*
> > + * Helpers for early access to EFI configuration table
> > + *
> > + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> > + *
> > + * Author: Michael Roth <michael.roth@amd.com>
> > + */
> 
> It doesn't seem quite right to slap this copyright on a file that's full
> of content that came from other files.  It would be one thing if
> arch/x86/boot/compressed/acpi.c had this banner in it already.  Also, a

Yah, acpi.c didn't have any copyright banner so I used my 'default'
template for new files here to cover any additions, but that does give
a misleading impression.

I'm not sure how this is normally addressed, but I'm planning on just
continuing the acpi.c tradition of *not* adding copyright notices for new
code, and simply document that the contents of the file are mostly movement
from acpi.c

> arch/x86/boot/compressed/acpi.c had this banner in it already.  Also, a
> bunch of the lines in this file seem to come from:
> 
> 	commit 33f0df8d843deb9ec24116dcd79a40ca0ea8e8a9
> 	Author: Chao Fan <fanc.fnst@cn.fujitsu.com>
> 	Date:   Wed Jan 23 19:08:46 2019 +0800

AFAICT the full author list for the changes in question are, in
alphabetical order:

  Chao Fan <fanc.fnst@cn.fujitsu.com>
  Junichi Nomura <j-nomura@ce.jp.nec.com>
  Borislav Petkov <bp@suse.de>

Chao, Junichi, Borislav,

If you would like to be listed as an author in efi.c (which is mainly just a
movement of EFI config table parsing code from acpi.c into re-usable helper
functions in efi.c), please let me know and I'll add you.

Otherwise, I'll plan on adopting the acpi.c precedent for this as well, which
is to not list individual authors, since it doesn't seem right to add Author
fields retroactively without their permission.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data
  2021-12-13 15:08           ` Dave Hansen
@ 2021-12-13 15:55             ` Brijesh Singh
  2022-01-07 11:54             ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-13 15:55 UTC (permalink / raw)
  To: Dave Hansen, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: brijesh.singh, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 12/13/21 9:08 AM, Dave Hansen wrote:
> On 12/13/21 6:49 AM, Brijesh Singh wrote:
>>> I was more concerned that this structure could change sizes if it were
>>> compiled on 32-bit versus 64-bit code.  For kernel ABIs, we try not to
>>> do that.
>>>
>>> Is this somehow OK when talking to firmware?  Or can a 32-bit OS and
>>> 64-bit firmware never interact?
>>
>> For SNP, both the firmware and OS need to be 64-bit. IIRC, both the
>> Linux and OVMF do not enable the memory encryption for the 32-bit.
> 
> Could you please make the structure's size invariant?  

Ack. I will make the required changes.

That's great if
> there's no problem in today's implementation, but it's best no to leave
> little land mines like this around.  Let's say someone copies your code
> as an example of something that interacts with a firmware table a few
> years or months down the road.
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup to helper
  2021-12-13 15:47     ` Michael Roth
@ 2021-12-13 16:21       ` Dave Hansen
  2021-12-13 18:00         ` Michael Roth
  2022-01-11  8:59       ` Chao Fan
  1 sibling, 1 reply; 183+ messages in thread
From: Dave Hansen @ 2021-12-13 16:21 UTC (permalink / raw)
  To: Michael Roth, fanc.fnst, j-nomura, bp
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Vlastimil Babka, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/13/21 7:47 AM, Michael Roth wrote:
> Otherwise, I'll plan on adopting the acpi.c precedent for this as well, which
> is to not list individual authors, since it doesn't seem right to add Author
> fields retroactively without their permission.

That's fine with me, especially if it follows precedent in the subsystem.

Could you also please take a quick scan over the rest of the series to
make sure there are no more of these?

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 33/40] x86/compressed/64: add identity mapping for Confidential Computing blob
  2021-12-10 19:52   ` Dave Hansen
@ 2021-12-13 17:54     ` Michael Roth
  0 siblings, 0 replies; 183+ messages in thread
From: Michael Roth @ 2021-12-13 17:54 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Vlastimil Babka, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 11:52:28AM -0800, Dave Hansen wrote:
> On 12/10/21 7:43 AM, Brijesh Singh wrote:
> > +static void sev_prep_identity_maps(void)
> > +{
> > +	/*
> > +	 * The ConfidentialComputing blob is used very early in uncompressed
> > +	 * kernel to find the in-memory cpuid table to handle cpuid
> > +	 * instructions. Make sure an identity-mapping exists so it can be
> > +	 * accessed after switchover.
> > +	 */
> > +	if (sev_snp_enabled()) {
> > +		struct cc_blob_sev_info *cc_info =
> > +			(void *)(unsigned long)boot_params->cc_blob_address;
> > +
> > +		add_identity_map((unsigned long)cc_info,
> > +				 (unsigned long)cc_info + sizeof(*cc_info));
> > +		add_identity_map((unsigned long)cc_info->cpuid_phys,
> > +				 (unsigned long)cc_info->cpuid_phys + cc_info->cpuid_len);
> > +	}
> 
> The casting here is pretty ugly.  Also, isn't ->cpuid_phys already a
> u64?  Whats the idea behind casting it?
> 
> I also have a sneaking suspicion that a single "unsigned long cc_blob"
> could remove virtually all the casting.  Does this work?
> 
> 	unsigned long cc_blob = boot_params->cc_blob_addres;
> 	struct cc_blob_sev_info *cc_info;
> 
> 	add_identity_map(cc_blob, cc_blob + sizeof(*cc_info));
> 
> 	cc_info = (struct cc_blob_sev_info *)cc_blob;
> 	add_identity_map(cc_info->cpuid_phys,
> 			 cc_info->cpuid_phys + cc_info->cpuid_len);

Yes, the cc->cpuid_phys cast is not needed, and your suggested implementation
is clearer and compiles/runs without any issues. I'll implement it this way for
the next spin. Thanks!

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup to helper
  2021-12-13 16:21       ` Dave Hansen
@ 2021-12-13 18:00         ` Michael Roth
  0 siblings, 0 replies; 183+ messages in thread
From: Michael Roth @ 2021-12-13 18:00 UTC (permalink / raw)
  To: Dave Hansen
  Cc: fanc.fnst, j-nomura, bp, Brijesh Singh, x86, linux-kernel, kvm,
	linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Mon, Dec 13, 2021 at 08:21:44AM -0800, Dave Hansen wrote:
> On 12/13/21 7:47 AM, Michael Roth wrote:
> > Otherwise, I'll plan on adopting the acpi.c precedent for this as well, which
> > is to not list individual authors, since it doesn't seem right to add Author
> > fields retroactively without their permission.
> 
> That's fine with me, especially if it follows precedent in the subsystem.
> 
> Could you also please take a quick scan over the rest of the series to
> make sure there are no more of these?

Outside of the guest driver there's only one other new file addition in
the series:

  arch/x86/include/asm/cpuid.h

where I moved some code out of arch/x86/kvm/cpuid.c in similar fashion, so
cpuid.h should probably inherit cpuid.c's copyright banner. I'll make that
change for the next spin as well. Thanks for the catches.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-10 15:42 ` [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot Brijesh Singh
  2021-12-10 18:47   ` Dave Hansen
  2021-12-10 19:12   ` Borislav Petkov
@ 2021-12-13 19:09   ` Venu Busireddy
  2021-12-13 19:17     ` Borislav Petkov
  2 siblings, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2021-12-13 19:09 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:42:53 -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> With upcoming SEV-SNP support, SEV-related features need to be
> initialized earlier in boot, at the same point the initial #VC handler
> is set up, so that the SEV-SNP CPUID table can be utilized during the
> initial feature checks. Also, SEV-SNP feature detection will rely on
> EFI helper functions to scan the EFI config table for the Confidential
> Computing blob, and so would need to be implemented at least partially
> in C.
> 
> Currently set_sev_encryption_mask() is used to initialize the
> sev_status and sme_me_mask globals that advertise what SEV/SME features
> are available in a guest. Rename it to sev_enable() to better reflect
> that (SME is only enabled in the case of SEV guests in the
> boot/compressed kernel), and move it to just after the stage1 #VC
> handler is set up so that it can be used to initialize SEV-SNP as well
> in future patches.
> 
> While at it, re-implement it as C code so that all SEV feature
> detection can be better consolidated with upcoming SEV-SNP feature
> detection, which will also be in C.
> 
> The 32-bit entry path remains unchanged, as it never relied on the
> set_sev_encryption_mask() initialization to begin with, possibly due to
> the normal rva() helper for accessing globals only being usable by code
> in .head.text. Either way, 32-bit entry for SEV-SNP would likely only
> be supported for non-EFI boot paths, and so wouldn't rely on existing
> EFI helper functions, and so could be handled by a separate/simpler
> 32-bit initializer in the future if needed.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/boot/compressed/head_64.S     | 32 ++++++++++--------
>  arch/x86/boot/compressed/mem_encrypt.S | 36 ---------------------
>  arch/x86/boot/compressed/misc.h        |  4 +--
>  arch/x86/boot/compressed/sev.c         | 45 ++++++++++++++++++++++++++
>  4 files changed, 66 insertions(+), 51 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index 572c535cf45b..20b174adca51 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -191,9 +191,8 @@ SYM_FUNC_START(startup_32)
>  	/*
>  	 * Mark SEV as active in sev_status so that startup32_check_sev_cbit()
>  	 * will do a check. The sev_status memory will be fully initialized
> -	 * with the contents of MSR_AMD_SEV_STATUS later in
> -	 * set_sev_encryption_mask(). For now it is sufficient to know that SEV
> -	 * is active.
> +	 * with the contents of MSR_AMD_SEV_STATUS later via sev_enable(). For
> +	 * now it is sufficient to know that SEV is active.
>  	 */
>  	movl	$1, rva(sev_status)(%ebp)
>  1:
> @@ -447,6 +446,23 @@ SYM_CODE_START(startup_64)
>  	call	load_stage1_idt
>  	popq	%rsi
>  
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +	/*
> +	 * Now that the stage1 interrupt handlers are set up, #VC exceptions from
> +	 * CPUID instructions can be properly handled for SEV-ES guests.
> +	 *
> +	 * For SEV-SNP, the CPUID table also needs to be set up in advance of any
> +	 * CPUID instructions being issued, so go ahead and do that now via
> +	 * sev_enable(), which will also handle the rest of the SEV-related
> +	 * detection/setup to ensure that has been done in advance of any dependent
> +	 * code.
> +	 */
> +	pushq	%rsi
> +	movq	%rsi, %rdi		/* real mode address */
> +	call	sev_enable
> +	popq	%rsi
> +#endif
> +
>  	/*
>  	 * paging_prepare() sets up the trampoline and checks if we need to
>  	 * enable 5-level paging.
> @@ -559,17 +575,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
>  	shrq	$3, %rcx
>  	rep	stosq
>  
> -/*
> - * If running as an SEV guest, the encryption mask is required in the
> - * page-table setup code below. When the guest also has SEV-ES enabled
> - * set_sev_encryption_mask() will cause #VC exceptions, but the stage2
> - * handler can't map its GHCB because the page-table is not set up yet.
> - * So set up the encryption mask here while still on the stage1 #VC
> - * handler. Then load stage2 IDT and switch to the kernel's own
> - * page-table.
> - */
>  	pushq	%rsi
> -	call	set_sev_encryption_mask
>  	call	load_stage2_idt
>  
>  	/* Pass boot_params to initialize_identity_maps() */
> diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
> index c1e81a848b2a..311d40f35a4b 100644
> --- a/arch/x86/boot/compressed/mem_encrypt.S
> +++ b/arch/x86/boot/compressed/mem_encrypt.S
> @@ -187,42 +187,6 @@ SYM_CODE_END(startup32_vc_handler)
>  	.code64
>  
>  #include "../../kernel/sev_verify_cbit.S"
> -SYM_FUNC_START(set_sev_encryption_mask)
> -#ifdef CONFIG_AMD_MEM_ENCRYPT
> -	push	%rbp
> -	push	%rdx
> -
> -	movq	%rsp, %rbp		/* Save current stack pointer */
> -
> -	call	get_sev_encryption_bit	/* Get the encryption bit position */
> -	testl	%eax, %eax
> -	jz	.Lno_sev_mask
> -
> -	bts	%rax, sme_me_mask(%rip)	/* Create the encryption mask */
> -
> -	/*
> -	 * Read MSR_AMD64_SEV again and store it to sev_status. Can't do this in
> -	 * get_sev_encryption_bit() because this function is 32-bit code and
> -	 * shared between 64-bit and 32-bit boot path.
> -	 */
> -	movl	$MSR_AMD64_SEV, %ecx	/* Read the SEV MSR */
> -	rdmsr
> -
> -	/* Store MSR value in sev_status */
> -	shlq	$32, %rdx
> -	orq	%rdx, %rax
> -	movq	%rax, sev_status(%rip)
> -
> -.Lno_sev_mask:
> -	movq	%rbp, %rsp		/* Restore original stack pointer */
> -
> -	pop	%rdx
> -	pop	%rbp
> -#endif
> -
> -	xor	%rax, %rax
> -	ret
> -SYM_FUNC_END(set_sev_encryption_mask)
>  
>  	.data
>  
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 16ed360b6692..23e0e395084a 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -120,12 +120,12 @@ static inline void console_init(void)
>  { }
>  #endif
>  
> -void set_sev_encryption_mask(void);
> -
>  #ifdef CONFIG_AMD_MEM_ENCRYPT
> +void sev_enable(struct boot_params *bp);
>  void sev_es_shutdown_ghcb(void);
>  extern bool sev_es_check_ghcb_fault(unsigned long address);
>  #else
> +static inline void sev_enable(struct boot_params *bp) { }
>  static inline void sev_es_shutdown_ghcb(void) { }
>  static inline bool sev_es_check_ghcb_fault(unsigned long address)
>  {
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index 28bcf04c022e..8eebdf589a90 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -204,3 +204,48 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
>  	else if (result != ES_RETRY)
>  		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
>  }
> +
> +static inline u64 rd_sev_status_msr(void)
> +{
> +	unsigned long low, high;
> +
> +	asm volatile("rdmsr" : "=a" (low), "=d" (high) :
> +			"c" (MSR_AMD64_SEV));
> +
> +	return ((high << 32) | low);
> +}
> +
> +void sev_enable(struct boot_params *bp)
> +{
> +	unsigned int eax, ebx, ecx, edx;
> +
> +	/* Check for the SME/SEV support leaf */
> +	eax = 0x80000000;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);
> +	if (eax < 0x8000001f)
> +		return;
> +
> +	/*
> +	 * Check for the SME/SEV feature:
> +	 *   CPUID Fn8000_001F[EAX]
> +	 *   - Bit 0 - Secure Memory Encryption support
> +	 *   - Bit 1 - Secure Encrypted Virtualization support
> +	 *   CPUID Fn8000_001F[EBX]
> +	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
> +	 */
> +	eax = 0x8000001f;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);
> +	/* Check whether SEV is supported */
> +	if (!(eax & BIT(1)))
> +		return;
> +
> +	/* Set the SME mask if this is an SEV guest. */
> +	sev_status   = rd_sev_status_msr();
> +
> +	if (!(sev_status & MSR_AMD64_SEV_ENABLED))
> +		return;
> +
> +	sme_me_mask = BIT_ULL(ebx & 0x3f);

I made this suggestion while reviewing v7 too, but it appears that it
fell through the cracks. Most of the code in sev_enable() is duplicated
from sme_enable(). Wouldn't it be better to put all that common code
in a different function, and call that function from sme_enable()
and sev_enable()?

Venu


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-13 19:09   ` Venu Busireddy
@ 2021-12-13 19:17     ` Borislav Petkov
  2021-12-14 17:46       ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2021-12-13 19:17 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Mon, Dec 13, 2021 at 01:09:19PM -0600, Venu Busireddy wrote:
> I made this suggestion while reviewing v7 too, but it appears that it
> fell through the cracks. Most of the code in sev_enable() is duplicated
> from sme_enable(). Wouldn't it be better to put all that common code
> in a different function, and call that function from sme_enable()
> and sev_enable()?

How about you look where both functions are defined? Which kernel stages?

And please trim your mails when you reply.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 02/40] x86/sev: detect/setup SEV/SME features earlier in boot
  2021-12-10 15:42 ` [PATCH v8 02/40] x86/sev: " Brijesh Singh
@ 2021-12-13 22:36   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-13 22:36 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:42:54 -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> sme_enable() handles feature detection for both SEV and SME. Future
> patches will also use it for SEV-SNP feature detection/setup, which
> will need to be done immediately after the first #VC handler is set up.
> Move it now in preparation.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/kernel/head64.c  |  3 ---
>  arch/x86/kernel/head_64.S | 13 +++++++++++++
>  2 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 3be9dd213dad..b01f64e8389b 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -192,9 +192,6 @@ unsigned long __head __startup_64(unsigned long physaddr,
>  	if (load_delta & ~PMD_PAGE_MASK)
>  		for (;;);
>  
> -	/* Activate Secure Memory Encryption (SME) if supported and enabled */
> -	sme_enable(bp);
> -
>  	/* Include the SME encryption mask in the fixup value */
>  	load_delta += sme_get_me_mask();
>  
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index d8b3ebd2bb85..99de8fd461e8 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -69,6 +69,19 @@ SYM_CODE_START_NOALIGN(startup_64)
>  	call	startup_64_setup_env
>  	popq	%rsi
>  
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +	/*
> +	 * Activate SEV/SME memory encryption if supported/enabled. This needs to
> +	 * be done now, since this also includes setup of the SEV-SNP CPUID table,
> +	 * which needs to be done before any CPUID instructions are executed in
> +	 * subsequent code.
> +	 */
> +	movq	%rsi, %rdi
> +	pushq	%rsi
> +	call	sme_enable
> +	popq	%rsi
> +#endif
> +
>  	/* Now switch to __KERNEL_CS so IRET works reliably */
>  	pushq	$__KERNEL_CS
>  	leaq	.Lon_kernel_cs(%rip), %rax
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 03/40] x86/mm: Extend cc_attr to include AMD SEV-SNP
  2021-12-10 15:42 ` [PATCH v8 03/40] x86/mm: Extend cc_attr to include AMD SEV-SNP Brijesh Singh
@ 2021-12-13 22:47   ` Venu Busireddy
  2021-12-14 15:53   ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-13 22:47 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:42:55 -0600, Brijesh Singh wrote:
> The CC_ATTR_SEV_SNP can be used by the guest to query whether the SNP -
> Secure Nested Paging feature is active.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
> +
> +	/**
> +	 * @CC_ATTR_SEV_SNP: Guest SNP is active.
> +	 *
> +	 * The platform/OS is running as a guest/virtual machine and actively
> +	 * using AMD SEV-SNP features.
> +	 */
> +	CC_ATTR_SEV_SNP = 0x100,

Perhaps add a note on why this is being set to 0x100?

With that...

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 04/40] x86/sev: Define the Linux specific guest termination reasons
  2021-12-10 15:42 ` [PATCH v8 04/40] x86/sev: Define the Linux specific guest termination reasons Brijesh Singh
@ 2021-12-14  0:13   ` Venu Busireddy
  2021-12-14 22:22   ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-14  0:13 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:42:56 -0600, Brijesh Singh wrote:
> GHCB specification defines the reason code for reason set 0. The reason
> codes defined in the set 0 do not cover all possible causes for a guest
> to request termination.
> 
> The reason set 1 to 255 is reserved for the vendor-specific codes.

s/set 1 to 255 is/sets 1 to 255 are/

> Reseve the reason set 1 for the Linux guest. Define an error codes for

s/Define an/Define the/

> reason set 1.
> 
> While at it, change the sev_es_terminate() to accept the reason set
> parameter.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

With that...

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/boot/compressed/sev.c    |  6 +++---
>  arch/x86/include/asm/sev-common.h |  8 ++++++++
>  arch/x86/kernel/sev-shared.c      | 11 ++++-------
>  arch/x86/kernel/sev.c             |  4 ++--
>  4 files changed, 17 insertions(+), 12 deletions(-)

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 05/40] x86/sev: Save the negotiated GHCB version
  2021-12-10 15:42 ` [PATCH v8 05/40] x86/sev: Save the negotiated GHCB version Brijesh Singh
@ 2021-12-14  0:32   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-14  0:32 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:42:57 -0600, Brijesh Singh wrote:
> The SEV-ES guest calls the sev_es_negotiate_protocol() to negotiate the
> GHCB protocol version before establishing the GHCB. Cache the negotiated
> GHCB version so that it can be used later.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/include/asm/sev.h   |  2 +-
>  arch/x86/kernel/sev-shared.c | 17 ++++++++++++++---
>  2 files changed, 15 insertions(+), 4 deletions(-)

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 03/40] x86/mm: Extend cc_attr to include AMD SEV-SNP
  2021-12-10 15:42 ` [PATCH v8 03/40] x86/mm: Extend cc_attr to include AMD SEV-SNP Brijesh Singh
  2021-12-13 22:47   ` Venu Busireddy
@ 2021-12-14 15:53   ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-14 15:53 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:42:55AM -0600, Brijesh Singh wrote:
> diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
> index a075b70b9a70..ef5e2209c9b8 100644
> --- a/include/linux/cc_platform.h
> +++ b/include/linux/cc_platform.h
> @@ -61,6 +61,14 @@ enum cc_attr {
>  	 * Examples include SEV-ES.
>  	 */
>  	CC_ATTR_GUEST_STATE_ENCRYPT,
> +
> +	/**
> +	 * @CC_ATTR_SEV_SNP: Guest SNP is active.
> +	 *
> +	 * The platform/OS is running as a guest/virtual machine and actively
> +	 * using AMD SEV-SNP features.
> +	 */
> +	CC_ATTR_SEV_SNP = 0x100,

I guess CC_ATTR_GUEST_SEV_SNP. The Intel is called CC_ATTR_GUEST_TDX so
at least they all say it is a guest thing, this way.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-13 19:17     ` Borislav Petkov
@ 2021-12-14 17:46       ` Venu Busireddy
  2021-12-14 19:10         ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2021-12-14 17:46 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-13 20:17:31 +0100, Borislav Petkov wrote:
> On Mon, Dec 13, 2021 at 01:09:19PM -0600, Venu Busireddy wrote:
> > I made this suggestion while reviewing v7 too, but it appears that it
> > fell through the cracks. Most of the code in sev_enable() is duplicated
> > from sme_enable(). Wouldn't it be better to put all that common code
> > in a different function, and call that function from sme_enable()
> > and sev_enable()?
> 
> How about you look where both functions are defined? Which kernel stages?

What I am suggesting should not have anything to do with the boot stage
of the kernel.

For example, both these functions call native_cpuid(), which is declared
as an inline function. I am merely suggesting to do something similar
to avoid the code duplication.

Venu


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-14 17:46       ` Venu Busireddy
@ 2021-12-14 19:10         ` Borislav Petkov
  2021-12-15  0:14           ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2021-12-14 19:10 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Dec 14, 2021 at 11:46:14AM -0600, Venu Busireddy wrote:
> What I am suggesting should not have anything to do with the boot stage
> of the kernel.

I know exactly what you're suggesting.

> For example, both these functions call native_cpuid(), which is declared
> as an inline function. I am merely suggesting to do something similar
> to avoid the code duplication.

Try it yourself. If you can come up with something halfway readable and
it builds, I'm willing to take a look.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 04/40] x86/sev: Define the Linux specific guest termination reasons
  2021-12-10 15:42 ` [PATCH v8 04/40] x86/sev: Define the Linux specific guest termination reasons Brijesh Singh
  2021-12-14  0:13   ` Venu Busireddy
@ 2021-12-14 22:22   ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-14 22:22 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:42:56AM -0600, Brijesh Singh wrote:
> GHCB specification defines the reason code for reason set 0. The reason
> codes defined in the set 0 do not cover all possible causes for a guest
> to request termination.
> 
> The reason set 1 to 255 is reserved for the vendor-specific codes.
> Reseve the reason set 1 for the Linux guest. Define an error codes for

Yah, your spellchecker is still broken:

Reseve the reason set 1 for the Linux guest. Define an error codes for
Unknown word [Reseve] in commit message, suggestions:
        ['Reeves', 'Reeve', 'Reserve', 'Res eve', 'Res-eve', 'Severe', 'Reverse', 'Sevres', 'Revers']

> reason set 1.

"... and use them in the Linux guest so that one can have meaningful
termination reasons and thus better guest failure diagnosis."

The *why* is very important.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-14 19:10         ` Borislav Petkov
@ 2021-12-15  0:14           ` Venu Busireddy
  2021-12-15 11:57             ` Borislav Petkov
                               ` (2 more replies)
  0 siblings, 3 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-15  0:14 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-14 20:10:16 +0100, Borislav Petkov wrote:
> On Tue, Dec 14, 2021 at 11:46:14AM -0600, Venu Busireddy wrote:
> > What I am suggesting should not have anything to do with the boot stage
> > of the kernel.
> 
> I know exactly what you're suggesting.
> 
> > For example, both these functions call native_cpuid(), which is declared
> > as an inline function. I am merely suggesting to do something similar
> > to avoid the code duplication.
> 
> Try it yourself. If you can come up with something halfway readable and
> it builds, I'm willing to take a look.

Patch (to be applied on top of sev-snp-v8 branch of
https://github.com/AMDESE/linux.git) is attached at the end.

Here are a few things I did.

1. Moved all the common code that existed at the begining of
   sme_enable() and sev_enable() to an inline function named
   get_pagetable_bit_pos().
2. sme_enable() was using AMD_SME_BIT and AMD_SEV_BIT, whereas
   sev_enable() was dealing with raw bits. Moved those definitions to
   sev.h, and changed sev_enable() to use those definitions.
3. Make consistent use of BIT_ULL.

Venu


diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index c2bf99522e5e..b44d6b37796e 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -291,6 +291,7 @@ static void enforce_vmpl0(void)
 void sev_enable(struct boot_params *bp)
 {
 	unsigned int eax, ebx, ecx, edx;
+	unsigned long pt_bit_pos;	/* Pagetable bit position */
 	bool snp;
 
 	/*
@@ -299,26 +300,8 @@ void sev_enable(struct boot_params *bp)
 	 */
 	snp = snp_init(bp);
 
-	/* Check for the SME/SEV support leaf */
-	eax = 0x80000000;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	if (eax < 0x8000001f)
-		return;
-
-	/*
-	 * Check for the SME/SEV feature:
-	 *   CPUID Fn8000_001F[EAX]
-	 *   - Bit 0 - Secure Memory Encryption support
-	 *   - Bit 1 - Secure Encrypted Virtualization support
-	 *   CPUID Fn8000_001F[EBX]
-	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
-	 */
-	eax = 0x8000001f;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	/* Check whether SEV is supported */
-	if (!(eax & BIT(1))) {
+	/* Get the pagetable bit position if SEV is supported */
+	if ((get_pagetable_bit_pos(&pt_bit_pos, AMD_SEV_BIT)) < 0) {
 		if (snp)
 			error("SEV-SNP support indicated by CC blob, but not CPUID.");
 		return;
@@ -350,7 +333,7 @@ void sev_enable(struct boot_params *bp)
 	if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
 		error("SEV-SNP supported indicated by CC blob, but not SEV status MSR.");
 
-	sme_me_mask = BIT_ULL(ebx & 0x3f);
+	sme_me_mask = BIT_ULL(pt_bit_pos);
 }
 
 /* Search for Confidential Computing blob in the EFI config table. */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2c5f12ae7d04..41b096f28d02 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -224,6 +224,43 @@ static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
 	    : "memory");
 }
 
+/*
+ * Returns the pagetable bit position in pt_bit_pos,
+ * iff the specified features are supported.
+ */
+static inline int get_pagetable_bit_pos(unsigned long *pt_bit_pos,
+					unsigned long features)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	/* Check for the SME/SEV support leaf */
+	eax = 0x80000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax < 0x8000001f)
+		return -1;
+
+	eax = 0x8000001f;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+
+	/* Check whether the specified features are supported.
+	 * SME/SEV features:
+	 *   CPUID Fn8000_001F[EAX]
+	 *   - Bit 0 - Secure Memory Encryption support
+	 *   - Bit 1 - Secure Encrypted Virtualization support
+	 */
+	if (!(eax & features))
+		return -1;
+
+	/*
+	 *   CPUID Fn8000_001F[EBX]
+	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
+	 */
+	*pt_bit_pos = (unsigned long)(ebx & 0x3f);
+	return 0;
+}
+
 #define native_cpuid_reg(reg)					\
 static inline unsigned int native_cpuid_##reg(unsigned int op)	\
 {								\
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 7a5934af9d47..1a2344362ec6 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -17,6 +17,9 @@
 #define GHCB_PROTOCOL_MAX	2ULL
 #define GHCB_DEFAULT_USAGE	0ULL
 
+#define AMD_SME_BIT		BIT(0)
+#define AMD_SEV_BIT		BIT(1)
+
 #define	VMGEXIT()			{ asm volatile("rep; vmmcall\n\r"); }
 
 enum es_result {
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 2f723e106ed3..1ef50e969efd 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -508,38 +508,18 @@ void __init sme_enable(struct boot_params *bp)
 	unsigned long feature_mask;
 	bool active_by_default;
 	unsigned long me_mask;
+	unsigned long pt_bit_pos;	/* Pagetable bit position */
 	char buffer[16];
 	bool snp;
 	u64 msr;
 
 	snp = snp_init(bp);
 
-	/* Check for the SME/SEV support leaf */
-	eax = 0x80000000;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	if (eax < 0x8000001f)
+	/* Get the pagetable bit position if SEV or SME are supported */
+	if ((get_pagetable_bit_pos(&pt_bit_pos, AMD_SEV_BIT | AMD_SME_BIT)) < 0)
 		return;
 
-#define AMD_SME_BIT	BIT(0)
-#define AMD_SEV_BIT	BIT(1)
-
-	/*
-	 * Check for the SME/SEV feature:
-	 *   CPUID Fn8000_001F[EAX]
-	 *   - Bit 0 - Secure Memory Encryption support
-	 *   - Bit 1 - Secure Encrypted Virtualization support
-	 *   CPUID Fn8000_001F[EBX]
-	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
-	 */
-	eax = 0x8000001f;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	/* Check whether SEV or SME is supported */
-	if (!(eax & (AMD_SEV_BIT | AMD_SME_BIT)))
-		return;
-
-	me_mask = 1UL << (ebx & 0x3f);
+	me_mask = BIT_ULL(pt_bit_pos);
 
 	/* Check the SEV MSR whether SEV or SME is enabled */
 	sev_status   = __rdmsr(MSR_AMD64_SEV);

^ permalink raw reply related	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15  0:14           ` Venu Busireddy
@ 2021-12-15 11:57             ` Borislav Petkov
  2021-12-15 14:43             ` Tom Lendacky
  2021-12-15 17:51             ` Michael Roth
  2 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-15 11:57 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Dec 14, 2021 at 06:14:34PM -0600, Venu Busireddy wrote:
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 2c5f12ae7d04..41b096f28d02 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -224,6 +224,43 @@ static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
>  	    : "memory");
>  }
>  
> +/*
> + * Returns the pagetable bit position in pt_bit_pos,
> + * iff the specified features are supported.
> + */
> +static inline int get_pagetable_bit_pos(unsigned long *pt_bit_pos,
> +					unsigned long features)

You can simply return pt_bit_pos:

static inline unsigned int get_pagetable_bit_pos(unsigned long features)

and return a negative value on error.

Also, the only duplication this is saving is visual - that function will
get inlined at the call sites.

Also, I'd love to separate the compressed kernel headers from the
kernel proper ones but I'm afraid that ship has sailed. But if I could,
that would have to be in a special header that gets included by both
stages...

So I don't mind this but I'd let Brijesh and Tom have a look at it too.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15  0:14           ` Venu Busireddy
  2021-12-15 11:57             ` Borislav Petkov
@ 2021-12-15 14:43             ` Tom Lendacky
  2021-12-15 17:49               ` Michael Roth
  2021-12-15 18:58               ` Venu Busireddy
  2021-12-15 17:51             ` Michael Roth
  2 siblings, 2 replies; 183+ messages in thread
From: Tom Lendacky @ 2021-12-15 14:43 UTC (permalink / raw)
  To: Venu Busireddy, Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, H. Peter Anvin, Ard Biesheuvel,
	Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jim Mattson, Andy Lutomirski, Dave Hansen, Sergio Lopez,
	Peter Gonda, Peter Zijlstra, Srinivas Pandruvada, David Rientjes,
	Dov Murik, Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On 12/14/21 6:14 PM, Venu Busireddy wrote:
> On 2021-12-14 20:10:16 +0100, Borislav Petkov wrote:
>> On Tue, Dec 14, 2021 at 11:46:14AM -0600, Venu Busireddy wrote:
>>> What I am suggesting should not have anything to do with the boot stage
>>> of the kernel.
>>
>> I know exactly what you're suggesting.
>>
>>> For example, both these functions call native_cpuid(), which is declared
>>> as an inline function. I am merely suggesting to do something similar
>>> to avoid the code duplication.
>>
>> Try it yourself. If you can come up with something halfway readable and
>> it builds, I'm willing to take a look.
> 
> Patch (to be applied on top of sev-snp-v8 branch of
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAMDESE%2Flinux.git&amp;data=04%7C01%7Cthomas.lendacky%40amd.com%7Cbff83ee03b1147c39ea808d9bf5fe9d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637751240978266883%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=D8t%2FwXY%2FYIl8aJXN%2BU7%2Flubln8AbhtdgB0f4DCNWp4w%3D&amp;reserved=0) is attached at the end.
> 
> Here are a few things I did.
> 
> 1. Moved all the common code that existed at the begining of
>     sme_enable() and sev_enable() to an inline function named
>     get_pagetable_bit_pos().
> 2. sme_enable() was using AMD_SME_BIT and AMD_SEV_BIT, whereas
>     sev_enable() was dealing with raw bits. Moved those definitions to
>     sev.h, and changed sev_enable() to use those definitions.
> 3. Make consistent use of BIT_ULL.
> 
> Venu
> 
> 
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index c2bf99522e5e..b44d6b37796e 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -291,6 +291,7 @@ static void enforce_vmpl0(void)
>   void sev_enable(struct boot_params *bp)
>   {
>   	unsigned int eax, ebx, ecx, edx;
> +	unsigned long pt_bit_pos;	/* Pagetable bit position */
>   	bool snp;
>   
>   	/*
> @@ -299,26 +300,8 @@ void sev_enable(struct boot_params *bp)
>   	 */
>   	snp = snp_init(bp);
>   
> -	/* Check for the SME/SEV support leaf */
> -	eax = 0x80000000;
> -	ecx = 0;
> -	native_cpuid(&eax, &ebx, &ecx, &edx);
> -	if (eax < 0x8000001f)
> -		return;
> -
> -	/*
> -	 * Check for the SME/SEV feature:
> -	 *   CPUID Fn8000_001F[EAX]
> -	 *   - Bit 0 - Secure Memory Encryption support
> -	 *   - Bit 1 - Secure Encrypted Virtualization support
> -	 *   CPUID Fn8000_001F[EBX]
> -	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
> -	 */
> -	eax = 0x8000001f;
> -	ecx = 0;
> -	native_cpuid(&eax, &ebx, &ecx, &edx);
> -	/* Check whether SEV is supported */
> -	if (!(eax & BIT(1))) {
> +	/* Get the pagetable bit position if SEV is supported */
> +	if ((get_pagetable_bit_pos(&pt_bit_pos, AMD_SEV_BIT)) < 0) {
>   		if (snp)
>   			error("SEV-SNP support indicated by CC blob, but not CPUID.");
>   		return;
> @@ -350,7 +333,7 @@ void sev_enable(struct boot_params *bp)
>   	if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
>   		error("SEV-SNP supported indicated by CC blob, but not SEV status MSR.");
>   
> -	sme_me_mask = BIT_ULL(ebx & 0x3f);
> +	sme_me_mask = BIT_ULL(pt_bit_pos);
>   }
>   
>   /* Search for Confidential Computing blob in the EFI config table. */
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 2c5f12ae7d04..41b096f28d02 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -224,6 +224,43 @@ static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
>   	    : "memory");
>   }
>   
> +/*
> + * Returns the pagetable bit position in pt_bit_pos,
> + * iff the specified features are supported.
> + */
> +static inline int get_pagetable_bit_pos(unsigned long *pt_bit_pos,
> +					unsigned long features)

I'm not a fan of this name. You are specifically returning the encryption 
bit position but using a very generic name of get_pagetable_bit_pos() in a 
very common header file. Maybe something more like get_me_bit() and move 
the function to an existing SEV header file.

Also, this can probably just return an unsigned int that will be either 0 
or the bit position, right?  Then the check above can be for a zero value, 
e.g.:

	me_bit = get_me_bit();
	if (!me_bit) {

	...

	sme_me_mask = BIT_ULL(me_bit);

That should work below, too, but you'll need to verify that.

> +{
> +	unsigned int eax, ebx, ecx, edx;
> +
> +	/* Check for the SME/SEV support leaf */
> +	eax = 0x80000000;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);
> +	if (eax < 0x8000001f)
> +		return -1;

This can then be:

		return 0;

> +
> +	eax = 0x8000001f;
> +	ecx = 0;
> +	native_cpuid(&eax, &ebx, &ecx, &edx);
> +
> +	/* Check whether the specified features are supported.
> +	 * SME/SEV features:
> +	 *   CPUID Fn8000_001F[EAX]
> +	 *   - Bit 0 - Secure Memory Encryption support
> +	 *   - Bit 1 - Secure Encrypted Virtualization support
> +	 */
> +	if (!(eax & features))
> +		return -1;

and this can be:

		return 0;

> +
> +	/*
> +	 *   CPUID Fn8000_001F[EBX]
> +	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
> +	 */
> +	*pt_bit_pos = (unsigned long)(ebx & 0x3f);

and this can be:

	return ebx & 0x3f;

> +	return 0;
> +}
> +
>   #define native_cpuid_reg(reg)					\
>   static inline unsigned int native_cpuid_##reg(unsigned int op)	\
>   {								\
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 7a5934af9d47..1a2344362ec6 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -17,6 +17,9 @@
>   #define GHCB_PROTOCOL_MAX	2ULL
>   #define GHCB_DEFAULT_USAGE	0ULL
>   
> +#define AMD_SME_BIT		BIT(0)
> +#define AMD_SEV_BIT		BIT(1)
> +

Maybe this is where that new static inline function should go...

>   #define	VMGEXIT()			{ asm volatile("rep; vmmcall\n\r"); }
>   
>   enum es_result {
> diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
> index 2f723e106ed3..1ef50e969efd 100644
> --- a/arch/x86/mm/mem_encrypt_identity.c
> +++ b/arch/x86/mm/mem_encrypt_identity.c
> @@ -508,38 +508,18 @@ void __init sme_enable(struct boot_params *bp)
>   	unsigned long feature_mask;
>   	bool active_by_default;
>   	unsigned long me_mask;
> +	unsigned long pt_bit_pos;	/* Pagetable bit position */

unsigned int and me_bit or me_bit_pos.

Thanks,
Tom

>   	char buffer[16];
>   	bool snp;
>   	u64 msr;
>   
>   	snp = snp_init(bp);
>   
> -	/* Check for the SME/SEV support leaf */
> -	eax = 0x80000000;
> -	ecx = 0;
> -	native_cpuid(&eax, &ebx, &ecx, &edx);
> -	if (eax < 0x8000001f)
> +	/* Get the pagetable bit position if SEV or SME are supported */
> +	if ((get_pagetable_bit_pos(&pt_bit_pos, AMD_SEV_BIT | AMD_SME_BIT)) < 0)
>   		return;
>   
> -#define AMD_SME_BIT	BIT(0)
> -#define AMD_SEV_BIT	BIT(1)
> -
> -	/*
> -	 * Check for the SME/SEV feature:
> -	 *   CPUID Fn8000_001F[EAX]
> -	 *   - Bit 0 - Secure Memory Encryption support
> -	 *   - Bit 1 - Secure Encrypted Virtualization support
> -	 *   CPUID Fn8000_001F[EBX]
> -	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
> -	 */
> -	eax = 0x8000001f;
> -	ecx = 0;
> -	native_cpuid(&eax, &ebx, &ecx, &edx);
> -	/* Check whether SEV or SME is supported */
> -	if (!(eax & (AMD_SEV_BIT | AMD_SME_BIT)))
> -		return;
> -
> -	me_mask = 1UL << (ebx & 0x3f);
> +	me_mask = BIT_ULL(pt_bit_pos);
>   
>   	/* Check the SEV MSR whether SEV or SME is enabled */
>   	sev_status   = __rdmsr(MSR_AMD64_SEV);
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 14:43             ` Tom Lendacky
@ 2021-12-15 17:49               ` Michael Roth
  2021-12-15 18:17                 ` Venu Busireddy
  2021-12-15 19:54                 ` Venu Busireddy
  2021-12-15 18:58               ` Venu Busireddy
  1 sibling, 2 replies; 183+ messages in thread
From: Michael Roth @ 2021-12-15 17:49 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Venu Busireddy, Borislav Petkov, Brijesh Singh, x86,
	linux-kernel, kvm, linux-efi, platform-driver-x86, linux-coco,
	linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Vlastimil Babka, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Dec 15, 2021 at 08:43:23AM -0600, Tom Lendacky wrote:
> On 12/14/21 6:14 PM, Venu Busireddy wrote:
> > On 2021-12-14 20:10:16 +0100, Borislav Petkov wrote:
> > > On Tue, Dec 14, 2021 at 11:46:14AM -0600, Venu Busireddy wrote:
> > > > What I am suggesting should not have anything to do with the boot stage
> > > > of the kernel.
> > > 
> > > I know exactly what you're suggesting.
> > > 
> > > > For example, both these functions call native_cpuid(), which is declared
> > > > as an inline function. I am merely suggesting to do something similar
> > > > to avoid the code duplication.
> > > 
> > > Try it yourself. If you can come up with something halfway readable and
> > > it builds, I'm willing to take a look.
> > 
> > Patch (to be applied on top of sev-snp-v8 branch of
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAMDESE%2Flinux.git&amp;data=04%7C01%7Cthomas.lendacky%40amd.com%7Cbff83ee03b1147c39ea808d9bf5fe9d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637751240978266883%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=D8t%2FwXY%2FYIl8aJXN%2BU7%2Flubln8AbhtdgB0f4DCNWp4w%3D&amp;reserved=0) is attached at the end.
> > 
> > Here are a few things I did.
> > 
> > 1. Moved all the common code that existed at the begining of
> >     sme_enable() and sev_enable() to an inline function named
> >     get_pagetable_bit_pos().
> > 2. sme_enable() was using AMD_SME_BIT and AMD_SEV_BIT, whereas
> >     sev_enable() was dealing with raw bits. Moved those definitions to
> >     sev.h, and changed sev_enable() to use those definitions.
> > 3. Make consistent use of BIT_ULL.
> > 
> > Venu
> > 
> > 
> > diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> > index c2bf99522e5e..b44d6b37796e 100644
> > --- a/arch/x86/boot/compressed/sev.c
> > +++ b/arch/x86/boot/compressed/sev.c
> > @@ -291,6 +291,7 @@ static void enforce_vmpl0(void)
> >   void sev_enable(struct boot_params *bp)
> >   {
> >   	unsigned int eax, ebx, ecx, edx;
> > +	unsigned long pt_bit_pos;	/* Pagetable bit position */
> >   	bool snp;
> >   	/*
> > @@ -299,26 +300,8 @@ void sev_enable(struct boot_params *bp)
> >   	 */
> >   	snp = snp_init(bp);
> > -	/* Check for the SME/SEV support leaf */
> > -	eax = 0x80000000;
> > -	ecx = 0;
> > -	native_cpuid(&eax, &ebx, &ecx, &edx);
> > -	if (eax < 0x8000001f)
> > -		return;
> > -
> > -	/*
> > -	 * Check for the SME/SEV feature:
> > -	 *   CPUID Fn8000_001F[EAX]
> > -	 *   - Bit 0 - Secure Memory Encryption support
> > -	 *   - Bit 1 - Secure Encrypted Virtualization support
> > -	 *   CPUID Fn8000_001F[EBX]
> > -	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
> > -	 */
> > -	eax = 0x8000001f;
> > -	ecx = 0;
> > -	native_cpuid(&eax, &ebx, &ecx, &edx);
> > -	/* Check whether SEV is supported */
> > -	if (!(eax & BIT(1))) {
> > +	/* Get the pagetable bit position if SEV is supported */
> > +	if ((get_pagetable_bit_pos(&pt_bit_pos, AMD_SEV_BIT)) < 0) {
> >   		if (snp)
> >   			error("SEV-SNP support indicated by CC blob, but not CPUID.");
> >   		return;
> > @@ -350,7 +333,7 @@ void sev_enable(struct boot_params *bp)
> >   	if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
> >   		error("SEV-SNP supported indicated by CC blob, but not SEV status MSR.");
> > -	sme_me_mask = BIT_ULL(ebx & 0x3f);
> > +	sme_me_mask = BIT_ULL(pt_bit_pos);
> >   }
> >   /* Search for Confidential Computing blob in the EFI config table. */
> > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> > index 2c5f12ae7d04..41b096f28d02 100644
> > --- a/arch/x86/include/asm/processor.h
> > +++ b/arch/x86/include/asm/processor.h
> > @@ -224,6 +224,43 @@ static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
> >   	    : "memory");
> >   }
> > +/*
> > + * Returns the pagetable bit position in pt_bit_pos,
> > + * iff the specified features are supported.
> > + */
> > +static inline int get_pagetable_bit_pos(unsigned long *pt_bit_pos,
> > +					unsigned long features)
> 
> I'm not a fan of this name. You are specifically returning the encryption
> bit position but using a very generic name of get_pagetable_bit_pos() in a
> very common header file. Maybe something more like get_me_bit() and move the
> function to an existing SEV header file.
> 
> Also, this can probably just return an unsigned int that will be either 0 or
> the bit position, right?  Then the check above can be for a zero value,
> e.g.:
> 
> 	me_bit = get_me_bit();
> 	if (!me_bit) {
> 
> 	...
> 
> 	sme_me_mask = BIT_ULL(me_bit);
> 
> That should work below, too, but you'll need to verify that.

I think in the greater context of consolidating all the SME/SEV setup
and re-using code, this helper stands a high chance of eventually becoming
something more along the lines of sme_sev_parse_cpuid(), since otherwise
we'd end up re-introducing multiple helpers to parse the same 0x8000001F
fields if we ever need to process any of the other fields advertised in
there. Given that, it makes sense to reserve the return value as an
indication that either SEV or SME are enabled, and then have a
pass-by-pointer parameters list to collect the individual feature
bits/encryption mask for cases where SEV/SME are enabled, which are only
treated as valid if sme_sev_parse_cpuid() returns 0.

So Venu's original approach of passing the encryption mask by pointer
seems a little closer toward that end, but I also agree Tom's approach
is cleaner for the current code base, so I'm fine either way, just
figured I'd mention this.

I think needing to pass in the SME/SEV CPUID bits to tell the helper when
to parse encryption bit and when not to is a little bit awkward though.
If there's some agreement that this will ultimately serve the purpose of
handling all (or most) of SME/SEV-related CPUID parsing, then the caller
shouldn't really need to be aware of any individual bit positions.
Maybe a bool could handle that instead, e.g.:

  int get_me_bit(bool sev_only, ...)

  or

  int sme_sev_parse_cpuid(bool sev_only, ...)

where for boot/compressed sev_only=true, for kernel proper sev_only=false.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15  0:14           ` Venu Busireddy
  2021-12-15 11:57             ` Borislav Petkov
  2021-12-15 14:43             ` Tom Lendacky
@ 2021-12-15 17:51             ` Michael Roth
  2 siblings, 0 replies; 183+ messages in thread
From: Michael Roth @ 2021-12-15 17:51 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Borislav Petkov, Brijesh Singh, x86, linux-kernel, kvm,
	linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Vlastimil Babka, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Dec 14, 2021 at 06:14:34PM -0600, Venu Busireddy wrote:
> On 2021-12-14 20:10:16 +0100, Borislav Petkov wrote:
> > On Tue, Dec 14, 2021 at 11:46:14AM -0600, Venu Busireddy wrote:
> > > What I am suggesting should not have anything to do with the boot stage
> > > of the kernel.
> > 
> > I know exactly what you're suggesting.
> > 
> > > For example, both these functions call native_cpuid(), which is declared
> > > as an inline function. I am merely suggesting to do something similar
> > > to avoid the code duplication.
> > 
> > Try it yourself. If you can come up with something halfway readable and
> > it builds, I'm willing to take a look.
> 
> Patch (to be applied on top of sev-snp-v8 branch of
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAMDESE%2Flinux.git&amp;data=04%7C01%7Cmichael.roth%40amd.com%7Cbff83ee03b1147c39ea808d9bf5fe9d8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637751240979543818%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=DZpgEtthswLhhfWqZlLkHHd5nJW2jb%2FVFuTssAFJ6Uo%3D&amp;reserved=0) is attached at the end.
> 
> Here are a few things I did.
> 
> 1. Moved all the common code that existed at the begining of
>    sme_enable() and sev_enable() to an inline function named
>    get_pagetable_bit_pos().
> 2. sme_enable() was using AMD_SME_BIT and AMD_SEV_BIT, whereas
>    sev_enable() was dealing with raw bits. Moved those definitions to
>    sev.h, and changed sev_enable() to use those definitions.
> 3. Make consistent use of BIT_ULL.

Hi Venu,

I know there's still comments floating around, but once there's consensus feel
free to respond with a separate precursor patch against tip which moves
sme_enable() cpuid code into your helper function, along with your S-o-B, and I
can include it directly in the next version. Otherwise, I can incorporate your
suggestions into the next spin, just let me know if it's okay to add:

  Co-authored-by: Venu Busireddy <venu.busireddy@oracle.com>
  Signed-off-by:  Venu Busireddy <venu.busireddy@oracle.com>

to the relevant commits.

Thank you (and Boris/Tom) for the suggestions!

-Mike

> 
> Venu
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 17:49               ` Michael Roth
@ 2021-12-15 18:17                 ` Venu Busireddy
  2021-12-15 18:33                   ` Borislav Petkov
  2021-12-15 20:43                   ` Michael Roth
  2021-12-15 19:54                 ` Venu Busireddy
  1 sibling, 2 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-15 18:17 UTC (permalink / raw)
  To: Michael Roth
  Cc: Tom Lendacky, Borislav Petkov, Brijesh Singh, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-15 11:49:34 -0600, Michael Roth wrote:
> 
> I think in the greater context of consolidating all the SME/SEV setup
> and re-using code, this helper stands a high chance of eventually becoming
> something more along the lines of sme_sev_parse_cpuid(), since otherwise
> we'd end up re-introducing multiple helpers to parse the same 0x8000001F
> fields if we ever need to process any of the other fields advertised in
> there. Given that, it makes sense to reserve the return value as an
> indication that either SEV or SME are enabled, and then have a
> pass-by-pointer parameters list to collect the individual feature
> bits/encryption mask for cases where SEV/SME are enabled, which are only
> treated as valid if sme_sev_parse_cpuid() returns 0.
> 
> So Venu's original approach of passing the encryption mask by pointer
> seems a little closer toward that end, but I also agree Tom's approach
> is cleaner for the current code base, so I'm fine either way, just
> figured I'd mention this.
> 
> I think needing to pass in the SME/SEV CPUID bits to tell the helper when
> to parse encryption bit and when not to is a little bit awkward though.
> If there's some agreement that this will ultimately serve the purpose of
> handling all (or most) of SME/SEV-related CPUID parsing, then the caller
> shouldn't really need to be aware of any individual bit positions.
> Maybe a bool could handle that instead, e.g.:
> 
>   int get_me_bit(bool sev_only, ...)
> 
>   or
> 
>   int sme_sev_parse_cpuid(bool sev_only, ...)
> 
> where for boot/compressed sev_only=true, for kernel proper sev_only=false.

I can implement it this way too. But I am wondering if having a
boolean argument limits us from handling any future additions to the
bit positions.

Boris & Tom, which implementation would you prefer?

Venu



^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 18:17                 ` Venu Busireddy
@ 2021-12-15 18:33                   ` Borislav Petkov
  2021-12-15 20:17                     ` Michael Roth
  2021-12-15 20:43                   ` Michael Roth
  1 sibling, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2021-12-15 18:33 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Michael Roth, Tom Lendacky, Brijesh Singh, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Dec 15, 2021 at 12:17:44PM -0600, Venu Busireddy wrote:
> Boris & Tom, which implementation would you prefer?

I'd like to see how that sme_sev_parse_cpuid() would look like. And that
function should be called sev_parse_cpuid(), btw.

Because if that function turns out to be a subset of your suggestion,
functionality-wise, then we should save us the churn and simply do one
generic helper.

Btw 2, that helper should be in arch/x86/kernel/sev-shared.c so that it
gets shared by both kernel stages instead having an inline function in
some random header.

Btw 3, I'm not crazy about the feature testing with the @features param
either. Maybe that function should return the eYx register directly,
like the cpuid_eYx() variants in the kernel do, where Y in { a, b, c, d
}.

The caller can than do its own testing:

	eax = sev_parse_cpuid(RET_EAX, ...)
	if (eax > 0) {
		if (eax & BIT(1))
			...

Something along those lines, for example.

But I'd have to see a concrete diff from Michael to get a better idea
how that CPUID parsing from the CPUID page is going to look like.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 14:43             ` Tom Lendacky
  2021-12-15 17:49               ` Michael Roth
@ 2021-12-15 18:58               ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-15 18:58 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Borislav Petkov, Brijesh Singh, x86, linux-kernel, kvm,
	linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-15 08:43:23 -0600, Tom Lendacky wrote:
> 
> I'm not a fan of this name. You are specifically returning the encryption
> bit position but using a very generic name of get_pagetable_bit_pos() in a
> very common header file. Maybe something more like get_me_bit() and move the
> function to an existing SEV header file.
> 
> Also, this can probably just return an unsigned int that will be either 0 or
> the bit position, right?  Then the check above can be for a zero value,
> e.g.:
> 
> 	me_bit = get_me_bit();
> 	if (!me_bit) {
> 
> 	...
> 
> 	sme_me_mask = BIT_ULL(me_bit);
> 
> That should work below, too, but you'll need to verify that.
> 

Implemented the changes as you suggested. Patch attached below. Will
submit another if we reach a different consensus.

Venu

---
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 7a5934af9d47..f0d5a00e490d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -17,6 +17,45 @@
 #define GHCB_PROTOCOL_MAX	2ULL
 #define GHCB_DEFAULT_USAGE	0ULL
 
+#define AMD_SME_BIT		BIT(0)
+#define AMD_SEV_BIT		BIT(1)
+
+/*
+ * Returns the memory encryption bit position,
+ * if the specified features are supported.
+ * Returns 0, otherwise.
+ */
+static inline unsigned int get_me_bit_pos(unsigned long features)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	/* Check for the SME/SEV support leaf */
+	eax = 0x80000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax < 0x8000001f)
+		return 0;
+
+	eax = 0x8000001f;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+
+	/* Check whether the specified features are supported.
+	 * SME/SEV features:
+	 *   CPUID Fn8000_001F[EAX]
+	 *   - Bit 0 - Secure Memory Encryption support
+	 *   - Bit 1 - Secure Encrypted Virtualization support
+	 */
+	if (!(eax & features))
+		return 0;
+
+	/*
+	 *   CPUID Fn8000_001F[EBX]
+	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
+	 */
+	return ebx & 0x3f;
+}
+
 #define	VMGEXIT()			{ asm volatile("rep; vmmcall\n\r"); }
 
 enum es_result {
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index c2bf99522e5e..838c383f102b 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -291,6 +291,7 @@ static void enforce_vmpl0(void)
 void sev_enable(struct boot_params *bp)
 {
 	unsigned int eax, ebx, ecx, edx;
+	unsigned int me_bit_pos;
 	bool snp;
 
 	/*
@@ -299,26 +300,9 @@ void sev_enable(struct boot_params *bp)
 	 */
 	snp = snp_init(bp);
 
-	/* Check for the SME/SEV support leaf */
-	eax = 0x80000000;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	if (eax < 0x8000001f)
-		return;
-
-	/*
-	 * Check for the SME/SEV feature:
-	 *   CPUID Fn8000_001F[EAX]
-	 *   - Bit 0 - Secure Memory Encryption support
-	 *   - Bit 1 - Secure Encrypted Virtualization support
-	 *   CPUID Fn8000_001F[EBX]
-	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
-	 */
-	eax = 0x8000001f;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	/* Check whether SEV is supported */
-	if (!(eax & BIT(1))) {
+	/* Get the memory encryption bit position if SEV is supported */
+	me_bit_pos = get_me_bit_pos(AMD_SEV_BIT);
+	if (!me_bit_pos) {
 		if (snp)
 			error("SEV-SNP support indicated by CC blob, but not CPUID.");
 		return;
@@ -350,7 +334,7 @@ void sev_enable(struct boot_params *bp)
 	if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
 		error("SEV-SNP supported indicated by CC blob, but not SEV status MSR.");
 
-	sme_me_mask = BIT_ULL(ebx & 0x3f);
+	sme_me_mask = BIT_ULL(me_bit_pos);
 }
 
 /* Search for Confidential Computing blob in the EFI config table. */
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 2f723e106ed3..57bc77382288 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -508,38 +508,19 @@ void __init sme_enable(struct boot_params *bp)
 	unsigned long feature_mask;
 	bool active_by_default;
 	unsigned long me_mask;
+	unsigned int me_bit_pos;
 	char buffer[16];
 	bool snp;
 	u64 msr;
 
 	snp = snp_init(bp);
 
-	/* Check for the SME/SEV support leaf */
-	eax = 0x80000000;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	if (eax < 0x8000001f)
+	/* Get the memory encryption bit position if SEV or SME are supported */
+	me_bit_pos = get_me_bit_pos(AMD_SEV_BIT | AMD_SME_BIT);
+	if (!me_bit_pos)
 		return;
 
-#define AMD_SME_BIT	BIT(0)
-#define AMD_SEV_BIT	BIT(1)
-
-	/*
-	 * Check for the SME/SEV feature:
-	 *   CPUID Fn8000_001F[EAX]
-	 *   - Bit 0 - Secure Memory Encryption support
-	 *   - Bit 1 - Secure Encrypted Virtualization support
-	 *   CPUID Fn8000_001F[EBX]
-	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
-	 */
-	eax = 0x8000001f;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	/* Check whether SEV or SME is supported */
-	if (!(eax & (AMD_SEV_BIT | AMD_SME_BIT)))
-		return;
-
-	me_mask = 1UL << (ebx & 0x3f);
+	me_mask = BIT_ULL(me_bit_pos);
 
 	/* Check the SEV MSR whether SEV or SME is enabled */
 	sev_status   = __rdmsr(MSR_AMD64_SEV);

^ permalink raw reply related	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 17:49               ` Michael Roth
  2021-12-15 18:17                 ` Venu Busireddy
@ 2021-12-15 19:54                 ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-15 19:54 UTC (permalink / raw)
  To: Michael Roth
  Cc: Tom Lendacky, Borislav Petkov, Brijesh Singh, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-15 11:49:34 -0600, Michael Roth wrote:
> 
> I think needing to pass in the SME/SEV CPUID bits to tell the helper when
> to parse encryption bit and when not to is a little bit awkward though.
> If there's some agreement that this will ultimately serve the purpose of
> handling all (or most) of SME/SEV-related CPUID parsing, then the caller
> shouldn't really need to be aware of any individual bit positions.
> Maybe a bool could handle that instead, e.g.:
> 
>   int get_me_bit(bool sev_only, ...)
> 
>   or
> 
>   int sme_sev_parse_cpuid(bool sev_only, ...)
> 
> where for boot/compressed sev_only=true, for kernel proper sev_only=false.

Implemented using this suggestion, and the patch is at the end.

I feel that passing of "true" or "false" to get_me_bit_pos() from
sev_enable() and sme_enable() has become less clear now. It is not
obvious what the "true" and "false" values mean.

However, both implementations (Tom's suggestions and Tom's + Mike's
suggestions) are available now. We can pick one of these, or I will redo
this if we want a different implementation.

Venu

---
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 7a5934af9d47..eb202096a1fc 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -17,6 +17,48 @@
 #define GHCB_PROTOCOL_MAX	2ULL
 #define GHCB_DEFAULT_USAGE	0ULL
 
+#define AMD_SME_BIT		BIT(0)
+#define AMD_SEV_BIT		BIT(1)
+
+/*
+ * Returns the memory encryption bit position,
+ * if the specified features are supported.
+ * Returns 0, otherwise.
+ */
+static inline unsigned int get_me_bit_pos(bool sev_only)
+{
+	unsigned int eax, ebx, ecx, edx;
+	unsigned int features;
+
+	features = AMD_SEV_BIT | (sev_only ? 0 : AMD_SME_BIT);
+
+	/* Check for the SME/SEV support leaf */
+	eax = 0x80000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax < 0x8000001f)
+		return 0;
+
+	eax = 0x8000001f;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+
+	/* Check whether the specified features are supported.
+	 * SME/SEV features:
+	 *   CPUID Fn8000_001F[EAX]
+	 *   - Bit 0 - Secure Memory Encryption support
+	 *   - Bit 1 - Secure Encrypted Virtualization support
+	 */
+	if (!(eax & features))
+		return 0;
+
+	/*
+	 *   CPUID Fn8000_001F[EBX]
+	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
+	 */
+	return ebx & 0x3f;
+}
+
 #define	VMGEXIT()			{ asm volatile("rep; vmmcall\n\r"); }
 
 enum es_result {
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index c2bf99522e5e..9a8181893af7 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -291,6 +291,7 @@ static void enforce_vmpl0(void)
 void sev_enable(struct boot_params *bp)
 {
 	unsigned int eax, ebx, ecx, edx;
+	unsigned int me_bit_pos;
 	bool snp;
 
 	/*
@@ -299,26 +300,9 @@ void sev_enable(struct boot_params *bp)
 	 */
 	snp = snp_init(bp);
 
-	/* Check for the SME/SEV support leaf */
-	eax = 0x80000000;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	if (eax < 0x8000001f)
-		return;
-
-	/*
-	 * Check for the SME/SEV feature:
-	 *   CPUID Fn8000_001F[EAX]
-	 *   - Bit 0 - Secure Memory Encryption support
-	 *   - Bit 1 - Secure Encrypted Virtualization support
-	 *   CPUID Fn8000_001F[EBX]
-	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
-	 */
-	eax = 0x8000001f;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	/* Check whether SEV is supported */
-	if (!(eax & BIT(1))) {
+	/* Get the memory encryption bit position if SEV is supported */
+	me_bit_pos = get_me_bit_pos(true);
+	if (!me_bit_pos) {
 		if (snp)
 			error("SEV-SNP support indicated by CC blob, but not CPUID.");
 		return;
@@ -350,7 +334,7 @@ void sev_enable(struct boot_params *bp)
 	if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
 		error("SEV-SNP supported indicated by CC blob, but not SEV status MSR.");
 
-	sme_me_mask = BIT_ULL(ebx & 0x3f);
+	sme_me_mask = BIT_ULL(me_bit_pos);
 }
 
 /* Search for Confidential Computing blob in the EFI config table. */
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 2f723e106ed3..a4979f61ecc7 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -508,38 +508,19 @@ void __init sme_enable(struct boot_params *bp)
 	unsigned long feature_mask;
 	bool active_by_default;
 	unsigned long me_mask;
+	unsigned int me_bit_pos;
 	char buffer[16];
 	bool snp;
 	u64 msr;
 
 	snp = snp_init(bp);
 
-	/* Check for the SME/SEV support leaf */
-	eax = 0x80000000;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	if (eax < 0x8000001f)
+	/* Get the memory encryption bit position if SEV or SME are supported */
+	me_bit_pos = get_me_bit_pos(false);
+	if (!me_bit_pos)
 		return;
 
-#define AMD_SME_BIT	BIT(0)
-#define AMD_SEV_BIT	BIT(1)
-
-	/*
-	 * Check for the SME/SEV feature:
-	 *   CPUID Fn8000_001F[EAX]
-	 *   - Bit 0 - Secure Memory Encryption support
-	 *   - Bit 1 - Secure Encrypted Virtualization support
-	 *   CPUID Fn8000_001F[EBX]
-	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
-	 */
-	eax = 0x8000001f;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-	/* Check whether SEV or SME is supported */
-	if (!(eax & (AMD_SEV_BIT | AMD_SME_BIT)))
-		return;
-
-	me_mask = 1UL << (ebx & 0x3f);
+	me_mask = BIT_ULL(me_bit_pos);
 
 	/* Check the SEV MSR whether SEV or SME is enabled */
 	sev_status   = __rdmsr(MSR_AMD64_SEV);

^ permalink raw reply related	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 18:33                   ` Borislav Petkov
@ 2021-12-15 20:17                     ` Michael Roth
  2021-12-15 20:38                       ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2021-12-15 20:17 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Venu Busireddy, Tom Lendacky, Brijesh Singh, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Dec 15, 2021 at 07:33:47PM +0100, Borislav Petkov wrote:
> On Wed, Dec 15, 2021 at 12:17:44PM -0600, Venu Busireddy wrote:
> > Boris & Tom, which implementation would you prefer?
> 
> I'd like to see how that sme_sev_parse_cpuid() would look like. And that
> function should be called sev_parse_cpuid(), btw.
> 
> Because if that function turns out to be a subset of your suggestion,
> functionality-wise, then we should save us the churn and simply do one
> generic helper.

I was actually thinking this proposed sev_parse_cpuid() helper would be
a superset of what Venu currently has implemented. E.g. Venu's most recent
patch does:

sev_enable():
  unsigned int me_bit_pos;

  me_bit_pos = get_me_bit(AMD_SEV_BIT)
  if (!me_bit_pos)
    return;

  ...

Let's say in the future there's need to also grab say, the VTE bit. We
could introduce a new helper, get_vte_bit() that re-does all the
0x80000000-0x8000001F range checks, some sanity checks that SEV is set if
VTE bit is set, and then now have a nice single-purpose helper that
duplicates similar checks in get_me_bit(), or we could avoid the
duplication by expanding get_me_bit() so it could be used something like:

  me_bit_pos = get_me_bit(AMD_SEV_BIT, &vte_enabled)

at which point it makes more sense to just have it be a more generic
helper, called via:

  ret = sev_parse_cpuid(AMD_SEV_BIT, &me_bit_pos, &vte_enabled)

i.e. Venu's original patch basically, but with the helper function
renamed.

and if fields are added in the future:

  sev_parse_cpuid(AMD_SEV_BIT, &me_bit_pos, &vte_enabled, &new_feature_enabled, etc..)

or if that eventually becomes unwieldly it could later be changed to return
a feature mask.

> 
> Btw 2, that helper should be in arch/x86/kernel/sev-shared.c so that it
> gets shared by both kernel stages instead having an inline function in
> some random header.
> 
> Btw 3, I'm not crazy about the feature testing with the @features param
> either. Maybe that function should return the eYx register directly,
> like the cpuid_eYx() variants in the kernel do, where Y in { a, b, c, d
> }.
> 
> The caller can than do its own testing:
> 
> 	eax = sev_parse_cpuid(RET_EAX, ...)
> 	if (eax > 0) {
> 		if (eax & BIT(1))
> 			...
> 
> Something along those lines, for example.

I think having sev_parse_cpuid() using a more "human-readable" format
for reporting features/fields will make it easier to abstract away the
nitty-gritty details and reduce that chances for more duplication
between boot/compressed and kernel proper in the future. That
"human-readable" format could be in the form of a boolean/int
parameter list that gets expanded over time as needed (like the above
examples), or a higher-level construct like a struct/bitmask/etc. But
either way it would be nice to only have to think about specific CPUID
bits when looking at sev_parse_cpuid(), and have callers instead rely
purely on the sev_parse_cpuid() function prototype/documentation to
know what's going on.

> 
> But I'd have to see a concrete diff from Michael to get a better idea
> how that CPUID parsing from the CPUID page is going to look like.

It should look the same with/without CPUID page, since the CPUID page
will have already been set up early in sev_enable()/sme_enable() based
on the presence of the CC blob via snp_init(), introduced in:

 [PATCH v8 31/40] x86/compressed: add SEV-SNP feature detection/setup

Thanks,

Mike

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7C6a28b961ef1441ed08f908d9bff970ea%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637751900351173552%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=nnCrpsw9%2FYlmhK1Xbx5y5vUScVsEOQeU%2F%2FTCmBMQ3v4%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 20:17                     ` Michael Roth
@ 2021-12-15 20:38                       ` Borislav Petkov
  2021-12-15 21:22                         ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2021-12-15 20:38 UTC (permalink / raw)
  To: Michael Roth
  Cc: Venu Busireddy, Tom Lendacky, Brijesh Singh, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Dec 15, 2021 at 02:17:34PM -0600, Michael Roth wrote:
> and if fields are added in the future:
> 
>   sev_parse_cpuid(AMD_SEV_BIT, &me_bit_pos, &vte_enabled, &new_feature_enabled, etc..)

And that will end up being a vararg function because of who knows what
other feature bits will have to get passed in? You have even added the
ellipsis in there.

Nope. Definitely not.

> or if that eventually becomes unwieldly 

The above example is already unwieldy.

> it could later be changed to return a feature mask.

Yes, that. Clean and simple.

But it is hard to discuss anything without patches so we can continue
the topic with concrete patches. But this unification is not
super-pressing so it can go ontop of the SNP pile.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 18:17                 ` Venu Busireddy
  2021-12-15 18:33                   ` Borislav Petkov
@ 2021-12-15 20:43                   ` Michael Roth
  1 sibling, 0 replies; 183+ messages in thread
From: Michael Roth @ 2021-12-15 20:43 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Tom Lendacky, Borislav Petkov, Brijesh Singh, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Dec 15, 2021 at 12:17:44PM -0600, Venu Busireddy wrote:
> On 2021-12-15 11:49:34 -0600, Michael Roth wrote:
> > 
> > I think in the greater context of consolidating all the SME/SEV setup
> > and re-using code, this helper stands a high chance of eventually becoming
> > something more along the lines of sme_sev_parse_cpuid(), since otherwise
> > we'd end up re-introducing multiple helpers to parse the same 0x8000001F
> > fields if we ever need to process any of the other fields advertised in
> > there. Given that, it makes sense to reserve the return value as an
> > indication that either SEV or SME are enabled, and then have a
> > pass-by-pointer parameters list to collect the individual feature
> > bits/encryption mask for cases where SEV/SME are enabled, which are only
> > treated as valid if sme_sev_parse_cpuid() returns 0.
> > 
> > So Venu's original approach of passing the encryption mask by pointer
> > seems a little closer toward that end, but I also agree Tom's approach
> > is cleaner for the current code base, so I'm fine either way, just
> > figured I'd mention this.
> > 
> > I think needing to pass in the SME/SEV CPUID bits to tell the helper when
> > to parse encryption bit and when not to is a little bit awkward though.
> > If there's some agreement that this will ultimately serve the purpose of
> > handling all (or most) of SME/SEV-related CPUID parsing, then the caller
> > shouldn't really need to be aware of any individual bit positions.
> > Maybe a bool could handle that instead, e.g.:
> > 
> >   int get_me_bit(bool sev_only, ...)
> > 
> >   or
> > 
> >   int sme_sev_parse_cpuid(bool sev_only, ...)
> > 
> > where for boot/compressed sev_only=true, for kernel proper sev_only=false.
> 
> I can implement it this way too. But I am wondering if having a
> boolean argument limits us from handling any future additions to the
> bit positions.

That's the thing, we'll pretty much always want to parse cpuid in
boot/compressed if SEV is enabled, and in kernel proper if either SEV or
SME are enabled, because they both require, at a minimum, the c-bit
position. Extensions to either SEV/SME likely won't change this, but by
using CPUID feature masks to handle this it gives the impression that
this helper relies on individual features being present in the mask in
order for the corresponding fields to be parsed, when in reality it
boils down more to SEV features needing to be enabled earlier because
they don't trust the host during early boot.

I agree the boolean flag makes things a bit less readable without
checking the function prototype though. I was going to suggest 2
separate functions that use a common helper and hide away the
boolean, e.g:

  sev_parse_cpuid() //sev-only

and

  sme_parse_cpuid() //sev or sme

but the latter maybe is a bit misleading and I couldn't think of a
better name. It's really more like sev_sme_parse_cpuid(), but I'm
not sure that will fly. Maybe sme_parse_cpuid() is fine.

You could also just have it take an enum as the first arg though:

enum sev_parse_cpuid {
    SEV_PARSE_CPUID_SEV_ONLY = 0
    SEV_PARSE_CPUID_SME_ONLY //unused
    SEV_PARSE_CPUID_BOTH
}

Personally I still prefer the boolean but just some alternatives
you could consider otherwise.

> 
> Boris & Tom, which implementation would you prefer?
> 
> Venu
> 
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 20:38                       ` Borislav Petkov
@ 2021-12-15 21:22                         ` Michael Roth
  2022-01-03 19:10                           ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2021-12-15 21:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Venu Busireddy, Tom Lendacky, Brijesh Singh, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Dec 15, 2021 at 09:38:55PM +0100, Borislav Petkov wrote:
> On Wed, Dec 15, 2021 at 02:17:34PM -0600, Michael Roth wrote:
> > and if fields are added in the future:
> > 
> >   sev_parse_cpuid(AMD_SEV_BIT, &me_bit_pos, &vte_enabled, &new_feature_enabled, etc..)
> 
> And that will end up being a vararg function because of who knows what
> other feature bits will have to get passed in? You have even added the
> ellipsis in there.

Well, not varargs, just sort of anticipating how the function prototype
might change over time as it's modified to parse for new features.

> 
> Nope. Definitely not.
> 
> > or if that eventually becomes unwieldly 
> 
> The above example is already unwieldy.
> 
> > it could later be changed to return a feature mask.
> 
> Yes, that. Clean and simple.
> 
> But it is hard to discuss anything without patches so we can continue
> the topic with concrete patches. But this unification is not
> super-pressing so it can go ontop of the SNP pile.

Yah, it's all theoretical at this point. Didn't mean to derail things
though. I mainly brought it up to suggest that Venu's original approach of
returning the encryption bit via a pointer argument might make it easier to
expand it for other purposes in the future, and that naming it for that
future purpose might encourage future developers to focus their efforts
there instead of potentially re-introducing duplicate code.

But either way it's simple enough to rework things when we actually
cross that bridge. So totally fine with saving all of this as a future
follow-up, or picking up either of Venu's patches for now if you'd still
prefer.

Thanks,

Mike

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7C10261dab334649b4b81408d9c00aec95%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637751975466658716%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=E3prWlptt32G%2FsgFg9wU8cMKec2cHywgNm1pPL3jzcI%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 06/40] x86/sev: Check SEV-SNP features support
  2021-12-10 15:42 ` [PATCH v8 06/40] x86/sev: Check SEV-SNP features support Brijesh Singh
@ 2021-12-16 15:47   ` Borislav Petkov
  2021-12-16 16:28     ` Brijesh Singh
  2021-12-16 19:01   ` Venu Busireddy
  1 sibling, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2021-12-16 15:47 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:42:58AM -0600, Brijesh Singh wrote:
> Version 2 of the GHCB specification added the advertisement of features
> that are supported by the hypervisor. If hypervisor supports the SEV-SNP
> then it must set the SEV-SNP features bit to indicate that the base
> SEV-SNP is supported.
> 
> Check the SEV-SNP feature while establishing the GHCB, if failed,
> terminate the guest.
> 
> Version 2 of GHCB specification adds several new NAEs, most of them are
> optional except the hypervisor feature. Now that hypervisor feature NAE
> is implemented, so bump the GHCB maximum support protocol version.
> 
> While at it, move the GHCB protocol negotitation check from VC exception

Unknown word [negotitation] in commit message, suggestions:
        ['negotiation', 'negotiator', 'negotiate', 'abnegation', 'vegetation']

> handler to sev_enable() so that all feature detection happens before
> the first VC exception.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/boot/compressed/sev.c    | 21 ++++++++++++++++-----
>  arch/x86/include/asm/sev-common.h |  6 ++++++
>  arch/x86/include/asm/sev.h        |  2 +-
>  arch/x86/include/uapi/asm/svm.h   |  2 ++
>  arch/x86/kernel/sev-shared.c      | 20 ++++++++++++++++++++
>  arch/x86/kernel/sev.c             | 16 ++++++++++++++++
>  6 files changed, 61 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index 0b6cc6402ac1..a0708f359a46 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -119,11 +119,8 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
>  /* Include code for early handlers */
>  #include "../../kernel/sev-shared.c"
>  
> -static bool early_setup_sev_es(void)
> +static bool early_setup_ghcb(void)
>  {
> -	if (!sev_es_negotiate_protocol())
> -		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
> -
>  	if (set_page_decrypted((unsigned long)&boot_ghcb_page))
>  		return false;
>  
> @@ -174,7 +171,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
>  	struct es_em_ctxt ctxt;
>  	enum es_result result;
>  
> -	if (!boot_ghcb && !early_setup_sev_es())
> +	if (!boot_ghcb && !early_setup_ghcb())
>  		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);

Can you setup the GHCB in sev_enable() too, after the protocol version
negotiation succeeds?

>  	vc_ghcb_invalidate(boot_ghcb);
> @@ -247,5 +244,19 @@ void sev_enable(struct boot_params *bp)
>  	if (!(sev_status & MSR_AMD64_SEV_ENABLED))
>  		return;
>  
> +	/* Negotiate the GHCB protocol version */
> +	if (sev_status & MSR_AMD64_SEV_ES_ENABLED)
> +		if (!sev_es_negotiate_protocol())
> +			sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
> +
> +	/*
> +	 * SNP is supported in v2 of the GHCB spec which mandates support for HV
> +	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
> +	 * the SEV-SNP features.
> +	 */
> +	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED && !(get_hv_features() & GHCB_HV_FT_SNP))
> +		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
> +
> +
^ Superfluous newline.

>  	sme_me_mask = BIT_ULL(ebx & 0x3f);

...

> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 19ad09712902..a0cada8398a4 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -43,6 +43,10 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
>   */
>  static struct ghcb __initdata *boot_ghcb;
>  
> +/* Bitmap of SEV features supported by the hypervisor */
> +static u64 sev_hv_features;

__ro_after_init

> +
> +
>  /* #VC handler runtime per-CPU data */
>  struct sev_es_runtime_data {
>  	struct ghcb ghcb_page;
> @@ -766,6 +770,18 @@ void __init sev_es_init_vc_handling(void)
>  	if (!sev_es_check_cpu_features())
>  		panic("SEV-ES CPU Features missing");
>  
> +	/*
> +	 * SNP is supported in v2 of the GHCB spec which mandates support for HV
> +	 * features. If SEV-SNP is enabled, then check if the hypervisor supports

s/SEV-SNP/SNP/g

And please do that everywhere in sev-specific files.

This file is called sev.c and there's way too many acronyms flying
around so the simpler the better.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 06/40] x86/sev: Check SEV-SNP features support
  2021-12-16 15:47   ` Borislav Petkov
@ 2021-12-16 16:28     ` Brijesh Singh
  2021-12-16 16:58       ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-16 16:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 12/16/21 9:47 AM, Borislav Petkov wrote:

>>   
>> -	if (!boot_ghcb && !early_setup_sev_es())
>> +	if (!boot_ghcb && !early_setup_ghcb())
>>   		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
> 
> Can you setup the GHCB in sev_enable() too, after the protocol version
> negotiation succeeds?

A good question; the GHCB page is needed only at the time of #VC.  If 
the second stage VC handler is not called after the sev_enable() during 
the decompression stage, setting up the GHC page in sev_enable() is a 
waste. But in practice, the second stage VC handler will be called 
during decompression. It also brings a similar question for the kernel 
proper, should we do the same over there?

Jorge did the initial ES support and may have other reasons he chose to 
set up GHCB page in the handler. I was trying to avoid the flow change. 
We can do this as a pre or post-SNP patch; let me know your thoughts?





>> +	 * SNP is supported in v2 of the GHCB spec which mandates support for HV
>> +	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
> 
> s/SEV-SNP/SNP/g
> 
> And please do that everywhere in sev-specific files.
> 
> This file is called sev.c and there's way too many acronyms flying
> around so the simpler the better.
> 

Noted.

thanks

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 06/40] x86/sev: Check SEV-SNP features support
  2021-12-16 16:28     ` Brijesh Singh
@ 2021-12-16 16:58       ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-16 16:58 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Thu, Dec 16, 2021 at 10:28:45AM -0600, Brijesh Singh wrote:
> A good question; the GHCB page is needed only at the time of #VC.  If the
> second stage VC handler is not called after the sev_enable() during the
> decompression stage, setting up the GHC page in sev_enable() is a waste.

It would be a waste if no #VC would fire. But we set up a #VC handler so
we might just as well set up the GHCB for it too.

> But in practice, the second stage VC handler will be called during
> decompression. It also brings a similar question for the kernel
> proper, should we do the same over there?

I'd think so, yes.
 
> Jorge did the initial ES support and may have other reasons he chose to set
> up GHCB page in the handler. I was trying to avoid the flow change. We can
> do this as a pre or post-SNP patch; let me know your thoughts?

You can do a separate patch only with that change and if it causes
trouble, we can always debug/delay it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 06/40] x86/sev: Check SEV-SNP features support
  2021-12-10 15:42 ` [PATCH v8 06/40] x86/sev: Check SEV-SNP features support Brijesh Singh
  2021-12-16 15:47   ` Borislav Petkov
@ 2021-12-16 19:01   ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-16 19:01 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:42:58 -0600, Brijesh Singh wrote:
> Version 2 of the GHCB specification added the advertisement of features
> that are supported by the hypervisor. If hypervisor supports the SEV-SNP
> then it must set the SEV-SNP features bit to indicate that the base
> SEV-SNP is supported.
> 
> Check the SEV-SNP feature while establishing the GHCB, if failed,
> terminate the guest.
> 
> Version 2 of GHCB specification adds several new NAEs, most of them are
> optional except the hypervisor feature. Now that hypervisor feature NAE
> is implemented, so bump the GHCB maximum support protocol version.
> 
> While at it, move the GHCB protocol negotitation check from VC exception
> handler to sev_enable() so that all feature detection happens before
> the first VC exception.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/boot/compressed/sev.c    | 21 ++++++++++++++++-----
>  arch/x86/include/asm/sev-common.h |  6 ++++++
>  arch/x86/include/asm/sev.h        |  2 +-
>  arch/x86/include/uapi/asm/svm.h   |  2 ++
>  arch/x86/kernel/sev-shared.c      | 20 ++++++++++++++++++++
>  arch/x86/kernel/sev.c             | 16 ++++++++++++++++
>  6 files changed, 61 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index 0b6cc6402ac1..a0708f359a46 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -119,11 +119,8 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
>  /* Include code for early handlers */
>  #include "../../kernel/sev-shared.c"
>  
> -static bool early_setup_sev_es(void)
> +static bool early_setup_ghcb(void)
>  {
> -	if (!sev_es_negotiate_protocol())
> -		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);

Should the name sev_es_terminate() be changed to a more generic
name, as we are simply terminating the guest, not SEV or ES as the
name implies?

Other than that...

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 07/40] x86/sev: Add a helper for the PVALIDATE instruction
  2021-12-10 15:42 ` [PATCH v8 07/40] x86/sev: Add a helper for the PVALIDATE instruction Brijesh Singh
@ 2021-12-16 20:20   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2021-12-16 20:20 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:42:59 -0600, Brijesh Singh wrote:
> An SNP-active guest uses the PVALIDATE instruction to validate or
> rescind the validation of a guest page’s RMP entry. Upon completion,
> a return code is stored in EAX and rFLAGS bits are set based on the
> return code. If the instruction completed successfully, the CF
> indicates if the content of the RMP were changed or not.
> 
> See AMD APM Volume 3 for additional details.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 08/40] x86/sev: Check the vmpl level
  2021-12-10 15:43 ` [PATCH v8 08/40] x86/sev: Check the vmpl level Brijesh Singh
@ 2021-12-16 20:24   ` Venu Busireddy
  2021-12-16 23:39     ` Mikolaj Lisik
  0 siblings, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2021-12-16 20:24 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:00 -0600, Brijesh Singh wrote:
> Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
> allows a guest VM to divide its address space into four levels. The level
> can be used to provide the hardware isolated abstraction layers with a VM.
> The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
> Certain operations must be done by the VMPL0 software, such as:
> 
> * Validate or invalidate memory range (PVALIDATE instruction)
> * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
> 
> The initial SEV-SNP support requires that the guest kernel is running on
> VMPL0. Add a check to make sure that kernel is running at VMPL0 before
> continuing the boot. There is no easy method to query the current VMPL
> level, so use the RMPADJUST instruction to determine whether the guest is
> running at the VMPL0.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
>  arch/x86/include/asm/sev-common.h |  1 +
>  arch/x86/include/asm/sev.h        | 16 +++++++++++++++
>  3 files changed, 48 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index a0708f359a46..9be369f72299 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -212,6 +212,31 @@ static inline u64 rd_sev_status_msr(void)
>  	return ((high << 32) | low);
>  }
>  
> +static void enforce_vmpl0(void)
> +{
> +	u64 attrs;
> +	int err;
> +
> +	/*
> +	 * There is no straightforward way to query the current VMPL level. The
> +	 * simplest method is to use the RMPADJUST instruction to change a page
> +	 * permission to a VMPL level-1, and if the guest kernel is launched at
> +	 * a level <= 1, then RMPADJUST instruction will return an error.

Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
equal to 1 semantically, or numerically?

Venu


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 08/40] x86/sev: Check the vmpl level
  2021-12-16 20:24   ` Venu Busireddy
@ 2021-12-16 23:39     ` Mikolaj Lisik
  2021-12-17 22:19       ` Brijesh Singh
  0 siblings, 1 reply; 183+ messages in thread
From: Mikolaj Lisik @ 2021-12-16 23:39 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Thu, Dec 16, 2021 at 12:24 PM Venu Busireddy
<venu.busireddy@oracle.com> wrote:
>
> On 2021-12-10 09:43:00 -0600, Brijesh Singh wrote:
> > Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
> > allows a guest VM to divide its address space into four levels. The level
> > can be used to provide the hardware isolated abstraction layers with a VM.
> > The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
> > Certain operations must be done by the VMPL0 software, such as:
> >
> > * Validate or invalidate memory range (PVALIDATE instruction)
> > * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
> >
> > The initial SEV-SNP support requires that the guest kernel is running on
> > VMPL0. Add a check to make sure that kernel is running at VMPL0 before
> > continuing the boot. There is no easy method to query the current VMPL
> > level, so use the RMPADJUST instruction to determine whether the guest is
> > running at the VMPL0.
> >
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > ---
> >  arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
> >  arch/x86/include/asm/sev-common.h |  1 +
> >  arch/x86/include/asm/sev.h        | 16 +++++++++++++++
> >  3 files changed, 48 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> > index a0708f359a46..9be369f72299 100644
> > --- a/arch/x86/boot/compressed/sev.c
> > +++ b/arch/x86/boot/compressed/sev.c
> > @@ -212,6 +212,31 @@ static inline u64 rd_sev_status_msr(void)
> >       return ((high << 32) | low);
> >  }
> >
> > +static void enforce_vmpl0(void)
> > +{
> > +     u64 attrs;
> > +     int err;
> > +
> > +     /*
> > +      * There is no straightforward way to query the current VMPL level. The
> > +      * simplest method is to use the RMPADJUST instruction to change a page
> > +      * permission to a VMPL level-1, and if the guest kernel is launched at
> > +      * a level <= 1, then RMPADJUST instruction will return an error.
>
> Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
> equal to 1 semantically, or numerically?
>

+1 to this. Additionally I found the "level-1" confusing which I
interpreted as "level minus one".

Perhaps phrasing it as "level one", or "level=1" would be more explicit?

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage
  2021-12-10 15:43 ` [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage Brijesh Singh
@ 2021-12-17 20:47   ` Venu Busireddy
  2021-12-17 23:24     ` Brijesh Singh
  2021-12-21 13:01   ` Borislav Petkov
  1 sibling, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2021-12-17 20:47 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:01 -0600, Brijesh Singh wrote:
> Many of the integrity guarantees of SEV-SNP are enforced through the
> Reverse Map Table (RMP). Each RMP entry contains the GPA at which a
> particular page of DRAM should be mapped. The VMs can request the
> hypervisor to add pages in the RMP table via the Page State Change VMGEXIT
> defined in the GHCB specification. Inside each RMP entry is a Validated
> flag; this flag is automatically cleared to 0 by the CPU hardware when a
> new RMP entry is created for a guest. Each VM page can be either
> validated or invalidated, as indicated by the Validated flag in the RMP
> entry. Memory access to a private page that is not validated generates
> a #VC. A VM must use PVALIDATE instruction to validate the private page
> before using it.
> 
> To maintain the security guarantee of SEV-SNP guests, when transitioning
> pages from private to shared, the guest must invalidate the pages before
> asking the hypervisor to change the page state to shared in the RMP table.
> 
> After the pages are mapped private in the page table, the guest must issue
> a page state change VMGEXIT to make the pages private in the RMP table and
> validate it.
> 
> On boot, BIOS should have validated the entire system memory. During
> the kernel decompression stage, the VC handler uses the
> set_memory_decrypted() to make the GHCB page shared (i.e clear encryption
> attribute). And while exiting from the decompression, it calls the
> set_page_encrypted() to make the page private.
> 
> Add sev_snp_set_page_{private,shared}() helper that is used by the

Since the functions being added are snp_set_page_{private,shared}(),

s/sev_snp_set_page_/snp_set_page_/

Also, s/helper that is/helpers that are/

> set_memory_{decrypt,encrypt}() to change the page state in the RMP table.

s/decrypt,encrypt/decrypted,encrypted/

> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/boot/compressed/ident_map_64.c | 18 +++++++++-
>  arch/x86/boot/compressed/misc.h         |  4 +++
>  arch/x86/boot/compressed/sev.c          | 46 +++++++++++++++++++++++++
>  arch/x86/include/asm/sev-common.h       | 26 ++++++++++++++
>  4 files changed, 93 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
> index f7213d0943b8..ef77453cc629 100644
> --- a/arch/x86/boot/compressed/ident_map_64.c
> +++ b/arch/x86/boot/compressed/ident_map_64.c
> @@ -275,15 +275,31 @@ static int set_clr_page_flags(struct x86_mapping_info *info,
>  	 * Changing encryption attributes of a page requires to flush it from
>  	 * the caches.
>  	 */
> -	if ((set | clr) & _PAGE_ENC)
> +	if ((set | clr) & _PAGE_ENC) {
>  		clflush_page(address);
>  
> +		/*
> +		 * If the encryption attribute is being cleared, then change
> +		 * the page state to shared in the RMP table.
> +		 */
> +		if (clr)

This function is also called by set_page_non_present() with clr set to
_PAGE_PRESENT. Do we want to change the page state to shared even when
the page is not present? If not, shouldn't the check be (clr & _PAGE_ENC)?

> +			snp_set_page_shared(pte_pfn(*ptep) << PAGE_SHIFT);
> +	}
> +
>  	/* Update PTE */
>  	pte = *ptep;
>  	pte = pte_set_flags(pte, set);
>  	pte = pte_clear_flags(pte, clr);
>  	set_pte(ptep, pte);
>  
> +	/*
> +	 * If the encryption attribute is being set, then change the page state to
> +	 * private in the RMP entry. The page state must be done after the PTE
> +	 * is updated.
> +	 */
> +	if (set & _PAGE_ENC)
> +		snp_set_page_private(__pa(address & PAGE_MASK));
> +
>  	/* Flush TLB after changing encryption attribute */
>  	write_cr3(top_level_pgt);
>  
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 23e0e395084a..01cc13c12059 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -124,6 +124,8 @@ static inline void console_init(void)
>  void sev_enable(struct boot_params *bp);
>  void sev_es_shutdown_ghcb(void);
>  extern bool sev_es_check_ghcb_fault(unsigned long address);
> +void snp_set_page_private(unsigned long paddr);
> +void snp_set_page_shared(unsigned long paddr);
>  #else
>  static inline void sev_enable(struct boot_params *bp) { }
>  static inline void sev_es_shutdown_ghcb(void) { }
> @@ -131,6 +133,8 @@ static inline bool sev_es_check_ghcb_fault(unsigned long address)
>  {
>  	return false;
>  }
> +static inline void snp_set_page_private(unsigned long paddr) { }
> +static inline void snp_set_page_shared(unsigned long paddr) { }
>  #endif
>  
>  /* acpi.c */
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index 9be369f72299..12a93acc94ba 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -119,6 +119,52 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
>  /* Include code for early handlers */
>  #include "../../kernel/sev-shared.c"
>  
> +static inline bool sev_snp_enabled(void)
> +{
> +	return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
> +}
> +
> +static void __page_state_change(unsigned long paddr, enum psc_op op)
> +{
> +	u64 val;
> +
> +	if (!sev_snp_enabled())
> +		return;
> +
> +	/*
> +	 * If private -> shared then invalidate the page before requesting the

This comment is confusing. We don't know what the present state is,
right? If we don't, shouldn't we just say:

    If the operation is SNP_PAGE_STATE_SHARED, invalidate the page before
    requesting the state change in the RMP table.

> +	 * state change in the RMP table.
> +	 */
> +	if (op == SNP_PAGE_STATE_SHARED && pvalidate(paddr, RMP_PG_SIZE_4K, 0))
> +		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
> +
> +	/* Issue VMGEXIT to change the page state in RMP table. */
> +	sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
> +	VMGEXIT();
> +
> +	/* Read the response of the VMGEXIT. */
> +	val = sev_es_rd_ghcb_msr();
> +	if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val))
> +		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
> +
> +	/*
> +	 * Now that page is added in the RMP table, validate it so that it is
> +	 * consistent with the RMP entry.

The page is not "added", right? Shouldn't we just say:

    Validate the page so that it is consistent with the RMP entry.

Venu

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 08/40] x86/sev: Check the vmpl level
  2021-12-16 23:39     ` Mikolaj Lisik
@ 2021-12-17 22:19       ` Brijesh Singh
  2021-12-17 22:33         ` Tom Lendacky
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-17 22:19 UTC (permalink / raw)
  To: Mikolaj Lisik, Venu Busireddy
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy


On 12/16/21 5:39 PM, Mikolaj Lisik wrote:
> On Thu, Dec 16, 2021 at 12:24 PM Venu Busireddy
> <venu.busireddy@oracle.com> wrote:
>> On 2021-12-10 09:43:00 -0600, Brijesh Singh wrote:
>>> Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
>>> allows a guest VM to divide its address space into four levels. The level
>>> can be used to provide the hardware isolated abstraction layers with a VM.
>>> The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
>>> Certain operations must be done by the VMPL0 software, such as:
>>>
>>> * Validate or invalidate memory range (PVALIDATE instruction)
>>> * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
>>>
>>> The initial SEV-SNP support requires that the guest kernel is running on
>>> VMPL0. Add a check to make sure that kernel is running at VMPL0 before
>>> continuing the boot. There is no easy method to query the current VMPL
>>> level, so use the RMPADJUST instruction to determine whether the guest is
>>> running at the VMPL0.
>>>
>>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>>> ---
>>>  arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
>>>  arch/x86/include/asm/sev-common.h |  1 +
>>>  arch/x86/include/asm/sev.h        | 16 +++++++++++++++
>>>  3 files changed, 48 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
>>> index a0708f359a46..9be369f72299 100644
>>> --- a/arch/x86/boot/compressed/sev.c
>>> +++ b/arch/x86/boot/compressed/sev.c
>>> @@ -212,6 +212,31 @@ static inline u64 rd_sev_status_msr(void)
>>>       return ((high << 32) | low);
>>>  }
>>>
>>> +static void enforce_vmpl0(void)
>>> +{
>>> +     u64 attrs;
>>> +     int err;
>>> +
>>> +     /*
>>> +      * There is no straightforward way to query the current VMPL level. The
>>> +      * simplest method is to use the RMPADJUST instruction to change a page
>>> +      * permission to a VMPL level-1, and if the guest kernel is launched at
>>> +      * a level <= 1, then RMPADJUST instruction will return an error.
>> Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
>> equal to 1 semantically, or numerically?

Its numerically, please see the AMD APM vol 3.

Here is the snippet from the APM RMPAJUST.

IF (TARGET_VMPL <= CURRENT_VMPL)  // Only permissions for numerically

        EAX = FAIL_PERMISSION                // higher VMPL can be modified

        EXIT


> +1 to this. Additionally I found the "level-1" confusing which I
> interpreted as "level minus one".
>
> Perhaps phrasing it as "level one", or "level=1" would be more explicit?
>
Sure, I will make it clear that its target vmpl level 1 and not (target
level - 1).

thanks



^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 08/40] x86/sev: Check the vmpl level
  2021-12-17 22:19       ` Brijesh Singh
@ 2021-12-17 22:33         ` Tom Lendacky
  2021-12-20 18:10           ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Tom Lendacky @ 2021-12-17 22:33 UTC (permalink / raw)
  To: Brijesh Singh, Mikolaj Lisik, Venu Busireddy
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/17/21 4:19 PM, Brijesh Singh wrote:
> 
> On 12/16/21 5:39 PM, Mikolaj Lisik wrote:
>> On Thu, Dec 16, 2021 at 12:24 PM Venu Busireddy
>> <venu.busireddy@oracle.com> wrote:
>>> On 2021-12-10 09:43:00 -0600, Brijesh Singh wrote:
>>>> Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
>>>> allows a guest VM to divide its address space into four levels. The level
>>>> can be used to provide the hardware isolated abstraction layers with a VM.
>>>> The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
>>>> Certain operations must be done by the VMPL0 software, such as:
>>>>
>>>> * Validate or invalidate memory range (PVALIDATE instruction)
>>>> * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
>>>>
>>>> The initial SEV-SNP support requires that the guest kernel is running on
>>>> VMPL0. Add a check to make sure that kernel is running at VMPL0 before
>>>> continuing the boot. There is no easy method to query the current VMPL
>>>> level, so use the RMPADJUST instruction to determine whether the guest is
>>>> running at the VMPL0.
>>>>
>>>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>>>> ---
>>>>   arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
>>>>   arch/x86/include/asm/sev-common.h |  1 +
>>>>   arch/x86/include/asm/sev.h        | 16 +++++++++++++++
>>>>   3 files changed, 48 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
>>>> index a0708f359a46..9be369f72299 100644
>>>> --- a/arch/x86/boot/compressed/sev.c
>>>> +++ b/arch/x86/boot/compressed/sev.c
>>>> @@ -212,6 +212,31 @@ static inline u64 rd_sev_status_msr(void)
>>>>        return ((high << 32) | low);
>>>>   }
>>>>
>>>> +static void enforce_vmpl0(void)
>>>> +{
>>>> +     u64 attrs;
>>>> +     int err;
>>>> +
>>>> +     /*
>>>> +      * There is no straightforward way to query the current VMPL level. The
>>>> +      * simplest method is to use the RMPADJUST instruction to change a page
>>>> +      * permission to a VMPL level-1, and if the guest kernel is launched at
>>>> +      * a level <= 1, then RMPADJUST instruction will return an error.
>>> Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
>>> equal to 1 semantically, or numerically?
> 
> Its numerically, please see the AMD APM vol 3.

Actually it is not numerically...  if it was numerically, then 0 <= 1 
would return an error, but VMPL0 is the highest permission level.

> 
> Here is the snippet from the APM RMPAJUST.
> 
> IF (TARGET_VMPL <= CURRENT_VMPL)  // Only permissions for numerically

Notice, that the target VMPL is checked against the current VMPL. So if 
the target VMPL is numerically less than or equal to the current VMPL 
(e.g. you are trying to modify permissions for VMPL1 when you are running 
at VMPL2), that is a permission error. So similar to CPL, 0 is the highest 
permission followed by 1 then 2 then 3.

Thanks,
Tom

> 
>          EAX = FAIL_PERMISSION                // higher VMPL can be modified
> 
>          EXIT
> 
> 
>> +1 to this. Additionally I found the "level-1" confusing which I
>> interpreted as "level minus one".
>>
>> Perhaps phrasing it as "level one", or "level=1" would be more explicit?
>>
> Sure, I will make it clear that its target vmpl level 1 and not (target
> level - 1).
> 
> thanks
> 
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage
  2021-12-17 20:47   ` Venu Busireddy
@ 2021-12-17 23:24     ` Brijesh Singh
  2022-01-03 18:43       ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2021-12-17 23:24 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy


On 12/17/21 2:47 PM, Venu Busireddy wrote:

>>  	 * the caches.
>>  	 */
>> -	if ((set | clr) & _PAGE_ENC)
>> +	if ((set | clr) & _PAGE_ENC) {
>>  		clflush_page(address);
>>  
>> +		/*
>> +		 * If the encryption attribute is being cleared, then change
>> +		 * the page state to shared in the RMP table.
>> +		 */
>> +		if (clr)
> This function is also called by set_page_non_present() with clr set to
> _PAGE_PRESENT. Do we want to change the page state to shared even when
> the page is not present? If not, shouldn't the check be (clr & _PAGE_ENC)?

I am not able to follow your comment. Here we only pay attention to the
encryption attribute, if encryption attribute is getting cleared then
make PSC. In the case ov set_page_non_present(), the outer if() block
will return false.  Am I missing something ?


>> +	/*
>> +	 * If private -> shared then invalidate the page before requesting the
> This comment is confusing. We don't know what the present state is,
> right? If we don't, shouldn't we just say:
>
>     If the operation is SNP_PAGE_STATE_SHARED, invalidate the page before
>     requesting the state change in the RMP table.
>
By default all the pages are private, so I don't see any issue with
saying "private -> shared".


>> +	 * state change in the RMP table.
>> +	 */
>> +	if (op == SNP_PAGE_STATE_SHARED && pvalidate(paddr, RMP_PG_SIZE_4K, 0))
>> +		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
>> +
>> +	/* Issue VMGEXIT to change the page state in RMP table. */
>> +	sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
>> +	VMGEXIT();
>> +
>> +	/* Read the response of the VMGEXIT. */
>> +	val = sev_es_rd_ghcb_msr();
>> +	if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val))
>> +		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
>> +
>> +	/*
>> +	 * Now that page is added in the RMP table, validate it so that it is
>> +	 * consistent with the RMP entry.
> The page is not "added", right? Shouldn't we just say:

Technically, PSC modifies the RMP entry, so I should use that  instead
of calling "added".


>     Validate the page so that it is consistent with the RMP entry.

Yes, I am okay with it.


> Venu

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 08/40] x86/sev: Check the vmpl level
  2021-12-17 22:33         ` Tom Lendacky
@ 2021-12-20 18:10           ` Borislav Petkov
  2022-01-04 15:23             ` Brijesh Singh
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2021-12-20 18:10 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, Mikolaj Lisik, Venu Busireddy, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Fri, Dec 17, 2021 at 04:33:02PM -0600, Tom Lendacky wrote:
> > > > > +      * There is no straightforward way to query the current VMPL level. The
> > > > > +      * simplest method is to use the RMPADJUST instruction to change a page
> > > > > +      * permission to a VMPL level-1, and if the guest kernel is launched at
> > > > > +      * a level <= 1, then RMPADJUST instruction will return an error.
> > > > Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
> > > > equal to 1 semantically, or numerically?
> > 
> > Its numerically, please see the AMD APM vol 3.
> 
> Actually it is not numerically...  if it was numerically, then 0 <= 1 would
> return an error, but VMPL0 is the highest permission level.

Just write in that comment exactly what this function does:

"RMPADJUST modifies RMP permissions of a lesser-privileged (numerically
higher) privilege level. Here, clear the VMPL1 permission mask of the
GHCB page. If the guest is not running at VMPL0, this will fail.

If the guest is running at VMP0, it will succeed. Even if that operation
modifies permission bits, it is still ok to do currently because Linux
SNP guests are supported only on VMPL0 so VMPL1 or higher permission
masks changing is a don't-care."

and then everything is clear wrt numbering, privilege, etc.

Ok?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage
  2021-12-10 15:43 ` [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage Brijesh Singh
  2021-12-17 20:47   ` Venu Busireddy
@ 2021-12-21 13:01   ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-21 13:01 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:01AM -0600, Brijesh Singh wrote:
> diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
> index f7213d0943b8..ef77453cc629 100644
> --- a/arch/x86/boot/compressed/ident_map_64.c
> +++ b/arch/x86/boot/compressed/ident_map_64.c
> @@ -275,15 +275,31 @@ static int set_clr_page_flags(struct x86_mapping_info *info,
>  	 * Changing encryption attributes of a page requires to flush it from
>  	 * the caches.
>  	 */
> -	if ((set | clr) & _PAGE_ENC)
> +	if ((set | clr) & _PAGE_ENC) {
>  		clflush_page(address);
>  
> +		/*
> +		 * If the encryption attribute is being cleared, then change
> +		 * the page state to shared in the RMP table.
> +		 */
> +		if (clr)
> +			snp_set_page_shared(pte_pfn(*ptep) << PAGE_SHIFT);

You forgot to change that one.

> +	}
> +
>  	/* Update PTE */
>  	pte = *ptep;
>  	pte = pte_set_flags(pte, set);
>  	pte = pte_clear_flags(pte, clr);
>  	set_pte(ptep, pte);
>  
> +	/*
> +	 * If the encryption attribute is being set, then change the page state to
> +	 * private in the RMP entry. The page state must be done after the PTE
                                                   ^
                                                 change

Geez, tell me, why should I be even bothering to review stuff if I have
to go look at the previous review I did and find that you haven't really
addressed it?!

> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
> index 7ac5842e32b6..a2f956cfafba 100644
> --- a/arch/x86/include/asm/sev-common.h
> +++ b/arch/x86/include/asm/sev-common.h
> @@ -57,6 +57,32 @@
>  #define GHCB_MSR_AP_RESET_HOLD_REQ	0x006
>  #define GHCB_MSR_AP_RESET_HOLD_RESP	0x007
>  
> +/*
> + * SNP Page State Change Operation
> + *
> + * GHCBData[55:52] - Page operation:
> + *   0x0001 – Page assignment, Private
> + *   0x0002 – Page assignment, Shared

I wonder how you've achieved that:

massage_diff: Warning: Unicode char [–] (0x2013) in line: + *   0x0001 – Page assignment, Private
massage_diff: Warning: Unicode char [–] (0x2013) in line: + *   0x0002 – Page assignment, Shared

See https://trojansource.codes/ for some background.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 11/40] x86/sev: Register GHCB memory when SEV-SNP is active
  2021-12-10 15:43 ` [PATCH v8 11/40] x86/sev: " Brijesh Singh
@ 2021-12-22 13:16   ` Borislav Petkov
  2021-12-22 15:16     ` Brijesh Singh
  2022-01-03 22:47   ` Venu Busireddy
  1 sibling, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2021-12-22 13:16 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:03AM -0600, Brijesh Singh wrote:
> @@ -652,7 +652,7 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>   * This function runs on the first #VC exception after the kernel
>   * switched to virtual addresses.
>   */
> -static bool __init sev_es_setup_ghcb(void)
> +static bool __init setup_ghcb(void)
>  {
>  	/* First make sure the hypervisor talks a supported protocol. */
>  	if (!sev_es_negotiate_protocol())

Ok, let me stare at this for a while:

This gets called by handle_vc_boot_ghcb() which gets set at build time:

arch/x86/kernel/head_64.S:372:SYM_DATA(initial_vc_handler,      .quad handle_vc_boot_ghcb)

initial_vc_handler() gets called by vc_boot_ghcb() which gets set in

early_setup_idt()

and that function already does sev_snp_register_ghcb().

So why don't you concentrate the work setup_ghcb() does before the first
#VC and call it in early_setup_idt(), before the IDT is set?

And then you get rid of yet another setup-at-first-use case?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 11/40] x86/sev: Register GHCB memory when SEV-SNP is active
  2021-12-22 13:16   ` Borislav Petkov
@ 2021-12-22 15:16     ` Brijesh Singh
  0 siblings, 0 replies; 183+ messages in thread
From: Brijesh Singh @ 2021-12-22 15:16 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 12/22/21 7:16 AM, Borislav Petkov wrote:
> On Fri, Dec 10, 2021 at 09:43:03AM -0600, Brijesh Singh wrote:
>> @@ -652,7 +652,7 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>>    * This function runs on the first #VC exception after the kernel
>>    * switched to virtual addresses.
>>    */
>> -static bool __init sev_es_setup_ghcb(void)
>> +static bool __init setup_ghcb(void)
>>   {
>>   	/* First make sure the hypervisor talks a supported protocol. */
>>   	if (!sev_es_negotiate_protocol())
> 
> Ok, let me stare at this for a while:
> 
> This gets called by handle_vc_boot_ghcb() which gets set at build time:
> 
> arch/x86/kernel/head_64.S:372:SYM_DATA(initial_vc_handler,      .quad handle_vc_boot_ghcb)
> 
> initial_vc_handler() gets called by vc_boot_ghcb() which gets set in
> 
> early_setup_idt()
> 
> and that function already does sev_snp_register_ghcb().
> 
> So why don't you concentrate the work setup_ghcb() does before the first
> #VC and call it in early_setup_idt(), before the IDT is set?
> 
> And then you get rid of yet another setup-at-first-use case?
> 

I was following the existing SEV-ES implementation in which GHCB is 
setup on first #VC. But recently you recommended to move the setup 
outside of the VC handler for the decompression path and I was going to 
do the same for the kernel proper. I have tried moving the GHCB setup 
outside and it seems to be working okay with me (a limited testing so 
far). I will check Jorge to see if there was any reason for doing the 
GHCB setup inside the VC for the SEV-ES case.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes
  2021-12-10 15:43 ` [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes Brijesh Singh
@ 2021-12-23 11:50   ` Borislav Petkov
  2022-01-04 15:33     ` Brijesh Singh
  2022-01-03 23:28   ` Venu Busireddy
  1 sibling, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2021-12-23 11:50 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:04AM -0600, Brijesh Singh wrote:
> The early_set_memory_{encrypt,decrypt}() are used for changing the
					^
					ed()


> page from decrypted (shared) to encrypted (private) and vice versa.
> When SEV-SNP is active, the page state transition needs to go through
> additional steps.
> 
> If the page is transitioned from shared to private, then perform the
> following after the encryption attribute is set in the page table:
> 
> 1. Issue the page state change VMGEXIT to add the page as a private
>    in the RMP table.
> 2. Validate the page after its successfully added in the RMP table.
> 
> To maintain the security guarantees, if the page is transitioned from
> private to shared, then perform the following before clearing the
> encryption attribute from the page table.
> 
> 1. Invalidate the page.
> 2. Issue the page state change VMGEXIT to make the page shared in the
>    RMP table.
> 
> The early_set_memory_{encrypt,decrypt} can be called before the GHCB

ditto.

> is setup, use the SNP page state MSR protocol VMGEXIT defined in the GHCB
> specification to request the page state change in the RMP table.
> 
> While at it, add a helper snp_prep_memory() that can be used outside
> the sev specific files to change the page state for a specified memory

"outside of the sev specific"? What is that trying to say?

/me goes and looks at the whole patchset...

Right, so that is used only in probe_roms(). So that should say:

"Add a helper ... which will be used in probe_roms(), in a later patch."

> range.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/sev.h |  10 ++++
>  arch/x86/kernel/sev.c      | 102 +++++++++++++++++++++++++++++++++++++
>  arch/x86/mm/mem_encrypt.c  |  51 +++++++++++++++++--

Right, for the next revision, that file is called mem_encrypt_amd.c now.

...

> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 3ba801ff6afc..5d19aad06670 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -31,6 +31,7 @@
>  #include <asm/processor-flags.h>
>  #include <asm/msr.h>
>  #include <asm/cmdline.h>
> +#include <asm/sev.h>
>  
>  #include "mm_internal.h"
>  
> @@ -49,6 +50,34 @@ EXPORT_SYMBOL_GPL(sev_enable_key);
>  /* Buffer used for early in-place encryption by BSP, no locking needed */
>  static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
>  
> +/*
> + * When SNP is active, change the page state from private to shared before
> + * copying the data from the source to destination and restore after the copy.
> + * This is required because the source address is mapped as decrypted by the
> + * caller of the routine.
> + */
> +static inline void __init snp_memcpy(void *dst, void *src, size_t sz,
> +				     unsigned long paddr, bool decrypt)
> +{
> +	unsigned long npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> +
> +	if (!cc_platform_has(CC_ATTR_SEV_SNP) || !decrypt) {

Yeah, looking at this again, I don't really like this multiplexing.
Let's do this instead, diff ontop:

---
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index c14fd8254198..e3f7a84449bb 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -49,24 +49,18 @@ EXPORT_SYMBOL(sme_me_mask);
 static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
 
 /*
- * When SNP is active, change the page state from private to shared before
- * copying the data from the source to destination and restore after the copy.
- * This is required because the source address is mapped as decrypted by the
- * caller of the routine.
+ * SNP-specific routine which needs to additionally change the page state from
+ * private to shared before copying the data from the source to destination and
+ * restore after the copy.
  */
 static inline void __init snp_memcpy(void *dst, void *src, size_t sz,
 				     unsigned long paddr, bool decrypt)
 {
 	unsigned long npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
 
-	if (!cc_platform_has(CC_ATTR_SEV_SNP) || !decrypt) {
-		memcpy(dst, src, sz);
-		return;
-	}
-
 	/*
-	 * With SNP, the paddr needs to be accessed decrypted, mark the page
-	 * shared in the RMP table before copying it.
+	 * @paddr needs to be accessed decrypted, mark the page shared in the
+	 * RMP table before copying it.
 	 */
 	early_snp_set_memory_shared((unsigned long)__va(paddr), paddr, npages);
 
@@ -124,8 +118,13 @@ static void __init __sme_early_enc_dec(resource_size_t paddr,
 		 * Use a temporary buffer, of cache-line multiple size, to
 		 * avoid data corruption as documented in the APM.
 		 */
-		snp_memcpy(sme_early_buffer, src, len, paddr, enc);
-		snp_memcpy(dst, sme_early_buffer, len, paddr, !enc);
+		if (cc_platform_has(CC_ATTR_SEV_SNP)) {
+			snp_memcpy(sme_early_buffer, src, len, paddr, enc);
+			snp_memcpy(dst, sme_early_buffer, len, paddr, !enc);
+		} else {
+			memcpy(sme_early_buffer, src, len);
+			memcpy(dst, sme_early_buffer, len);
+		}
 
 		early_memunmap(dst, len);
 		early_memunmap(src, len);

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2021-12-10 15:43 ` [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table Brijesh Singh
@ 2021-12-28 11:53   ` Borislav Petkov
  2022-01-04 17:56   ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-28 11:53 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:05AM -0600, Brijesh Singh wrote:
> The encryption attribute for the bss.decrypted region is cleared in the

s/region/section/

s/bss.decrypted/.bss..decrypted/g

if you're going to call it by its name, use the correct one pls.

Ditto in the Subject.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 14/40] x86/kernel: Validate rom memory before accessing when SEV-SNP is active
  2021-12-10 15:43 ` [PATCH v8 14/40] x86/kernel: Validate rom memory before accessing when SEV-SNP is active Brijesh Singh
@ 2021-12-28 15:40   ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-28 15:40 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:06AM -0600, Brijesh Singh wrote:

> Subject: Re: [PATCH v8 14/40] x86/kernel: Validate rom memory before accessing when SEV-SNP is active

s/rom/ROM/

> The probe_roms() access the memory range (0xc0000 - 0x10000) to probe

"probe_roms() accesses... "

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 15/40] x86/mm: Add support to validate memory when changing C-bit
  2021-12-10 15:43 ` [PATCH v8 15/40] x86/mm: Add support to validate memory when changing C-bit Brijesh Singh
@ 2021-12-29 11:09   ` Borislav Petkov
  2022-01-04 22:31   ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-29 11:09 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:07AM -0600, Brijesh Singh wrote:
> The set_memory_{encrypt,decrypt}() are used for changing the pages

$ git grep -E "set_memory_decrypt\W"
$

Please check all your commit messages whether you're quoting the proper
functions.

> from decrypted (shared) to encrypted (private) and vice versa.
> When SEV-SNP is active, the page state transition needs to go through
> additional steps.

		    ... "done by the guest."

I think it is important to state here who's supposed to do those
additional steps.

...

> @@ -659,6 +659,161 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
>  		WARN(1, "invalid memory op %d\n", op);
>  }
>  
> +static int vmgexit_psc(struct snp_psc_desc *desc)
> +{
> +	int cur_entry, end_entry, ret = 0;
> +	struct snp_psc_desc *data;
> +	struct ghcb_state state;
> +	unsigned long flags;
> +	struct ghcb *ghcb;
> +
> +	/* __sev_get_ghcb() need to run with IRQs disabled because it using per-cpu GHCB */

"... because it uses a per-CPU GHCB."

> +	local_irq_save(flags);
> +
> +	ghcb = __sev_get_ghcb(&state);
> +	if (unlikely(!ghcb))
> +		panic("SEV-SNP: Failed to get GHCB\n");

__sev_get_ghcb() will already panic if even the backup GHCB is active so
you don't need to panic here too - just check the retval.

> +	/* Copy the input desc into GHCB shared buffer */
> +	data = (struct snp_psc_desc *)ghcb->shared_buffer;
> +	memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof(*desc)));
> +
> +	/*
> +	 * As per the GHCB specification, the hypervisor can resume the guest
> +	 * before processing all the entries. Check whether all the entries
> +	 * are processed. If not, then keep retrying.
> +	 *
> +	 * The stragtegy here is to wait for the hypervisor to change the page

+        * The stragtegy here is to wait for the hypervisor to change the page
Unknown word [stragtegy] in comment, suggestions:
        ['strategy', 'strategist']

> +	 * state in the RMP table before guest accesses the memory pages. If the
> +	 * page state change was not successful, then later memory access will result
> +	 * in a crash.
> +	 */
> +	cur_entry = data->hdr.cur_entry;
> +	end_entry = data->hdr.end_entry;
> +
> +	while (data->hdr.cur_entry <= data->hdr.end_entry) {
> +		ghcb_set_sw_scratch(ghcb, (u64)__pa(data));
> +

Add a comment here:

		/* This will advance the shared buffer data points to. */

I had asked about it already but nada:

"So then you *absoulutely* want to use data->hdr everywhere and then also
write why in the comment above the check that data gets updated by the
HV call."

> +		ret = sev_es_ghcb_hv_call(ghcb, true, NULL, SVM_VMGEXIT_PSC, 0, 0);
> +
> +		/*
> +		 * Page State Change VMGEXIT can pass error code through
> +		 * exit_info_2.
> +		 */
> +		if (WARN(ret || ghcb->save.sw_exit_info_2,
> +			 "SEV-SNP: PSC failed ret=%d exit_info_2=%llx\n",
> +			 ret, ghcb->save.sw_exit_info_2)) {
> +			ret = 1;
> +			goto out;
> +		}
> +
> +		/* Verify that reserved bit is not set */
> +		if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n")) {
> +			ret = 1;
> +			goto out;
> +		}
> +
> +		/*
> +		 * Sanity check that entry processing is not going backward.

"... backwards."

> +		 * This will happen only if hypervisor is tricking us.
> +		 */
> +		if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_entry,
> +"SEV-SNP:  PSC processing going backward, end_entry %d (got %d) cur_entry %d (got %d)\n",
> +			 end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) {
> +			ret = 1;
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	__sev_put_ghcb(&state);
> +	local_irq_restore(flags);
> +
> +	return ret;
> +}
> +
> +static void __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
> +			      unsigned long vaddr_end, int op)
> +{
> +	struct psc_hdr *hdr;
> +	struct psc_entry *e;
> +	unsigned long pfn;
> +	int i;
> +
> +	hdr = &data->hdr;
> +	e = data->entries;
> +
> +	memset(data, 0, sizeof(*data));
> +	i = 0;
> +
> +	while (vaddr < vaddr_end) {
> +		if (is_vmalloc_addr((void *)vaddr))
> +			pfn = vmalloc_to_pfn((void *)vaddr);
> +		else
> +			pfn = __pa(vaddr) >> PAGE_SHIFT;
> +
> +		e->gfn = pfn;
> +		e->operation = op;
> +		hdr->end_entry = i;

		/*
		 * Current SNP implementation doesn't keep track of the page size so use
		 * 4K for simplicity.
		 */

> +		e->pagesize = RMP_PG_SIZE_4K;
> +
> +		vaddr = vaddr + PAGE_SIZE;
> +		e++;
> +		i++;
> +	}

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 17/40] KVM: SVM: Create a separate mapping for the SEV-ES save area
  2021-12-10 15:43 ` [PATCH v8 17/40] KVM: SVM: Create a separate mapping for the SEV-ES save area Brijesh Singh
@ 2021-12-30 12:19   ` Borislav Petkov
  2022-01-05  1:38   ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-30 12:19 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:09AM -0600, Brijesh Singh wrote:
> +/* Save area definition for SEV-ES and SEV-SNP guests */
> +struct sev_es_save_area {

I'd still call it sev_save_area for simplicity. And
EXPECTED_SEV_SAVE_AREA_SIZE and so on.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper
  2021-12-10 15:43 ` [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper Brijesh Singh
@ 2021-12-30 18:52   ` Sean Christopherson
  2022-01-04 20:57     ` Borislav Petkov
  2022-01-04 23:36     ` Michael Roth
  2022-01-06 18:38   ` Venu Busireddy
  1 sibling, 2 replies; 183+ messages in thread
From: Sean Christopherson @ 2021-12-30 18:52 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> This code will also be used later for SEV-SNP-validated CPUID code in
> some cases, so move it to a common helper.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/kernel/sev-shared.c | 84 +++++++++++++++++++++++++-----------
>  1 file changed, 58 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index 3aaef1a18ffe..d89481b31022 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -194,6 +194,58 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
>  	return verify_exception_info(ghcb, ctxt);
>  }
>  
> +static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,

Having @subfunc, a.k.a. index, in is weird/confusing/fragile because it's not consumed,
nor is it checked.  Peeking ahead, it looks like all future users pass '0'.  Taking the
index but dropping it on the floor is asking for future breakage.  Either drop it or
assert that it's zero.

> +			u32 *ecx, u32 *edx)
> +{
> +	u64 val;
> +
> +	if (eax) {
> +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EAX));
> +		VMGEXIT();
> +		val = sev_es_rd_ghcb_msr();
> +
> +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +			return -EIO;
> +
> +		*eax = (val >> 32);
> +	}
> +
> +	if (ebx) {
> +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EBX));
> +		VMGEXIT();
> +		val = sev_es_rd_ghcb_msr();
> +
> +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +			return -EIO;
> +
> +		*ebx = (val >> 32);
> +	}
> +
> +	if (ecx) {
> +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_ECX));
> +		VMGEXIT();
> +		val = sev_es_rd_ghcb_msr();
> +
> +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +			return -EIO;
> +
> +		*ecx = (val >> 32);
> +	}
> +
> +	if (edx) {
> +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EDX));
> +		VMGEXIT();
> +		val = sev_es_rd_ghcb_msr();
> +
> +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +			return -EIO;
> +
> +		*edx = (val >> 32);
> +	}

That's a lot of pasta!  If you add

  static int __sev_cpuid_hv(u32 func, int reg_idx, u32 *reg)
  {
	u64 val;

	if (!reg)
		return 0;

	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, reg_idx));
	VMGEXIT();
	val = sev_es_rd_ghcb_msr();
	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
		return -EIO;

	*reg = (val >> 32);
	return 0;
  }

then this helper can become something like:

  static int sev_cpuid_hv(u32 func, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
  {
	int ret;

	ret = __sev_cpuid_hv(func, GHCB_CPUID_REQ_EAX, eax);
	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EBX, ebx);
	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_ECX, ecx);
	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EDX, edx);

	return ret;
  }

> +
> +	return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2021-12-10 15:43 ` [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs Brijesh Singh
  2021-12-10 18:50   ` Dave Hansen
@ 2021-12-31 15:36   ` Borislav Petkov
  2022-01-03 18:10     ` Vlastimil Babka
  2022-01-12 16:33     ` Brijesh Singh
  1 sibling, 2 replies; 183+ messages in thread
From: Borislav Petkov @ 2021-12-31 15:36 UTC (permalink / raw)
  To: Brijesh Singh, Vlastimil Babka
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:12AM -0600, Brijesh Singh wrote:
> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
> index 123a96f7dff2..38c14601ae4a 100644
> --- a/arch/x86/include/asm/sev-common.h
> +++ b/arch/x86/include/asm/sev-common.h
> @@ -104,6 +104,7 @@ enum psc_op {
>  	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
>  
>  #define GHCB_HV_FT_SNP			BIT_ULL(0)
> +#define GHCB_HV_FT_SNP_AP_CREATION	(BIT_ULL(1) | GHCB_HV_FT_SNP)

Why is bit 0 ORed in? Because it "Requires SEV-SNP Feature."?

You can still enforce that requirement in the test though.

Or all those SEV features should not be bits but masks -
GHCB_HV_FT_SNP_AP_CREATION_MASK for example, seeing how the others
require the previous bits to be set too.

...

>  static DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
>  DEFINE_STATIC_KEY_FALSE(sev_es_enable_key);
>  
> +static DEFINE_PER_CPU(struct sev_es_save_area *, snp_vmsa);

This is what I mean: the struct is called "sev_es... " but the variable
"snp_...". I.e., it is all sev_<something>.

> +
>  static __always_inline bool on_vc_stack(struct pt_regs *regs)
>  {
>  	unsigned long sp = regs->sp;
> @@ -814,6 +818,231 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
>  	pvalidate_pages(vaddr, npages, 1);
>  }
>  
> +static int snp_set_vmsa(void *va, bool vmsa)
> +{
> +	u64 attrs;
> +
> +	/*
> +	 * The RMPADJUST instruction is used to set or clear the VMSA bit for
> +	 * a page. A change to the VMSA bit is only performed when running
> +	 * at VMPL0 and is ignored at other VMPL levels. If too low of a target

What does "too low" mean here exactly?

The kernel is not at VMPL0 but the specified level is lower? Weird...

> +	 * VMPL level is specified, the instruction can succeed without changing
> +	 * the VMSA bit should the kernel not be in VMPL0. Using a target VMPL
> +	 * level of 1 will return a FAIL_PERMISSION error if the kernel is not
> +	 * at VMPL0, thus ensuring that the VMSA bit has been properly set when
> +	 * no error is returned.

We do check whether we run at VMPL0 earlier when starting the guest -
see enforce_vmpl0().

I don't think you need any of that additional verification here - just
assume you are at VMPL0.

> +	 */
> +	attrs = 1;
> +	if (vmsa)
> +		attrs |= RMPADJUST_VMSA_PAGE_BIT;
> +
> +	return rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs);
> +}
> +
> +#define __ATTR_BASE		(SVM_SELECTOR_P_MASK | SVM_SELECTOR_S_MASK)
> +#define INIT_CS_ATTRIBS		(__ATTR_BASE | SVM_SELECTOR_READ_MASK | SVM_SELECTOR_CODE_MASK)
> +#define INIT_DS_ATTRIBS		(__ATTR_BASE | SVM_SELECTOR_WRITE_MASK)
> +
> +#define INIT_LDTR_ATTRIBS	(SVM_SELECTOR_P_MASK | 2)
> +#define INIT_TR_ATTRIBS		(SVM_SELECTOR_P_MASK | 3)
> +
> +static void *snp_safe_alloc_page(void)

safe?

And you don't need to say "safe" - snp_alloc_vmsa_page() is perfectly fine.

> +{
> +	unsigned long pfn;
> +	struct page *p;
> +
> +	/*
> +	 * Allocate an SNP safe page to workaround the SNP erratum where
> +	 * the CPU will incorrectly signal an RMP violation  #PF if a
> +	 * hugepage (2mb or 1gb) collides with the RMP entry of VMSA page.

		2MB or 1GB

Collides how? The 4K frame is inside the hugepage?

> +	 * The recommeded workaround is to not use the large page.

Unknown word [recommeded] in comment, suggestions:
        ['recommended', 'recommend', 'recommitted', 'commended', 'commandeered']

> +	 *
> +	 * Allocate one extra page, use a page which is not 2mb aligned

2MB-aligned

> +	 * and free the other.
> +	 */
> +	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> +	if (!p)
> +		return NULL;
> +
> +	split_page(p, 1);
> +
> +	pfn = page_to_pfn(p);
> +	if (IS_ALIGNED(__pfn_to_phys(pfn), PMD_SIZE)) {
> +		pfn++;
> +		__free_page(p);
> +	} else {
> +		__free_page(pfn_to_page(pfn + 1));
> +	}

AFAICT, this is doing all this stuff so that you can make sure you get a
non-2M-aligned page. I wonder if there's a way to simply ask mm to give
you such page directly.

vbabka?

> +
> +	return page_address(pfn_to_page(pfn));
> +}
> +
> +static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
> +{
> +	struct sev_es_save_area *cur_vmsa, *vmsa;
> +	struct ghcb_state state;
> +	unsigned long flags;
> +	struct ghcb *ghcb;
> +	int cpu, err, ret;
> +	u8 sipi_vector;
> +	u64 cr4;
> +
> +	if ((sev_hv_features & GHCB_HV_FT_SNP_AP_CREATION) != GHCB_HV_FT_SNP_AP_CREATION)
> +		return -EOPNOTSUPP;
> +
> +	/*
> +	 * Verify the desired start IP against the known trampoline start IP
> +	 * to catch any future new trampolines that may be introduced that
> +	 * would require a new protected guest entry point.
> +	 */
> +	if (WARN_ONCE(start_ip != real_mode_header->trampoline_start,
> +		      "Unsupported SEV-SNP start_ip: %lx\n", start_ip))
> +		return -EINVAL;
> +
> +	/* Override start_ip with known protected guest start IP */
> +	start_ip = real_mode_header->sev_es_trampoline_start;
> +
> +	/* Find the logical CPU for the APIC ID */
> +	for_each_present_cpu(cpu) {
> +		if (arch_match_cpu_phys_id(cpu, apic_id))
> +			break;
> +	}
> +	if (cpu >= nr_cpu_ids)
> +		return -EINVAL;
> +
> +	cur_vmsa = per_cpu(snp_vmsa, cpu);
> +
> +	/*
> +	 * A new VMSA is created each time because there is no guarantee that
> +	 * the current VMSA is the kernels or that the vCPU is not running. If

kernel's.

And if it is not the kernel's, whose it is?

> +	 * an attempt was done to use the current VMSA with a running vCPU, a
> +	 * #VMEXIT of that vCPU would wipe out all of the settings being done
> +	 * here.

I don't understand - this is waking up a CPU, how can it ever be a
running vCPU which is using the current VMSA?!

There is per_cpu(snp_vmsa, cpu), who else can be using that one currently?

> +	 */
> +	vmsa = (struct sev_es_save_area *)snp_safe_alloc_page();
> +	if (!vmsa)
> +		return -ENOMEM;
> +
> +	/* CR4 should maintain the MCE value */
> +	cr4 = native_read_cr4() & X86_CR4_MCE;
> +
> +	/* Set the CS value based on the start_ip converted to a SIPI vector */
> +	sipi_vector		= (start_ip >> 12);
> +	vmsa->cs.base		= sipi_vector << 12;
> +	vmsa->cs.limit		= 0xffff;
> +	vmsa->cs.attrib		= INIT_CS_ATTRIBS;
> +	vmsa->cs.selector	= sipi_vector << 8;
> +
> +	/* Set the RIP value based on start_ip */
> +	vmsa->rip		= start_ip & 0xfff;
> +
> +	/* Set VMSA entries to the INIT values as documented in the APM */
> +	vmsa->ds.limit		= 0xffff;
> +	vmsa->ds.attrib		= INIT_DS_ATTRIBS;
> +	vmsa->es		= vmsa->ds;
> +	vmsa->fs		= vmsa->ds;
> +	vmsa->gs		= vmsa->ds;
> +	vmsa->ss		= vmsa->ds;
> +
> +	vmsa->gdtr.limit	= 0xffff;
> +	vmsa->ldtr.limit	= 0xffff;
> +	vmsa->ldtr.attrib	= INIT_LDTR_ATTRIBS;
> +	vmsa->idtr.limit	= 0xffff;
> +	vmsa->tr.limit		= 0xffff;
> +	vmsa->tr.attrib		= INIT_TR_ATTRIBS;
> +
> +	vmsa->efer		= 0x1000;	/* Must set SVME bit */

verify_comment_style: Warning: No tail comments please:
 arch/x86/kernel/sev.c:954 [+   vmsa->efer              = 0x1000;       /* Must set SVME bit */]

> +	vmsa->cr4		= cr4;
> +	vmsa->cr0		= 0x60000010;
> +	vmsa->dr7		= 0x400;
> +	vmsa->dr6		= 0xffff0ff0;
> +	vmsa->rflags		= 0x2;
> +	vmsa->g_pat		= 0x0007040600070406ULL;
> +	vmsa->xcr0		= 0x1;
> +	vmsa->mxcsr		= 0x1f80;
> +	vmsa->x87_ftw		= 0x5555;
> +	vmsa->x87_fcw		= 0x0040;

Yah, those definitely need macros or at least comments ontop denoting
what those naked values are.

> +
> +	/*
> +	 * Set the SNP-specific fields for this VMSA:
> +	 *   VMPL level
> +	 *   SEV_FEATURES (matches the SEV STATUS MSR right shifted 2 bits)
> +	 */

Like this^^

> +	vmsa->vmpl		= 0;
> +	vmsa->sev_features	= sev_status >> 2;
> +
> +	/* Switch the page over to a VMSA page now that it is initialized */
> +	ret = snp_set_vmsa(vmsa, true);
> +	if (ret) {
> +		pr_err("set VMSA page failed (%u)\n", ret);
> +		free_page((unsigned long)vmsa);
> +
> +		return -EINVAL;
> +	}
> +
> +	/* Issue VMGEXIT AP Creation NAE event */
> +	local_irq_save(flags);
> +
> +	ghcb = __sev_get_ghcb(&state);
> +
> +	vc_ghcb_invalidate(ghcb);
> +	ghcb_set_rax(ghcb, vmsa->sev_features);
> +	ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION);
> +	ghcb_set_sw_exit_info_1(ghcb, ((u64)apic_id << 32) | SVM_VMGEXIT_AP_CREATE);
> +	ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa));
> +
> +	sev_es_wr_ghcb_msr(__pa(ghcb));
> +	VMGEXIT();
> +
> +	if (!ghcb_sw_exit_info_1_is_valid(ghcb) ||
> +	    lower_32_bits(ghcb->save.sw_exit_info_1)) {
> +		pr_alert("SNP AP Creation error\n");

alert?

> +		ret = -EINVAL;
> +	}
> +
> +	__sev_put_ghcb(&state);
> +
> +	local_irq_restore(flags);
> +
> +	/* Perform cleanup if there was an error */
> +	if (ret) {
> +		err = snp_set_vmsa(vmsa, false);
> +		if (err)
> +			pr_err("clear VMSA page failed (%u), leaking page\n", err);
> +		else
> +			free_page((unsigned long)vmsa);

That...

> +
> +		vmsa = NULL;
> +	}
> +
> +	/* Free up any previous VMSA page */
> +	if (cur_vmsa) {
> +		err = snp_set_vmsa(cur_vmsa, false);
> +		if (err)
> +			pr_err("clear VMSA page failed (%u), leaking page\n", err);
> +		else
> +			free_page((unsigned long)cur_vmsa);

.. and that wants to be in a common helper.

> +	}
> +
> +	/* Record the current VMSA page */
> +	per_cpu(snp_vmsa, cpu) = vmsa;
> +
> +	return ret;
> +}

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 21/40] x86/head: re-enable stack protection for 32/64-bit builds
  2021-12-10 15:43 ` [PATCH v8 21/40] x86/head: re-enable stack protection for 32/64-bit builds Brijesh Singh
@ 2022-01-03 16:49   ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-03 16:49 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:13AM -0600, Brijesh Singh wrote:

> Subject: Re: [PATCH v8 21/40] x86/head: re-enable stack protection for 32/64-bit builds

The tip tree preferred format for patch subject prefixes is
'subsys/component:', e.g. 'x86/apic:', 'x86/mm/fault:', 'sched/fair:',
'genirq/core:'. Please do not use file names or complete file paths as
prefix. 'git log path/to/file' should give you a reasonable hint in most
cases.

The condensed patch description in the subject line should start with a
uppercase letter and should be written in imperative tone.

In this case:

x86/head/64: Re-enable stack protection

There's no need for 32/64-bit builds - we don't have anything else :-)

Please check all your subjects.

> From: Michael Roth <michael.roth@amd.com>
> 
> As of commit 103a4908ad4d ("x86/head/64: Disable stack protection for
> head$(BITS).o")

verify_commit_quotation: Warning: The proper commit quotation format is:
<newline>
[  ]<sha1, 12 chars> ("commit name")
<newline>

> kernel/head64.c is compiled with -fno-stack-protector
> to allow a call to set_bringup_idt_handler(), which would otherwise
> have stack protection enabled with CONFIG_STACKPROTECTOR_STRONG. While
> sufficient for that case, there may still be issues with calls to any
> external functions that were compiled with stack protection enabled that
> in-turn make stack-protected calls, or if the exception handlers set up
> by set_bringup_idt_handler() make calls to stack-protected functions.
> As part of 103a4908ad4d, stack protection was also disabled for
> kernel/head32.c as a precaution.
> 
> Subsequent patches for SEV-SNP CPUID validation support will introduce
> both such cases. Attempting to disable stack protection for everything
> in scope to address that is prohibitive since much of the code, like
> SEV-ES #VC handler, is shared code that remains in use after boot and
> could benefit from having stack protection enabled. Attempting to inline
> calls is brittle and can quickly balloon out to library/helper code
> where that's not really an option.
> 
> Instead, re-enable stack protection for head32.c/head64.c and make the
> appropriate changes to ensure the segment used for the stack canary is
> initialized in advance of any stack-protected C calls.
> 
> for head64.c:
> 
> - The BSP will enter from startup_64 and call into C code

Function names need to end with "()" so that it is clear they're
functions.

>   (startup_64_setup_env) shortly after setting up the stack, which may
>   result in calls to stack-protected code. Set up %gs early to allow
>   for this safely.
> - APs will enter from secondary_startup_64*, and %gs will be set up
>   soon after. There is one call to C code prior to this
>   (__startup_secondary_64), but it is only to fetch sme_me_mask, and
>   unlikely to be stack-protected, so leave things as they are, but add
>   a note about this in case things change in the future.
> 
> for head32.c:
> 
> - BSPs/APs will set %fs to __BOOT_DS prior to any C calls. In recent
>   kernels, the compiler is configured to access the stack canary at
>   %fs:__stack_chk_guard,

Add here somewhere:

"See

  3fb0fdb3bbe7 ("x86/stackprotector/32: Make the canary into a regular percpu variable")

for details."

> which overlaps with the initial per-cpu
>   __stack_chk_guard variable in the initial/'master' .data..percpu
>   area. This is sufficient to allow access to the canary for use
>   during initial startup, so no changes are needed there.
> 
> Suggested-by: Joerg Roedel <jroedel@suse.de> #for 64-bit %gs set up
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/kernel/Makefile  |  1 -
>  arch/x86/kernel/head_64.S | 24 ++++++++++++++++++++++++
>  2 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index 2ff3e600f426..4df8c8f7d2ac 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -48,7 +48,6 @@ endif
>  # non-deterministic coverage.
>  KCOV_INSTRUMENT		:= n
>  
> -CFLAGS_head$(BITS).o	+= -fno-stack-protector
>  CFLAGS_cc_platform.o	+= -fno-stack-protector
>  
>  CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 99de8fd461e8..9f8a7e48aca7 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -65,6 +65,22 @@ SYM_CODE_START_NOALIGN(startup_64)
>  	leaq	(__end_init_task - FRAME_SIZE)(%rip), %rsp
>  
>  	leaq	_text(%rip), %rdi
> +
> +	/*
> +	 * initial_gs points to initial fixed_per_cpu struct with storage for

$ git grep fixed_per_cpu
$

??

Do you mean this:

SYM_DATA(initial_gs,    .quad INIT_PER_CPU_VAR(fixed_percpu_data))

?

> +	 * the stack protector canary. Global pointer fixups are needed at this
> +	 * stage, so apply them as is done in fixup_pointer(), and initialize %gs
> +	 * such that the canary can be accessed at %gs:40 for subsequent C calls.
> +	 */
> +	movl	$MSR_GS_BASE, %ecx
> +	movq	initial_gs(%rip), %rax
> +	movq	$_text, %rdx
> +	subq	%rdx, %rax
> +	addq	%rdi, %rax
> +	movq	%rax, %rdx
> +	shrq	$32,  %rdx
> +	wrmsr
> +
>  	pushq	%rsi
>  	call	startup_64_setup_env
>  	popq	%rsi
> @@ -146,6 +162,14 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
>  	 * added to the initial pgdir entry that will be programmed into CR3.
>  	 */
>  	pushq	%rsi

<---- newline here.

> +	/*
> +	 * NOTE: %gs at this point is a stale data segment left over from the
> +	 * real-mode trampoline, so the default stack protector canary location
> +	 * at %gs:40 does not yet coincide with the expected fixed_per_cpu struct
> +	 * that contains storage for the stack canary. So take care not to add
> +	 * anything to the C functions in this path that would result in stack
> +	 * protected C code being generated.
> +	 */
>  	call	__startup_secondary_64
>  	popq	%rsi

Can't you simply do

	movq    sme_me_mask, %rax

here instead and avoid the issue altogether?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2021-12-31 15:36   ` Borislav Petkov
@ 2022-01-03 18:10     ` Vlastimil Babka
  2022-01-12 16:33     ` Brijesh Singh
  1 sibling, 0 replies; 183+ messages in thread
From: Vlastimil Babka @ 2022-01-03 18:10 UTC (permalink / raw)
  To: Borislav Petkov, Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 12/31/21 16:36, Borislav Petkov wrote:
> On Fri, Dec 10, 2021 at 09:43:12AM -0600, Brijesh Singh wrote:
>> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
>> index 123a96f7dff2..38c14601ae4a 100644
> 
>> +{
>> +	unsigned long pfn;
>> +	struct page *p;
>> +
>> +	/*
>> +	 * Allocate an SNP safe page to workaround the SNP erratum where
>> +	 * the CPU will incorrectly signal an RMP violation  #PF if a
>> +	 * hugepage (2mb or 1gb) collides with the RMP entry of VMSA page.
> 
> 		2MB or 1GB
> 
> Collides how? The 4K frame is inside the hugepage?
> 
>> +	 * The recommeded workaround is to not use the large page.
> 
> Unknown word [recommeded] in comment, suggestions:
>         ['recommended', 'recommend', 'recommitted', 'commended', 'commandeered']
> 
>> +	 *
>> +	 * Allocate one extra page, use a page which is not 2mb aligned
> 
> 2MB-aligned
> 
>> +	 * and free the other.
>> +	 */
>> +	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
>> +	if (!p)
>> +		return NULL;
>> +
>> +	split_page(p, 1);
>> +
>> +	pfn = page_to_pfn(p);
>> +	if (IS_ALIGNED(__pfn_to_phys(pfn), PMD_SIZE)) {
>> +		pfn++;
>> +		__free_page(p);
>> +	} else {
>> +		__free_page(pfn_to_page(pfn + 1));
>> +	}
> 
> AFAICT, this is doing all this stuff so that you can make sure you get a
> non-2M-aligned page. I wonder if there's a way to simply ask mm to give
> you such page directly.
> 
> vbabka?

AFAIK, not, as this is a very unusual constraint. Unless there are more
places that need it, should be fine to solve it like above. Maybe just also
be optimistic and try a plain order-0 first and only if it has the undesired
alignment (which should happen only once per 512 allocations), fallback to
the above?

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage
  2021-12-17 23:24     ` Brijesh Singh
@ 2022-01-03 18:43       ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-03 18:43 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-17 17:24:43 -0600, Brijesh Singh wrote:
> 
> On 12/17/21 2:47 PM, Venu Busireddy wrote:
> 
> >>  	 * the caches.
> >>  	 */
> >> -	if ((set | clr) & _PAGE_ENC)
> >> +	if ((set | clr) & _PAGE_ENC) {
> >>  		clflush_page(address);
> >>  
> >> +		/*
> >> +		 * If the encryption attribute is being cleared, then change
> >> +		 * the page state to shared in the RMP table.
> >> +		 */
> >> +		if (clr)
> > This function is also called by set_page_non_present() with clr set to
> > _PAGE_PRESENT. Do we want to change the page state to shared even when
> > the page is not present? If not, shouldn't the check be (clr & _PAGE_ENC)?
> 
> I am not able to follow your comment. Here we only pay attention to the
> encryption attribute, if encryption attribute is getting cleared then
> make PSC. In the case ov set_page_non_present(), the outer if() block
> will return false.  Am I missing something ?

You are right. I missed the outer check.

> > The page is not "added", right? Shouldn't we just say:
> 
> Technically, PSC modifies the RMP entry, so I should use that  instead
> of calling "added".
> 
> 
> >     Validate the page so that it is consistent with the RMP entry.
> 
> Yes, I am okay with it.

Thanks,

Venu

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2021-12-15 21:22                         ` Michael Roth
@ 2022-01-03 19:10                           ` Venu Busireddy
  2022-01-05 19:34                             ` Brijesh Singh
  0 siblings, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2022-01-03 19:10 UTC (permalink / raw)
  To: Michael Roth
  Cc: Borislav Petkov, Tom Lendacky, Brijesh Singh, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-15 15:22:57 -0600, Michael Roth wrote:
> On Wed, Dec 15, 2021 at 09:38:55PM +0100, Borislav Petkov wrote:
> > 
> > But it is hard to discuss anything without patches so we can continue
> > the topic with concrete patches. But this unification is not
> > super-pressing so it can go ontop of the SNP pile.
> 
> Yah, it's all theoretical at this point. Didn't mean to derail things
> though. I mainly brought it up to suggest that Venu's original approach of
> returning the encryption bit via a pointer argument might make it easier to
> expand it for other purposes in the future, and that naming it for that
> future purpose might encourage future developers to focus their efforts
> there instead of potentially re-introducing duplicate code.
> 
> But either way it's simple enough to rework things when we actually
> cross that bridge. So totally fine with saving all of this as a future
> follow-up, or picking up either of Venu's patches for now if you'd still
> prefer.

So, what is the consensus? Do you want me to submit a patch after the
SNP changes go upstream? Or, do you want to roll in one of the patches
that I posted earlier?

Venu

> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 10/40] x86/compressed: Register GHCB memory when SEV-SNP is active
  2021-12-10 15:43 ` [PATCH v8 10/40] x86/compressed: Register GHCB memory when SEV-SNP is active Brijesh Singh
@ 2022-01-03 19:54   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-03 19:54 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:02 -0600, Brijesh Singh wrote:
> The SEV-SNP guest is required by the GHCB spec to register the GHCB's
> Guest Physical Address (GPA). This is because the hypervisor may prefer
> that a guest use a consistent and/or specific GPA for the GHCB associated
> with a vCPU. For more information, see the GHCB specification section
> "GHCB GPA Registration".
> 
> If hypervisor can not work with the guest provided GPA then terminate the
> guest boot.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/boot/compressed/sev.c    |  4 ++++
>  arch/x86/include/asm/sev-common.h | 13 +++++++++++++
>  arch/x86/kernel/sev-shared.c      | 16 ++++++++++++++++
>  3 files changed, 33 insertions(+)
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 11/40] x86/sev: Register GHCB memory when SEV-SNP is active
  2021-12-10 15:43 ` [PATCH v8 11/40] x86/sev: " Brijesh Singh
  2021-12-22 13:16   ` Borislav Petkov
@ 2022-01-03 22:47   ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-03 22:47 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:03 -0600, Brijesh Singh wrote:
> The SEV-SNP guest is required by the GHCB spec to register the GHCB's
> Guest Physical Address (GPA). This is because the hypervisor may prefer
> that a guest use a consistent and/or specific GPA for the GHCB associated
> with a vCPU. For more information, see the GHCB specification section
> "GHCB GPA Registration".
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/include/asm/sev.h   |   2 +
>  arch/x86/kernel/cpu/common.c |   4 ++
>  arch/x86/kernel/head64.c     |   1 +
>  arch/x86/kernel/sev-shared.c |   2 +-
>  arch/x86/kernel/sev.c        | 120 ++++++++++++++++++++---------------
>  5 files changed, 77 insertions(+), 52 deletions(-)
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes
  2021-12-10 15:43 ` [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes Brijesh Singh
  2021-12-23 11:50   ` Borislav Petkov
@ 2022-01-03 23:28   ` Venu Busireddy
  2022-01-11 21:22     ` Brijesh Singh
  1 sibling, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2022-01-03 23:28 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:04 -0600, Brijesh Singh wrote:
> The early_set_memory_{encrypt,decrypt}() are used for changing the

s/encrypt,decrypt/encrypted,decrypted/

> page from decrypted (shared) to encrypted (private) and vice versa.
> When SEV-SNP is active, the page state transition needs to go through
> additional steps.
> 
> If the page is transitioned from shared to private, then perform the
> following after the encryption attribute is set in the page table:
> 
> 1. Issue the page state change VMGEXIT to add the page as a private
>    in the RMP table.
> 2. Validate the page after its successfully added in the RMP table.
> 
> To maintain the security guarantees, if the page is transitioned from
> private to shared, then perform the following before clearing the
> encryption attribute from the page table.
> 
> 1. Invalidate the page.
> 2. Issue the page state change VMGEXIT to make the page shared in the
>    RMP table.
> 
> The early_set_memory_{encrypt,decrypt} can be called before the GHCB

s/encrypt,decrypt/encrypted,decrypted/

> is setup, use the SNP page state MSR protocol VMGEXIT defined in the GHCB
> specification to request the page state change in the RMP table.
> 
> While at it, add a helper snp_prep_memory() that can be used outside
> the sev specific files to change the page state for a specified memory
> range.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

And with a few other nits below:

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> +
> +	 /*
> +	  * Ask the hypervisor to mark the memory pages as private in the RMP
> +	  * table.
> +	  */

Indentation is off. While at it, you may want to collapse it into a one
line comment.

> +	early_set_page_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
> +
> +	/* Validate the memory pages after they've been added in the RMP table. */
> +	pvalidate_pages(vaddr, npages, 1);
> +}
> +
> +void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> +					unsigned int npages)
> +{
> +	if (!cc_platform_has(CC_ATTR_SEV_SNP))
> +		return;
> +
> +	/*
> +	 * Invalidate the memory pages before they are marked shared in the
> +	 * RMP table.
> +	 */

Collapse into one line?

> +	pvalidate_pages(vaddr, npages, 0);
> +
> +	 /* Ask hypervisor to mark the memory pages shared in the RMP table. */

Indentation is off.

> +		/*
> +		 * ON SNP, the page state in the RMP table must happen
> +		 * before the page table updates.
> +		 */
> +		early_snp_set_memory_shared((unsigned long)__va(pa), pa, 1);

I know "1" implies "true", but to emphasize that the argument is
actually a boolean, could you please change the "1" to "true?"

> +	}
> +
>  	/* Change the page encryption mask. */
>  	new_pte = pfn_pte(pfn, new_prot);
>  	set_pte_atomic(kpte, new_pte);
> +
> +	/*
> +	 * If page is set encrypted in the page table, then update the RMP table to
> +	 * add this page as private.
> +	 */
> +	if (enc)
> +		early_snp_set_memory_private((unsigned long)__va(pa), pa, 1);

Here too, could you please change the "1" to "true?"

Venu


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 08/40] x86/sev: Check the vmpl level
  2021-12-20 18:10           ` Borislav Petkov
@ 2022-01-04 15:23             ` Brijesh Singh
  0 siblings, 0 replies; 183+ messages in thread
From: Brijesh Singh @ 2022-01-04 15:23 UTC (permalink / raw)
  To: Borislav Petkov, Tom Lendacky
  Cc: brijesh.singh, Mikolaj Lisik, Venu Busireddy, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 12/20/21 12:10 PM, Borislav Petkov wrote:
> On Fri, Dec 17, 2021 at 04:33:02PM -0600, Tom Lendacky wrote:
>>>>>> +      * There is no straightforward way to query the current VMPL level. The
>>>>>> +      * simplest method is to use the RMPADJUST instruction to change a page
>>>>>> +      * permission to a VMPL level-1, and if the guest kernel is launched at
>>>>>> +      * a level <= 1, then RMPADJUST instruction will return an error.
>>>>> Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
>>>>> equal to 1 semantically, or numerically?
>>>
>>> Its numerically, please see the AMD APM vol 3.
>>
>> Actually it is not numerically...  if it was numerically, then 0 <= 1 would
>> return an error, but VMPL0 is the highest permission level.
> 
> Just write in that comment exactly what this function does:
> 
> "RMPADJUST modifies RMP permissions of a lesser-privileged (numerically
> higher) privilege level. Here, clear the VMPL1 permission mask of the
> GHCB page. If the guest is not running at VMPL0, this will fail.
> 
> If the guest is running at VMP0, it will succeed. Even if that operation
> modifies permission bits, it is still ok to do currently because Linux
> SNP guests are supported only on VMPL0 so VMPL1 or higher permission
> masks changing is a don't-care."
> 
> and then everything is clear wrt numbering, privilege, etc.
> 
> Ok?
> 

Noted.

thanks

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes
  2021-12-23 11:50   ` Borislav Petkov
@ 2022-01-04 15:33     ` Brijesh Singh
  0 siblings, 0 replies; 183+ messages in thread
From: Brijesh Singh @ 2022-01-04 15:33 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 12/23/21 5:50 AM, Borislav Petkov wrote:
>> While at it, add a helper snp_prep_memory() that can be used outside
>> the sev specific files to change the page state for a specified memory
> 
> "outside of the sev specific"? What is that trying to say?
> 
> /me goes and looks at the whole patchset...
> 
> Right, so that is used only in probe_roms(). So that should say:
> 
> "Add a helper ... which will be used in probe_roms(), in a later patch."
> 

Currently the helper is used for the probe_roms() only but it can be 
used by others in future. I will go ahead and spell out saying that it 
is for the probe_roms().


> 
> Yeah, looking at this again, I don't really like this multiplexing.
> Let's do this instead, diff ontop:
> 

thanks, I will apply your diff.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2021-12-10 15:43 ` [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table Brijesh Singh
  2021-12-28 11:53   ` Borislav Petkov
@ 2022-01-04 17:56   ` Venu Busireddy
  2022-01-05 19:52     ` Brijesh Singh
  1 sibling, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2022-01-04 17:56 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:05 -0600, Brijesh Singh wrote:
> The encryption attribute for the bss.decrypted region is cleared in the
> initial page table build. This is because the section contains the data
> that need to be shared between the guest and the hypervisor.
> 
> When SEV-SNP is active, just clearing the encryption attribute in the
> page table is not enough. The page state need to be updated in the RMP
> table.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/kernel/head64.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index fa02402dcb9b..72c5082a3ba4 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -143,7 +143,14 @@ static unsigned long sme_postprocess_startup(struct boot_params *bp, pmdval_t *p
>  	if (sme_get_me_mask()) {
>  		vaddr = (unsigned long)__start_bss_decrypted;
>  		vaddr_end = (unsigned long)__end_bss_decrypted;
> +
>  		for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> +			/*
> +			 * When SEV-SNP is active then transition the page to shared in the RMP
> +			 * table so that it is consistent with the page table attribute change.
> +			 */
> +			early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);

Shouldn't the first argument be vaddr as below?

   			early_snp_set_memory_shared(vaddr, __pa(vaddr), PTRS_PER_PMD);

Venu


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper
  2021-12-30 18:52   ` Sean Christopherson
@ 2022-01-04 20:57     ` Borislav Petkov
  2022-01-04 23:36     ` Michael Roth
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-04 20:57 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Thu, Dec 30, 2021 at 06:52:52PM +0000, Sean Christopherson wrote:
> Having @subfunc, a.k.a. index, in is weird/confusing/fragile because it's not consumed,
> nor is it checked.  Peeking ahead, it looks like all future users pass '0'.  Taking the
> index but dropping it on the floor is asking for future breakage.  Either drop it or
> assert that it's zero.

Yah, just drop it please. 

It can always be added later if needed.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 15/40] x86/mm: Add support to validate memory when changing C-bit
  2021-12-10 15:43 ` [PATCH v8 15/40] x86/mm: Add support to validate memory when changing C-bit Brijesh Singh
  2021-12-29 11:09   ` Borislav Petkov
@ 2022-01-04 22:31   ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-04 22:31 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:07 -0600, Brijesh Singh wrote:
> The set_memory_{encrypt,decrypt}() are used for changing the pages

s/set_memory_{encrypt,decrypt}/snp_set_memory_{shared,private}/

> from decrypted (shared) to encrypted (private) and vice versa.
> When SEV-SNP is active, the page state transition needs to go through
> additional steps.
> 
> If the page is transitioned from shared to private, then perform the
> following after the encryption attribute is set in the page table:
> 
> 1. Issue the page state change VMGEXIT to add the memory region in
>    the RMP table.
> 2. Validate the memory region after the RMP entry is added.
> 
> To maintain the security guarantees, if the page is transitioned from
> private to shared, then perform the following before encryption attribute
> is removed from the page table:
> 
> 1. Invalidate the page.
> 2. Issue the page state change VMGEXIT to remove the page from RMP table.
> 
> To change the page state in the RMP table, use the Page State Change
> VMGEXIT defined in the GHCB specification.
> 
> The GHCB specification provides the flexibility to use either 4K or 2MB
> page size in during the page state change (PSC) request. For now use the
> 4K page size for all the PSC until page size tracking is supported in the
> kernel.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

[snip]

> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 2971aa280ce6..35c772bf9f6c 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -574,7 +574,7 @@ static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool valid
>  	}
>  }
>  
> -static void __init early_set_page_state(unsigned long paddr, unsigned int npages, enum psc_op op)
> +static void __init early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)

Is there a need to change the name? "npages" can take a value of 1 too.
Hence, early_set_page_state() appears to be a better name!

> +		/*
> +		 * Page State Change VMGEXIT can pass error code through
> +		 * exit_info_2.
> +		 */

Collapse into one line?

> +void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
> +{
> +	if (!cc_platform_has(CC_ATTR_SEV_SNP))
> +		return;
> +
> +	pvalidate_pages(vaddr, npages, 0);

Replace '0' with "false"?

> +
> +	set_pages_state(vaddr, npages, SNP_PAGE_STATE_SHARED);
> +}
> +
> +void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
> +{
> +	if (!cc_platform_has(CC_ATTR_SEV_SNP))
> +		return;
> +
> +	set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
> +
> +	pvalidate_pages(vaddr, npages, 1);

Replace '1' with "true"?

Venu


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 16/40] KVM: SVM: Define sev_features and vmpl field in the VMSA
  2021-12-10 15:43 ` [PATCH v8 16/40] KVM: SVM: Define sev_features and vmpl field in the VMSA Brijesh Singh
@ 2022-01-04 22:59   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-04 22:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:08 -0600, Brijesh Singh wrote:
> The hypervisor uses the sev_features field (offset 3B0h) in the Save State
> Area to control the SEV-SNP guest features such as SNPActive, vTOM,
> ReflectVC etc. An SEV-SNP guest can read the SEV_FEATURES fields through
> the SEV_STATUS MSR.
> 
> While at it, update the dump_vmcb() to log the VMPL level.
> 
> See APM2 Table 15-34 and B-4 for more details.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/include/asm/svm.h | 6 ++++--
>  arch/x86/kvm/svm/svm.c     | 4 ++--
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index d3277486a6c0..c3fad5172584 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -238,7 +238,8 @@ struct vmcb_save_area {
>  	struct vmcb_seg ldtr;
>  	struct vmcb_seg idtr;
>  	struct vmcb_seg tr;
> -	u8 reserved_1[43];
> +	u8 reserved_1[42];
> +	u8 vmpl;
>  	u8 cpl;
>  	u8 reserved_2[4];
>  	u64 efer;
> @@ -303,7 +304,8 @@ struct vmcb_save_area {
>  	u64 sw_exit_info_1;
>  	u64 sw_exit_info_2;
>  	u64 sw_scratch;
> -	u8 reserved_11[56];
> +	u64 sev_features;
> +	u8 reserved_11[48];
>  	u64 xcr0;
>  	u8 valid_bitmap[16];
>  	u64 x87_state_gpa;
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 65707bee208d..d3a6356fa1af 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3290,8 +3290,8 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
>  	       "tr:",
>  	       save01->tr.selector, save01->tr.attrib,
>  	       save01->tr.limit, save01->tr.base);
> -	pr_err("cpl:            %d                efer:         %016llx\n",
> -		save->cpl, save->efer);
> +	pr_err("vmpl: %d   cpl:  %d               efer:          %016llx\n",
                                                       ^
Extra space?

> +	       save->vmpl, save->cpl, save->efer);
>  	pr_err("%-15s %016llx %-13s %016llx\n",
>  	       "cr0:", save->cr0, "cr2:", save->cr2);
>  	pr_err("%-15s %016llx %-13s %016llx\n",
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper
  2021-12-30 18:52   ` Sean Christopherson
  2022-01-04 20:57     ` Borislav Petkov
@ 2022-01-04 23:36     ` Michael Roth
  1 sibling, 0 replies; 183+ messages in thread
From: Michael Roth @ 2022-01-04 23:36 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Thu, Dec 30, 2021 at 06:52:52PM +0000, Sean Christopherson wrote:
> On Fri, Dec 10, 2021, Brijesh Singh wrote:
> > From: Michael Roth <michael.roth@amd.com>
> > 
> > This code will also be used later for SEV-SNP-validated CPUID code in
> > some cases, so move it to a common helper.
> > 
> > Signed-off-by: Michael Roth <michael.roth@amd.com>
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > ---
> >  arch/x86/kernel/sev-shared.c | 84 +++++++++++++++++++++++++-----------
> >  1 file changed, 58 insertions(+), 26 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> > index 3aaef1a18ffe..d89481b31022 100644
> > --- a/arch/x86/kernel/sev-shared.c
> > +++ b/arch/x86/kernel/sev-shared.c
> > @@ -194,6 +194,58 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
> >  	return verify_exception_info(ghcb, ctxt);
> >  }
> >  
> > +static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> 
> Having @subfunc, a.k.a. index, in is weird/confusing/fragile because it's not consumed,
> nor is it checked.  Peeking ahead, it looks like all future users pass '0'.  Taking the
> index but dropping it on the floor is asking for future breakage.  Either drop it or
> assert that it's zero.
> 
> > +			u32 *ecx, u32 *edx)
> > +{
> > +	u64 val;
> > +
> > +	if (eax) {
> > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EAX));
> > +		VMGEXIT();
> > +		val = sev_es_rd_ghcb_msr();
> > +
> > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > +			return -EIO;
> > +
> > +		*eax = (val >> 32);
> > +	}
> > +
> > +	if (ebx) {
> > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EBX));
> > +		VMGEXIT();
> > +		val = sev_es_rd_ghcb_msr();
> > +
> > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > +			return -EIO;
> > +
> > +		*ebx = (val >> 32);
> > +	}
> > +
> > +	if (ecx) {
> > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_ECX));
> > +		VMGEXIT();
> > +		val = sev_es_rd_ghcb_msr();
> > +
> > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > +			return -EIO;
> > +
> > +		*ecx = (val >> 32);
> > +	}
> > +
> > +	if (edx) {
> > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EDX));
> > +		VMGEXIT();
> > +		val = sev_es_rd_ghcb_msr();
> > +
> > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > +			return -EIO;
> > +
> > +		*edx = (val >> 32);
> > +	}
> 
> That's a lot of pasta!  If you add
> 
>   static int __sev_cpuid_hv(u32 func, int reg_idx, u32 *reg)
>   {
> 	u64 val;
> 
> 	if (!reg)
> 		return 0;
> 
> 	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, reg_idx));
> 	VMGEXIT();
> 	val = sev_es_rd_ghcb_msr();
> 	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> 		return -EIO;
> 
> 	*reg = (val >> 32);
> 	return 0;
>   }
> 
> then this helper can become something like:
> 
>   static int sev_cpuid_hv(u32 func, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
>   {
> 	int ret;
> 
> 	ret = __sev_cpuid_hv(func, GHCB_CPUID_REQ_EAX, eax);
> 	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EBX, ebx);
> 	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_ECX, ecx);
> 	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EDX, edx);
> 
> 	return ret;

Looks good, will make these changes.

-Mike

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 17/40] KVM: SVM: Create a separate mapping for the SEV-ES save area
  2021-12-10 15:43 ` [PATCH v8 17/40] KVM: SVM: Create a separate mapping for the SEV-ES save area Brijesh Singh
  2021-12-30 12:19   ` Borislav Petkov
@ 2022-01-05  1:38   ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-05  1:38 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:09 -0600, Brijesh Singh wrote:
> The save area for SEV-ES/SEV-SNP guests, as used by the hardware, is
> different from the save area of a non SEV-ES/SEV-SNP guest.
> 
> This is the first step in defining the multiple save areas to keep them
> separate and ensuring proper operation amongst the different types of
> guests. Create an SEV-ES/SEV-SNP save area and adjust usage to the new
> save area definition where needed.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/include/asm/svm.h | 83 +++++++++++++++++++++++++++++---------
>  arch/x86/kvm/svm/sev.c     | 24 +++++------
>  arch/x86/kvm/svm/svm.h     |  2 +-
>  3 files changed, 77 insertions(+), 32 deletions(-)
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 18/40] KVM: SVM: Create a separate mapping for the GHCB save area
  2021-12-10 15:43 ` [PATCH v8 18/40] KVM: SVM: Create a separate mapping for the GHCB " Brijesh Singh
@ 2022-01-05 18:41   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-05 18:41 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:10 -0600, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> The initial implementation of the GHCB spec was based on trying to keep
> the register state offsets the same relative to the VM save area. However,
> the save area for SEV-ES has changed within the hardware causing the
> relation between the SEV-ES save area to change relative to the GHCB save
> area.
> 
> This is the second step in defining the multiple save areas to keep them
> separate and ensuring proper operation amongst the different types of
> guests. Create a GHCB save area that matches the GHCB specification.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/include/asm/svm.h | 48 +++++++++++++++++++++++++++++++++++---
>  1 file changed, 45 insertions(+), 3 deletions(-)
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 19/40] KVM: SVM: Update the SEV-ES save area mapping
  2021-12-10 15:43 ` [PATCH v8 19/40] KVM: SVM: Update the SEV-ES save area mapping Brijesh Singh
@ 2022-01-05 18:54   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-05 18:54 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:11 -0600, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> This is the final step in defining the multiple save areas to keep them
> separate and ensuring proper operation amongst the different types of
> guests. Update the SEV-ES/SEV-SNP save area to match the APM. This save
> area will be used for the upcoming SEV-SNP AP Creation NAE event support.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/include/asm/svm.h | 66 +++++++++++++++++++++++++++++---------
>  1 file changed, 50 insertions(+), 16 deletions(-)
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2022-01-03 19:10                           ` Venu Busireddy
@ 2022-01-05 19:34                             ` Brijesh Singh
  2022-01-10 20:46                               ` Brijesh Singh
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2022-01-05 19:34 UTC (permalink / raw)
  To: Venu Busireddy, Michael Roth
  Cc: brijesh.singh, Borislav Petkov, Tom Lendacky, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 1/3/22 1:10 PM, Venu Busireddy wrote:
> On 2021-12-15 15:22:57 -0600, Michael Roth wrote:
>> On Wed, Dec 15, 2021 at 09:38:55PM +0100, Borislav Petkov wrote:
>>>
>>> But it is hard to discuss anything without patches so we can continue
>>> the topic with concrete patches. But this unification is not
>>> super-pressing so it can go ontop of the SNP pile.
>>
>> Yah, it's all theoretical at this point. Didn't mean to derail things
>> though. I mainly brought it up to suggest that Venu's original approach of
>> returning the encryption bit via a pointer argument might make it easier to
>> expand it for other purposes in the future, and that naming it for that
>> future purpose might encourage future developers to focus their efforts
>> there instead of potentially re-introducing duplicate code.
>>
>> But either way it's simple enough to rework things when we actually
>> cross that bridge. So totally fine with saving all of this as a future
>> follow-up, or picking up either of Venu's patches for now if you'd still
>> prefer.
> 
> So, what is the consensus? Do you want me to submit a patch after the
> SNP changes go upstream? Or, do you want to roll in one of the patches
> that I posted earlier?
> 

Will incorporate your changes in v9. And will see what others say about it.

-Brijesh

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2022-01-04 17:56   ` Venu Busireddy
@ 2022-01-05 19:52     ` Brijesh Singh
  2022-01-05 20:27       ` Dave Hansen
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2022-01-05 19:52 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy



On 1/4/22 11:56 AM, Venu Busireddy wrote:
> On 2021-12-10 09:43:05 -0600, Brijesh Singh wrote:
>> The encryption attribute for the bss.decrypted region is cleared in the
>> initial page table build. This is because the section contains the data
>> that need to be shared between the guest and the hypervisor.
>>
>> When SEV-SNP is active, just clearing the encryption attribute in the
>> page table is not enough. The page state need to be updated in the RMP
>> table.
>>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> ---
>>   arch/x86/kernel/head64.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>> index fa02402dcb9b..72c5082a3ba4 100644
>> --- a/arch/x86/kernel/head64.c
>> +++ b/arch/x86/kernel/head64.c
>> @@ -143,7 +143,14 @@ static unsigned long sme_postprocess_startup(struct boot_params *bp, pmdval_t *p
>>   	if (sme_get_me_mask()) {
>>   		vaddr = (unsigned long)__start_bss_decrypted;
>>   		vaddr_end = (unsigned long)__end_bss_decrypted;
>> +
>>   		for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>> +			/*
>> +			 * When SEV-SNP is active then transition the page to shared in the RMP
>> +			 * table so that it is consistent with the page table attribute change.
>> +			 */
>> +			early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
> 
> Shouldn't the first argument be vaddr as below?
> 

Nope, sme_postprocess_startup() is called while we are fixing the 
initial page table and running with identity mapping (so va == pa).

thanks

>     			early_snp_set_memory_shared(vaddr, __pa(vaddr), PTRS_PER_PMD);
> 
> Venu
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2022-01-05 19:52     ` Brijesh Singh
@ 2022-01-05 20:27       ` Dave Hansen
  2022-01-05 21:39         ` Brijesh Singh
  0 siblings, 1 reply; 183+ messages in thread
From: Dave Hansen @ 2022-01-05 20:27 UTC (permalink / raw)
  To: Brijesh Singh, Venu Busireddy
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 1/5/22 11:52, Brijesh Singh wrote:
>>>           for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>>> +            /*
>>> +             * When SEV-SNP is active then transition the page to shared in the RMP
>>> +             * table so that it is consistent with the page table attribute change.
>>> +             */
>>> +            early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
>>
>> Shouldn't the first argument be vaddr as below?
> 
> Nope, sme_postprocess_startup() is called while we are fixing the 
> initial page table and running with identity mapping (so va == pa).

I'm not sure I've ever seen a line of code that wanted a comment so badly.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2022-01-05 20:27       ` Dave Hansen
@ 2022-01-05 21:39         ` Brijesh Singh
  2022-01-06 17:40           ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2022-01-05 21:39 UTC (permalink / raw)
  To: Dave Hansen, Venu Busireddy
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy



On 1/5/22 2:27 PM, Dave Hansen wrote:
> On 1/5/22 11:52, Brijesh Singh wrote:
>>>>           for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>>>> +            /*
>>>> +             * When SEV-SNP is active then transition the page to 
>>>> shared in the RMP
>>>> +             * table so that it is consistent with the page table 
>>>> attribute change.
>>>> +             */
>>>> +            early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), 
>>>> PTRS_PER_PMD);
>>>
>>> Shouldn't the first argument be vaddr as below?
>>
>> Nope, sme_postprocess_startup() is called while we are fixing the 
>> initial page table and running with identity mapping (so va == pa).
> 
> I'm not sure I've ever seen a line of code that wanted a comment so badly.

The early_snp_set_memory_shared() call the PVALIDATE instruction to 
clear the validated bit from the BSS region. The PVALIDATE instruction 
needs a virtual address, so we need to use the identity mapped virtual 
address so that PVALIDATE can clear the validated bit. I will add more 
comments to clarify it.

-Brijesh

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup to helper
  2021-12-10 15:43 ` [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup " Brijesh Singh
  2021-12-10 18:54   ` Dave Hansen
@ 2022-01-05 23:50   ` Borislav Petkov
  2022-01-06 19:59   ` Venu Busireddy
  2 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-05 23:50 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:16AM -0600, Brijesh Singh wrote:
> +int efi_get_system_table(struct boot_params *boot_params, unsigned long *sys_tbl_pa,
> +			 bool *is_efi_64)

Nah, that's doing two things in one function.

The signature should be a lot simpler:

unsigned long efi_get_system_table(struct boot_params *bp)

returns 0 on failure, non-null on success and then it returns the
physical address of the system table.

> +{
> +	unsigned long sys_tbl;
> +	struct efi_info *ei;
> +	bool efi_64;
> +	char *sig;
> +
> +	if (!sys_tbl_pa || !is_efi_64)
> +		return -EINVAL;
> +

This...

> +	ei = &boot_params->efi_info;
> +	sig = (char *)&ei->efi_loader_signature;
> +
> +	if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
> +		efi_64 = true;
> +	} else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
> +		efi_64 = false;
> +	} else {
> +		debug_putstr("Wrong EFI loader signature.\n");
> +		return -EOPNOTSUPP;
> +	}

... up to here needs to be another function:

enum get_efi_type(sig)

where enum is { EFI64, EFI32, INVALID } or so.

And you call this function at the call sites:

	if (efi_get_type(sig) == INVALID)
		error...

	sys_tbl_pa = efi_get_system_table(bp);
	if (!sys_tbl_pa)
		error...


Completely pseudo but you get the idea.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2022-01-05 21:39         ` Brijesh Singh
@ 2022-01-06 17:40           ` Venu Busireddy
  2022-01-06 19:06             ` Tom Lendacky
  0 siblings, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2022-01-06 17:40 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Dave Hansen, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On 2022-01-05 15:39:22 -0600, Brijesh Singh wrote:
> 
> 
> On 1/5/22 2:27 PM, Dave Hansen wrote:
> > On 1/5/22 11:52, Brijesh Singh wrote:
> > > > >           for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> > > > > +            /*
> > > > > +             * When SEV-SNP is active then transition the
> > > > > page to shared in the RMP
> > > > > +             * table so that it is consistent with the page
> > > > > table attribute change.
> > > > > +             */
> > > > > +            early_snp_set_memory_shared(__pa(vaddr),
> > > > > __pa(vaddr), PTRS_PER_PMD);
> > > > 
> > > > Shouldn't the first argument be vaddr as below?
> > > 
> > > Nope, sme_postprocess_startup() is called while we are fixing the
> > > initial page table and running with identity mapping (so va == pa).
> > 
> > I'm not sure I've ever seen a line of code that wanted a comment so badly.
> 
> The early_snp_set_memory_shared() call the PVALIDATE instruction to clear
> the validated bit from the BSS region. The PVALIDATE instruction needs a
> virtual address, so we need to use the identity mapped virtual address so
> that PVALIDATE can clear the validated bit. I will add more comments to
> clarify it.

Looking forward to see your final comments explaining this. I can't
still follow why, when PVALIDATE needs the virtual address, we are doing
a __pa() on the vaddr.

Venu


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper
  2021-12-10 15:43 ` [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper Brijesh Singh
  2021-12-30 18:52   ` Sean Christopherson
@ 2022-01-06 18:38   ` Venu Busireddy
  2022-01-06 20:21     ` Michael Roth
  1 sibling, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2022-01-06 18:38 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:14 -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> This code will also be used later for SEV-SNP-validated CPUID code in
> some cases, so move it to a common helper.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/kernel/sev-shared.c | 84 +++++++++++++++++++++++++-----------
>  1 file changed, 58 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index 3aaef1a18ffe..d89481b31022 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -194,6 +194,58 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
>  	return verify_exception_info(ghcb, ctxt);
>  }
>  
> +static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> +			u32 *ecx, u32 *edx)
> +{
> +	u64 val;
> +
> +	if (eax) {
> +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EAX));
> +		VMGEXIT();
> +		val = sev_es_rd_ghcb_msr();
> +
> +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +			return -EIO;
> +
> +		*eax = (val >> 32);
> +	}
> +
> +	if (ebx) {
> +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EBX));
> +		VMGEXIT();
> +		val = sev_es_rd_ghcb_msr();
> +
> +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +			return -EIO;
> +
> +		*ebx = (val >> 32);
> +	}
> +
> +	if (ecx) {
> +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_ECX));
> +		VMGEXIT();
> +		val = sev_es_rd_ghcb_msr();
> +
> +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +			return -EIO;
> +
> +		*ecx = (val >> 32);
> +	}
> +
> +	if (edx) {
> +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EDX));
> +		VMGEXIT();
> +		val = sev_es_rd_ghcb_msr();
> +
> +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +			return -EIO;
> +
> +		*edx = (val >> 32);
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * Boot VC Handler - This is the first VC handler during boot, there is no GHCB
>   * page yet, so it only supports the MSR based communication with the
> @@ -202,39 +254,19 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
>  void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
>  {
>  	unsigned int fn = lower_bits(regs->ax, 32);
> -	unsigned long val;
> +	u32 eax, ebx, ecx, edx;
>  
>  	/* Only CPUID is supported via MSR protocol */
>  	if (exit_code != SVM_EXIT_CPUID)
>  		goto fail;
>  
> -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EAX));
> -	VMGEXIT();
> -	val = sev_es_rd_ghcb_msr();
> -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +	if (sev_cpuid_hv(fn, 0, &eax, &ebx, &ecx, &edx))
>  		goto fail;
> -	regs->ax = val >> 32;
>  
> -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EBX));
> -	VMGEXIT();
> -	val = sev_es_rd_ghcb_msr();
> -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> -		goto fail;
> -	regs->bx = val >> 32;
> -
> -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_ECX));
> -	VMGEXIT();
> -	val = sev_es_rd_ghcb_msr();
> -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> -		goto fail;
> -	regs->cx = val >> 32;
> -
> -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EDX));
> -	VMGEXIT();
> -	val = sev_es_rd_ghcb_msr();
> -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> -		goto fail;
> -	regs->dx = val >> 32;
> +	regs->ax = eax;
> +	regs->bx = ebx;
> +	regs->cx = ecx;
> +	regs->dx = edx;

What is the intent behind declaring e?x as local variables, instead
of passing the addresses of regs->?x to sev_cpuid_hv()? Is it to
prevent touching any of the regs->?x unless there is no error from
sev_cpuid_hv()? If so, wouldn't it be better to hide this logic from
the callers by declaring the local variables in sev_cpuid_hv() itself,
and moving the four "*e?x = (val >> 32);" statements there to the end
of the function (just before last the return)? With that change, the
callers can safely pass the addresses of regs->?x to do_vc_no_ghcb(),
knowing that the values will only be touched if there is no error?

Venu

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 23/40] KVM: x86: move lookup of indexed CPUID leafs to helper
  2021-12-10 15:43 ` [PATCH v8 23/40] KVM: x86: move lookup of indexed CPUID leafs " Brijesh Singh
@ 2022-01-06 18:46   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-06 18:46 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:15 -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> Determining which CPUID leafs have significant ECX/index values is
> also needed by guest kernel code when doing SEV-SNP-validated CPUID
> lookups. Move this to common code to keep future updates in sync.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/include/asm/cpuid.h | 26 ++++++++++++++++++++++++++
>  arch/x86/kvm/cpuid.c         | 17 ++---------------
>  2 files changed, 28 insertions(+), 15 deletions(-)
>  create mode 100644 arch/x86/include/asm/cpuid.h
> 
> diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
> new file mode 100644
> index 000000000000..61426eb1f665
> --- /dev/null
> +++ b/arch/x86/include/asm/cpuid.h
> @@ -0,0 +1,26 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_X86_CPUID_H
> +#define _ASM_X86_CPUID_H
> +
> +static __always_inline bool cpuid_function_is_indexed(u32 function)
> +{
> +	switch (function) {
> +	case 4:
> +	case 7:
> +	case 0xb:
> +	case 0xd:
> +	case 0xf:
> +	case 0x10:
> +	case 0x12:
> +	case 0x14:
> +	case 0x17:
> +	case 0x18:
> +	case 0x1f:
> +	case 0x8000001d:
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +#endif /* _ASM_X86_CPUID_H */
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 07e9215e911d..6b99e8e87480 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -19,6 +19,7 @@
>  #include <asm/user.h>
>  #include <asm/fpu/xstate.h>
>  #include <asm/sgx.h>
> +#include <asm/cpuid.h>
>  #include "cpuid.h"
>  #include "lapic.h"
>  #include "mmu.h"
> @@ -626,22 +627,8 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
>  	cpuid_count(entry->function, entry->index,
>  		    &entry->eax, &entry->ebx, &entry->ecx, &entry->edx);
>  
> -	switch (function) {
> -	case 4:
> -	case 7:
> -	case 0xb:
> -	case 0xd:
> -	case 0xf:
> -	case 0x10:
> -	case 0x12:
> -	case 0x14:
> -	case 0x17:
> -	case 0x18:
> -	case 0x1f:
> -	case 0x8000001d:
> +	if (cpuid_function_is_indexed(function))
>  		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> -		break;
> -	}
>  
>  	return entry;
>  }
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2022-01-06 17:40           ` Venu Busireddy
@ 2022-01-06 19:06             ` Tom Lendacky
  2022-01-06 20:16               ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Tom Lendacky @ 2022-01-06 19:06 UTC (permalink / raw)
  To: Venu Busireddy, Brijesh Singh
  Cc: Dave Hansen, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, H. Peter Anvin, Ard Biesheuvel,
	Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jim Mattson, Andy Lutomirski, Dave Hansen, Sergio Lopez,
	Peter Gonda, Peter Zijlstra, Srinivas Pandruvada, David Rientjes,
	Dov Murik, Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 1/6/22 11:40 AM, Venu Busireddy wrote:
> On 2022-01-05 15:39:22 -0600, Brijesh Singh wrote:
>>
>>
>> On 1/5/22 2:27 PM, Dave Hansen wrote:
>>> On 1/5/22 11:52, Brijesh Singh wrote:
>>>>>>            for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>>>>>> +            /*
>>>>>> +             * When SEV-SNP is active then transition the
>>>>>> page to shared in the RMP
>>>>>> +             * table so that it is consistent with the page
>>>>>> table attribute change.
>>>>>> +             */
>>>>>> +            early_snp_set_memory_shared(__pa(vaddr),
>>>>>> __pa(vaddr), PTRS_PER_PMD);
>>>>>
>>>>> Shouldn't the first argument be vaddr as below?
>>>>
>>>> Nope, sme_postprocess_startup() is called while we are fixing the
>>>> initial page table and running with identity mapping (so va == pa).
>>>
>>> I'm not sure I've ever seen a line of code that wanted a comment so badly.
>>
>> The early_snp_set_memory_shared() call the PVALIDATE instruction to clear
>> the validated bit from the BSS region. The PVALIDATE instruction needs a
>> virtual address, so we need to use the identity mapped virtual address so
>> that PVALIDATE can clear the validated bit. I will add more comments to
>> clarify it.
> 
> Looking forward to see your final comments explaining this. I can't
> still follow why, when PVALIDATE needs the virtual address, we are doing
> a __pa() on the vaddr.

It's because of the phase of booting that the kernel is in. At this point, 
the kernel is running in identity mapped mode (VA == PA). The 
__start_bss_decrypted address is a regular kernel address, e.g. for the 
kernel I'm on it is 0xffffffffa7600000. Since the PVALIDATE instruction 
requires a valid virtual address, the code needs to perform a __pa() 
against __start_bss_decrypted to get the identity mapped virtual address 
that is currently in place.

It is not until the .Ljump_to_C_code label in head_64.S that the 
addressing switches from identity mapped to kernel addresses.

Thanks,
Tom

> 
> Venu
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup to helper
  2021-12-10 15:43 ` [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup " Brijesh Singh
  2021-12-10 18:54   ` Dave Hansen
  2022-01-05 23:50   ` Borislav Petkov
@ 2022-01-06 19:59   ` Venu Busireddy
  2 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-06 19:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:16 -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> Future patches for SEV-SNP-validated CPUID will also require early
> parsing of the EFI configuration. Incrementally move the related code
> into a set of helpers that can be re-used for that purpose.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/boot/compressed/Makefile |  1 +
>  arch/x86/boot/compressed/acpi.c   | 60 ++++++++++----------------
>  arch/x86/boot/compressed/efi.c    | 72 +++++++++++++++++++++++++++++++
>  arch/x86/boot/compressed/misc.h   | 14 ++++++
>  4 files changed, 109 insertions(+), 38 deletions(-)
>  create mode 100644 arch/x86/boot/compressed/efi.c
> 
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index 431bf7f846c3..d364192c2367 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -100,6 +100,7 @@ endif
>  vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
>  
>  vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
> +vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o
>  efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a
>  
>  $(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
> diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
> index 8bcbcee54aa1..9e784bd7b2e6 100644
> --- a/arch/x86/boot/compressed/acpi.c
> +++ b/arch/x86/boot/compressed/acpi.c
> @@ -86,8 +86,8 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
>  {
>  	efi_system_table_64_t *systab;
>  	struct efi_setup_data *esd;
> -	struct efi_info *ei;
> -	char *sig;
> +	bool efi_64;
> +	int ret;
>  
>  	esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
>  	if (!esd)
> @@ -98,18 +98,16 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
>  		return 0;
>  	}
>  
> -	ei = &boot_params->efi_info;
> -	sig = (char *)&ei->efi_loader_signature;
> -	if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
> +	/* Get systab from boot params. */
> +	ret = efi_get_system_table(boot_params, (unsigned long *)&systab, &efi_64);
> +	if (ret)
> +		error("EFI system table not found in kexec boot_params.");
> +
> +	if (!efi_64) {
>  		debug_putstr("Wrong kexec EFI loader signature.\n");
>  		return 0;
>  	}
>  
> -	/* Get systab from boot params. */
> -	systab = (efi_system_table_64_t *) (ei->efi_systab | ((__u64)ei->efi_systab_hi << 32));
> -	if (!systab)
> -		error("EFI system table not found in kexec boot_params.");
> -
>  	return __efi_get_rsdp_addr((unsigned long)esd->tables, systab->nr_tables, true);
>  }
>  #else
> @@ -119,45 +117,31 @@ static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
>  static acpi_physical_address efi_get_rsdp_addr(void)
>  {
>  #ifdef CONFIG_EFI
> -	unsigned long systab, config_tables;
> +	unsigned long systab_tbl_pa, config_tables;
>  	unsigned int nr_tables;
> -	struct efi_info *ei;
>  	bool efi_64;
> -	char *sig;
> -
> -	ei = &boot_params->efi_info;
> -	sig = (char *)&ei->efi_loader_signature;
> -
> -	if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
> -		efi_64 = true;
> -	} else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
> -		efi_64 = false;
> -	} else {
> -		debug_putstr("Wrong EFI loader signature.\n");
> -		return 0;
> -	}
> +	int ret;
>  
> -	/* Get systab from boot params. */
> -#ifdef CONFIG_X86_64
> -	systab = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
> -#else
> -	if (ei->efi_systab_hi || ei->efi_memmap_hi) {
> -		debug_putstr("Error getting RSDP address: EFI system table located above 4GB.\n");
> +	/*
> +	 * This function is called even for non-EFI BIOSes, and callers expect
> +	 * failure to locate the EFI system table to result in 0 being returned
> +	 * as indication that EFI is not available, rather than outright
> +	 * failure/abort.
> +	 */
> +	ret = efi_get_system_table(boot_params, &systab_tbl_pa, &efi_64);
> +	if (ret == -EOPNOTSUPP)
>  		return 0;
> -	}
> -	systab = ei->efi_systab;
> -#endif
> -	if (!systab)
> -		error("EFI system table not found.");
> +	if (ret)
> +		error("EFI support advertised, but unable to locate system table.");
>  
>  	/* Handle EFI bitness properly */
>  	if (efi_64) {
> -		efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab;
> +		efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab_tbl_pa;
>  
>  		config_tables	= stbl->tables;
>  		nr_tables	= stbl->nr_tables;
>  	} else {
> -		efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab;
> +		efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab_tbl_pa;
>  
>  		config_tables	= stbl->tables;
>  		nr_tables	= stbl->nr_tables;
> diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
> new file mode 100644
> index 000000000000..1c626d28f07e
> --- /dev/null
> +++ b/arch/x86/boot/compressed/efi.c
> @@ -0,0 +1,72 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Helpers for early access to EFI configuration table
> + *
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Michael Roth <michael.roth@amd.com>
> + */
> +
> +#include "misc.h"
> +#include <linux/efi.h>
> +#include <asm/efi.h>
> +
> +/**
> + * efi_get_system_table - Given boot_params, retrieve the physical address of
> + *                        EFI system table.
> + *
> + * @boot_params:        pointer to boot_params
> + * @sys_tbl_pa:         location to store physical address of system table
> + * @is_efi_64:          location to store whether using 64-bit EFI or not
> + *
> + * Return: 0 on success. On error, return params are left unchanged.
> + *
> + * Note: Existing callers like ACPI will call this unconditionally even for
> + * non-EFI BIOSes. In such cases, those callers may treat cases where
> + * bootparams doesn't indicate that a valid EFI system table is available as
> + * non-fatal errors to allow fall-through to non-EFI alternatives. This
> + * class of errors are reported as EOPNOTSUPP and should be kept in sync with
> + * callers who check for that specific error.
> + */
> +int efi_get_system_table(struct boot_params *boot_params, unsigned long *sys_tbl_pa,
> +			 bool *is_efi_64)
> +{
> +	unsigned long sys_tbl;
> +	struct efi_info *ei;
> +	bool efi_64;
> +	char *sig;
> +
> +	if (!sys_tbl_pa || !is_efi_64)
> +		return -EINVAL;
> +
> +	ei = &boot_params->efi_info;
> +	sig = (char *)&ei->efi_loader_signature;
> +
> +	if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
> +		efi_64 = true;
> +	} else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
> +		efi_64 = false;
> +	} else {
> +		debug_putstr("Wrong EFI loader signature.\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	/* Get systab from boot params. */
> +#ifdef CONFIG_X86_64
> +	sys_tbl = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
> +#else
> +	if (ei->efi_systab_hi || ei->efi_memmap_hi) {
> +		debug_putstr("Error: EFI system table located above 4GB.\n");
> +		return -EOPNOTSUPP;
> +	}
> +	sys_tbl = ei->efi_systab;
> +#endif
> +	if (!sys_tbl) {
> +		debug_putstr("EFI system table not found.");
> +		return -ENOENT;
> +	}
> +
> +	*sys_tbl_pa = sys_tbl;
> +	*is_efi_64 = efi_64;
> +	return 0;
> +}
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 01cc13c12059..165640f64b71 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -23,6 +23,7 @@
>  #include <linux/screen_info.h>
>  #include <linux/elf.h>
>  #include <linux/io.h>
> +#include <linux/efi.h>
>  #include <asm/page.h>
>  #include <asm/boot.h>
>  #include <asm/bootparam.h>
> @@ -176,4 +177,17 @@ void boot_stage2_vc(void);
>  
>  unsigned long sev_verify_cbit(unsigned long cr3);
>  
> +#ifdef CONFIG_EFI
> +/* helpers for early EFI config table access */
> +int efi_get_system_table(struct boot_params *boot_params,
> +			 unsigned long *sys_tbl_pa, bool *is_efi_64);
> +#else
> +static inline int
> +efi_get_system_table(struct boot_params *boot_params,
> +		     unsigned long *sys_tbl_pa, bool *is_efi_64)
> +{
> +	return -ENOENT;
> +}
> +#endif /* CONFIG_EFI */
> +
>  #endif /* BOOT_COMPRESSED_MISC_H */
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2022-01-06 19:06             ` Tom Lendacky
@ 2022-01-06 20:16               ` Venu Busireddy
  2022-01-06 20:50                 ` Tom Lendacky
  0 siblings, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2022-01-06 20:16 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, Dave Hansen, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, H. Peter Anvin, Ard Biesheuvel,
	Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jim Mattson, Andy Lutomirski, Dave Hansen, Sergio Lopez,
	Peter Gonda, Peter Zijlstra, Srinivas Pandruvada, David Rientjes,
	Dov Murik, Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2022-01-06 13:06:13 -0600, Tom Lendacky wrote:
> On 1/6/22 11:40 AM, Venu Busireddy wrote:
> > On 2022-01-05 15:39:22 -0600, Brijesh Singh wrote:
> > > 
> > > 
> > > On 1/5/22 2:27 PM, Dave Hansen wrote:
> > > > On 1/5/22 11:52, Brijesh Singh wrote:
> > > > > > >            for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> > > > > > > +            /*
> > > > > > > +             * When SEV-SNP is active then transition the
> > > > > > > page to shared in the RMP
> > > > > > > +             * table so that it is consistent with the page
> > > > > > > table attribute change.
> > > > > > > +             */
> > > > > > > +            early_snp_set_memory_shared(__pa(vaddr),
> > > > > > > __pa(vaddr), PTRS_PER_PMD);
> > > > > > 
> > > > > > Shouldn't the first argument be vaddr as below?
> > > > > 
> > > > > Nope, sme_postprocess_startup() is called while we are fixing the
> > > > > initial page table and running with identity mapping (so va == pa).
> > > > 
> > > > I'm not sure I've ever seen a line of code that wanted a comment so badly.
> > > 
> > > The early_snp_set_memory_shared() call the PVALIDATE instruction to clear
> > > the validated bit from the BSS region. The PVALIDATE instruction needs a
> > > virtual address, so we need to use the identity mapped virtual address so
> > > that PVALIDATE can clear the validated bit. I will add more comments to
> > > clarify it.
> > 
> > Looking forward to see your final comments explaining this. I can't
> > still follow why, when PVALIDATE needs the virtual address, we are doing
> > a __pa() on the vaddr.
> 
> It's because of the phase of booting that the kernel is in. At this point,
> the kernel is running in identity mapped mode (VA == PA). The
> __start_bss_decrypted address is a regular kernel address, e.g. for the
> kernel I'm on it is 0xffffffffa7600000. Since the PVALIDATE instruction
> requires a valid virtual address, the code needs to perform a __pa() against
> __start_bss_decrypted to get the identity mapped virtual address that is
> currently in place.

Perhaps  my confusion stems from the fact that __pa(x) is defined either
as "((unsigned long ) (x))" (for the cases where paddr and vaddr are
same), or as "__phys_addr((unsigned long )(x))", where a vaddr needs to
be converted to a paddr. If the paddr and vaddr are same in our case,
what exactly is the _pa(vaddr) doing to the vaddr?

Venu

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper
  2022-01-06 18:38   ` Venu Busireddy
@ 2022-01-06 20:21     ` Michael Roth
  2022-01-06 20:36       ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2022-01-06 20:21 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Vlastimil Babka, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Thu, Jan 06, 2022 at 12:38:37PM -0600, Venu Busireddy wrote:
> On 2021-12-10 09:43:14 -0600, Brijesh Singh wrote:
> > From: Michael Roth <michael.roth@amd.com>
> > 
> > This code will also be used later for SEV-SNP-validated CPUID code in
> > some cases, so move it to a common helper.
> > 
> > Signed-off-by: Michael Roth <michael.roth@amd.com>
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > ---
> >  arch/x86/kernel/sev-shared.c | 84 +++++++++++++++++++++++++-----------
> >  1 file changed, 58 insertions(+), 26 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> > index 3aaef1a18ffe..d89481b31022 100644
> > --- a/arch/x86/kernel/sev-shared.c
> > +++ b/arch/x86/kernel/sev-shared.c
> > @@ -194,6 +194,58 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
> >  	return verify_exception_info(ghcb, ctxt);
> >  }
> >  
> > +static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> > +			u32 *ecx, u32 *edx)
> > +{
> > +	u64 val;
> > +
> > +	if (eax) {
> > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EAX));
> > +		VMGEXIT();
> > +		val = sev_es_rd_ghcb_msr();
> > +
> > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > +			return -EIO;
> > +
> > +		*eax = (val >> 32);
> > +	}
> > +
> > +	if (ebx) {
> > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EBX));
> > +		VMGEXIT();
> > +		val = sev_es_rd_ghcb_msr();
> > +
> > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > +			return -EIO;
> > +
> > +		*ebx = (val >> 32);
> > +	}
> > +
> > +	if (ecx) {
> > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_ECX));
> > +		VMGEXIT();
> > +		val = sev_es_rd_ghcb_msr();
> > +
> > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > +			return -EIO;
> > +
> > +		*ecx = (val >> 32);
> > +	}
> > +
> > +	if (edx) {
> > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EDX));
> > +		VMGEXIT();
> > +		val = sev_es_rd_ghcb_msr();
> > +
> > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > +			return -EIO;
> > +
> > +		*edx = (val >> 32);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  /*
> >   * Boot VC Handler - This is the first VC handler during boot, there is no GHCB
> >   * page yet, so it only supports the MSR based communication with the
> > @@ -202,39 +254,19 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
> >  void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
> >  {
> >  	unsigned int fn = lower_bits(regs->ax, 32);
> > -	unsigned long val;
> > +	u32 eax, ebx, ecx, edx;
> >  
> >  	/* Only CPUID is supported via MSR protocol */
> >  	if (exit_code != SVM_EXIT_CPUID)
> >  		goto fail;
> >  
> > -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EAX));
> > -	VMGEXIT();
> > -	val = sev_es_rd_ghcb_msr();
> > -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > +	if (sev_cpuid_hv(fn, 0, &eax, &ebx, &ecx, &edx))
> >  		goto fail;
> > -	regs->ax = val >> 32;
> >  
> > -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EBX));
> > -	VMGEXIT();
> > -	val = sev_es_rd_ghcb_msr();
> > -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > -		goto fail;
> > -	regs->bx = val >> 32;
> > -
> > -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_ECX));
> > -	VMGEXIT();
> > -	val = sev_es_rd_ghcb_msr();
> > -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > -		goto fail;
> > -	regs->cx = val >> 32;
> > -
> > -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EDX));
> > -	VMGEXIT();
> > -	val = sev_es_rd_ghcb_msr();
> > -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > -		goto fail;
> > -	regs->dx = val >> 32;
> > +	regs->ax = eax;
> > +	regs->bx = ebx;
> > +	regs->cx = ecx;
> > +	regs->dx = edx;
> 
> What is the intent behind declaring e?x as local variables, instead
> of passing the addresses of regs->?x to sev_cpuid_hv()? Is it to
> prevent touching any of the regs->?x unless there is no error from
> sev_cpuid_hv()? If so, wouldn't it be better to hide this logic from
> the callers by declaring the local variables in sev_cpuid_hv() itself,
> and moving the four "*e?x = (val >> 32);" statements there to the end
> of the function (just before last the return)? With that change, the
> callers can safely pass the addresses of regs->?x to do_vc_no_ghcb(),
> knowing that the values will only be touched if there is no error?

For me it was more about readability. E?X are well-defined as 32-bit
values, whereas regs->?x are longs. It seemed more readable to me to
have sev_cpuid_hv()/snp_cpuid() expect/return the actual native types,
and leave it up to the caller to cast/shift if necessary.

It also seems more robust for future re-use, since, for instance, if we
ever introduced another callsite that happened to already use u32 locally,
it seems like it would be a mess trying to setup up temp long* args or do
casts to pass them into these functions and then shift/cast them back just
so we could save a few lines at this particular callsite.

> 
> Venu

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 25/40] x86/compressed/acpi: move EFI config table lookup to helper
  2021-12-10 15:43 ` [PATCH v8 25/40] x86/compressed/acpi: move EFI config " Brijesh Singh
@ 2022-01-06 20:33   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-06 20:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:17 -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> Future patches for SEV-SNP-validated CPUID will also require early
> parsing of the EFI configuration. Incrementally move the related code
> into a set of helpers that can be re-used for that purpose.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/boot/compressed/acpi.c | 25 ++++++--------------
>  arch/x86/boot/compressed/efi.c  | 42 +++++++++++++++++++++++++++++++++
>  arch/x86/boot/compressed/misc.h |  9 +++++++
>  3 files changed, 58 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
> index 9e784bd7b2e6..fea72a1504ff 100644
> --- a/arch/x86/boot/compressed/acpi.c
> +++ b/arch/x86/boot/compressed/acpi.c
> @@ -117,8 +117,9 @@ static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
>  static acpi_physical_address efi_get_rsdp_addr(void)
>  {
>  #ifdef CONFIG_EFI
> -	unsigned long systab_tbl_pa, config_tables;
> -	unsigned int nr_tables;
> +	unsigned long cfg_tbl_pa = 0;
> +	unsigned long systab_tbl_pa;
> +	unsigned int cfg_tbl_len;
>  	bool efi_64;
>  	int ret;
>  
> @@ -134,23 +135,11 @@ static acpi_physical_address efi_get_rsdp_addr(void)
>  	if (ret)
>  		error("EFI support advertised, but unable to locate system table.");
>  
> -	/* Handle EFI bitness properly */
> -	if (efi_64) {
> -		efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab_tbl_pa;
> +	ret = efi_get_conf_table(boot_params, &cfg_tbl_pa, &cfg_tbl_len, &efi_64);
> +	if (ret || !cfg_tbl_pa)
> +		error("EFI config table not found.");
>  
> -		config_tables	= stbl->tables;
> -		nr_tables	= stbl->nr_tables;
> -	} else {
> -		efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab_tbl_pa;
> -
> -		config_tables	= stbl->tables;
> -		nr_tables	= stbl->nr_tables;
> -	}
> -
> -	if (!config_tables)
> -		error("EFI config tables not found.");
> -
> -	return __efi_get_rsdp_addr(config_tables, nr_tables, efi_64);
> +	return __efi_get_rsdp_addr(cfg_tbl_pa, cfg_tbl_len, efi_64);
>  #else
>  	return 0;
>  #endif
> diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
> index 1c626d28f07e..08ad517b0731 100644
> --- a/arch/x86/boot/compressed/efi.c
> +++ b/arch/x86/boot/compressed/efi.c
> @@ -70,3 +70,45 @@ int efi_get_system_table(struct boot_params *boot_params, unsigned long *sys_tbl
>  	*is_efi_64 = efi_64;
>  	return 0;
>  }
> +
> +/**
> + * efi_get_conf_table - Given boot_params, locate EFI system table from it
> + *                        and return the physical address EFI configuration table.
> + *
> + * @boot_params:        pointer to boot_params
> + * @cfg_tbl_pa:         location to store physical address of config table
> + * @cfg_tbl_len:        location to store number of config table entries
> + * @is_efi_64:          location to store whether using 64-bit EFI or not
> + *
> + * Return: 0 on success. On error, return params are left unchanged.
> + */
> +int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
> +		       unsigned int *cfg_tbl_len, bool *is_efi_64)
> +{
> +	unsigned long sys_tbl_pa = 0;
> +	int ret;
> +
> +	if (!cfg_tbl_pa || !cfg_tbl_len || !is_efi_64)
> +		return -EINVAL;
> +
> +	ret = efi_get_system_table(boot_params, &sys_tbl_pa, is_efi_64);
> +	if (ret)
> +		return ret;
> +
> +	/* Handle EFI bitness properly */
> +	if (*is_efi_64) {
> +		efi_system_table_64_t *stbl =
> +			(efi_system_table_64_t *)sys_tbl_pa;
> +
> +		*cfg_tbl_pa	= stbl->tables;
> +		*cfg_tbl_len	= stbl->nr_tables;
> +	} else {
> +		efi_system_table_32_t *stbl =
> +			(efi_system_table_32_t *)sys_tbl_pa;
> +
> +		*cfg_tbl_pa	= stbl->tables;
> +		*cfg_tbl_len	= stbl->nr_tables;
> +	}
> +
> +	return 0;
> +}
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 165640f64b71..1c69592e83da 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -181,6 +181,8 @@ unsigned long sev_verify_cbit(unsigned long cr3);
>  /* helpers for early EFI config table access */
>  int efi_get_system_table(struct boot_params *boot_params,
>  			 unsigned long *sys_tbl_pa, bool *is_efi_64);
> +int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
> +		       unsigned int *cfg_tbl_len, bool *is_efi_64);
>  #else
>  static inline int
>  efi_get_system_table(struct boot_params *boot_params,
> @@ -188,6 +190,13 @@ efi_get_system_table(struct boot_params *boot_params,
>  {
>  	return -ENOENT;
>  }
> +
> +static inline int
> +efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
> +		   unsigned int *cfg_tbl_len, bool *is_efi_64)
> +{
> +	return -ENOENT;
> +}
>  #endif /* CONFIG_EFI */
>  
>  #endif /* BOOT_COMPRESSED_MISC_H */
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper
  2022-01-06 20:21     ` Michael Roth
@ 2022-01-06 20:36       ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-06 20:36 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Vlastimil Babka, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2022-01-06 14:21:35 -0600, Michael Roth wrote:
> On Thu, Jan 06, 2022 at 12:38:37PM -0600, Venu Busireddy wrote:
> > On 2021-12-10 09:43:14 -0600, Brijesh Singh wrote:
> > > From: Michael Roth <michael.roth@amd.com>
> > > 
> > > This code will also be used later for SEV-SNP-validated CPUID code in
> > > some cases, so move it to a common helper.
> > > 
> > > Signed-off-by: Michael Roth <michael.roth@amd.com>
> > > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> > > ---
> > >  arch/x86/kernel/sev-shared.c | 84 +++++++++++++++++++++++++-----------
> > >  1 file changed, 58 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> > > index 3aaef1a18ffe..d89481b31022 100644
> > > --- a/arch/x86/kernel/sev-shared.c
> > > +++ b/arch/x86/kernel/sev-shared.c
> > > @@ -194,6 +194,58 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
> > >  	return verify_exception_info(ghcb, ctxt);
> > >  }
> > >  
> > > +static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> > > +			u32 *ecx, u32 *edx)
> > > +{
> > > +	u64 val;
> > > +
> > > +	if (eax) {
> > > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EAX));
> > > +		VMGEXIT();
> > > +		val = sev_es_rd_ghcb_msr();
> > > +
> > > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > > +			return -EIO;
> > > +
> > > +		*eax = (val >> 32);
> > > +	}
> > > +
> > > +	if (ebx) {
> > > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EBX));
> > > +		VMGEXIT();
> > > +		val = sev_es_rd_ghcb_msr();
> > > +
> > > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > > +			return -EIO;
> > > +
> > > +		*ebx = (val >> 32);
> > > +	}
> > > +
> > > +	if (ecx) {
> > > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_ECX));
> > > +		VMGEXIT();
> > > +		val = sev_es_rd_ghcb_msr();
> > > +
> > > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > > +			return -EIO;
> > > +
> > > +		*ecx = (val >> 32);
> > > +	}
> > > +
> > > +	if (edx) {
> > > +		sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EDX));
> > > +		VMGEXIT();
> > > +		val = sev_es_rd_ghcb_msr();
> > > +
> > > +		if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > > +			return -EIO;
> > > +
> > > +		*edx = (val >> 32);
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  /*
> > >   * Boot VC Handler - This is the first VC handler during boot, there is no GHCB
> > >   * page yet, so it only supports the MSR based communication with the
> > > @@ -202,39 +254,19 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
> > >  void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
> > >  {
> > >  	unsigned int fn = lower_bits(regs->ax, 32);
> > > -	unsigned long val;
> > > +	u32 eax, ebx, ecx, edx;
> > >  
> > >  	/* Only CPUID is supported via MSR protocol */
> > >  	if (exit_code != SVM_EXIT_CPUID)
> > >  		goto fail;
> > >  
> > > -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EAX));
> > > -	VMGEXIT();
> > > -	val = sev_es_rd_ghcb_msr();
> > > -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > > +	if (sev_cpuid_hv(fn, 0, &eax, &ebx, &ecx, &edx))
> > >  		goto fail;
> > > -	regs->ax = val >> 32;
> > >  
> > > -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EBX));
> > > -	VMGEXIT();
> > > -	val = sev_es_rd_ghcb_msr();
> > > -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > > -		goto fail;
> > > -	regs->bx = val >> 32;
> > > -
> > > -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_ECX));
> > > -	VMGEXIT();
> > > -	val = sev_es_rd_ghcb_msr();
> > > -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > > -		goto fail;
> > > -	regs->cx = val >> 32;
> > > -
> > > -	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EDX));
> > > -	VMGEXIT();
> > > -	val = sev_es_rd_ghcb_msr();
> > > -	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> > > -		goto fail;
> > > -	regs->dx = val >> 32;
> > > +	regs->ax = eax;
> > > +	regs->bx = ebx;
> > > +	regs->cx = ecx;
> > > +	regs->dx = edx;
> > 
> > What is the intent behind declaring e?x as local variables, instead
> > of passing the addresses of regs->?x to sev_cpuid_hv()? Is it to
> > prevent touching any of the regs->?x unless there is no error from
> > sev_cpuid_hv()? If so, wouldn't it be better to hide this logic from
> > the callers by declaring the local variables in sev_cpuid_hv() itself,
> > and moving the four "*e?x = (val >> 32);" statements there to the end
> > of the function (just before last the return)? With that change, the
> > callers can safely pass the addresses of regs->?x to do_vc_no_ghcb(),
> > knowing that the values will only be touched if there is no error?
> 
> For me it was more about readability. E?X are well-defined as 32-bit
> values, whereas regs->?x are longs. It seemed more readable to me to
> have sev_cpuid_hv()/snp_cpuid() expect/return the actual native types,
> and leave it up to the caller to cast/shift if necessary.
> 
> It also seems more robust for future re-use, since, for instance, if we
> ever introduced another callsite that happened to already use u32 locally,
> it seems like it would be a mess trying to setup up temp long* args or do
> casts to pass them into these functions and then shift/cast them back just
> so we could save a few lines at this particular callsite.

Got it.

Venu

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 26/40] x86/compressed/acpi: move EFI vendor table lookup to helper
  2021-12-10 15:43 ` [PATCH v8 26/40] x86/compressed/acpi: move EFI vendor " Brijesh Singh
@ 2022-01-06 20:47   ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-06 20:47 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:18 -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> Future patches for SEV-SNP-validated CPUID will also require early
> parsing of the EFI configuration. Incrementally move the related code
> into a set of helpers that can be re-used for that purpose.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/boot/compressed/acpi.c | 50 ++++++++-----------------
>  arch/x86/boot/compressed/efi.c  | 65 +++++++++++++++++++++++++++++++++
>  arch/x86/boot/compressed/misc.h |  9 +++++
>  3 files changed, 90 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
> index fea72a1504ff..0670c8f8888a 100644
> --- a/arch/x86/boot/compressed/acpi.c
> +++ b/arch/x86/boot/compressed/acpi.c
> @@ -20,46 +20,28 @@
>   */
>  struct mem_vector immovable_mem[MAX_NUMNODES*2];
>  
> -/*
> - * Search EFI system tables for RSDP.  If both ACPI_20_TABLE_GUID and
> - * ACPI_TABLE_GUID are found, take the former, which has more features.
> - */
>  static acpi_physical_address
> -__efi_get_rsdp_addr(unsigned long config_tables, unsigned int nr_tables,
> -		    bool efi_64)
> +__efi_get_rsdp_addr(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len, bool efi_64)
>  {
>  	acpi_physical_address rsdp_addr = 0;
>  
>  #ifdef CONFIG_EFI
> -	int i;
> -
> -	/* Get EFI tables from systab. */
> -	for (i = 0; i < nr_tables; i++) {
> -		acpi_physical_address table;
> -		efi_guid_t guid;
> -
> -		if (efi_64) {
> -			efi_config_table_64_t *tbl = (efi_config_table_64_t *)config_tables + i;
> -
> -			guid  = tbl->guid;
> -			table = tbl->table;
> -
> -			if (!IS_ENABLED(CONFIG_X86_64) && table >> 32) {
> -				debug_putstr("Error getting RSDP address: EFI config table located above 4GB.\n");
> -				return 0;
> -			}
> -		} else {
> -			efi_config_table_32_t *tbl = (efi_config_table_32_t *)config_tables + i;
> -
> -			guid  = tbl->guid;
> -			table = tbl->table;
> -		}
> +	int ret;
>  
> -		if (!(efi_guidcmp(guid, ACPI_TABLE_GUID)))
> -			rsdp_addr = table;
> -		else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID)))
> -			return table;
> -	}
> +	/*
> +	 * Search EFI system tables for RSDP. Preferred is ACPI_20_TABLE_GUID to
> +	 * ACPI_TABLE_GUID because it has more features.
> +	 */
> +	ret = efi_find_vendor_table(cfg_tbl_pa, cfg_tbl_len, ACPI_20_TABLE_GUID,
> +				    efi_64, (unsigned long *)&rsdp_addr);
> +	if (!ret)
> +		return rsdp_addr;
> +
> +	/* No ACPI_20_TABLE_GUID found, fallback to ACPI_TABLE_GUID. */
> +	ret = efi_find_vendor_table(cfg_tbl_pa, cfg_tbl_len, ACPI_TABLE_GUID,
> +				    efi_64, (unsigned long *)&rsdp_addr);
> +	if (ret)
> +		debug_putstr("Error getting RSDP address.\n");
>  #endif
>  	return rsdp_addr;
>  }
> diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
> index 08ad517b0731..c1ddc72ef4d9 100644
> --- a/arch/x86/boot/compressed/efi.c
> +++ b/arch/x86/boot/compressed/efi.c
> @@ -112,3 +112,68 @@ int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_p
>  
>  	return 0;
>  }
> +
> +/* Get vendor table address/guid from EFI config table at the given index */
> +static int get_vendor_table(void *cfg_tbl, unsigned int idx,
> +			    unsigned long *vendor_tbl_pa,
> +			    efi_guid_t *vendor_tbl_guid,
> +			    bool efi_64)
> +{
> +	if (efi_64) {
> +		efi_config_table_64_t *tbl_entry =
> +			(efi_config_table_64_t *)cfg_tbl + idx;
> +
> +		if (!IS_ENABLED(CONFIG_X86_64) && tbl_entry->table >> 32) {
> +			debug_putstr("Error: EFI config table entry located above 4GB.\n");
> +			return -EINVAL;
> +		}
> +
> +		*vendor_tbl_pa		= tbl_entry->table;
> +		*vendor_tbl_guid	= tbl_entry->guid;
> +
> +	} else {
> +		efi_config_table_32_t *tbl_entry =
> +			(efi_config_table_32_t *)cfg_tbl + idx;
> +
> +		*vendor_tbl_pa		= tbl_entry->table;
> +		*vendor_tbl_guid	= tbl_entry->guid;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * efi_find_vendor_table - Given EFI config table, search it for the physical
> + *                         address of the vendor table associated with GUID.
> + *
> + * @cfg_tbl_pa:        pointer to EFI configuration table
> + * @cfg_tbl_len:       number of entries in EFI configuration table
> + * @guid:              GUID of vendor table
> + * @efi_64:            true if using 64-bit EFI
> + * @vendor_tbl_pa:     location to store physical address of vendor table
> + *
> + * Return: 0 on success. On error, return params are left unchanged.
> + */
> +int efi_find_vendor_table(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len,
> +			  efi_guid_t guid, bool efi_64, unsigned long *vendor_tbl_pa)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < cfg_tbl_len; i++) {
> +		unsigned long vendor_tbl_pa_tmp;
> +		efi_guid_t vendor_tbl_guid;
> +		int ret;
> +
> +		if (get_vendor_table((void *)cfg_tbl_pa, i,
> +				     &vendor_tbl_pa_tmp,
> +				     &vendor_tbl_guid, efi_64))
> +			return -EINVAL;
> +
> +		if (!efi_guidcmp(guid, vendor_tbl_guid)) {
> +			*vendor_tbl_pa = vendor_tbl_pa_tmp;
> +			return 0;
> +		}
> +	}
> +
> +	return -ENOENT;
> +}
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 1c69592e83da..e9fde1482fbe 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -183,6 +183,8 @@ int efi_get_system_table(struct boot_params *boot_params,
>  			 unsigned long *sys_tbl_pa, bool *is_efi_64);
>  int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
>  		       unsigned int *cfg_tbl_len, bool *is_efi_64);
> +int efi_find_vendor_table(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len,
> +			  efi_guid_t guid, bool efi_64, unsigned long *vendor_tbl_pa);
>  #else
>  static inline int
>  efi_get_system_table(struct boot_params *boot_params,
> @@ -197,6 +199,13 @@ efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
>  {
>  	return -ENOENT;
>  }
> +
> +static inline int
> +efi_find_vendor_table(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len,
> +		      efi_guid_t guid, bool efi_64, unsigned long *vendor_tbl_pa)
> +{
> +	return -ENOENT;
> +}
>  #endif /* CONFIG_EFI */
>  
>  #endif /* BOOT_COMPRESSED_MISC_H */
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table
  2022-01-06 20:16               ` Venu Busireddy
@ 2022-01-06 20:50                 ` Tom Lendacky
  0 siblings, 0 replies; 183+ messages in thread
From: Tom Lendacky @ 2022-01-06 20:50 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Brijesh Singh, Dave Hansen, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, H. Peter Anvin, Ard Biesheuvel,
	Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Jim Mattson, Andy Lutomirski, Dave Hansen, Sergio Lopez,
	Peter Gonda, Peter Zijlstra, Srinivas Pandruvada, David Rientjes,
	Dov Murik, Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 1/6/22 2:16 PM, Venu Busireddy wrote:
> On 2022-01-06 13:06:13 -0600, Tom Lendacky wrote:
>> On 1/6/22 11:40 AM, Venu Busireddy wrote:
>>> On 2022-01-05 15:39:22 -0600, Brijesh Singh wrote:
>>>>
>>>>
>>>> On 1/5/22 2:27 PM, Dave Hansen wrote:
>>>>> On 1/5/22 11:52, Brijesh Singh wrote:
>>>>>>>>             for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>>>>>>>> +            /*
>>>>>>>> +             * When SEV-SNP is active then transition the
>>>>>>>> page to shared in the RMP
>>>>>>>> +             * table so that it is consistent with the page
>>>>>>>> table attribute change.
>>>>>>>> +             */
>>>>>>>> +            early_snp_set_memory_shared(__pa(vaddr),
>>>>>>>> __pa(vaddr), PTRS_PER_PMD);
>>>>>>>
>>>>>>> Shouldn't the first argument be vaddr as below?
>>>>>>
>>>>>> Nope, sme_postprocess_startup() is called while we are fixing the
>>>>>> initial page table and running with identity mapping (so va == pa).
>>>>>
>>>>> I'm not sure I've ever seen a line of code that wanted a comment so badly.
>>>>
>>>> The early_snp_set_memory_shared() call the PVALIDATE instruction to clear
>>>> the validated bit from the BSS region. The PVALIDATE instruction needs a
>>>> virtual address, so we need to use the identity mapped virtual address so
>>>> that PVALIDATE can clear the validated bit. I will add more comments to
>>>> clarify it.
>>>
>>> Looking forward to see your final comments explaining this. I can't
>>> still follow why, when PVALIDATE needs the virtual address, we are doing
>>> a __pa() on the vaddr.
>>
>> It's because of the phase of booting that the kernel is in. At this point,
>> the kernel is running in identity mapped mode (VA == PA). The
>> __start_bss_decrypted address is a regular kernel address, e.g. for the
>> kernel I'm on it is 0xffffffffa7600000. Since the PVALIDATE instruction
>> requires a valid virtual address, the code needs to perform a __pa() against
>> __start_bss_decrypted to get the identity mapped virtual address that is
>> currently in place.
> 
> Perhaps  my confusion stems from the fact that __pa(x) is defined either
> as "((unsigned long ) (x))" (for the cases where paddr and vaddr are
> same), or as "__phys_addr((unsigned long )(x))", where a vaddr needs to
> be converted to a paddr. If the paddr and vaddr are same in our case,
> what exactly is the _pa(vaddr) doing to the vaddr?

But they are not the same and the head64.c file is compiled without 
defining a value for __pa(), so __pa() is __phys_addr((unsigned long)(x)). 
The virtual address value of __start_bss_decrypted, for me, is 
0xffffffffa7600000, and that does not equal the physical address (take a 
look at your /proc/kallsyms). However, since the code is running identity 
mapped and with a page table without kernel virtual addresses, it cannot 
use that value. It needs to convert that value to the identity mapped 
virtual address and that is done using __pa(). Only after using __pa() on 
__start_bss_decrypted, do you get a virtual address that maps to and is 
equal to the physical address.

You may want to step through the boot code using KVM to see what the 
environment is and why things are done the way they are.

Thanks,
Tom

> 
> Venu
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data
  2021-12-10 15:43 ` [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data Brijesh Singh
  2021-12-10 19:12   ` Dave Hansen
@ 2022-01-06 22:48   ` Venu Busireddy
  1 sibling, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-06 22:48 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2021-12-10 09:43:19 -0600, Brijesh Singh wrote:
> While launching the encrypted guests, the hypervisor may need to provide
> some additional information during the guest boot. When booting under the
> EFI based BIOS, the EFI configuration table contains an entry for the
> confidential computing blob that contains the required information.
> 
> To support booting encrypted guests on non-EFI VM, the hypervisor needs to
> pass this additional information to the kernel with a different method.
> 
> For this purpose, introduce SETUP_CC_BLOB type in setup_data to hold the
> physical address of the confidential computing blob location. The boot
> loader or hypervisor may choose to use this method instead of EFI
> configuration table. The CC blob location scanning should give preference
> to setup_data data over the EFI configuration table.
> 
> In AMD SEV-SNP, the CC blob contains the address of the secrets and CPUID
> pages. The secrets page includes information such as a VM to PSP
> communication key and CPUID page contains PSP filtered CPUID values.
> Define the AMD SEV confidential computing blob structure.
> 
> While at it, define the EFI GUID for the confidential computing blob.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>

> ---
>  arch/x86/include/asm/sev.h            | 12 ++++++++++++
>  arch/x86/include/uapi/asm/bootparam.h |  1 +
>  include/linux/efi.h                   |  1 +
>  3 files changed, 14 insertions(+)
> 
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index f7cbd5164136..f42fbe3c332f 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -44,6 +44,18 @@ struct es_em_ctxt {
>  
>  void do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code);
>  
> +/* AMD SEV Confidential computing blob structure */
> +#define CC_BLOB_SEV_HDR_MAGIC	0x45444d41
> +struct cc_blob_sev_info {
> +	u32 magic;
> +	u16 version;
> +	u16 reserved;
> +	u64 secrets_phys;
> +	u32 secrets_len;
> +	u64 cpuid_phys;
> +	u32 cpuid_len;
> +};
> +
>  static inline u64 lower_bits(u64 val, unsigned int bits)
>  {
>  	u64 mask = (1ULL << bits) - 1;
> diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> index b25d3f82c2f3..1ac5acca72ce 100644
> --- a/arch/x86/include/uapi/asm/bootparam.h
> +++ b/arch/x86/include/uapi/asm/bootparam.h
> @@ -10,6 +10,7 @@
>  #define SETUP_EFI			4
>  #define SETUP_APPLE_PROPERTIES		5
>  #define SETUP_JAILHOUSE			6
> +#define SETUP_CC_BLOB			7
>  
>  #define SETUP_INDIRECT			(1<<31)
>  
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index dbd39b20e034..a022aed7adb3 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -344,6 +344,7 @@ void efi_native_runtime_setup(void);
>  #define EFI_CERT_SHA256_GUID			EFI_GUID(0xc1c41626, 0x504c, 0x4092, 0xac, 0xa9, 0x41, 0xf9, 0x36, 0x93, 0x43, 0x28)
>  #define EFI_CERT_X509_GUID			EFI_GUID(0xa5c059a1, 0x94e4, 0x4aa7, 0x87, 0xb5, 0xab, 0x15, 0x5c, 0x2b, 0xf0, 0x72)
>  #define EFI_CERT_X509_SHA256_GUID		EFI_GUID(0x3bd2a492, 0x96c0, 0x4079, 0xb4, 0x20, 0xfc, 0xf9, 0x8e, 0xf1, 0x03, 0xed)
> +#define EFI_CC_BLOB_GUID			EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54, 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42)
>  
>  /*
>   * This GUID is used to pass to the kernel proper the struct screen_info
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data
  2021-12-13 15:08           ` Dave Hansen
  2021-12-13 15:55             ` Brijesh Singh
@ 2022-01-07 11:54             ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-07 11:54 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Mon, Dec 13, 2021 at 07:08:07AM -0800, Dave Hansen wrote:
> Could you please make the structure's size invariant?  That's great if
> there's no problem in today's implementation, but it's best no to leave
> little land mines like this around.  Let's say someone copies your code
> as an example of something that interacts with a firmware table a few
> years or months down the road.

Btw, about that cc blob thing: is TDX going to need something like that
too and if so, can they use it too?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 28/40] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement
  2021-12-10 15:43 ` [PATCH v8 28/40] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement Brijesh Singh
@ 2022-01-07 13:22   ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-07 13:22 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:20AM -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> Update the documentation with SEV-SNP CPUID enforcement.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        | 28 +++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index 5c081c8c7164..aa8292fa579a 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> @@ -427,6 +427,34 @@ issued by the hypervisor to make the guest ready for execution.
>  
>  Returns: 0 on success, -negative on error
>  
> +SEV-SNP CPUID Enforcement
> +=========================
> +
> +SEV-SNP guests can access a special page that contains a table of CPUID values
> +that have been validated by the PSP as part of SNP_LAUNCH_UPDATE firmware
						 ^
						 the

> +command. It provides the following assurances regarding the validity of CPUID
> +values:
> +
> + - Its address is obtained via bootloader/firmware (via CC blob), whose
> +   binares will be measured as part of the SEV-SNP attestation report.

Unknown word [binares] in Documentation.
Suggestions: ['binaries', 'Linares', 'bi nares', 'bi-nares', 'bin ares', 'bin-ares', 'nares']

Also:

s/whose binaries/and those binaries/

> + - Its initial state will be encrypted/pvalidated, so attempts to modify
> +   it during run-time will be result in garbage being written, or #VC

s/be //

> +   exceptions being generated due to changes in validation state if the
> +   hypervisor tries to swap the backing page.
> + - Attempts to bypass PSP checks by hypervisor by using a normal page, or a
				      ^
				      the

> +   non-CPUID encrypted page will change the measurement provided by the
> +   SEV-SNP attestation report.
> + - The CPUID page contents are *not* measured, but attempts to modify the
> +   expected contents of a CPUID page as part of guest initialization will be
> +   gated by the PSP CPUID enforcement policy checks performed on the page
> +   during SNP_LAUNCH_UPDATE, and noticeable later if the guest owner
> +   implements their own checks of the CPUID values.
> +
> +It is important to note that this last assurance is only useful if the kernel
> +has taken care to make use of the SEV-SNP CPUID throughout all stages of boot.
> +Otherwise guest owner attestation provides no assurance that the kernel wasn't
	    ^
	    ,

> +fed incorrect values at some point during boot.
> +
>  References
>  ==========
>  
> -- 
> 2.25.1
> 

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2022-01-05 19:34                             ` Brijesh Singh
@ 2022-01-10 20:46                               ` Brijesh Singh
  2022-01-10 21:17                                 ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2022-01-10 20:46 UTC (permalink / raw)
  To: Venu Busireddy, Michael Roth
  Cc: brijesh.singh, Borislav Petkov, Tom Lendacky, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

Hi Venu,

On 1/5/22 1:34 PM, Brijesh Singh wrote:
> 
> 
> On 1/3/22 1:10 PM, Venu Busireddy wrote:
>> On 2021-12-15 15:22:57 -0600, Michael Roth wrote:
>>> On Wed, Dec 15, 2021 at 09:38:55PM +0100, Borislav Petkov wrote:
>>>>
>>>> But it is hard to discuss anything without patches so we can continue
>>>> the topic with concrete patches. But this unification is not
>>>> super-pressing so it can go ontop of the SNP pile.
>>>
>>> Yah, it's all theoretical at this point. Didn't mean to derail things
>>> though. I mainly brought it up to suggest that Venu's original 
>>> approach of
>>> returning the encryption bit via a pointer argument might make it 
>>> easier to
>>> expand it for other purposes in the future, and that naming it for that
>>> future purpose might encourage future developers to focus their efforts
>>> there instead of potentially re-introducing duplicate code.
>>>
>>> But either way it's simple enough to rework things when we actually
>>> cross that bridge. So totally fine with saving all of this as a future
>>> follow-up, or picking up either of Venu's patches for now if you'd still
>>> prefer.
>>
>> So, what is the consensus? Do you want me to submit a patch after the
>> SNP changes go upstream? Or, do you want to roll in one of the patches
>> that I posted earlier?
>>
> 
> Will incorporate your changes in v9. And will see what others say about it.
> 

Now that I am incorporating the feedback in my wip branch, at this time 
I am dropping your cleanup mainly because some of recommendation may 
require more rework down the line; you can submit your recommendation as 
cleanup after the patches are in. I hope this is okay with you.

thanks

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2022-01-10 20:46                               ` Brijesh Singh
@ 2022-01-10 21:17                                 ` Venu Busireddy
  2022-01-10 21:38                                   ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2022-01-10 21:17 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Michael Roth, Borislav Petkov, Tom Lendacky, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2022-01-10 14:46:27 -0600, Brijesh Singh wrote:
> Hi Venu,
> 
> On 1/5/22 1:34 PM, Brijesh Singh wrote:
> > 
> > 
> > On 1/3/22 1:10 PM, Venu Busireddy wrote:
> > > On 2021-12-15 15:22:57 -0600, Michael Roth wrote:
> > > > On Wed, Dec 15, 2021 at 09:38:55PM +0100, Borislav Petkov wrote:
> > > > > 
> > > > > But it is hard to discuss anything without patches so we can continue
> > > > > the topic with concrete patches. But this unification is not
> > > > > super-pressing so it can go ontop of the SNP pile.
> > > > 
> > > > Yah, it's all theoretical at this point. Didn't mean to derail things
> > > > though. I mainly brought it up to suggest that Venu's original
> > > > approach of
> > > > returning the encryption bit via a pointer argument might make
> > > > it easier to
> > > > expand it for other purposes in the future, and that naming it for that
> > > > future purpose might encourage future developers to focus their efforts
> > > > there instead of potentially re-introducing duplicate code.
> > > > 
> > > > But either way it's simple enough to rework things when we actually
> > > > cross that bridge. So totally fine with saving all of this as a future
> > > > follow-up, or picking up either of Venu's patches for now if you'd still
> > > > prefer.
> > > 
> > > So, what is the consensus? Do you want me to submit a patch after the
> > > SNP changes go upstream? Or, do you want to roll in one of the patches
> > > that I posted earlier?
> > > 
> > 
> > Will incorporate your changes in v9. And will see what others say about it.
> > 
> 
> Now that I am incorporating the feedback in my wip branch, at this time I am
> dropping your cleanup mainly because some of recommendation may require more
> rework down the line; you can submit your recommendation as cleanup after
> the patches are in. I hope this is okay with you.

Can't we do that rework (if any) as and when it is needed? I am worried
that we will never get this in!

Venu


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot
  2022-01-10 21:17                                 ` Venu Busireddy
@ 2022-01-10 21:38                                   ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-10 21:38 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Brijesh Singh, Michael Roth, Tom Lendacky, x86, linux-kernel,
	kvm, linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Mon, Jan 10, 2022 at 03:17:56PM -0600, Venu Busireddy wrote:
> Can't we do that rework (if any) as and when it is needed? I am worried
> that we will never get this in!

In case you've missed it from a previous mail on that same thread:

"But this unification is not super-pressing so it can go ontop of the
SNP pile."

So such cleanups go ontop, when the dust settles and when we realize
that there really are parts which can be unified. Right now, everything
is moving so first things first.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup to helper
  2021-12-13 15:47     ` Michael Roth
  2021-12-13 16:21       ` Dave Hansen
@ 2022-01-11  8:59       ` Chao Fan
  1 sibling, 0 replies; 183+ messages in thread
From: Chao Fan @ 2022-01-11  8:59 UTC (permalink / raw)
  To: Michael Roth
  Cc: Dave Hansen, fanc.fnst, j-nomura, bp, Brijesh Singh, x86,
	linux-kernel, kvm, linux-efi, platform-driver-x86, linux-coco,
	linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

Hi, I am this Chao Fan, and <fanc.fnst@cn.fujitsu.com> won't be used again.
Please add me with fanchao.njupt@gmail.com
Many thanks.

Thanks,
Chao Fan

Michael Roth <michael.roth@amd.com> 于2021年12月14日周二 11:46写道:
>
> On Fri, Dec 10, 2021 at 10:54:35AM -0800, Dave Hansen wrote:
> > On 12/10/21 7:43 AM, Brijesh Singh wrote:
> > > +/*
> > > + * Helpers for early access to EFI configuration table
> > > + *
> > > + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> > > + *
> > > + * Author: Michael Roth <michael.roth@amd.com>
> > > + */
> >
> > It doesn't seem quite right to slap this copyright on a file that's full
> > of content that came from other files.  It would be one thing if
> > arch/x86/boot/compressed/acpi.c had this banner in it already.  Also, a
>
> Yah, acpi.c didn't have any copyright banner so I used my 'default'
> template for new files here to cover any additions, but that does give
> a misleading impression.
>
> I'm not sure how this is normally addressed, but I'm planning on just
> continuing the acpi.c tradition of *not* adding copyright notices for new
> code, and simply document that the contents of the file are mostly movement
> from acpi.c
>
> > arch/x86/boot/compressed/acpi.c had this banner in it already.  Also, a
> > bunch of the lines in this file seem to come from:
> >
> >       commit 33f0df8d843deb9ec24116dcd79a40ca0ea8e8a9
> >       Author: Chao Fan <fanc.fnst@cn.fujitsu.com>
> >       Date:   Wed Jan 23 19:08:46 2019 +0800
>
> AFAICT the full author list for the changes in question are, in
> alphabetical order:
>
>   Chao Fan <fanc.fnst@cn.fujitsu.com>
>   Junichi Nomura <j-nomura@ce.jp.nec.com>
>   Borislav Petkov <bp@suse.de>
>
> Chao, Junichi, Borislav,
>
> If you would like to be listed as an author in efi.c (which is mainly just a
> movement of EFI config table parsing code from acpi.c into re-usable helper
> functions in efi.c), please let me know and I'll add you.
>
> Otherwise, I'll plan on adopting the acpi.c precedent for this as well, which
> is to not list individual authors, since it doesn't seem right to add Author
> fields retroactively without their permission.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes
  2022-01-03 23:28   ` Venu Busireddy
@ 2022-01-11 21:22     ` Brijesh Singh
  2022-01-11 21:51       ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2022-01-11 21:22 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

Hi Venu,


On 1/3/22 5:28 PM, Venu Busireddy wrote:
...

>> +
>> +	 /*
>> +	  * Ask the hypervisor to mark the memory pages as private in the RMP
>> +	  * table.
>> +	  */
> 
> Indentation is off. While at it, you may want to collapse it into a one
> line comment.
> 

Based on previous review feedback I tried to keep the comment to 80 
character limit.

>> +	early_set_page_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
>> +
>> +	/* Validate the memory pages after they've been added in the RMP table. */
>> +	pvalidate_pages(vaddr, npages, 1);
>> +}
>> +
>> +void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
>> +					unsigned int npages)
>> +{
>> +	if (!cc_platform_has(CC_ATTR_SEV_SNP))
>> +		return;
>> +
>> +	/*
>> +	 * Invalidate the memory pages before they are marked shared in the
>> +	 * RMP table.
>> +	 */
> 
> Collapse into one line?
> 

same as above.

...

>> +		/*
>> +		 * ON SNP, the page state in the RMP table must happen
>> +		 * before the page table updates.
>> +		 */
>> +		early_snp_set_memory_shared((unsigned long)__va(pa), pa, 1);
> 
> I know "1" implies "true", but to emphasize that the argument is
> actually a boolean, could you please change the "1" to "true?"
> 

I assume you mean the last argument to the 
early_snp_set_memory_{private,shared}. Please note that its a number of 
pages (unsigned int). The 'true' does not make sense to me.

>> +	}
>> +
>>   	/* Change the page encryption mask. */
>>   	new_pte = pfn_pte(pfn, new_prot);
>>   	set_pte_atomic(kpte, new_pte);
>> +
>> +	/*
>> +	 * If page is set encrypted in the page table, then update the RMP table to
>> +	 * add this page as private.
>> +	 */
>> +	if (enc)
>> +		early_snp_set_memory_private((unsigned long)__va(pa), pa, 1);
> 
> Here too, could you please change the "1" to "true?"
> 

same as above.

thanks

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes
  2022-01-11 21:22     ` Brijesh Singh
@ 2022-01-11 21:51       ` Venu Busireddy
  2022-01-11 21:57         ` Brijesh Singh
  0 siblings, 1 reply; 183+ messages in thread
From: Venu Busireddy @ 2022-01-11 21:51 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2022-01-11 15:22:01 -0600, Brijesh Singh wrote:
> Hi Venu,
> 
> 
> On 1/3/22 5:28 PM, Venu Busireddy wrote:
> ...
> 
> > > +
> > > +	 /*
> > > +	  * Ask the hypervisor to mark the memory pages as private in the RMP
> > > +	  * table.
> > > +	  */
> > 
> > Indentation is off. While at it, you may want to collapse it into a one
> > line comment.
> > 
> 
> Based on previous review feedback I tried to keep the comment to 80
> character limit.

Isn't the line length limit 100 now? Also, there are quite a few lines
that are longer than 80 characters in this file, and elsewhere.

But you can ignore my comment.

> > > +	early_set_page_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
> > > +
> > > +	/* Validate the memory pages after they've been added in the RMP table. */
> > > +	pvalidate_pages(vaddr, npages, 1);
> > > +}
> > > +
> > > +void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> > > +					unsigned int npages)
> > > +{
> > > +	if (!cc_platform_has(CC_ATTR_SEV_SNP))
> > > +		return;
> > > +
> > > +	/*
> > > +	 * Invalidate the memory pages before they are marked shared in the
> > > +	 * RMP table.
> > > +	 */
> > 
> > Collapse into one line?
> > 
> 
> same as above.

Same as above.

> 
> ...
> 
> > > +		/*
> > > +		 * ON SNP, the page state in the RMP table must happen
> > > +		 * before the page table updates.
> > > +		 */
> > > +		early_snp_set_memory_shared((unsigned long)__va(pa), pa, 1);
> > 
> > I know "1" implies "true", but to emphasize that the argument is
> > actually a boolean, could you please change the "1" to "true?"
> > 
> 
> I assume you mean the last argument to the
> early_snp_set_memory_{private,shared}. Please note that its a number of
> pages (unsigned int). The 'true' does not make sense to me.

Sorry. While reading the code, I was looking at the invocations
of pvalidate_pages(), where 0 and 1 are passed instead of "false"
and "true" for the third argument. But while replying to the thread,
I marked my comment at the wrong place. I meant to suggest to change
the third argument to pvalidate_pages().


> > > +	}
> > > +
> > >   	/* Change the page encryption mask. */
> > >   	new_pte = pfn_pte(pfn, new_prot);
> > >   	set_pte_atomic(kpte, new_pte);
> > > +
> > > +	/*
> > > +	 * If page is set encrypted in the page table, then update the RMP table to
> > > +	 * add this page as private.
> > > +	 */
> > > +	if (enc)
> > > +		early_snp_set_memory_private((unsigned long)__va(pa), pa, 1);
> > 
> > Here too, could you please change the "1" to "true?"
> > 
> 
> same as above.
> 
> thanks

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes
  2022-01-11 21:51       ` Venu Busireddy
@ 2022-01-11 21:57         ` Brijesh Singh
  2022-01-11 22:42           ` Venu Busireddy
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2022-01-11 21:57 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy



On 1/11/22 3:51 PM, Venu Busireddy wrote:
> On 2022-01-11 15:22:01 -0600, Brijesh Singh wrote:
>> Hi Venu,
>>
>>
>> On 1/3/22 5:28 PM, Venu Busireddy wrote:
>> ...
>>
>>>> +
>>>> +	 /*
>>>> +	  * Ask the hypervisor to mark the memory pages as private in the RMP
>>>> +	  * table.
>>>> +	  */
>>>
>>> Indentation is off. While at it, you may want to collapse it into a one
>>> line comment.
>>>
>>
>> Based on previous review feedback I tried to keep the comment to 80
>> character limit.
> 
> Isn't the line length limit 100 now? Also, there are quite a few lines
> that are longer than 80 characters in this file, and elsewhere.
> 
> But you can ignore my comment.
> 

Yes, the actual line limit is 100, but I was asked to keep the comments 
to 80 cols [1] to keep it consistent with other comments in this file.

https://lore.kernel.org/lkml/f9a69ad8-54bb-70f1-d606-6497e5753bb0@amd.com/

thanks

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes
  2022-01-11 21:57         ` Brijesh Singh
@ 2022-01-11 22:42           ` Venu Busireddy
  0 siblings, 0 replies; 183+ messages in thread
From: Venu Busireddy @ 2022-01-11 22:42 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 2022-01-11 15:57:13 -0600, Brijesh Singh wrote:
> 
> 
> On 1/11/22 3:51 PM, Venu Busireddy wrote:
> > On 2022-01-11 15:22:01 -0600, Brijesh Singh wrote:
> > > Hi Venu,
> > > 
> > > 
> > > On 1/3/22 5:28 PM, Venu Busireddy wrote:
> > > ...
> > > 
> > > > > +
> > > > > +	 /*
> > > > > +	  * Ask the hypervisor to mark the memory pages as private in the RMP
> > > > > +	  * table.
> > > > > +	  */
> > > > 
> > > > Indentation is off. While at it, you may want to collapse it into a one
> > > > line comment.
> > > > 
> > > 
> > > Based on previous review feedback I tried to keep the comment to 80
> > > character limit.
> > 
> > Isn't the line length limit 100 now? Also, there are quite a few lines
> > that are longer than 80 characters in this file, and elsewhere.
> > 
> > But you can ignore my comment.
> > 
> 
> Yes, the actual line limit is 100, but I was asked to keep the comments to
> 80 cols [1] to keep it consistent with other comments in this file.

Well, now that you mention it, the comment that immediately precedes this
one in the file is 91 characters long, and the comment that immediately
follows this one is 82 characters long! And both those lines are also
added as part of this patch.

Venu

> 
> https://lore.kernel.org/lkml/f9a69ad8-54bb-70f1-d606-6497e5753bb0@amd.com/
> 
> thanks

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2021-12-10 18:50   ` Dave Hansen
@ 2022-01-12 16:17     ` Brijesh Singh
  0 siblings, 0 replies; 183+ messages in thread
From: Brijesh Singh @ 2022-01-12 16:17 UTC (permalink / raw)
  To: Dave Hansen, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm
  Cc: brijesh.singh, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 12/10/21 12:50 PM, Dave Hansen wrote:
> On 12/10/21 7:43 AM, Brijesh Singh wrote:
>> +	vmsa->efer		= 0x1000;	/* Must set SVME bit */
>> +	vmsa->cr4		= cr4;
>> +	vmsa->cr0		= 0x60000010;
>> +	vmsa->dr7		= 0x400;
>> +	vmsa->dr6		= 0xffff0ff0;
>> +	vmsa->rflags		= 0x2;
>> +	vmsa->g_pat		= 0x0007040600070406ULL;
>> +	vmsa->xcr0		= 0x1;
>> +	vmsa->mxcsr		= 0x1f80;
>> +	vmsa->x87_ftw		= 0x5555;
>> +	vmsa->x87_fcw		= 0x0040;
> 
> This is a big fat pile of magic numbers.  We also have nice macros for a
> non-zero number of these, like:
> 
> 	#define MXCSR_DEFAULT 0x1f80
> 
> I understand that this probably _works_ as-is, but it doesn't look very
> friendly if someone else needs to go hack on it.
> 

APM documents the default value for the AP following the RESET or INIT, 
I will define macros and use them accordingly.

thx

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2021-12-31 15:36   ` Borislav Petkov
  2022-01-03 18:10     ` Vlastimil Babka
@ 2022-01-12 16:33     ` Brijesh Singh
  2022-01-12 17:10       ` Tom Lendacky
  2022-01-13 12:21       ` Borislav Petkov
  1 sibling, 2 replies; 183+ messages in thread
From: Brijesh Singh @ 2022-01-12 16:33 UTC (permalink / raw)
  To: Borislav Petkov, Vlastimil Babka
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy



On 12/31/21 9:36 AM, Borislav Petkov wrote:
> On Fri, Dec 10, 2021 at 09:43:12AM -0600, Brijesh Singh wrote:
>> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
>> index 123a96f7dff2..38c14601ae4a 100644
>> --- a/arch/x86/include/asm/sev-common.h
>> +++ b/arch/x86/include/asm/sev-common.h
>> @@ -104,6 +104,7 @@ enum psc_op {
>>   	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
>>   
>>   #define GHCB_HV_FT_SNP			BIT_ULL(0)
>> +#define GHCB_HV_FT_SNP_AP_CREATION	(BIT_ULL(1) | GHCB_HV_FT_SNP)
> 
> Why is bit 0 ORed in? Because it "Requires SEV-SNP Feature."?
> 

Yes, the SEV-SNP feature is required. Anyway, I will improve a check. We 
will reach to AP creation only after SEV-SNP feature is checked, so, in 
AP creation routine we just need to check for the AP_CREATION specific 
feature flag; I will add comment about it.

> You can still enforce that requirement in the test though.
> 
> Or all those SEV features should not be bits but masks -
> GHCB_HV_FT_SNP_AP_CREATION_MASK for example, seeing how the others
> require the previous bits to be set too.
> 

> ...
> 
>>   static DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
>>   DEFINE_STATIC_KEY_FALSE(sev_es_enable_key);
>>   
>> +static DEFINE_PER_CPU(struct sev_es_save_area *, snp_vmsa);
> 
> This is what I mean: the struct is called "sev_es... " but the variable
> "snp_...". I.e., it is all sev_<something>.
> 

Sure, I define the variable as sev_vmsa.

>> +
>>   static __always_inline bool on_vc_stack(struct pt_regs *regs)
>>   {
>>   	unsigned long sp = regs->sp;
>> @@ -814,6 +818,231 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
>>   	pvalidate_pages(vaddr, npages, 1);
>>   }
>>   
>> +static int snp_set_vmsa(void *va, bool vmsa)
>> +{
>> +	u64 attrs;
>> +
>> +	/*
>> +	 * The RMPADJUST instruction is used to set or clear the VMSA bit for
>> +	 * a page. A change to the VMSA bit is only performed when running
>> +	 * at VMPL0 and is ignored at other VMPL levels. If too low of a target
> 
> What does "too low" mean here exactly?
> 

I believe its saying that target VMPL is lesser than the current VMPL 
level. Now that we have VMPL0 check enforced in the beginning so will 
work on improving comment.

> The kernel is not at VMPL0 but the specified level is lower? Weird...
> 
>> +	 * VMPL level is specified, the instruction can succeed without changing
>> +	 * the VMSA bit should the kernel not be in VMPL0. Using a target VMPL
>> +	 * level of 1 will return a FAIL_PERMISSION error if the kernel is not
>> +	 * at VMPL0, thus ensuring that the VMSA bit has been properly set when
>> +	 * no error is returned.
> 
> We do check whether we run at VMPL0 earlier when starting the guest -
> see enforce_vmpl0().
> 
> I don't think you need any of that additional verification here - just
> assume you are at VMPL0.
> 

Yep.

>> +	 */
>> +	attrs = 1;
>> +	if (vmsa)
>> +		attrs |= RMPADJUST_VMSA_PAGE_BIT;
>> +
>> +	return rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs);
>> +}
>> +
>> +#define __ATTR_BASE		(SVM_SELECTOR_P_MASK | SVM_SELECTOR_S_MASK)
>> +#define INIT_CS_ATTRIBS		(__ATTR_BASE | SVM_SELECTOR_READ_MASK | SVM_SELECTOR_CODE_MASK)
>> +#define INIT_DS_ATTRIBS		(__ATTR_BASE | SVM_SELECTOR_WRITE_MASK)
>> +
>> +#define INIT_LDTR_ATTRIBS	(SVM_SELECTOR_P_MASK | 2)
>> +#define INIT_TR_ATTRIBS		(SVM_SELECTOR_P_MASK | 3)
>> +
>> +static void *snp_safe_alloc_page(void)
> 
> safe?
> 
> And you don't need to say "safe" - snp_alloc_vmsa_page() is perfectly fine.
> 

noted.

...

>> +
>> +	/*
>> +	 * A new VMSA is created each time because there is no guarantee that
>> +	 * the current VMSA is the kernels or that the vCPU is not running. If
> 
> kernel's.
> 
> And if it is not the kernel's, whose it is?

It could be hypervisor's VMSA.

> 
>> +	 * an attempt was done to use the current VMSA with a running vCPU, a
>> +	 * #VMEXIT of that vCPU would wipe out all of the settings being done
>> +	 * here.
> 
> I don't understand - this is waking up a CPU, how can it ever be a
> running vCPU which is using the current VMSA?!
> 
> There is per_cpu(snp_vmsa, cpu), who else can be using that one currently?
> 

Maybe Tom can expand it bit more?

...

>> +
>> +	if (!ghcb_sw_exit_info_1_is_valid(ghcb) ||
>> +	    lower_32_bits(ghcb->save.sw_exit_info_1)) {
>> +		pr_alert("SNP AP Creation error\n");
> 
> alert?

I see that smboot.c is using the pr_err() when failing to wakeup CPU; 
will switch to pr_err(), let me know if you don't agree with it.


thx

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2022-01-12 16:33     ` Brijesh Singh
@ 2022-01-12 17:10       ` Tom Lendacky
  2022-01-13 12:23         ` Borislav Petkov
  2022-01-13 12:21       ` Borislav Petkov
  1 sibling, 1 reply; 183+ messages in thread
From: Tom Lendacky @ 2022-01-12 17:10 UTC (permalink / raw)
  To: Brijesh Singh, Borislav Petkov, Vlastimil Babka
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Kirill A . Shutemov,
	Andi Kleen, Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On 1/12/22 10:33 AM, Brijesh Singh wrote:
> On 12/31/21 9:36 AM, Borislav Petkov wrote:
>> On Fri, Dec 10, 2021 at 09:43:12AM -0600, Brijesh Singh wrote:

>>> +     * an attempt was done to use the current VMSA with a running vCPU, a
>>> +     * #VMEXIT of that vCPU would wipe out all of the settings being done
>>> +     * here.
>>
>> I don't understand - this is waking up a CPU, how can it ever be a
>> running vCPU which is using the current VMSA?!

Yes, in general. My thought was that nothing is stopping a malicious 
hypervisor from performing a VMRUN on that vCPU and then the VMSA would be 
in use.

Thanks,
Tom

>>
>> There is per_cpu(snp_vmsa, cpu), who else can be using that one currently?
>>
> 
> Maybe Tom can expand it bit more?
> 

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2022-01-12 16:33     ` Brijesh Singh
  2022-01-12 17:10       ` Tom Lendacky
@ 2022-01-13 12:21       ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-13 12:21 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Vlastimil Babka, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Wed, Jan 12, 2022 at 10:33:40AM -0600, Brijesh Singh wrote:
> Yes, the SEV-SNP feature is required. Anyway, I will improve a check. We
> will reach to AP creation only after SEV-SNP feature is checked, so, in AP
> creation routine we just need to check for the AP_CREATION specific feature
> flag; I will add comment about it.

Right, at least a comment explaining why the bits are ORed.
> 
> > You can still enforce that requirement in the test though.
> > 
> > Or all those SEV features should not be bits but masks -
> > GHCB_HV_FT_SNP_AP_CREATION_MASK for example, seeing how the others
> > require the previous bits to be set too.

Thinking about this more, calling it a "mask" might not be optimal here
as you use masks usually to, well, mask out bits, etc. So I guess a
comment explaning why the OR-in of bit 0...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2022-01-12 17:10       ` Tom Lendacky
@ 2022-01-13 12:23         ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-13 12:23 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Brijesh Singh, Vlastimil Babka, x86, linux-kernel, kvm,
	linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Wed, Jan 12, 2022 at 11:10:04AM -0600, Tom Lendacky wrote:
> On 1/12/22 10:33 AM, Brijesh Singh wrote:
> > On 12/31/21 9:36 AM, Borislav Petkov wrote:
> > > On Fri, Dec 10, 2021 at 09:43:12AM -0600, Brijesh Singh wrote:
> 
> > > > +     * an attempt was done to use the current VMSA with a running vCPU, a
> > > > +     * #VMEXIT of that vCPU would wipe out all of the settings being done
> > > > +     * here.
> > > 
> > > I don't understand - this is waking up a CPU, how can it ever be a
> > > running vCPU which is using the current VMSA?!
> 
> Yes, in general. My thought was that nothing is stopping a malicious
> hypervisor from performing a VMRUN on that vCPU and then the VMSA would be
> in use.

Ah, that's what you mean.

Ok, please extend that comment with it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2021-12-10 15:43 ` [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers Brijesh Singh
@ 2022-01-13 13:16   ` Borislav Petkov
  2022-01-13 16:39     ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2022-01-13 13:16 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:21AM -0600, Brijesh Singh wrote:
> +/*
> + * Individual entries of the SEV-SNP CPUID table, as defined by the SEV-SNP
> + * Firmware ABI, Revision 0.9, Section 7.1, Table 14. Note that the XCR0_IN
> + * and XSS_IN are denoted here as __unused/__unused2, since they are not
> + * needed for the current guest implementation,

That's fine and great but you need to check in the function where you
iterate over those leafs below whether those unused variables are 0
and fail if not. Not that BIOS or whoever creates that table, starts
becoming creative...

> where the size of the buffers
> + * needed to store enabled XSAVE-saved features are calculated rather than
> + * encoded in the CPUID table for each possible combination of XCR0_IN/XSS_IN
> + * to save space.
> + */
> +struct snp_cpuid_fn {
> +	u32 eax_in;
> +	u32 ecx_in;
> +	u64 __unused;
> +	u64 __unused2;
> +	u32 eax;
> +	u32 ebx;
> +	u32 ecx;
> +	u32 edx;
> +	u64 __reserved;

Ditto.

> +} __packed;
> +
> +/*
> + * SEV-SNP CPUID table header, as defined by the SEV-SNP Firmware ABI,
> + * Revision 0.9, Section 8.14.2.6. Also noted there is the SEV-SNP
> + * firmware-enforced limit of 64 entries per CPUID table.
> + */
> +#define SNP_CPUID_COUNT_MAX 64
> +
> +struct snp_cpuid_info {
> +	u32 count;
> +	u32 __reserved1;
> +	u64 __reserved2;
> +	struct snp_cpuid_fn fn[SNP_CPUID_COUNT_MAX];
> +} __packed;
> +
>  /*
>   * Since feature negotiation related variables are set early in the boot
>   * process they must reside in the .data section so as not to be zeroed
> @@ -23,6 +58,20 @@
>   */
>  static u16 ghcb_version __ro_after_init;
>  
> +/* Copy of the SNP firmware's CPUID page. */
> +static struct snp_cpuid_info cpuid_info_copy __ro_after_init;
> +static bool snp_cpuid_initialized __ro_after_init;
> +
> +/*
> + * These will be initialized based on CPUID table so that non-present
> + * all-zero leaves (for sparse tables) can be differentiated from
> + * invalid/out-of-range leaves. This is needed since all-zero leaves
> + * still need to be post-processed.
> + */
> +u32 cpuid_std_range_max __ro_after_init;
> +u32 cpuid_hyp_range_max __ro_after_init;
> +u32 cpuid_ext_range_max __ro_after_init;

All of them: static.

>  static bool __init sev_es_check_cpu_features(void)
>  {
>  	if (!has_cpuflag(X86_FEATURE_RDRAND)) {
> @@ -246,6 +295,244 @@ static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
>  	return 0;
>  }
>  
> +static const struct snp_cpuid_info *

No need for that linebreak here.

> +snp_cpuid_info_get_ptr(void)
> +{
> +	void *ptr;
> +
> +	/*
> +	 * This may be called early while still running on the initial identity
> +	 * mapping. Use RIP-relative addressing to obtain the correct address
> +	 * in both for identity mapping and after switch-over to kernel virtual
> +	 * addresses.
> +	 */

Put that comment over the function name.

And yah, that probably works but eww.

> +	asm ("lea cpuid_info_copy(%%rip), %0"
> +	     : "=r" (ptr)

Why not "=g" and let the compiler decide?

> +	     : "p" (&cpuid_info_copy));
> +
> +	return ptr;
> +}
> +
> +static inline bool snp_cpuid_active(void)
> +{
> +	return snp_cpuid_initialized;
> +}

That looks useless. That variable snp_cpuid_initialized either gets set
or the guest terminates, so practically, if the guest is still running,
you can assume SNP CPUID is properly initialized.

> +static int snp_cpuid_calc_xsave_size(u64 xfeatures_en, u32 base_size,
> +				     u32 *xsave_size, bool compacted)
> +{
> +	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
> +	u32 xsave_size_total = base_size;
> +	u64 xfeatures_found = 0;
> +	int i;
> +
> +	for (i = 0; i < cpuid_info->count; i++) {
> +		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
> +
> +		if (!(fn->eax_in == 0xD && fn->ecx_in > 1 && fn->ecx_in < 64))
> +			continue;

I guess that test can be as simple as

		if (fn->eax_in != 0xd)
			continue;

or why do you wanna check ECX too? Funky values coming from the CPUID
page?

> +		if (!(xfeatures_en & (BIT_ULL(fn->ecx_in))))
> +			continue;
> +		if (xfeatures_found & (BIT_ULL(fn->ecx_in)))
> +			continue;

What is that test for? Don't tell me the CPUID page allows duplicate
entries...

> +		xfeatures_found |= (BIT_ULL(fn->ecx_in));
> +
> +		if (compacted)
> +			xsave_size_total += fn->eax;
> +		else
> +			xsave_size_total = max(xsave_size_total,
> +					       fn->eax + fn->ebx);
> +	}
> +
> +	/*
> +	 * Either the guest set unsupported XCR0/XSS bits, or the corresponding
> +	 * entries in the CPUID table were not present. This is not a valid
> +	 * state to be in.
> +	 */
> +	if (xfeatures_found != (xfeatures_en & GENMASK_ULL(63, 2)))
> +		return -EINVAL;
> +
> +	*xsave_size = xsave_size_total;
> +
> +	return 0;

This function can return xsave_size in the success case and negative in
the error case so you don't need the IO param *xsave_size.

> +}
> +
> +static void snp_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
> +			 u32 *edx)
> +{
> +	/*
> +	 * MSR protocol does not support fetching indexed subfunction, but is
> +	 * sufficient to handle current fallback cases. Should that change,
> +	 * make sure to terminate rather than ignoring the index and grabbing
> +	 * random values. If this issue arises in the future, handling can be
> +	 * added here to use GHCB-page protocol for cases that occur late
> +	 * enough in boot that GHCB page is available.
> +	 */
> +	if (cpuid_function_is_indexed(func) && subfunc)
> +		sev_es_terminate(1, GHCB_TERM_CPUID_HV);
> +
> +	if (sev_cpuid_hv(func, 0, eax, ebx, ecx, edx))
> +		sev_es_terminate(1, GHCB_TERM_CPUID_HV);
> +}
> +
> +static bool
> +snp_cpuid_find_validated_func(u32 func, u32 subfunc, u32 *eax, u32 *ebx,

snp_cpuid_get_validated_func()

> +			      u32 *ecx, u32 *edx)
> +{
> +	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
> +	int i;
> +
> +	for (i = 0; i < cpuid_info->count; i++) {
> +		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
> +
> +		if (fn->eax_in != func)
> +			continue;
> +
> +		if (cpuid_function_is_indexed(func) && fn->ecx_in != subfunc)
> +			continue;
> +
> +		*eax = fn->eax;
> +		*ebx = fn->ebx;
> +		*ecx = fn->ecx;
> +		*edx = fn->edx;
> +
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +static bool snp_cpuid_check_range(u32 func)
> +{
> +	if (func <= cpuid_std_range_max ||
> +	    (func >= 0x40000000 && func <= cpuid_hyp_range_max) ||
> +	    (func >= 0x80000000 && func <= cpuid_ext_range_max))
> +		return true;
> +
> +	return false;
> +}
> +
> +static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> +				 u32 *ecx, u32 *edx)

I'm wondering if you could make everything a lot easier by doing

static int snp_cpuid_postprocess(struct cpuid_leaf *leaf)

and marshall around that struct cpuid_leaf which contains func, subfunc,
e[abcd]x instead of dealing with 6 parameters.

Callers of snp_cpuid() can simply allocate it on their stack and hand it
in and it is all in sev-shared.c so nicely self-contained...

...

> +/*
> + * Returns -EOPNOTSUPP if feature not enabled. Any other return value should be
> + * treated as fatal by caller.
> + */
> +static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
> +		     u32 *edx)
> +{
> +	if (!snp_cpuid_active())
> +		return -EOPNOTSUPP;

And this becomes superfluous.

> +
> +	if (!snp_cpuid_find_validated_func(func, subfunc, eax, ebx, ecx, edx)) {
> +		/*
> +		 * Some hypervisors will avoid keeping track of CPUID entries
> +		 * where all values are zero, since they can be handled the
> +		 * same as out-of-range values (all-zero). This is useful here
> +		 * as well as it allows virtually all guest configurations to
> +		 * work using a single SEV-SNP CPUID table.
> +		 *
> +		 * To allow for this, there is a need to distinguish between
> +		 * out-of-range entries and in-range zero entries, since the
> +		 * CPUID table entries are only a template that may need to be
> +		 * augmented with additional values for things like
> +		 * CPU-specific information during post-processing. So if it's
> +		 * not in the table, but is still in the valid range, proceed
> +		 * with the post-processing. Otherwise, just return zeros.
> +		 */
> +		*eax = *ebx = *ecx = *edx = 0;
> +		if (!snp_cpuid_check_range(func))
> +			return 0;

Do the check first and then assign.

> +	}
> +
> +	return snp_cpuid_postprocess(func, subfunc, eax, ebx, ecx, edx);
> +}
> +
>  /*
>   * Boot VC Handler - This is the first VC handler during boot, there is no GHCB
>   * page yet, so it only supports the MSR based communication with the
> @@ -253,16 +540,26 @@ static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
>   */
>  void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
>  {
> +	unsigned int subfn = lower_bits(regs->cx, 32);
>  	unsigned int fn = lower_bits(regs->ax, 32);
>  	u32 eax, ebx, ecx, edx;
> +	int ret;
>  
>  	/* Only CPUID is supported via MSR protocol */
>  	if (exit_code != SVM_EXIT_CPUID)
>  		goto fail;
>  
> +	ret = snp_cpuid(fn, subfn, &eax, &ebx, &ecx, &edx);
> +	if (ret == 0)

	if (!ret)

> +		goto cpuid_done;
> +
> +	if (ret != -EOPNOTSUPP)
> +		goto fail;
> +
>  	if (sev_cpuid_hv(fn, 0, &eax, &ebx, &ecx, &edx))
>  		goto fail;
>  
> +cpuid_done:
>  	regs->ax = eax;
>  	regs->bx = ebx;
>  	regs->cx = ecx;
> @@ -557,12 +854,35 @@ static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
>  	return ret;
>  }
>  
> +static int vc_handle_cpuid_snp(struct pt_regs *regs)
> +{
> +	u32 eax, ebx, ecx, edx;
> +	int ret;
> +
> +	ret = snp_cpuid(regs->ax, regs->cx, &eax, &ebx, &ecx, &edx);
> +	if (ret == 0) {

	if (!ret)

> +		regs->ax = eax;
> +		regs->bx = ebx;
> +		regs->cx = ecx;
> +		regs->dx = edx;
> +	}
> +
> +	return ret;
> +}
> +
>  static enum es_result vc_handle_cpuid(struct ghcb *ghcb,
>  				      struct es_em_ctxt *ctxt)
>  {
>  	struct pt_regs *regs = ctxt->regs;
>  	u32 cr4 = native_read_cr4();
>  	enum es_result ret;
> +	int snp_cpuid_ret;
> +
> +	snp_cpuid_ret = vc_handle_cpuid_snp(regs);
> +	if (snp_cpuid_ret == 0)

	if (! ... - you get the idea.



-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-13 13:16   ` Borislav Petkov
@ 2022-01-13 16:39     ` Michael Roth
  2022-01-14 16:13       ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2022-01-13 16:39 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Thu, Jan 13, 2022 at 02:16:05PM +0100, Borislav Petkov wrote:
> On Fri, Dec 10, 2021 at 09:43:21AM -0600, Brijesh Singh wrote:
> > +/*
> > + * Individual entries of the SEV-SNP CPUID table, as defined by the SEV-SNP
> > + * Firmware ABI, Revision 0.9, Section 7.1, Table 14. Note that the XCR0_IN
> > + * and XSS_IN are denoted here as __unused/__unused2, since they are not
> > + * needed for the current guest implementation,
> 
> That's fine and great but you need to check in the function where you
> iterate over those leafs below whether those unused variables are 0
> and fail if not. Not that BIOS or whoever creates that table, starts
> becoming creative...

It's not actually necessary for the values to be 0 in this case. If a
hypervisor chooses to use them (which is allowed by current firmware
ABI, but should perhaps be updated to suggest against it), then guest
code written for that specific hypervisor implementation can choose to
make use of the fields.

But if a guest chooses to ignore the XCR0_IN/XSS_IN values, and instead
compute the XSAVE buffer size from scratch using it's own knowledge of
what features are enabled in the actual XCR0/XSS registers, then it can
compute the same information those fields encode into the cpuid table
by simply walking 0xD sub-leaves 2-63 and calculating the XSAVE buffer
size manually based on the individual sizes encoded in those sub-leaves
(as per CPUID spec).

All entries for 0xD subleaves 0 through 1, with different XCR0_IN/XSS_IN
values, can then be treated as being identical, since the only thing
that changes from a sub-leaf entry with one XCR0_IN/XSS_IN combination to
verses another is total XSAVE buffer size, which the guest doesn't need,
since it is computing them instead by summing up the 0xD subleaves 2-63.

Requiring these to be 0 places constraints on the hypervisor that are at
odds with the current firmware ABI, so that's not enforced here, but the
guest code should work regardless of how the hypervisor chooses to use
XCR0_IN/XSS_IN.

> 
> > where the size of the buffers
> > + * needed to store enabled XSAVE-saved features are calculated rather than
> > + * encoded in the CPUID table for each possible combination of XCR0_IN/XSS_IN
> > + * to save space.
> > + */
> > +struct snp_cpuid_fn {
> > +	u32 eax_in;
> > +	u32 ecx_in;
> > +	u64 __unused;
> > +	u64 __unused2;
> > +	u32 eax;
> > +	u32 ebx;
> > +	u32 ecx;
> > +	u32 edx;
> > +	u64 __reserved;
> 
> Ditto.

I was thinking a future hypervisor/spec might make use of this field for
new functionality, while still wanting to be backward-compatible with
existing guests, so it would be better to not enforce 0. The firmware
ABI does indeed document it as must-be-zero, by that seems to be more of
a constraint on what a hypervisor is currently allowed to place in the
CPUID table, rather than something the guest is meant to enforce/rely
on.

> 
> > +} __packed;
> > +
> > +/*
> > + * SEV-SNP CPUID table header, as defined by the SEV-SNP Firmware ABI,
> > + * Revision 0.9, Section 8.14.2.6. Also noted there is the SEV-SNP
> > + * firmware-enforced limit of 64 entries per CPUID table.
> > + */
> > +#define SNP_CPUID_COUNT_MAX 64
> > +
> > +struct snp_cpuid_info {
> > +	u32 count;
> > +	u32 __reserved1;
> > +	u64 __reserved2;
> > +	struct snp_cpuid_fn fn[SNP_CPUID_COUNT_MAX];
> > +} __packed;
> > +
> >  /*
> >   * Since feature negotiation related variables are set early in the boot
> >   * process they must reside in the .data section so as not to be zeroed
> > @@ -23,6 +58,20 @@
> >   */
> >  static u16 ghcb_version __ro_after_init;
> >  
> > +/* Copy of the SNP firmware's CPUID page. */
> > +static struct snp_cpuid_info cpuid_info_copy __ro_after_init;
> > +static bool snp_cpuid_initialized __ro_after_init;
> > +
> > +/*
> > + * These will be initialized based on CPUID table so that non-present
> > + * all-zero leaves (for sparse tables) can be differentiated from
> > + * invalid/out-of-range leaves. This is needed since all-zero leaves
> > + * still need to be post-processed.
> > + */
> > +u32 cpuid_std_range_max __ro_after_init;
> > +u32 cpuid_hyp_range_max __ro_after_init;
> > +u32 cpuid_ext_range_max __ro_after_init;
> 
> All of them: static.
> 
> >  static bool __init sev_es_check_cpu_features(void)
> >  {
> >  	if (!has_cpuflag(X86_FEATURE_RDRAND)) {
> > @@ -246,6 +295,244 @@ static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> >  	return 0;
> >  }
> >  
> > +static const struct snp_cpuid_info *
> 
> No need for that linebreak here.
> 
> > +snp_cpuid_info_get_ptr(void)
> > +{
> > +	void *ptr;
> > +
> > +	/*
> > +	 * This may be called early while still running on the initial identity
> > +	 * mapping. Use RIP-relative addressing to obtain the correct address
> > +	 * in both for identity mapping and after switch-over to kernel virtual
> > +	 * addresses.
> > +	 */
> 
> Put that comment over the function name.
> 
> And yah, that probably works but eww.

Yah... originally there was a cpuid table ptr that was initialized for
identity-mapping early on, then updated later after switch-over, but it
required an additional init routine or callback later in boot, which
seemed at odds with the goal of initializing everything in one spot, so
I switched to using this approach to avoid needed to re-introduce having
multiple stages of initialization throughout boot.

> 
> > +	asm ("lea cpuid_info_copy(%%rip), %0"
> > +	     : "=r" (ptr)
> 
> Why not "=g" and let the compiler decide?

I mainly re-used existing code from sme_enable() here, but I'll check
on this.

> 
> > +	     : "p" (&cpuid_info_copy));
> > +
> > +	return ptr;
> > +}
> > +
> > +static inline bool snp_cpuid_active(void)
> > +{
> > +	return snp_cpuid_initialized;
> > +}
> 
> That looks useless. That variable snp_cpuid_initialized either gets set
> or the guest terminates, so practically, if the guest is still running,
> you can assume SNP CPUID is properly initialized.

snp_cpuid_info_create() (which sets snp_cpuid_initialized) only gets
called if firmware indicates this is an SNP guests (via the cc_blob), but
the #VC handler still needs to know whether or not it should use the SNP
CPUID table still SEV-ES will still make use of it, so it uses
snp_cpuid_active() to make that determination.

Previous versions of the series basically did:

  snp_cpuid_active():
    return snp_cpuid_table_ptr != NULL;

but now that the above snp_cpuid_info_get_ptr() accessor is used instead
of storing an actual pointer somewhere, a new variable is needed to
track that, which why snp_cpuid_initialized was introduced here.

> 
> > +static int snp_cpuid_calc_xsave_size(u64 xfeatures_en, u32 base_size,
> > +				     u32 *xsave_size, bool compacted)
> > +{
> > +	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
> > +	u32 xsave_size_total = base_size;
> > +	u64 xfeatures_found = 0;
> > +	int i;
> > +
> > +	for (i = 0; i < cpuid_info->count; i++) {
> > +		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
> > +
> > +		if (!(fn->eax_in == 0xD && fn->ecx_in > 1 && fn->ecx_in < 64))
> > +			continue;
> 
> I guess that test can be as simple as
> 
> 		if (fn->eax_in != 0xd)
> 			continue;
> 
> or why do you wanna check ECX too? Funky values coming from the CPUID
> page?

This code is calculating the total XSAVE buffer size for whatever
features are enabled by the guest's XCR0/XSS registers. Those feature
bits correspond to the 0xD subleaves 2-63, which advertise the buffer
size for each particular feature. So that check needs to ignore anything
outside that range (including 0xD subleafs 0 and 1, which would normally
provide this total size dynamically based on current values of XCR0/XSS,
but here are instead calculated "manually" since we are not relying on
the XCR0_IN/XSS_IN fields in the table (due to the reasons mentioned
earlier in this thread).

> 
> > +		if (!(xfeatures_en & (BIT_ULL(fn->ecx_in))))
> > +			continue;
> > +		if (xfeatures_found & (BIT_ULL(fn->ecx_in)))
> > +			continue;
> 
> What is that test for? Don't tell me the CPUID page allows duplicate
> entries...

Not duplicate entries (though there's technically nothing in the spec
that says you can't), but I was more concerned here with multiple
entries corresponding to different combination of XCR0_IN/XSS_IN.
There's no good reason for a hypervisor to use those fields for anything
other than 0xD subleaves 0 and 1, but a hypervisor could in theory encode
1 "duplicate" sub-leaf for each possible combination of XCR0_IN/XSS_IN,
similar to what it might do for subleaves 0 and 1, and not violate the
spec.

The current spec is a bit open-ended in some of these areas so the guest
code is trying to be as agnostic as possible to the underlying implementation
so there's less chance of breakage running on one hypervisor verses
another. We're working on updating the spec to encourage better
interoperability, but that would likely only be enforceable for future
firmware versions/guests.

> 
> > +		xfeatures_found |= (BIT_ULL(fn->ecx_in));
> > +
> > +		if (compacted)
> > +			xsave_size_total += fn->eax;
> > +		else
> > +			xsave_size_total = max(xsave_size_total,
> > +					       fn->eax + fn->ebx);
> > +	}
> > +
> > +	/*
> > +	 * Either the guest set unsupported XCR0/XSS bits, or the corresponding
> > +	 * entries in the CPUID table were not present. This is not a valid
> > +	 * state to be in.
> > +	 */
> > +	if (xfeatures_found != (xfeatures_en & GENMASK_ULL(63, 2)))
> > +		return -EINVAL;
> > +
> > +	*xsave_size = xsave_size_total;
> > +
> > +	return 0;
> 
> This function can return xsave_size in the success case and negative in
> the error case so you don't need the IO param *xsave_size.

Makes sense.

> 
> > +}
> > +
> > +static void snp_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
> > +			 u32 *edx)
> > +{
> > +	/*
> > +	 * MSR protocol does not support fetching indexed subfunction, but is
> > +	 * sufficient to handle current fallback cases. Should that change,
> > +	 * make sure to terminate rather than ignoring the index and grabbing
> > +	 * random values. If this issue arises in the future, handling can be
> > +	 * added here to use GHCB-page protocol for cases that occur late
> > +	 * enough in boot that GHCB page is available.
> > +	 */
> > +	if (cpuid_function_is_indexed(func) && subfunc)
> > +		sev_es_terminate(1, GHCB_TERM_CPUID_HV);
> > +
> > +	if (sev_cpuid_hv(func, 0, eax, ebx, ecx, edx))
> > +		sev_es_terminate(1, GHCB_TERM_CPUID_HV);
> > +}
> > +
> > +static bool
> > +snp_cpuid_find_validated_func(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> 
> snp_cpuid_get_validated_func()
> 
> > +			      u32 *ecx, u32 *edx)
> > +{
> > +	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
> > +	int i;
> > +
> > +	for (i = 0; i < cpuid_info->count; i++) {
> > +		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
> > +
> > +		if (fn->eax_in != func)
> > +			continue;
> > +
> > +		if (cpuid_function_is_indexed(func) && fn->ecx_in != subfunc)
> > +			continue;
> > +
> > +		*eax = fn->eax;
> > +		*ebx = fn->ebx;
> > +		*ecx = fn->ecx;
> > +		*edx = fn->edx;
> > +
> > +		return true;
> > +	}
> > +
> > +	return false;
> > +}
> > +
> > +static bool snp_cpuid_check_range(u32 func)
> > +{
> > +	if (func <= cpuid_std_range_max ||
> > +	    (func >= 0x40000000 && func <= cpuid_hyp_range_max) ||
> > +	    (func >= 0x80000000 && func <= cpuid_ext_range_max))
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> > +static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> > +				 u32 *ecx, u32 *edx)
> 
> I'm wondering if you could make everything a lot easier by doing
> 
> static int snp_cpuid_postprocess(struct cpuid_leaf *leaf)
> 
> and marshall around that struct cpuid_leaf which contains func, subfunc,
> e[abcd]x instead of dealing with 6 parameters.
> 
> Callers of snp_cpuid() can simply allocate it on their stack and hand it
> in and it is all in sev-shared.c so nicely self-contained...
> 
> ...

True, I could have snp_cpuid_find_validated_func() return a pointer to the
actual entry and pass that through to postprocess(). I'll see what that
looks like.

> 
> > +/*
> > + * Returns -EOPNOTSUPP if feature not enabled. Any other return value should be
> > + * treated as fatal by caller.
> > + */
> > +static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
> > +		     u32 *edx)
> > +{
> > +	if (!snp_cpuid_active())
> > +		return -EOPNOTSUPP;
> 
> And this becomes superfluous.
> 
> > +
> > +	if (!snp_cpuid_find_validated_func(func, subfunc, eax, ebx, ecx, edx)) {
> > +		/*
> > +		 * Some hypervisors will avoid keeping track of CPUID entries
> > +		 * where all values are zero, since they can be handled the
> > +		 * same as out-of-range values (all-zero). This is useful here
> > +		 * as well as it allows virtually all guest configurations to
> > +		 * work using a single SEV-SNP CPUID table.
> > +		 *
> > +		 * To allow for this, there is a need to distinguish between
> > +		 * out-of-range entries and in-range zero entries, since the
> > +		 * CPUID table entries are only a template that may need to be
> > +		 * augmented with additional values for things like
> > +		 * CPU-specific information during post-processing. So if it's
> > +		 * not in the table, but is still in the valid range, proceed
> > +		 * with the post-processing. Otherwise, just return zeros.
> > +		 */
> > +		*eax = *ebx = *ecx = *edx = 0;
> > +		if (!snp_cpuid_check_range(func))
> > +			return 0;
> 
> Do the check first and then assign.
> 
> > +	}
> > +
> > +	return snp_cpuid_postprocess(func, subfunc, eax, ebx, ecx, edx);
> > +}
> > +
> >  /*
> >   * Boot VC Handler - This is the first VC handler during boot, there is no GHCB
> >   * page yet, so it only supports the MSR based communication with the
> > @@ -253,16 +540,26 @@ static int sev_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> >   */
> >  void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
> >  {
> > +	unsigned int subfn = lower_bits(regs->cx, 32);
> >  	unsigned int fn = lower_bits(regs->ax, 32);
> >  	u32 eax, ebx, ecx, edx;
> > +	int ret;
> >  
> >  	/* Only CPUID is supported via MSR protocol */
> >  	if (exit_code != SVM_EXIT_CPUID)
> >  		goto fail;
> >  
> > +	ret = snp_cpuid(fn, subfn, &eax, &ebx, &ecx, &edx);
> > +	if (ret == 0)
> 
> 	if (!ret)
> 
> > +		goto cpuid_done;
> > +
> > +	if (ret != -EOPNOTSUPP)
> > +		goto fail;
> > +
> >  	if (sev_cpuid_hv(fn, 0, &eax, &ebx, &ecx, &edx))
> >  		goto fail;
> >  
> > +cpuid_done:
> >  	regs->ax = eax;
> >  	regs->bx = ebx;
> >  	regs->cx = ecx;
> > @@ -557,12 +854,35 @@ static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
> >  	return ret;
> >  }
> >  
> > +static int vc_handle_cpuid_snp(struct pt_regs *regs)
> > +{
> > +	u32 eax, ebx, ecx, edx;
> > +	int ret;
> > +
> > +	ret = snp_cpuid(regs->ax, regs->cx, &eax, &ebx, &ecx, &edx);
> > +	if (ret == 0) {
> 
> 	if (!ret)
> 
> > +		regs->ax = eax;
> > +		regs->bx = ebx;
> > +		regs->cx = ecx;
> > +		regs->dx = edx;
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  static enum es_result vc_handle_cpuid(struct ghcb *ghcb,
> >  				      struct es_em_ctxt *ctxt)
> >  {
> >  	struct pt_regs *regs = ctxt->regs;
> >  	u32 cr4 = native_read_cr4();
> >  	enum es_result ret;
> > +	int snp_cpuid_ret;
> > +
> > +	snp_cpuid_ret = vc_handle_cpuid_snp(regs);
> > +	if (snp_cpuid_ret == 0)
> 
> 	if (! ... - you get the idea.

Thanks for the suggestions, will get these all implemented unless
otherwise noted above.

-Mike

> 
> 
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7CMichael.Roth%40amd.com%7Cbdeae41f29e841b9c98108d9d696dd53%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637776765726982239%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=u2W1ZARL%2Fn%2FgQZb7KXvnpR3o8%2FMUw7GoXwrIY19xFcY%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-13 16:39     ` Michael Roth
@ 2022-01-14 16:13       ` Borislav Petkov
  2022-01-18  4:35         ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2022-01-14 16:13 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Thu, Jan 13, 2022 at 10:39:13AM -0600, Michael Roth wrote:
> I was thinking a future hypervisor/spec might make use of this field for
> new functionality, while still wanting to be backward-compatible with
> existing guests, so it would be better to not enforce 0. The firmware
> ABI does indeed document it as must-be-zero,

Maybe there's a good reason for that.

> by that seems to be more of a constraint on what a hypervisor is
> currently allowed to place in the CPUID table, rather than something
> the guest is meant to enforce/rely on.

So imagine whoever creates those, starts putting stuff in those fields.
Then, in the future, the spec decides to rename those reserved/unused
fields into something else and starts putting concrete values in them.
I.e., it starts using them for something.

But, now the spec breaks existing usage because those fields are already
in use. And by then it doesn't matter what the spec says - existing
usage makes it an ABI.

So we start doing expensive and ugly workarounds just so that we don't
break the old, undocumented use which the spec simply silently allowed,
and accomodate that new feature the spec adds.

So no, what you're thinking is a bad bad idea.

> snp_cpuid_info_create() (which sets snp_cpuid_initialized) only gets
> called if firmware indicates this is an SNP guests (via the cc_blob), but
> the #VC handler still needs to know whether or not it should use the SNP
> CPUID table still SEV-ES will still make use of it, so it uses
> snp_cpuid_active() to make that determination.

So I went and applied the rest of the series. And I think you mean
do_vc_no_ghcb() and it doing snp_cpuid().

Then, looking at sev_enable() and it calling snp_init(), you fail
further init if there's any discrepancy in the supplied data - CPUID,
SEV status MSR, etc.

So, practically, what you wanna test in all those places is whether
you're a SNP guest or not. Which we already have:

	sev_status & MSR_AMD64_SEV_SNP_ENABLED

so, unless I'm missing something, you don't need yet another
<bla>_active() helper.

> This code is calculating the total XSAVE buffer size for whatever
> features are enabled by the guest's XCR0/XSS registers. Those feature
> bits correspond to the 0xD subleaves 2-63, which advertise the buffer
> size for each particular feature. So that check needs to ignore anything
> outside that range (including 0xD subleafs 0 and 1, which would normally
> provide this total size dynamically based on current values of XCR0/XSS,
> but here are instead calculated "manually" since we are not relying on
> the XCR0_IN/XSS_IN fields in the table (due to the reasons mentioned
> earlier in this thread).

Yah, the gist of that needs to be as a comment of that line as it is not
obvious (at least to me).

> Not duplicate entries (though there's technically nothing in the spec
> that says you can't), but I was more concerned here with multiple
> entries corresponding to different combination of XCR0_IN/XSS_IN.
> There's no good reason for a hypervisor to use those fields for anything
> other than 0xD subleaves 0 and 1, but a hypervisor could in theory encode
> 1 "duplicate" sub-leaf for each possible combination of XCR0_IN/XSS_IN,
> similar to what it might do for subleaves 0 and 1, and not violate the
> spec.


Ditto. Also a comment ontop please.

> The current spec is a bit open-ended in some of these areas so the guest
> code is trying to be as agnostic as possible to the underlying implementation
> so there's less chance of breakage running on one hypervisor verses
> another. We're working on updating the spec to encourage better
> interoperability, but that would likely only be enforceable for future
> firmware versions/guests.

This has the same theoretical problem as the reserved/unused fields. If
you don't enforce it, people will do whatever and once it is implemented
in hypervisors and it has become an ABI, you can't change it anymore.

So I'd very strongly suggest you tighten in up upfront and only allow
stuff later, when it makes sense. Not the other way around.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 30/40] x86/boot: add a pointer to Confidential Computing blob in bootparams
  2021-12-10 15:43 ` [PATCH v8 30/40] x86/boot: add a pointer to Confidential Computing blob in bootparams Brijesh Singh
@ 2022-01-17 18:14   ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-17 18:14 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:22AM -0600, Brijesh Singh wrote:
> diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> index 1ac5acca72ce..bea5cdcdf532 100644
> --- a/arch/x86/include/uapi/asm/bootparam.h
> +++ b/arch/x86/include/uapi/asm/bootparam.h
> @@ -188,7 +188,8 @@ struct boot_params {
>  	__u32 ext_ramdisk_image;			/* 0x0c0 */
>  	__u32 ext_ramdisk_size;				/* 0x0c4 */
>  	__u32 ext_cmd_line_ptr;				/* 0x0c8 */
> -	__u8  _pad4[116];				/* 0x0cc */
> +	__u8  _pad4[112];				/* 0x0cc */
> +	__u32 cc_blob_address;				/* 0x13c */
>  	struct edid_info edid_info;			/* 0x140 */
>  	struct efi_info efi_info;			/* 0x1c0 */
>  	__u32 alt_mem_k;				/* 0x1e0 */

Yes, you said that this is a boot/compressed stage -> kernel proper info
pass field but let's document it anyway, please, and say what it is,
just like:

	1E4/004         ALL     scratch                 Scratch field for the kernel setup code

is documented, for example.

And now that I look at it, acpi_rsdp_addr isn't documented either so if
you wanna add it too, while you're at it, that would be nice.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-14 16:13       ` Borislav Petkov
@ 2022-01-18  4:35         ` Michael Roth
  2022-01-18 14:02           ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2022-01-18  4:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Fri, Jan 14, 2022 at 05:13:30PM +0100, Borislav Petkov wrote:
> On Thu, Jan 13, 2022 at 10:39:13AM -0600, Michael Roth wrote:
> > I was thinking a future hypervisor/spec might make use of this field for
> > new functionality, while still wanting to be backward-compatible with
> > existing guests, so it would be better to not enforce 0. The firmware
> > ABI does indeed document it as must-be-zero,
> 
> Maybe there's a good reason for that.
> 
> > by that seems to be more of a constraint on what a hypervisor is
> > currently allowed to place in the CPUID table, rather than something
> > the guest is meant to enforce/rely on.
> 
> So imagine whoever creates those, starts putting stuff in those fields.
> Then, in the future, the spec decides to rename those reserved/unused
> fields into something else and starts putting concrete values in them.
> I.e., it starts using them for something.
> 
> But, now the spec breaks existing usage because those fields are already
> in use. And by then it doesn't matter what the spec says - existing
> usage makes it an ABI.
> 
> So we start doing expensive and ugly workarounds just so that we don't
> break the old, undocumented use which the spec simply silently allowed,
> and accomodate that new feature the spec adds.
> 
> So no, what you're thinking is a bad bad idea.

I can certainly see that argument. I'll add checks to enforce these
cases. If it breaks an existing hypervisor implementation, it's probably
better to have that happen early based on this initial reference
implementation rather than down the road when we actually need these
fields for something.

> 
> > snp_cpuid_info_create() (which sets snp_cpuid_initialized) only gets
> > called if firmware indicates this is an SNP guests (via the cc_blob), but
> > the #VC handler still needs to know whether or not it should use the SNP
> > CPUID table still SEV-ES will still make use of it, so it uses
> > snp_cpuid_active() to make that determination.
> 
> So I went and applied the rest of the series. And I think you mean
> do_vc_no_ghcb() and it doing snp_cpuid().

Yes that's correct.

> 
> Then, looking at sev_enable() and it calling snp_init(), you fail
> further init if there's any discrepancy in the supplied data - CPUID,
> SEV status MSR, etc.
> 
> So, practically, what you wanna test in all those places is whether
> you're a SNP guest or not. Which we already have:
> 
> 	sev_status & MSR_AMD64_SEV_SNP_ENABLED
> 
> so, unless I'm missing something, you don't need yet another
> <bla>_active() helper.

Unfortunately, in sev_enable(), between the point where snp_init() is
called, and sev_status is actually set, there are a number of cpuid
intructions which will make use of do_vc_no_ghcb() prior to sev_status
being set (and it needs to happen in that order to set sev_status
appropriately). After that point, snp_cpuid_active() would no longer be
necessary, but during that span some indicator is needed in case this
is just an SEV-ES guest trigger cpuid #VCs.

> 
> > This code is calculating the total XSAVE buffer size for whatever
> > features are enabled by the guest's XCR0/XSS registers. Those feature
> > bits correspond to the 0xD subleaves 2-63, which advertise the buffer
> > size for each particular feature. So that check needs to ignore anything
> > outside that range (including 0xD subleafs 0 and 1, which would normally
> > provide this total size dynamically based on current values of XCR0/XSS,
> > but here are instead calculated "manually" since we are not relying on
> > the XCR0_IN/XSS_IN fields in the table (due to the reasons mentioned
> > earlier in this thread).
> 
> Yah, the gist of that needs to be as a comment of that line as it is not
> obvious (at least to me).
> 
> > Not duplicate entries (though there's technically nothing in the spec
> > that says you can't), but I was more concerned here with multiple
> > entries corresponding to different combination of XCR0_IN/XSS_IN.
> > There's no good reason for a hypervisor to use those fields for anything
> > other than 0xD subleaves 0 and 1, but a hypervisor could in theory encode
> > 1 "duplicate" sub-leaf for each possible combination of XCR0_IN/XSS_IN,
> > similar to what it might do for subleaves 0 and 1, and not violate the
> > spec.
> 
> 
> Ditto. Also a comment ontop please.

Will do.

> 
> > The current spec is a bit open-ended in some of these areas so the guest
> > code is trying to be as agnostic as possible to the underlying implementation
> > so there's less chance of breakage running on one hypervisor verses
> > another. We're working on updating the spec to encourage better
> > interoperability, but that would likely only be enforceable for future
> > firmware versions/guests.
> 
> This has the same theoretical problem as the reserved/unused fields. If
> you don't enforce it, people will do whatever and once it is implemented
> in hypervisors and it has become an ABI, you can't change it anymore.
> 
> So I'd very strongly suggest you tighten in up upfront and only allow
> stuff later, when it makes sense. Not the other way around.

Yah, I was trying to avoid causing issues for other early
implementations trying to make do with the current open-ended spec, but
that's probably a losing battle and it's probably better to try to get
everyone using the proposed reference implementation early on. I'll tighten
the code up in this regard and also send a new version of reference
implementation document to SNP mailing list for awareness.

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7Cb2fefc5c0458441d2c9508d9d778cf46%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637777736172270637%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=lCq1wBm8iyW1AOorhL35om6gU9GEypzCksiFZUI3H%2Fk%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-18  4:35         ` Michael Roth
@ 2022-01-18 14:02           ` Borislav Petkov
  2022-01-18 14:23             ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2022-01-18 14:02 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Mon, Jan 17, 2022 at 10:35:21PM -0600, Michael Roth wrote:
> Unfortunately, in sev_enable(), between the point where snp_init() is
> called, and sev_status is actually set, there are a number of cpuid
> intructions which will make use of do_vc_no_ghcb() prior to sev_status
> being set (and it needs to happen in that order to set sev_status
> appropriately). After that point, snp_cpuid_active() would no longer be
> necessary, but during that span some indicator is needed in case this
> is just an SEV-ES guest trigger cpuid #VCs.

You mean testing what snp_cpuid_info_create() set up is not enough?

diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 7bc7e297f88c..17cfe804bad3 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -523,7 +523,9 @@ static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
 static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
 		     u32 *edx)
 {
-	if (!snp_cpuid_active())
+	const struct snp_cpuid_info *c = snp_cpuid_info_get_ptr();
+
+	if (!c->count)
 		return -EOPNOTSUPP;
 
 	if (!snp_cpuid_find_validated_func(func, subfunc, eax, ebx, ecx, edx)) {

---

Btw, all those

        /* SEV-SNP CPUID table should be set up now. */
        if (!snp_cpuid_active())
                sev_es_terminate(1, GHCB_TERM_CPUID);

after snp_cpuid_info_create() has returned are useless either. If that
function returns, you know you're good to go wrt SNP.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-18 14:02           ` Borislav Petkov
@ 2022-01-18 14:23             ` Michael Roth
  2022-01-18 14:32               ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2022-01-18 14:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Jan 18, 2022 at 03:02:48PM +0100, Borislav Petkov wrote:
> On Mon, Jan 17, 2022 at 10:35:21PM -0600, Michael Roth wrote:
> > Unfortunately, in sev_enable(), between the point where snp_init() is
> > called, and sev_status is actually set, there are a number of cpuid
> > intructions which will make use of do_vc_no_ghcb() prior to sev_status
> > being set (and it needs to happen in that order to set sev_status
> > appropriately). After that point, snp_cpuid_active() would no longer be
> > necessary, but during that span some indicator is needed in case this
> > is just an SEV-ES guest trigger cpuid #VCs.
> 
> You mean testing what snp_cpuid_info_create() set up is not enough?
> 
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index 7bc7e297f88c..17cfe804bad3 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -523,7 +523,9 @@ static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
>  static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
>  		     u32 *edx)
>  {
> -	if (!snp_cpuid_active())
> +	const struct snp_cpuid_info *c = snp_cpuid_info_get_ptr();
> +
> +	if (!c->count)
>  		return -EOPNOTSUPP;
>  
>  	if (!snp_cpuid_find_validated_func(func, subfunc, eax, ebx, ecx, edx)) {

snp_cpuid_info_get_ptr() will always return non-NULL, since it's a
pointer to the local copy of the cpuid page. But I can probably re-work
it slightly that so snp_cpuid_info_get_ptr() does the same check that
snp_cpuid_active() does, and have it return NULL if the copy hasn't been
initialized, if that seems preferable to the separate snp_cpuid_active()
function.

> 
> ---
> 
> Btw, all those
> 
>         /* SEV-SNP CPUID table should be set up now. */
>         if (!snp_cpuid_active())
>                 sev_es_terminate(1, GHCB_TERM_CPUID);
> 
> after snp_cpuid_info_create() has returned are useless either. If that
> function returns, you know you're good to go wrt SNP.

It seemed like a good thing to assert in case something slipped in later
that tried to use snp_cpuid() without the table being initialized, but
if I implement things the way you suggested above,
snp_cpuid_info_get_ptr() will return NULL in that case, so we get that
assurance for free. That does sound cleaner.

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7Cb8be348c4db84954006708d9da8b3820%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637781113782595576%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=HszNjYntQNNI827LNK4H8Tpx0vhpICo7y3FCzIvhNtc%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-18 14:23             ` Michael Roth
@ 2022-01-18 14:32               ` Michael Roth
  2022-01-18 14:37                 ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2022-01-18 14:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Jan 18, 2022 at 08:23:45AM -0600, Michael Roth wrote:
> On Tue, Jan 18, 2022 at 03:02:48PM +0100, Borislav Petkov wrote:
> > On Mon, Jan 17, 2022 at 10:35:21PM -0600, Michael Roth wrote:
> > > Unfortunately, in sev_enable(), between the point where snp_init() is
> > > called, and sev_status is actually set, there are a number of cpuid
> > > intructions which will make use of do_vc_no_ghcb() prior to sev_status
> > > being set (and it needs to happen in that order to set sev_status
> > > appropriately). After that point, snp_cpuid_active() would no longer be
> > > necessary, but during that span some indicator is needed in case this
> > > is just an SEV-ES guest trigger cpuid #VCs.
> > 
> > You mean testing what snp_cpuid_info_create() set up is not enough?
> > 
> > diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> > index 7bc7e297f88c..17cfe804bad3 100644
> > --- a/arch/x86/kernel/sev-shared.c
> > +++ b/arch/x86/kernel/sev-shared.c
> > @@ -523,7 +523,9 @@ static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> >  static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
> >  		     u32 *edx)
> >  {
> > -	if (!snp_cpuid_active())
> > +	const struct snp_cpuid_info *c = snp_cpuid_info_get_ptr();
> > +
> > +	if (!c->count)
> >  		return -EOPNOTSUPP;
> >  
> >  	if (!snp_cpuid_find_validated_func(func, subfunc, eax, ebx, ecx, edx)) {
> 
> snp_cpuid_info_get_ptr() will always return non-NULL, since it's a
> pointer to the local copy of the cpuid page. But I can probably re-work

Doh, misread your patch. Yes I think checking the count would also work,
since a valid table should be non-zero.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-18 14:32               ` Michael Roth
@ 2022-01-18 14:37                 ` Michael Roth
  2022-01-18 16:34                   ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2022-01-18 14:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Jan 18, 2022 at 08:32:38AM -0600, Michael Roth wrote:
> On Tue, Jan 18, 2022 at 08:23:45AM -0600, Michael Roth wrote:
> > On Tue, Jan 18, 2022 at 03:02:48PM +0100, Borislav Petkov wrote:
> > > On Mon, Jan 17, 2022 at 10:35:21PM -0600, Michael Roth wrote:
> > > > Unfortunately, in sev_enable(), between the point where snp_init() is
> > > > called, and sev_status is actually set, there are a number of cpuid
> > > > intructions which will make use of do_vc_no_ghcb() prior to sev_status
> > > > being set (and it needs to happen in that order to set sev_status
> > > > appropriately). After that point, snp_cpuid_active() would no longer be
> > > > necessary, but during that span some indicator is needed in case this
> > > > is just an SEV-ES guest trigger cpuid #VCs.
> > > 
> > > You mean testing what snp_cpuid_info_create() set up is not enough?
> > > 
> > > diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> > > index 7bc7e297f88c..17cfe804bad3 100644
> > > --- a/arch/x86/kernel/sev-shared.c
> > > +++ b/arch/x86/kernel/sev-shared.c
> > > @@ -523,7 +523,9 @@ static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> > >  static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
> > >  		     u32 *edx)
> > >  {
> > > -	if (!snp_cpuid_active())
> > > +	const struct snp_cpuid_info *c = snp_cpuid_info_get_ptr();
> > > +
> > > +	if (!c->count)
> > >  		return -EOPNOTSUPP;
> > >  
> > >  	if (!snp_cpuid_find_validated_func(func, subfunc, eax, ebx, ecx, edx)) {
> > 
> > snp_cpuid_info_get_ptr() will always return non-NULL, since it's a
> > pointer to the local copy of the cpuid page. But I can probably re-work
> 
> Doh, misread your patch. Yes I think checking the count would also work,
> since a valid table should be non-zero.

Actually, no, because doing that would provide hypervisor a means to
effectively disable CPUID page for an SNP guest by provided a table with
count == 0, which needs to be guarded against.

But can still implement something similar by having snp_cpuid_info_get_ptr()
return NULL if local copy of cpuid page hasn't been initialized.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-18 14:37                 ` Michael Roth
@ 2022-01-18 16:34                   ` Borislav Petkov
  2022-01-18 17:20                     ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2022-01-18 16:34 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Jan 18, 2022 at 08:37:30AM -0600, Michael Roth wrote:
> Actually, no, because doing that would provide hypervisor a means to
> effectively disable CPUID page for an SNP guest by provided a table with
> count == 0, which needs to be guarded against.

Err, I'm confused.

Isn't that "SEV-SNP guests will be provided the location of special
'secrets' 'CPUID' pages via the Confidential Computing blob..." and the
HV has no say in there?

Why does the HV provide the CPUID page?

And when I read "secrets page" I think, encrypted/signed and given
directly to the guest, past the HV which cannot even touch it.

Hmmm.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-18 16:34                   ` Borislav Petkov
@ 2022-01-18 17:20                     ` Michael Roth
  2022-01-18 17:41                       ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2022-01-18 17:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Jan 18, 2022 at 05:34:49PM +0100, Borislav Petkov wrote:
> On Tue, Jan 18, 2022 at 08:37:30AM -0600, Michael Roth wrote:
> > Actually, no, because doing that would provide hypervisor a means to
> > effectively disable CPUID page for an SNP guest by provided a table with
> > count == 0, which needs to be guarded against.
> 
> Err, I'm confused.
> 
> Isn't that "SEV-SNP guests will be provided the location of special
> 'secrets' 'CPUID' pages via the Confidential Computing blob..." and the
> HV has no say in there?
> 
> Why does the HV provide the CPUID page?

The HV fills out the initial contents of the CPUID page, which includes
the count. SNP/PSP firmware will validate the contents the HV tries to put
in the initial page, but does not currently enforce that the 'count' field
is non-zero. So we can't rely on the 'count' field as an indicator of
whether or not the CPUID page is active, we need to rely on the presence
of the ccblob as the true indicator, then treat a non-zero 'count' field
as an invalid state.

I think we discussed this to some extent in the past. The following
document was added to clarify the security model for CPUID page:

https://lore.kernel.org/lkml/20211210154332.11526-29-brijesh.singh@amd.com/

> 
> And when I read "secrets page" I think, encrypted/signed and given
> directly to the guest, past the HV which cannot even touch it.

The CPUID page is also encrypted, but its initial contents come from the
HV, which are then passed through the PSP for initial validation before
being placed in the CPUID page. But the count==0 case is not disallowed
by the PSP firmware, so can't be relied upon as a means to indicate that
the CPUID page is not active.

> 
> Hmmm.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7C9e8f64f998744605658608d9daa073ee%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637781205020570217%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=26Fkw4FJ9jOLVRW7SMs6IWYyaY5gO8iUfdm3x4HDaJk%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-18 17:20                     ` Michael Roth
@ 2022-01-18 17:41                       ` Borislav Petkov
  2022-01-18 18:49                         ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2022-01-18 17:41 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Jan 18, 2022 at 11:20:43AM -0600, Michael Roth wrote:
> The HV fills out the initial contents of the CPUID page, which includes
> the count. SNP/PSP firmware will validate the contents the HV tries to put
> in the initial page, but does not currently enforce that the 'count' field
> is non-zero.

So if the HV sets count to 0, then the PSP can validate all it wants but
you basically don't have a CPUID page. And that's a pretty easy way to
defeat it, if you ask me.

So, if it is too late to change this, I guess the only way out of here
is to terminate the guest on count == 0.

And regardless, what if the HV fakes the count - how do you figure out
what the proper count is? You go and read the whole CPUID page and try
to make sense of what's there, even beyond the "last" function leaf.

> So we can't rely on the 'count' field as an indicator of whether or
> not the CPUID page is active, we need to rely on the presence of the
> ccblob as the true indicator, then treat a non-zero 'count' field as
> an invalid state.

treat a non-zero count field as invalid?

You mean, "a zero count" maybe...

But see above, how do you check whether the HV hasn't "hidden" some
entries by modifying the count field?

Either I'm missing something or this sounds really weird...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-18 17:41                       ` Borislav Petkov
@ 2022-01-18 18:49                         ` Michael Roth
  2022-01-19  1:18                           ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2022-01-18 18:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Jan 18, 2022 at 06:41:16PM +0100, Borislav Petkov wrote:
> On Tue, Jan 18, 2022 at 11:20:43AM -0600, Michael Roth wrote:
> > The HV fills out the initial contents of the CPUID page, which includes
> > the count. SNP/PSP firmware will validate the contents the HV tries to put
> > in the initial page, but does not currently enforce that the 'count' field
> > is non-zero.
> 
> So if the HV sets count to 0, then the PSP can validate all it wants but
> you basically don't have a CPUID page. And that's a pretty easy way to
> defeat it, if you ask me.
> 
> So, if it is too late to change this, I guess the only way out of here
> is to terminate the guest on count == 0.

Right, which is already enforced as part of snp_cpuid_info_create(). So
snp_cpuid_info->count will always be non-zero for SNP guests...

Er... so I suppose we *could* use snp_cpuid_info->count==0 as an indicator
that the cpuid page isn't enabled afterall...since if this was an SNP guest
then count==0 would've caused it to terminate...

Sorry I missed that, early versions of the code used count==0 as
indicator to bypass CPUID page before we realized that wasn't safe, and
I'd avoided relying on count==0 for anything since then. But in the
current code it should work, since count==0 causes SNP guests to
terminate, so a running guest with count==0 is clearly non-SNP.

> 
> And regardless, what if the HV fakes the count - how do you figure out
> what the proper count is? You go and read the whole CPUID page and try
> to make sense of what's there, even beyond the "last" function leaf.

The current code trusts the count value, as long as it is within the
bounds of the CPUID page. If the hypervisor provides a count that is
higher or lower than the number of entries added to the table, the PSP
will fail the guest launch.

count==0 is sort of a special case because of the reasons above, and
since it is never a valid CPUID configuration, so makes sense to
guard against.


> 
> > So we can't rely on the 'count' field as an indicator of whether or
> > not the CPUID page is active, we need to rely on the presence of the
> > ccblob as the true indicator, then treat a non-zero 'count' field as
> > an invalid state.
> 
> treat a non-zero count field as invalid?
> 
> You mean, "a zero count" maybe...

Yes, sorry for the confusion.

> 
> But see above, how do you check whether the HV hasn't "hidden" some
> entries by modifying the count field?
> 
> Either I'm missing something or this sounds really weird...

Yes, that's my fault. count must match the actual number of entries in
the table in all cases. If count==0 then there must also be no entries
in the table. count==0 is only special in that code might erroneously
decide to treat it as an indicator that cpuid table isn't enabled at
all, but since that case causes termination it should actually be ok.

Though I wonder if we should do something like this to still keep
callers from relying on checking count==0 directly:

  static const struct snp_cpuid_info *
  snp_cpuid_info_get_ptr(void)
  {
          const struct snp_cpuid_info *cpuid_info;
          void *ptr;
  
          /*
           * This may be called early while still running on the initial identity
           * mapping. Use RIP-relative addressing to obtain the correct address
           * in both for identity mapping and after switch-over to kernel virtual
           * addresses.
           */
          asm ("lea cpuid_info_copy(%%rip), %0"
               : "=r" (ptr)
               : "p" (&cpuid_info_copy));
  
          cpuid_info = ptr;
          if (cpuid_info->count == 0)
                  return NULL
  
          return cpuid_info;
  }

Because then it's impossible for a caller to accidentally misconstrue
what count==0 means (0 entries? or cpuid table not present?), since the
table then simply becomes inaccessible for anything other than an SNP
guest, and callers just need a NULL check (and will get a free hint
(crash) if they don't).

> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7C56a8943d73484fda82ce08d9daa9bc0c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637781244848502186%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=2QO85fJBiZt2opWRWX%2FGb5LPt4How5cuAt4UJzAiQsg%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-18 18:49                         ` Michael Roth
@ 2022-01-19  1:18                           ` Michael Roth
  2022-01-19 11:17                             ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Michael Roth @ 2022-01-19  1:18 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Jan 18, 2022 at 12:49:30PM -0600, Michael Roth wrote:
> On Tue, Jan 18, 2022 at 06:41:16PM +0100, Borislav Petkov wrote:
> > On Tue, Jan 18, 2022 at 11:20:43AM -0600, Michael Roth wrote:
> > > The HV fills out the initial contents of the CPUID page, which includes
> > > the count. SNP/PSP firmware will validate the contents the HV tries to put
> > > in the initial page, but does not currently enforce that the 'count' field
> > > is non-zero.
> > 
> > And regardless, what if the HV fakes the count - how do you figure out
> > what the proper count is? You go and read the whole CPUID page and try
> > to make sense of what's there, even beyond the "last" function leaf.
> 
> 

<snip>

> > 
> > But see above, how do you check whether the HV hasn't "hidden" some
> > entries by modifying the count field?
> > 
> > Either I'm missing something or this sounds really weird...
> 
> ...count must match the actual number of entries in the table in all
> cases.

Turns out in my testing earlier there was a separate check that was
causing the PSP to fail, so I re-tested the behavior, and things are
actually a bit more interesting, but nothing too concerning:

If 'fake_count'/'reported_count' is greater than the actual number of
entries in the table, 'actual_count', then all table entries up to
'fake_count' will also need to pass validation. Generally the table
will be zero'd out initially, so those additional/bogus entries will
be interpreted as a CPUID leaves where all fields are 0. Unfortunately,
that's still considered a valid leaf, even if it's a duplicate of the
*actual* 0x0 leaf present earlier in the table. The current code will
handle this fine, since it scans the table in order, and uses the
valid 0x0 leaf earlier in the table.

This is isn't really a special case though, it falls under the general
category of a hypervisor inserting garbage entries that happen to pass
validation, but don't reflect values that a guest would normally see.
This will be detectable as part of guest owner attestation, since the
guest code is careful to guarantee that the values seen after boot,
once the attestation stage is reached, will be identical to the values
seen during boot, so if this sort of manipulation of CPUID values
occurred, the guest owner will notice this during attestation, and can
abort the boot at that point. The Documentation patch addresses this
in more detail.

If 'fake_count' is less than 'actual_count', then the PSP skips
validation for anything >= 'fake_count', and leaves them in the table.
That should also be fine though, since guest code should never exceed
'fake_count'/'reported_count', as that's a blatant violation of the
spec, and it doesn't make any sense for a guest to do this. This will
effectively 'hide' entries, but those resulting missing CPUID leaves
will be noticeable to the guest owner once attestation phase is
reached.

This does all highlight the need for some very thorough guidelines
on how a guest owner should implement their attestation checks for
cpuid, however. I think a section in the reference implementation
notes/document that covers this would be a good starting point. I'll
also check with the PSP team on tightening up some of these CPUID
page checks to rule out some of these possibilities in the future.

> in the table. count==0 is only special in that code might erroneously
> decide to treat it as an indicator that cpuid table isn't enabled at
> all, but since that case causes termination it should actually be ok.
> 
> Though I wonder if we should do something like this to still keep
> callers from relying on checking count==0 directly:
> 
>   static const struct snp_cpuid_info *
>   snp_cpuid_info_get_ptr(void)
>   {
>           const struct snp_cpuid_info *cpuid_info;
>           void *ptr;
>   
>           /*
>            * This may be called early while still running on the initial identity
>            * mapping. Use RIP-relative addressing to obtain the correct address
>            * in both for identity mapping and after switch-over to kernel virtual
>            * addresses.
>            */
>           asm ("lea cpuid_info_copy(%%rip), %0"
>                : "=r" (ptr)
>                : "p" (&cpuid_info_copy));
>   
>           cpuid_info = ptr;
>           if (cpuid_info->count == 0)
>                   return NULL
>   
>           return cpuid_info;
>   }

Nevermind, that doesn't work since snp_cpuid_info_get_ptr() is also called
by snp_cpuid_info_get_ptr() *prior* to initializing the table, so it ends
seeing cpuid->count==0 and fails right away. So your initial suggestion
of checking cpuid->count==0 at the call-sites to determine if the table
is enabled is probably the best option.

Sorry for the noise/confusion.

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-19  1:18                           ` Michael Roth
@ 2022-01-19 11:17                             ` Borislav Petkov
  2022-01-19 16:27                               ` Michael Roth
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2022-01-19 11:17 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Jan 18, 2022 at 07:18:06PM -0600, Michael Roth wrote:
> If 'fake_count'/'reported_count' is greater than the actual number of
> entries in the table, 'actual_count', then all table entries up to
> 'fake_count' will also need to pass validation. Generally the table
> will be zero'd out initially, so those additional/bogus entries will
> be interpreted as a CPUID leaves where all fields are 0. Unfortunately,
> that's still considered a valid leaf, even if it's a duplicate of the
> *actual* 0x0 leaf present earlier in the table. The current code will
> handle this fine, since it scans the table in order, and uses the
> valid 0x0 leaf earlier in the table.

I guess it would be prudent to have some warnings when enumerating those
leafs and when the count index "goes off into the weeds", so to speak,
and starts reading 0-CPUID entries. I.e., "dear guest owner, your HV is
giving you a big lie: a weird/bogus CPUID leaf count..."

:-)

And lemme make sure I understand it: the ->count itself is not
measured/encrypted because you want to be flexible here and supply
different blobs with different CPUID leafs?

> This is isn't really a special case though, it falls under the general
> category of a hypervisor inserting garbage entries that happen to pass
> validation, but don't reflect values that a guest would normally see.
> This will be detectable as part of guest owner attestation, since the
> guest code is careful to guarantee that the values seen after boot,
> once the attestation stage is reached, will be identical to the values
> seen during boot, so if this sort of manipulation of CPUID values
> occurred, the guest owner will notice this during attestation, and can
> abort the boot at that point. The Documentation patch addresses this
> in more detail.

Yap, it is important this is properly explained there so that people can
pay attention to during attestation.

> If 'fake_count' is less than 'actual_count', then the PSP skips
> validation for anything >= 'fake_count', and leaves them in the table.
> That should also be fine though, since guest code should never exceed
> 'fake_count'/'reported_count', as that's a blatant violation of the
> spec, and it doesn't make any sense for a guest to do this. This will
> effectively 'hide' entries, but those resulting missing CPUID leaves
> will be noticeable to the guest owner once attestation phase is
> reached.

Noticeable because the guest owner did supply a CPUID table with X
entries but the HV is reporting Y?

If so, you can make this part of the attestation process: guest owners
should always check the CPUID entries count to be of a certain value.

> This does all highlight the need for some very thorough guidelines
> on how a guest owner should implement their attestation checks for
> cpuid, however. I think a section in the reference implementation
> notes/document that covers this would be a good starting point. I'll
> also check with the PSP team on tightening up some of these CPUID
> page checks to rule out some of these possibilities in the future.

Now you're starting to grow the right amount of paranoia - I'm glad I
was able to sensitize you properly!

:-)))

> Nevermind, that doesn't work since snp_cpuid_info_get_ptr() is also called
> by snp_cpuid_info_get_ptr() *prior* to initializing the table, so it ends
> seeing cpuid->count==0 and fails right away. So your initial suggestion
> of checking cpuid->count==0 at the call-sites to determine if the table
> is enabled is probably the best option.
> 
> Sorry for the noise/confusion.

No worries - the end result is important!

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 31/40] x86/compressed: add SEV-SNP feature detection/setup
  2021-12-10 15:43 ` [PATCH v8 31/40] x86/compressed: add SEV-SNP feature detection/setup Brijesh Singh
@ 2022-01-19 12:55   ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-19 12:55 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:23AM -0600, Brijesh Singh wrote:
> +/*
> + * TODO: These are exported only temporarily while boot/compressed/sev.c is
> + * the only user. This is to avoid unused function warnings for kernel/sev.c
> + * during the build of kernel proper.
> + *
> + * Once the code is added to consume these in kernel proper these functions
> + * can be moved back to being statically-scoped to units that pull in
> + * sev-shared.c via #include and these declarations can be dropped.
> + */
> +struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params *bp);

You don't need any of that - just add the function with the patch which
uses it.

> +/*
> + * Search for a Confidential Computing blob passed in as a setup_data entry
> + * via the Linux Boot Protocol.
> + */
> +struct cc_blob_sev_info *
> +snp_find_cc_blob_setup_data(struct boot_params *bp)

Please break lines like that only if absolutely necessary. Which doesn't
look like it here.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-19 11:17                             ` Borislav Petkov
@ 2022-01-19 16:27                               ` Michael Roth
  2022-01-27 17:23                                 ` Michael Roth
  2022-01-28 22:58                                 ` Borislav Petkov
  0 siblings, 2 replies; 183+ messages in thread
From: Michael Roth @ 2022-01-19 16:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Jan 19, 2022 at 12:17:22PM +0100, Borislav Petkov wrote:
> On Tue, Jan 18, 2022 at 07:18:06PM -0600, Michael Roth wrote:
> > If 'fake_count'/'reported_count' is greater than the actual number of
> > entries in the table, 'actual_count', then all table entries up to
> > 'fake_count' will also need to pass validation. Generally the table
> > will be zero'd out initially, so those additional/bogus entries will
> > be interpreted as a CPUID leaves where all fields are 0. Unfortunately,
> > that's still considered a valid leaf, even if it's a duplicate of the
> > *actual* 0x0 leaf present earlier in the table. The current code will
> > handle this fine, since it scans the table in order, and uses the
> > valid 0x0 leaf earlier in the table.
> 
> I guess it would be prudent to have some warnings when enumerating those
> leafs and when the count index "goes off into the weeds", so to speak,
> and starts reading 0-CPUID entries. I.e., "dear guest owner, your HV is
> giving you a big lie: a weird/bogus CPUID leaf count..."
> 
> :-)

Ok, there's some sanity checks that happen a little later in boot via
snp_cpuid_check_status(), after printk is enabled, that reports some
basic details to dmesg like the number of entries in the table. I can
add some additional sanity checks to flag the above case (really,
all-zero entries never make sense, since CPUID 0x0 is supposed to report
the max standard-range CPUID leaf, and leaf 0x1 at least should always
be present). I'll print a warning for such cases, add maybe dump the
cpuid the table in that case so it can be examined more easily by
owner.

> 
> And lemme make sure I understand it: the ->count itself is not
> measured/encrypted because you want to be flexible here and supply
> different blobs with different CPUID leafs?

Yes, but to be clear it's the entire CPUID page, including the count,
that's not measured (though it is encrypted after passing PSP
validation). Probably the biggest reason is the logistics of having
untrusted cloud vendors provide a copy of the CPUID values they plan
to pass to the guest, since a new measurement would need to be
calculated for every new configuration (using different guest
cpuflags, SMP count, etc.), since those table values will need to be
made easily-accessible to guest owner for all these measurement
calculations, and they can't be trusted so each table would need to
be checked either manually or by some tooling that could be difficult
to implement unless it was something simple like "give me the expected
CPUID values and I'll check if the provided CPUID table agrees with
that".

At that point it's much easier for the guest owner to just check the
CPUID values directly against known good values for a particular
configuration as part of their attestation process and leave the
untrusted cloud vendor out of it completely. So not measuring the
CPUID page as part of SNP attestation allows for that flexibility.

> 
> > This is isn't really a special case though, it falls under the general
> > category of a hypervisor inserting garbage entries that happen to pass
> > validation, but don't reflect values that a guest would normally see.
> > This will be detectable as part of guest owner attestation, since the
> > guest code is careful to guarantee that the values seen after boot,
> > once the attestation stage is reached, will be identical to the values
> > seen during boot, so if this sort of manipulation of CPUID values
> > occurred, the guest owner will notice this during attestation, and can
> > abort the boot at that point. The Documentation patch addresses this
> > in more detail.
> 
> Yap, it is important this is properly explained there so that people can
> pay attention to during attestation.
> 
> > If 'fake_count' is less than 'actual_count', then the PSP skips
> > validation for anything >= 'fake_count', and leaves them in the table.
> > That should also be fine though, since guest code should never exceed
> > 'fake_count'/'reported_count', as that's a blatant violation of the
> > spec, and it doesn't make any sense for a guest to do this. This will
> > effectively 'hide' entries, but those resulting missing CPUID leaves
> > will be noticeable to the guest owner once attestation phase is
> > reached.
> 
> Noticeable because the guest owner did supply a CPUID table with X
> entries but the HV is reporting Y?

Or even more simply by the guest owner simply running 'cpuid -r -1' on
the guest after boot, and making sure all the expected entries are
present. If the HV manipulated the count to be lower, there would be
missing entries, if they manipulated it to be higher, then there would
either be extra duplicate entries at the end of the table (which the
#VC handler would ignore due to it using the first matching entry in
the table when doing lookups), or additional non-duplicate garbage
entries, which will show up in 'cpuid -r -1' as unexpected entries.

Really 'cpuid -r -1' is the guest owner/userspace view of things, so
some of these nuances about the table contents might be noteworthy,
but wouldn't actually affect guest behavior, which would be the main
thing attestation process should be concerned with.

> 
> If so, you can make this part of the attestation process: guest owners
> should always check the CPUID entries count to be of a certain value.
> 
> > This does all highlight the need for some very thorough guidelines
> > on how a guest owner should implement their attestation checks for
> > cpuid, however. I think a section in the reference implementation
> > notes/document that covers this would be a good starting point. I'll
> > also check with the PSP team on tightening up some of these CPUID
> > page checks to rule out some of these possibilities in the future.
> 
> Now you're starting to grow the right amount of paranoia - I'm glad I
> was able to sensitize you properly!
> 
> :-)))

Hehe =*D

Thanks!

-Mike

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 32/40] x86/compressed: use firmware-validated CPUID for SEV-SNP guests
  2021-12-10 15:43 ` [PATCH v8 32/40] x86/compressed: use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
@ 2022-01-20 12:18   ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-20 12:18 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:24AM -0600, Brijesh Singh wrote:
> Subject: Re: [PATCH v8 32/40] x86/compressed: use firmware-validated CPUID for SEV-SNP guests
									    ^
									    leafs

or so.

> From: Michael Roth <michael.roth@amd.com>
> 
> SEV-SNP guests will be provided the location of special 'secrets'
> 'CPUID' pages via the Confidential Computing blob. This blob is
> provided to the boot kernel either through an EFI config table entry,
> or via a setup_data structure as defined by the Linux Boot Protocol.
> 
> Locate the Confidential Computing from these sources and, if found,
> use the provided CPUID page/table address to create a copy that the
> boot kernel will use when servicing cpuid instructions via a #VC

CPUID

> handler.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/boot/compressed/sev.c | 13 ++++++++++
>  arch/x86/include/asm/sev.h     |  1 +
>  arch/x86/kernel/sev-shared.c   | 43 ++++++++++++++++++++++++++++++++++
>  3 files changed, 57 insertions(+)
> 
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index 93e125da12cf..29dfb34b5907 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -415,6 +415,19 @@ bool snp_init(struct boot_params *bp)
>  	if (!cc_info)
>  		return false;
>  
> +	/*
> +	 * If SEV-SNP-specific Confidential Computing blob is present, then
	     ^
	     a


> +	 * firmware/bootloader have indicated SEV-SNP support. Verifying this
> +	 * involves CPUID checks which will be more reliable if the SEV-SNP
> +	 * CPUID table is used. See comments for snp_cpuid_info_create() for

s/for/over/ ?

> +	 * more details.
> +	 */
> +	snp_cpuid_info_create(cc_info);
> +
> +	/* SEV-SNP CPUID table should be set up now. */
> +	if (!snp_cpuid_active())
> +		sev_es_terminate(1, GHCB_TERM_CPUID);

Right, that is not needed now.

>  	 * Pass run-time kernel a pointer to CC info via boot_params so EFI
>  	 * config table doesn't need to be searched again during early startup
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index cd189c20bcc4..4fa7ca20d7c9 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -157,6 +157,7 @@ bool snp_init(struct boot_params *bp);
>   * sev-shared.c via #include and these declarations can be dropped.
>   */
>  struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params *bp);
> +void snp_cpuid_info_create(const struct cc_blob_sev_info *cc_info);
>  #else
>  static inline void sev_es_ist_enter(struct pt_regs *regs) { }
>  static inline void sev_es_ist_exit(void) { }
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index bd58a4ce29c8..5cb8f87df4b3 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -403,6 +403,23 @@ snp_cpuid_find_validated_func(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
>  	return false;
>  }
>  
> +static void __init snp_cpuid_set_ranges(void)
> +{
> +	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
> +	int i;
> +
> +	for (i = 0; i < cpuid_info->count; i++) {
> +		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
> +
> +		if (fn->eax_in == 0x0)
> +			cpuid_std_range_max = fn->eax;
> +		else if (fn->eax_in == 0x40000000)
> +			cpuid_hyp_range_max = fn->eax;
> +		else if (fn->eax_in == 0x80000000)
> +			cpuid_ext_range_max = fn->eax;
> +	}
> +}

Kinda arbitrary to have a separate function which has a single caller.
You can just as well move the loop into snp_cpuid_info_create() and put
a comment above it:

	/* Set CPUID ranges. */
	for (i = 0; i < cpuid_info->count; i++) {
		...

Also, snp_cpuid_info_create() should be called snp_setup_cpuid_table()
which is what this thing does.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 33/40] x86/compressed/64: add identity mapping for Confidential Computing blob
  2021-12-10 15:43 ` [PATCH v8 33/40] x86/compressed/64: add identity mapping for Confidential Computing blob Brijesh Singh
  2021-12-10 19:52   ` Dave Hansen
@ 2022-01-25 13:48   ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-25 13:48 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:25AM -0600, Brijesh Singh wrote:
> +static void sev_prep_identity_maps(void)
> +{
> +	/*
> +	 * The ConfidentialComputing blob is used very early in uncompressed
> +	 * kernel to find the in-memory cpuid table to handle cpuid
> +	 * instructions. Make sure an identity-mapping exists so it can be
> +	 * accessed after switchover.
> +	 */
> +	if (sev_snp_enabled()) {
> +		struct cc_blob_sev_info *cc_info =
> +			(void *)(unsigned long)boot_params->cc_blob_address;
> +
> +		add_identity_map((unsigned long)cc_info,
> +				 (unsigned long)cc_info + sizeof(*cc_info));
> +		add_identity_map((unsigned long)cc_info->cpuid_phys,
> +				 (unsigned long)cc_info->cpuid_phys + cc_info->cpuid_len);
> +	}
> +
> +	sev_verify_cbit(top_level_pgt);
> +}
> +

Also, that function can just as well live in compressed/sev.c and
you can export add_identity_map() instead.

That latter function calls kernel_ident_mapping_init() which is
already exported. add_identity_map() doesn't do anything special
and it is limited to the decompressor kernel so nothing stands in
the way of exporting it in a pre-patch and renaming it there to
kernel_add_identity_map() or so...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 34/40] x86/sev: add SEV-SNP feature detection/setup
  2021-12-10 15:43 ` [PATCH v8 34/40] x86/sev: add SEV-SNP feature detection/setup Brijesh Singh
@ 2022-01-25 18:43   ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-25 18:43 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:26AM -0600, Brijesh Singh wrote:
> +static struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
> +{
> +	struct cc_blob_sev_info *cc_info;
> +
> +	/* Boot kernel would have passed the CC blob via boot_params. */
> +	if (bp->cc_blob_address) {
> +		cc_info = (struct cc_blob_sev_info *)
> +			  (unsigned long)bp->cc_blob_address;

No need to break that line.

> +		goto found_cc_info;
> +	}
> +
> +	/*
> +	 * If kernel was booted directly, without the use of the
> +	 * boot/decompression kernel, the CC blob may have been passed via
> +	 * setup_data instead.
> +	 */
> +	cc_info = snp_find_cc_blob_setup_data(bp);
> +	if (!cc_info)
> +		return NULL;
> +
> +found_cc_info:
> +	if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
> +		sev_es_terminate(1, GHCB_SNP_UNSUPPORTED);

snp_abort() if you're gonna call it that.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 35/40] x86/sev: use firmware-validated CPUID for SEV-SNP guests
  2021-12-10 15:43 ` [PATCH v8 35/40] x86/sev: use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
@ 2022-01-26 18:35   ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-26 18:35 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:27AM -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> SEV-SNP guests will be provided the location of special 'secrets' and
> 'CPUID' pages via the Confidential Computing blob. This blob is
> provided to the run-time kernel either through bootparams field that
						^
						a


> was initialized by the boot/compressed kernel, or via a setup_data
> structure as defined by the Linux Boot Protocol.
> 
> Locate the Confidential Computing from these sources and, if found,
				   ^
				   blob

> use the provided CPUID page/table address to create a copy that the
> run-time kernel will use when servicing cpuid instructions via a #VC
					  ^^^^^

Please capitalize all instruction mnemonics in text.

> +/*
> + * It is useful from an auditing/testing perspective to provide an easy way
> + * for the guest owner to know that the CPUID table has been initialized as
> + * expected, but that initialization happens too early in boot to print any
> + * sort of indicator, and there's not really any other good place to do it. So
> + * do it here, and while at it, go ahead and re-verify that nothing strange has
> + * happened between early boot and now.
> + */
> +static int __init snp_cpuid_check_status(void)

That function's redundant now, I believe, since we terminate the guest
if there's something wrong with the CPUID page.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs
  2021-12-10 15:43 ` [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs Brijesh Singh
@ 2022-01-27 16:21   ` Borislav Petkov
  2022-01-27 17:02     ` Brijesh Singh
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2022-01-27 16:21 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Dec 10, 2021 at 09:43:28AM -0600, Brijesh Singh wrote:
> Version 2 of GHCB specification provides SNP_GUEST_REQUEST and
> SNP_EXT_GUEST_REQUEST NAE that can be used by the SNP guest to communicate
> with the PSP.
> 
> While at it, add a snp_issue_guest_request() helper that can be used by

Not "that can" but "that will".

> driver or other subsystem to issue the request to PSP.
> 
> See SEV-SNP and GHCB spec for more details.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/sev-common.h |  3 ++
>  arch/x86/include/asm/sev.h        | 14 +++++++++
>  arch/x86/include/uapi/asm/svm.h   |  4 +++
>  arch/x86/kernel/sev.c             | 51 +++++++++++++++++++++++++++++++
>  4 files changed, 72 insertions(+)
> 
> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
> index 673e6778194b..346600724b84 100644
> --- a/arch/x86/include/asm/sev-common.h
> +++ b/arch/x86/include/asm/sev-common.h
> @@ -128,6 +128,9 @@ struct snp_psc_desc {
>  	struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
>  } __packed;
>  
> +/* Guest message request error code */
> +#define SNP_GUEST_REQ_INVALID_LEN	BIT_ULL(32)

SZ_4G is more descriptive, perhaps...

> +int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err)
> +{
> +	struct ghcb_state state;
> +	unsigned long flags;
> +	struct ghcb *ghcb;
> +	int ret;
> +
> +	if (!cc_platform_has(CC_ATTR_SEV_SNP))
> +		return -ENODEV;
> +
> +	/* __sev_get_ghcb() need to run with IRQs disabled because it using per-cpu GHCB */

			   needs 				it is using a

> +	local_irq_save(flags);
> +
> +	ghcb = __sev_get_ghcb(&state);
> +	if (!ghcb) {
> +		ret = -EIO;
> +		goto e_restore_irq;
> +	}
> +
> +	vc_ghcb_invalidate(ghcb);
> +
> +	if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
> +		ghcb_set_rax(ghcb, input->data_gpa);
> +		ghcb_set_rbx(ghcb, input->data_npages);
> +	}
> +
> +	ret = sev_es_ghcb_hv_call(ghcb, true, NULL, exit_code, input->req_gpa, input->resp_gpa);
					      ^^^^^

That's ctxt which is accessed without a NULL check in
verify_exception_info().

Why aren't you allocating a ctxt on stack like the other callers do?


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs
  2022-01-27 16:21   ` Borislav Petkov
@ 2022-01-27 17:02     ` Brijesh Singh
  2022-01-29 10:27       ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2022-01-27 17:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 1/27/22 10:21 AM, Borislav Petkov wrote:
> On Fri, Dec 10, 2021 at 09:43:28AM -0600, Brijesh Singh wrote:
>> Version 2 of GHCB specification provides SNP_GUEST_REQUEST and
>> SNP_EXT_GUEST_REQUEST NAE that can be used by the SNP guest to communicate
>> with the PSP.
>>
>> While at it, add a snp_issue_guest_request() helper that can be used by
> 
> Not "that can" but "that will".
> 
Noted.

>>   
>> +/* Guest message request error code */
>> +#define SNP_GUEST_REQ_INVALID_LEN	BIT_ULL(32)
> 
> SZ_4G is more descriptive, perhaps...
> 

I am okay with using SZ_4G but per the spec they don't spell that its 4G 
size. It says bit 32 will should be set on error.



>> +
>> +	ret = sev_es_ghcb_hv_call(ghcb, true, NULL, exit_code, input->req_gpa, input->resp_gpa);
> 					      ^^^^^
> 
> That's ctxt which is accessed without a NULL check in
> verify_exception_info().
> 
> Why aren't you allocating a ctxt on stack like the other callers do?

Typically the sev_es_ghcb_hv_handler() is called from #VC handler, which 
provides the context structure. But in this and PSC case, the caller is 
not a #VC handler, so we don't have a context structure. But as you 
pointed, we could allocate context structure on the stack and pass it 
down so that verify_exception_info() does not cause a panic with NULL 
deference (when HV violates the spec and inject exception while handling 
this NAE).

thanks

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-19 16:27                               ` Michael Roth
@ 2022-01-27 17:23                                 ` Michael Roth
  2022-01-28 22:58                                 ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Michael Roth @ 2022-01-27 17:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Jan 19, 2022 at 10:27:47AM -0600, Michael Roth wrote:
> On Wed, Jan 19, 2022 at 12:17:22PM +0100, Borislav Petkov wrote:
> > On Tue, Jan 18, 2022 at 07:18:06PM -0600, Michael Roth wrote:
> > > If 'fake_count'/'reported_count' is greater than the actual number of
> > > entries in the table, 'actual_count', then all table entries up to
> > > 'fake_count' will also need to pass validation. Generally the table
> > > will be zero'd out initially, so those additional/bogus entries will
> > > be interpreted as a CPUID leaves where all fields are 0. Unfortunately,
> > > that's still considered a valid leaf, even if it's a duplicate of the
> > > *actual* 0x0 leaf present earlier in the table. The current code will
> > > handle this fine, since it scans the table in order, and uses the
> > > valid 0x0 leaf earlier in the table.
> > 
> > I guess it would be prudent to have some warnings when enumerating those
> > leafs and when the count index "goes off into the weeds", so to speak,
> > and starts reading 0-CPUID entries. I.e., "dear guest owner, your HV is
> > giving you a big lie: a weird/bogus CPUID leaf count..."
> > 
> > :-)
> 

Sorry for the delay, this response got stuck in my mail queue apparently.

> Ok, there's some sanity checks that happen a little later in boot via
> snp_cpuid_check_status(), after printk is enabled, that reports some
> basic details to dmesg like the number of entries in the table. I can
> add some additional sanity checks to flag the above case (really,
> all-zero entries never make sense, since CPUID 0x0 is supposed to report
> the max standard-range CPUID leaf, and leaf 0x1 at least should always
> be present). I'll print a warning for such cases, add maybe dump the
> cpuid the table in that case so it can be examined more easily by
> owner.
> 
> > 
> > And lemme make sure I understand it: the ->count itself is not
> > measured/encrypted because you want to be flexible here and supply
> > different blobs with different CPUID leafs?
> 
> Yes, but to be clear it's the entire CPUID page, including the count,
> that's not measured (though it is encrypted after passing PSP
> validation). Probably the biggest reason is the logistics of having
> untrusted cloud vendors provide a copy of the CPUID values they plan
> to pass to the guest, since a new measurement would need to be
> calculated for every new configuration (using different guest
> cpuflags, SMP count, etc.), since those table values will need to be
> made easily-accessible to guest owner for all these measurement
> calculations, and they can't be trusted so each table would need to
> be checked either manually or by some tooling that could be difficult
> to implement unless it was something simple like "give me the expected
> CPUID values and I'll check if the provided CPUID table agrees with
> that".
> 
> At that point it's much easier for the guest owner to just check the
> CPUID values directly against known good values for a particular
> configuration as part of their attestation process and leave the
> untrusted cloud vendor out of it completely. So not measuring the
> CPUID page as part of SNP attestation allows for that flexibility.
> 
> > 
> > > This is isn't really a special case though, it falls under the general
> > > category of a hypervisor inserting garbage entries that happen to pass
> > > validation, but don't reflect values that a guest would normally see.
> > > This will be detectable as part of guest owner attestation, since the
> > > guest code is careful to guarantee that the values seen after boot,
> > > once the attestation stage is reached, will be identical to the values
> > > seen during boot, so if this sort of manipulation of CPUID values
> > > occurred, the guest owner will notice this during attestation, and can
> > > abort the boot at that point. The Documentation patch addresses this
> > > in more detail.
> > 
> > Yap, it is important this is properly explained there so that people can
> > pay attention to during attestation.
> > 
> > > If 'fake_count' is less than 'actual_count', then the PSP skips
> > > validation for anything >= 'fake_count', and leaves them in the table.
> > > That should also be fine though, since guest code should never exceed
> > > 'fake_count'/'reported_count', as that's a blatant violation of the
> > > spec, and it doesn't make any sense for a guest to do this. This will
> > > effectively 'hide' entries, but those resulting missing CPUID leaves
> > > will be noticeable to the guest owner once attestation phase is
> > > reached.
> > 
> > Noticeable because the guest owner did supply a CPUID table with X
> > entries but the HV is reporting Y?
> 
> Or even more simply by the guest owner simply running 'cpuid -r -1' on
> the guest after boot, and making sure all the expected entries are
> present. If the HV manipulated the count to be lower, there would be
> missing entries, if they manipulated it to be higher, then there would
> either be extra duplicate entries at the end of the table (which the
> #VC handler would ignore due to it using the first matching entry in
> the table when doing lookups), or additional non-duplicate garbage
> entries, which will show up in 'cpuid -r -1' as unexpected entries.
> 
> Really 'cpuid -r -1' is the guest owner/userspace view of things, so
> some of these nuances about the table contents might be noteworthy,
> but wouldn't actually affect guest behavior, which would be the main
> thing attestation process should be concerned with.
> 
> > 
> > If so, you can make this part of the attestation process: guest owners
> > should always check the CPUID entries count to be of a certain value.
> > 
> > > This does all highlight the need for some very thorough guidelines
> > > on how a guest owner should implement their attestation checks for
> > > cpuid, however. I think a section in the reference implementation
> > > notes/document that covers this would be a good starting point. I'll
> > > also check with the PSP team on tightening up some of these CPUID
> > > page checks to rule out some of these possibilities in the future.
> > 
> > Now you're starting to grow the right amount of paranoia - I'm glad I
> > was able to sensitize you properly!
> > 
> > :-)))
> 
> Hehe =*D
> 
> Thanks!
> 
> -Mike

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers
  2022-01-19 16:27                               ` Michael Roth
  2022-01-27 17:23                                 ` Michael Roth
@ 2022-01-28 22:58                                 ` Borislav Petkov
  1 sibling, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-28 22:58 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Jan 19, 2022 at 10:27:47AM -0600, Michael Roth wrote:
> At that point it's much easier for the guest owner to just check the
> CPUID values directly against known good values for a particular
> configuration as part of their attestation process and leave the
> untrusted cloud vendor out of it completely. So not measuring the
> CPUID page as part of SNP attestation allows for that flexibility.

Well, in that case, I guess you don't need the sanity-checking in the
guest either - you simply add it to the attestation TODO-list for the
guest owner to go through:

Upon booting, the guest owner should compare the CPUID leafs the guest
sees with the ones supplied during boot.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs
  2022-01-27 17:02     ` Brijesh Singh
@ 2022-01-29 10:27       ` Borislav Petkov
  2022-01-29 11:49         ` Brijesh Singh
  0 siblings, 1 reply; 183+ messages in thread
From: Borislav Petkov @ 2022-01-29 10:27 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Thu, Jan 27, 2022 at 11:02:13AM -0600, Brijesh Singh wrote:
> I am okay with using SZ_4G but per the spec they don't spell that its 4G
> size. It says bit 32 will should be set on error.

What does the speck call it exactly? Is it "length"? Because that's what
confused me: SNP_GUEST_REQ_INVALID_LEN - that's a length and length you
don't usually specify with a bit position...

> Typically the sev_es_ghcb_hv_handler() is called from #VC handler, which
> provides the context structure. But in this and PSC case, the caller is not
> a #VC handler, so we don't have a context structure. But as you pointed, we
> could allocate context structure on the stack and pass it down so that
> verify_exception_info() does not cause a panic with NULL deference (when HV
> violates the spec and inject exception while handling this NAE).

Yap, exactly.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs
  2022-01-29 10:27       ` Borislav Petkov
@ 2022-01-29 11:49         ` Brijesh Singh
  2022-01-29 12:02           ` Borislav Petkov
  0 siblings, 1 reply; 183+ messages in thread
From: Brijesh Singh @ 2022-01-29 11:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy


On 1/29/22 4:27 AM, Borislav Petkov wrote:
> On Thu, Jan 27, 2022 at 11:02:13AM -0600, Brijesh Singh wrote:
>> I am okay with using SZ_4G but per the spec they don't spell that its 4G
>> size. It says bit 32 will should be set on error.
> What does the speck call it exactly? Is it "length"? Because that's what
> confused me: SNP_GUEST_REQ_INVALID_LEN - that's a length and length you
> don't usually specify with a bit position...

Here is the text from the spec:

----------

The hypervisor must validate that the guest has supplied enough pages to
hold the certificates that will be returned before performing the SNP
guest request. If there are not enough guest pages to hold the
certificate table and certificate data, the hypervisor will return the
required number of pages needed to hold the certificate table and
certificate data in the RBX register and set the SW_EXITINFO2 field to
0x0000000100000000.

---------

It does not spell it as invalid length. However, for *similar* failure,
the SEV-SNP spec spells out it as INVALID_LENGTH, so, I choose macro
name as INVALID_LENGTH.

thanks


^ permalink raw reply	[flat|nested] 183+ messages in thread

* Re: [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs
  2022-01-29 11:49         ` Brijesh Singh
@ 2022-01-29 12:02           ` Borislav Petkov
  0 siblings, 0 replies; 183+ messages in thread
From: Borislav Petkov @ 2022-01-29 12:02 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Sat, Jan 29, 2022 at 05:49:06AM -0600, Brijesh Singh wrote:
> The hypervisor must validate that the guest has supplied enough pages to
> hold the certificates that will be returned before performing the SNP
> guest request. If there are not enough guest pages to hold the
> certificate table and certificate data, the hypervisor will return the
> required number of pages needed to hold the certificate table and
> certificate data in the RBX register and set the SW_EXITINFO2 field to
> 0x0000000100000000.

Then you could call that one:

#define SNP_GUEST_REQ_ERR_NUM_PAGES       BIT_ULL(32)

or so, to mean what exactly that error is. Because when you read the
code, it is more "self-descriptive" this way:

	...
	ghcb->save.sw_exit_info_2 == SNP_GUEST_REQ_ERR_NUM_PAGES)
		input->data_npages = ghcb_get_rbx(ghcb);

> It does not spell it as invalid length. However, for *similar* failure,
> the SEV-SNP spec spells out it as INVALID_LENGTH, so, I choose macro
> name as INVALID_LENGTH.

You can simply define a separate one here called ...INVALID_LENGTH.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 183+ messages in thread

end of thread, other threads:[~2022-01-29 12:02 UTC | newest]

Thread overview: 183+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-10 15:42 [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
2021-12-10 15:42 ` [PATCH v8 01/40] x86/compressed/64: detect/setup SEV/SME features earlier in boot Brijesh Singh
2021-12-10 18:47   ` Dave Hansen
2021-12-10 19:12   ` Borislav Petkov
2021-12-10 19:23     ` Dave Hansen
2021-12-10 19:33       ` Borislav Petkov
2021-12-13 19:09   ` Venu Busireddy
2021-12-13 19:17     ` Borislav Petkov
2021-12-14 17:46       ` Venu Busireddy
2021-12-14 19:10         ` Borislav Petkov
2021-12-15  0:14           ` Venu Busireddy
2021-12-15 11:57             ` Borislav Petkov
2021-12-15 14:43             ` Tom Lendacky
2021-12-15 17:49               ` Michael Roth
2021-12-15 18:17                 ` Venu Busireddy
2021-12-15 18:33                   ` Borislav Petkov
2021-12-15 20:17                     ` Michael Roth
2021-12-15 20:38                       ` Borislav Petkov
2021-12-15 21:22                         ` Michael Roth
2022-01-03 19:10                           ` Venu Busireddy
2022-01-05 19:34                             ` Brijesh Singh
2022-01-10 20:46                               ` Brijesh Singh
2022-01-10 21:17                                 ` Venu Busireddy
2022-01-10 21:38                                   ` Borislav Petkov
2021-12-15 20:43                   ` Michael Roth
2021-12-15 19:54                 ` Venu Busireddy
2021-12-15 18:58               ` Venu Busireddy
2021-12-15 17:51             ` Michael Roth
2021-12-10 15:42 ` [PATCH v8 02/40] x86/sev: " Brijesh Singh
2021-12-13 22:36   ` Venu Busireddy
2021-12-10 15:42 ` [PATCH v8 03/40] x86/mm: Extend cc_attr to include AMD SEV-SNP Brijesh Singh
2021-12-13 22:47   ` Venu Busireddy
2021-12-14 15:53   ` Borislav Petkov
2021-12-10 15:42 ` [PATCH v8 04/40] x86/sev: Define the Linux specific guest termination reasons Brijesh Singh
2021-12-14  0:13   ` Venu Busireddy
2021-12-14 22:22   ` Borislav Petkov
2021-12-10 15:42 ` [PATCH v8 05/40] x86/sev: Save the negotiated GHCB version Brijesh Singh
2021-12-14  0:32   ` Venu Busireddy
2021-12-10 15:42 ` [PATCH v8 06/40] x86/sev: Check SEV-SNP features support Brijesh Singh
2021-12-16 15:47   ` Borislav Petkov
2021-12-16 16:28     ` Brijesh Singh
2021-12-16 16:58       ` Borislav Petkov
2021-12-16 19:01   ` Venu Busireddy
2021-12-10 15:42 ` [PATCH v8 07/40] x86/sev: Add a helper for the PVALIDATE instruction Brijesh Singh
2021-12-16 20:20   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 08/40] x86/sev: Check the vmpl level Brijesh Singh
2021-12-16 20:24   ` Venu Busireddy
2021-12-16 23:39     ` Mikolaj Lisik
2021-12-17 22:19       ` Brijesh Singh
2021-12-17 22:33         ` Tom Lendacky
2021-12-20 18:10           ` Borislav Petkov
2022-01-04 15:23             ` Brijesh Singh
2021-12-10 15:43 ` [PATCH v8 09/40] x86/compressed: Add helper for validating pages in the decompression stage Brijesh Singh
2021-12-17 20:47   ` Venu Busireddy
2021-12-17 23:24     ` Brijesh Singh
2022-01-03 18:43       ` Venu Busireddy
2021-12-21 13:01   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 10/40] x86/compressed: Register GHCB memory when SEV-SNP is active Brijesh Singh
2022-01-03 19:54   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 11/40] x86/sev: " Brijesh Singh
2021-12-22 13:16   ` Borislav Petkov
2021-12-22 15:16     ` Brijesh Singh
2022-01-03 22:47   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 12/40] x86/sev: Add helper for validating pages in early enc attribute changes Brijesh Singh
2021-12-23 11:50   ` Borislav Petkov
2022-01-04 15:33     ` Brijesh Singh
2022-01-03 23:28   ` Venu Busireddy
2022-01-11 21:22     ` Brijesh Singh
2022-01-11 21:51       ` Venu Busireddy
2022-01-11 21:57         ` Brijesh Singh
2022-01-11 22:42           ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 13/40] x86/kernel: Make the bss.decrypted section shared in RMP table Brijesh Singh
2021-12-28 11:53   ` Borislav Petkov
2022-01-04 17:56   ` Venu Busireddy
2022-01-05 19:52     ` Brijesh Singh
2022-01-05 20:27       ` Dave Hansen
2022-01-05 21:39         ` Brijesh Singh
2022-01-06 17:40           ` Venu Busireddy
2022-01-06 19:06             ` Tom Lendacky
2022-01-06 20:16               ` Venu Busireddy
2022-01-06 20:50                 ` Tom Lendacky
2021-12-10 15:43 ` [PATCH v8 14/40] x86/kernel: Validate rom memory before accessing when SEV-SNP is active Brijesh Singh
2021-12-28 15:40   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 15/40] x86/mm: Add support to validate memory when changing C-bit Brijesh Singh
2021-12-29 11:09   ` Borislav Petkov
2022-01-04 22:31   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 16/40] KVM: SVM: Define sev_features and vmpl field in the VMSA Brijesh Singh
2022-01-04 22:59   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 17/40] KVM: SVM: Create a separate mapping for the SEV-ES save area Brijesh Singh
2021-12-30 12:19   ` Borislav Petkov
2022-01-05  1:38   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 18/40] KVM: SVM: Create a separate mapping for the GHCB " Brijesh Singh
2022-01-05 18:41   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 19/40] KVM: SVM: Update the SEV-ES save area mapping Brijesh Singh
2022-01-05 18:54   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 20/40] x86/sev: Use SEV-SNP AP creation to start secondary CPUs Brijesh Singh
2021-12-10 18:50   ` Dave Hansen
2022-01-12 16:17     ` Brijesh Singh
2021-12-31 15:36   ` Borislav Petkov
2022-01-03 18:10     ` Vlastimil Babka
2022-01-12 16:33     ` Brijesh Singh
2022-01-12 17:10       ` Tom Lendacky
2022-01-13 12:23         ` Borislav Petkov
2022-01-13 12:21       ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 21/40] x86/head: re-enable stack protection for 32/64-bit builds Brijesh Singh
2022-01-03 16:49   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 22/40] x86/sev: move MSR-based VMGEXITs for CPUID to helper Brijesh Singh
2021-12-30 18:52   ` Sean Christopherson
2022-01-04 20:57     ` Borislav Petkov
2022-01-04 23:36     ` Michael Roth
2022-01-06 18:38   ` Venu Busireddy
2022-01-06 20:21     ` Michael Roth
2022-01-06 20:36       ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 23/40] KVM: x86: move lookup of indexed CPUID leafs " Brijesh Singh
2022-01-06 18:46   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 24/40] x86/compressed/acpi: move EFI system table lookup " Brijesh Singh
2021-12-10 18:54   ` Dave Hansen
2021-12-13 15:47     ` Michael Roth
2021-12-13 16:21       ` Dave Hansen
2021-12-13 18:00         ` Michael Roth
2022-01-11  8:59       ` Chao Fan
2022-01-05 23:50   ` Borislav Petkov
2022-01-06 19:59   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 25/40] x86/compressed/acpi: move EFI config " Brijesh Singh
2022-01-06 20:33   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 26/40] x86/compressed/acpi: move EFI vendor " Brijesh Singh
2022-01-06 20:47   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 27/40] x86/boot: Add Confidential Computing type to setup_data Brijesh Singh
2021-12-10 19:12   ` Dave Hansen
2021-12-10 20:18     ` Brijesh Singh
2021-12-10 20:30       ` Dave Hansen
2021-12-13 14:49         ` Brijesh Singh
2021-12-13 15:08           ` Dave Hansen
2021-12-13 15:55             ` Brijesh Singh
2022-01-07 11:54             ` Borislav Petkov
2022-01-06 22:48   ` Venu Busireddy
2021-12-10 15:43 ` [PATCH v8 28/40] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement Brijesh Singh
2022-01-07 13:22   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 29/40] x86/compressed/64: add support for SEV-SNP CPUID table in #VC handlers Brijesh Singh
2022-01-13 13:16   ` Borislav Petkov
2022-01-13 16:39     ` Michael Roth
2022-01-14 16:13       ` Borislav Petkov
2022-01-18  4:35         ` Michael Roth
2022-01-18 14:02           ` Borislav Petkov
2022-01-18 14:23             ` Michael Roth
2022-01-18 14:32               ` Michael Roth
2022-01-18 14:37                 ` Michael Roth
2022-01-18 16:34                   ` Borislav Petkov
2022-01-18 17:20                     ` Michael Roth
2022-01-18 17:41                       ` Borislav Petkov
2022-01-18 18:49                         ` Michael Roth
2022-01-19  1:18                           ` Michael Roth
2022-01-19 11:17                             ` Borislav Petkov
2022-01-19 16:27                               ` Michael Roth
2022-01-27 17:23                                 ` Michael Roth
2022-01-28 22:58                                 ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 30/40] x86/boot: add a pointer to Confidential Computing blob in bootparams Brijesh Singh
2022-01-17 18:14   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 31/40] x86/compressed: add SEV-SNP feature detection/setup Brijesh Singh
2022-01-19 12:55   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 32/40] x86/compressed: use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
2022-01-20 12:18   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 33/40] x86/compressed/64: add identity mapping for Confidential Computing blob Brijesh Singh
2021-12-10 19:52   ` Dave Hansen
2021-12-13 17:54     ` Michael Roth
2022-01-25 13:48   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 34/40] x86/sev: add SEV-SNP feature detection/setup Brijesh Singh
2022-01-25 18:43   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 35/40] x86/sev: use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
2022-01-26 18:35   ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 36/40] x86/sev: Provide support for SNP guest request NAEs Brijesh Singh
2022-01-27 16:21   ` Borislav Petkov
2022-01-27 17:02     ` Brijesh Singh
2022-01-29 10:27       ` Borislav Petkov
2022-01-29 11:49         ` Brijesh Singh
2022-01-29 12:02           ` Borislav Petkov
2021-12-10 15:43 ` [PATCH v8 37/40] x86/sev: Register SNP guest request platform device Brijesh Singh
2021-12-10 15:43 ` [PATCH v8 38/40] virt: Add SEV-SNP guest driver Brijesh Singh
2021-12-10 15:43 ` [PATCH v8 39/40] virt: sevguest: Add support to derive key Brijesh Singh
2021-12-10 22:27   ` Liam Merwick
2021-12-10 15:43 ` [PATCH v8 40/40] virt: sevguest: Add support to get extended report Brijesh Singh
2021-12-10 20:17 ` [PATCH v8 00/40] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Dave Hansen
2021-12-10 20:20   ` Brijesh Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).