All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support
@ 2022-01-28 17:17 Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 01/43] KVM: SVM: Define sev_features and vmpl field in the VMSA Brijesh Singh
                   ` (42 more replies)
  0 siblings, 43 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

This part of Secure Encrypted Paging (SEV-SNP) series focuses on the changes
required in a guest OS for SEV-SNP support.

SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
new hardware-based memory protections. SEV-SNP adds strong memory integrity
protection to help prevent malicious hypervisor-based attacks like data
replay, memory re-mapping and more in order to create an isolated memory
encryption environment.
 
This series provides the basic building blocks to support booting the SEV-SNP
VMs, it does not cover all the security enhancement introduced by the SEV-SNP
such as interrupt protection.

Many of the integrity guarantees of SEV-SNP are enforced through a new
structure called the Reverse Map Table (RMP). Adding a new page to SEV-SNP
VM requires a 2-step process. First, the hypervisor assigns a page to the
guest using the new RMPUPDATE instruction. This transitions the page to
guest-invalid. Second, the guest validates the page using the new PVALIDATE
instruction. The SEV-SNP VMs can use the new "Page State Change Request NAE"
defined in the GHCB specification to ask hypervisor to add or remove page
from the RMP table.

Each page assigned to the SEV-SNP VM can either be validated or unvalidated,
as indicated by the Validated flag in the page's RMP entry. There are two
approaches that can be taken for the page validation: Pre-validation and
Lazy Validation.

Under pre-validation, the pages are validated prior to first use. And under
lazy validation, pages are validated when first accessed. An access to a
unvalidated page results in a #VC exception, at which time the exception
handler may validate the page. Lazy validation requires careful tracking of
the validated pages to avoid validating the same GPA more than once. The
recently introduced "Unaccepted" memory type can be used to communicate the
unvalidated memory ranges to the Guest OS.

At this time we only sypport the pre-validation, the OVMF guest BIOS
validates the entire RAM before the control is handed over to the guest kernel.
The early_set_memory_{encrypted,decrypted} and set_memory_{encrypted,decrypted} are
enlightened to perform the page validation or invalidation while setting or
clearing the encryption attribute from the page table.

This series does not provide support for the Interrupt security yet which will
be added after the base support.

The series is based on tip/master
  94985da003a4 (origin/master, origin/HEAD) Merge branch into tip/master: 'irq/urgent'

The complete branch is at https://github.com/AMDESE/linux/tree/sev-snp-v9

Patch 1-4 defines multiple VMSA save area to support SEV,SEV-ES and SEV-SNP guests.
It is a pre-requisite for the SEV-SNP guest support, and included in the
series for the completeness. Not sure if it will go through the tip or KVM.
It is also posted on KVM mailing list:
https://lore.kernel.org/lkml/20211213173356.138726-3-brijesh.singh@amd.com/T/#m7d6868f3e81624323ea933d3a63a68949b286103

Additional resources
---------------------
SEV-SNP whitepaper
https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf
 
APM 2: https://www.amd.com/system/files/TechDocs/24593.pdf
(section 15.36)

GHCB spec:
https://developer.amd.com/wp-content/resources/56421.pdf

SEV-SNP firmware specification:
https://developer.amd.com/sev/

v8: https://lore.kernel.org/lkml/20211210154332.11526-1-brijesh.singh@amd.com/
v7: https://lore.kernel.org/linux-mm/20211110220731.2396491-40-brijesh.singh@amd.com/
v6: https://lore.kernel.org/linux-mm/20211008180453.462291-1-brijesh.singh@amd.com/
v5: https://lore.kernel.org/lkml/20210820151933.22401-1-brijesh.singh@amd.com/

Changes since v8:
 * Setup the GHCB before taking the first #VC.
 * Make the CC blob structure size invariant.
 * Define the AP INIT macro and update the AP creation to use those macro
   instead of the hardcoded values.
 * Expand the comments to cover some of previous feedbacks.
 * Fix the commit messages based on the feedbacks.
 * Multiple fixes/cleanup on cpuid patches (based on Boris and Dave feedback)
   * drop is_efi64 return arguments in favor of a separate efi_get_type() helper.
   * drop is_efi64 input arguments in favor of calling efi_get_type() as-needed.
   * move acpi.c's kexec-specific handling into library code.
   * fix stack protection for 32/64-bit builds.
   * Export add_identity_map() to avoid SEV-specific code in ident_map_64.c.
   * use snp_abort() when terminating via initial ccblob scan.
   * fix the copyright header after the code refactor.
   * remove code duplication whereever possible.

Changes since v7:
 * sevguest: extend the get report structure to accept the vmpl from userspace.
 * In the compressed path, move the GHCB protocol negotiation from VC handler
   to sev_enable().
 * sev_enable(): don't expect SEV bit in status MSR when cpuid bit is present, update comments.
 * sme_enable(): call directly from head_64.S rather than as part of startup_64_setup_env, add comments
 * snp_find_cc_blob(), sev_prep_identity_maps(): add missing 'static' keywords to function prototypes

Changes since v6:
 * Add rmpadjust() helper to be used by AP creation and vmpl0 detect function.
 * Clear the VM communication key if guest detects that hypervisor is modifying
   the SNP_GUEST_REQ response header.
 * Move the per-cpu GHCB registration from first #VC to idt setup.
 * Consolidate initial SEV/SME setup into a common entry point that gets called
   early enough to also be used for SEV-SNP CPUID table setup.
 * SNP CPUID: separate initial SEV-SNP feature detection out into standalone
   snp_init() routines, then add CPUID table setup to it as a separate patch.
 * SNP CPUID: fix boot issue with Seabios due to ACPI relying on certain EFI
   config table lookup failures as fallthrough cases rather than error cases.
 * SNP CPUID: drop the use of a separate init routines to handle pointer fixups
   after switching to kernel virtual addresses, instead use a helper that uses
   RIP-relative addressing to access CPUID table when either on identity mapping
   or kernel virtual addresses.

Changes since v5:
 * move the seqno allocation in the sevguest driver.
 * extend snp_issue_guest_request() to accept the exit_info to simplify the logic.
 * use smaller structure names based on feedback.
 * explicitly clear the memory after the SNP guest request is completed.
 * cpuid validation: use a local copy of cpuid table instead of keeping
   firmware table mapped throughout boot.
 * cpuid validation: coding style fix-ups and refactor cpuid-related helpers
   as suggested.
 * cpuid validation: drop a number of BOOT_COMPRESSED-guarded defs/declarations
   by moving things like snp_cpuid_init*() out of sev-shared.c and keeping only
   the common bits there.
 * Break up EFI config table helpers and related acpi.c changes into separate
   patches.
 * re-enable stack protection for 32-bit kernels as well, not just 64-bit

Changes since v4:
 * Address the cpuid specific review comment
 * Simplified the macro based on the review feedback
 * Move macro definition to the patch that needs it
 * Fix the issues reported by the checkpath
 * Address the AP creation specific review comment

Changes since v3:
 * Add support to use the PSP filtered CPUID.
 * Add support for the extended guest request.
 * Move sevguest driver in driver/virt/coco.
 * Add documentation for sevguest ioctl.
 * Add support to check the vmpl0.
 * Pass the VM encryption key and id to be used for encrypting guest messages
   through the platform drv data.
 * Multiple cleanup and fixes to address the review feedbacks.

Changes since v2:
 * Add support for AP startup using SNP specific vmgexit.
 * Add snp_prep_memory() helper.
 * Drop sev_snp_active() helper.
 * Add sev_feature_enabled() helper to check which SEV feature is active.
 * Sync the SNP guest message request header with latest SNP FW spec.
 * Multiple cleanup and fixes to address the review feedbacks.

Changes since v1:
 * Integerate the SNP support in sev.{ch}.
 * Add support to query the hypervisor feature and detect whether SNP is supported.
 * Define Linux specific reason code for the SNP guest termination.
 * Extend the setup_header provide a way for hypervisor to pass secret and cpuid page.
 * Add support to create a platform device and driver to query the attestation report
   and the derive a key.
 * Multiple cleanup and fixes to address Boris's review fedback.

Brijesh Singh (20):
  KVM: SVM: Define sev_features and vmpl field in the VMSA
  x86/mm: Extend cc_attr to include AMD SEV-SNP
  x86/sev: Define the Linux specific guest termination reasons
  x86/sev: Save the negotiated GHCB version
  x86/sev: Check SEV-SNP features support
  x86/sev: Add a helper for the PVALIDATE instruction
  x86/sev: Check the vmpl level
  x86/compressed: Add helper for validating pages in the decompression
    stage
  x86/compressed: Register GHCB memory when SEV-SNP is active
  x86/sev: Register GHCB memory when SEV-SNP is active
  x86/sev: Add helper for validating pages in early enc attribute
    changes
  x86/kernel: Make the .bss..decrypted section shared in RMP table
  x86/kernel: Validate ROM memory before accessing when SEV-SNP is
    active
  x86/mm: Add support to validate memory when changing C-bit
  x86/boot: Add Confidential Computing type to setup_data
  x86/sev: Provide support for SNP guest request NAEs
  x86/sev: Register SEV-SNP guest request platform device
  virt: Add SEV-SNP guest driver
  virt: sevguest: Add support to derive key
  virt: sevguest: Add support to get extended report

Michael Roth (19):
  x86/compressed/64: Detect/setup SEV/SME features earlier in boot
  x86/sev: Detect/setup SEV/SME features earlier in boot
  x86/head/64: Re-enable stack protection
  x86/sev: Move MSR-based VMGEXITs for CPUID to helper
  KVM: x86: Move lookup of indexed CPUID leafs to helper
  x86/compressed/acpi: Move EFI detection to helper
  x86/compressed/acpi: Move EFI system table lookup to helper
  x86/compressed/acpi: Move EFI config table lookup to helper
  x86/compressed/acpi: Move EFI vendor table lookup to helper
  x86/compressed/acpi: Move EFI kexec handling into common code
  KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement
  x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers
  x86/boot: Add a pointer to Confidential Computing blob in bootparams
  x86/compressed: Add SEV-SNP feature detection/setup
  x86/compressed: Use firmware-validated CPUID leaves for SEV-SNP guests
  x86/compressed: Export and rename add_identity_map()
  x86/compressed/64: Add identity mapping for Confidential Computing
    blob
  x86/sev: Add SEV-SNP feature detection/setup
  x86/sev: Use firmware-validated CPUID for SEV-SNP guests

Tom Lendacky (4):
  KVM: SVM: Create a separate mapping for the SEV-ES save area
  KVM: SVM: Create a separate mapping for the GHCB save area
  KVM: SVM: Update the SEV-ES save area mapping
  x86/sev: Use SEV-SNP AP creation to start secondary CPUs

 Documentation/virt/coco/sevguest.rst          | 121 +++
 .../virt/kvm/amd-memory-encryption.rst        |  28 +
 arch/x86/boot/compressed/Makefile             |   1 +
 arch/x86/boot/compressed/acpi.c               | 172 +---
 arch/x86/boot/compressed/efi.c                | 243 +++++
 arch/x86/boot/compressed/head_64.S            |  32 +-
 arch/x86/boot/compressed/ident_map_64.c       |  39 +-
 arch/x86/boot/compressed/idt_64.c             |  10 +-
 arch/x86/boot/compressed/mem_encrypt.S        |  36 -
 arch/x86/boot/compressed/misc.h               |  55 +-
 arch/x86/boot/compressed/sev.c                | 256 ++++-
 arch/x86/include/asm/bootparam_utils.h        |   1 +
 arch/x86/include/asm/cpuid.h                  |  38 +
 arch/x86/include/asm/msr-index.h              |   2 +
 arch/x86/include/asm/setup.h                  |   1 -
 arch/x86/include/asm/sev-common.h             |  82 ++
 arch/x86/include/asm/sev.h                    | 102 +-
 arch/x86/include/asm/svm.h                    | 171 +++-
 arch/x86/include/uapi/asm/bootparam.h         |   4 +-
 arch/x86/include/uapi/asm/svm.h               |  13 +
 arch/x86/kernel/Makefile                      |   1 -
 arch/x86/kernel/cc_platform.c                 |   2 +
 arch/x86/kernel/cpu/common.c                  |   4 +
 arch/x86/kernel/head64.c                      |  30 +-
 arch/x86/kernel/head_64.S                     |  37 +-
 arch/x86/kernel/probe_roms.c                  |  13 +-
 arch/x86/kernel/sev-shared.c                  | 526 +++++++++-
 arch/x86/kernel/sev.c                         | 934 ++++++++++++++++--
 arch/x86/kernel/smpboot.c                     |   3 +
 arch/x86/kvm/cpuid.c                          |  19 +-
 arch/x86/kvm/svm/sev.c                        |  24 +-
 arch/x86/kvm/svm/svm.c                        |   4 +-
 arch/x86/kvm/svm/svm.h                        |   2 +-
 arch/x86/mm/mem_encrypt.c                     |   4 +
 arch/x86/mm/mem_encrypt_amd.c                 |  58 +-
 arch/x86/mm/mem_encrypt_identity.c            |   8 +
 arch/x86/mm/pat/set_memory.c                  |  15 +
 drivers/virt/Kconfig                          |   3 +
 drivers/virt/Makefile                         |   1 +
 drivers/virt/coco/sevguest/Kconfig            |  12 +
 drivers/virt/coco/sevguest/Makefile           |   2 +
 drivers/virt/coco/sevguest/sevguest.c         | 739 ++++++++++++++
 drivers/virt/coco/sevguest/sevguest.h         |  98 ++
 include/linux/cc_platform.h                   |   8 +
 include/linux/efi.h                           |   1 +
 include/uapi/linux/sev-guest.h                |  80 ++
 46 files changed, 3652 insertions(+), 383 deletions(-)
 create mode 100644 Documentation/virt/coco/sevguest.rst
 create mode 100644 arch/x86/boot/compressed/efi.c
 create mode 100644 arch/x86/include/asm/cpuid.h
 create mode 100644 drivers/virt/coco/sevguest/Kconfig
 create mode 100644 drivers/virt/coco/sevguest/Makefile
 create mode 100644 drivers/virt/coco/sevguest/sevguest.c
 create mode 100644 drivers/virt/coco/sevguest/sevguest.h
 create mode 100644 include/uapi/linux/sev-guest.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 115+ messages in thread

* [PATCH v9 01/43] KVM: SVM: Define sev_features and vmpl field in the VMSA
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 02/43] KVM: SVM: Create a separate mapping for the SEV-ES save area Brijesh Singh
                   ` (41 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh, Venu Busireddy

The hypervisor uses the sev_features field (offset 3B0h) in the Save State
Area to control the SEV-SNP guest features such as SNPActive, vTOM,
ReflectVC etc. An SEV-SNP guest can read the SEV_FEATURES fields through
the SEV_STATUS MSR.

While at it, update the dump_vmcb() to log the VMPL level.

See APM2 Table 15-34 and B-4 for more details.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/svm.h | 6 ++++--
 arch/x86/kvm/svm/svm.c     | 4 ++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index b00dbc5fac2b..7c9cf4f3c164 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -238,7 +238,8 @@ struct vmcb_save_area {
 	struct vmcb_seg ldtr;
 	struct vmcb_seg idtr;
 	struct vmcb_seg tr;
-	u8 reserved_1[43];
+	u8 reserved_1[42];
+	u8 vmpl;
 	u8 cpl;
 	u8 reserved_2[4];
 	u64 efer;
@@ -303,7 +304,8 @@ struct vmcb_save_area {
 	u64 sw_exit_info_1;
 	u64 sw_exit_info_2;
 	u64 sw_scratch;
-	u8 reserved_11[56];
+	u64 sev_features;
+	u8 reserved_11[48];
 	u64 xcr0;
 	u8 valid_bitmap[16];
 	u64 x87_state_gpa;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2c99b18d76c0..dca191739c34 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3089,8 +3089,8 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
 	       "tr:",
 	       save01->tr.selector, save01->tr.attrib,
 	       save01->tr.limit, save01->tr.base);
-	pr_err("cpl:            %d                efer:         %016llx\n",
-		save->cpl, save->efer);
+	pr_err("vmpl: %d   cpl:  %d               efer:          %016llx\n",
+	       save->vmpl, save->cpl, save->efer);
 	pr_err("%-15s %016llx %-13s %016llx\n",
 	       "cr0:", save->cr0, "cr2:", save->cr2);
 	pr_err("%-15s %016llx %-13s %016llx\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 02/43] KVM: SVM: Create a separate mapping for the SEV-ES save area
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 01/43] KVM: SVM: Define sev_features and vmpl field in the VMSA Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-01 13:02   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 03/43] KVM: SVM: Create a separate mapping for the GHCB " Brijesh Singh
                   ` (40 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Venu Busireddy, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

The save area for SEV-ES/SEV-SNP guests, as used by the hardware, is
different from the save area of a non SEV-ES/SEV-SNP guest.

This is the first step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Create an SEV-ES/SEV-SNP save area and adjust usage to the new
save area definition where needed.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/svm.h | 87 +++++++++++++++++++++++++++++---------
 arch/x86/kvm/svm/sev.c     | 24 +++++------
 arch/x86/kvm/svm/svm.h     |  2 +-
 3 files changed, 80 insertions(+), 33 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 7c9cf4f3c164..3ce2e575a2de 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -227,6 +227,7 @@ struct vmcb_seg {
 	u64 base;
 } __packed;
 
+/* Save area definition for legacy and SEV-MEM guests */
 struct vmcb_save_area {
 	struct vmcb_seg es;
 	struct vmcb_seg cs;
@@ -243,8 +244,58 @@ struct vmcb_save_area {
 	u8 cpl;
 	u8 reserved_2[4];
 	u64 efer;
+	u8 reserved_3[112];
+	u64 cr4;
+	u64 cr3;
+	u64 cr0;
+	u64 dr7;
+	u64 dr6;
+	u64 rflags;
+	u64 rip;
+	u8 reserved_4[88];
+	u64 rsp;
+	u64 s_cet;
+	u64 ssp;
+	u64 isst_addr;
+	u64 rax;
+	u64 star;
+	u64 lstar;
+	u64 cstar;
+	u64 sfmask;
+	u64 kernel_gs_base;
+	u64 sysenter_cs;
+	u64 sysenter_esp;
+	u64 sysenter_eip;
+	u64 cr2;
+	u8 reserved_5[32];
+	u64 g_pat;
+	u64 dbgctl;
+	u64 br_from;
+	u64 br_to;
+	u64 last_excp_from;
+	u64 last_excp_to;
+	u8 reserved_6[72];
+	u32 spec_ctrl;		/* Guest version of SPEC_CTRL at 0x2E0 */
+} __packed;
+
+/* Save area definition for SEV-ES and SEV-SNP guests */
+struct sev_es_save_area {
+	struct vmcb_seg es;
+	struct vmcb_seg cs;
+	struct vmcb_seg ss;
+	struct vmcb_seg ds;
+	struct vmcb_seg fs;
+	struct vmcb_seg gs;
+	struct vmcb_seg gdtr;
+	struct vmcb_seg ldtr;
+	struct vmcb_seg idtr;
+	struct vmcb_seg tr;
+	u8 reserved_1[43];
+	u8 cpl;
+	u8 reserved_2[4];
+	u64 efer;
 	u8 reserved_3[104];
-	u64 xss;		/* Valid for SEV-ES only */
+	u64 xss;
 	u64 cr4;
 	u64 cr3;
 	u64 cr0;
@@ -272,22 +323,14 @@ struct vmcb_save_area {
 	u64 br_to;
 	u64 last_excp_from;
 	u64 last_excp_to;
-
-	/*
-	 * The following part of the save area is valid only for
-	 * SEV-ES guests when referenced through the GHCB or for
-	 * saving to the host save area.
-	 */
-	u8 reserved_7[72];
-	u32 spec_ctrl;		/* Guest version of SPEC_CTRL at 0x2E0 */
-	u8 reserved_7b[4];
+	u8 reserved_7[80];
 	u32 pkru;
-	u8 reserved_7a[20];
-	u64 reserved_8;		/* rax already available at 0x01f8 */
+	u8 reserved_9[20];
+	u64 reserved_10;	/* rax already available at 0x01f8 */
 	u64 rcx;
 	u64 rdx;
 	u64 rbx;
-	u64 reserved_9;		/* rsp already available at 0x01d8 */
+	u64 reserved_11;	/* rsp already available at 0x01d8 */
 	u64 rbp;
 	u64 rsi;
 	u64 rdi;
@@ -299,23 +342,25 @@ struct vmcb_save_area {
 	u64 r13;
 	u64 r14;
 	u64 r15;
-	u8 reserved_10[16];
+	u8 reserved_12[16];
 	u64 sw_exit_code;
 	u64 sw_exit_info_1;
 	u64 sw_exit_info_2;
 	u64 sw_scratch;
 	u64 sev_features;
-	u8 reserved_11[48];
+	u8 reserved_13[48];
 	u64 xcr0;
 	u8 valid_bitmap[16];
 	u64 x87_state_gpa;
 } __packed;
 
+#define GHCB_SHARED_BUF_SIZE	2032
+
 struct ghcb {
-	struct vmcb_save_area save;
-	u8 reserved_save[2048 - sizeof(struct vmcb_save_area)];
+	struct sev_es_save_area save;
+	u8 reserved_save[2048 - sizeof(struct sev_es_save_area)];
 
-	u8 shared_buffer[2032];
+	u8 shared_buffer[GHCB_SHARED_BUF_SIZE];
 
 	u8 reserved_1[10];
 	u16 protocol_version;	/* negotiated SEV-ES/GHCB protocol version */
@@ -323,13 +368,15 @@ struct ghcb {
 } __packed;
 
 
-#define EXPECTED_VMCB_SAVE_AREA_SIZE		1032
+#define EXPECTED_VMCB_SAVE_AREA_SIZE		740
+#define EXPECTED_SEV_ES_SAVE_AREA_SIZE		1032
 #define EXPECTED_VMCB_CONTROL_AREA_SIZE		1024
 #define EXPECTED_GHCB_SIZE			PAGE_SIZE
 
 static inline void __unused_size_checks(void)
 {
 	BUILD_BUG_ON(sizeof(struct vmcb_save_area)	!= EXPECTED_VMCB_SAVE_AREA_SIZE);
+	BUILD_BUG_ON(sizeof(struct sev_es_save_area)	!= EXPECTED_SEV_ES_SAVE_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct vmcb_control_area)	!= EXPECTED_VMCB_CONTROL_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct ghcb)		!= EXPECTED_GHCB_SIZE);
 }
@@ -399,7 +446,7 @@ struct vmcb {
 /* GHCB Accessor functions */
 
 #define GHCB_BITMAP_IDX(field)							\
-	(offsetof(struct vmcb_save_area, field) / sizeof(u64))
+	(offsetof(struct sev_es_save_area, field) / sizeof(u64))
 
 #define DEFINE_GHCB_ACCESSORS(field)						\
 	static inline bool ghcb_##field##_is_valid(const struct ghcb *ghcb)	\
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6a22798eaaee..44bad375b4d6 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -558,12 +558,20 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 
 static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 {
-	struct vmcb_save_area *save = &svm->vmcb->save;
+	struct sev_es_save_area *save = svm->sev_es.vmsa;
 
 	/* Check some debug related fields before encrypting the VMSA */
-	if (svm->vcpu.guest_debug || (save->dr7 & ~DR7_FIXED_1))
+	if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1))
 		return -EINVAL;
 
+	/*
+	 * SEV-ES will use a VMSA that is pointed to by the VMCB, not
+	 * the traditional VMSA that is part of the VMCB. Copy the
+	 * traditional VMSA as it has been built so far (in prep
+	 * for LAUNCH_UPDATE_VMSA) to be the initial SEV-ES state.
+	 */
+	memcpy(save, &svm->vmcb->save, sizeof(svm->vmcb->save));
+
 	/* Sync registgers */
 	save->rax = svm->vcpu.arch.regs[VCPU_REGS_RAX];
 	save->rbx = svm->vcpu.arch.regs[VCPU_REGS_RBX];
@@ -591,14 +599,6 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 	save->xss  = svm->vcpu.arch.ia32_xss;
 	save->dr6  = svm->vcpu.arch.dr6;
 
-	/*
-	 * SEV-ES will use a VMSA that is pointed to by the VMCB, not
-	 * the traditional VMSA that is part of the VMCB. Copy the
-	 * traditional VMSA as it has been built so far (in prep
-	 * for LAUNCH_UPDATE_VMSA) to be the initial SEV-ES state.
-	 */
-	memcpy(svm->sev_es.vmsa, save, sizeof(*save));
-
 	return 0;
 }
 
@@ -2905,7 +2905,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
 void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu)
 {
 	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
-	struct vmcb_save_area *hostsa;
+	struct sev_es_save_area *hostsa;
 
 	/*
 	 * As an SEV-ES guest, hardware will restore the host state on VMEXIT,
@@ -2915,7 +2915,7 @@ void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu)
 	vmsave(__sme_page_pa(sd->save_area));
 
 	/* XCR0 is restored on VMEXIT, save the current host value */
-	hostsa = (struct vmcb_save_area *)(page_address(sd->save_area) + 0x400);
+	hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
 	hostsa->xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
 
 	/* PKRU is restored on VMEXIT, save the current host value */
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 47ef8f4a9358..9597331e9a32 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -167,7 +167,7 @@ struct svm_nested_state {
 
 struct vcpu_sev_es_state {
 	/* SEV-ES support */
-	struct vmcb_save_area *vmsa;
+	struct sev_es_save_area *vmsa;
 	struct ghcb *ghcb;
 	struct kvm_host_map ghcb_map;
 	bool received_first_sipi;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 03/43] KVM: SVM: Create a separate mapping for the GHCB save area
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 01/43] KVM: SVM: Define sev_features and vmpl field in the VMSA Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 02/43] KVM: SVM: Create a separate mapping for the SEV-ES save area Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 04/43] KVM: SVM: Update the SEV-ES save area mapping Brijesh Singh
                   ` (39 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Venu Busireddy, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

The initial implementation of the GHCB spec was based on trying to keep
the register state offsets the same relative to the VM save area. However,
the save area for SEV-ES has changed within the hardware causing the
relation between the SEV-ES save area to change relative to the GHCB save
area.

This is the second step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Create a GHCB save area that matches the GHCB specification.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/svm.h | 48 +++++++++++++++++++++++++++++++++++---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 3ce2e575a2de..5ff1fa364a31 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -354,11 +354,51 @@ struct sev_es_save_area {
 	u64 x87_state_gpa;
 } __packed;
 
+struct ghcb_save_area {
+	u8 reserved_1[203];
+	u8 cpl;
+	u8 reserved_2[116];
+	u64 xss;
+	u8 reserved_3[24];
+	u64 dr7;
+	u8 reserved_4[16];
+	u64 rip;
+	u8 reserved_5[88];
+	u64 rsp;
+	u8 reserved_6[24];
+	u64 rax;
+	u8 reserved_7[264];
+	u64 rcx;
+	u64 rdx;
+	u64 rbx;
+	u8 reserved_8[8];
+	u64 rbp;
+	u64 rsi;
+	u64 rdi;
+	u64 r8;
+	u64 r9;
+	u64 r10;
+	u64 r11;
+	u64 r12;
+	u64 r13;
+	u64 r14;
+	u64 r15;
+	u8 reserved_9[16];
+	u64 sw_exit_code;
+	u64 sw_exit_info_1;
+	u64 sw_exit_info_2;
+	u64 sw_scratch;
+	u8 reserved_10[56];
+	u64 xcr0;
+	u8 valid_bitmap[16];
+	u64 x87_state_gpa;
+} __packed;
+
 #define GHCB_SHARED_BUF_SIZE	2032
 
 struct ghcb {
-	struct sev_es_save_area save;
-	u8 reserved_save[2048 - sizeof(struct sev_es_save_area)];
+	struct ghcb_save_area save;
+	u8 reserved_save[2048 - sizeof(struct ghcb_save_area)];
 
 	u8 shared_buffer[GHCB_SHARED_BUF_SIZE];
 
@@ -369,6 +409,7 @@ struct ghcb {
 
 
 #define EXPECTED_VMCB_SAVE_AREA_SIZE		740
+#define EXPECTED_GHCB_SAVE_AREA_SIZE		1032
 #define EXPECTED_SEV_ES_SAVE_AREA_SIZE		1032
 #define EXPECTED_VMCB_CONTROL_AREA_SIZE		1024
 #define EXPECTED_GHCB_SIZE			PAGE_SIZE
@@ -376,6 +417,7 @@ struct ghcb {
 static inline void __unused_size_checks(void)
 {
 	BUILD_BUG_ON(sizeof(struct vmcb_save_area)	!= EXPECTED_VMCB_SAVE_AREA_SIZE);
+	BUILD_BUG_ON(sizeof(struct ghcb_save_area)	!= EXPECTED_GHCB_SAVE_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct sev_es_save_area)	!= EXPECTED_SEV_ES_SAVE_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct vmcb_control_area)	!= EXPECTED_VMCB_CONTROL_AREA_SIZE);
 	BUILD_BUG_ON(sizeof(struct ghcb)		!= EXPECTED_GHCB_SIZE);
@@ -446,7 +488,7 @@ struct vmcb {
 /* GHCB Accessor functions */
 
 #define GHCB_BITMAP_IDX(field)							\
-	(offsetof(struct sev_es_save_area, field) / sizeof(u64))
+	(offsetof(struct ghcb_save_area, field) / sizeof(u64))
 
 #define DEFINE_GHCB_ACCESSORS(field)						\
 	static inline bool ghcb_##field##_is_valid(const struct ghcb *ghcb)	\
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 04/43] KVM: SVM: Update the SEV-ES save area mapping
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (2 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 03/43] KVM: SVM: Create a separate mapping for the GHCB " Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot Brijesh Singh
                   ` (38 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Venu Busireddy, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

This is the final step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Update the SEV-ES/SEV-SNP save area to match the APM. This save
area will be used for the upcoming SEV-SNP AP Creation NAE event support.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/svm.h | 66 +++++++++++++++++++++++++++++---------
 1 file changed, 50 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 5ff1fa364a31..7d90321e7775 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -290,7 +290,13 @@ struct sev_es_save_area {
 	struct vmcb_seg ldtr;
 	struct vmcb_seg idtr;
 	struct vmcb_seg tr;
-	u8 reserved_1[43];
+	u64 vmpl0_ssp;
+	u64 vmpl1_ssp;
+	u64 vmpl2_ssp;
+	u64 vmpl3_ssp;
+	u64 u_cet;
+	u8 reserved_1[2];
+	u8 vmpl;
 	u8 cpl;
 	u8 reserved_2[4];
 	u64 efer;
@@ -303,9 +309,19 @@ struct sev_es_save_area {
 	u64 dr6;
 	u64 rflags;
 	u64 rip;
-	u8 reserved_4[88];
+	u64 dr0;
+	u64 dr1;
+	u64 dr2;
+	u64 dr3;
+	u64 dr0_addr_mask;
+	u64 dr1_addr_mask;
+	u64 dr2_addr_mask;
+	u64 dr3_addr_mask;
+	u8 reserved_4[24];
 	u64 rsp;
-	u8 reserved_5[24];
+	u64 s_cet;
+	u64 ssp;
+	u64 isst_addr;
 	u64 rax;
 	u64 star;
 	u64 lstar;
@@ -316,7 +332,7 @@ struct sev_es_save_area {
 	u64 sysenter_esp;
 	u64 sysenter_eip;
 	u64 cr2;
-	u8 reserved_6[32];
+	u8 reserved_5[32];
 	u64 g_pat;
 	u64 dbgctl;
 	u64 br_from;
@@ -325,12 +341,12 @@ struct sev_es_save_area {
 	u64 last_excp_to;
 	u8 reserved_7[80];
 	u32 pkru;
-	u8 reserved_9[20];
-	u64 reserved_10;	/* rax already available at 0x01f8 */
+	u8 reserved_8[20];
+	u64 reserved_9;		/* rax already available at 0x01f8 */
 	u64 rcx;
 	u64 rdx;
 	u64 rbx;
-	u64 reserved_11;	/* rsp already available at 0x01d8 */
+	u64 reserved_10;	/* rsp already available at 0x01d8 */
 	u64 rbp;
 	u64 rsi;
 	u64 rdi;
@@ -342,16 +358,34 @@ struct sev_es_save_area {
 	u64 r13;
 	u64 r14;
 	u64 r15;
-	u8 reserved_12[16];
-	u64 sw_exit_code;
-	u64 sw_exit_info_1;
-	u64 sw_exit_info_2;
-	u64 sw_scratch;
+	u8 reserved_11[16];
+	u64 guest_exit_info_1;
+	u64 guest_exit_info_2;
+	u64 guest_exit_int_info;
+	u64 guest_nrip;
 	u64 sev_features;
-	u8 reserved_13[48];
+	u64 vintr_ctrl;
+	u64 guest_exit_code;
+	u64 virtual_tom;
+	u64 tlb_id;
+	u64 pcpu_id;
+	u64 event_inj;
 	u64 xcr0;
-	u8 valid_bitmap[16];
-	u64 x87_state_gpa;
+	u8 reserved_12[16];
+
+	/* Floating point area */
+	u64 x87_dp;
+	u32 mxcsr;
+	u16 x87_ftw;
+	u16 x87_fsw;
+	u16 x87_fcw;
+	u16 x87_fop;
+	u16 x87_ds;
+	u16 x87_cs;
+	u64 x87_rip;
+	u8 fpreg_x87[80];
+	u8 fpreg_xmm[256];
+	u8 fpreg_ymm[256];
 } __packed;
 
 struct ghcb_save_area {
@@ -410,7 +444,7 @@ struct ghcb {
 
 #define EXPECTED_VMCB_SAVE_AREA_SIZE		740
 #define EXPECTED_GHCB_SAVE_AREA_SIZE		1032
-#define EXPECTED_SEV_ES_SAVE_AREA_SIZE		1032
+#define EXPECTED_SEV_ES_SAVE_AREA_SIZE		1648
 #define EXPECTED_VMCB_CONTROL_AREA_SIZE		1024
 #define EXPECTED_GHCB_SIZE			PAGE_SIZE
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (3 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 04/43] KVM: SVM: Update the SEV-ES save area mapping Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-01 18:08   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 06/43] x86/sev: " Brijesh Singh
                   ` (37 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

With upcoming SEV-SNP support, SEV-related features need to be
initialized earlier in boot, at the same point the initial #VC handler
is set up, so that the SEV-SNP CPUID table can be utilized during the
initial feature checks. Also, SEV-SNP feature detection will rely on
EFI helper functions to scan the EFI config table for the Confidential
Computing blob, and so would need to be implemented at least partially
in C.

Currently set_sev_encryption_mask() is used to initialize the
sev_status and sme_me_mask globals that advertise what SEV/SME features
are available in a guest. Rename it to sev_enable() to better reflect
that (SME is only enabled in the case of SEV guests in the
boot/compressed kernel), and move it to just after the stage1 #VC
handler is set up so that it can be used to initialize SEV-SNP as well
in future patches.

While at it, re-implement it as C code so that all SEV feature
detection can be better consolidated with upcoming SEV-SNP feature
detection, which will also be in C.

The 32-bit entry path remains unchanged, as it never relied on the
set_sev_encryption_mask() initialization to begin with, possibly due to
the normal rva() helper for accessing globals only being usable by code
in .head.text. Either way, 32-bit entry for SEV-SNP would likely only
be supported for non-EFI boot paths, and so wouldn't rely on existing
EFI helper functions, and so could be handled by a separate/simpler
32-bit initializer in the future if needed.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/head_64.S     | 32 +++++++++++--------
 arch/x86/boot/compressed/mem_encrypt.S | 36 ---------------------
 arch/x86/boot/compressed/misc.h        |  4 +--
 arch/x86/boot/compressed/sev.c         | 44 ++++++++++++++++++++++++++
 4 files changed, 65 insertions(+), 51 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index fd9441f40457..49064a9f96e2 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -191,9 +191,8 @@ SYM_FUNC_START(startup_32)
 	/*
 	 * Mark SEV as active in sev_status so that startup32_check_sev_cbit()
 	 * will do a check. The sev_status memory will be fully initialized
-	 * with the contents of MSR_AMD_SEV_STATUS later in
-	 * set_sev_encryption_mask(). For now it is sufficient to know that SEV
-	 * is active.
+	 * with the contents of MSR_AMD_SEV_STATUS later via sev_enable(). For
+	 * now it is sufficient to know that SEV is active.
 	 */
 	movl	$1, rva(sev_status)(%ebp)
 1:
@@ -447,6 +446,23 @@ SYM_CODE_START(startup_64)
 	call	load_stage1_idt
 	popq	%rsi
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	/*
+	 * Now that the stage1 interrupt handlers are set up, #VC exceptions from
+	 * CPUID instructions can be properly handled for SEV-ES guests.
+	 *
+	 * For SEV-SNP, the CPUID table also needs to be set up in advance of any
+	 * CPUID instructions being issued, so go ahead and do that now via
+	 * sev_enable(), which will also handle the rest of the SEV-related
+	 * detection/setup to ensure that has been done in advance of any dependent
+	 * code.
+	 */
+	pushq	%rsi
+	movq	%rsi, %rdi		/* real mode address */
+	call	sev_enable
+	popq	%rsi
+#endif
+
 	/*
 	 * paging_prepare() sets up the trampoline and checks if we need to
 	 * enable 5-level paging.
@@ -559,17 +575,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
 	shrq	$3, %rcx
 	rep	stosq
 
-/*
- * If running as an SEV guest, the encryption mask is required in the
- * page-table setup code below. When the guest also has SEV-ES enabled
- * set_sev_encryption_mask() will cause #VC exceptions, but the stage2
- * handler can't map its GHCB because the page-table is not set up yet.
- * So set up the encryption mask here while still on the stage1 #VC
- * handler. Then load stage2 IDT and switch to the kernel's own
- * page-table.
- */
 	pushq	%rsi
-	call	set_sev_encryption_mask
 	call	load_stage2_idt
 
 	/* Pass boot_params to initialize_identity_maps() */
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
index a63424d13627..a73e4d783cae 100644
--- a/arch/x86/boot/compressed/mem_encrypt.S
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -187,42 +187,6 @@ SYM_CODE_END(startup32_vc_handler)
 	.code64
 
 #include "../../kernel/sev_verify_cbit.S"
-SYM_FUNC_START(set_sev_encryption_mask)
-#ifdef CONFIG_AMD_MEM_ENCRYPT
-	push	%rbp
-	push	%rdx
-
-	movq	%rsp, %rbp		/* Save current stack pointer */
-
-	call	get_sev_encryption_bit	/* Get the encryption bit position */
-	testl	%eax, %eax
-	jz	.Lno_sev_mask
-
-	bts	%rax, sme_me_mask(%rip)	/* Create the encryption mask */
-
-	/*
-	 * Read MSR_AMD64_SEV again and store it to sev_status. Can't do this in
-	 * get_sev_encryption_bit() because this function is 32-bit code and
-	 * shared between 64-bit and 32-bit boot path.
-	 */
-	movl	$MSR_AMD64_SEV, %ecx	/* Read the SEV MSR */
-	rdmsr
-
-	/* Store MSR value in sev_status */
-	shlq	$32, %rdx
-	orq	%rdx, %rax
-	movq	%rax, sev_status(%rip)
-
-.Lno_sev_mask:
-	movq	%rbp, %rsp		/* Restore original stack pointer */
-
-	pop	%rdx
-	pop	%rbp
-#endif
-
-	xor	%rax, %rax
-	RET
-SYM_FUNC_END(set_sev_encryption_mask)
 
 	.data
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 16ed360b6692..23e0e395084a 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -120,12 +120,12 @@ static inline void console_init(void)
 { }
 #endif
 
-void set_sev_encryption_mask(void);
-
 #ifdef CONFIG_AMD_MEM_ENCRYPT
+void sev_enable(struct boot_params *bp);
 void sev_es_shutdown_ghcb(void);
 extern bool sev_es_check_ghcb_fault(unsigned long address);
 #else
+static inline void sev_enable(struct boot_params *bp) { }
 static inline void sev_es_shutdown_ghcb(void) { }
 static inline bool sev_es_check_ghcb_fault(unsigned long address)
 {
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 28bcf04c022e..c88d7e17a71a 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -204,3 +204,47 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
 	else if (result != ES_RETRY)
 		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
 }
+
+static inline u64 rd_sev_status_msr(void)
+{
+	unsigned long low, high;
+
+	asm volatile("rdmsr" : "=a" (low), "=d" (high) :
+			"c" (MSR_AMD64_SEV));
+
+	return ((high << 32) | low);
+}
+
+void sev_enable(struct boot_params *bp)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	/* Check for the SME/SEV support leaf */
+	eax = 0x80000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	if (eax < 0x8000001f)
+		return;
+
+	/*
+	 * Check for the SME/SEV feature:
+	 *   CPUID Fn8000_001F[EAX]
+	 *   - Bit 0 - Secure Memory Encryption support
+	 *   - Bit 1 - Secure Encrypted Virtualization support
+	 *   CPUID Fn8000_001F[EBX]
+	 *   - Bits 5:0 - Pagetable bit position used to indicate encryption
+	 */
+	eax = 0x8000001f;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+	/* Check whether SEV is supported */
+	if (!(eax & BIT(1)))
+		return;
+
+	/* Set the SME mask if this is an SEV guest. */
+	sev_status = rd_sev_status_msr();
+	if (!(sev_status & MSR_AMD64_SEV_ENABLED))
+		return;
+
+	sme_me_mask = BIT_ULL(ebx & 0x3f);
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 06/43] x86/sev: Detect/setup SEV/SME features earlier in boot
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (4 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 07/43] x86/mm: Extend cc_attr to include AMD SEV-SNP Brijesh Singh
                   ` (36 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Venu Busireddy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

sme_enable() handles feature detection for both SEV and SME. Future
patches will also use it for SEV-SNP feature detection/setup, which
will need to be done immediately after the first #VC handler is set up.
Move it now in preparation.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/head64.c  |  3 ---
 arch/x86/kernel/head_64.S | 13 +++++++++++++
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index de563db9cdcd..66363f51a3ad 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -192,9 +192,6 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	if (load_delta & ~PMD_PAGE_MASK)
 		for (;;);
 
-	/* Activate Secure Memory Encryption (SME) if supported and enabled */
-	sme_enable(bp);
-
 	/* Include the SME encryption mask in the fixup value */
 	load_delta += sme_get_me_mask();
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 9c63fc5988cd..9c2c3aff5ee4 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -69,6 +69,19 @@ SYM_CODE_START_NOALIGN(startup_64)
 	call	startup_64_setup_env
 	popq	%rsi
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	/*
+	 * Activate SEV/SME memory encryption if supported/enabled. This needs to
+	 * be done now, since this also includes setup of the SEV-SNP CPUID table,
+	 * which needs to be done before any CPUID instructions are executed in
+	 * subsequent code.
+	 */
+	movq	%rsi, %rdi
+	pushq	%rsi
+	call	sme_enable
+	popq	%rsi
+#endif
+
 	/* Now switch to __KERNEL_CS so IRET works reliably */
 	pushq	$__KERNEL_CS
 	leaq	.Lon_kernel_cs(%rip), %rax
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 07/43] x86/mm: Extend cc_attr to include AMD SEV-SNP
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (5 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 06/43] x86/sev: " Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 08/43] x86/sev: Define the Linux specific guest termination reasons Brijesh Singh
                   ` (35 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The CC_ATTR_GUEST_SEV_SNP can be used by the guest to query whether the SNP -
Secure Nested Paging feature is active.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/msr-index.h | 2 ++
 arch/x86/kernel/cc_platform.c    | 2 ++
 arch/x86/mm/mem_encrypt.c        | 4 ++++
 include/linux/cc_platform.h      | 8 ++++++++
 4 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3faf0f97edb1..0b3b4dcf55a7 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -481,8 +481,10 @@
 #define MSR_AMD64_SEV			0xc0010131
 #define MSR_AMD64_SEV_ENABLED_BIT	0
 #define MSR_AMD64_SEV_ES_ENABLED_BIT	1
+#define MSR_AMD64_SEV_SNP_ENABLED_BIT	2
 #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
 #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
+#define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
 
 #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
 
diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c
index 6a6ffcd978f6..b9d1361d112b 100644
--- a/arch/x86/kernel/cc_platform.c
+++ b/arch/x86/kernel/cc_platform.c
@@ -59,6 +59,8 @@ static bool amd_cc_platform_has(enum cc_attr attr)
 		return (sev_status & MSR_AMD64_SEV_ENABLED) &&
 			!(sev_status & MSR_AMD64_SEV_ES_ENABLED);
 
+	case CC_ATTR_GUEST_SEV_SNP:
+		return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
 	default:
 		return false;
 	}
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 50d209939c66..f85868c031c6 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -62,6 +62,10 @@ static void print_mem_encrypt_feature_info(void)
 	if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
 		pr_cont(" SEV-ES");
 
+	/* Secure Nested Paging */
+	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		pr_cont(" SEV-SNP");
+
 	pr_cont("\n");
 }
 
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index efd8205282da..d08dd65b5c43 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -72,6 +72,14 @@ enum cc_attr {
 	 * Examples include TDX guest & SEV.
 	 */
 	CC_ATTR_GUEST_UNROLL_STRING_IO,
+
+	/**
+	 * @CC_ATTR_SEV_SNP: Guest SNP is active.
+	 *
+	 * The platform/OS is running as a guest/virtual machine and actively
+	 * using AMD SEV-SNP features.
+	 */
+	CC_ATTR_GUEST_SEV_SNP,
 };
 
 #ifdef CONFIG_ARCH_HAS_CC_PLATFORM
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 08/43] x86/sev: Define the Linux specific guest termination reasons
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (6 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 07/43] x86/mm: Extend cc_attr to include AMD SEV-SNP Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 09/43] x86/sev: Save the negotiated GHCB version Brijesh Singh
                   ` (34 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh, Venu Busireddy

GHCB specification defines the reason code for reason set 0. The reason
codes defined in the set 0 do not cover all possible causes for a guest
to request termination.

The reason sets 1 to 255 are reserved for the vendor-specific codes.
Reserve the reason set 1 for the Linux guest. Define the error codes for
reason set 1 so that one can have meaningful termination reasons and thus
better guest failure diagnosis.

While at it, change the sev_es_terminate() to accept the reason set
parameter.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    |  6 +++---
 arch/x86/include/asm/sev-common.h |  8 ++++++++
 arch/x86/kernel/sev-shared.c      | 11 ++++-------
 arch/x86/kernel/sev.c             |  4 ++--
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index c88d7e17a71a..06416113af22 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -122,7 +122,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
 static bool early_setup_sev_es(void)
 {
 	if (!sev_es_negotiate_protocol())
-		sev_es_terminate(GHCB_SEV_ES_PROT_UNSUPPORTED);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
 
 	if (set_page_decrypted((unsigned long)&boot_ghcb_page))
 		return false;
@@ -175,7 +175,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
 	enum es_result result;
 
 	if (!boot_ghcb && !early_setup_sev_es())
-		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 
 	vc_ghcb_invalidate(boot_ghcb);
 	result = vc_init_em_ctxt(&ctxt, regs, exit_code);
@@ -202,7 +202,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
 	if (result == ES_OK)
 		vc_finish_insn(&ctxt);
 	else if (result != ES_RETRY)
-		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 }
 
 static inline u64 rd_sev_status_msr(void)
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 1b2fd32b42fe..94f0ea574049 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -73,9 +73,17 @@
 	 /* GHCBData[23:16] */				\
 	((((u64)reason_val) & 0xff) << 16))
 
+/* Error codes from reason set 0 */
+#define SEV_TERM_SET_GEN		0
 #define GHCB_SEV_ES_GEN_REQ		0
 #define GHCB_SEV_ES_PROT_UNSUPPORTED	1
 
+/* Linux-specific reason codes (used with reason set 1) */
+#define SEV_TERM_SET_LINUX		1
+#define GHCB_TERM_REGISTER		0	/* GHCB GPA registration failure */
+#define GHCB_TERM_PSC			1	/* Page State Change failure */
+#define GHCB_TERM_PVALIDATE		2	/* Pvalidate failure */
+
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
 /*
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index ce987688bbc0..2abf8a7d75e5 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -24,15 +24,12 @@ static bool __init sev_es_check_cpu_features(void)
 	return true;
 }
 
-static void __noreturn sev_es_terminate(unsigned int reason)
+static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
 {
 	u64 val = GHCB_MSR_TERM_REQ;
 
-	/*
-	 * Tell the hypervisor what went wrong - only reason-set 0 is
-	 * currently supported.
-	 */
-	val |= GHCB_SEV_TERM_REASON(0, reason);
+	/* Tell the hypervisor what went wrong. */
+	val |= GHCB_SEV_TERM_REASON(set, reason);
 
 	/* Request Guest Termination from Hypvervisor */
 	sev_es_wr_ghcb_msr(val);
@@ -221,7 +218,7 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
 
 fail:
 	/* Terminate the guest */
-	sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+	sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 }
 
 static enum es_result vc_insn_string_read(struct es_em_ctxt *ctxt,
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index e6d316a01fdd..19ad09712902 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1337,7 +1337,7 @@ DEFINE_IDTENTRY_VC_KERNEL(exc_vmm_communication)
 		show_regs(regs);
 
 		/* Ask hypervisor to sev_es_terminate */
-		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 
 		/* If that fails and we get here - just panic */
 		panic("Returned from Terminate-Request to Hypervisor\n");
@@ -1385,7 +1385,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
 
 	/* Do initial setup or terminate the guest */
 	if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
-		sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 
 	vc_ghcb_invalidate(boot_ghcb);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 09/43] x86/sev: Save the negotiated GHCB version
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (7 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 08/43] x86/sev: Define the Linux specific guest termination reasons Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 10/43] x86/sev: Check SEV-SNP features support Brijesh Singh
                   ` (33 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh, Venu Busireddy

The SEV-ES guest calls the sev_es_negotiate_protocol() to negotiate the
GHCB protocol version before establishing the GHCB. Cache the negotiated
GHCB version so that it can be used later.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h   |  2 +-
 arch/x86/kernel/sev-shared.c | 17 ++++++++++++++---
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index ec060c433589..9b9c190e8c3b 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -12,7 +12,7 @@
 #include <asm/insn.h>
 #include <asm/sev-common.h>
 
-#define GHCB_PROTO_OUR		0x0001UL
+#define GHCB_PROTOCOL_MIN	1ULL
 #define GHCB_PROTOCOL_MAX	1ULL
 #define GHCB_DEFAULT_USAGE	0ULL
 
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 2abf8a7d75e5..91105f5a02a8 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -14,6 +14,15 @@
 #define has_cpuflag(f)	boot_cpu_has(f)
 #endif
 
+/*
+ * Since feature negotiation related variables are set early in the boot
+ * process they must reside in the .data section so as not to be zeroed
+ * out when the .bss section is later cleared.
+ *
+ * GHCB protocol version negotiated with the hypervisor.
+ */
+static u16 ghcb_version __ro_after_init;
+
 static bool __init sev_es_check_cpu_features(void)
 {
 	if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -51,10 +60,12 @@ static bool sev_es_negotiate_protocol(void)
 	if (GHCB_MSR_INFO(val) != GHCB_MSR_SEV_INFO_RESP)
 		return false;
 
-	if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTO_OUR ||
-	    GHCB_MSR_PROTO_MIN(val) > GHCB_PROTO_OUR)
+	if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTOCOL_MIN ||
+	    GHCB_MSR_PROTO_MIN(val) > GHCB_PROTOCOL_MAX)
 		return false;
 
+	ghcb_version = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
+
 	return true;
 }
 
@@ -127,7 +138,7 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
 				   u64 exit_info_1, u64 exit_info_2)
 {
 	/* Fill in protocol and format specifiers */
-	ghcb->protocol_version = GHCB_PROTOCOL_MAX;
+	ghcb->protocol_version = ghcb_version;
 	ghcb->ghcb_usage       = GHCB_DEFAULT_USAGE;
 
 	ghcb_set_sw_exit_code(ghcb, exit_code);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 10/43] x86/sev: Check SEV-SNP features support
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (8 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 09/43] x86/sev: Save the negotiated GHCB version Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-01 19:59   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 11/43] x86/sev: Add a helper for the PVALIDATE instruction Brijesh Singh
                   ` (32 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Version 2 of the GHCB specification added the advertisement of features
that are supported by the hypervisor. If hypervisor supports the SEV-SNP
then it must set the SEV-SNP features bit to indicate that the base
SEV-SNP is supported.

Check the SEV-SNP feature while establishing the GHCB, if failed,
terminate the guest.

Version 2 of GHCB specification adds several new NAEs, most of them are
optional except the hypervisor feature. Now that hypervisor feature NAE
is implemented, so bump the GHCB maximum support protocol version.

While at it, move the GHCB protocol negotiation check from VC exception
handler to sev_enable() so that all feature detection happens before
the first VC exception.

While at it, document why GHCB page cannot be setup from the
load_stage2_idt().

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/idt_64.c | 10 +++++++++-
 arch/x86/boot/compressed/sev.c    | 21 ++++++++++++++++-----
 arch/x86/include/asm/sev-common.h |  6 ++++++
 arch/x86/include/asm/sev.h        |  2 +-
 arch/x86/include/uapi/asm/svm.h   |  2 ++
 arch/x86/kernel/sev-shared.c      | 20 ++++++++++++++++++++
 arch/x86/kernel/sev.c             | 15 +++++++++++++++
 7 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c
index 9b93567d663a..63e9044ab1d6 100644
--- a/arch/x86/boot/compressed/idt_64.c
+++ b/arch/x86/boot/compressed/idt_64.c
@@ -39,7 +39,15 @@ void load_stage1_idt(void)
 	load_boot_idt(&boot_idt_desc);
 }
 
-/* Setup IDT after kernel jumping to  .Lrelocated */
+/*
+ * Setup IDT after kernel jumping to  .Lrelocated
+ *
+ * initialize_identity_maps() needs a PF handler setup. The PF handler setup
+ * needs to happen in load_stage2_idt() where the IDT is loaded and there the
+ * VC IDT entry gets setup too in order to handle VCs, one needs a GHCB which
+ * gets setup with an already setup table which is done in
+ * initialize_identity_maps() and this is where the circle is complete.
+ */
 void load_stage2_idt(void)
 {
 	boot_idt_desc.address = (unsigned long)boot_idt;
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 06416113af22..745b418866ea 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -119,11 +119,8 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
 /* Include code for early handlers */
 #include "../../kernel/sev-shared.c"
 
-static bool early_setup_sev_es(void)
+static bool early_setup_ghcb(void)
 {
-	if (!sev_es_negotiate_protocol())
-		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
-
 	if (set_page_decrypted((unsigned long)&boot_ghcb_page))
 		return false;
 
@@ -174,7 +171,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
 	struct es_em_ctxt ctxt;
 	enum es_result result;
 
-	if (!boot_ghcb && !early_setup_sev_es())
+	if (!boot_ghcb && !early_setup_ghcb())
 		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
 
 	vc_ghcb_invalidate(boot_ghcb);
@@ -246,5 +243,19 @@ void sev_enable(struct boot_params *bp)
 	if (!(sev_status & MSR_AMD64_SEV_ENABLED))
 		return;
 
+	/* Negotiate the GHCB protocol version. */
+	if (sev_status & MSR_AMD64_SEV_ES_ENABLED) {
+		if (!sev_es_negotiate_protocol())
+			sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
+	}
+
+	/*
+	 * SEV-SNP is supported in v2 of the GHCB spec which mandates support for HV
+	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
+	 * the SEV-SNP features.
+	 */
+	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED && !(get_hv_features() & GHCB_HV_FT_SNP))
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
 	sme_me_mask = BIT_ULL(ebx & 0x3f);
 }
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 94f0ea574049..6f037c29a46e 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -60,6 +60,11 @@
 /* GHCB Hypervisor Feature Request/Response */
 #define GHCB_MSR_HV_FT_REQ		0x080
 #define GHCB_MSR_HV_FT_RESP		0x081
+#define GHCB_MSR_HV_FT_RESP_VAL(v)			\
+	/* GHCBData[63:12] */				\
+	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
+
+#define GHCB_HV_FT_SNP			BIT_ULL(0)
 
 #define GHCB_MSR_TERM_REQ		0x100
 #define GHCB_MSR_TERM_REASON_SET_POS	12
@@ -77,6 +82,7 @@
 #define SEV_TERM_SET_GEN		0
 #define GHCB_SEV_ES_GEN_REQ		0
 #define GHCB_SEV_ES_PROT_UNSUPPORTED	1
+#define GHCB_SNP_UNSUPPORTED		2
 
 /* Linux-specific reason codes (used with reason set 1) */
 #define SEV_TERM_SET_LINUX		1
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 9b9c190e8c3b..17b75f6ee11a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -13,7 +13,7 @@
 #include <asm/sev-common.h>
 
 #define GHCB_PROTOCOL_MIN	1ULL
-#define GHCB_PROTOCOL_MAX	1ULL
+#define GHCB_PROTOCOL_MAX	2ULL
 #define GHCB_DEFAULT_USAGE	0ULL
 
 #define	VMGEXIT()			{ asm volatile("rep; vmmcall\n\r"); }
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index efa969325ede..b0ad00f4c1e1 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -108,6 +108,7 @@
 #define SVM_VMGEXIT_AP_JUMP_TABLE		0x80000005
 #define SVM_VMGEXIT_SET_AP_JUMP_TABLE		0
 #define SVM_VMGEXIT_GET_AP_JUMP_TABLE		1
+#define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
 #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
 
 /* Exit code reserved for hypervisor/software use */
@@ -218,6 +219,7 @@
 	{ SVM_VMGEXIT_NMI_COMPLETE,	"vmgexit_nmi_complete" }, \
 	{ SVM_VMGEXIT_AP_HLT_LOOP,	"vmgexit_ap_hlt_loop" }, \
 	{ SVM_VMGEXIT_AP_JUMP_TABLE,	"vmgexit_ap_jump_table" }, \
+	{ SVM_VMGEXIT_HV_FEATURES,	"vmgexit_hypervisor_feature" }, \
 	{ SVM_EXIT_ERR,         "invalid_guest_state" }
 
 
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 91105f5a02a8..4a876e684f67 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -48,6 +48,26 @@ static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
 		asm volatile("hlt\n" : : : "memory");
 }
 
+/*
+ * The hypervisor features are available from GHCB version 2 onward.
+ */
+static u64 get_hv_features(void)
+{
+	u64 val;
+
+	if (ghcb_version < 2)
+		return 0;
+
+	sev_es_wr_ghcb_msr(GHCB_MSR_HV_FT_REQ);
+	VMGEXIT();
+
+	val = sev_es_rd_ghcb_msr();
+	if (GHCB_RESP_CODE(val) != GHCB_MSR_HV_FT_RESP)
+		return 0;
+
+	return GHCB_MSR_HV_FT_RESP_VAL(val);
+}
+
 static bool sev_es_negotiate_protocol(void)
 {
 	u64 val;
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 19ad09712902..24df739c9c05 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -43,6 +43,9 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
  */
 static struct ghcb __initdata *boot_ghcb;
 
+/* Bitmap of SEV features supported by the hypervisor */
+static u64 sev_hv_features __ro_after_init;
+
 /* #VC handler runtime per-CPU data */
 struct sev_es_runtime_data {
 	struct ghcb ghcb_page;
@@ -766,6 +769,18 @@ void __init sev_es_init_vc_handling(void)
 	if (!sev_es_check_cpu_features())
 		panic("SEV-ES CPU Features missing");
 
+	/*
+	 * SEV-SNP is supported in v2 of the GHCB spec which mandates support for HV
+	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
+	 * the SEV-SNP features.
+	 */
+	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
+		sev_hv_features = get_hv_features();
+
+		if (!(sev_hv_features & GHCB_HV_FT_SNP))
+			sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+	}
+
 	/* Enable SEV-ES special handling */
 	static_branch_enable(&sev_es_enable_key);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 11/43] x86/sev: Add a helper for the PVALIDATE instruction
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (9 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 10/43] x86/sev: Check SEV-SNP features support Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 12/43] x86/sev: Check the vmpl level Brijesh Singh
                   ` (31 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh, Venu Busireddy

An SNP-active guest uses the PVALIDATE instruction to validate or
rescind the validation of a guest page’s RMP entry. Upon completion,
a return code is stored in EAX and rFLAGS bits are set based on the
return code. If the instruction completed successfully, the CF
indicates if the content of the RMP were changed or not.

See AMD APM Volume 3 for additional details.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 17b75f6ee11a..4ee98976aed8 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -60,6 +60,9 @@ extern void vc_no_ghcb(void);
 extern void vc_boot_ghcb(void);
 extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 
+/* Software defined (when rFlags.CF = 1) */
+#define PVALIDATE_FAIL_NOUPDATE		255
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -87,12 +90,30 @@ extern enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
 					  struct es_em_ctxt *ctxt,
 					  u64 exit_code, u64 exit_info_1,
 					  u64 exit_info_2);
+static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
+{
+	bool no_rmpupdate;
+	int rc;
+
+	/* "pvalidate" mnemonic support in binutils 2.36 and newer */
+	asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFF\n\t"
+		     CC_SET(c)
+		     : CC_OUT(c) (no_rmpupdate), "=a"(rc)
+		     : "a"(vaddr), "c"(rmp_psize), "d"(validate)
+		     : "memory", "cc");
+
+	if (no_rmpupdate)
+		return PVALIDATE_FAIL_NOUPDATE;
+
+	return rc;
+}
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
 static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { return 0; }
 static inline void sev_es_nmi_complete(void) { }
 static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
+static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
 #endif
 
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 12/43] x86/sev: Check the vmpl level
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (10 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 11/43] x86/sev: Add a helper for the PVALIDATE instruction Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 13/43] x86/compressed: Add helper for validating pages in the decompression stage Brijesh Singh
                   ` (30 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
allows a guest VM to divide its address space into four levels. The level
can be used to provide the hardware isolated abstraction layers with a VM.
The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
Certain operations must be done by the VMPL0 software, such as:

* Validate or invalidate memory range (PVALIDATE instruction)
* Allocate VMSA page (RMPADJUST instruction when VMSA=1)

The initial SEV-SNP support requires that the guest kernel is running on
VMPL0. Add a check to make sure that kernel is running at VMPL0 before
continuing the boot. There is no easy method to query the current VMPL
level, so use the RMPADJUST instruction to determine whether the guest is
running at the VMPL0.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    | 30 +++++++++++++++++++++++++++---
 arch/x86/include/asm/sev-common.h |  1 +
 arch/x86/include/asm/sev.h        | 16 ++++++++++++++++
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 745b418866ea..adfec1d43a77 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -212,6 +212,26 @@ static inline u64 rd_sev_status_msr(void)
 	return ((high << 32) | low);
 }
 
+static void enforce_vmpl0(void)
+{
+	u64 attrs;
+	int err;
+
+	/*
+	 * RMPADJUST modifies RMP permissions of a lesser-privileged (numerically
+	 * higher) privilege level. Here, clear the VMPL1 permission mask of the
+	 * GHCB page. If the guest is not running at VMPL0, this will fail.
+	 *
+	 * If the guest is running at VMPL0, it will succeed. Even if that operation
+	 * modifies permission bits, it is still ok to do currently because Linux
+	 * SEV-SNP guests are supported only on VMPL0 so VMPL1 or higher permission
+	 * masks changing is a don't-care.
+	 */
+	attrs = 1;
+	if (rmpadjust((unsigned long)&boot_ghcb_page, RMP_PG_SIZE_4K, attrs))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_NOT_VMPL0);
+}
+
 void sev_enable(struct boot_params *bp)
 {
 	unsigned int eax, ebx, ecx, edx;
@@ -252,10 +272,14 @@ void sev_enable(struct boot_params *bp)
 	/*
 	 * SEV-SNP is supported in v2 of the GHCB spec which mandates support for HV
 	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
-	 * the SEV-SNP features.
+	 * the SEV-SNP features and is launched at VMPL0 level.
 	 */
-	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED && !(get_hv_features() & GHCB_HV_FT_SNP))
-		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) {
+		if (!(get_hv_features() & GHCB_HV_FT_SNP))
+			sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
+		enforce_vmpl0();
+	}
 
 	sme_me_mask = BIT_ULL(ebx & 0x3f);
 }
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 6f037c29a46e..f2b6da96f79b 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -89,6 +89,7 @@
 #define GHCB_TERM_REGISTER		0	/* GHCB GPA registration failure */
 #define GHCB_TERM_PSC			1	/* Page State Change failure */
 #define GHCB_TERM_PVALIDATE		2	/* Pvalidate failure */
+#define GHCB_TERM_NOT_VMPL0		3	/* SEV-SNP guest is not running at VMPL-0 */
 
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 4ee98976aed8..e37451849165 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -63,6 +63,9 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 /* Software defined (when rFlags.CF = 1) */
 #define PVALIDATE_FAIL_NOUPDATE		255
 
+/* RMP page size */
+#define RMP_PG_SIZE_4K			0
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -90,6 +93,18 @@ extern enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
 					  struct es_em_ctxt *ctxt,
 					  u64 exit_code, u64 exit_info_1,
 					  u64 exit_info_2);
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs)
+{
+	int rc;
+
+	/* "rmpadjust" mnemonic support in binutils 2.36 and newer */
+	asm volatile(".byte 0xF3,0x0F,0x01,0xFE\n\t"
+		     : "=a"(rc)
+		     : "a"(vaddr), "c"(rmp_psize), "d"(attrs)
+		     : "memory", "cc");
+
+	return rc;
+}
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
 {
 	bool no_rmpupdate;
@@ -114,6 +129,7 @@ static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { ret
 static inline void sev_es_nmi_complete(void) { }
 static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
 #endif
 
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 13/43] x86/compressed: Add helper for validating pages in the decompression stage
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (11 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 12/43] x86/sev: Check the vmpl level Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 14/43] x86/compressed: Register GHCB memory when SEV-SNP is active Brijesh Singh
                   ` (29 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Many of the integrity guarantees of SEV-SNP are enforced through the
Reverse Map Table (RMP). Each RMP entry contains the GPA at which a
particular page of DRAM should be mapped. The VMs can request the
hypervisor to add pages in the RMP table via the Page State Change VMGEXIT
defined in the GHCB specification. Inside each RMP entry is a Validated
flag; this flag is automatically cleared to 0 by the CPU hardware when a
new RMP entry is created for a guest. Each VM page can be either
validated or invalidated, as indicated by the Validated flag in the RMP
entry. Memory access to a private page that is not validated generates
a #VC. A VM must use PVALIDATE instruction to validate the private page
before using it.

To maintain the security guarantee of SEV-SNP guests, when transitioning
pages from private to shared, the guest must invalidate the pages before
asking the hypervisor to change the page state to shared in the RMP table.

After the pages are mapped private in the page table, the guest must issue
a page state change VMGEXIT to make the pages private in the RMP table and
validate it.

On boot, BIOS should have validated the entire system memory. During
the kernel decompression stage, the early_setup_ghcb() uses the
set_page_decrypted() to make the GHCB page shared (i.e clear encryption
attribute). And while exiting from the decompression, it calls the
set_page_encrypted() to make the page private.

Add snp_set_page_{private,shared}() helpers that are used by the
set_page_{decrypted,encrypted}() to change the page state in the RMP table.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/ident_map_64.c | 18 +++++++++-
 arch/x86/boot/compressed/misc.h         |  4 +++
 arch/x86/boot/compressed/sev.c          | 46 +++++++++++++++++++++++++
 arch/x86/include/asm/sev-common.h       | 26 ++++++++++++++
 4 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index f7213d0943b8..3d566964b829 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -275,15 +275,31 @@ static int set_clr_page_flags(struct x86_mapping_info *info,
 	 * Changing encryption attributes of a page requires to flush it from
 	 * the caches.
 	 */
-	if ((set | clr) & _PAGE_ENC)
+	if ((set | clr) & _PAGE_ENC) {
 		clflush_page(address);
 
+		/*
+		 * If the encryption attribute is being cleared, then change
+		 * the page state to shared in the RMP table.
+		 */
+		if (clr)
+			snp_set_page_shared(__pa(address & PAGE_MASK));
+	}
+
 	/* Update PTE */
 	pte = *ptep;
 	pte = pte_set_flags(pte, set);
 	pte = pte_clear_flags(pte, clr);
 	set_pte(ptep, pte);
 
+	/*
+	 * If the encryption attribute is being set, then change the page state to
+	 * private in the RMP entry. The page state change must be done after the PTE
+	 * is updated.
+	 */
+	if (set & _PAGE_ENC)
+		snp_set_page_private(__pa(address & PAGE_MASK));
+
 	/* Flush TLB after changing encryption attribute */
 	write_cr3(top_level_pgt);
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 23e0e395084a..01cc13c12059 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -124,6 +124,8 @@ static inline void console_init(void)
 void sev_enable(struct boot_params *bp);
 void sev_es_shutdown_ghcb(void);
 extern bool sev_es_check_ghcb_fault(unsigned long address);
+void snp_set_page_private(unsigned long paddr);
+void snp_set_page_shared(unsigned long paddr);
 #else
 static inline void sev_enable(struct boot_params *bp) { }
 static inline void sev_es_shutdown_ghcb(void) { }
@@ -131,6 +133,8 @@ static inline bool sev_es_check_ghcb_fault(unsigned long address)
 {
 	return false;
 }
+static inline void snp_set_page_private(unsigned long paddr) { }
+static inline void snp_set_page_shared(unsigned long paddr) { }
 #endif
 
 /* acpi.c */
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index adfec1d43a77..1305267372d1 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -119,6 +119,52 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
 /* Include code for early handlers */
 #include "../../kernel/sev-shared.c"
 
+static inline bool sev_snp_enabled(void)
+{
+	return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
+}
+
+static void __page_state_change(unsigned long paddr, enum psc_op op)
+{
+	u64 val;
+
+	if (!sev_snp_enabled())
+		return;
+
+	/*
+	 * If private -> shared then invalidate the page before requesting the
+	 * state change in the RMP table.
+	 */
+	if (op == SNP_PAGE_STATE_SHARED && pvalidate(paddr, RMP_PG_SIZE_4K, 0))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+	/* Issue VMGEXIT to change the page state in RMP table. */
+	sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+	VMGEXIT();
+
+	/* Read the response of the VMGEXIT. */
+	val = sev_es_rd_ghcb_msr();
+	if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+
+	/*
+	 * Now that page state is changed in the RMP table, validate it so that it is
+	 * consistent with the RMP entry.
+	 */
+	if (op == SNP_PAGE_STATE_PRIVATE && pvalidate(paddr, RMP_PG_SIZE_4K, 1))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+}
+
+void snp_set_page_private(unsigned long paddr)
+{
+	__page_state_change(paddr, SNP_PAGE_STATE_PRIVATE);
+}
+
+void snp_set_page_shared(unsigned long paddr)
+{
+	__page_state_change(paddr, SNP_PAGE_STATE_SHARED);
+}
+
 static bool early_setup_ghcb(void)
 {
 	if (set_page_decrypted((unsigned long)&boot_ghcb_page))
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index f2b6da96f79b..dbb4635f2bb5 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -57,6 +57,32 @@
 #define GHCB_MSR_AP_RESET_HOLD_REQ	0x006
 #define GHCB_MSR_AP_RESET_HOLD_RESP	0x007
 
+/*
+ * SNP Page State Change Operation
+ *
+ * GHCBData[55:52] - Page operation:
+ *   0x0001	Page assignment, Private
+ *   0x0002	Page assignment, Shared
+ */
+enum psc_op {
+	SNP_PAGE_STATE_PRIVATE = 1,
+	SNP_PAGE_STATE_SHARED,
+};
+
+#define GHCB_MSR_PSC_REQ		0x014
+#define GHCB_MSR_PSC_REQ_GFN(gfn, op)			\
+	/* GHCBData[55:52] */				\
+	(((u64)((op) & 0xf) << 52) |			\
+	/* GHCBData[51:12] */				\
+	((u64)((gfn) & GENMASK_ULL(39, 0)) << 12) |	\
+	/* GHCBData[11:0] */				\
+	GHCB_MSR_PSC_REQ)
+
+#define GHCB_MSR_PSC_RESP		0x015
+#define GHCB_MSR_PSC_RESP_VAL(val)			\
+	/* GHCBData[63:32] */				\
+	(((u64)(val) & GENMASK_ULL(63, 32)) >> 32)
+
 /* GHCB Hypervisor Feature Request/Response */
 #define GHCB_MSR_HV_FT_REQ		0x080
 #define GHCB_MSR_HV_FT_RESP		0x081
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 14/43] x86/compressed: Register GHCB memory when SEV-SNP is active
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (12 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 13/43] x86/compressed: Add helper for validating pages in the decompression stage Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 15/43] x86/sev: " Brijesh Singh
                   ` (28 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh, Venu Busireddy

The SEV-SNP guest is required by the GHCB spec to register the GHCB's
Guest Physical Address (GPA). This is because the hypervisor may prefer
that a guest use a consistent and/or specific GPA for the GHCB associated
with a vCPU. For more information, see the GHCB specification section
"GHCB GPA Registration".

If hypervisor can not work with the guest provided GPA then terminate the
guest boot.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    |  4 ++++
 arch/x86/include/asm/sev-common.h | 13 +++++++++++++
 arch/x86/kernel/sev-shared.c      | 16 ++++++++++++++++
 3 files changed, 33 insertions(+)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 1305267372d1..1e5aa6b65025 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -178,6 +178,10 @@ static bool early_setup_ghcb(void)
 	/* Initialize lookup tables for the instruction decoder */
 	inat_init_tables();
 
+	/* SEV-SNP guest requires the GHCB GPA must be registered */
+	if (sev_snp_enabled())
+		snp_register_ghcb_early(__pa(&boot_ghcb_page));
+
 	return true;
 }
 
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index dbb4635f2bb5..891d03408f93 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -57,6 +57,19 @@
 #define GHCB_MSR_AP_RESET_HOLD_REQ	0x006
 #define GHCB_MSR_AP_RESET_HOLD_RESP	0x007
 
+/* GHCB GPA Register */
+#define GHCB_MSR_REG_GPA_REQ		0x012
+#define GHCB_MSR_REG_GPA_REQ_VAL(v)			\
+	/* GHCBData[63:12] */				\
+	(((u64)((v) & GENMASK_ULL(51, 0)) << 12) |	\
+	/* GHCBData[11:0] */				\
+	GHCB_MSR_REG_GPA_REQ)
+
+#define GHCB_MSR_REG_GPA_RESP		0x013
+#define GHCB_MSR_REG_GPA_RESP_VAL(v)			\
+	/* GHCBData[63:12] */				\
+	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
+
 /*
  * SNP Page State Change Operation
  *
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 4a876e684f67..e9ff13cd90b0 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -68,6 +68,22 @@ static u64 get_hv_features(void)
 	return GHCB_MSR_HV_FT_RESP_VAL(val);
 }
 
+static void __maybe_unused snp_register_ghcb_early(unsigned long paddr)
+{
+	unsigned long pfn = paddr >> PAGE_SHIFT;
+	u64 val;
+
+	sev_es_wr_ghcb_msr(GHCB_MSR_REG_GPA_REQ_VAL(pfn));
+	VMGEXIT();
+
+	val = sev_es_rd_ghcb_msr();
+
+	/* If the response GPA is not ours then abort the guest */
+	if ((GHCB_RESP_CODE(val) != GHCB_MSR_REG_GPA_RESP) ||
+	    (GHCB_MSR_REG_GPA_RESP_VAL(val) != pfn))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_REGISTER);
+}
+
 static bool sev_es_negotiate_protocol(void)
 {
 	u64 val;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 15/43] x86/sev: Register GHCB memory when SEV-SNP is active
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (13 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 14/43] x86/compressed: Register GHCB memory when SEV-SNP is active Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-02 10:34   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 16/43] x86/sev: Add helper for validating pages in early enc attribute changes Brijesh Singh
                   ` (27 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The SEV-SNP guest is required by the GHCB spec to register the GHCB's
Guest Physical Address (GPA). This is because the hypervisor may prefer
that a guest use a consistent and/or specific GPA for the GHCB associated
with a vCPU. For more information, see the GHCB specification section
"GHCB GPA Registration".

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h   |   2 +
 arch/x86/kernel/cpu/common.c |   4 +
 arch/x86/kernel/head64.c     |   4 +-
 arch/x86/kernel/sev-shared.c |   2 +-
 arch/x86/kernel/sev.c        | 145 ++++++++++++++++++++---------------
 5 files changed, 94 insertions(+), 63 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index e37451849165..48df02713ee0 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -122,6 +122,7 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
 
 	return rc;
 }
+void setup_ghcb(void);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -130,6 +131,7 @@ static inline void sev_es_nmi_complete(void) { }
 static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
 static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
+static inline void setup_ghcb(void) { }
 #endif
 
 #endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 7b8382c11788..d3c319f2ffd2 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -59,6 +59,7 @@
 #include <asm/cpu_device_id.h>
 #include <asm/uv/uv.h>
 #include <asm/sigframe.h>
+#include <asm/sev.h>
 
 #include "cpu.h"
 
@@ -1988,6 +1989,9 @@ void cpu_init_exception_handling(void)
 
 	load_TR_desc();
 
+	/* GHCB need to be setup to handle #VC. */
+	setup_ghcb();
+
 	/* Finally load the IDT */
 	load_current_idt();
 }
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 66363f51a3ad..8075e91cff2b 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -597,8 +597,10 @@ static void startup_64_load_idt(unsigned long physbase)
 void early_setup_idt(void)
 {
 	/* VMM Communication Exception */
-	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
+	if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+		setup_ghcb();
 		set_bringup_idt_handler(bringup_idt_table, X86_TRAP_VC, vc_boot_ghcb);
+	}
 
 	bringup_idt_descr.address = (unsigned long)bringup_idt_table;
 	native_load_idt(&bringup_idt_descr);
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index e9ff13cd90b0..3aaef1a18ffe 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -68,7 +68,7 @@ static u64 get_hv_features(void)
 	return GHCB_MSR_HV_FT_RESP_VAL(val);
 }
 
-static void __maybe_unused snp_register_ghcb_early(unsigned long paddr)
+static void snp_register_ghcb_early(unsigned long paddr)
 {
 	unsigned long pfn = paddr >> PAGE_SHIFT;
 	u64 val;
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 24df739c9c05..b86b48b66a44 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -41,7 +41,7 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
  * Needs to be in the .data section because we need it NULL before bss is
  * cleared
  */
-static struct ghcb __initdata *boot_ghcb;
+static struct ghcb *boot_ghcb __section(".data");
 
 /* Bitmap of SEV features supported by the hypervisor */
 static u64 sev_hv_features __ro_after_init;
@@ -161,55 +161,6 @@ void noinstr __sev_es_ist_exit(void)
 	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist);
 }
 
-/*
- * Nothing shall interrupt this code path while holding the per-CPU
- * GHCB. The backup GHCB is only for NMIs interrupting this path.
- *
- * Callers must disable local interrupts around it.
- */
-static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
-{
-	struct sev_es_runtime_data *data;
-	struct ghcb *ghcb;
-
-	WARN_ON(!irqs_disabled());
-
-	data = this_cpu_read(runtime_data);
-	ghcb = &data->ghcb_page;
-
-	if (unlikely(data->ghcb_active)) {
-		/* GHCB is already in use - save its contents */
-
-		if (unlikely(data->backup_ghcb_active)) {
-			/*
-			 * Backup-GHCB is also already in use. There is no way
-			 * to continue here so just kill the machine. To make
-			 * panic() work, mark GHCBs inactive so that messages
-			 * can be printed out.
-			 */
-			data->ghcb_active        = false;
-			data->backup_ghcb_active = false;
-
-			instrumentation_begin();
-			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
-			instrumentation_end();
-		}
-
-		/* Mark backup_ghcb active before writing to it */
-		data->backup_ghcb_active = true;
-
-		state->ghcb = &data->backup_ghcb;
-
-		/* Backup GHCB content */
-		*state->ghcb = *ghcb;
-	} else {
-		state->ghcb = NULL;
-		data->ghcb_active = true;
-	}
-
-	return ghcb;
-}
-
 static inline u64 sev_es_rd_ghcb_msr(void)
 {
 	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
@@ -483,6 +434,55 @@ static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt
 /* Include code shared with pre-decompression boot stage */
 #include "sev-shared.c"
 
+/*
+ * Nothing shall interrupt this code path while holding the per-CPU
+ * GHCB. The backup GHCB is only for NMIs interrupting this path.
+ *
+ * Callers must disable local interrupts around it.
+ */
+static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
+{
+	struct sev_es_runtime_data *data;
+	struct ghcb *ghcb;
+
+	WARN_ON(!irqs_disabled());
+
+	data = this_cpu_read(runtime_data);
+	ghcb = &data->ghcb_page;
+
+	if (unlikely(data->ghcb_active)) {
+		/* GHCB is already in use - save its contents */
+
+		if (unlikely(data->backup_ghcb_active)) {
+			/*
+			 * Backup-GHCB is also already in use. There is no way
+			 * to continue here so just kill the machine. To make
+			 * panic() work, mark GHCBs inactive so that messages
+			 * can be printed out.
+			 */
+			data->ghcb_active        = false;
+			data->backup_ghcb_active = false;
+
+			instrumentation_begin();
+			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
+			instrumentation_end();
+		}
+
+		/* Mark backup_ghcb active before writing to it */
+		data->backup_ghcb_active = true;
+
+		state->ghcb = &data->backup_ghcb;
+
+		/* Backup GHCB content */
+		*state->ghcb = *ghcb;
+	} else {
+		state->ghcb = NULL;
+		data->ghcb_active = true;
+	}
+
+	return ghcb;
+}
+
 static noinstr void __sev_put_ghcb(struct ghcb_state *state)
 {
 	struct sev_es_runtime_data *data;
@@ -647,15 +647,40 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
 	return ret;
 }
 
-/*
- * This function runs on the first #VC exception after the kernel
- * switched to virtual addresses.
- */
-static bool __init sev_es_setup_ghcb(void)
+static void snp_register_per_cpu_ghcb(void)
+{
+	struct sev_es_runtime_data *data;
+	struct ghcb *ghcb;
+
+	data = this_cpu_read(runtime_data);
+	ghcb = &data->ghcb_page;
+
+	snp_register_ghcb_early(__pa(ghcb));
+}
+
+void setup_ghcb(void)
 {
+	if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
+		return;
+
 	/* First make sure the hypervisor talks a supported protocol. */
 	if (!sev_es_negotiate_protocol())
-		return false;
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+
+	/*
+	 * Check whether the runtime #VC exception handler is active.
+	 * The runtime exception handler uses the per-CPU GHCB page, and
+	 * the GHCB page would be setup by sev_es_init_vc_handling().
+	 *
+	 * If SEV-SNP is active, then register the per-CPU GHCB page so
+	 * that runtime exception handler can use it.
+	 */
+	if (initial_vc_handler == (unsigned long)kernel_exc_vmm_communication) {
+		if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+			snp_register_per_cpu_ghcb();
+
+		return;
+	}
 
 	/*
 	 * Clear the boot_ghcb. The first exception comes in before the bss
@@ -666,7 +691,9 @@ static bool __init sev_es_setup_ghcb(void)
 	/* Alright - Make the boot-ghcb public */
 	boot_ghcb = &boot_ghcb_page;
 
-	return true;
+	/* SEV-SNP guest requires that GHCB GPA must be registered. */
+	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		snp_register_ghcb_early(__pa(&boot_ghcb_page));
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -1398,10 +1425,6 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
 	struct es_em_ctxt ctxt;
 	enum es_result result;
 
-	/* Do initial setup or terminate the guest */
-	if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
-		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
-
 	vc_ghcb_invalidate(boot_ghcb);
 
 	result = vc_init_em_ctxt(&ctxt, regs, exit_code);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 16/43] x86/sev: Add helper for validating pages in early enc attribute changes
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (14 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 15/43] x86/sev: " Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 17/43] x86/kernel: Make the .bss..decrypted section shared in RMP table Brijesh Singh
                   ` (26 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh, Venu Busireddy

The early_set_memory_{encrypted,decrypted}() are used for changing the
page from decrypted (shared) to encrypted (private) and vice versa.
When SEV-SNP is active, the page state transition needs to go through
additional steps.

If the page is transitioned from shared to private, then perform the
following after the encryption attribute is set in the page table:

1. Issue the page state change VMGEXIT to add the page as a private
   in the RMP table.
2. Validate the page after its successfully added in the RMP table.

To maintain the security guarantees, if the page is transitioned from
private to shared, then perform the following before clearing the
encryption attribute from the page table.

1. Invalidate the page.
2. Issue the page state change VMGEXIT to make the page shared in the
   RMP table.

The early_set_memory_{encrypted,decrypted} can be called before the
GHCB is setup, use the SNP page state MSR protocol VMGEXIT defined in
the GHCB specification to request the page state change in the RMP
table.

While at it, add a helper snp_prep_memory() which will be used in
probe_roms(), in a later patch.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h    | 10 ++++
 arch/x86/kernel/sev.c         | 99 +++++++++++++++++++++++++++++++++++
 arch/x86/mm/mem_encrypt_amd.c | 58 ++++++++++++++++++--
 3 files changed, 163 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 48df02713ee0..f65d257e3d4a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -123,6 +123,11 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
 	return rc;
 }
 void setup_ghcb(void);
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+					 unsigned int npages);
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+					unsigned int npages);
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -132,6 +137,11 @@ static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
 static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
 static inline void setup_ghcb(void) { }
+static inline void __init
+early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init
+early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
 #endif
 
 #endif
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index b86b48b66a44..5ee0fbd98d0d 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -556,6 +556,105 @@ static u64 get_jump_table_addr(void)
 	return ret;
 }
 
+static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool validate)
+{
+	unsigned long vaddr_end;
+	int rc;
+
+	vaddr = vaddr & PAGE_MASK;
+	vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+	while (vaddr < vaddr_end) {
+		rc = pvalidate(vaddr, RMP_PG_SIZE_4K, validate);
+		if (WARN(rc, "Failed to validate address 0x%lx ret %d", vaddr, rc))
+			sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+		vaddr = vaddr + PAGE_SIZE;
+	}
+}
+
+static void __init early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
+{
+	unsigned long paddr_end;
+	u64 val;
+
+	paddr = paddr & PAGE_MASK;
+	paddr_end = paddr + (npages << PAGE_SHIFT);
+
+	while (paddr < paddr_end) {
+		/*
+		 * Use the MSR protocol because this function can be called before
+		 * the GHCB is established.
+		 */
+		sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+		VMGEXIT();
+
+		val = sev_es_rd_ghcb_msr();
+
+		if (WARN(GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP,
+			 "Wrong PSC response code: 0x%x\n",
+			 (unsigned int)GHCB_RESP_CODE(val)))
+			goto e_term;
+
+		if (WARN(GHCB_MSR_PSC_RESP_VAL(val),
+			 "Failed to change page state to '%s' paddr 0x%lx error 0x%llx\n",
+			 op == SNP_PAGE_STATE_PRIVATE ? "private" : "shared",
+			 paddr, GHCB_MSR_PSC_RESP_VAL(val)))
+			goto e_term;
+
+		paddr = paddr + PAGE_SIZE;
+	}
+
+	return;
+
+e_term:
+	sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+					 unsigned int npages)
+{
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return;
+
+	 /*
+	  * Ask the hypervisor to mark the memory pages as private in the RMP
+	  * table.
+	  */
+	early_set_pages_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+	/* Validate the memory pages after they've been added in the RMP table. */
+	pvalidate_pages(vaddr, npages, true);
+}
+
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+					unsigned int npages)
+{
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return;
+
+	/* Invalidate the memory pages before they are marked shared in the RMP table. */
+	pvalidate_pages(vaddr, npages, false);
+
+	 /* Ask hypervisor to mark the memory pages shared in the RMP table. */
+	early_set_pages_state(paddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
+{
+	unsigned long vaddr, npages;
+
+	vaddr = (unsigned long)__va(paddr);
+	npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+	if (op == SNP_PAGE_STATE_PRIVATE)
+		early_snp_set_memory_private(vaddr, paddr, npages);
+	else if (op == SNP_PAGE_STATE_SHARED)
+		early_snp_set_memory_shared(vaddr, paddr, npages);
+	else
+		WARN(1, "invalid memory op %d\n", op);
+}
+
 int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
 {
 	u16 startup_cs, startup_ip;
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 2b2d018ea345..b59be6aac97b 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -31,6 +31,7 @@
 #include <asm/processor-flags.h>
 #include <asm/msr.h>
 #include <asm/cmdline.h>
+#include <asm/sev.h>
 
 #include "mm_internal.h"
 
@@ -47,6 +48,36 @@ EXPORT_SYMBOL(sme_me_mask);
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
 
+/*
+ * SNP-specific routine which needs to additionally change the page state from
+ * private to shared before copying the data from the source to destination and
+ * restore after the copy.
+ */
+static inline void __init snp_memcpy(void *dst, void *src, size_t sz,
+				     unsigned long paddr, bool decrypt)
+{
+	unsigned long npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+	if (decrypt) {
+		/*
+		 * @paddr needs to be accessed decrypted, mark the page shared in
+		 * the RMP table before copying it.
+		 */
+		early_snp_set_memory_shared((unsigned long)__va(paddr), paddr, npages);
+
+		memcpy(dst, src, sz);
+
+		/* Restore the page state after the memcpy. */
+		early_snp_set_memory_private((unsigned long)__va(paddr), paddr, npages);
+	} else {
+		/*
+		 * @paddr need to be accessed encrypted, no need for the page state
+		 * change.
+		 */
+		memcpy(dst, src, sz);
+	}
+}
+
 /*
  * This routine does not change the underlying encryption setting of the
  * page(s) that map this memory. It assumes that eventually the memory is
@@ -95,8 +126,13 @@ static void __init __sme_early_enc_dec(resource_size_t paddr,
 		 * Use a temporary buffer, of cache-line multiple size, to
 		 * avoid data corruption as documented in the APM.
 		 */
-		memcpy(sme_early_buffer, src, len);
-		memcpy(dst, sme_early_buffer, len);
+		if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
+			snp_memcpy(sme_early_buffer, src, len, paddr, enc);
+			snp_memcpy(dst, sme_early_buffer, len, paddr, !enc);
+		} else {
+			memcpy(sme_early_buffer, src, len);
+			memcpy(dst, sme_early_buffer, len);
+		}
 
 		early_memunmap(dst, len);
 		early_memunmap(src, len);
@@ -318,14 +354,28 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
 	clflush_cache_range(__va(pa), size);
 
 	/* Encrypt/decrypt the contents in-place */
-	if (enc)
+	if (enc) {
 		sme_early_encrypt(pa, size);
-	else
+	} else {
 		sme_early_decrypt(pa, size);
 
+		/*
+		 * ON SNP, the page state in the RMP table must happen
+		 * before the page table updates.
+		 */
+		early_snp_set_memory_shared((unsigned long)__va(pa), pa, 1);
+	}
+
 	/* Change the page encryption mask. */
 	new_pte = pfn_pte(pfn, new_prot);
 	set_pte_atomic(kpte, new_pte);
+
+	/*
+	 * If page is set encrypted in the page table, then update the RMP table to
+	 * add this page as private.
+	 */
+	if (enc)
+		early_snp_set_memory_private((unsigned long)__va(pa), pa, 1);
 }
 
 static int __init early_set_memory_enc_dec(unsigned long vaddr,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 17/43] x86/kernel: Make the .bss..decrypted section shared in RMP table
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (15 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 16/43] x86/sev: Add helper for validating pages in early enc attribute changes Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-02 11:06   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 18/43] x86/kernel: Validate ROM memory before accessing when SEV-SNP is active Brijesh Singh
                   ` (25 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The encryption attribute for the .bss..decrypted section is cleared in the
initial page table build. This is because the section contains the data
that need to be shared between the guest and the hypervisor.

When SEV-SNP is active, just clearing the encryption attribute in the
page table is not enough. The page state need to be updated in the RMP
table.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/head64.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 8075e91cff2b..1239bc104cda 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -143,7 +143,21 @@ static unsigned long sme_postprocess_startup(struct boot_params *bp, pmdval_t *p
 	if (sme_get_me_mask()) {
 		vaddr = (unsigned long)__start_bss_decrypted;
 		vaddr_end = (unsigned long)__end_bss_decrypted;
+
 		for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
+			/*
+			 * When SEV-SNP is active then transition the page to
+			 * shared in the RMP table so that it is consistent with
+			 * the page table attribute change.
+			 *
+			 * At this point, kernel is running in identity mapped mode.
+			 * The __start_bss_decrypted is a regular kernel address. The
+			 * early_snp_set_memory_shared() requires a valid virtual
+			 * address, so use __pa() against __start_bss_decrypted to
+			 * get valid virtual address.
+			 */
+			early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
+
 			i = pmd_index(vaddr);
 			pmd[i] -= sme_get_me_mask();
 		}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 18/43] x86/kernel: Validate ROM memory before accessing when SEV-SNP is active
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (16 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 17/43] x86/kernel: Make the .bss..decrypted section shared in RMP table Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-02 15:41   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 19/43] x86/mm: Add support to validate memory when changing C-bit Brijesh Singh
                   ` (24 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

probe_roms() accesses the memory range (0xc0000 - 0x10000) to probe
various ROMs. The memory range is not part of the E820 system RAM
range. The memory range is mapped as private (i.e encrypted) in page
table.

When SEV-SNP is active, all the private memory must be validated before
the access. The ROM range was not part of E820 map, so the guest BIOS
did not validate it. An access to invalidated memory will cause a VC
exception. The guest does not support handling not-validated VC exception
yet, so validate the ROM memory regions before it is accessed.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/probe_roms.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 36e84d904260..d19a80565252 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -21,6 +21,7 @@
 #include <asm/sections.h>
 #include <asm/io.h>
 #include <asm/setup_arch.h>
+#include <asm/sev.h>
 
 static struct resource system_rom_resource = {
 	.name	= "System ROM",
@@ -197,11 +198,21 @@ static int __init romchecksum(const unsigned char *rom, unsigned long length)
 
 void __init probe_roms(void)
 {
-	const unsigned char *rom;
 	unsigned long start, length, upper;
+	const unsigned char *rom;
 	unsigned char c;
 	int i;
 
+	/*
+	 * The ROM memory is not part of the E820 system RAM and is not pre-validated
+	 * by the BIOS. The kernel page table maps the ROM region as encrypted memory,
+	 * the SEV-SNP requires the encrypted memory must be validated before the
+	 * access. Validate the ROM before accessing it.
+	 */
+	snp_prep_memory(video_rom_resource.start,
+			((system_rom_resource.end + 1) - video_rom_resource.start),
+			SNP_PAGE_STATE_PRIVATE);
+
 	/* video rom */
 	upper = adapter_rom_resources[0].start;
 	for (start = video_rom_resource.start; start < upper; start += 2048) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 19/43] x86/mm: Add support to validate memory when changing C-bit
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (17 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 18/43] x86/kernel: Validate ROM memory before accessing when SEV-SNP is active Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-02 16:10   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 20/43] x86/sev: Use SEV-SNP AP creation to start secondary CPUs Brijesh Singh
                   ` (23 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

The set_memory_{encrypted,decrypted}() are used for changing the pages
from decrypted (shared) to encrypted (private) and vice versa.
When SEV-SNP is active, the page state transition needs to go through
additional steps done by the guest.

If the page is transitioned from shared to private, then perform the
following after the encryption attribute is set in the page table:

1. Issue the page state change VMGEXIT to add the memory region in
   the RMP table.
2. Validate the memory region after the RMP entry is added.

To maintain the security guarantees, if the page is transitioned from
private to shared, then perform the following before encryption attribute
is removed from the page table:

1. Invalidate the page.
2. Issue the page state change VMGEXIT to remove the page from RMP table.

To change the page state in the RMP table, use the Page State Change
VMGEXIT defined in the GHCB specification.

The GHCB specification provides the flexibility to use either 4K or 2MB
page size in during the page state change (PSC) request. For now use the
4K page size for all the PSC until RMP page size tracking is supported
in the kernel.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev-common.h |  22 ++++
 arch/x86/include/asm/sev.h        |   4 +
 arch/x86/include/uapi/asm/svm.h   |   2 +
 arch/x86/kernel/sev.c             | 167 ++++++++++++++++++++++++++++++
 arch/x86/mm/pat/set_memory.c      |  15 +++
 5 files changed, 210 insertions(+)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 891d03408f93..956b8b49528a 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -105,6 +105,28 @@ enum psc_op {
 
 #define GHCB_HV_FT_SNP			BIT_ULL(0)
 
+/* SNP Page State Change NAE event */
+#define VMGEXIT_PSC_MAX_ENTRY		253
+
+struct psc_hdr {
+	u16 cur_entry;
+	u16 end_entry;
+	u32 reserved;
+} __packed;
+
+struct psc_entry {
+	u64	cur_page	: 12,
+		gfn		: 40,
+		operation	: 4,
+		pagesize	: 1,
+		reserved	: 7;
+} __packed;
+
+struct snp_psc_desc {
+	struct psc_hdr hdr;
+	struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
+} __packed;
+
 #define GHCB_MSR_TERM_REQ		0x100
 #define GHCB_MSR_TERM_REASON_SET_POS	12
 #define GHCB_MSR_TERM_REASON_SET_MASK	0xf
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index f65d257e3d4a..feeb93e6ec97 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -128,6 +128,8 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
 void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
 					unsigned int npages);
 void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -142,6 +144,8 @@ early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned
 static inline void __init
 early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
 static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
+static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
+static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
 #endif
 
 #endif
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index b0ad00f4c1e1..0dcdb6e0c913 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -108,6 +108,7 @@
 #define SVM_VMGEXIT_AP_JUMP_TABLE		0x80000005
 #define SVM_VMGEXIT_SET_AP_JUMP_TABLE		0
 #define SVM_VMGEXIT_GET_AP_JUMP_TABLE		1
+#define SVM_VMGEXIT_PSC				0x80000010
 #define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
 #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
 
@@ -219,6 +220,7 @@
 	{ SVM_VMGEXIT_NMI_COMPLETE,	"vmgexit_nmi_complete" }, \
 	{ SVM_VMGEXIT_AP_HLT_LOOP,	"vmgexit_ap_hlt_loop" }, \
 	{ SVM_VMGEXIT_AP_JUMP_TABLE,	"vmgexit_ap_jump_table" }, \
+	{ SVM_VMGEXIT_PSC,	"vmgexit_page_state_change" }, \
 	{ SVM_VMGEXIT_HV_FEATURES,	"vmgexit_hypervisor_feature" }, \
 	{ SVM_EXIT_ERR,         "invalid_guest_state" }
 
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 5ee0fbd98d0d..b7ae741a8c66 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -655,6 +655,173 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
 		WARN(1, "invalid memory op %d\n", op);
 }
 
+static int vmgexit_psc(struct snp_psc_desc *desc)
+{
+	int cur_entry, end_entry, ret = 0;
+	struct snp_psc_desc *data;
+	struct ghcb_state state;
+	struct es_em_ctxt ctxt;
+	unsigned long flags;
+	struct ghcb *ghcb;
+
+	/*
+	 * __sev_get_ghcb() needs to run with IRQs disabled because it is using
+	 * a per-CPU GHCB.
+	 */
+	local_irq_save(flags);
+
+	ghcb = __sev_get_ghcb(&state);
+	if (unlikely(!ghcb))
+		return 1;
+
+	/* Copy the input desc into GHCB shared buffer */
+	data = (struct snp_psc_desc *)ghcb->shared_buffer;
+	memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof(*desc)));
+
+	/*
+	 * As per the GHCB specification, the hypervisor can resume the guest
+	 * before processing all the entries. Check whether all the entries
+	 * are processed. If not, then keep retrying. Note, the hypervisor
+	 * will update the data memory directly to indicate the status, so
+	 * reference the data->hdr everywhere.
+	 *
+	 * The strategy here is to wait for the hypervisor to change the page
+	 * state in the RMP table before guest accesses the memory pages. If the
+	 * page state change was not successful, then later memory access will
+	 * result in a crash.
+	 */
+	cur_entry = data->hdr.cur_entry;
+	end_entry = data->hdr.end_entry;
+
+	while (data->hdr.cur_entry <= data->hdr.end_entry) {
+		ghcb_set_sw_scratch(ghcb, (u64)__pa(data));
+
+		/* This will advance the shared buffer data points to. */
+		ret = sev_es_ghcb_hv_call(ghcb, true, &ctxt, SVM_VMGEXIT_PSC, 0, 0);
+
+		/*
+		 * Page State Change VMGEXIT can pass error code through
+		 * exit_info_2.
+		 */
+		if (WARN(ret || ghcb->save.sw_exit_info_2,
+			 "SEV-SNP: PSC failed ret=%d exit_info_2=%llx\n",
+			 ret, ghcb->save.sw_exit_info_2)) {
+			ret = 1;
+			goto out;
+		}
+
+		/* Verify that reserved bit is not set */
+		if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n")) {
+			ret = 1;
+			goto out;
+		}
+
+		/*
+		 * Sanity check that entry processing is not going backwards.
+		 * This will happen only if hypervisor is tricking us.
+		 */
+		if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_entry,
+"SEV-SNP:  PSC processing going backward, end_entry %d (got %d) cur_entry %d (got %d)\n",
+			 end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) {
+			ret = 1;
+			goto out;
+		}
+	}
+
+out:
+	__sev_put_ghcb(&state);
+	local_irq_restore(flags);
+
+	return ret;
+}
+
+static void __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
+			      unsigned long vaddr_end, int op)
+{
+	struct psc_hdr *hdr;
+	struct psc_entry *e;
+	unsigned long pfn;
+	int i;
+
+	hdr = &data->hdr;
+	e = data->entries;
+
+	memset(data, 0, sizeof(*data));
+	i = 0;
+
+	while (vaddr < vaddr_end) {
+		if (is_vmalloc_addr((void *)vaddr))
+			pfn = vmalloc_to_pfn((void *)vaddr);
+		else
+			pfn = __pa(vaddr) >> PAGE_SHIFT;
+
+		e->gfn = pfn;
+		e->operation = op;
+		hdr->end_entry = i;
+
+		/*
+		 * Current SEV-SNP implementation doesn't keep track of the RMP
+		 * page size so use 4K for simplicity.
+		 */
+		e->pagesize = RMP_PG_SIZE_4K;
+
+		vaddr = vaddr + PAGE_SIZE;
+		e++;
+		i++;
+	}
+
+	if (vmgexit_psc(data))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
+{
+	unsigned long vaddr_end, next_vaddr;
+	struct snp_psc_desc *desc;
+
+	desc = kmalloc(sizeof(*desc), GFP_KERNEL_ACCOUNT);
+	if (!desc)
+		panic("SEV-SNP: failed to allocate memory for PSC descriptor\n");
+
+	vaddr = vaddr & PAGE_MASK;
+	vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+	while (vaddr < vaddr_end) {
+		/*
+		 * Calculate the last vaddr that can be fit in one
+		 * struct snp_psc_desc.
+		 */
+		next_vaddr = min_t(unsigned long, vaddr_end,
+				   (VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr);
+
+		__set_pages_state(desc, vaddr, next_vaddr, op);
+
+		vaddr = next_vaddr;
+	}
+
+	kfree(desc);
+}
+
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
+{
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return;
+
+	pvalidate_pages(vaddr, npages, false);
+
+	set_pages_state(vaddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
+{
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return;
+
+	set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+	pvalidate_pages(vaddr, npages, true);
+}
+
 int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
 {
 	u16 startup_cs, startup_ip;
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index b4072115c8ef..1bc15b9d15f3 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -32,6 +32,7 @@
 #include <asm/set_memory.h>
 #include <asm/hyperv-tlfs.h>
 #include <asm/mshyperv.h>
+#include <asm/sev.h>
 
 #include "../mm_internal.h"
 
@@ -2012,8 +2013,22 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
 	 */
 	cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
 
+	/*
+	 * To maintain the security guarantees of SEV-SNP guest invalidate the
+	 * memory before clearing the encryption attribute.
+	 */
+	if (!enc)
+		snp_set_memory_shared(addr, numpages);
+
 	ret = __change_page_attr_set_clr(&cpa, 1);
 
+	/*
+	 * Now that memory is mapped encrypted in the page table, validate it
+	 * so that is consistent with the above page state.
+	 */
+	if (!ret && enc)
+		snp_set_memory_private(addr, numpages);
+
 	/*
 	 * After changing the encryption attribute, we need to flush TLBs again
 	 * in case any speculative TLB caching occurred (but no need to flush
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 20/43] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (18 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 19/43] x86/mm: Add support to validate memory when changing C-bit Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-03  6:50   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 21/43] x86/head/64: Re-enable stack protection Brijesh Singh
                   ` (22 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Tom Lendacky <thomas.lendacky@amd.com>

To provide a more secure way to start APs under SEV-SNP, use the SEV-SNP
AP Creation NAE event. This allows for guest control over the AP register
state rather than trusting the hypervisor with the SEV-ES Jump Table
address.

During native_smp_prepare_cpus(), invoke an SEV-SNP function that, if
SEV-SNP is active, will set/override apic->wakeup_secondary_cpu. This
will allow the SEV-SNP AP Creation NAE event method to be used to boot
the APs. As a result of installing the override when SEV-SNP is active,
this method of starting the APs becomes the required method. The override
function will fail to start the AP if the hypervisor does not have
support for AP creation.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev-common.h |   1 +
 arch/x86/include/asm/sev.h        |   4 +
 arch/x86/include/uapi/asm/svm.h   |   5 +
 arch/x86/kernel/sev.c             | 250 ++++++++++++++++++++++++++++++
 arch/x86/kernel/smpboot.c         |   3 +
 5 files changed, 263 insertions(+)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 956b8b49528a..6d90f7c688b1 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -104,6 +104,7 @@ enum psc_op {
 	(((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
 
 #define GHCB_HV_FT_SNP			BIT_ULL(0)
+#define GHCB_HV_FT_SNP_AP_CREATION	BIT_ULL(1)
 
 /* SNP Page State Change NAE event */
 #define VMGEXIT_PSC_MAX_ENTRY		253
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index feeb93e6ec97..a3203b2caaca 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -66,6 +66,8 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 /* RMP page size */
 #define RMP_PG_SIZE_4K			0
 
+#define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -130,6 +132,7 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
 void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
 void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
+void snp_set_wakeup_secondary_cpu(void);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -146,6 +149,7 @@ early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned i
 static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
 static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
 static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
+static inline void snp_set_wakeup_secondary_cpu(void) { }
 #endif
 
 #endif
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 0dcdb6e0c913..8b4c57baec52 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -109,6 +109,10 @@
 #define SVM_VMGEXIT_SET_AP_JUMP_TABLE		0
 #define SVM_VMGEXIT_GET_AP_JUMP_TABLE		1
 #define SVM_VMGEXIT_PSC				0x80000010
+#define SVM_VMGEXIT_AP_CREATION			0x80000013
+#define SVM_VMGEXIT_AP_CREATE_ON_INIT		0
+#define SVM_VMGEXIT_AP_CREATE			1
+#define SVM_VMGEXIT_AP_DESTROY			2
 #define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
 #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
 
@@ -221,6 +225,7 @@
 	{ SVM_VMGEXIT_AP_HLT_LOOP,	"vmgexit_ap_hlt_loop" }, \
 	{ SVM_VMGEXIT_AP_JUMP_TABLE,	"vmgexit_ap_jump_table" }, \
 	{ SVM_VMGEXIT_PSC,	"vmgexit_page_state_change" }, \
+	{ SVM_VMGEXIT_AP_CREATION,	"vmgexit_ap_creation" }, \
 	{ SVM_VMGEXIT_HV_FEATURES,	"vmgexit_hypervisor_feature" }, \
 	{ SVM_EXIT_ERR,         "invalid_guest_state" }
 
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index b7ae741a8c66..19a8ffcbd83b 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -18,6 +18,7 @@
 #include <linux/memblock.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
+#include <linux/cpumask.h>
 
 #include <asm/cpu_entry_area.h>
 #include <asm/stacktrace.h>
@@ -31,9 +32,26 @@
 #include <asm/svm.h>
 #include <asm/smp.h>
 #include <asm/cpu.h>
+#include <asm/apic.h>
 
 #define DR7_RESET_VALUE        0x400
 
+/* AP INIT values as documented in the APM2  section "Processor Initialization State" */
+#define AP_INIT_CS_LIMIT		0xffff
+#define AP_INIT_DS_LIMIT		0xffff
+#define AP_INIT_LDTR_LIMIT		0xffff
+#define AP_INIT_GDTR_LIMIT		0xffff
+#define AP_INIT_IDTR_LIMIT		0xffff
+#define AP_INIT_TR_LIMIT		0xffff
+#define AP_INIT_RFLAGS_DEFAULT		0x2
+#define AP_INIT_DR6_DEFAULT		0xffff0ff0
+#define AP_INIT_GPAT_DEFAULT		0x0007040600070406ULL
+#define AP_INIT_XCR0_DEFAULT		0x1
+#define AP_INIT_X87_FTW_DEFAULT		0x5555
+#define AP_INIT_X87_FCW_DEFAULT		0x0040
+#define AP_INIT_CR0_DEFAULT		0x60000010
+#define AP_INIT_MXCSR_DEFAULT		0x1f80
+
 /* For early boot hypervisor communication in SEV-ES enabled guests */
 static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
 
@@ -90,6 +108,8 @@ struct ghcb_state {
 static DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
 DEFINE_STATIC_KEY_FALSE(sev_es_enable_key);
 
+static DEFINE_PER_CPU(struct sev_es_save_area *, sev_vmsa);
+
 static __always_inline bool on_vc_stack(struct pt_regs *regs)
 {
 	unsigned long sp = regs->sp;
@@ -822,6 +842,236 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
 	pvalidate_pages(vaddr, npages, true);
 }
 
+static int snp_set_vmsa(void *va, bool vmsa)
+{
+	u64 attrs;
+
+	/*
+	 * Running at VMPL0 allows the kernel to change the VMSA bit for a page
+	 * using the RMPADJUST instruction. However, for the instruction to
+	 * succeed it must target the permissions of a lesser privileged
+	 * VMPL level, so use VMPL1 (refer to the RMPADJUST instruction in the
+	 * AMD64 APM Volume 3).
+	 */
+	attrs = 1;
+	if (vmsa)
+		attrs |= RMPADJUST_VMSA_PAGE_BIT;
+
+	return rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs);
+}
+
+#define __ATTR_BASE		(SVM_SELECTOR_P_MASK | SVM_SELECTOR_S_MASK)
+#define INIT_CS_ATTRIBS		(__ATTR_BASE | SVM_SELECTOR_READ_MASK | SVM_SELECTOR_CODE_MASK)
+#define INIT_DS_ATTRIBS		(__ATTR_BASE | SVM_SELECTOR_WRITE_MASK)
+
+#define INIT_LDTR_ATTRIBS	(SVM_SELECTOR_P_MASK | 2)
+#define INIT_TR_ATTRIBS		(SVM_SELECTOR_P_MASK | 3)
+
+static void *snp_alloc_vmsa_page(void)
+{
+	unsigned long pfn;
+	struct page *p;
+
+	/*
+	 * Allocate vmsa page to workaround the SEV-SNP erratum where the CPU
+	 * will incorrectly signal an RMP violation #PF if a large page (2MB
+	 * or 1GB) collides with the RMP entry of VMSA page. The recommended
+	 * workaround is to not use the large page.
+	 *
+	 * Allocate one extra page, use a page which is not 2MB-aligned
+	 * and free the other.
+	 */
+	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
+	if (!p)
+		return NULL;
+
+	split_page(p, 1);
+
+	pfn = page_to_pfn(p);
+	if (IS_ALIGNED(__pfn_to_phys(pfn), PMD_SIZE)) {
+		pfn++;
+		__free_page(p);
+	} else {
+		__free_page(pfn_to_page(pfn + 1));
+	}
+
+	return page_address(pfn_to_page(pfn));
+}
+
+static void snp_cleanup_vmsa(struct sev_es_save_area *vmsa)
+{
+	int err;
+
+	err = snp_set_vmsa(vmsa, false);
+	if (err)
+		pr_err("clear VMSA page failed (%u), leaking page\n", err);
+	else
+		free_page((unsigned long)vmsa);
+}
+
+static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
+{
+	struct sev_es_save_area *cur_vmsa, *vmsa;
+	struct ghcb_state state;
+	unsigned long flags;
+	struct ghcb *ghcb;
+	u8 sipi_vector;
+	int cpu, ret;
+	u64 cr4;
+
+	/*
+	 * SNP-SNP AP creation requires that the hypervisor must support SEV-SNP
+	 * feature. The SEV-SNP feature check is already performed, so just check
+	 * for the AP_CREATION feature flag.
+	 */
+	if (!(sev_hv_features & GHCB_HV_FT_SNP_AP_CREATION))
+		return -EOPNOTSUPP;
+
+	/*
+	 * Verify the desired start IP against the known trampoline start IP
+	 * to catch any future new trampolines that may be introduced that
+	 * would require a new protected guest entry point.
+	 */
+	if (WARN_ONCE(start_ip != real_mode_header->trampoline_start,
+		      "Unsupported SEV-SNP start_ip: %lx\n", start_ip))
+		return -EINVAL;
+
+	/* Override start_ip with known protected guest start IP */
+	start_ip = real_mode_header->sev_es_trampoline_start;
+
+	/* Find the logical CPU for the APIC ID */
+	for_each_present_cpu(cpu) {
+		if (arch_match_cpu_phys_id(cpu, apic_id))
+			break;
+	}
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+
+	cur_vmsa = per_cpu(sev_vmsa, cpu);
+
+	/*
+	 * A new VMSA is created each time because there is no guarantee that
+	 * the current VMSA is the kernels or that the vCPU is not running. If
+	 * an attempt was done to use the current VMSA with a running vCPU, a
+	 * #VMEXIT of that vCPU would wipe out all of the settings being done
+	 * here.
+	 */
+	vmsa = (struct sev_es_save_area *)snp_alloc_vmsa_page();
+	if (!vmsa)
+		return -ENOMEM;
+
+	/* CR4 should maintain the MCE value */
+	cr4 = native_read_cr4() & X86_CR4_MCE;
+
+	/* Set the CS value based on the start_ip converted to a SIPI vector */
+	sipi_vector		= (start_ip >> 12);
+	vmsa->cs.base		= sipi_vector << 12;
+	vmsa->cs.limit		= AP_INIT_CS_LIMIT;
+	vmsa->cs.attrib		= INIT_CS_ATTRIBS;
+	vmsa->cs.selector	= sipi_vector << 8;
+
+	/* Set the RIP value based on start_ip */
+	vmsa->rip		= start_ip & 0xfff;
+
+	/* Set AP INIT defaults as documented in the APM */
+	vmsa->ds.limit		= AP_INIT_DS_LIMIT;
+	vmsa->ds.attrib		= INIT_DS_ATTRIBS;
+	vmsa->es		= vmsa->ds;
+	vmsa->fs		= vmsa->ds;
+	vmsa->gs		= vmsa->ds;
+	vmsa->ss		= vmsa->ds;
+
+	vmsa->gdtr.limit	= AP_INIT_GDTR_LIMIT;
+	vmsa->ldtr.limit	= AP_INIT_LDTR_LIMIT;
+	vmsa->ldtr.attrib	= INIT_LDTR_ATTRIBS;
+	vmsa->idtr.limit	= AP_INIT_IDTR_LIMIT;
+	vmsa->tr.limit		= AP_INIT_TR_LIMIT;
+	vmsa->tr.attrib		= INIT_TR_ATTRIBS;
+
+	vmsa->cr4		= cr4;
+	vmsa->cr0		= AP_INIT_CR0_DEFAULT;
+	vmsa->dr7		= DR7_RESET_VALUE;
+	vmsa->dr6		= AP_INIT_DR6_DEFAULT;
+	vmsa->rflags		= AP_INIT_RFLAGS_DEFAULT;
+	vmsa->g_pat		= AP_INIT_GPAT_DEFAULT;
+	vmsa->xcr0		= AP_INIT_XCR0_DEFAULT;
+	vmsa->mxcsr		= AP_INIT_MXCSR_DEFAULT;
+	vmsa->x87_ftw		= AP_INIT_X87_FTW_DEFAULT;
+	vmsa->x87_fcw		= AP_INIT_X87_FCW_DEFAULT;
+
+	/* SVME must be set. */
+	vmsa->efer		= EFER_SVME;
+
+	/*
+	 * Set the SNP-specific fields for this VMSA:
+	 *   VMPL level
+	 *   SEV_FEATURES (matches the SEV STATUS MSR right shifted 2 bits)
+	 */
+	vmsa->vmpl		= 0;
+	vmsa->sev_features	= sev_status >> 2;
+
+	/* Switch the page over to a VMSA page now that it is initialized */
+	ret = snp_set_vmsa(vmsa, true);
+	if (ret) {
+		pr_err("set VMSA page failed (%u)\n", ret);
+		free_page((unsigned long)vmsa);
+
+		return -EINVAL;
+	}
+
+	/* Issue VMGEXIT AP Creation NAE event */
+	local_irq_save(flags);
+
+	ghcb = __sev_get_ghcb(&state);
+
+	vc_ghcb_invalidate(ghcb);
+	ghcb_set_rax(ghcb, vmsa->sev_features);
+	ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION);
+	ghcb_set_sw_exit_info_1(ghcb, ((u64)apic_id << 32) | SVM_VMGEXIT_AP_CREATE);
+	ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa));
+
+	sev_es_wr_ghcb_msr(__pa(ghcb));
+	VMGEXIT();
+
+	if (!ghcb_sw_exit_info_1_is_valid(ghcb) ||
+	    lower_32_bits(ghcb->save.sw_exit_info_1)) {
+		pr_err("SNP AP Creation error\n");
+		ret = -EINVAL;
+	}
+
+	__sev_put_ghcb(&state);
+
+	local_irq_restore(flags);
+
+	/* Perform cleanup if there was an error */
+	if (ret) {
+		snp_cleanup_vmsa(vmsa);
+		vmsa = NULL;
+	}
+
+	/* Free up any previous VMSA page */
+	if (cur_vmsa)
+		snp_cleanup_vmsa(cur_vmsa);
+
+	/* Record the current VMSA page */
+	per_cpu(sev_vmsa, cpu) = vmsa;
+
+	return ret;
+}
+
+void snp_set_wakeup_secondary_cpu(void)
+{
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return;
+
+	/*
+	 * Always set this override if SEV-SNP is enabled. This makes it the
+	 * required method to start APs under SEV-SNP. If the hypervisor does
+	 * not support AP creation, then no APs will be started.
+	 */
+	apic->wakeup_secondary_cpu = wakeup_cpu_via_vmgexit;
+}
+
 int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
 {
 	u16 startup_cs, startup_ip;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 617012f4619f..ad23d53b39ac 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -82,6 +82,7 @@
 #include <asm/spec-ctrl.h>
 #include <asm/hw_irq.h>
 #include <asm/stackprotector.h>
+#include <asm/sev.h>
 
 #ifdef CONFIG_ACPI_CPPC_LIB
 #include <acpi/cppc_acpi.h>
@@ -1436,6 +1437,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
 	smp_quirk_init_udelay();
 
 	speculative_store_bypass_ht_init();
+
+	snp_set_wakeup_secondary_cpu();
 }
 
 void arch_thaw_secondary_cpus_begin(void)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 21/43] x86/head/64: Re-enable stack protection
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (19 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 20/43] x86/sev: Use SEV-SNP AP creation to start secondary CPUs Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 22/43] x86/sev: Move MSR-based VMGEXITs for CPUID to helper Brijesh Singh
                   ` (21 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Due to the following commit:

  103a4908ad4d ("x86/head/64: Disable stack protection for head$(BITS).o")

kernel/head{32,64}.c are compiled with -fno-stack-protector to allow a
call to set_bringup_idt_handler(), which would otherwise have stack
protection enabled with CONFIG_STACKPROTECTOR_STRONG. While sufficient
for that case, there may still be issues with calls to any external
functions that were compiled with stack protection enabled that in-turn
make stack-protected calls, or if the exception handlers set up by
set_bringup_idt_handler() make calls to stack-protected functions.

Subsequent patches for SEV-SNP CPUID validation support will introduce
both such cases. Attempting to disable stack protection for everything
in scope to address that is prohibitive since much of the code, like
SEV-ES #VC handler, is shared code that remains in use after boot and
could benefit from having stack protection enabled. Attempting to inline
calls is brittle and can quickly balloon out to library/helper code
where that's not really an option.

Instead, re-enable stack protection for head32.c/head64.c and make the
appropriate changes to ensure the segment used for the stack canary is
initialized in advance of any stack-protected C calls.

for head64.c:

- The BSP will enter from startup_64() and call into C code
  (startup_64_setup_env()) shortly after setting up the stack, which
  may result in calls to stack-protected code. Set up %gs early to allow
  for this safely.
- APs will enter from secondary_startup_64*(), and %gs will be set up
  soon after. There is one call to C code prior to %gs being setup
  (__startup_secondary_64()), but it is only to fetch 'sme_me_mask'
  global, so just load 'sme_me_mask' directly instead, and remove the
  now-unused __startup_secondary_64() function.

for head32.c:

- BSPs/APs will set %fs to __BOOT_DS prior to any C calls. In recent
  kernels, the compiler is configured to access the stack canary at
  %fs:__stack_chk_guard [1], which overlaps with the initial per-cpu
  '__stack_chk_guard' variable in the initial/"master" .data..percpu
  area. This is sufficient to allow access to the canary for use
  during initial startup, so no changes are needed there.

[1] 3fb0fdb3bbe7 ("x86/stackprotector/32: Make the canary into a regular percpu variable")

Suggested-by: Joerg Roedel <jroedel@suse.de> #for 64-bit %gs set up
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/setup.h |  1 -
 arch/x86/kernel/Makefile     |  1 -
 arch/x86/kernel/head64.c     |  9 ---------
 arch/x86/kernel/head_64.S    | 24 +++++++++++++++++++++---
 4 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index a12458a7a8d4..72ede9159951 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -49,7 +49,6 @@ extern unsigned long saved_video_mode;
 extern void reserve_standard_io_resources(void);
 extern void i386_reserve_resources(void);
 extern unsigned long __startup_64(unsigned long physaddr, struct boot_params *bp);
-extern unsigned long __startup_secondary_64(void);
 extern void startup_64_setup_env(unsigned long physbase);
 extern void early_setup_idt(void);
 extern void __init do_early_exception(struct pt_regs *regs, int trapnr);
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 6aef9ee28a39..bd45e5ee6fe3 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -48,7 +48,6 @@ endif
 # non-deterministic coverage.
 KCOV_INSTRUMENT		:= n
 
-CFLAGS_head$(BITS).o	+= -fno-stack-protector
 CFLAGS_cc_platform.o	+= -fno-stack-protector
 
 CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 1239bc104cda..c80952dded32 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -319,15 +319,6 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	return sme_postprocess_startup(bp, pmd);
 }
 
-unsigned long __startup_secondary_64(void)
-{
-	/*
-	 * Return the SME encryption mask (if SME is active) to be used as a
-	 * modifier for the initial pgdir entry programmed into CR3.
-	 */
-	return sme_get_me_mask();
-}
-
 /* Wipe all early page tables except for the kernel symbol map */
 static void __init reset_early_page_tables(void)
 {
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 9c2c3aff5ee4..9e84263bcb94 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -65,6 +65,22 @@ SYM_CODE_START_NOALIGN(startup_64)
 	leaq	(__end_init_task - FRAME_SIZE)(%rip), %rsp
 
 	leaq	_text(%rip), %rdi
+
+	/*
+	 * initial_gs points to initial fixed_percpu_data struct with storage for
+	 * the stack protector canary. Global pointer fixups are needed at this
+	 * stage, so apply them as is done in fixup_pointer(), and initialize %gs
+	 * such that the canary can be accessed at %gs:40 for subsequent C calls.
+	 */
+	movl	$MSR_GS_BASE, %ecx
+	movq	initial_gs(%rip), %rax
+	movq	$_text, %rdx
+	subq	%rdx, %rax
+	addq	%rdi, %rax
+	movq	%rax, %rdx
+	shrq	$32,  %rdx
+	wrmsr
+
 	pushq	%rsi
 	call	startup_64_setup_env
 	popq	%rsi
@@ -145,9 +161,11 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	 * Retrieve the modifier (SME encryption mask if SME is active) to be
 	 * added to the initial pgdir entry that will be programmed into CR3.
 	 */
-	pushq	%rsi
-	call	__startup_secondary_64
-	popq	%rsi
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	movq	sme_me_mask, %rax
+#else
+	xorq	%rax, %rax
+#endif
 
 	/* Form the CR3 value being sure to include the CR3 modifier */
 	addq	$(init_top_pgt - __START_KERNEL_map), %rax
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 22/43] x86/sev: Move MSR-based VMGEXITs for CPUID to helper
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (20 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 21/43] x86/head/64: Re-enable stack protection Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-03 13:59   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 23/43] KVM: x86: Move lookup of indexed CPUID leafs " Brijesh Singh
                   ` (20 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

This code will also be used later for SEV-SNP-validated CPUID code in
some cases, so move it to a common helper.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kernel/sev-shared.c | 62 +++++++++++++++++++++---------------
 1 file changed, 36 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 3aaef1a18ffe..633f1f93b6e1 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -194,6 +194,36 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
 	return verify_exception_info(ghcb, ctxt);
 }
 
+static int __sev_cpuid_hv(u32 func, int reg_idx, u32 *reg)
+{
+	u64 val;
+
+	if (!reg)
+		return 0;
+
+	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, reg_idx));
+	VMGEXIT();
+	val = sev_es_rd_ghcb_msr();
+	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+		return -EIO;
+
+	*reg = (val >> 32);
+
+	return 0;
+}
+
+static int sev_cpuid_hv(u32 func, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
+{
+	int ret;
+
+	ret = __sev_cpuid_hv(func, GHCB_CPUID_REQ_EAX, eax);
+	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EBX, ebx);
+	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_ECX, ecx);
+	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EDX, edx);
+
+	return ret;
+}
+
 /*
  * Boot VC Handler - This is the first VC handler during boot, there is no GHCB
  * page yet, so it only supports the MSR based communication with the
@@ -202,39 +232,19 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
 void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
 {
 	unsigned int fn = lower_bits(regs->ax, 32);
-	unsigned long val;
+	u32 eax, ebx, ecx, edx;
 
 	/* Only CPUID is supported via MSR protocol */
 	if (exit_code != SVM_EXIT_CPUID)
 		goto fail;
 
-	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EAX));
-	VMGEXIT();
-	val = sev_es_rd_ghcb_msr();
-	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+	if (sev_cpuid_hv(fn, &eax, &ebx, &ecx, &edx))
 		goto fail;
-	regs->ax = val >> 32;
 
-	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EBX));
-	VMGEXIT();
-	val = sev_es_rd_ghcb_msr();
-	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
-		goto fail;
-	regs->bx = val >> 32;
-
-	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_ECX));
-	VMGEXIT();
-	val = sev_es_rd_ghcb_msr();
-	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
-		goto fail;
-	regs->cx = val >> 32;
-
-	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EDX));
-	VMGEXIT();
-	val = sev_es_rd_ghcb_msr();
-	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
-		goto fail;
-	regs->dx = val >> 32;
+	regs->ax = eax;
+	regs->bx = ebx;
+	regs->cx = ecx;
+	regs->dx = edx;
 
 	/*
 	 * This is a VC handler and the #VC is only raised when SEV-ES is
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 23/43] KVM: x86: Move lookup of indexed CPUID leafs to helper
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (21 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 22/43] x86/sev: Move MSR-based VMGEXITs for CPUID to helper Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-03 15:16   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 24/43] x86/compressed/acpi: Move EFI detection " Brijesh Singh
                   ` (19 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Venu Busireddy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Determining which CPUID leafs have significant ECX/index values is
also needed by guest kernel code when doing SEV-SNP-validated CPUID
lookups. Move this to common code to keep future updates in sync.

Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/cpuid.h | 38 ++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/cpuid.c         | 19 ++----------------
 2 files changed, 40 insertions(+), 17 deletions(-)
 create mode 100644 arch/x86/include/asm/cpuid.h

diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
new file mode 100644
index 000000000000..00408aded67c
--- /dev/null
+++ b/arch/x86/include/asm/cpuid.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Kernel-based Virtual Machine driver for Linux cpuid support routines
+ *
+ * derived from arch/x86/kvm/x86.c
+ * derived from arch/x86/kvm/cpuid.c
+ *
+ * Copyright 2011 Red Hat, Inc. and/or its affiliates.
+ * Copyright IBM Corporation, 2008
+ */
+
+#ifndef _ASM_X86_CPUID_H
+#define _ASM_X86_CPUID_H
+
+static __always_inline bool cpuid_function_is_indexed(u32 function)
+{
+	switch (function) {
+	case 4:
+	case 7:
+	case 0xb:
+	case 0xd:
+	case 0xf:
+	case 0x10:
+	case 0x12:
+	case 0x14:
+	case 0x17:
+	case 0x18:
+	case 0x1d:
+	case 0x1e:
+	case 0x1f:
+	case 0x8000001d:
+		return true;
+	}
+
+	return false;
+}
+
+#endif /* _ASM_X86_CPUID_H */
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 3902c28fb6cb..3458dd3272a0 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -19,6 +19,7 @@
 #include <asm/user.h>
 #include <asm/fpu/xstate.h>
 #include <asm/sgx.h>
+#include <asm/cpuid.h>
 #include "cpuid.h"
 #include "lapic.h"
 #include "mmu.h"
@@ -699,24 +700,8 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
 	cpuid_count(entry->function, entry->index,
 		    &entry->eax, &entry->ebx, &entry->ecx, &entry->edx);
 
-	switch (function) {
-	case 4:
-	case 7:
-	case 0xb:
-	case 0xd:
-	case 0xf:
-	case 0x10:
-	case 0x12:
-	case 0x14:
-	case 0x17:
-	case 0x18:
-	case 0x1d:
-	case 0x1e:
-	case 0x1f:
-	case 0x8000001d:
+	if (cpuid_function_is_indexed(function))
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-		break;
-	}
 
 	return entry;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 24/43] x86/compressed/acpi: Move EFI detection to helper
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (22 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 23/43] KVM: x86: Move lookup of indexed CPUID leafs " Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-03 14:39   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 25/43] x86/compressed/acpi: Move EFI system table lookup " Brijesh Singh
                   ` (18 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related
code into a set of helpers that can be re-used for that purpose.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/Makefile |  1 +
 arch/x86/boot/compressed/acpi.c   | 28 +++++++----------
 arch/x86/boot/compressed/efi.c    | 50 +++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/misc.h   | 16 ++++++++++
 4 files changed, 77 insertions(+), 18 deletions(-)
 create mode 100644 arch/x86/boot/compressed/efi.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 6115274fe10f..e69c3d2e0628 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -103,6 +103,7 @@ endif
 vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
 
 vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
+vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o
 efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a
 
 $(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 8bcbcee54aa1..db6c561920f0 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -87,7 +87,7 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
 	efi_system_table_64_t *systab;
 	struct efi_setup_data *esd;
 	struct efi_info *ei;
-	char *sig;
+	enum efi_type et;
 
 	esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
 	if (!esd)
@@ -98,10 +98,9 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
 		return 0;
 	}
 
-	ei = &boot_params->efi_info;
-	sig = (char *)&ei->efi_loader_signature;
-	if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
-		debug_putstr("Wrong kexec EFI loader signature.\n");
+	et = efi_get_type(boot_params);
+	if (et != EFI_TYPE_64) {
+		debug_putstr("Unexpected kexec EFI environment (expected 64-bit EFI).\n");
 		return 0;
 	}
 
@@ -122,29 +121,22 @@ static acpi_physical_address efi_get_rsdp_addr(void)
 	unsigned long systab, config_tables;
 	unsigned int nr_tables;
 	struct efi_info *ei;
+	enum efi_type et;
 	bool efi_64;
-	char *sig;
-
-	ei = &boot_params->efi_info;
-	sig = (char *)&ei->efi_loader_signature;
 
-	if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+	et = efi_get_type(boot_params);
+	if (et == EFI_TYPE_64)
 		efi_64 = true;
-	} else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
+	else if (et == EFI_TYPE_32)
 		efi_64 = false;
-	} else {
-		debug_putstr("Wrong EFI loader signature.\n");
+	else
 		return 0;
-	}
 
 	/* Get systab from boot params. */
+	ei = &boot_params->efi_info;
 #ifdef CONFIG_X86_64
 	systab = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
 #else
-	if (ei->efi_systab_hi || ei->efi_memmap_hi) {
-		debug_putstr("Error getting RSDP address: EFI system table located above 4GB.\n");
-		return 0;
-	}
 	systab = ei->efi_systab;
 #endif
 	if (!systab)
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
new file mode 100644
index 000000000000..daa73efdc7a5
--- /dev/null
+++ b/arch/x86/boot/compressed/efi.c
@@ -0,0 +1,50 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Helpers for early access to EFI configuration table.
+ *
+ * Originally derived from arch/x86/boot/compressed/acpi.c
+ */
+
+#include "misc.h"
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+/**
+ * efi_get_type - Given boot_params, determine the type of EFI environment.
+ *
+ * @boot_params:        pointer to boot_params
+ *
+ * Return: EFI_TYPE_{32,64} for valid EFI environments, EFI_TYPE_NONE otherwise.
+ */
+enum efi_type efi_get_type(struct boot_params *boot_params)
+{
+	struct efi_info *ei;
+	enum efi_type et;
+	const char *sig;
+
+	ei = &boot_params->efi_info;
+	sig = (char *)&ei->efi_loader_signature;
+
+	if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+		et = EFI_TYPE_64;
+	} else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
+		et = EFI_TYPE_32;
+	} else {
+		debug_putstr("No EFI environment detected.\n");
+		et = EFI_TYPE_NONE;
+	}
+
+#ifndef CONFIG_X86_64
+	/*
+	 * Existing callers like acpi.c treat this case as an indicator to
+	 * fall-through to non-EFI, rather than an error, so maintain that
+	 * functionality here as well.
+	 */
+	if (ei->efi_systab_hi || ei->efi_memmap_hi) {
+		debug_putstr("EFI system table is located above 4GB and cannot be accessed.\n");
+		et = EFI_TYPE_NONE;
+	}
+#endif
+
+	return et;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 01cc13c12059..a26244c0fe01 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -176,4 +176,20 @@ void boot_stage2_vc(void);
 
 unsigned long sev_verify_cbit(unsigned long cr3);
 
+enum efi_type {
+	EFI_TYPE_64,
+	EFI_TYPE_32,
+	EFI_TYPE_NONE,
+};
+
+#ifdef CONFIG_EFI
+/* helpers for early EFI config table access */
+enum efi_type efi_get_type(struct boot_params *boot_params);
+#else
+static inline enum efi_type efi_get_type(struct boot_params *boot_params)
+{
+	return EFI_TYPE_NONE;
+}
+#endif /* CONFIG_EFI */
+
 #endif /* BOOT_COMPRESSED_MISC_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 25/43] x86/compressed/acpi: Move EFI system table lookup to helper
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (23 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 24/43] x86/compressed/acpi: Move EFI detection " Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-03 14:48   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 26/43] x86/compressed/acpi: Move EFI config " Brijesh Singh
                   ` (17 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related
code into a set of helpers that can be re-used for that purpose.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/acpi.c | 21 +++++++--------------
 arch/x86/boot/compressed/efi.c  | 29 +++++++++++++++++++++++++++++
 arch/x86/boot/compressed/misc.h |  6 ++++++
 3 files changed, 42 insertions(+), 14 deletions(-)

diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index db6c561920f0..58a3d3f3e305 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -105,7 +105,7 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
 	}
 
 	/* Get systab from boot params. */
-	systab = (efi_system_table_64_t *) (ei->efi_systab | ((__u64)ei->efi_systab_hi << 32));
+	systab = (efi_system_table_64_t *)efi_get_system_table(boot_params);
 	if (!systab)
 		error("EFI system table not found in kexec boot_params.");
 
@@ -118,9 +118,8 @@ static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
 static acpi_physical_address efi_get_rsdp_addr(void)
 {
 #ifdef CONFIG_EFI
-	unsigned long systab, config_tables;
+	unsigned long systab_pa, config_tables;
 	unsigned int nr_tables;
-	struct efi_info *ei;
 	enum efi_type et;
 	bool efi_64;
 
@@ -132,24 +131,18 @@ static acpi_physical_address efi_get_rsdp_addr(void)
 	else
 		return 0;
 
-	/* Get systab from boot params. */
-	ei = &boot_params->efi_info;
-#ifdef CONFIG_X86_64
-	systab = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
-#else
-	systab = ei->efi_systab;
-#endif
-	if (!systab)
-		error("EFI system table not found.");
+	systab_pa = efi_get_system_table(boot_params);
+	if (!systab_pa)
+		error("EFI support advertised, but unable to locate system table.");
 
 	/* Handle EFI bitness properly */
 	if (efi_64) {
-		efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab;
+		efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab_pa;
 
 		config_tables	= stbl->tables;
 		nr_tables	= stbl->nr_tables;
 	} else {
-		efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab;
+		efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab_pa;
 
 		config_tables	= stbl->tables;
 		nr_tables	= stbl->nr_tables;
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
index daa73efdc7a5..bf99768cd229 100644
--- a/arch/x86/boot/compressed/efi.c
+++ b/arch/x86/boot/compressed/efi.c
@@ -48,3 +48,32 @@ enum efi_type efi_get_type(struct boot_params *boot_params)
 
 	return et;
 }
+
+/*
+ * efi_get_system_table - Given boot_params, retrieve the physical address of
+ *                        EFI system table.
+ *
+ * @boot_params:        pointer to boot_params
+ *
+ * Return: EFI system table address on success. On error, return 0.
+ */
+unsigned long efi_get_system_table(struct boot_params *boot_params)
+{
+	unsigned long sys_tbl_pa;
+	struct efi_info *ei;
+	enum efi_type et;
+
+	/* Get systab from boot params. */
+	ei = &boot_params->efi_info;
+#ifdef CONFIG_X86_64
+	sys_tbl_pa = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
+#else
+	sys_tbl_pa = ei->efi_systab;
+#endif
+	if (!sys_tbl_pa) {
+		debug_putstr("EFI system table not found.");
+		return 0;
+	}
+
+	return sys_tbl_pa;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index a26244c0fe01..24522be8c21d 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -185,11 +185,17 @@ enum efi_type {
 #ifdef CONFIG_EFI
 /* helpers for early EFI config table access */
 enum efi_type efi_get_type(struct boot_params *boot_params);
+unsigned long efi_get_system_table(struct boot_params *boot_params);
 #else
 static inline enum efi_type efi_get_type(struct boot_params *boot_params)
 {
 	return EFI_TYPE_NONE;
 }
+
+static inline unsigned long efi_get_system_table(struct boot_params *boot_params)
+{
+	return 0;
+}
 #endif /* CONFIG_EFI */
 
 #endif /* BOOT_COMPRESSED_MISC_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 26/43] x86/compressed/acpi: Move EFI config table lookup to helper
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (24 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 25/43] x86/compressed/acpi: Move EFI system table lookup " Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-03 15:13   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 27/43] x86/compressed/acpi: Move EFI vendor " Brijesh Singh
                   ` (16 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/acpi.c | 25 ++++++------------
 arch/x86/boot/compressed/efi.c  | 45 +++++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/misc.h |  9 +++++++
 3 files changed, 62 insertions(+), 17 deletions(-)

diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 58a3d3f3e305..9a824af69961 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -118,10 +118,13 @@ static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
 static acpi_physical_address efi_get_rsdp_addr(void)
 {
 #ifdef CONFIG_EFI
-	unsigned long systab_pa, config_tables;
+	unsigned long cfg_tbl_pa = 0;
+	unsigned int cfg_tbl_len;
+	unsigned long systab_pa;
 	unsigned int nr_tables;
 	enum efi_type et;
 	bool efi_64;
+	int ret;
 
 	et = efi_get_type(boot_params);
 	if (et == EFI_TYPE_64)
@@ -135,23 +138,11 @@ static acpi_physical_address efi_get_rsdp_addr(void)
 	if (!systab_pa)
 		error("EFI support advertised, but unable to locate system table.");
 
-	/* Handle EFI bitness properly */
-	if (efi_64) {
-		efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab_pa;
-
-		config_tables	= stbl->tables;
-		nr_tables	= stbl->nr_tables;
-	} else {
-		efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab_pa;
-
-		config_tables	= stbl->tables;
-		nr_tables	= stbl->nr_tables;
-	}
-
-	if (!config_tables)
-		error("EFI config tables not found.");
+	ret = efi_get_conf_table(boot_params, &cfg_tbl_pa, &cfg_tbl_len);
+	if (ret || !cfg_tbl_pa)
+		error("EFI config table not found.");
 
-	return __efi_get_rsdp_addr(config_tables, nr_tables, efi_64);
+	return __efi_get_rsdp_addr(cfg_tbl_pa, cfg_tbl_len, efi_64);
 #else
 	return 0;
 #endif
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
index bf99768cd229..feb00c6b4919 100644
--- a/arch/x86/boot/compressed/efi.c
+++ b/arch/x86/boot/compressed/efi.c
@@ -77,3 +77,48 @@ unsigned long efi_get_system_table(struct boot_params *boot_params)
 
 	return sys_tbl_pa;
 }
+
+/**
+ * efi_get_conf_table - Given boot_params, locate EFI system table from it and
+ *                      return the physical address of EFI configuration table.
+ *
+ * @boot_params:        pointer to boot_params
+ * @cfg_tbl_pa:         location to store physical address of config table
+ * @cfg_tbl_len:        location to store number of config table entries
+ *
+ * Return: 0 on success. On error, return params are left unchanged.
+ */
+int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
+		       unsigned int *cfg_tbl_len)
+{
+	unsigned long sys_tbl_pa = 0;
+	enum efi_type et;
+	int ret;
+
+	if (!cfg_tbl_pa || !cfg_tbl_len)
+		return -EINVAL;
+
+	sys_tbl_pa = efi_get_system_table(boot_params);
+	if (!sys_tbl_pa)
+		return -EINVAL;
+
+	/* Handle EFI bitness properly */
+	et = efi_get_type(boot_params);
+	if (et == EFI_TYPE_64) {
+		efi_system_table_64_t *stbl =
+			(efi_system_table_64_t *)sys_tbl_pa;
+
+		*cfg_tbl_pa = stbl->tables;
+		*cfg_tbl_len = stbl->nr_tables;
+	} else if (et == EFI_TYPE_32) {
+		efi_system_table_32_t *stbl =
+			(efi_system_table_32_t *)sys_tbl_pa;
+
+		*cfg_tbl_pa = stbl->tables;
+		*cfg_tbl_len = stbl->nr_tables;
+	} else {
+		return -EINVAL;
+	}
+
+	return 0;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 24522be8c21d..162dbd7443eb 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -186,6 +186,8 @@ enum efi_type {
 /* helpers for early EFI config table access */
 enum efi_type efi_get_type(struct boot_params *boot_params);
 unsigned long efi_get_system_table(struct boot_params *boot_params);
+int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
+		       unsigned int *cfg_tbl_len);
 #else
 static inline enum efi_type efi_get_type(struct boot_params *boot_params)
 {
@@ -196,6 +198,13 @@ static inline unsigned long efi_get_system_table(struct boot_params *boot_params
 {
 	return 0;
 }
+
+static inline int efi_get_conf_table(struct boot_params *boot_params,
+				     unsigned long *cfg_tbl_pa,
+				     unsigned int *cfg_tbl_len)
+{
+	return -ENOENT;
+}
 #endif /* CONFIG_EFI */
 
 #endif /* BOOT_COMPRESSED_MISC_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 27/43] x86/compressed/acpi: Move EFI vendor table lookup to helper
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (25 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 26/43] x86/compressed/acpi: Move EFI config " Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 28/43] x86/compressed/acpi: Move EFI kexec handling into common code Brijesh Singh
                   ` (15 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/acpi.c | 67 +++++++++++-------------------
 arch/x86/boot/compressed/efi.c  | 72 +++++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/misc.h | 13 ++++++
 3 files changed, 108 insertions(+), 44 deletions(-)

diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 9a824af69961..d505335bcc25 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -20,48 +20,31 @@
  */
 struct mem_vector immovable_mem[MAX_NUMNODES*2];
 
-/*
- * Search EFI system tables for RSDP.  If both ACPI_20_TABLE_GUID and
- * ACPI_TABLE_GUID are found, take the former, which has more features.
- */
 static acpi_physical_address
-__efi_get_rsdp_addr(unsigned long config_tables, unsigned int nr_tables,
-		    bool efi_64)
+__efi_get_rsdp_addr(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len)
 {
-	acpi_physical_address rsdp_addr = 0;
-
 #ifdef CONFIG_EFI
-	int i;
-
-	/* Get EFI tables from systab. */
-	for (i = 0; i < nr_tables; i++) {
-		acpi_physical_address table;
-		efi_guid_t guid;
-
-		if (efi_64) {
-			efi_config_table_64_t *tbl = (efi_config_table_64_t *)config_tables + i;
-
-			guid  = tbl->guid;
-			table = tbl->table;
-
-			if (!IS_ENABLED(CONFIG_X86_64) && table >> 32) {
-				debug_putstr("Error getting RSDP address: EFI config table located above 4GB.\n");
-				return 0;
-			}
-		} else {
-			efi_config_table_32_t *tbl = (efi_config_table_32_t *)config_tables + i;
-
-			guid  = tbl->guid;
-			table = tbl->table;
-		}
+	unsigned long rsdp_addr;
+	int ret;
 
-		if (!(efi_guidcmp(guid, ACPI_TABLE_GUID)))
-			rsdp_addr = table;
-		else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID)))
-			return table;
-	}
+	/*
+	 * Search EFI system tables for RSDP. Preferred is ACPI_20_TABLE_GUID to
+	 * ACPI_TABLE_GUID because it has more features.
+	 */
+	rsdp_addr = efi_find_vendor_table(boot_params, cfg_tbl_pa, cfg_tbl_len,
+					  ACPI_20_TABLE_GUID);
+	if (rsdp_addr)
+		return (acpi_physical_address)rsdp_addr;
+
+	/* No ACPI_20_TABLE_GUID found, fallback to ACPI_TABLE_GUID. */
+	rsdp_addr = efi_find_vendor_table(boot_params, cfg_tbl_pa, cfg_tbl_len,
+					  ACPI_TABLE_GUID);
+	if (rsdp_addr)
+		return (acpi_physical_address)rsdp_addr;
+
+	debug_putstr("Error getting RSDP address.\n");
 #endif
-	return rsdp_addr;
+	return 0;
 }
 
 /* EFI/kexec support is 64-bit only. */
@@ -109,7 +92,7 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
 	if (!systab)
 		error("EFI system table not found in kexec boot_params.");
 
-	return __efi_get_rsdp_addr((unsigned long)esd->tables, systab->nr_tables, true);
+	return __efi_get_rsdp_addr((unsigned long)esd->tables, systab->nr_tables);
 }
 #else
 static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
@@ -127,11 +110,7 @@ static acpi_physical_address efi_get_rsdp_addr(void)
 	int ret;
 
 	et = efi_get_type(boot_params);
-	if (et == EFI_TYPE_64)
-		efi_64 = true;
-	else if (et == EFI_TYPE_32)
-		efi_64 = false;
-	else
+	if (et == EFI_TYPE_NONE)
 		return 0;
 
 	systab_pa = efi_get_system_table(boot_params);
@@ -142,7 +121,7 @@ static acpi_physical_address efi_get_rsdp_addr(void)
 	if (ret || !cfg_tbl_pa)
 		error("EFI config table not found.");
 
-	return __efi_get_rsdp_addr(cfg_tbl_pa, cfg_tbl_len, efi_64);
+	return __efi_get_rsdp_addr(cfg_tbl_pa, cfg_tbl_len);
 #else
 	return 0;
 #endif
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
index feb00c6b4919..6413a808b2e7 100644
--- a/arch/x86/boot/compressed/efi.c
+++ b/arch/x86/boot/compressed/efi.c
@@ -122,3 +122,75 @@ int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_p
 
 	return 0;
 }
+
+/* Get vendor table address/guid from EFI config table at the given index */
+static int get_vendor_table(void *cfg_tbl, unsigned int idx,
+			    unsigned long *vendor_tbl_pa,
+			    efi_guid_t *vendor_tbl_guid,
+			    enum efi_type et)
+{
+	if (et == EFI_TYPE_64) {
+		efi_config_table_64_t *tbl_entry =
+			(efi_config_table_64_t *)cfg_tbl + idx;
+
+		if (!IS_ENABLED(CONFIG_X86_64) && tbl_entry->table >> 32) {
+			debug_putstr("Error: EFI config table entry located above 4GB.\n");
+			return -EINVAL;
+		}
+
+		*vendor_tbl_pa = tbl_entry->table;
+		*vendor_tbl_guid = tbl_entry->guid;
+
+	} else if (et == EFI_TYPE_32) {
+		efi_config_table_32_t *tbl_entry =
+			(efi_config_table_32_t *)cfg_tbl + idx;
+
+		*vendor_tbl_pa = tbl_entry->table;
+		*vendor_tbl_guid = tbl_entry->guid;
+	} else {
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/**
+ * efi_find_vendor_table - Given EFI config table, search it for the physical
+ *                         address of the vendor table associated with GUID.
+ *
+ * @boot_params:       pointer to boot_params
+ * @cfg_tbl_pa:        pointer to EFI configuration table
+ * @cfg_tbl_len:       number of entries in EFI configuration table
+ * @guid:              GUID of vendor table
+ *
+ * Return: vendor table address on success. On error, return 0.
+ */
+unsigned long efi_find_vendor_table(struct boot_params *boot_params,
+				    unsigned long cfg_tbl_pa,
+				    unsigned int cfg_tbl_len,
+				    efi_guid_t guid)
+{
+	enum efi_type et;
+	unsigned int i;
+
+	et = efi_get_type(boot_params);
+	if (et == EFI_TYPE_NONE)
+		return 0;
+
+	for (i = 0; i < cfg_tbl_len; i++) {
+		unsigned long vendor_tbl_pa;
+		efi_guid_t vendor_tbl_guid;
+		int ret;
+
+		ret = get_vendor_table((void *)cfg_tbl_pa, i,
+				       &vendor_tbl_pa,
+				       &vendor_tbl_guid, et);
+		if (ret)
+			return 0;
+
+		if (!efi_guidcmp(guid, vendor_tbl_guid))
+			return vendor_tbl_pa;
+	}
+
+	return 0;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 162dbd7443eb..991b46170914 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -23,6 +23,7 @@
 #include <linux/screen_info.h>
 #include <linux/elf.h>
 #include <linux/io.h>
+#include <linux/efi.h>
 #include <asm/page.h>
 #include <asm/boot.h>
 #include <asm/bootparam.h>
@@ -188,6 +189,10 @@ enum efi_type efi_get_type(struct boot_params *boot_params);
 unsigned long efi_get_system_table(struct boot_params *boot_params);
 int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
 		       unsigned int *cfg_tbl_len);
+unsigned long efi_find_vendor_table(struct boot_params *boot_params,
+				    unsigned long cfg_tbl_pa,
+				    unsigned int cfg_tbl_len,
+				    efi_guid_t guid);
 #else
 static inline enum efi_type efi_get_type(struct boot_params *boot_params)
 {
@@ -205,6 +210,14 @@ static inline int efi_get_conf_table(struct boot_params *boot_params,
 {
 	return -ENOENT;
 }
+
+static inline unsigned long efi_find_vendor_table(struct boot_params *boot_params,
+						  unsigned long cfg_tbl_pa,
+						  unsigned int cfg_tbl_len,
+						  efi_guid_t guid)
+{
+	return 0;
+}
 #endif /* CONFIG_EFI */
 
 #endif /* BOOT_COMPRESSED_MISC_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 28/43] x86/compressed/acpi: Move EFI kexec handling into common code
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (26 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 27/43] x86/compressed/acpi: Move EFI vendor " Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-04 16:09   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 29/43] x86/boot: Add Confidential Computing type to setup_data Brijesh Singh
                   ` (14 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.

In this instance, the current acpi.c kexec handling is mainly used to
get the alternative EFI config table address provided by kexec via a
setup_data entry of type SETUP_EFI. If not present, the code then falls
back to normal EFI config table address provided by EFI system table.
This would need to be done by all call-sites attempting to access the
EFI config table, so just have efi_get_conf_table() handle that
automatically.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/acpi.c | 59 ---------------------------------
 arch/x86/boot/compressed/efi.c  | 49 ++++++++++++++++++++++++++-
 2 files changed, 48 insertions(+), 60 deletions(-)

diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index d505335bcc25..c726da7eed64 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -47,57 +47,6 @@ __efi_get_rsdp_addr(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len)
 	return 0;
 }
 
-/* EFI/kexec support is 64-bit only. */
-#ifdef CONFIG_X86_64
-static struct efi_setup_data *get_kexec_setup_data_addr(void)
-{
-	struct setup_data *data;
-	u64 pa_data;
-
-	pa_data = boot_params->hdr.setup_data;
-	while (pa_data) {
-		data = (struct setup_data *)pa_data;
-		if (data->type == SETUP_EFI)
-			return (struct efi_setup_data *)(pa_data + sizeof(struct setup_data));
-
-		pa_data = data->next;
-	}
-	return NULL;
-}
-
-static acpi_physical_address kexec_get_rsdp_addr(void)
-{
-	efi_system_table_64_t *systab;
-	struct efi_setup_data *esd;
-	struct efi_info *ei;
-	enum efi_type et;
-
-	esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
-	if (!esd)
-		return 0;
-
-	if (!esd->tables) {
-		debug_putstr("Wrong kexec SETUP_EFI data.\n");
-		return 0;
-	}
-
-	et = efi_get_type(boot_params);
-	if (et != EFI_TYPE_64) {
-		debug_putstr("Unexpected kexec EFI environment (expected 64-bit EFI).\n");
-		return 0;
-	}
-
-	/* Get systab from boot params. */
-	systab = (efi_system_table_64_t *)efi_get_system_table(boot_params);
-	if (!systab)
-		error("EFI system table not found in kexec boot_params.");
-
-	return __efi_get_rsdp_addr((unsigned long)esd->tables, systab->nr_tables);
-}
-#else
-static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
-#endif /* CONFIG_X86_64 */
-
 static acpi_physical_address efi_get_rsdp_addr(void)
 {
 #ifdef CONFIG_EFI
@@ -211,14 +160,6 @@ acpi_physical_address get_rsdp_addr(void)
 
 	pa = boot_params->acpi_rsdp_addr;
 
-	/*
-	 * Try to get EFI data from setup_data. This can happen when we're a
-	 * kexec'ed kernel and kexec(1) has passed all the required EFI info to
-	 * us.
-	 */
-	if (!pa)
-		pa = kexec_get_rsdp_addr();
-
 	if (!pa)
 		pa = efi_get_rsdp_addr();
 
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
index 6413a808b2e7..aeed0f0f9f2e 100644
--- a/arch/x86/boot/compressed/efi.c
+++ b/arch/x86/boot/compressed/efi.c
@@ -78,6 +78,49 @@ unsigned long efi_get_system_table(struct boot_params *boot_params)
 	return sys_tbl_pa;
 }
 
+/*
+ * EFI config table address changes to virtual address after boot, which may
+ * not be accessible for the kexec'd kernel. To address this, kexec provides
+ * the initial physical address via a struct setup_data entry, which is
+ * checked for here, along with some sanity checks.
+ */
+static struct efi_setup_data *get_kexec_setup_data(struct boot_params *boot_params,
+						   enum efi_type et)
+{
+#ifdef CONFIG_X86_64
+	struct efi_setup_data *esd = NULL;
+	struct setup_data *data;
+	u64 pa_data;
+
+	if (et != EFI_TYPE_64)
+		return NULL;
+
+	pa_data = boot_params->hdr.setup_data;
+	while (pa_data) {
+		data = (struct setup_data *)pa_data;
+		if (data->type == SETUP_EFI) {
+			esd = (struct efi_setup_data *)(pa_data + sizeof(struct setup_data));
+			break;
+		}
+
+		pa_data = data->next;
+	}
+
+	/*
+	 * Original ACPI code falls back to attempting normal EFI boot in these
+	 * cases, so maintain existing behavior by indicating non-kexec
+	 * environment to the caller, but print them for debugging.
+	 */
+	if (esd && !esd->tables) {
+		debug_putstr("kexec EFI environment missing valid configuration table.\n");
+		return NULL;
+	}
+
+	return esd;
+#endif
+	return NULL;
+}
+
 /**
  * efi_get_conf_table - Given boot_params, locate EFI system table from it and
  *                      return the physical address of EFI configuration table.
@@ -107,8 +150,12 @@ int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_p
 	if (et == EFI_TYPE_64) {
 		efi_system_table_64_t *stbl =
 			(efi_system_table_64_t *)sys_tbl_pa;
+		struct efi_setup_data *esd;
 
-		*cfg_tbl_pa = stbl->tables;
+		/* kexec provides an alternative EFI conf table, check for it. */
+		esd = get_kexec_setup_data(boot_params, et);
+
+		*cfg_tbl_pa = esd ? esd->tables : stbl->tables;
 		*cfg_tbl_len = stbl->nr_tables;
 	} else if (et == EFI_TYPE_32) {
 		efi_system_table_32_t *stbl =
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 29/43] x86/boot: Add Confidential Computing type to setup_data
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (27 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 28/43] x86/compressed/acpi: Move EFI kexec handling into common code Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-04 16:21   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 30/43] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement Brijesh Singh
                   ` (13 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

While launching the encrypted guests, the hypervisor may need to provide
some additional information during the guest boot. When booting under the
EFI based BIOS, the EFI configuration table contains an entry for the
confidential computing blob that contains the required information.

To support booting encrypted guests on non-EFI VM, the hypervisor needs to
pass this additional information to the kernel with a different method.

For this purpose, introduce SETUP_CC_BLOB type in setup_data to hold the
physical address of the confidential computing blob location. The boot
loader or hypervisor may choose to use this method instead of EFI
configuration table. The CC blob location scanning should give preference
to setup_data data over the EFI configuration table.

In AMD SEV-SNP, the CC blob contains the address of the secrets and CPUID
pages. The secrets page includes information such as a VM to PSP
communication key and CPUID page contains PSP filtered CPUID values.
Define the AMD SEV confidential computing blob structure.

While at it, define the EFI GUID for the confidential computing blob.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h            | 18 ++++++++++++++++++
 arch/x86/include/uapi/asm/bootparam.h |  1 +
 include/linux/efi.h                   |  1 +
 3 files changed, 20 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index a3203b2caaca..1a7e21bb6eea 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -42,6 +42,24 @@ struct es_em_ctxt {
 	struct es_fault_info fi;
 };
 
+/*
+ * AMD SEV Confidential computing blob structure. The structure is
+ * defined in OVMF UEFI firmware header:
+ * https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h
+ */
+#define CC_BLOB_SEV_HDR_MAGIC	0x45444d41
+struct cc_blob_sev_info {
+	u32 magic;
+	u16 version;
+	u16 reserved;
+	u64 secrets_phys;
+	u32 secrets_len;
+	u32 rsvd1;
+	u64 cpuid_phys;
+	u32 cpuid_len;
+	u32 rsvd2;
+};
+
 void do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code);
 
 static inline u64 lower_bits(u64 val, unsigned int bits)
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index b25d3f82c2f3..1ac5acca72ce 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -10,6 +10,7 @@
 #define SETUP_EFI			4
 #define SETUP_APPLE_PROPERTIES		5
 #define SETUP_JAILHOUSE			6
+#define SETUP_CC_BLOB			7
 
 #define SETUP_INDIRECT			(1<<31)
 
diff --git a/include/linux/efi.h b/include/linux/efi.h
index ccd4d3f91c98..984aa688997a 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -390,6 +390,7 @@ void efi_native_runtime_setup(void);
 #define EFI_CERT_SHA256_GUID			EFI_GUID(0xc1c41626, 0x504c, 0x4092, 0xac, 0xa9, 0x41, 0xf9, 0x36, 0x93, 0x43, 0x28)
 #define EFI_CERT_X509_GUID			EFI_GUID(0xa5c059a1, 0x94e4, 0x4aa7, 0x87, 0xb5, 0xab, 0x15, 0x5c, 0x2b, 0xf0, 0x72)
 #define EFI_CERT_X509_SHA256_GUID		EFI_GUID(0x3bd2a492, 0x96c0, 0x4079, 0xb4, 0x20, 0xfc, 0xf9, 0x8e, 0xf1, 0x03, 0xed)
+#define EFI_CC_BLOB_GUID			EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54, 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42)
 
 /*
  * This GUID is used to pass to the kernel proper the struct screen_info
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 30/43] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (28 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 29/43] x86/boot: Add Confidential Computing type to setup_data Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-07 23:48   ` Sean Christopherson
  2022-01-28 17:17 ` [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers Brijesh Singh
                   ` (12 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Update the documentation with SEV-SNP CPUID enforcement.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 .../virt/kvm/amd-memory-encryption.rst        | 28 +++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
index 1c6847fff304..0c72f44cc11a 100644
--- a/Documentation/virt/kvm/amd-memory-encryption.rst
+++ b/Documentation/virt/kvm/amd-memory-encryption.rst
@@ -433,6 +433,34 @@ issued by the hypervisor to make the guest ready for execution.
 
 Returns: 0 on success, -negative on error
 
+SEV-SNP CPUID Enforcement
+=========================
+
+SEV-SNP guests can access a special page that contains a table of CPUID values
+that have been validated by the PSP as part of the SNP_LAUNCH_UPDATE firmware
+command. It provides the following assurances regarding the validity of CPUID
+values:
+
+ - Its address is obtained via bootloader/firmware (via CC blob), and those
+   binaries will be measured as part of the SEV-SNP attestation report.
+ - Its initial state will be encrypted/pvalidated, so attempts to modify
+   it during run-time will result in garbage being written, or #VC exceptions
+   being generated due to changes in validation state if the hypervisor tries
+   to swap the backing page.
+ - Attempts to bypass PSP checks by the hypervisor by using a normal page, or
+   a non-CPUID encrypted page will change the measurement provided by the
+   SEV-SNP attestation report.
+ - The CPUID page contents are *not* measured, but attempts to modify the
+   expected contents of a CPUID page as part of guest initialization will be
+   gated by the PSP CPUID enforcement policy checks performed on the page
+   during SNP_LAUNCH_UPDATE, and noticeable later if the guest owner
+   implements their own checks of the CPUID values.
+
+It is important to note that this last assurance is only useful if the kernel
+has taken care to make use of the SEV-SNP CPUID throughout all stages of boot.
+Otherwise, guest owner attestation provides no assurance that the kernel wasn't
+fed incorrect values at some point during boot.
+
 References
 ==========
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (29 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 30/43] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-05 10:54   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 32/43] x86/boot: Add a pointer to Confidential Computing blob in bootparams Brijesh Singh
                   ` (11 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

CPUID instructions generate a #VC exception for SEV-ES/SEV-SNP guests,
for which early handlers are currently set up to handle. In the case
of SEV-SNP, guests can use a configurable location in guest memory
that has been pre-populated with a firmware-validated CPUID table to
look up the relevant CPUID values rather than requesting them from
hypervisor via a VMGEXIT. Add the various hooks in the #VC handlers to
allow CPUID instructions to be handled via the table. The code to
actually configure/enable the table will be added in a subsequent
commit.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    |   1 +
 arch/x86/include/asm/sev-common.h |   2 +
 arch/x86/kernel/sev-shared.c      | 336 ++++++++++++++++++++++++++++++
 arch/x86/kernel/sev.c             |   1 +
 4 files changed, 340 insertions(+)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 1e5aa6b65025..1b80c1d0ea1f 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -20,6 +20,7 @@
 #include <asm/fpu/xcr.h>
 #include <asm/ptrace.h>
 #include <asm/svm.h>
+#include <asm/cpuid.h>
 
 #include "error.h"
 
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 6d90f7c688b1..cd769984e929 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -152,6 +152,8 @@ struct snp_psc_desc {
 #define GHCB_TERM_PSC			1	/* Page State Change failure */
 #define GHCB_TERM_PVALIDATE		2	/* Pvalidate failure */
 #define GHCB_TERM_NOT_VMPL0		3	/* SEV-SNP guest is not running at VMPL-0 */
+#define GHCB_TERM_CPUID			4	/* CPUID-validation failure */
+#define GHCB_TERM_CPUID_HV		5	/* CPUID failure during hypervisor fallback */
 
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 633f1f93b6e1..dea9f7b28620 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -14,6 +14,36 @@
 #define has_cpuflag(f)	boot_cpu_has(f)
 #endif
 
+/*
+ * Individual entries of the SEV-SNP CPUID table, as defined by the SEV-SNP
+ * Firmware ABI, Revision 0.9, Section 7.1, Table 14.
+ */
+struct snp_cpuid_fn {
+	u32 eax_in;
+	u32 ecx_in;
+	u64 xcr0_in;
+	u64 xss_in;
+	u32 eax;
+	u32 ebx;
+	u32 ecx;
+	u32 edx;
+	u64 __reserved;
+} __packed;
+
+/*
+ * SEV-SNP CPUID table header, as defined by the SEV-SNP Firmware ABI,
+ * Revision 0.9, Section 8.14.2.6. Also noted there is the SEV-SNP
+ * firmware-enforced limit of 64 entries per CPUID table.
+ */
+#define SNP_CPUID_COUNT_MAX 64
+
+struct snp_cpuid_info {
+	u32 count;
+	u32 __reserved1;
+	u64 __reserved2;
+	struct snp_cpuid_fn fn[SNP_CPUID_COUNT_MAX];
+} __packed;
+
 /*
  * Since feature negotiation related variables are set early in the boot
  * process they must reside in the .data section so as not to be zeroed
@@ -23,6 +53,19 @@
  */
 static u16 ghcb_version __ro_after_init;
 
+/* Copy of the SNP firmware's CPUID page. */
+static struct snp_cpuid_info cpuid_info_copy __ro_after_init;
+
+/*
+ * These will be initialized based on CPUID table so that non-present
+ * all-zero leaves (for sparse tables) can be differentiated from
+ * invalid/out-of-range leaves. This is needed since all-zero leaves
+ * still need to be post-processed.
+ */
+static u32 cpuid_std_range_max __ro_after_init;
+static u32 cpuid_hyp_range_max __ro_after_init;
+static u32 cpuid_ext_range_max __ro_after_init;
+
 static bool __init sev_es_check_cpu_features(void)
 {
 	if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -224,6 +267,266 @@ static int sev_cpuid_hv(u32 func, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
 	return ret;
 }
 
+/*
+ * This may be called early while still running on the initial identity
+ * mapping. Use RIP-relative addressing to obtain the correct address
+ * while running with the initial identity mapping as well as the
+ * switch-over to kernel virtual addresses later.
+ */
+static const struct snp_cpuid_info *snp_cpuid_info_get_ptr(void)
+{
+	void *ptr;
+
+	asm ("lea cpuid_info_copy(%%rip), %0"
+	     : "=r" (ptr)
+	     : "p" (&cpuid_info_copy));
+
+	return ptr;
+}
+
+/*
+ * The SEV-SNP Firmware ABI, Revision 0.9, Section 7.1, details the use of
+ * XCR0_IN and XSS_IN to encode multiple versions of 0xD subfunctions 0
+ * and 1 based on the corresponding features enabled by a particular
+ * combination of XCR0 and XSS registers so that a guest can look up the
+ * version corresponding to the features currently enabled in its XCR0/XSS
+ * registers. The only values that differ between these versions/table
+ * entries is the enabled XSAVE area size advertised via EBX.
+ *
+ * While hypervisors may choose to make use of this support, it is more
+ * robust/secure for a guest to simply find the entry corresponding to the
+ * base/legacy XSAVE area size (XCR0=1 or XCR0=3), and then calculate the
+ * XSAVE area size using subfunctions 2 through 64, as documented in APM
+ * Volume 3, Rev 3.31, Appendix E.3.8, which is what is done here.
+ *
+ * Since base/legacy XSAVE area size is documented as 0x240, use that value
+ * directly rather than relying on the base size in the CPUID table.
+ *
+ * Return: XSAVE area size on success, 0 otherwise.
+ */
+static u32 snp_cpuid_calc_xsave_size(u64 xfeatures_en, bool compacted)
+{
+	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
+	u64 xfeatures_found = 0;
+	u32 xsave_size = 0x240;
+	int i;
+
+	for (i = 0; i < cpuid_info->count; i++) {
+		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+		if (!(fn->eax_in == 0xD && fn->ecx_in > 1 && fn->ecx_in < 64))
+			continue;
+		if (!(xfeatures_en & (BIT_ULL(fn->ecx_in))))
+			continue;
+		if (xfeatures_found & (BIT_ULL(fn->ecx_in)))
+			continue;
+
+		xfeatures_found |= (BIT_ULL(fn->ecx_in));
+
+		if (compacted)
+			xsave_size += fn->eax;
+		else
+			xsave_size = max(xsave_size, fn->eax + fn->ebx);
+	}
+
+	/*
+	 * Either the guest set unsupported XCR0/XSS bits, or the corresponding
+	 * entries in the CPUID table were not present. This is not a valid
+	 * state to be in.
+	 */
+	if (xfeatures_found != (xfeatures_en & GENMASK_ULL(63, 2)))
+		return 0;
+
+	return xsave_size;
+}
+
+static void snp_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
+			 u32 *edx)
+{
+	/*
+	 * MSR protocol does not support fetching indexed subfunction, but is
+	 * sufficient to handle current fallback cases. Should that change,
+	 * make sure to terminate rather than ignoring the index and grabbing
+	 * random values. If this issue arises in the future, handling can be
+	 * added here to use GHCB-page protocol for cases that occur late
+	 * enough in boot that GHCB page is available.
+	 */
+	if (cpuid_function_is_indexed(func) && subfunc)
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV);
+
+	if (sev_cpuid_hv(func, eax, ebx, ecx, edx))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV);
+}
+
+static bool
+snp_cpuid_get_validated_func(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
+			     u32 *ecx, u32 *edx)
+{
+	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
+	int i;
+
+	for (i = 0; i < cpuid_info->count; i++) {
+		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+		if (fn->eax_in != func)
+			continue;
+
+		if (cpuid_function_is_indexed(func) && fn->ecx_in != subfunc)
+			continue;
+
+		/*
+		 * For 0xD subfunctions 0 and 1, only use the entry corresponding
+		 * to the base/legacy XSAVE area size (XCR0=1 or XCR0=3, XSS=0).
+		 * See the comments above snp_cpuid_calc_xsave_size() for more
+		 * details.
+		 */
+		if (fn->eax_in == 0xD && (fn->ecx_in == 0 || fn->ecx_in == 1))
+			if (!(fn->xcr0_in == 1 || fn->xcr0_in == 3) || fn->xss_in)
+				continue;
+
+		*eax = fn->eax;
+		*ebx = fn->ebx;
+		*ecx = fn->ecx;
+		*edx = fn->edx;
+
+		return true;
+	}
+
+	return false;
+}
+
+static bool snp_cpuid_check_range(u32 func)
+{
+	if (func <= cpuid_std_range_max ||
+	    (func >= 0x40000000 && func <= cpuid_hyp_range_max) ||
+	    (func >= 0x80000000 && func <= cpuid_ext_range_max))
+		return true;
+
+	return false;
+}
+
+static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
+				 u32 *ecx, u32 *edx)
+{
+	u32 ebx2, ecx2, edx2;
+
+	switch (func) {
+	case 0x1:
+		snp_cpuid_hv(func, subfunc, NULL, &ebx2, NULL, &edx2);
+
+		/* initial APIC ID */
+		*ebx = (ebx2 & GENMASK(31, 24)) | (*ebx & GENMASK(23, 0));
+		/* APIC enabled bit */
+		*edx = (edx2 & BIT(9)) | (*edx & ~BIT(9));
+
+		/* OSXSAVE enabled bit */
+		if (native_read_cr4() & X86_CR4_OSXSAVE)
+			*ecx |= BIT(27);
+		break;
+	case 0x7:
+		/* OSPKE enabled bit */
+		*ecx &= ~BIT(4);
+		if (native_read_cr4() & X86_CR4_PKE)
+			*ecx |= BIT(4);
+		break;
+	case 0xB:
+		/* extended APIC ID */
+		snp_cpuid_hv(func, 0, NULL, NULL, NULL, edx);
+		break;
+	case 0xD: {
+		bool compacted = false;
+		u64 xcr0 = 1, xss = 0;
+		u32 xsave_size;
+
+		if (subfunc != 0 && subfunc != 1)
+			return 0;
+
+		if (native_read_cr4() & X86_CR4_OSXSAVE)
+			xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
+		if (subfunc == 1) {
+			/* Get XSS value if XSAVES is enabled. */
+			if (*eax & BIT(3)) {
+				unsigned long lo, hi;
+
+				asm volatile("rdmsr" : "=a" (lo), "=d" (hi)
+						     : "c" (MSR_IA32_XSS));
+				xss = (hi << 32) | lo;
+			}
+
+			/*
+			 * The PPR and APM aren't clear on what size should be
+			 * encoded in 0xD:0x1:EBX when compaction is not enabled
+			 * by either XSAVEC (feature bit 1) or XSAVES (feature
+			 * bit 3) since SNP-capable hardware has these feature
+			 * bits fixed as 1. KVM sets it to 0 in this case, but
+			 * to avoid this becoming an issue it's safer to simply
+			 * treat this as unsupported for SEV-SNP guests.
+			 */
+			if (!(*eax & (BIT(1) | BIT(3))))
+				return -EINVAL;
+
+			compacted = true;
+		}
+
+		xsave_size = snp_cpuid_calc_xsave_size(xcr0 | xss, compacted);
+		if (!xsave_size)
+			return -EINVAL;
+
+		*ebx = xsave_size;
+		}
+		break;
+	case 0x8000001E:
+		/* extended APIC ID */
+		snp_cpuid_hv(func, subfunc, eax, &ebx2, &ecx2, NULL);
+		/* compute ID */
+		*ebx = (*ebx & GENMASK(31, 8)) | (ebx2 & GENMASK(7, 0));
+		/* node ID */
+		*ecx = (*ecx & GENMASK(31, 8)) | (ecx2 & GENMASK(7, 0));
+		break;
+	default:
+		/* No fix-ups needed, use values as-is. */
+		break;
+	}
+
+	return 0;
+}
+
+/*
+ * Returns -EOPNOTSUPP if feature not enabled. Any other non-zero return value
+ * should be treated as fatal by caller.
+ */
+static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
+		     u32 *edx)
+{
+	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
+
+	if (!cpuid_info->count)
+		return -EOPNOTSUPP;
+
+	if (!snp_cpuid_get_validated_func(func, subfunc, eax, ebx, ecx, edx)) {
+		/*
+		 * Some hypervisors will avoid keeping track of CPUID entries
+		 * where all values are zero, since they can be handled the
+		 * same as out-of-range values (all-zero). This is useful here
+		 * as well as it allows virtually all guest configurations to
+		 * work using a single SEV-SNP CPUID table.
+		 *
+		 * To allow for this, there is a need to distinguish between
+		 * out-of-range entries and in-range zero entries, since the
+		 * CPUID table entries are only a template that may need to be
+		 * augmented with additional values for things like
+		 * CPU-specific information during post-processing. So if it's
+		 * not in the table, but is still in the valid range, proceed
+		 * with the post-processing. Otherwise, just return zeros.
+		 */
+		*eax = *ebx = *ecx = *edx = 0;
+		if (!snp_cpuid_check_range(func))
+			return 0;
+	}
+
+	return snp_cpuid_postprocess(func, subfunc, eax, ebx, ecx, edx);
+}
+
 /*
  * Boot VC Handler - This is the first VC handler during boot, there is no GHCB
  * page yet, so it only supports the MSR based communication with the
@@ -231,16 +534,26 @@ static int sev_cpuid_hv(u32 func, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
  */
 void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
 {
+	unsigned int subfn = lower_bits(regs->cx, 32);
 	unsigned int fn = lower_bits(regs->ax, 32);
 	u32 eax, ebx, ecx, edx;
+	int ret;
 
 	/* Only CPUID is supported via MSR protocol */
 	if (exit_code != SVM_EXIT_CPUID)
 		goto fail;
 
+	ret = snp_cpuid(fn, subfn, &eax, &ebx, &ecx, &edx);
+	if (!ret)
+		goto cpuid_done;
+
+	if (ret != -EOPNOTSUPP)
+		goto fail;
+
 	if (sev_cpuid_hv(fn, &eax, &ebx, &ecx, &edx))
 		goto fail;
 
+cpuid_done:
 	regs->ax = eax;
 	regs->bx = ebx;
 	regs->cx = ecx;
@@ -535,12 +848,35 @@ static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
 	return ret;
 }
 
+static int vc_handle_cpuid_snp(struct pt_regs *regs)
+{
+	u32 eax, ebx, ecx, edx;
+	int ret;
+
+	ret = snp_cpuid(regs->ax, regs->cx, &eax, &ebx, &ecx, &edx);
+	if (!ret) {
+		regs->ax = eax;
+		regs->bx = ebx;
+		regs->cx = ecx;
+		regs->dx = edx;
+	}
+
+	return ret;
+}
+
 static enum es_result vc_handle_cpuid(struct ghcb *ghcb,
 				      struct es_em_ctxt *ctxt)
 {
 	struct pt_regs *regs = ctxt->regs;
 	u32 cr4 = native_read_cr4();
 	enum es_result ret;
+	int snp_cpuid_ret;
+
+	snp_cpuid_ret = vc_handle_cpuid_snp(regs);
+	if (!snp_cpuid_ret)
+		return ES_OK;
+	if (snp_cpuid_ret != -EOPNOTSUPP)
+		return ES_VMM_ERROR;
 
 	ghcb_set_rax(ghcb, regs->ax);
 	ghcb_set_rcx(ghcb, regs->cx);
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 19a8ffcbd83b..c2bc07eb97e9 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -33,6 +33,7 @@
 #include <asm/smp.h>
 #include <asm/cpu.h>
 #include <asm/apic.h>
+#include <asm/cpuid.h>
 
 #define DR7_RESET_VALUE        0x400
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 32/43] x86/boot: Add a pointer to Confidential Computing blob in bootparams
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (30 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-05 13:07   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 33/43] x86/compressed: Add SEV-SNP feature detection/setup Brijesh Singh
                   ` (10 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

The previously defined Confidential Computing blob is provided to the
kernel via a setup_data structure or EFI config table entry. Currently
these are both checked for by boot/compressed kernel to access the
CPUID table address within it for use with SEV-SNP CPUID enforcement.

To also enable SEV-SNP CPUID enforcement for the run-time kernel,
similar early access to the CPUID table is needed early on while it's
still using the identity-mapped page table set up by boot/compressed,
where global pointers need to be accessed via fixup_pointer().

This isn't much of an issue for accessing setup_data, and the EFI
config table helper code currently used in boot/compressed *could* be
used in this case as well since they both rely on identity-mapping.
However, it has some reliance on EFI helpers/string constants that
would need to be accessed via fixup_pointer(), and fixing it up while
making it shareable between boot/compressed and run-time kernel is
fragile and introduces a good bit of uglyness.

Instead, add a boot_params->cc_blob_address pointer that the
boot/compressed kernel can initialize so that the run-time kernel can
access the CC blob from there instead of re-scanning the EFI config
table.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/bootparam_utils.h | 1 +
 arch/x86/include/uapi/asm/bootparam.h  | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/bootparam_utils.h b/arch/x86/include/asm/bootparam_utils.h
index 981fe923a59f..53e9b0620d96 100644
--- a/arch/x86/include/asm/bootparam_utils.h
+++ b/arch/x86/include/asm/bootparam_utils.h
@@ -74,6 +74,7 @@ static void sanitize_boot_params(struct boot_params *boot_params)
 			BOOT_PARAM_PRESERVE(hdr),
 			BOOT_PARAM_PRESERVE(e820_table),
 			BOOT_PARAM_PRESERVE(eddbuf),
+			BOOT_PARAM_PRESERVE(cc_blob_address),
 		};
 
 		memset(&scratch, 0, sizeof(scratch));
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index 1ac5acca72ce..bea5cdcdf532 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -188,7 +188,8 @@ struct boot_params {
 	__u32 ext_ramdisk_image;			/* 0x0c0 */
 	__u32 ext_ramdisk_size;				/* 0x0c4 */
 	__u32 ext_cmd_line_ptr;				/* 0x0c8 */
-	__u8  _pad4[116];				/* 0x0cc */
+	__u8  _pad4[112];				/* 0x0cc */
+	__u32 cc_blob_address;				/* 0x13c */
 	struct edid_info edid_info;			/* 0x140 */
 	struct efi_info efi_info;			/* 0x1c0 */
 	__u32 alt_mem_k;				/* 0x1e0 */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 33/43] x86/compressed: Add SEV-SNP feature detection/setup
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (31 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 32/43] x86/boot: Add a pointer to Confidential Computing blob in bootparams Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-06 16:41   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 34/43] x86/compressed: Use firmware-validated CPUID leaves for SEV-SNP guests Brijesh Singh
                   ` (9 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Initial/preliminary detection of SEV-SNP is done via the Confidential
Computing blob. Check for it prior to the normal SEV/SME feature
initialization, and add some sanity checks to confirm it agrees with
SEV-SNP CPUID/MSR bits.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c | 118 ++++++++++++++++++++++++++++++++-
 arch/x86/include/asm/sev.h     |   3 +
 2 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 1b80c1d0ea1f..04cabff015ba 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -286,6 +286,13 @@ static void enforce_vmpl0(void)
 void sev_enable(struct boot_params *bp)
 {
 	unsigned int eax, ebx, ecx, edx;
+	bool snp;
+
+	/*
+	 * Setup/preliminary detection of SEV-SNP. This will be sanity-checked
+	 * against CPUID/MSR values later.
+	 */
+	snp = snp_init(bp);
 
 	/* Check for the SME/SEV support leaf */
 	eax = 0x80000000;
@@ -306,8 +313,11 @@ void sev_enable(struct boot_params *bp)
 	ecx = 0;
 	native_cpuid(&eax, &ebx, &ecx, &edx);
 	/* Check whether SEV is supported */
-	if (!(eax & BIT(1)))
+	if (!(eax & BIT(1))) {
+		if (snp)
+			error("SEV-SNP support indicated by CC blob, but not CPUID.");
 		return;
+	}
 
 	/* Set the SME mask if this is an SEV guest. */
 	sev_status = rd_sev_status_msr();
@@ -332,5 +342,111 @@ void sev_enable(struct boot_params *bp)
 		enforce_vmpl0();
 	}
 
+	if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
+		error("SEV-SNP supported indicated by CC blob, but not SEV status MSR.");
+
 	sme_me_mask = BIT_ULL(ebx & 0x3f);
 }
+
+/* Search for Confidential Computing blob in the EFI config table. */
+static struct cc_blob_sev_info *snp_find_cc_blob_efi(struct boot_params *bp)
+{
+	unsigned long cfg_table_pa;
+	unsigned int cfg_table_len;
+	int ret;
+
+	ret = efi_get_conf_table(bp, &cfg_table_pa, &cfg_table_len);
+	if (ret)
+		return NULL;
+
+	return (struct cc_blob_sev_info *)efi_find_vendor_table(bp, cfg_table_pa,
+								cfg_table_len,
+								EFI_CC_BLOB_GUID);
+}
+
+struct cc_setup_data {
+	struct setup_data header;
+	u32 cc_blob_address;
+};
+
+static struct cc_setup_data *get_cc_setup_data(struct boot_params *bp)
+{
+	struct setup_data *hdr = (struct setup_data *)bp->hdr.setup_data;
+
+	while (hdr) {
+		if (hdr->type == SETUP_CC_BLOB)
+			return (struct cc_setup_data *)hdr;
+		hdr = (struct setup_data *)hdr->next;
+	}
+
+	return NULL;
+}
+
+/*
+ * Search for a Confidential Computing blob passed in as a setup_data entry
+ * via the Linux Boot Protocol.
+ */
+static struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params *bp)
+{
+	struct cc_setup_data *sd;
+
+	sd = get_cc_setup_data(bp);
+	if (!sd)
+		return NULL;
+
+	return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
+}
+
+/*
+ * Initial set up of SEV-SNP relies on information provided by the
+ * Confidential Computing blob, which can be passed to the boot kernel
+ * by firmware/bootloader in the following ways:
+ *
+ * - via an entry in the EFI config table
+ * - via a setup_data structure, as defined by the Linux Boot Protocol
+ *
+ * Scan for the blob in that order.
+ */
+static struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
+{
+	struct cc_blob_sev_info *cc_info;
+
+	cc_info = snp_find_cc_blob_efi(bp);
+	if (cc_info)
+		goto found_cc_info;
+
+	cc_info = snp_find_cc_blob_setup_data(bp);
+	if (!cc_info)
+		return NULL;
+
+found_cc_info:
+	if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
+	return cc_info;
+}
+
+bool snp_init(struct boot_params *bp)
+{
+	struct cc_blob_sev_info *cc_info;
+
+	if (!bp)
+		return false;
+
+	cc_info = snp_find_cc_blob(bp);
+	if (!cc_info)
+		return false;
+
+	/*
+	 * Pass run-time kernel a pointer to CC info via boot_params so EFI
+	 * config table doesn't need to be searched again during early startup
+	 * phase.
+	 */
+	bp->cc_blob_address = (u32)(unsigned long)cc_info;
+
+	/*
+	 * Indicate SEV-SNP based on presence of SEV-SNP-specific CC blob.
+	 * Subsequent checks will verify SEV-SNP CPUID/MSR bits.
+	 */
+	return true;
+}
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 1a7e21bb6eea..4e3909042001 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -11,6 +11,7 @@
 #include <linux/types.h>
 #include <asm/insn.h>
 #include <asm/sev-common.h>
+#include <asm/bootparam.h>
 
 #define GHCB_PROTOCOL_MIN	1ULL
 #define GHCB_PROTOCOL_MAX	2ULL
@@ -151,6 +152,7 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
 void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
 void snp_set_wakeup_secondary_cpu(void);
+bool snp_init(struct boot_params *bp);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -168,6 +170,7 @@ static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz,
 static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
 static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
 static inline void snp_set_wakeup_secondary_cpu(void) { }
+static inline bool snp_init(struct boot_params *bp) { return false; }
 #endif
 
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 34/43] x86/compressed: Use firmware-validated CPUID leaves for SEV-SNP guests
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (32 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 33/43] x86/compressed: Add SEV-SNP feature detection/setup Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-01-28 17:17 ` [PATCH v9 35/43] x86/compressed: Export and rename add_identity_map() Brijesh Singh
                   ` (8 subsequent siblings)
  42 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

SEV-SNP guests will be provided the location of special 'secrets'
'CPUID' pages via the Confidential Computing blob. This blob is
provided to the boot kernel either through an EFI config table entry,
or via a setup_data structure as defined by the Linux Boot Protocol.

Locate the Confidential Computing from these sources and, if found,
use the provided CPUID page/table address to create a copy that the
boot kernel will use when servicing cpuid instructions via a #VC CPUID
handler.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c | 46 ++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 04cabff015ba..e1596bfc13e6 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -426,6 +426,43 @@ static struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
 	return cc_info;
 }
 
+/*
+ * Initialize the kernel's copy of the SEV-SNP CPUID table, and set up the
+ * pointer that will be used to access it.
+ *
+ * Maintaining a direct mapping of the SEV-SNP CPUID table used by firmware
+ * would be possible as an alternative, but the approach is brittle since the
+ * mapping needs to be updated in sync with all the changes to virtual memory
+ * layout and related mapping facilities throughout the boot process.
+ */
+static void snp_setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
+{
+	const struct snp_cpuid_info *cpuid_info_fw, *cpuid_info;
+	int i;
+
+	if (!cc_info || !cc_info->cpuid_phys || cc_info->cpuid_len < PAGE_SIZE)
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID);
+
+	cpuid_info_fw = (const struct snp_cpuid_info *)cc_info->cpuid_phys;
+	if (!cpuid_info_fw->count || cpuid_info_fw->count > SNP_CPUID_COUNT_MAX)
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID);
+
+	cpuid_info = snp_cpuid_info_get_ptr();
+	memcpy((void *)cpuid_info, cpuid_info_fw, sizeof(*cpuid_info));
+
+	/* Initialize CPUID ranges for range-checking. */
+	for (i = 0; i < cpuid_info->count; i++) {
+		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+		if (fn->eax_in == 0x0)
+			cpuid_std_range_max = fn->eax;
+		else if (fn->eax_in == 0x40000000)
+			cpuid_hyp_range_max = fn->eax;
+		else if (fn->eax_in == 0x80000000)
+			cpuid_ext_range_max = fn->eax;
+	}
+}
+
 bool snp_init(struct boot_params *bp)
 {
 	struct cc_blob_sev_info *cc_info;
@@ -437,6 +474,15 @@ bool snp_init(struct boot_params *bp)
 	if (!cc_info)
 		return false;
 
+	/*
+	 * If a SEV-SNP-specific Confidential Computing blob is present, then
+	 * firmware/bootloader have indicated SEV-SNP support. Verifying this
+	 * involves CPUID checks which will be more reliable if the SEV-SNP
+	 * CPUID table is used. See comments over snp_setup_cpuid_table() for
+	 * more details.
+	 */
+	snp_setup_cpuid_table(cc_info);
+
 	/*
 	 * Pass run-time kernel a pointer to CC info via boot_params so EFI
 	 * config table doesn't need to be searched again during early startup
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 35/43] x86/compressed: Export and rename add_identity_map()
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (33 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 34/43] x86/compressed: Use firmware-validated CPUID leaves for SEV-SNP guests Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-06 19:01   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 36/43] x86/compressed/64: Add identity mapping for Confidential Computing blob Brijesh Singh
                   ` (7 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Borislav Petkov

From: Michael Roth <michael.roth@amd.com>

SEV-specific code will need to add some additional mappings, but doing
this within ident_map_64.c requires some SEV-specific helpers to be
exported and some SEV-specific struct definitions to be pulled into
ident_map_64.c. Instead, export add_identity_map() so SEV-specific (and
other subsystem-specific) code can be better contained outside of
ident_map_64.c.

While at it, rename the function to kernel_add_identity_map(), similar
to the kernel_ident_mapping_init() function it relies upon.

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/boot/compressed/ident_map_64.c | 18 +++++++++---------
 arch/x86/boot/compressed/misc.h         |  1 +
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index 3d566964b829..7975680f521f 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -90,7 +90,7 @@ static struct x86_mapping_info mapping_info;
 /*
  * Adds the specified range to the identity mappings.
  */
-static void add_identity_map(unsigned long start, unsigned long end)
+void kernel_add_identity_map(unsigned long start, unsigned long end)
 {
 	int ret;
 
@@ -157,11 +157,11 @@ void initialize_identity_maps(void *rmode)
 	 * explicitly here in case the compressed kernel does not touch them,
 	 * or does not touch all the pages covering them.
 	 */
-	add_identity_map((unsigned long)_head, (unsigned long)_end);
+	kernel_add_identity_map((unsigned long)_head, (unsigned long)_end);
 	boot_params = rmode;
-	add_identity_map((unsigned long)boot_params, (unsigned long)(boot_params + 1));
+	kernel_add_identity_map((unsigned long)boot_params, (unsigned long)(boot_params + 1));
 	cmdline = get_cmd_line_ptr();
-	add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
+	kernel_add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
 
 	/* Load the new page-table. */
 	sev_verify_cbit(top_level_pgt);
@@ -246,10 +246,10 @@ static int set_clr_page_flags(struct x86_mapping_info *info,
 	 * It should already exist, but keep things generic.
 	 *
 	 * To map the page just read from it and fault it in if there is no
-	 * mapping yet. add_identity_map() can't be called here because that
-	 * would unconditionally map the address on PMD level, destroying any
-	 * PTE-level mappings that might already exist. Use assembly here so
-	 * the access won't be optimized away.
+	 * mapping yet. kernel_add_identity_map() can't be called here because
+	 * that would unconditionally map the address on PMD level, destroying
+	 * any PTE-level mappings that might already exist. Use assembly here
+	 * so the access won't be optimized away.
 	 */
 	asm volatile("mov %[address], %%r9"
 		     :: [address] "g" (*(unsigned long *)address)
@@ -363,5 +363,5 @@ void do_boot_page_fault(struct pt_regs *regs, unsigned long error_code)
 	 * Error code is sane - now identity map the 2M region around
 	 * the faulting address.
 	 */
-	add_identity_map(address, end);
+	kernel_add_identity_map(address, end);
 }
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 991b46170914..cfa0663bf931 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -156,6 +156,7 @@ static inline int count_immovable_mem_regions(void) { return 0; }
 #ifdef CONFIG_X86_5LEVEL
 extern unsigned int __pgtable_l5_enabled, pgdir_shift, ptrs_per_p4d;
 #endif
+extern void kernel_add_identity_map(unsigned long start, unsigned long end);
 
 /* Used by PAGE_KERN* macros: */
 extern pteval_t __default_kernel_pte_mask;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 36/43] x86/compressed/64: Add identity mapping for Confidential Computing blob
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (34 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 35/43] x86/compressed: Export and rename add_identity_map() Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-06 19:21   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 37/43] x86/sev: Add SEV-SNP feature detection/setup Brijesh Singh
                   ` (6 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

The run-time kernel will need to access the Confidential Computing
blob very early in boot to access the CPUID table it points to. At
that stage of boot it will be relying on the identity-mapped page table
set up by boot/compressed kernel, so make sure the blob and the CPUID
table it points to are mapped in advance.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/ident_map_64.c |  3 ++-
 arch/x86/boot/compressed/misc.h         |  2 ++
 arch/x86/boot/compressed/sev.c          | 22 ++++++++++++++++++++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index 7975680f521f..e4b093a0862d 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -163,8 +163,9 @@ void initialize_identity_maps(void *rmode)
 	cmdline = get_cmd_line_ptr();
 	kernel_add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
 
+	sev_prep_identity_maps(top_level_pgt);
+
 	/* Load the new page-table. */
-	sev_verify_cbit(top_level_pgt);
 	write_cr3(top_level_pgt);
 }
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index cfa0663bf931..72eda6c26c11 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -127,6 +127,7 @@ void sev_es_shutdown_ghcb(void);
 extern bool sev_es_check_ghcb_fault(unsigned long address);
 void snp_set_page_private(unsigned long paddr);
 void snp_set_page_shared(unsigned long paddr);
+void sev_prep_identity_maps(unsigned long top_level_pgt);
 #else
 static inline void sev_enable(struct boot_params *bp) { }
 static inline void sev_es_shutdown_ghcb(void) { }
@@ -136,6 +137,7 @@ static inline bool sev_es_check_ghcb_fault(unsigned long address)
 }
 static inline void snp_set_page_private(unsigned long paddr) { }
 static inline void snp_set_page_shared(unsigned long paddr) { }
+static inline void sev_prep_identity_maps(unsigned long top_level_pgt) { }
 #endif
 
 /* acpi.c */
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index e1596bfc13e6..faf432684870 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -496,3 +496,25 @@ bool snp_init(struct boot_params *bp)
 	 */
 	return true;
 }
+
+void sev_prep_identity_maps(unsigned long top_level_pgt)
+{
+	/*
+	 * The ConfidentialComputing blob is used very early in uncompressed
+	 * kernel to find the in-memory cpuid table to handle cpuid
+	 * instructions. Make sure an identity-mapping exists so it can be
+	 * accessed after switchover.
+	 */
+	if (sev_snp_enabled()) {
+		unsigned long cc_info_pa = boot_params->cc_blob_address;
+		struct cc_blob_sev_info *cc_info;
+
+		kernel_add_identity_map(cc_info_pa,
+					cc_info_pa + sizeof(*cc_info));
+		cc_info = (struct cc_blob_sev_info *)cc_info_pa;
+		kernel_add_identity_map((unsigned long)cc_info->cpuid_phys,
+					(unsigned long)cc_info->cpuid_phys + cc_info->cpuid_len);
+	}
+
+	sev_verify_cbit(top_level_pgt);
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 37/43] x86/sev: Add SEV-SNP feature detection/setup
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (35 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 36/43] x86/compressed/64: Add identity mapping for Confidential Computing blob Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-06 19:38   ` Borislav Petkov
  2022-01-28 17:17 ` [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
                   ` (5 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

Initial/preliminary detection of SEV-SNP is done via the Confidential
Computing blob. Check for it prior to the normal SEV/SME feature
initialization, and add some sanity checks to confirm it agrees with
SEV-SNP CPUID/MSR bits.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c     | 33 ---------------
 arch/x86/include/asm/sev.h         |  2 +
 arch/x86/kernel/sev-shared.c       | 33 +++++++++++++++
 arch/x86/kernel/sev.c              | 64 ++++++++++++++++++++++++++++++
 arch/x86/mm/mem_encrypt_identity.c |  8 ++++
 5 files changed, 107 insertions(+), 33 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index faf432684870..cde5658e1007 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -364,39 +364,6 @@ static struct cc_blob_sev_info *snp_find_cc_blob_efi(struct boot_params *bp)
 								EFI_CC_BLOB_GUID);
 }
 
-struct cc_setup_data {
-	struct setup_data header;
-	u32 cc_blob_address;
-};
-
-static struct cc_setup_data *get_cc_setup_data(struct boot_params *bp)
-{
-	struct setup_data *hdr = (struct setup_data *)bp->hdr.setup_data;
-
-	while (hdr) {
-		if (hdr->type == SETUP_CC_BLOB)
-			return (struct cc_setup_data *)hdr;
-		hdr = (struct setup_data *)hdr->next;
-	}
-
-	return NULL;
-}
-
-/*
- * Search for a Confidential Computing blob passed in as a setup_data entry
- * via the Linux Boot Protocol.
- */
-static struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params *bp)
-{
-	struct cc_setup_data *sd;
-
-	sd = get_cc_setup_data(bp);
-	if (!sd)
-		return NULL;
-
-	return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
-}
-
 /*
  * Initial set up of SEV-SNP relies on information provided by the
  * Confidential Computing blob, which can be passed to the boot kernel
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 4e3909042001..219abb4590f2 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -153,6 +153,7 @@ void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
 void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
 void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
+void snp_abort(void);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -171,6 +172,7 @@ static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npage
 static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
 static inline void snp_set_wakeup_secondary_cpu(void) { }
 static inline bool snp_init(struct boot_params *bp) { return false; }
+static inline void snp_abort(void) { }
 #endif
 
 #endif
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index dea9f7b28620..2bddbfea079b 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -928,3 +928,36 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
 
 	return ES_OK;
 }
+
+struct cc_setup_data {
+	struct setup_data header;
+	u32 cc_blob_address;
+};
+
+static struct cc_setup_data *get_cc_setup_data(struct boot_params *bp)
+{
+	struct setup_data *hdr = (struct setup_data *)bp->hdr.setup_data;
+
+	while (hdr) {
+		if (hdr->type == SETUP_CC_BLOB)
+			return (struct cc_setup_data *)hdr;
+		hdr = (struct setup_data *)hdr->next;
+	}
+
+	return NULL;
+}
+
+/*
+ * Search for a Confidential Computing blob passed in as a setup_data entry
+ * via the Linux Boot Protocol.
+ */
+static struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params *bp)
+{
+	struct cc_setup_data *sd;
+
+	sd = get_cc_setup_data(bp);
+	if (!sd)
+		return NULL;
+
+	return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
+}
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index c2bc07eb97e9..5c72b21c7e7b 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1983,3 +1983,67 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
 	while (true)
 		halt();
 }
+
+/*
+ * Initial set up of SEV-SNP relies on information provided by the
+ * Confidential Computing blob, which can be passed to the kernel
+ * in the following ways, depending on how it is booted:
+ *
+ * - when booted via the boot/decompress kernel:
+ *   - via boot_params
+ *
+ * - when booted directly by firmware/bootloader (e.g. CONFIG_PVH):
+ *   - via a setup_data entry, as defined by the Linux Boot Protocol
+ *
+ * Scan for the blob in that order.
+ */
+static __init struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
+{
+	struct cc_blob_sev_info *cc_info;
+
+	/* Boot kernel would have passed the CC blob via boot_params. */
+	if (bp->cc_blob_address) {
+		cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address;
+		goto found_cc_info;
+	}
+
+	/*
+	 * If kernel was booted directly, without the use of the
+	 * boot/decompression kernel, the CC blob may have been passed via
+	 * setup_data instead.
+	 */
+	cc_info = snp_find_cc_blob_setup_data(bp);
+	if (!cc_info)
+		return NULL;
+
+found_cc_info:
+	if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
+		snp_abort();
+
+	return cc_info;
+}
+
+bool __init snp_init(struct boot_params *bp)
+{
+	struct cc_blob_sev_info *cc_info;
+
+	if (!bp)
+		return false;
+
+	cc_info = snp_find_cc_blob(bp);
+	if (!cc_info)
+		return false;
+
+	/*
+	 * The CC blob will be used later to access the secrets page. Cache
+	 * it here like the boot kernel does.
+	 */
+	bp->cc_blob_address = (u32)(unsigned long)cc_info;
+
+	return true;
+}
+
+void __init snp_abort(void)
+{
+	sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+}
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 3f0abb403340..2f723e106ed3 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -44,6 +44,7 @@
 #include <asm/setup.h>
 #include <asm/sections.h>
 #include <asm/cmdline.h>
+#include <asm/sev.h>
 
 #include "mm_internal.h"
 
@@ -508,8 +509,11 @@ void __init sme_enable(struct boot_params *bp)
 	bool active_by_default;
 	unsigned long me_mask;
 	char buffer[16];
+	bool snp;
 	u64 msr;
 
+	snp = snp_init(bp);
+
 	/* Check for the SME/SEV support leaf */
 	eax = 0x80000000;
 	ecx = 0;
@@ -541,6 +545,10 @@ void __init sme_enable(struct boot_params *bp)
 	sev_status   = __rdmsr(MSR_AMD64_SEV);
 	feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
 
+	/* The SEV-SNP CC blob should never be present unless SEV-SNP is enabled. */
+	if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
+		snp_abort();
+
 	/* Check if memory encryption is enabled */
 	if (feature_mask == AMD_SME_BIT) {
 		/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (36 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 37/43] x86/sev: Add SEV-SNP feature detection/setup Brijesh Singh
@ 2022-01-28 17:17 ` Brijesh Singh
  2022-02-05 17:19   ` Michael Roth
  2022-02-06 19:50   ` Borislav Petkov
  2022-01-28 17:18 ` [PATCH v9 39/43] x86/sev: Provide support for SNP guest request NAEs Brijesh Singh
                   ` (4 subsequent siblings)
  42 siblings, 2 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:17 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

From: Michael Roth <michael.roth@amd.com>

SEV-SNP guests will be provided the location of special 'secrets' and
'CPUID' pages via the Confidential Computing blob. This blob is
provided to the run-time kernel either through a bootparams field that
was initialized by the boot/compressed kernel, or via a setup_data
structure as defined by the Linux Boot Protocol.

Locate the Confidential Computing blob from these sources and, if found,
use the provided CPUID page/table address to create a copy that the
run-time kernel will use when servicing CPUID instructions via a #VC
handler.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c | 37 -----------------
 arch/x86/kernel/sev-shared.c   | 37 +++++++++++++++++
 arch/x86/kernel/sev.c          | 75 ++++++++++++++++++++++++++++++++++
 3 files changed, 112 insertions(+), 37 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index cde5658e1007..21cdafac71e3 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -393,43 +393,6 @@ static struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
 	return cc_info;
 }
 
-/*
- * Initialize the kernel's copy of the SEV-SNP CPUID table, and set up the
- * pointer that will be used to access it.
- *
- * Maintaining a direct mapping of the SEV-SNP CPUID table used by firmware
- * would be possible as an alternative, but the approach is brittle since the
- * mapping needs to be updated in sync with all the changes to virtual memory
- * layout and related mapping facilities throughout the boot process.
- */
-static void snp_setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
-{
-	const struct snp_cpuid_info *cpuid_info_fw, *cpuid_info;
-	int i;
-
-	if (!cc_info || !cc_info->cpuid_phys || cc_info->cpuid_len < PAGE_SIZE)
-		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID);
-
-	cpuid_info_fw = (const struct snp_cpuid_info *)cc_info->cpuid_phys;
-	if (!cpuid_info_fw->count || cpuid_info_fw->count > SNP_CPUID_COUNT_MAX)
-		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID);
-
-	cpuid_info = snp_cpuid_info_get_ptr();
-	memcpy((void *)cpuid_info, cpuid_info_fw, sizeof(*cpuid_info));
-
-	/* Initialize CPUID ranges for range-checking. */
-	for (i = 0; i < cpuid_info->count; i++) {
-		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
-
-		if (fn->eax_in == 0x0)
-			cpuid_std_range_max = fn->eax;
-		else if (fn->eax_in == 0x40000000)
-			cpuid_hyp_range_max = fn->eax;
-		else if (fn->eax_in == 0x80000000)
-			cpuid_ext_range_max = fn->eax;
-	}
-}
-
 bool snp_init(struct boot_params *bp)
 {
 	struct cc_blob_sev_info *cc_info;
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 2bddbfea079b..795426ee6108 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -961,3 +961,40 @@ static struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params *
 
 	return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
 }
+
+/*
+ * Initialize the kernel's copy of the SEV-SNP CPUID table, and set up the
+ * pointer that will be used to access it.
+ *
+ * Maintaining a direct mapping of the SEV-SNP CPUID table used by firmware
+ * would be possible as an alternative, but the approach is brittle since the
+ * mapping needs to be updated in sync with all the changes to virtual memory
+ * layout and related mapping facilities throughout the boot process.
+ */
+static void __init snp_setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
+{
+	const struct snp_cpuid_info *cpuid_info_fw, *cpuid_info;
+	int i;
+
+	if (!cc_info || !cc_info->cpuid_phys || cc_info->cpuid_len < PAGE_SIZE)
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID);
+
+	cpuid_info_fw = (const struct snp_cpuid_info *)cc_info->cpuid_phys;
+	if (!cpuid_info_fw->count || cpuid_info_fw->count > SNP_CPUID_COUNT_MAX)
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID);
+
+	cpuid_info = snp_cpuid_info_get_ptr();
+	memcpy((void *)cpuid_info, cpuid_info_fw, sizeof(*cpuid_info));
+
+	/* Initialize CPUID ranges for range-checking. */
+	for (i = 0; i < cpuid_info->count; i++) {
+		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+		if (fn->eax_in == 0x0)
+			cpuid_std_range_max = fn->eax;
+		else if (fn->eax_in == 0x40000000)
+			cpuid_hyp_range_max = fn->eax;
+		else if (fn->eax_in == 0x80000000)
+			cpuid_ext_range_max = fn->eax;
+	}
+}
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 5c72b21c7e7b..cb97200bfda7 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2034,6 +2034,8 @@ bool __init snp_init(struct boot_params *bp)
 	if (!cc_info)
 		return false;
 
+	snp_setup_cpuid_table(cc_info);
+
 	/*
 	 * The CC blob will be used later to access the secrets page. Cache
 	 * it here like the boot kernel does.
@@ -2047,3 +2049,76 @@ void __init snp_abort(void)
 {
 	sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
 }
+
+static void snp_dump_cpuid_table(void)
+{
+	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
+	int i = 0;
+
+	pr_info("count=%d reserved=0x%x reserved2=0x%llx\n",
+		cpuid_info->count, cpuid_info->__reserved1, cpuid_info->__reserved2);
+
+	for (i = 0; i < SNP_CPUID_COUNT_MAX; i++) {
+		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+		pr_info("index=%03d fn=0x%08x subfn=0x%08x: eax=0x%08x ebx=0x%08x ecx=0x%08x edx=0x%08x xcr0_in=0x%016llx xss_in=0x%016llx reserved=0x%016llx\n",
+			i, fn->eax_in, fn->ecx_in, fn->eax, fn->ebx, fn->ecx,
+			fn->edx, fn->xcr0_in, fn->xss_in, fn->__reserved);
+	}
+}
+
+/*
+ * It is useful from an auditing/testing perspective to provide an easy way
+ * for the guest owner to know that the CPUID table has been initialized as
+ * expected, but that initialization happens too early in boot to print any
+ * sort of indicator, and there's not really any other good place to do it.
+ * So do it here. This is also a good place to flag unexpected usage of the
+ * CPUID table that does not violate the SNP specification, but may still
+ * be worth noting to the user.
+ */
+static int __init snp_check_cpuid_table(void)
+{
+	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
+	bool dump_table = false;
+	int i;
+
+	if (!cpuid_info->count)
+		return 0;
+
+	pr_info("Using SEV-SNP CPUID table, %d entries present.\n",
+		cpuid_info->count);
+
+	if (cpuid_info->__reserved1 || cpuid_info->__reserved2) {
+		pr_warn("Unexpected use of reserved fields in CPUID table header.\n");
+		dump_table = true;
+	}
+
+	for (i = 0; i < cpuid_info->count; i++) {
+		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
+		struct snp_cpuid_fn zero_fn = {0};
+
+		/*
+		 * Though allowed, an all-zero entry is not a valid leaf for linux
+		 * guests, and likely the result of an incorrect entry count.
+		 */
+		if (!memcmp(fn, &zero_fn, sizeof(struct snp_cpuid_fn))) {
+			pr_warn("CPUID table contains all-zero entry at index=%d\n", i);
+			dump_table = true;
+		}
+
+		if (fn->__reserved) {
+			pr_warn("Unexpected use of reserved fields in CPUID table entry at index=%d\n",
+				i);
+			dump_table = true;
+		}
+	}
+
+	if (dump_table) {
+		pr_warn("Dumping CPUID table:\n");
+		snp_dump_cpuid_table();
+	}
+
+	return 0;
+}
+
+arch_initcall(snp_check_cpuid_table);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 39/43] x86/sev: Provide support for SNP guest request NAEs
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (37 preceding siblings ...)
  2022-01-28 17:17 ` [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
@ 2022-01-28 17:18 ` Brijesh Singh
  2022-02-01 20:17   ` Peter Gonda
  2022-01-28 17:18 ` [PATCH v9 40/43] x86/sev: Register SEV-SNP guest request platform device Brijesh Singh
                   ` (3 subsequent siblings)
  42 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:18 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Version 2 of GHCB specification provides SNP_GUEST_REQUEST and
SNP_EXT_GUEST_REQUEST NAE that can be used by the SNP guest to communicate
with the PSP.

While at it, add a snp_issue_guest_request() helper that will be used by
driver or other subsystem to issue the request to PSP.

See SEV-SNP firmware and GHCB spec for more details.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev-common.h |  3 ++
 arch/x86/include/asm/sev.h        | 14 ++++++++
 arch/x86/include/uapi/asm/svm.h   |  4 +++
 arch/x86/kernel/sev.c             | 55 +++++++++++++++++++++++++++++++
 4 files changed, 76 insertions(+)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index cd769984e929..442614879dad 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -128,6 +128,9 @@ struct snp_psc_desc {
 	struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
 } __packed;
 
+/* Guest message request error code */
+#define SNP_GUEST_REQ_INVALID_LEN	BIT_ULL(32)
+
 #define GHCB_MSR_TERM_REQ		0x100
 #define GHCB_MSR_TERM_REASON_SET_POS	12
 #define GHCB_MSR_TERM_REASON_SET_MASK	0xf
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 219abb4590f2..9830ee1d6ef0 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -87,6 +87,14 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 
 #define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
 
+/* SNP Guest message request */
+struct snp_req_data {
+	unsigned long req_gpa;
+	unsigned long resp_gpa;
+	unsigned long data_gpa;
+	unsigned int data_npages;
+};
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -154,6 +162,7 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
 void snp_set_wakeup_secondary_cpu(void);
 bool snp_init(struct boot_params *bp);
 void snp_abort(void);
+int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err);
 #else
 static inline void sev_es_ist_enter(struct pt_regs *regs) { }
 static inline void sev_es_ist_exit(void) { }
@@ -173,6 +182,11 @@ static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npag
 static inline void snp_set_wakeup_secondary_cpu(void) { }
 static inline bool snp_init(struct boot_params *bp) { return false; }
 static inline void snp_abort(void) { }
+static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input,
+					  unsigned long *fw_err)
+{
+	return -ENOTTY;
+}
 #endif
 
 #endif
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 8b4c57baec52..5b8bc2b65a5e 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -109,6 +109,8 @@
 #define SVM_VMGEXIT_SET_AP_JUMP_TABLE		0
 #define SVM_VMGEXIT_GET_AP_JUMP_TABLE		1
 #define SVM_VMGEXIT_PSC				0x80000010
+#define SVM_VMGEXIT_GUEST_REQUEST		0x80000011
+#define SVM_VMGEXIT_EXT_GUEST_REQUEST		0x80000012
 #define SVM_VMGEXIT_AP_CREATION			0x80000013
 #define SVM_VMGEXIT_AP_CREATE_ON_INIT		0
 #define SVM_VMGEXIT_AP_CREATE			1
@@ -225,6 +227,8 @@
 	{ SVM_VMGEXIT_AP_HLT_LOOP,	"vmgexit_ap_hlt_loop" }, \
 	{ SVM_VMGEXIT_AP_JUMP_TABLE,	"vmgexit_ap_jump_table" }, \
 	{ SVM_VMGEXIT_PSC,	"vmgexit_page_state_change" }, \
+	{ SVM_VMGEXIT_GUEST_REQUEST,		"vmgexit_guest_request" }, \
+	{ SVM_VMGEXIT_EXT_GUEST_REQUEST,	"vmgexit_ext_guest_request" }, \
 	{ SVM_VMGEXIT_AP_CREATION,	"vmgexit_ap_creation" }, \
 	{ SVM_VMGEXIT_HV_FEATURES,	"vmgexit_hypervisor_feature" }, \
 	{ SVM_EXIT_ERR,         "invalid_guest_state" }
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index cb97200bfda7..1d3ac83226fc 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2122,3 +2122,58 @@ static int __init snp_check_cpuid_table(void)
 }
 
 arch_initcall(snp_check_cpuid_table);
+
+int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err)
+{
+	struct ghcb_state state;
+	struct es_em_ctxt ctxt;
+	unsigned long flags;
+	struct ghcb *ghcb;
+	int ret;
+
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return -ENODEV;
+
+	/*
+	 * __sev_get_ghcb() needs to run with IRQs disabled because it is using
+	 * a per-CPU GHCB.
+	 */
+	local_irq_save(flags);
+
+	ghcb = __sev_get_ghcb(&state);
+	if (!ghcb) {
+		ret = -EIO;
+		goto e_restore_irq;
+	}
+
+	vc_ghcb_invalidate(ghcb);
+
+	if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
+		ghcb_set_rax(ghcb, input->data_gpa);
+		ghcb_set_rbx(ghcb, input->data_npages);
+	}
+
+	ret = sev_es_ghcb_hv_call(ghcb, true, &ctxt, exit_code, input->req_gpa, input->resp_gpa);
+	if (ret)
+		goto e_put;
+
+	if (ghcb->save.sw_exit_info_2) {
+		/* Number of expected pages are returned in RBX */
+		if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
+		    ghcb->save.sw_exit_info_2 == SNP_GUEST_REQ_INVALID_LEN)
+			input->data_npages = ghcb_get_rbx(ghcb);
+
+		if (fw_err)
+			*fw_err = ghcb->save.sw_exit_info_2;
+
+		ret = -EIO;
+	}
+
+e_put:
+	__sev_put_ghcb(&state);
+e_restore_irq:
+	local_irq_restore(flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(snp_issue_guest_request);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 40/43] x86/sev: Register SEV-SNP guest request platform device
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (38 preceding siblings ...)
  2022-01-28 17:18 ` [PATCH v9 39/43] x86/sev: Provide support for SNP guest request NAEs Brijesh Singh
@ 2022-01-28 17:18 ` Brijesh Singh
  2022-02-01 20:21   ` Peter Gonda
  2022-02-06 20:05   ` Borislav Petkov
  2022-01-28 17:18 ` [PATCH v9 41/43] virt: Add SEV-SNP guest driver Brijesh Singh
                   ` (2 subsequent siblings)
  42 siblings, 2 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:18 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Version 2 of GHCB specification provides Non Automatic Exit (NAE) that can
be used by the SEV-SNP guest to communicate with the PSP without risk from
a malicious hypervisor who wishes to read, alter, drop or replay the
messages sent.

SNP_LAUNCH_UPDATE can insert two special pages into the guest’s memory:
the secrets page and the CPUID page. The PSP firmware populate the contents
of the secrets page. The secrets page contains encryption keys used by the
guest to interact with the firmware. Because the secrets page is encrypted
with the guest’s memory encryption key, the hypervisor cannot read the
keys. See SEV-SNP firmware spec for further details on the secrets page
format.

Create a platform device that the SEV-SNP guest driver can bind to get the
platform resources such as encryption key and message id to use to
communicate with the PSP. The SEV-SNP guest driver provides a userspace
interface to get the attestation report, key derivation, extended
attestation report etc.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/sev.h |  4 +++
 arch/x86/kernel/sev.c      | 61 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 9830ee1d6ef0..ca977493eb72 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -95,6 +95,10 @@ struct snp_req_data {
 	unsigned int data_npages;
 };
 
+struct snp_guest_platform_data {
+	u64 secrets_gpa;
+};
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 1d3ac83226fc..1e56ab00d1f4 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -19,6 +19,9 @@
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/cpumask.h>
+#include <linux/efi.h>
+#include <linux/platform_device.h>
+#include <linux/io.h>
 
 #include <asm/cpu_entry_area.h>
 #include <asm/stacktrace.h>
@@ -34,6 +37,7 @@
 #include <asm/cpu.h>
 #include <asm/apic.h>
 #include <asm/cpuid.h>
+#include <asm/setup.h>
 
 #define DR7_RESET_VALUE        0x400
 
@@ -2177,3 +2181,60 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned
 	return ret;
 }
 EXPORT_SYMBOL_GPL(snp_issue_guest_request);
+
+static struct platform_device guest_req_device = {
+	.name		= "snp-guest",
+	.id		= -1,
+};
+
+static u64 get_secrets_page(void)
+{
+	u64 pa_data = boot_params.cc_blob_address;
+	struct cc_blob_sev_info info;
+	void *map;
+
+	/*
+	 * The CC blob contains the address of the secrets page, check if the
+	 * blob is present.
+	 */
+	if (!pa_data)
+		return 0;
+
+	map = early_memremap(pa_data, sizeof(info));
+	memcpy(&info, map, sizeof(info));
+	early_memunmap(map, sizeof(info));
+
+	/* smoke-test the secrets page passed */
+	if (!info.secrets_phys || info.secrets_len != PAGE_SIZE)
+		return 0;
+
+	return info.secrets_phys;
+}
+
+static int __init init_snp_platform_device(void)
+{
+	struct snp_guest_platform_data data;
+	u64 gpa;
+
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return -ENODEV;
+
+	gpa = get_secrets_page();
+	if (!gpa)
+		return -ENODEV;
+
+	data.secrets_gpa = gpa;
+	if (platform_device_add_data(&guest_req_device, &data, sizeof(data)))
+		goto e_fail;
+
+	if (platform_device_register(&guest_req_device))
+		goto e_fail;
+
+	pr_info("SNP guest platform device initialized.\n");
+	return 0;
+
+e_fail:
+	pr_err("Failed to initialize SNP guest device\n");
+	return -ENODEV;
+}
+device_initcall(init_snp_platform_device);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 41/43] virt: Add SEV-SNP guest driver
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (39 preceding siblings ...)
  2022-01-28 17:18 ` [PATCH v9 40/43] x86/sev: Register SEV-SNP guest request platform device Brijesh Singh
@ 2022-01-28 17:18 ` Brijesh Singh
  2022-02-01 20:33   ` Peter Gonda
  2022-02-06 22:39   ` Borislav Petkov
  2022-01-28 17:18 ` [PATCH v9 42/43] virt: sevguest: Add support to derive key Brijesh Singh
  2022-01-28 17:18 ` [PATCH v9 43/43] virt: sevguest: Add support to get extended report Brijesh Singh
  42 siblings, 2 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:18 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

SEV-SNP specification provides the guest a mechanism to communicate with
the PSP without risk from a malicious hypervisor who wishes to read, alter,
drop or replay the messages sent. The driver uses snp_issue_guest_request()
to issue GHCB SNP_GUEST_REQUEST or SNP_EXT_GUEST_REQUEST NAE events to
submit the request to PSP.

The PSP requires that all communication should be encrypted using key
specified through the platform_data.

The userspace can use SNP_GET_REPORT ioctl() to query the guest
attestation report.

See SEV-SNP spec section Guest Messages for more details.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 Documentation/virt/coco/sevguest.rst  |  81 ++++
 drivers/virt/Kconfig                  |   3 +
 drivers/virt/Makefile                 |   1 +
 drivers/virt/coco/sevguest/Kconfig    |  12 +
 drivers/virt/coco/sevguest/Makefile   |   2 +
 drivers/virt/coco/sevguest/sevguest.c | 605 ++++++++++++++++++++++++++
 drivers/virt/coco/sevguest/sevguest.h |  98 +++++
 include/uapi/linux/sev-guest.h        |  50 +++
 8 files changed, 852 insertions(+)
 create mode 100644 Documentation/virt/coco/sevguest.rst
 create mode 100644 drivers/virt/coco/sevguest/Kconfig
 create mode 100644 drivers/virt/coco/sevguest/Makefile
 create mode 100644 drivers/virt/coco/sevguest/sevguest.c
 create mode 100644 drivers/virt/coco/sevguest/sevguest.h
 create mode 100644 include/uapi/linux/sev-guest.h

diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
new file mode 100644
index 000000000000..47ef3b0821d5
--- /dev/null
+++ b/Documentation/virt/coco/sevguest.rst
@@ -0,0 +1,81 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================================================
+The Definitive SEV Guest API Documentation
+===================================================================
+
+1. General description
+======================
+
+The SEV API is a set of ioctls that are used by the guest or hypervisor
+to get or set certain aspect of the SEV virtual machine. The ioctls belong
+to the following classes:
+
+ - Hypervisor ioctls: These query and set global attributes which affect the
+   whole SEV firmware.  These ioctl are used by platform provision tools.
+
+ - Guest ioctls: These query and set attributes of the SEV virtual machine.
+
+2. API description
+==================
+
+This section describes ioctls that can be used to query or set SEV guests.
+For each ioctl, the following information is provided along with a
+description:
+
+  Technology:
+      which SEV technology provides this ioctl. sev, sev-es, sev-snp or all.
+
+  Type:
+      hypervisor or guest. The ioctl can be used inside the guest or the
+      hypervisor.
+
+  Parameters:
+      what parameters are accepted by the ioctl.
+
+  Returns:
+      the return value.  General error numbers (ENOMEM, EINVAL)
+      are not detailed, but errors with specific meanings are.
+
+The guest ioctl should be issued on a file descriptor of the /dev/sev-guest device.
+The ioctl accepts struct snp_user_guest_request. The input and output structure is
+specified through the req_data and resp_data field respectively. If the ioctl fails
+to execute due to a firmware error, then fw_err code will be set otherwise the
+fw_err will be set to 0xff.
+
+::
+        struct snp_guest_request_ioctl {
+                /* Message version number */
+                __u32 msg_version;
+
+                /* Request and response structure address */
+                __u64 req_data;
+                __u64 resp_data;
+
+                /* firmware error code on failure (see psp-sev.h) */
+                __u64 fw_err;
+        };
+
+2.1 SNP_GET_REPORT
+------------------
+
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in): struct snp_report_req
+:Returns (out): struct snp_report_resp on success, -negative on error
+
+The SNP_GET_REPORT ioctl can be used to query the attestation report from the
+SEV-SNP firmware. The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command
+provided by the SEV-SNP firmware to query the attestation report.
+
+On success, the snp_report_resp.data will contains the report. The report
+contain the format described in the SEV-SNP specification. See the SEV-SNP
+specification for further details.
+
+
+Reference
+---------
+
+SEV-SNP and GHCB specification: developer.amd.com/sev
+
+The driver is based on SEV-SNP firmware spec 0.9 and GHCB spec version 2.0.
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index 8061e8ef449f..e457e47610d3 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -36,4 +36,7 @@ source "drivers/virt/vboxguest/Kconfig"
 source "drivers/virt/nitro_enclaves/Kconfig"
 
 source "drivers/virt/acrn/Kconfig"
+
+source "drivers/virt/coco/sevguest/Kconfig"
+
 endif
diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index 3e272ea60cd9..9c704a6fdcda 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -8,3 +8,4 @@ obj-y				+= vboxguest/
 
 obj-$(CONFIG_NITRO_ENCLAVES)	+= nitro_enclaves/
 obj-$(CONFIG_ACRN_HSM)		+= acrn/
+obj-$(CONFIG_SEV_GUEST)		+= coco/sevguest/
diff --git a/drivers/virt/coco/sevguest/Kconfig b/drivers/virt/coco/sevguest/Kconfig
new file mode 100644
index 000000000000..07ab9ec6471c
--- /dev/null
+++ b/drivers/virt/coco/sevguest/Kconfig
@@ -0,0 +1,12 @@
+config SEV_GUEST
+	tristate "AMD SEV Guest driver"
+	default y
+	depends on AMD_MEM_ENCRYPT && CRYPTO_AEAD2
+	help
+	  SEV-SNP firmware provides the guest a mechanism to communicate with
+	  the PSP without risk from a malicious hypervisor who wishes to read,
+	  alter, drop or replay the messages sent. The driver provides
+	  userspace interface to communicate with the PSP to request the
+	  attestation report and more.
+
+	  If you choose 'M' here, this module will be called sevguest.
diff --git a/drivers/virt/coco/sevguest/Makefile b/drivers/virt/coco/sevguest/Makefile
new file mode 100644
index 000000000000..b1ffb2b4177b
--- /dev/null
+++ b/drivers/virt/coco/sevguest/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_SEV_GUEST) += sevguest.o
diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
new file mode 100644
index 000000000000..6dc0785ddd4b
--- /dev/null
+++ b/drivers/virt/coco/sevguest/sevguest.c
@@ -0,0 +1,605 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Secure Encrypted Virtualization Nested Paging (SEV-SNP) guest request interface
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/io.h>
+#include <linux/platform_device.h>
+#include <linux/miscdevice.h>
+#include <linux/set_memory.h>
+#include <linux/fs.h>
+#include <crypto/aead.h>
+#include <linux/scatterlist.h>
+#include <linux/psp-sev.h>
+#include <uapi/linux/sev-guest.h>
+#include <uapi/linux/psp-sev.h>
+
+#include <asm/svm.h>
+#include <asm/sev.h>
+
+#include "sevguest.h"
+
+#define DEVICE_NAME	"sev-guest"
+#define AAD_LEN		48
+#define MSG_HDR_VER	1
+
+struct snp_guest_crypto {
+	struct crypto_aead *tfm;
+	u8 *iv, *authtag;
+	int iv_len, a_len;
+};
+
+struct snp_guest_dev {
+	struct device *dev;
+	struct miscdevice misc;
+
+	struct snp_guest_crypto *crypto;
+	struct snp_guest_msg *request, *response;
+	struct snp_secrets_page_layout *layout;
+	struct snp_req_data input;
+	u32 *os_area_msg_seqno;
+	u8 *vmpck;
+};
+
+static u32 vmpck_id;
+module_param(vmpck_id, uint, 0444);
+MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.");
+
+static DEFINE_MUTEX(snp_cmd_mutex);
+
+static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
+{
+	char zero_key[VMPCK_KEY_LEN] = {0};
+
+	if (snp_dev->vmpck)
+		return memcmp(snp_dev->vmpck, zero_key, VMPCK_KEY_LEN) == 0;
+
+	return true;
+}
+
+static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
+{
+	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
+	snp_dev->vmpck = NULL;
+}
+
+static inline u64 __snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+	u64 count;
+
+	lockdep_assert_held(&snp_cmd_mutex);
+
+	/* Read the current message sequence counter from secrets pages */
+	count = *snp_dev->os_area_msg_seqno;
+
+	return count + 1;
+}
+
+/* Return a non-zero on success */
+static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+	u64 count = __snp_get_msg_seqno(snp_dev);
+
+	/*
+	 * The message sequence counter for the SNP guest request is a  64-bit
+	 * value but the version 2 of GHCB specification defines a 32-bit storage
+	 * for it. If the counter exceeds the 32-bit value then return zero.
+	 * The caller should check the return value, but if the caller happens to
+	 * not check the value and use it, then the firmware treats zero as an
+	 * invalid number and will fail the  message request.
+	 */
+	if (count >= UINT_MAX) {
+		pr_err_ratelimited("SNP guest request message sequence counter overflow\n");
+		return 0;
+	}
+
+	return count;
+}
+
+static void snp_inc_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+	/*
+	 * The counter is also incremented by the PSP, so increment it by 2
+	 * and save in secrets page.
+	 */
+	*snp_dev->os_area_msg_seqno += 2;
+}
+
+static inline struct snp_guest_dev *to_snp_dev(struct file *file)
+{
+	struct miscdevice *dev = file->private_data;
+
+	return container_of(dev, struct snp_guest_dev, misc);
+}
+
+static struct snp_guest_crypto *init_crypto(struct snp_guest_dev *snp_dev, u8 *key, size_t keylen)
+{
+	struct snp_guest_crypto *crypto;
+
+	crypto = kzalloc(sizeof(*crypto), GFP_KERNEL_ACCOUNT);
+	if (!crypto)
+		return NULL;
+
+	crypto->tfm = crypto_alloc_aead("gcm(aes)", 0, 0);
+	if (IS_ERR(crypto->tfm))
+		goto e_free;
+
+	if (crypto_aead_setkey(crypto->tfm, key, keylen))
+		goto e_free_crypto;
+
+	crypto->iv_len = crypto_aead_ivsize(crypto->tfm);
+	if (crypto->iv_len < 12) {
+		dev_err(snp_dev->dev, "IV length is less than 12.\n");
+		goto e_free_crypto;
+	}
+
+	crypto->iv = kmalloc(crypto->iv_len, GFP_KERNEL_ACCOUNT);
+	if (!crypto->iv)
+		goto e_free_crypto;
+
+	if (crypto_aead_authsize(crypto->tfm) > MAX_AUTHTAG_LEN) {
+		if (crypto_aead_setauthsize(crypto->tfm, MAX_AUTHTAG_LEN)) {
+			dev_err(snp_dev->dev, "failed to set authsize to %d\n", MAX_AUTHTAG_LEN);
+			goto e_free_crypto;
+		}
+	}
+
+	crypto->a_len = crypto_aead_authsize(crypto->tfm);
+	crypto->authtag = kmalloc(crypto->a_len, GFP_KERNEL_ACCOUNT);
+	if (!crypto->authtag)
+		goto e_free_crypto;
+
+	return crypto;
+
+e_free_crypto:
+	crypto_free_aead(crypto->tfm);
+e_free:
+	kfree(crypto->iv);
+	kfree(crypto->authtag);
+	kfree(crypto);
+
+	return NULL;
+}
+
+static void deinit_crypto(struct snp_guest_crypto *crypto)
+{
+	crypto_free_aead(crypto->tfm);
+	kfree(crypto->iv);
+	kfree(crypto->authtag);
+	kfree(crypto);
+}
+
+static int enc_dec_message(struct snp_guest_crypto *crypto, struct snp_guest_msg *msg,
+			   u8 *src_buf, u8 *dst_buf, size_t len, bool enc)
+{
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
+	struct scatterlist src[3], dst[3];
+	DECLARE_CRYPTO_WAIT(wait);
+	struct aead_request *req;
+	int ret;
+
+	req = aead_request_alloc(crypto->tfm, GFP_KERNEL);
+	if (!req)
+		return -ENOMEM;
+
+	/*
+	 * AEAD memory operations:
+	 * +------ AAD -------+------- DATA -----+---- AUTHTAG----+
+	 * |  msg header      |  plaintext       |  hdr->authtag  |
+	 * | bytes 30h - 5Fh  |    or            |                |
+	 * |                  |   cipher         |                |
+	 * +------------------+------------------+----------------+
+	 */
+	sg_init_table(src, 3);
+	sg_set_buf(&src[0], &hdr->algo, AAD_LEN);
+	sg_set_buf(&src[1], src_buf, hdr->msg_sz);
+	sg_set_buf(&src[2], hdr->authtag, crypto->a_len);
+
+	sg_init_table(dst, 3);
+	sg_set_buf(&dst[0], &hdr->algo, AAD_LEN);
+	sg_set_buf(&dst[1], dst_buf, hdr->msg_sz);
+	sg_set_buf(&dst[2], hdr->authtag, crypto->a_len);
+
+	aead_request_set_ad(req, AAD_LEN);
+	aead_request_set_tfm(req, crypto->tfm);
+	aead_request_set_callback(req, 0, crypto_req_done, &wait);
+
+	aead_request_set_crypt(req, src, dst, len, crypto->iv);
+	ret = crypto_wait_req(enc ? crypto_aead_encrypt(req) : crypto_aead_decrypt(req), &wait);
+
+	aead_request_free(req);
+	return ret;
+}
+
+static int __enc_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
+			 void *plaintext, size_t len)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
+
+	memset(crypto->iv, 0, crypto->iv_len);
+	memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
+
+	return enc_dec_message(crypto, msg, plaintext, msg->payload, len, true);
+}
+
+static int dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
+		       void *plaintext, size_t len)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_guest_msg_hdr *hdr = &msg->hdr;
+
+	/* Build IV with response buffer sequence number */
+	memset(crypto->iv, 0, crypto->iv_len);
+	memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
+
+	return enc_dec_message(crypto, msg, msg->payload, plaintext, len, false);
+}
+
+static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, u32 sz)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_guest_msg *resp = snp_dev->response;
+	struct snp_guest_msg *req = snp_dev->request;
+	struct snp_guest_msg_hdr *req_hdr = &req->hdr;
+	struct snp_guest_msg_hdr *resp_hdr = &resp->hdr;
+
+	dev_dbg(snp_dev->dev, "response [seqno %lld type %d version %d sz %d]\n",
+		resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version, resp_hdr->msg_sz);
+
+	/* Verify that the sequence counter is incremented by 1 */
+	if (unlikely(resp_hdr->msg_seqno != (req_hdr->msg_seqno + 1)))
+		return -EBADMSG;
+
+	/* Verify response message type and version number. */
+	if (resp_hdr->msg_type != (req_hdr->msg_type + 1) ||
+	    resp_hdr->msg_version != req_hdr->msg_version)
+		return -EBADMSG;
+
+	/*
+	 * If the message size is greater than our buffer length then return
+	 * an error.
+	 */
+	if (unlikely((resp_hdr->msg_sz + crypto->a_len) > sz))
+		return -EBADMSG;
+
+	/* Decrypt the payload */
+	return dec_payload(snp_dev, resp, payload, resp_hdr->msg_sz + crypto->a_len);
+}
+
+static bool enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8 type,
+			void *payload, size_t sz)
+{
+	struct snp_guest_msg *req = snp_dev->request;
+	struct snp_guest_msg_hdr *hdr = &req->hdr;
+
+	memset(req, 0, sizeof(*req));
+
+	hdr->algo = SNP_AEAD_AES_256_GCM;
+	hdr->hdr_version = MSG_HDR_VER;
+	hdr->hdr_sz = sizeof(*hdr);
+	hdr->msg_type = type;
+	hdr->msg_version = version;
+	hdr->msg_seqno = seqno;
+	hdr->msg_vmpck = vmpck_id;
+	hdr->msg_sz = sz;
+
+	/* Verify the sequence number is non-zero */
+	if (!hdr->msg_seqno)
+		return -ENOSR;
+
+	dev_dbg(snp_dev->dev, "request [seqno %lld type %d version %d sz %d]\n",
+		hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
+
+	return __enc_payload(snp_dev, req, payload, sz);
+}
+
+static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
+				u8 type, void *req_buf, size_t req_sz, void *resp_buf,
+				u32 resp_sz, __u64 *fw_err)
+{
+	unsigned long err;
+	u64 seqno;
+	int rc;
+
+	/* Get message sequence and verify that its a non-zero */
+	seqno = snp_get_msg_seqno(snp_dev);
+	if (!seqno)
+		return -EIO;
+
+	memset(snp_dev->response, 0, sizeof(*snp_dev->response));
+
+	/* Encrypt the userspace provided payload */
+	rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
+	if (rc)
+		return rc;
+
+	/* Call firmware to process the request */
+	rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+	if (fw_err)
+		*fw_err = err;
+
+	if (rc)
+		return rc;
+
+	rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
+	if (rc) {
+		/*
+		 * The verify_and_dec_payload() will fail only if the hypervisor is
+		 * actively modifying the message header or corrupting the encrypted payload.
+		 * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
+		 * the key cannot be used for any communication. The key is disabled to ensure
+		 * that AES-GCM does not use the same IV while encrypting the request payload.
+		 */
+		dev_alert(snp_dev->dev,
+			  "Detected unexpected decode failure, disabling the vmpck_id %d\n",
+			  vmpck_id);
+		snp_disable_vmpck(snp_dev);
+		return rc;
+	}
+
+	/* Increment to new message sequence after payload decryption was successful. */
+	snp_inc_msg_seqno(snp_dev);
+
+	return 0;
+}
+
+static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_report_req req = {0};
+	struct snp_report_resp *resp;
+	int rc, resp_len;
+
+	if (!arg->req_data || !arg->resp_data)
+		return -EINVAL;
+
+	/* Copy the request payload from userspace */
+	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+		return -EFAULT;
+
+	/*
+	 * The intermediate response buffer is used while decrypting the
+	 * response payload. Make sure that it has enough space to cover the
+	 * authtag.
+	 */
+	resp_len = sizeof(resp->data) + crypto->a_len;
+	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+	if (!resp)
+		return -ENOMEM;
+
+	/* Issue the command to get the attestation report */
+	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
+				  SNP_MSG_REPORT_REQ, &req, sizeof(req), resp->data,
+				  resp_len, &arg->fw_err);
+	if (rc)
+		goto e_free;
+
+	/* Copy the response payload to userspace */
+	if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+		rc = -EFAULT;
+
+e_free:
+	kfree(resp);
+	return rc;
+}
+
+static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
+{
+	struct snp_guest_dev *snp_dev = to_snp_dev(file);
+	void __user *argp = (void __user *)arg;
+	struct snp_guest_request_ioctl input;
+	int ret = -ENOTTY;
+
+	if (copy_from_user(&input, argp, sizeof(input)))
+		return -EFAULT;
+
+	input.fw_err = 0xff;
+
+	/* Message version must be non-zero */
+	if (!input.msg_version)
+		return -EINVAL;
+
+	mutex_lock(&snp_cmd_mutex);
+
+	/* Check if the VMPCK is not empty */
+	if (is_vmpck_empty(snp_dev)) {
+		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
+		mutex_unlock(&snp_cmd_mutex);
+		return -ENOTTY;
+	}
+
+	switch (ioctl) {
+	case SNP_GET_REPORT:
+		ret = get_report(snp_dev, &input);
+		break;
+	default:
+		break;
+	}
+
+	mutex_unlock(&snp_cmd_mutex);
+
+	if (input.fw_err && copy_to_user(argp, &input, sizeof(input)))
+		return -EFAULT;
+
+	return ret;
+}
+
+static void free_shared_pages(void *buf, size_t sz)
+{
+	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+	if (!buf)
+		return;
+
+	/* If fail to restore the encryption mask then leak it. */
+	if (WARN_ONCE(set_memory_encrypted((unsigned long)buf, npages),
+		      "Failed to restore encryption mask (leak it)\n"))
+		return;
+
+	__free_pages(virt_to_page(buf), get_order(sz));
+}
+
+static void *alloc_shared_pages(size_t sz)
+{
+	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+	struct page *page;
+	int ret;
+
+	page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
+	if (IS_ERR(page))
+		return NULL;
+
+	ret = set_memory_decrypted((unsigned long)page_address(page), npages);
+	if (ret) {
+		pr_err("SEV-SNP: failed to mark page shared, ret=%d\n", ret);
+		__free_pages(page, get_order(sz));
+		return NULL;
+	}
+
+	return page_address(page);
+}
+
+static const struct file_operations snp_guest_fops = {
+	.owner	= THIS_MODULE,
+	.unlocked_ioctl = snp_guest_ioctl,
+};
+
+static u8 *get_vmpck(int id, struct snp_secrets_page_layout *layout, u32 **seqno)
+{
+	u8 *key = NULL;
+
+	switch (id) {
+	case 0:
+		*seqno = &layout->os_area.msg_seqno_0;
+		key = layout->vmpck0;
+		break;
+	case 1:
+		*seqno = &layout->os_area.msg_seqno_1;
+		key = layout->vmpck1;
+		break;
+	case 2:
+		*seqno = &layout->os_area.msg_seqno_2;
+		key = layout->vmpck2;
+		break;
+	case 3:
+		*seqno = &layout->os_area.msg_seqno_3;
+		key = layout->vmpck3;
+		break;
+	default:
+		break;
+	}
+
+	return key;
+}
+
+static int __init snp_guest_probe(struct platform_device *pdev)
+{
+	struct snp_secrets_page_layout *layout;
+	struct snp_guest_platform_data *data;
+	struct device *dev = &pdev->dev;
+	struct snp_guest_dev *snp_dev;
+	struct miscdevice *misc;
+	int ret;
+
+	if (!dev->platform_data)
+		return -ENODEV;
+
+	data = (struct snp_guest_platform_data *)dev->platform_data;
+	layout = (__force void *)ioremap_encrypted(data->secrets_gpa, PAGE_SIZE);
+	if (!layout)
+		return -ENODEV;
+
+	ret = -ENOMEM;
+	snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL);
+	if (!snp_dev)
+		goto e_fail;
+
+	ret = -EINVAL;
+	snp_dev->vmpck = get_vmpck(vmpck_id, layout, &snp_dev->os_area_msg_seqno);
+	if (!snp_dev->vmpck) {
+		dev_err(dev, "invalid vmpck id %d\n", vmpck_id);
+		goto e_fail;
+	}
+
+	/* Verify that VMPCK is not zero. */
+	if (is_vmpck_empty(snp_dev)) {
+		dev_err(dev, "vmpck id %d is null\n", vmpck_id);
+		goto e_fail;
+	}
+
+	platform_set_drvdata(pdev, snp_dev);
+	snp_dev->dev = dev;
+	snp_dev->layout = layout;
+
+	/* Allocate the shared page used for the request and response message. */
+	snp_dev->request = alloc_shared_pages(sizeof(struct snp_guest_msg));
+	if (!snp_dev->request)
+		goto e_fail;
+
+	snp_dev->response = alloc_shared_pages(sizeof(struct snp_guest_msg));
+	if (!snp_dev->response)
+		goto e_fail;
+
+	ret = -EIO;
+	snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
+	if (!snp_dev->crypto)
+		goto e_fail;
+
+	misc = &snp_dev->misc;
+	misc->minor = MISC_DYNAMIC_MINOR;
+	misc->name = DEVICE_NAME;
+	misc->fops = &snp_guest_fops;
+
+	/* initial the input address for guest request */
+	snp_dev->input.req_gpa = __pa(snp_dev->request);
+	snp_dev->input.resp_gpa = __pa(snp_dev->response);
+
+	ret =  misc_register(misc);
+	if (ret)
+		goto e_fail;
+
+	dev_info(dev, "Initialized SEV-SNP guest driver (using vmpck_id %d)\n", vmpck_id);
+	return 0;
+
+e_fail:
+	iounmap(layout);
+	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
+	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+
+	return ret;
+}
+
+static int __exit snp_guest_remove(struct platform_device *pdev)
+{
+	struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev);
+
+	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
+	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+	deinit_crypto(snp_dev->crypto);
+	misc_deregister(&snp_dev->misc);
+
+	return 0;
+}
+
+static struct platform_driver snp_guest_driver = {
+	.remove		= __exit_p(snp_guest_remove),
+	.driver		= {
+		.name = "snp-guest",
+	},
+};
+
+module_platform_driver_probe(snp_guest_driver, snp_guest_probe);
+
+MODULE_AUTHOR("Brijesh Singh <brijesh.singh@amd.com>");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.0.0");
+MODULE_DESCRIPTION("AMD SNP Guest Driver");
diff --git a/drivers/virt/coco/sevguest/sevguest.h b/drivers/virt/coco/sevguest/sevguest.h
new file mode 100644
index 000000000000..cfa76cf8a21a
--- /dev/null
+++ b/drivers/virt/coco/sevguest/sevguest.h
@@ -0,0 +1,98 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * SEV-SNP API spec is available at https://developer.amd.com/sev
+ */
+
+#ifndef __LINUX_SEVGUEST_H_
+#define __LINUX_SEVGUEST_H_
+
+#include <linux/types.h>
+
+#define MAX_AUTHTAG_LEN		32
+
+/* See SNP spec SNP_GUEST_REQUEST section for the structure */
+enum msg_type {
+	SNP_MSG_TYPE_INVALID = 0,
+	SNP_MSG_CPUID_REQ,
+	SNP_MSG_CPUID_RSP,
+	SNP_MSG_KEY_REQ,
+	SNP_MSG_KEY_RSP,
+	SNP_MSG_REPORT_REQ,
+	SNP_MSG_REPORT_RSP,
+	SNP_MSG_EXPORT_REQ,
+	SNP_MSG_EXPORT_RSP,
+	SNP_MSG_IMPORT_REQ,
+	SNP_MSG_IMPORT_RSP,
+	SNP_MSG_ABSORB_REQ,
+	SNP_MSG_ABSORB_RSP,
+	SNP_MSG_VMRK_REQ,
+	SNP_MSG_VMRK_RSP,
+
+	SNP_MSG_TYPE_MAX
+};
+
+enum aead_algo {
+	SNP_AEAD_INVALID,
+	SNP_AEAD_AES_256_GCM,
+};
+
+struct snp_guest_msg_hdr {
+	u8 authtag[MAX_AUTHTAG_LEN];
+	u64 msg_seqno;
+	u8 rsvd1[8];
+	u8 algo;
+	u8 hdr_version;
+	u16 hdr_sz;
+	u8 msg_type;
+	u8 msg_version;
+	u16 msg_sz;
+	u32 rsvd2;
+	u8 msg_vmpck;
+	u8 rsvd3[35];
+} __packed;
+
+struct snp_guest_msg {
+	struct snp_guest_msg_hdr hdr;
+	u8 payload[4000];
+} __packed;
+
+/*
+ * The secrets page contains 96-bytes of reserved field that can be used by
+ * the guest OS. The guest OS uses the area to save the message sequence
+ * number for each VMPCK.
+ *
+ * See the GHCB spec section Secret page layout for the format for this area.
+ */
+struct secrets_os_area {
+	u32 msg_seqno_0;
+	u32 msg_seqno_1;
+	u32 msg_seqno_2;
+	u32 msg_seqno_3;
+	u64 ap_jump_table_pa;
+	u8 rsvd[40];
+	u8 guest_usage[32];
+} __packed;
+
+#define VMPCK_KEY_LEN		32
+
+/* See the SNP spec version 0.9 for secrets page format */
+struct snp_secrets_page_layout {
+	u32 version;
+	u32 imien	: 1,
+	    rsvd1	: 31;
+	u32 fms;
+	u32 rsvd2;
+	u8 gosvw[16];
+	u8 vmpck0[VMPCK_KEY_LEN];
+	u8 vmpck1[VMPCK_KEY_LEN];
+	u8 vmpck2[VMPCK_KEY_LEN];
+	u8 vmpck3[VMPCK_KEY_LEN];
+	struct secrets_os_area os_area;
+	u8 rsvd3[3840];
+} __packed;
+
+#endif /* __LINUX_SNP_GUEST_H__ */
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
new file mode 100644
index 000000000000..081d314a6279
--- /dev/null
+++ b/include/uapi/linux/sev-guest.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/*
+ * Userspace interface for AMD SEV and SEV-SNP guest driver.
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * SEV API specification is available at: https://developer.amd.com/sev/
+ */
+
+#ifndef __UAPI_LINUX_SEV_GUEST_H_
+#define __UAPI_LINUX_SEV_GUEST_H_
+
+#include <linux/types.h>
+
+struct snp_report_req {
+	/* user data that should be included in the report */
+	__u8 user_data[64];
+
+	/* The vmpl level to be included in the report */
+	__u32 vmpl;
+
+	/* Must be zero filled */
+	__u8 rsvd[28];
+};
+
+struct snp_report_resp {
+	/* response data, see SEV-SNP spec for the format */
+	__u8 data[4000];
+};
+
+struct snp_guest_request_ioctl {
+	/* message version number (must be non-zero) */
+	__u8 msg_version;
+
+	/* Request and response structure address */
+	__u64 req_data;
+	__u64 resp_data;
+
+	/* firmware error code on failure (see psp-sev.h) */
+	__u64 fw_err;
+};
+
+#define SNP_GUEST_REQ_IOC_TYPE	'S'
+
+/* Get SNP attestation report */
+#define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_guest_request_ioctl)
+
+#endif /* __UAPI_LINUX_SEV_GUEST_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (40 preceding siblings ...)
  2022-01-28 17:18 ` [PATCH v9 41/43] virt: Add SEV-SNP guest driver Brijesh Singh
@ 2022-01-28 17:18 ` Brijesh Singh
  2022-02-01 20:39   ` Peter Gonda
  2022-02-07  8:52   ` Borislav Petkov
  2022-01-28 17:18 ` [PATCH v9 43/43] virt: sevguest: Add support to get extended report Brijesh Singh
  42 siblings, 2 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:18 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh, Liam Merwick

The SNP_GET_DERIVED_KEY ioctl interface can be used by the SNP guest to
ask the firmware to provide a key derived from a root key. The derived
key may be used by the guest for any purposes it chooses, such as a
sealing key or communicating with the external entities.

See SEV-SNP firmware spec for more information.

Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 Documentation/virt/coco/sevguest.rst  | 17 ++++++++++
 drivers/virt/coco/sevguest/sevguest.c | 45 +++++++++++++++++++++++++++
 include/uapi/linux/sev-guest.h        | 17 ++++++++++
 3 files changed, 79 insertions(+)

diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
index 47ef3b0821d5..aafc9bce9aef 100644
--- a/Documentation/virt/coco/sevguest.rst
+++ b/Documentation/virt/coco/sevguest.rst
@@ -72,6 +72,23 @@ On success, the snp_report_resp.data will contains the report. The report
 contain the format described in the SEV-SNP specification. See the SEV-SNP
 specification for further details.
 
+2.2 SNP_GET_DERIVED_KEY
+-----------------------
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in): struct snp_derived_key_req
+:Returns (out): struct snp_derived_key_resp on success, -negative on error
+
+The SNP_GET_DERIVED_KEY ioctl can be used to get a key derive from a root key.
+The derived key can be used by the guest for any purpose, such as sealing keys
+or communicating with external entities.
+
+The ioctl uses the SNP_GUEST_REQUEST (MSG_KEY_REQ) command provided by the
+SEV-SNP firmware to derive the key. See SEV-SNP specification for further details
+on the various fields passed in the key derivation request.
+
+On success, the snp_derived_key_resp.data contains the derived key value. See
+the SEV-SNP specification for further details.
 
 Reference
 ---------
diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
index 6dc0785ddd4b..4369e55df9a6 100644
--- a/drivers/virt/coco/sevguest/sevguest.c
+++ b/drivers/virt/coco/sevguest/sevguest.c
@@ -392,6 +392,48 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
 	return rc;
 }
 
+static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_derived_key_resp resp = {0};
+	struct snp_derived_key_req req = {0};
+	int rc, resp_len;
+	u8 buf[64+16]; /* Response data is 64 bytes and max authsize for GCM is 16 bytes */
+
+	if (!arg->req_data || !arg->resp_data)
+		return -EINVAL;
+
+	/* Copy the request payload from userspace */
+	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+		return -EFAULT;
+
+	/*
+	 * The intermediate response buffer is used while decrypting the
+	 * response payload. Make sure that it has enough space to cover the
+	 * authtag.
+	 */
+	resp_len = sizeof(resp.data) + crypto->a_len;
+	if (sizeof(buf) < resp_len)
+		return -ENOMEM;
+
+	/* Issue the command to get the attestation report */
+	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
+				  SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len,
+				  &arg->fw_err);
+	if (rc)
+		goto e_free;
+
+	/* Copy the response payload to userspace */
+	memcpy(resp.data, buf, sizeof(resp.data));
+	if (copy_to_user((void __user *)arg->resp_data, &resp, sizeof(resp)))
+		rc = -EFAULT;
+
+e_free:
+	memzero_explicit(buf, sizeof(buf));
+	memzero_explicit(&resp, sizeof(resp));
+	return rc;
+}
+
 static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	struct snp_guest_dev *snp_dev = to_snp_dev(file);
@@ -421,6 +463,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 	case SNP_GET_REPORT:
 		ret = get_report(snp_dev, &input);
 		break;
+	case SNP_GET_DERIVED_KEY:
+		ret = get_derived_key(snp_dev, &input);
+		break;
 	default:
 		break;
 	}
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
index 081d314a6279..bcd00a6d4501 100644
--- a/include/uapi/linux/sev-guest.h
+++ b/include/uapi/linux/sev-guest.h
@@ -30,6 +30,20 @@ struct snp_report_resp {
 	__u8 data[4000];
 };
 
+struct snp_derived_key_req {
+	__u32 root_key_select;
+	__u32 rsvd;
+	__u64 guest_field_select;
+	__u32 vmpl;
+	__u32 guest_svn;
+	__u64 tcb_version;
+};
+
+struct snp_derived_key_resp {
+	/* response data, see SEV-SNP spec for the format */
+	__u8 data[64];
+};
+
 struct snp_guest_request_ioctl {
 	/* message version number (must be non-zero) */
 	__u8 msg_version;
@@ -47,4 +61,7 @@ struct snp_guest_request_ioctl {
 /* Get SNP attestation report */
 #define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_guest_request_ioctl)
 
+/* Get a derived key from the root */
+#define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_guest_request_ioctl)
+
 #endif /* __UAPI_LINUX_SEV_GUEST_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* [PATCH v9 43/43] virt: sevguest: Add support to get extended report
  2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
                   ` (41 preceding siblings ...)
  2022-01-28 17:18 ` [PATCH v9 42/43] virt: sevguest: Add support to derive key Brijesh Singh
@ 2022-01-28 17:18 ` Brijesh Singh
  2022-02-01 20:43   ` Peter Gonda
  2022-02-07  9:16   ` Borislav Petkov
  42 siblings, 2 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-01-28 17:18 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm
  Cc: Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Brijesh Singh

Version 2 of GHCB specification defines Non-Automatic-Exit(NAE) to get
the extended guest report. It is similar to the SNP_GET_REPORT ioctl.
The main difference is related to the additional data that will be
returned. The additional data returned is a certificate blob that can
be used by the SNP guest user. The certificate blob layout is defined
in the GHCB specification. The driver simply treats the blob as a opaque
data and copies it to userspace.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 Documentation/virt/coco/sevguest.rst  | 23 +++++++
 drivers/virt/coco/sevguest/sevguest.c | 89 +++++++++++++++++++++++++++
 include/uapi/linux/sev-guest.h        | 13 ++++
 3 files changed, 125 insertions(+)

diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
index aafc9bce9aef..b9fe20e92d06 100644
--- a/Documentation/virt/coco/sevguest.rst
+++ b/Documentation/virt/coco/sevguest.rst
@@ -90,6 +90,29 @@ on the various fields passed in the key derivation request.
 On success, the snp_derived_key_resp.data contains the derived key value. See
 the SEV-SNP specification for further details.
 
+
+2.3 SNP_GET_EXT_REPORT
+----------------------
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in/out): struct snp_ext_report_req
+:Returns (out): struct snp_report_resp on success, -negative on error
+
+The SNP_GET_EXT_REPORT ioctl is similar to the SNP_GET_REPORT. The difference is
+related to the additional certificate data that is returned with the report.
+The certificate data returned is being provided by the hypervisor through the
+SNP_SET_EXT_CONFIG.
+
+The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command provided by the SEV-SNP
+firmware to get the attestation report.
+
+On success, the snp_ext_report_resp.data will contain the attestation report
+and snp_ext_report_req.certs_address will contain the certificate blob. If the
+length of the blob is smaller than expected then snp_ext_report_req.certs_len will
+be updated with the expected value.
+
+See GHCB specification for further detail on how to parse the certificate blob.
+
 Reference
 ---------
 
diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
index 4369e55df9a6..a53854353944 100644
--- a/drivers/virt/coco/sevguest/sevguest.c
+++ b/drivers/virt/coco/sevguest/sevguest.c
@@ -41,6 +41,7 @@ struct snp_guest_dev {
 	struct device *dev;
 	struct miscdevice misc;
 
+	void *certs_data;
 	struct snp_guest_crypto *crypto;
 	struct snp_guest_msg *request, *response;
 	struct snp_secrets_page_layout *layout;
@@ -434,6 +435,84 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
 	return rc;
 }
 
+static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+	struct snp_guest_crypto *crypto = snp_dev->crypto;
+	struct snp_ext_report_req req = {0};
+	struct snp_report_resp *resp;
+	int ret, npages = 0, resp_len;
+
+	if (!arg->req_data || !arg->resp_data)
+		return -EINVAL;
+
+	/* Copy the request payload from userspace */
+	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+		return -EFAULT;
+
+	if (req.certs_len) {
+		if (req.certs_len > SEV_FW_BLOB_MAX_SIZE ||
+		    !IS_ALIGNED(req.certs_len, PAGE_SIZE))
+			return -EINVAL;
+	}
+
+	if (req.certs_address && req.certs_len) {
+		if (!access_ok(req.certs_address, req.certs_len))
+			return -EFAULT;
+
+		/*
+		 * Initialize the intermediate buffer with all zero's. This buffer
+		 * is used in the guest request message to get the certs blob from
+		 * the host. If host does not supply any certs in it, then copy
+		 * zeros to indicate that certificate data was not provided.
+		 */
+		memset(snp_dev->certs_data, 0, req.certs_len);
+
+		npages = req.certs_len >> PAGE_SHIFT;
+	}
+
+	/*
+	 * The intermediate response buffer is used while decrypting the
+	 * response payload. Make sure that it has enough space to cover the
+	 * authtag.
+	 */
+	resp_len = sizeof(resp->data) + crypto->a_len;
+	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+	if (!resp)
+		return -ENOMEM;
+
+	snp_dev->input.data_npages = npages;
+	ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg->msg_version,
+				   SNP_MSG_REPORT_REQ, &req.data,
+				   sizeof(req.data), resp->data, resp_len, &arg->fw_err);
+
+	/* If certs length is invalid then copy the returned length */
+	if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
+		req.certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
+
+		if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
+			ret = -EFAULT;
+	}
+
+	if (ret)
+		goto e_free;
+
+	/* Copy the certificate data blob to userspace */
+	if (req.certs_address && req.certs_len &&
+	    copy_to_user((void __user *)req.certs_address, snp_dev->certs_data,
+			 req.certs_len)) {
+		ret = -EFAULT;
+		goto e_free;
+	}
+
+	/* Copy the response payload to userspace */
+	if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+		ret = -EFAULT;
+
+e_free:
+	kfree(resp);
+	return ret;
+}
+
 static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
 {
 	struct snp_guest_dev *snp_dev = to_snp_dev(file);
@@ -466,6 +545,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
 	case SNP_GET_DERIVED_KEY:
 		ret = get_derived_key(snp_dev, &input);
 		break;
+	case SNP_GET_EXT_REPORT:
+		ret = get_ext_report(snp_dev, &input);
+		break;
 	default:
 		break;
 	}
@@ -594,6 +676,10 @@ static int __init snp_guest_probe(struct platform_device *pdev)
 	if (!snp_dev->response)
 		goto e_fail;
 
+	snp_dev->certs_data = alloc_shared_pages(SEV_FW_BLOB_MAX_SIZE);
+	if (!snp_dev->certs_data)
+		goto e_fail;
+
 	ret = -EIO;
 	snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
 	if (!snp_dev->crypto)
@@ -607,6 +693,7 @@ static int __init snp_guest_probe(struct platform_device *pdev)
 	/* initial the input address for guest request */
 	snp_dev->input.req_gpa = __pa(snp_dev->request);
 	snp_dev->input.resp_gpa = __pa(snp_dev->response);
+	snp_dev->input.data_gpa = __pa(snp_dev->certs_data);
 
 	ret =  misc_register(misc);
 	if (ret)
@@ -617,6 +704,7 @@ static int __init snp_guest_probe(struct platform_device *pdev)
 
 e_fail:
 	iounmap(layout);
+	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
 	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
 	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
 
@@ -629,6 +717,7 @@ static int __exit snp_guest_remove(struct platform_device *pdev)
 
 	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
 	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+	free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
 	deinit_crypto(snp_dev->crypto);
 	misc_deregister(&snp_dev->misc);
 
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
index bcd00a6d4501..0a47b6627c78 100644
--- a/include/uapi/linux/sev-guest.h
+++ b/include/uapi/linux/sev-guest.h
@@ -56,6 +56,16 @@ struct snp_guest_request_ioctl {
 	__u64 fw_err;
 };
 
+struct snp_ext_report_req {
+	struct snp_report_req data;
+
+	/* where to copy the certificate blob */
+	__u64 certs_address;
+
+	/* length of the certificate blob */
+	__u32 certs_len;
+};
+
 #define SNP_GUEST_REQ_IOC_TYPE	'S'
 
 /* Get SNP attestation report */
@@ -64,4 +74,7 @@ struct snp_guest_request_ioctl {
 /* Get a derived key from the root */
 #define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_guest_request_ioctl)
 
+/* Get SNP extended report as defined in the GHCB specification version 2. */
+#define SNP_GET_EXT_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x2, struct snp_guest_request_ioctl)
+
 #endif /* __UAPI_LINUX_SEV_GUEST_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 02/43] KVM: SVM: Create a separate mapping for the SEV-ES save area
  2022-01-28 17:17 ` [PATCH v9 02/43] KVM: SVM: Create a separate mapping for the SEV-ES save area Brijesh Singh
@ 2022-02-01 13:02   ` Borislav Petkov
  2022-02-09 15:02     ` Brijesh Singh
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-01 13:02 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy,
	Venu Busireddy

On Fri, Jan 28, 2022 at 11:17:23AM -0600, Brijesh Singh wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> The save area for SEV-ES/SEV-SNP guests, as used by the hardware, is
> different from the save area of a non SEV-ES/SEV-SNP guest.
> 
> This is the first step in defining the multiple save areas to keep them
> separate and ensuring proper operation amongst the different types of
> guests. Create an SEV-ES/SEV-SNP save area and adjust usage to the new
> save area definition where needed.
> 
> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/svm.h | 87 +++++++++++++++++++++++++++++---------
>  arch/x86/kvm/svm/sev.c     | 24 +++++------
>  arch/x86/kvm/svm/svm.h     |  2 +-
>  3 files changed, 80 insertions(+), 33 deletions(-)

https://lore.kernel.org/r/Yc2jzOunYej4vwSc@zn.tnic

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot
  2022-01-28 17:17 ` [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot Brijesh Singh
@ 2022-02-01 18:08   ` Borislav Petkov
  2022-02-01 20:35     ` Michael Roth
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-01 18:08 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:26AM -0600, Brijesh Singh wrote:
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index fd9441f40457..49064a9f96e2 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -191,9 +191,8 @@ SYM_FUNC_START(startup_32)
>  	/*
>  	 * Mark SEV as active in sev_status so that startup32_check_sev_cbit()
>  	 * will do a check. The sev_status memory will be fully initialized

That "sev_status memory" formulation is just weird. Pls fix it while
you're touching that comment.

> +static inline u64 rd_sev_status_msr(void)
> +{
> +	unsigned long low, high;
> +
> +	asm volatile("rdmsr" : "=a" (low), "=d" (high) :
> +			"c" (MSR_AMD64_SEV));
> +
> +	return ((high << 32) | low);
> +}

Don't you see sev_es_rd_ghcb_msr() in that same file above? Do a common
rdmsr() helper and call it where needed, pls, instead of duplicating
code.

misc.h looks like a good place.

Extra bonus points will be given if you unify callers in
arch/x86/boot/cpucheck.c too but you don't have to - I can do that
ontop.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 10/43] x86/sev: Check SEV-SNP features support
  2022-01-28 17:17 ` [PATCH v9 10/43] x86/sev: Check SEV-SNP features support Brijesh Singh
@ 2022-02-01 19:59   ` Borislav Petkov
  2022-02-02 14:28     ` Brijesh Singh
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-01 19:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:31AM -0600, Brijesh Singh wrote:
> diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c
> index 9b93567d663a..63e9044ab1d6 100644
> --- a/arch/x86/boot/compressed/idt_64.c
> +++ b/arch/x86/boot/compressed/idt_64.c
> @@ -39,7 +39,15 @@ void load_stage1_idt(void)
>  	load_boot_idt(&boot_idt_desc);
>  }
>  
> -/* Setup IDT after kernel jumping to  .Lrelocated */
> +/*
> + * Setup IDT after kernel jumping to  .Lrelocated
> + *
> + * initialize_identity_maps() needs a PF handler setup. The PF handler setup
> + * needs to happen in load_stage2_idt() where the IDT is loaded and there the
> + * VC IDT entry gets setup too in order to handle VCs, one needs a GHCB which
> + * gets setup with an already setup table which is done in
> + * initialize_identity_maps() and this is where the circle is complete.
> + */

I've beefed it up more, please use this one instead:

/*
 * Setup IDT after kernel jumping to  .Lrelocated.
 *
 * initialize_identity_maps() needs a #PF handler to be setup
 * in order to be able to fault-in identity mapping ranges; see
 * do_boot_page_fault().
 *
 * This #PF handler setup needs to happen in load_stage2_idt() where the
 * IDT is loaded and there the #VC IDT entry gets setup too.
 *
 * In order to be able to handle #VCs, one needs a GHCB which
 * gets setup with an already set up pagetable, which is done in
 * initialize_identity_maps(). And there's the catch 22: the boot #VC
 * handler do_boot_stage2_vc() needs to call early_setup_ghcb() itself
 * (and, especially set_page_decrypted()) because the SEV-ES setup code
 * cannot initialize a GHCB as there's no #PF handler yet...
 */

> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 19ad09712902..24df739c9c05 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -43,6 +43,9 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
>   */
>  static struct ghcb __initdata *boot_ghcb;
>  
> +/* Bitmap of SEV features supported by the hypervisor */
> +static u64 sev_hv_features __ro_after_init;
> +
>  /* #VC handler runtime per-CPU data */
>  struct sev_es_runtime_data {
>  	struct ghcb ghcb_page;
> @@ -766,6 +769,18 @@ void __init sev_es_init_vc_handling(void)
>  	if (!sev_es_check_cpu_features())
>  		panic("SEV-ES CPU Features missing");
>  
> +	/*
> +	 * SEV-SNP is supported in v2 of the GHCB spec which mandates support for HV
> +	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
> +	 * the SEV-SNP features.

You guys have been completely brainwashed by marketing. I say:

"s/SEV-SNP/SNP/g

And please do that everywhere in sev-specific files."

and you go and slap that "SEV-" thing everywhere instead. Why? That file
is already called sev.c so it must be SEV-something. Lemme simplify that
comment for ya:

	/*
	 * SNP is supported in v2 of the GHCB spec which mandates support for HV
	 * features.
	 */

That's it, no more needed - the rest should be visible from the code.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 39/43] x86/sev: Provide support for SNP guest request NAEs
  2022-01-28 17:18 ` [PATCH v9 39/43] x86/sev: Provide support for SNP guest request NAEs Brijesh Singh
@ 2022-02-01 20:17   ` Peter Gonda
  2022-03-03 14:53     ` Brijesh Singh
  0 siblings, 1 reply; 115+ messages in thread
From: Peter Gonda @ 2022-02-01 20:17 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: the arch/x86 maintainers, LKML, kvm list, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, Tony Luck, Marc Orr, Sathyanarayanan Kuppuswamy

On Fri, Jan 28, 2022 at 10:19 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>
> Version 2 of GHCB specification provides SNP_GUEST_REQUEST and
> SNP_EXT_GUEST_REQUEST NAE that can be used by the SNP guest to communicate
> with the PSP.
>
> While at it, add a snp_issue_guest_request() helper that will be used by
> driver or other subsystem to issue the request to PSP.
>
> See SEV-SNP firmware and GHCB spec for more details.
>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/sev-common.h |  3 ++
>  arch/x86/include/asm/sev.h        | 14 ++++++++
>  arch/x86/include/uapi/asm/svm.h   |  4 +++
>  arch/x86/kernel/sev.c             | 55 +++++++++++++++++++++++++++++++
>  4 files changed, 76 insertions(+)
>
> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
> index cd769984e929..442614879dad 100644
> --- a/arch/x86/include/asm/sev-common.h
> +++ b/arch/x86/include/asm/sev-common.h
> @@ -128,6 +128,9 @@ struct snp_psc_desc {
>         struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
>  } __packed;
>
> +/* Guest message request error code */
> +#define SNP_GUEST_REQ_INVALID_LEN      BIT_ULL(32)
> +
>  #define GHCB_MSR_TERM_REQ              0x100
>  #define GHCB_MSR_TERM_REASON_SET_POS   12
>  #define GHCB_MSR_TERM_REASON_SET_MASK  0xf
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 219abb4590f2..9830ee1d6ef0 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -87,6 +87,14 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
>
>  #define RMPADJUST_VMSA_PAGE_BIT                BIT(16)
>
> +/* SNP Guest message request */
> +struct snp_req_data {
> +       unsigned long req_gpa;
> +       unsigned long resp_gpa;
> +       unsigned long data_gpa;
> +       unsigned int data_npages;
> +};
> +
>  #ifdef CONFIG_AMD_MEM_ENCRYPT
>  extern struct static_key_false sev_es_enable_key;
>  extern void __sev_es_ist_enter(struct pt_regs *regs);
> @@ -154,6 +162,7 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
>  void snp_set_wakeup_secondary_cpu(void);
>  bool snp_init(struct boot_params *bp);
>  void snp_abort(void);
> +int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err);
>  #else
>  static inline void sev_es_ist_enter(struct pt_regs *regs) { }
>  static inline void sev_es_ist_exit(void) { }
> @@ -173,6 +182,11 @@ static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npag
>  static inline void snp_set_wakeup_secondary_cpu(void) { }
>  static inline bool snp_init(struct boot_params *bp) { return false; }
>  static inline void snp_abort(void) { }
> +static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input,
> +                                         unsigned long *fw_err)
> +{
> +       return -ENOTTY;
> +}
>  #endif
>
>  #endif
> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
> index 8b4c57baec52..5b8bc2b65a5e 100644
> --- a/arch/x86/include/uapi/asm/svm.h
> +++ b/arch/x86/include/uapi/asm/svm.h
> @@ -109,6 +109,8 @@
>  #define SVM_VMGEXIT_SET_AP_JUMP_TABLE          0
>  #define SVM_VMGEXIT_GET_AP_JUMP_TABLE          1
>  #define SVM_VMGEXIT_PSC                                0x80000010
> +#define SVM_VMGEXIT_GUEST_REQUEST              0x80000011
> +#define SVM_VMGEXIT_EXT_GUEST_REQUEST          0x80000012
>  #define SVM_VMGEXIT_AP_CREATION                        0x80000013
>  #define SVM_VMGEXIT_AP_CREATE_ON_INIT          0
>  #define SVM_VMGEXIT_AP_CREATE                  1
> @@ -225,6 +227,8 @@
>         { SVM_VMGEXIT_AP_HLT_LOOP,      "vmgexit_ap_hlt_loop" }, \
>         { SVM_VMGEXIT_AP_JUMP_TABLE,    "vmgexit_ap_jump_table" }, \
>         { SVM_VMGEXIT_PSC,      "vmgexit_page_state_change" }, \
> +       { SVM_VMGEXIT_GUEST_REQUEST,            "vmgexit_guest_request" }, \
> +       { SVM_VMGEXIT_EXT_GUEST_REQUEST,        "vmgexit_ext_guest_request" }, \
>         { SVM_VMGEXIT_AP_CREATION,      "vmgexit_ap_creation" }, \
>         { SVM_VMGEXIT_HV_FEATURES,      "vmgexit_hypervisor_feature" }, \
>         { SVM_EXIT_ERR,         "invalid_guest_state" }
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index cb97200bfda7..1d3ac83226fc 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -2122,3 +2122,58 @@ static int __init snp_check_cpuid_table(void)
>  }
>
>  arch_initcall(snp_check_cpuid_table);
> +
> +int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err)
> +{
> +       struct ghcb_state state;
> +       struct es_em_ctxt ctxt;
> +       unsigned long flags;
> +       struct ghcb *ghcb;
> +       int ret;
> +
> +       if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> +               return -ENODEV;
> +
> +       /*
> +        * __sev_get_ghcb() needs to run with IRQs disabled because it is using
> +        * a per-CPU GHCB.
> +        */
> +       local_irq_save(flags);
> +
> +       ghcb = __sev_get_ghcb(&state);
> +       if (!ghcb) {
> +               ret = -EIO;
> +               goto e_restore_irq;
> +       }
> +
> +       vc_ghcb_invalidate(ghcb);
> +
> +       if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
> +               ghcb_set_rax(ghcb, input->data_gpa);
> +               ghcb_set_rbx(ghcb, input->data_npages);
> +       }
> +
> +       ret = sev_es_ghcb_hv_call(ghcb, true, &ctxt, exit_code, input->req_gpa, input->resp_gpa);
> +       if (ret)
> +               goto e_put;
> +
> +       if (ghcb->save.sw_exit_info_2) {
> +               /* Number of expected pages are returned in RBX */
> +               if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> +                   ghcb->save.sw_exit_info_2 == SNP_GUEST_REQ_INVALID_LEN)
> +                       input->data_npages = ghcb_get_rbx(ghcb);
> +
> +               if (fw_err)
> +                       *fw_err = ghcb->save.sw_exit_info_2;

In the PSP driver we've had a bit of discussion around the fw_err and
the return code and that it would be preferable to have fw_err be a
required parameter. And then we can easily make sure fw_err is always
non-zero when the return code is non-zero. Thoughts about doing the
same inside the guest?


> +
> +               ret = -EIO;
> +       }
> +
> +e_put:
> +       __sev_put_ghcb(&state);
> +e_restore_irq:
> +       local_irq_restore(flags);
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(snp_issue_guest_request);
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 40/43] x86/sev: Register SEV-SNP guest request platform device
  2022-01-28 17:18 ` [PATCH v9 40/43] x86/sev: Register SEV-SNP guest request platform device Brijesh Singh
@ 2022-02-01 20:21   ` Peter Gonda
  2022-02-02 16:27     ` Brijesh Singh
  2022-02-06 20:05   ` Borislav Petkov
  1 sibling, 1 reply; 115+ messages in thread
From: Peter Gonda @ 2022-02-01 20:21 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: the arch/x86 maintainers, LKML, kvm list, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, Tony Luck, Marc Orr, Sathyanarayanan Kuppuswamy

On Fri, Jan 28, 2022 at 10:19 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>
> Version 2 of GHCB specification provides Non Automatic Exit (NAE) that can
> be used by the SEV-SNP guest to communicate with the PSP without risk from
> a malicious hypervisor who wishes to read, alter, drop or replay the
> messages sent.
>
> SNP_LAUNCH_UPDATE can insert two special pages into the guest’s memory:
> the secrets page and the CPUID page. The PSP firmware populate the contents
> of the secrets page. The secrets page contains encryption keys used by the
> guest to interact with the firmware. Because the secrets page is encrypted
> with the guest’s memory encryption key, the hypervisor cannot read the
> keys. See SEV-SNP firmware spec for further details on the secrets page
> format.
>
> Create a platform device that the SEV-SNP guest driver can bind to get the
> platform resources such as encryption key and message id to use to
> communicate with the PSP. The SEV-SNP guest driver provides a userspace
> interface to get the attestation report, key derivation, extended
> attestation report etc.
>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/sev.h |  4 +++
>  arch/x86/kernel/sev.c      | 61 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 65 insertions(+)
>
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index 9830ee1d6ef0..ca977493eb72 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -95,6 +95,10 @@ struct snp_req_data {
>         unsigned int data_npages;
>  };
>
> +struct snp_guest_platform_data {
> +       u64 secrets_gpa;
> +};
> +
>  #ifdef CONFIG_AMD_MEM_ENCRYPT
>  extern struct static_key_false sev_es_enable_key;
>  extern void __sev_es_ist_enter(struct pt_regs *regs);
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 1d3ac83226fc..1e56ab00d1f4 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -19,6 +19,9 @@
>  #include <linux/kernel.h>
>  #include <linux/mm.h>
>  #include <linux/cpumask.h>
> +#include <linux/efi.h>
> +#include <linux/platform_device.h>
> +#include <linux/io.h>
>
>  #include <asm/cpu_entry_area.h>
>  #include <asm/stacktrace.h>
> @@ -34,6 +37,7 @@
>  #include <asm/cpu.h>
>  #include <asm/apic.h>
>  #include <asm/cpuid.h>
> +#include <asm/setup.h>
>
>  #define DR7_RESET_VALUE        0x400
>
> @@ -2177,3 +2181,60 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned
>         return ret;
>  }
>  EXPORT_SYMBOL_GPL(snp_issue_guest_request);
> +
> +static struct platform_device guest_req_device = {
> +       .name           = "snp-guest",
> +       .id             = -1,
> +};
> +
> +static u64 get_secrets_page(void)
> +{
> +       u64 pa_data = boot_params.cc_blob_address;
> +       struct cc_blob_sev_info info;
> +       void *map;
> +
> +       /*
> +        * The CC blob contains the address of the secrets page, check if the
> +        * blob is present.
> +        */
> +       if (!pa_data)
> +               return 0;
> +
> +       map = early_memremap(pa_data, sizeof(info));
> +       memcpy(&info, map, sizeof(info));
> +       early_memunmap(map, sizeof(info));
> +
> +       /* smoke-test the secrets page passed */
> +       if (!info.secrets_phys || info.secrets_len != PAGE_SIZE)
> +               return 0;

This seems like an error condition worth noting. If no cc_blob_address
is passed it makes sense not to log but what if the address passed
fails this smoke test, why not log?

> +
> +       return info.secrets_phys;
> +}
> +
> +static int __init init_snp_platform_device(void)
> +{
> +       struct snp_guest_platform_data data;
> +       u64 gpa;
> +
> +       if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> +               return -ENODEV;
> +
> +       gpa = get_secrets_page();
> +       if (!gpa)
> +               return -ENODEV;
> +
> +       data.secrets_gpa = gpa;
> +       if (platform_device_add_data(&guest_req_device, &data, sizeof(data)))
> +               goto e_fail;
> +
> +       if (platform_device_register(&guest_req_device))
> +               goto e_fail;
> +
> +       pr_info("SNP guest platform device initialized.\n");
> +       return 0;
> +
> +e_fail:
> +       pr_err("Failed to initialize SNP guest device\n");
> +       return -ENODEV;
> +}
> +device_initcall(init_snp_platform_device);
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 41/43] virt: Add SEV-SNP guest driver
  2022-01-28 17:18 ` [PATCH v9 41/43] virt: Add SEV-SNP guest driver Brijesh Singh
@ 2022-02-01 20:33   ` Peter Gonda
  2022-02-06 22:39   ` Borislav Petkov
  1 sibling, 0 replies; 115+ messages in thread
From: Peter Gonda @ 2022-02-01 20:33 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: the arch/x86 maintainers, LKML, kvm list, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, Tony Luck, Marc Orr, Sathyanarayanan Kuppuswamy

On Fri, Jan 28, 2022 at 10:19 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>
> SEV-SNP specification provides the guest a mechanism to communicate with
> the PSP without risk from a malicious hypervisor who wishes to read, alter,
> drop or replay the messages sent. The driver uses snp_issue_guest_request()
> to issue GHCB SNP_GUEST_REQUEST or SNP_EXT_GUEST_REQUEST NAE events to
> submit the request to PSP.
>
> The PSP requires that all communication should be encrypted using key
> specified through the platform_data.
>
> The userspace can use SNP_GET_REPORT ioctl() to query the guest
> attestation report.
>
> See SEV-SNP spec section Guest Messages for more details.
>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Peter Gonda <pgonda@google.com>

I'll try and do some basic functionality testing done with this later.

> ---
>  Documentation/virt/coco/sevguest.rst  |  81 ++++
>  drivers/virt/Kconfig                  |   3 +
>  drivers/virt/Makefile                 |   1 +
>  drivers/virt/coco/sevguest/Kconfig    |  12 +
>  drivers/virt/coco/sevguest/Makefile   |   2 +
>  drivers/virt/coco/sevguest/sevguest.c | 605 ++++++++++++++++++++++++++
>  drivers/virt/coco/sevguest/sevguest.h |  98 +++++
>  include/uapi/linux/sev-guest.h        |  50 +++
>  8 files changed, 852 insertions(+)
>  create mode 100644 Documentation/virt/coco/sevguest.rst
>  create mode 100644 drivers/virt/coco/sevguest/Kconfig
>  create mode 100644 drivers/virt/coco/sevguest/Makefile
>  create mode 100644 drivers/virt/coco/sevguest/sevguest.c
>  create mode 100644 drivers/virt/coco/sevguest/sevguest.h
>  create mode 100644 include/uapi/linux/sev-guest.h
>
> diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
> new file mode 100644
> index 000000000000..47ef3b0821d5
> --- /dev/null
> +++ b/Documentation/virt/coco/sevguest.rst
> @@ -0,0 +1,81 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===================================================================
> +The Definitive SEV Guest API Documentation
> +===================================================================
> +
> +1. General description
> +======================
> +
> +The SEV API is a set of ioctls that are used by the guest or hypervisor
> +to get or set certain aspect of the SEV virtual machine. The ioctls belong
> +to the following classes:
> +
> + - Hypervisor ioctls: These query and set global attributes which affect the
> +   whole SEV firmware.  These ioctl are used by platform provision tools.
> +
> + - Guest ioctls: These query and set attributes of the SEV virtual machine.
> +
> +2. API description
> +==================
> +
> +This section describes ioctls that can be used to query or set SEV guests.
> +For each ioctl, the following information is provided along with a
> +description:
> +
> +  Technology:
> +      which SEV technology provides this ioctl. sev, sev-es, sev-snp or all.
> +
> +  Type:
> +      hypervisor or guest. The ioctl can be used inside the guest or the
> +      hypervisor.
> +
> +  Parameters:
> +      what parameters are accepted by the ioctl.
> +
> +  Returns:
> +      the return value.  General error numbers (ENOMEM, EINVAL)
> +      are not detailed, but errors with specific meanings are.
> +
> +The guest ioctl should be issued on a file descriptor of the /dev/sev-guest device.
> +The ioctl accepts struct snp_user_guest_request. The input and output structure is
> +specified through the req_data and resp_data field respectively. If the ioctl fails
> +to execute due to a firmware error, then fw_err code will be set otherwise the
> +fw_err will be set to 0xff.
> +
> +::
> +        struct snp_guest_request_ioctl {
> +                /* Message version number */
> +                __u32 msg_version;
> +
> +                /* Request and response structure address */
> +                __u64 req_data;
> +                __u64 resp_data;
> +
> +                /* firmware error code on failure (see psp-sev.h) */
> +                __u64 fw_err;
> +        };
> +
> +2.1 SNP_GET_REPORT
> +------------------
> +
> +:Technology: sev-snp
> +:Type: guest ioctl
> +:Parameters (in): struct snp_report_req
> +:Returns (out): struct snp_report_resp on success, -negative on error
> +
> +The SNP_GET_REPORT ioctl can be used to query the attestation report from the
> +SEV-SNP firmware. The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command
> +provided by the SEV-SNP firmware to query the attestation report.
> +
> +On success, the snp_report_resp.data will contains the report. The report
> +contain the format described in the SEV-SNP specification. See the SEV-SNP
> +specification for further details.
> +
> +
> +Reference
> +---------
> +
> +SEV-SNP and GHCB specification: developer.amd.com/sev
> +
> +The driver is based on SEV-SNP firmware spec 0.9 and GHCB spec version 2.0.
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index 8061e8ef449f..e457e47610d3 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -36,4 +36,7 @@ source "drivers/virt/vboxguest/Kconfig"
>  source "drivers/virt/nitro_enclaves/Kconfig"
>
>  source "drivers/virt/acrn/Kconfig"
> +
> +source "drivers/virt/coco/sevguest/Kconfig"
> +
>  endif
> diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
> index 3e272ea60cd9..9c704a6fdcda 100644
> --- a/drivers/virt/Makefile
> +++ b/drivers/virt/Makefile
> @@ -8,3 +8,4 @@ obj-y                           += vboxguest/
>
>  obj-$(CONFIG_NITRO_ENCLAVES)   += nitro_enclaves/
>  obj-$(CONFIG_ACRN_HSM)         += acrn/
> +obj-$(CONFIG_SEV_GUEST)                += coco/sevguest/
> diff --git a/drivers/virt/coco/sevguest/Kconfig b/drivers/virt/coco/sevguest/Kconfig
> new file mode 100644
> index 000000000000..07ab9ec6471c
> --- /dev/null
> +++ b/drivers/virt/coco/sevguest/Kconfig
> @@ -0,0 +1,12 @@
> +config SEV_GUEST
> +       tristate "AMD SEV Guest driver"
> +       default y
> +       depends on AMD_MEM_ENCRYPT && CRYPTO_AEAD2
> +       help
> +         SEV-SNP firmware provides the guest a mechanism to communicate with
> +         the PSP without risk from a malicious hypervisor who wishes to read,
> +         alter, drop or replay the messages sent. The driver provides
> +         userspace interface to communicate with the PSP to request the
> +         attestation report and more.
> +
> +         If you choose 'M' here, this module will be called sevguest.
> diff --git a/drivers/virt/coco/sevguest/Makefile b/drivers/virt/coco/sevguest/Makefile
> new file mode 100644
> index 000000000000..b1ffb2b4177b
> --- /dev/null
> +++ b/drivers/virt/coco/sevguest/Makefile
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +obj-$(CONFIG_SEV_GUEST) += sevguest.o
> diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
> new file mode 100644
> index 000000000000..6dc0785ddd4b
> --- /dev/null
> +++ b/drivers/virt/coco/sevguest/sevguest.c
> @@ -0,0 +1,605 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * AMD Secure Encrypted Virtualization Nested Paging (SEV-SNP) guest request interface
> + *
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Brijesh Singh <brijesh.singh@amd.com>
> + */
> +
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/mutex.h>
> +#include <linux/io.h>
> +#include <linux/platform_device.h>
> +#include <linux/miscdevice.h>
> +#include <linux/set_memory.h>
> +#include <linux/fs.h>
> +#include <crypto/aead.h>
> +#include <linux/scatterlist.h>
> +#include <linux/psp-sev.h>
> +#include <uapi/linux/sev-guest.h>
> +#include <uapi/linux/psp-sev.h>
> +
> +#include <asm/svm.h>
> +#include <asm/sev.h>
> +
> +#include "sevguest.h"
> +
> +#define DEVICE_NAME    "sev-guest"
> +#define AAD_LEN                48
> +#define MSG_HDR_VER    1
> +
> +struct snp_guest_crypto {
> +       struct crypto_aead *tfm;
> +       u8 *iv, *authtag;
> +       int iv_len, a_len;
> +};
> +
> +struct snp_guest_dev {
> +       struct device *dev;
> +       struct miscdevice misc;
> +
> +       struct snp_guest_crypto *crypto;
> +       struct snp_guest_msg *request, *response;
> +       struct snp_secrets_page_layout *layout;
> +       struct snp_req_data input;
> +       u32 *os_area_msg_seqno;
> +       u8 *vmpck;
> +};
> +
> +static u32 vmpck_id;
> +module_param(vmpck_id, uint, 0444);
> +MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.");
> +
> +static DEFINE_MUTEX(snp_cmd_mutex);
> +
> +static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
> +{
> +       char zero_key[VMPCK_KEY_LEN] = {0};
> +
> +       if (snp_dev->vmpck)
> +               return memcmp(snp_dev->vmpck, zero_key, VMPCK_KEY_LEN) == 0;
> +
> +       return true;
> +}
> +
> +static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
> +{
> +       memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
> +       snp_dev->vmpck = NULL;
> +}
> +
> +static inline u64 __snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
> +{
> +       u64 count;
> +
> +       lockdep_assert_held(&snp_cmd_mutex);
> +
> +       /* Read the current message sequence counter from secrets pages */
> +       count = *snp_dev->os_area_msg_seqno;
> +
> +       return count + 1;
> +}
> +
> +/* Return a non-zero on success */
> +static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
> +{
> +       u64 count = __snp_get_msg_seqno(snp_dev);
> +
> +       /*
> +        * The message sequence counter for the SNP guest request is a  64-bit
> +        * value but the version 2 of GHCB specification defines a 32-bit storage
> +        * for it. If the counter exceeds the 32-bit value then return zero.
> +        * The caller should check the return value, but if the caller happens to
> +        * not check the value and use it, then the firmware treats zero as an
> +        * invalid number and will fail the  message request.
> +        */
> +       if (count >= UINT_MAX) {
> +               pr_err_ratelimited("SNP guest request message sequence counter overflow\n");
> +               return 0;
> +       }
> +
> +       return count;
> +}
> +
> +static void snp_inc_msg_seqno(struct snp_guest_dev *snp_dev)
> +{
> +       /*
> +        * The counter is also incremented by the PSP, so increment it by 2
> +        * and save in secrets page.
> +        */
> +       *snp_dev->os_area_msg_seqno += 2;
> +}
> +
> +static inline struct snp_guest_dev *to_snp_dev(struct file *file)
> +{
> +       struct miscdevice *dev = file->private_data;
> +
> +       return container_of(dev, struct snp_guest_dev, misc);
> +}
> +
> +static struct snp_guest_crypto *init_crypto(struct snp_guest_dev *snp_dev, u8 *key, size_t keylen)
> +{
> +       struct snp_guest_crypto *crypto;
> +
> +       crypto = kzalloc(sizeof(*crypto), GFP_KERNEL_ACCOUNT);
> +       if (!crypto)
> +               return NULL;
> +
> +       crypto->tfm = crypto_alloc_aead("gcm(aes)", 0, 0);
> +       if (IS_ERR(crypto->tfm))
> +               goto e_free;
> +
> +       if (crypto_aead_setkey(crypto->tfm, key, keylen))
> +               goto e_free_crypto;
> +
> +       crypto->iv_len = crypto_aead_ivsize(crypto->tfm);
> +       if (crypto->iv_len < 12) {
> +               dev_err(snp_dev->dev, "IV length is less than 12.\n");
> +               goto e_free_crypto;
> +       }
> +
> +       crypto->iv = kmalloc(crypto->iv_len, GFP_KERNEL_ACCOUNT);
> +       if (!crypto->iv)
> +               goto e_free_crypto;
> +
> +       if (crypto_aead_authsize(crypto->tfm) > MAX_AUTHTAG_LEN) {
> +               if (crypto_aead_setauthsize(crypto->tfm, MAX_AUTHTAG_LEN)) {
> +                       dev_err(snp_dev->dev, "failed to set authsize to %d\n", MAX_AUTHTAG_LEN);
> +                       goto e_free_crypto;
> +               }
> +       }
> +
> +       crypto->a_len = crypto_aead_authsize(crypto->tfm);
> +       crypto->authtag = kmalloc(crypto->a_len, GFP_KERNEL_ACCOUNT);
> +       if (!crypto->authtag)
> +               goto e_free_crypto;
> +
> +       return crypto;
> +
> +e_free_crypto:
> +       crypto_free_aead(crypto->tfm);
> +e_free:
> +       kfree(crypto->iv);
> +       kfree(crypto->authtag);
> +       kfree(crypto);
> +
> +       return NULL;
> +}
> +
> +static void deinit_crypto(struct snp_guest_crypto *crypto)
> +{
> +       crypto_free_aead(crypto->tfm);
> +       kfree(crypto->iv);
> +       kfree(crypto->authtag);
> +       kfree(crypto);
> +}
> +
> +static int enc_dec_message(struct snp_guest_crypto *crypto, struct snp_guest_msg *msg,
> +                          u8 *src_buf, u8 *dst_buf, size_t len, bool enc)
> +{
> +       struct snp_guest_msg_hdr *hdr = &msg->hdr;
> +       struct scatterlist src[3], dst[3];
> +       DECLARE_CRYPTO_WAIT(wait);
> +       struct aead_request *req;
> +       int ret;
> +
> +       req = aead_request_alloc(crypto->tfm, GFP_KERNEL);
> +       if (!req)
> +               return -ENOMEM;
> +
> +       /*
> +        * AEAD memory operations:
> +        * +------ AAD -------+------- DATA -----+---- AUTHTAG----+
> +        * |  msg header      |  plaintext       |  hdr->authtag  |
> +        * | bytes 30h - 5Fh  |    or            |                |
> +        * |                  |   cipher         |                |
> +        * +------------------+------------------+----------------+
> +        */
> +       sg_init_table(src, 3);
> +       sg_set_buf(&src[0], &hdr->algo, AAD_LEN);
> +       sg_set_buf(&src[1], src_buf, hdr->msg_sz);
> +       sg_set_buf(&src[2], hdr->authtag, crypto->a_len);
> +
> +       sg_init_table(dst, 3);
> +       sg_set_buf(&dst[0], &hdr->algo, AAD_LEN);
> +       sg_set_buf(&dst[1], dst_buf, hdr->msg_sz);
> +       sg_set_buf(&dst[2], hdr->authtag, crypto->a_len);
> +
> +       aead_request_set_ad(req, AAD_LEN);
> +       aead_request_set_tfm(req, crypto->tfm);
> +       aead_request_set_callback(req, 0, crypto_req_done, &wait);
> +
> +       aead_request_set_crypt(req, src, dst, len, crypto->iv);
> +       ret = crypto_wait_req(enc ? crypto_aead_encrypt(req) : crypto_aead_decrypt(req), &wait);
> +
> +       aead_request_free(req);
> +       return ret;
> +}
> +
> +static int __enc_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
> +                        void *plaintext, size_t len)
> +{
> +       struct snp_guest_crypto *crypto = snp_dev->crypto;
> +       struct snp_guest_msg_hdr *hdr = &msg->hdr;
> +
> +       memset(crypto->iv, 0, crypto->iv_len);
> +       memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
> +
> +       return enc_dec_message(crypto, msg, plaintext, msg->payload, len, true);
> +}
> +
> +static int dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
> +                      void *plaintext, size_t len)
> +{
> +       struct snp_guest_crypto *crypto = snp_dev->crypto;
> +       struct snp_guest_msg_hdr *hdr = &msg->hdr;
> +
> +       /* Build IV with response buffer sequence number */
> +       memset(crypto->iv, 0, crypto->iv_len);
> +       memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
> +
> +       return enc_dec_message(crypto, msg, msg->payload, plaintext, len, false);
> +}
> +
> +static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, u32 sz)
> +{
> +       struct snp_guest_crypto *crypto = snp_dev->crypto;
> +       struct snp_guest_msg *resp = snp_dev->response;
> +       struct snp_guest_msg *req = snp_dev->request;
> +       struct snp_guest_msg_hdr *req_hdr = &req->hdr;
> +       struct snp_guest_msg_hdr *resp_hdr = &resp->hdr;
> +
> +       dev_dbg(snp_dev->dev, "response [seqno %lld type %d version %d sz %d]\n",
> +               resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version, resp_hdr->msg_sz);
> +
> +       /* Verify that the sequence counter is incremented by 1 */
> +       if (unlikely(resp_hdr->msg_seqno != (req_hdr->msg_seqno + 1)))
> +               return -EBADMSG;
> +
> +       /* Verify response message type and version number. */
> +       if (resp_hdr->msg_type != (req_hdr->msg_type + 1) ||
> +           resp_hdr->msg_version != req_hdr->msg_version)
> +               return -EBADMSG;
> +
> +       /*
> +        * If the message size is greater than our buffer length then return
> +        * an error.
> +        */
> +       if (unlikely((resp_hdr->msg_sz + crypto->a_len) > sz))
> +               return -EBADMSG;
> +
> +       /* Decrypt the payload */
> +       return dec_payload(snp_dev, resp, payload, resp_hdr->msg_sz + crypto->a_len);
> +}
> +
> +static bool enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8 type,
> +                       void *payload, size_t sz)
> +{
> +       struct snp_guest_msg *req = snp_dev->request;
> +       struct snp_guest_msg_hdr *hdr = &req->hdr;
> +
> +       memset(req, 0, sizeof(*req));
> +
> +       hdr->algo = SNP_AEAD_AES_256_GCM;
> +       hdr->hdr_version = MSG_HDR_VER;
> +       hdr->hdr_sz = sizeof(*hdr);
> +       hdr->msg_type = type;
> +       hdr->msg_version = version;
> +       hdr->msg_seqno = seqno;
> +       hdr->msg_vmpck = vmpck_id;
> +       hdr->msg_sz = sz;
> +
> +       /* Verify the sequence number is non-zero */
> +       if (!hdr->msg_seqno)
> +               return -ENOSR;
> +
> +       dev_dbg(snp_dev->dev, "request [seqno %lld type %d version %d sz %d]\n",
> +               hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
> +
> +       return __enc_payload(snp_dev, req, payload, sz);
> +}
> +
> +static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
> +                               u8 type, void *req_buf, size_t req_sz, void *resp_buf,
> +                               u32 resp_sz, __u64 *fw_err)
> +{
> +       unsigned long err;
> +       u64 seqno;
> +       int rc;
> +
> +       /* Get message sequence and verify that its a non-zero */
> +       seqno = snp_get_msg_seqno(snp_dev);
> +       if (!seqno)
> +               return -EIO;
> +
> +       memset(snp_dev->response, 0, sizeof(*snp_dev->response));
> +
> +       /* Encrypt the userspace provided payload */
> +       rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
> +       if (rc)
> +               return rc;
> +
> +       /* Call firmware to process the request */
> +       rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> +       if (fw_err)
> +               *fw_err = err;

All the other pointers are required. Why not require this one to make
it simpler?

> +
> +       if (rc)
> +               return rc;
> +
> +       rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
> +       if (rc) {
> +               /*
> +                * The verify_and_dec_payload() will fail only if the hypervisor is
> +                * actively modifying the message header or corrupting the encrypted payload.
> +                * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
> +                * the key cannot be used for any communication. The key is disabled to ensure
> +                * that AES-GCM does not use the same IV while encrypting the request payload.
> +                */
> +               dev_alert(snp_dev->dev,
> +                         "Detected unexpected decode failure, disabling the vmpck_id %d\n",
> +                         vmpck_id);
> +               snp_disable_vmpck(snp_dev);
> +               return rc;
> +       }
> +
> +       /* Increment to new message sequence after payload decryption was successful. */
> +       snp_inc_msg_seqno(snp_dev);
> +
> +       return 0;
> +}
> +
> +static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> +{
> +       struct snp_guest_crypto *crypto = snp_dev->crypto;
> +       struct snp_report_req req = {0};
> +       struct snp_report_resp *resp;
> +       int rc, resp_len;
> +
> +       if (!arg->req_data || !arg->resp_data)
> +               return -EINVAL;
> +
> +       /* Copy the request payload from userspace */
> +       if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> +               return -EFAULT;
> +
> +       /*
> +        * The intermediate response buffer is used while decrypting the
> +        * response payload. Make sure that it has enough space to cover the
> +        * authtag.
> +        */
> +       resp_len = sizeof(resp->data) + crypto->a_len;
> +       resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
> +       if (!resp)
> +               return -ENOMEM;
> +
> +       /* Issue the command to get the attestation report */
> +       rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
> +                                 SNP_MSG_REPORT_REQ, &req, sizeof(req), resp->data,
> +                                 resp_len, &arg->fw_err);
> +       if (rc)
> +               goto e_free;
> +
> +       /* Copy the response payload to userspace */
> +       if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
> +               rc = -EFAULT;
> +
> +e_free:
> +       kfree(resp);
> +       return rc;
> +}
> +
> +static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
> +{
> +       struct snp_guest_dev *snp_dev = to_snp_dev(file);
> +       void __user *argp = (void __user *)arg;
> +       struct snp_guest_request_ioctl input;
> +       int ret = -ENOTTY;
> +
> +       if (copy_from_user(&input, argp, sizeof(input)))
> +               return -EFAULT;
> +
> +       input.fw_err = 0xff;
> +
> +       /* Message version must be non-zero */
> +       if (!input.msg_version)
> +               return -EINVAL;
> +
> +       mutex_lock(&snp_cmd_mutex);
> +
> +       /* Check if the VMPCK is not empty */
> +       if (is_vmpck_empty(snp_dev)) {
> +               dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
> +               mutex_unlock(&snp_cmd_mutex);
> +               return -ENOTTY;
> +       }
> +
> +       switch (ioctl) {
> +       case SNP_GET_REPORT:
> +               ret = get_report(snp_dev, &input);
> +               break;
> +       default:
> +               break;
> +       }
> +
> +       mutex_unlock(&snp_cmd_mutex);
> +
> +       if (input.fw_err && copy_to_user(argp, &input, sizeof(input)))
> +               return -EFAULT;
> +
> +       return ret;
> +}
> +
> +static void free_shared_pages(void *buf, size_t sz)
> +{
> +       unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> +
> +       if (!buf)
> +               return;
> +
> +       /* If fail to restore the encryption mask then leak it. */
> +       if (WARN_ONCE(set_memory_encrypted((unsigned long)buf, npages),
> +                     "Failed to restore encryption mask (leak it)\n"))
> +               return;
> +
> +       __free_pages(virt_to_page(buf), get_order(sz));
> +}
> +
> +static void *alloc_shared_pages(size_t sz)
> +{
> +       unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> +       struct page *page;
> +       int ret;
> +
> +       page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
> +       if (IS_ERR(page))
> +               return NULL;
> +
> +       ret = set_memory_decrypted((unsigned long)page_address(page), npages);
> +       if (ret) {
> +               pr_err("SEV-SNP: failed to mark page shared, ret=%d\n", ret);
> +               __free_pages(page, get_order(sz));
> +               return NULL;
> +       }
> +
> +       return page_address(page);
> +}
> +
> +static const struct file_operations snp_guest_fops = {
> +       .owner  = THIS_MODULE,
> +       .unlocked_ioctl = snp_guest_ioctl,
> +};
> +
> +static u8 *get_vmpck(int id, struct snp_secrets_page_layout *layout, u32 **seqno)
> +{
> +       u8 *key = NULL;
> +
> +       switch (id) {
> +       case 0:
> +               *seqno = &layout->os_area.msg_seqno_0;
> +               key = layout->vmpck0;
> +               break;
> +       case 1:
> +               *seqno = &layout->os_area.msg_seqno_1;
> +               key = layout->vmpck1;
> +               break;
> +       case 2:
> +               *seqno = &layout->os_area.msg_seqno_2;
> +               key = layout->vmpck2;
> +               break;
> +       case 3:
> +               *seqno = &layout->os_area.msg_seqno_3;
> +               key = layout->vmpck3;
> +               break;
> +       default:
> +               break;
> +       }
> +
> +       return key;
> +}
> +
> +static int __init snp_guest_probe(struct platform_device *pdev)
> +{
> +       struct snp_secrets_page_layout *layout;
> +       struct snp_guest_platform_data *data;
> +       struct device *dev = &pdev->dev;
> +       struct snp_guest_dev *snp_dev;
> +       struct miscdevice *misc;
> +       int ret;
> +
> +       if (!dev->platform_data)
> +               return -ENODEV;
> +
> +       data = (struct snp_guest_platform_data *)dev->platform_data;
> +       layout = (__force void *)ioremap_encrypted(data->secrets_gpa, PAGE_SIZE);
> +       if (!layout)
> +               return -ENODEV;
> +
> +       ret = -ENOMEM;
> +       snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL);
> +       if (!snp_dev)
> +               goto e_fail;
> +
> +       ret = -EINVAL;
> +       snp_dev->vmpck = get_vmpck(vmpck_id, layout, &snp_dev->os_area_msg_seqno);
> +       if (!snp_dev->vmpck) {
> +               dev_err(dev, "invalid vmpck id %d\n", vmpck_id);
> +               goto e_fail;
> +       }
> +
> +       /* Verify that VMPCK is not zero. */
> +       if (is_vmpck_empty(snp_dev)) {
> +               dev_err(dev, "vmpck id %d is null\n", vmpck_id);
> +               goto e_fail;
> +       }
> +
> +       platform_set_drvdata(pdev, snp_dev);
> +       snp_dev->dev = dev;
> +       snp_dev->layout = layout;
> +
> +       /* Allocate the shared page used for the request and response message. */
> +       snp_dev->request = alloc_shared_pages(sizeof(struct snp_guest_msg));
> +       if (!snp_dev->request)
> +               goto e_fail;
> +
> +       snp_dev->response = alloc_shared_pages(sizeof(struct snp_guest_msg));
> +       if (!snp_dev->response)
> +               goto e_fail;
> +
> +       ret = -EIO;
> +       snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
> +       if (!snp_dev->crypto)
> +               goto e_fail;
> +
> +       misc = &snp_dev->misc;
> +       misc->minor = MISC_DYNAMIC_MINOR;
> +       misc->name = DEVICE_NAME;
> +       misc->fops = &snp_guest_fops;
> +
> +       /* initial the input address for guest request */
> +       snp_dev->input.req_gpa = __pa(snp_dev->request);
> +       snp_dev->input.resp_gpa = __pa(snp_dev->response);
> +
> +       ret =  misc_register(misc);
> +       if (ret)
> +               goto e_fail;
> +
> +       dev_info(dev, "Initialized SEV-SNP guest driver (using vmpck_id %d)\n", vmpck_id);
> +       return 0;
> +
> +e_fail:
> +       iounmap(layout);
> +       free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
> +       free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
> +
> +       return ret;
> +}
> +
> +static int __exit snp_guest_remove(struct platform_device *pdev)
> +{
> +       struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev);
> +
> +       free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
> +       free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
> +       deinit_crypto(snp_dev->crypto);
> +       misc_deregister(&snp_dev->misc);
> +
> +       return 0;
> +}
> +
> +static struct platform_driver snp_guest_driver = {
> +       .remove         = __exit_p(snp_guest_remove),
> +       .driver         = {
> +               .name = "snp-guest",
> +       },
> +};
> +
> +module_platform_driver_probe(snp_guest_driver, snp_guest_probe);
> +
> +MODULE_AUTHOR("Brijesh Singh <brijesh.singh@amd.com>");
> +MODULE_LICENSE("GPL");
> +MODULE_VERSION("1.0.0");
> +MODULE_DESCRIPTION("AMD SNP Guest Driver");
> diff --git a/drivers/virt/coco/sevguest/sevguest.h b/drivers/virt/coco/sevguest/sevguest.h
> new file mode 100644
> index 000000000000..cfa76cf8a21a
> --- /dev/null
> +++ b/drivers/virt/coco/sevguest/sevguest.h
> @@ -0,0 +1,98 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Brijesh Singh <brijesh.singh@amd.com>
> + *
> + * SEV-SNP API spec is available at https://developer.amd.com/sev
> + */
> +
> +#ifndef __LINUX_SEVGUEST_H_
> +#define __LINUX_SEVGUEST_H_
> +
> +#include <linux/types.h>
> +
> +#define MAX_AUTHTAG_LEN                32
> +
> +/* See SNP spec SNP_GUEST_REQUEST section for the structure */
> +enum msg_type {
> +       SNP_MSG_TYPE_INVALID = 0,
> +       SNP_MSG_CPUID_REQ,
> +       SNP_MSG_CPUID_RSP,
> +       SNP_MSG_KEY_REQ,
> +       SNP_MSG_KEY_RSP,
> +       SNP_MSG_REPORT_REQ,
> +       SNP_MSG_REPORT_RSP,
> +       SNP_MSG_EXPORT_REQ,
> +       SNP_MSG_EXPORT_RSP,
> +       SNP_MSG_IMPORT_REQ,
> +       SNP_MSG_IMPORT_RSP,
> +       SNP_MSG_ABSORB_REQ,
> +       SNP_MSG_ABSORB_RSP,
> +       SNP_MSG_VMRK_REQ,
> +       SNP_MSG_VMRK_RSP,
> +
> +       SNP_MSG_TYPE_MAX
> +};
> +
> +enum aead_algo {
> +       SNP_AEAD_INVALID,
> +       SNP_AEAD_AES_256_GCM,
> +};
> +
> +struct snp_guest_msg_hdr {
> +       u8 authtag[MAX_AUTHTAG_LEN];
> +       u64 msg_seqno;
> +       u8 rsvd1[8];
> +       u8 algo;
> +       u8 hdr_version;
> +       u16 hdr_sz;
> +       u8 msg_type;
> +       u8 msg_version;
> +       u16 msg_sz;
> +       u32 rsvd2;
> +       u8 msg_vmpck;
> +       u8 rsvd3[35];
> +} __packed;
> +
> +struct snp_guest_msg {
> +       struct snp_guest_msg_hdr hdr;
> +       u8 payload[4000];
> +} __packed;
> +
> +/*
> + * The secrets page contains 96-bytes of reserved field that can be used by
> + * the guest OS. The guest OS uses the area to save the message sequence
> + * number for each VMPCK.
> + *
> + * See the GHCB spec section Secret page layout for the format for this area.
> + */
> +struct secrets_os_area {
> +       u32 msg_seqno_0;
> +       u32 msg_seqno_1;
> +       u32 msg_seqno_2;
> +       u32 msg_seqno_3;
> +       u64 ap_jump_table_pa;
> +       u8 rsvd[40];
> +       u8 guest_usage[32];
> +} __packed;
> +
> +#define VMPCK_KEY_LEN          32
> +
> +/* See the SNP spec version 0.9 for secrets page format */
> +struct snp_secrets_page_layout {
> +       u32 version;
> +       u32 imien       : 1,
> +           rsvd1       : 31;
> +       u32 fms;
> +       u32 rsvd2;
> +       u8 gosvw[16];
> +       u8 vmpck0[VMPCK_KEY_LEN];
> +       u8 vmpck1[VMPCK_KEY_LEN];
> +       u8 vmpck2[VMPCK_KEY_LEN];
> +       u8 vmpck3[VMPCK_KEY_LEN];
> +       struct secrets_os_area os_area;
> +       u8 rsvd3[3840];
> +} __packed;
> +
> +#endif /* __LINUX_SNP_GUEST_H__ */
> diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
> new file mode 100644
> index 000000000000..081d314a6279
> --- /dev/null
> +++ b/include/uapi/linux/sev-guest.h
> @@ -0,0 +1,50 @@
> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
> +/*
> + * Userspace interface for AMD SEV and SEV-SNP guest driver.
> + *
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Brijesh Singh <brijesh.singh@amd.com>
> + *
> + * SEV API specification is available at: https://developer.amd.com/sev/
> + */
> +
> +#ifndef __UAPI_LINUX_SEV_GUEST_H_
> +#define __UAPI_LINUX_SEV_GUEST_H_
> +
> +#include <linux/types.h>
> +
> +struct snp_report_req {
> +       /* user data that should be included in the report */
> +       __u8 user_data[64];
> +
> +       /* The vmpl level to be included in the report */
> +       __u32 vmpl;
> +
> +       /* Must be zero filled */
> +       __u8 rsvd[28];
> +};
> +
> +struct snp_report_resp {
> +       /* response data, see SEV-SNP spec for the format */
> +       __u8 data[4000];
> +};
> +
> +struct snp_guest_request_ioctl {
> +       /* message version number (must be non-zero) */
> +       __u8 msg_version;
> +
> +       /* Request and response structure address */
> +       __u64 req_data;
> +       __u64 resp_data;
> +
> +       /* firmware error code on failure (see psp-sev.h) */
> +       __u64 fw_err;
> +};
> +
> +#define SNP_GUEST_REQ_IOC_TYPE 'S'
> +
> +/* Get SNP attestation report */
> +#define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_guest_request_ioctl)
> +
> +#endif /* __UAPI_LINUX_SEV_GUEST_H_ */
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot
  2022-02-01 18:08   ` Borislav Petkov
@ 2022-02-01 20:35     ` Michael Roth
  2022-02-01 21:28       ` Borislav Petkov
  0 siblings, 1 reply; 115+ messages in thread
From: Michael Roth @ 2022-02-01 20:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Feb 01, 2022 at 07:08:21PM +0100, Borislav Petkov wrote:
> On Fri, Jan 28, 2022 at 11:17:26AM -0600, Brijesh Singh wrote:
> > diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> > index fd9441f40457..49064a9f96e2 100644
> > --- a/arch/x86/boot/compressed/head_64.S
> > +++ b/arch/x86/boot/compressed/head_64.S
> > @@ -191,9 +191,8 @@ SYM_FUNC_START(startup_32)
> >  	/*
> >  	 * Mark SEV as active in sev_status so that startup32_check_sev_cbit()
> >  	 * will do a check. The sev_status memory will be fully initialized
> 
> That "sev_status memory" formulation is just weird. Pls fix it while
> you're touching that comment.

Will do.

> 
> > +static inline u64 rd_sev_status_msr(void)
> > +{
> > +	unsigned long low, high;
> > +
> > +	asm volatile("rdmsr" : "=a" (low), "=d" (high) :
> > +			"c" (MSR_AMD64_SEV));
> > +
> > +	return ((high << 32) | low);
> > +}
> 
> Don't you see sev_es_rd_ghcb_msr() in that same file above? Do a common
> rdmsr() helper and call it where needed, pls, instead of duplicating
> code.

Unfortunately rdmsr()/wrmsr()/__rdmsr()/__wrmsr() etc. definitions are all
already getting pulled in via:

  misc.h:
    #include linux/elf.h
      #include linux/thread_info.h
        #include linux/cpufeature.h
          #include linux/processor.h
            #include linux/msr.h

Those definitions aren't usable in boot/compressed because of __ex_table
and possibly some other dependency hellishness.

Would read_msr()/write_msr() be reasonable alternative names for these new
helpers, or something else that better distinguishes them from the
kernel proper definitions?

> 
> misc.h looks like a good place.

It doesn't look like anything in boot/ pulls in boot/compressed/
headers. It seems to be the other way around, with boot/compressed
pulling in headers and whole C files from boot/.

So perhaps these new definitions should be added to a small boot/msr.h
header and pulled in from there?

> 
> Extra bonus points will be given if you unify callers in
> arch/x86/boot/cpucheck.c too but you don't have to - I can do that
> ontop.

I have these new helpers defined with similar signatures to
__rdmsr/__wrmsr:

  /* rdmsr/wrmsr helpers */
  static inline u64 read_msr(unsigned int msr)
  {
         u64 low, high;
  
         asm volatile("rdmsr" : "=a" (low), "=d" (high) : "c" (msr));
  
         return ((high << 32) | low);
  }
  
  static inline void write_msr(unsigned int msr, u32 low, u32 high)
  {
         asm volatile("wrmsr" : : "c" (msr), "a"(low), "d" (high) : "memory");
  }

but cpucheck.c code flow really lends itself to having a read_msr()
variant that loads into 2 separate high/low u32 values, like what
native_rdmsr does:

  #define native_rdmsr(msr, val1, val2)                   \
  do {                                                    \
          u64 __val = __rdmsr((msr));                     \
          (void)((val1) = (u32)__val);                    \
          (void)((val2) = (u32)(__val >> 32));            \
  } while (0)

Should we introduce something like this as well for cpucheck.c? Or
re-write cpucheck.c to make use of the u64 versions? Or just set the
cpucheck.c rework aside for now? (but still introduce the above helpers
as boot/msr.h in preparation)?

Thanks,

Mike

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7Cec7f8621a6934039cfff08d9e5addaca%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637793357136301050%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=FMoP5ZskuxwanWTe5DxMnIYNPBSi%2FhRrOExp9hIHaCo%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-01-28 17:18 ` [PATCH v9 42/43] virt: sevguest: Add support to derive key Brijesh Singh
@ 2022-02-01 20:39   ` Peter Gonda
  2022-02-02 22:31     ` Brijesh Singh
  2022-02-07  8:52   ` Borislav Petkov
  1 sibling, 1 reply; 115+ messages in thread
From: Peter Gonda @ 2022-02-01 20:39 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: the arch/x86 maintainers, LKML, kvm list, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, Tony Luck, Marc Orr, Sathyanarayanan Kuppuswamy,
	Liam Merwick

On Fri, Jan 28, 2022 at 10:19 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>
> The SNP_GET_DERIVED_KEY ioctl interface can be used by the SNP guest to
> ask the firmware to provide a key derived from a root key. The derived
> key may be used by the guest for any purposes it chooses, such as a
> sealing key or communicating with the external entities.
>
> See SEV-SNP firmware spec for more information.
>
> Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Reviewed-by: Peter Gonda <pgonda@google.com>

> ---
>  Documentation/virt/coco/sevguest.rst  | 17 ++++++++++
>  drivers/virt/coco/sevguest/sevguest.c | 45 +++++++++++++++++++++++++++
>  include/uapi/linux/sev-guest.h        | 17 ++++++++++
>  3 files changed, 79 insertions(+)
>
> diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
> index 47ef3b0821d5..aafc9bce9aef 100644
> --- a/Documentation/virt/coco/sevguest.rst
> +++ b/Documentation/virt/coco/sevguest.rst
> @@ -72,6 +72,23 @@ On success, the snp_report_resp.data will contains the report. The report
>  contain the format described in the SEV-SNP specification. See the SEV-SNP
>  specification for further details.
>
> +2.2 SNP_GET_DERIVED_KEY
> +-----------------------
> +:Technology: sev-snp
> +:Type: guest ioctl
> +:Parameters (in): struct snp_derived_key_req
> +:Returns (out): struct snp_derived_key_resp on success, -negative on error
> +
> +The SNP_GET_DERIVED_KEY ioctl can be used to get a key derive from a root key.

derived from ...

> +The derived key can be used by the guest for any purpose, such as sealing keys
> +or communicating with external entities.

Question: How would this be used to communicate with external
entities? Reading Section 7.2 it seems like we could pick the VCEK and
have no guest specific inputs and we'd get the same derived key as we
would on another guest on the same platform with, is that correct?

> +
> +The ioctl uses the SNP_GUEST_REQUEST (MSG_KEY_REQ) command provided by the
> +SEV-SNP firmware to derive the key. See SEV-SNP specification for further details
> +on the various fields passed in the key derivation request.
> +
> +On success, the snp_derived_key_resp.data contains the derived key value. See
> +the SEV-SNP specification for further details.
>
>  Reference
>  ---------
> diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
> index 6dc0785ddd4b..4369e55df9a6 100644
> --- a/drivers/virt/coco/sevguest/sevguest.c
> +++ b/drivers/virt/coco/sevguest/sevguest.c
> @@ -392,6 +392,48 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io
>         return rc;
>  }
>
> +static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> +{
> +       struct snp_guest_crypto *crypto = snp_dev->crypto;
> +       struct snp_derived_key_resp resp = {0};
> +       struct snp_derived_key_req req = {0};
> +       int rc, resp_len;
> +       u8 buf[64+16]; /* Response data is 64 bytes and max authsize for GCM is 16 bytes */
> +
> +       if (!arg->req_data || !arg->resp_data)
> +               return -EINVAL;
> +
> +       /* Copy the request payload from userspace */
> +       if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> +               return -EFAULT;
> +
> +       /*
> +        * The intermediate response buffer is used while decrypting the
> +        * response payload. Make sure that it has enough space to cover the
> +        * authtag.
> +        */
> +       resp_len = sizeof(resp.data) + crypto->a_len;
> +       if (sizeof(buf) < resp_len)
> +               return -ENOMEM;
> +
> +       /* Issue the command to get the attestation report */
> +       rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
> +                                 SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len,
> +                                 &arg->fw_err);
> +       if (rc)
> +               goto e_free;
> +
> +       /* Copy the response payload to userspace */
> +       memcpy(resp.data, buf, sizeof(resp.data));
> +       if (copy_to_user((void __user *)arg->resp_data, &resp, sizeof(resp)))
> +               rc = -EFAULT;
> +
> +e_free:
> +       memzero_explicit(buf, sizeof(buf));
> +       memzero_explicit(&resp, sizeof(resp));
> +       return rc;
> +}
> +
>  static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
>  {
>         struct snp_guest_dev *snp_dev = to_snp_dev(file);
> @@ -421,6 +463,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>         case SNP_GET_REPORT:
>                 ret = get_report(snp_dev, &input);
>                 break;
> +       case SNP_GET_DERIVED_KEY:
> +               ret = get_derived_key(snp_dev, &input);
> +               break;
>         default:
>                 break;
>         }
> diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
> index 081d314a6279..bcd00a6d4501 100644
> --- a/include/uapi/linux/sev-guest.h
> +++ b/include/uapi/linux/sev-guest.h
> @@ -30,6 +30,20 @@ struct snp_report_resp {
>         __u8 data[4000];
>  };
>
> +struct snp_derived_key_req {
> +       __u32 root_key_select;
> +       __u32 rsvd;
> +       __u64 guest_field_select;
> +       __u32 vmpl;
> +       __u32 guest_svn;
> +       __u64 tcb_version;
> +};
> +
> +struct snp_derived_key_resp {
> +       /* response data, see SEV-SNP spec for the format */
> +       __u8 data[64];
> +};
> +
>  struct snp_guest_request_ioctl {
>         /* message version number (must be non-zero) */
>         __u8 msg_version;
> @@ -47,4 +61,7 @@ struct snp_guest_request_ioctl {
>  /* Get SNP attestation report */
>  #define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_guest_request_ioctl)
>
> +/* Get a derived key from the root */
> +#define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_guest_request_ioctl)
> +
>  #endif /* __UAPI_LINUX_SEV_GUEST_H_ */
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 43/43] virt: sevguest: Add support to get extended report
  2022-01-28 17:18 ` [PATCH v9 43/43] virt: sevguest: Add support to get extended report Brijesh Singh
@ 2022-02-01 20:43   ` Peter Gonda
  2022-02-07  9:16   ` Borislav Petkov
  1 sibling, 0 replies; 115+ messages in thread
From: Peter Gonda @ 2022-02-01 20:43 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: the arch/x86 maintainers, LKML, kvm list, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, Tony Luck, Marc Orr, Sathyanarayanan Kuppuswamy

On Fri, Jan 28, 2022 at 10:19 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>
> Version 2 of GHCB specification defines Non-Automatic-Exit(NAE) to get
> the extended guest report. It is similar to the SNP_GET_REPORT ioctl.
> The main difference is related to the additional data that will be
> returned. The additional data returned is a certificate blob that can
> be used by the SNP guest user. The certificate blob layout is defined
> in the GHCB specification. The driver simply treats the blob as a opaque
> data and copies it to userspace.
>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  Documentation/virt/coco/sevguest.rst  | 23 +++++++
>  drivers/virt/coco/sevguest/sevguest.c | 89 +++++++++++++++++++++++++++
>  include/uapi/linux/sev-guest.h        | 13 ++++
>  3 files changed, 125 insertions(+)
>
> diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
> index aafc9bce9aef..b9fe20e92d06 100644
> --- a/Documentation/virt/coco/sevguest.rst
> +++ b/Documentation/virt/coco/sevguest.rst
> @@ -90,6 +90,29 @@ on the various fields passed in the key derivation request.
>  On success, the snp_derived_key_resp.data contains the derived key value. See
>  the SEV-SNP specification for further details.
>
> +
> +2.3 SNP_GET_EXT_REPORT
> +----------------------
> +:Technology: sev-snp
> +:Type: guest ioctl
> +:Parameters (in/out): struct snp_ext_report_req
> +:Returns (out): struct snp_report_resp on success, -negative on error
> +
> +The SNP_GET_EXT_REPORT ioctl is similar to the SNP_GET_REPORT. The difference is
> +related to the additional certificate data that is returned with the report.
> +The certificate data returned is being provided by the hypervisor through the
> +SNP_SET_EXT_CONFIG.
> +
> +The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command provided by the SEV-SNP
> +firmware to get the attestation report.
> +
> +On success, the snp_ext_report_resp.data will contain the attestation report
> +and snp_ext_report_req.certs_address will contain the certificate blob. If the
> +length of the blob is smaller than expected then snp_ext_report_req.certs_len will
> +be updated with the expected value.
> +
> +See GHCB specification for further detail on how to parse the certificate blob.
> +
>  Reference
>  ---------
>
> diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
> index 4369e55df9a6..a53854353944 100644
> --- a/drivers/virt/coco/sevguest/sevguest.c
> +++ b/drivers/virt/coco/sevguest/sevguest.c
> @@ -41,6 +41,7 @@ struct snp_guest_dev {
>         struct device *dev;
>         struct miscdevice misc;
>
> +       void *certs_data;
>         struct snp_guest_crypto *crypto;
>         struct snp_guest_msg *request, *response;
>         struct snp_secrets_page_layout *layout;
> @@ -434,6 +435,84 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque
>         return rc;
>  }
>
> +static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> +{
> +       struct snp_guest_crypto *crypto = snp_dev->crypto;
> +       struct snp_ext_report_req req = {0};
> +       struct snp_report_resp *resp;
> +       int ret, npages = 0, resp_len;
> +
> +       if (!arg->req_data || !arg->resp_data)
> +               return -EINVAL;
> +
> +       /* Copy the request payload from userspace */
> +       if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> +               return -EFAULT;
> +
> +       if (req.certs_len) {
> +               if (req.certs_len > SEV_FW_BLOB_MAX_SIZE ||
> +                   !IS_ALIGNED(req.certs_len, PAGE_SIZE))
> +                       return -EINVAL;
> +       }
> +
> +       if (req.certs_address && req.certs_len) {
> +               if (!access_ok(req.certs_address, req.certs_len))
> +                       return -EFAULT;
> +
> +               /*
> +                * Initialize the intermediate buffer with all zero's. This buffer

zeros

> +                * is used in the guest request message to get the certs blob from
> +                * the host. If host does not supply any certs in it, then copy
> +                * zeros to indicate that certificate data was not provided.
> +                */
> +               memset(snp_dev->certs_data, 0, req.certs_len);
> +
> +               npages = req.certs_len >> PAGE_SHIFT;
> +       }
> +
> +       /*
> +        * The intermediate response buffer is used while decrypting the
> +        * response payload. Make sure that it has enough space to cover the
> +        * authtag.
> +        */
> +       resp_len = sizeof(resp->data) + crypto->a_len;
> +       resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
> +       if (!resp)
> +               return -ENOMEM;

Can we pull this duplicated code from get_report() into a helper?

> +
> +       snp_dev->input.data_npages = npages;
> +       ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg->msg_version,
> +                                  SNP_MSG_REPORT_REQ, &req.data,
> +                                  sizeof(req.data), resp->data, resp_len, &arg->fw_err);
> +
> +       /* If certs length is invalid then copy the returned length */
> +       if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
> +               req.certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
> +
> +               if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
> +                       ret = -EFAULT;
> +       }
> +
> +       if (ret)
> +               goto e_free;
> +
> +       /* Copy the certificate data blob to userspace */
> +       if (req.certs_address && req.certs_len &&
> +           copy_to_user((void __user *)req.certs_address, snp_dev->certs_data,
> +                        req.certs_len)) {
> +               ret = -EFAULT;
> +               goto e_free;
> +       }
> +
> +       /* Copy the response payload to userspace */
> +       if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
> +               ret = -EFAULT;
> +
> +e_free:
> +       kfree(resp);
> +       return ret;
> +}
> +
>  static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
>  {
>         struct snp_guest_dev *snp_dev = to_snp_dev(file);
> @@ -466,6 +545,9 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
>         case SNP_GET_DERIVED_KEY:
>                 ret = get_derived_key(snp_dev, &input);
>                 break;
> +       case SNP_GET_EXT_REPORT:
> +               ret = get_ext_report(snp_dev, &input);
> +               break;
>         default:
>                 break;
>         }
> @@ -594,6 +676,10 @@ static int __init snp_guest_probe(struct platform_device *pdev)
>         if (!snp_dev->response)
>                 goto e_fail;
>
> +       snp_dev->certs_data = alloc_shared_pages(SEV_FW_BLOB_MAX_SIZE);
> +       if (!snp_dev->certs_data)
> +               goto e_fail;
> +
>         ret = -EIO;
>         snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
>         if (!snp_dev->crypto)
> @@ -607,6 +693,7 @@ static int __init snp_guest_probe(struct platform_device *pdev)
>         /* initial the input address for guest request */
>         snp_dev->input.req_gpa = __pa(snp_dev->request);
>         snp_dev->input.resp_gpa = __pa(snp_dev->response);
> +       snp_dev->input.data_gpa = __pa(snp_dev->certs_data);
>
>         ret =  misc_register(misc);
>         if (ret)
> @@ -617,6 +704,7 @@ static int __init snp_guest_probe(struct platform_device *pdev)
>
>  e_fail:
>         iounmap(layout);
> +       free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
>         free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
>         free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
>
> @@ -629,6 +717,7 @@ static int __exit snp_guest_remove(struct platform_device *pdev)
>
>         free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
>         free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
> +       free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
>         deinit_crypto(snp_dev->crypto);
>         misc_deregister(&snp_dev->misc);
>
> diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
> index bcd00a6d4501..0a47b6627c78 100644
> --- a/include/uapi/linux/sev-guest.h
> +++ b/include/uapi/linux/sev-guest.h
> @@ -56,6 +56,16 @@ struct snp_guest_request_ioctl {
>         __u64 fw_err;
>  };
>
> +struct snp_ext_report_req {
> +       struct snp_report_req data;
> +
> +       /* where to copy the certificate blob */
> +       __u64 certs_address;
> +
> +       /* length of the certificate blob */
> +       __u32 certs_len;
> +};
> +
>  #define SNP_GUEST_REQ_IOC_TYPE 'S'
>
>  /* Get SNP attestation report */
> @@ -64,4 +74,7 @@ struct snp_guest_request_ioctl {
>  /* Get a derived key from the root */
>  #define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_guest_request_ioctl)
>
> +/* Get SNP extended report as defined in the GHCB specification version 2. */
> +#define SNP_GET_EXT_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x2, struct snp_guest_request_ioctl)
> +
>  #endif /* __UAPI_LINUX_SEV_GUEST_H_ */
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot
  2022-02-01 20:35     ` Michael Roth
@ 2022-02-01 21:28       ` Borislav Petkov
  2022-02-02  0:52         ` Michael Roth
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-01 21:28 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Feb 01, 2022 at 02:35:07PM -0600, Michael Roth wrote:
> Unfortunately rdmsr()/wrmsr()/__rdmsr()/__wrmsr() etc. definitions are all
> already getting pulled in via:
> 
>   misc.h:
>     #include linux/elf.h
>       #include linux/thread_info.h
>         #include linux/cpufeature.h
>           #include linux/processor.h
>             #include linux/msr.h
> 
> Those definitions aren't usable in boot/compressed because of __ex_table
> and possibly some other dependency hellishness.

And they should not be. Mixing kernel proper and decompressor code needs
to stop and untangling that is a multi-year effort, unfortunately. ;-\

> Would read_msr()/write_msr() be reasonable alternative names for these new
> helpers, or something else that better distinguishes them from the
> kernel proper definitions?

Nah, just call them rdmsr/wrmsr(). There is already {read,write}_msr()
tracepoint symbols in kernel proper and there's no point in keeping them
apart using different names - that ship has long sailed.

> It doesn't look like anything in boot/ pulls in boot/compressed/
> headers. It seems to be the other way around, with boot/compressed
> pulling in headers and whole C files from boot/.
> 
> So perhaps these new definitions should be added to a small boot/msr.h
> header and pulled in from there?

That sounds good too.

> Should we introduce something like this as well for cpucheck.c? Or
> re-write cpucheck.c to make use of the u64 versions? Or just set the
> cpucheck.c rework aside for now? (but still introduce the above helpers
> as boot/msr.h in preparation)?

How about you model it after

static int msr_read(u32 msr, struct msr *m)

from arch/x86/lib/msr.c which takes struct msr from which you can return
either u32s or a u64?

The stuff you share between the decompressor and kernel proper you put
in a arch/x86/include/asm/shared/ folder, for an example, see what we do
there in the TDX patchset:

https://lore.kernel.org/r/20220124150215.36893-11-kirill.shutemov@linux.intel.com

I.e., you move struct msr in such a shared header and then you include
it everywhere needed.

The arch/x86/boot/ msr helpers are then plain and simple, without
tracepoints and exception fixups and you define them in ...boot/msr.c or
so.

If the patch gets too big, make sure to split it in a couple so that it
is clear what happens at each step.

How does that sound?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot
  2022-02-01 21:28       ` Borislav Petkov
@ 2022-02-02  0:52         ` Michael Roth
  2022-02-02  6:09           ` Borislav Petkov
  0 siblings, 1 reply; 115+ messages in thread
From: Michael Roth @ 2022-02-02  0:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Feb 01, 2022 at 10:28:39PM +0100, Borislav Petkov wrote:
> On Tue, Feb 01, 2022 at 02:35:07PM -0600, Michael Roth wrote:
> > Unfortunately rdmsr()/wrmsr()/__rdmsr()/__wrmsr() etc. definitions are all
> > already getting pulled in via:
> > 
> >   misc.h:
> >     #include linux/elf.h
> >       #include linux/thread_info.h
> >         #include linux/cpufeature.h
> >           #include linux/processor.h
> >             #include linux/msr.h
> > 
> > Those definitions aren't usable in boot/compressed because of __ex_table
> > and possibly some other dependency hellishness.
> 
> And they should not be. Mixing kernel proper and decompressor code needs
> to stop and untangling that is a multi-year effort, unfortunately. ;-\
> 
> > Would read_msr()/write_msr() be reasonable alternative names for these new
> > helpers, or something else that better distinguishes them from the
> > kernel proper definitions?
> 
> Nah, just call them rdmsr/wrmsr(). There is already {read,write}_msr()
> tracepoint symbols in kernel proper and there's no point in keeping them
> apart using different names - that ship has long sailed.

Since the kernel proper rdmsr()/wrmsr() definitions are getting pulled in via
misc.h, I have to use a different name to avoid compiler errors. For now I've
gone with rd_msr()/wr_msr(), but no problem changing those if needed.

> > Should we introduce something like this as well for cpucheck.c? Or
> > re-write cpucheck.c to make use of the u64 versions? Or just set the
> > cpucheck.c rework aside for now? (but still introduce the above helpers
> > as boot/msr.h in preparation)?
> 
> How about you model it after
> 
> static int msr_read(u32 msr, struct msr *m)
> 
> from arch/x86/lib/msr.c which takes struct msr from which you can return
> either u32s or a u64?
> 
> The stuff you share between the decompressor and kernel proper you put
> in a arch/x86/include/asm/shared/ folder, for an example, see what we do
> there in the TDX patchset:
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2F20220124150215.36893-11-kirill.shutemov%40linux.intel.com&amp;data=04%7C01%7Cmichael.roth%40amd.com%7C1662b963f1c54f3663df08d9e5c9d6bd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637793477399827883%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=rtdo9ci3XjOHn2HphvNE7ciFR6tKh1pVclFuNhXkNGs%3D&amp;reserved=0
> 
> I.e., you move struct msr in such a shared header and then you include
> it everywhere needed.
> 
> The arch/x86/boot/ msr helpers are then plain and simple, without
> tracepoints and exception fixups and you define them in ...boot/msr.c or
> so.
> 
> If the patch gets too big, make sure to split it in a couple so that it
> is clear what happens at each step.
> 
> How does that sound?

Thanks for the suggestions, this works out nicely, but as far as defining
them in boot/msr.c, it looks like the Makefile in boot/compressed doesn't
currently have any instances of linking to objects in boot/, and the status
quo seems to be #include'ing the whole C file in cases where boot/ code is
needed in boot/compressed.

Since the rd_msr/wr_msr are oneliners, it seemed like it might be a
little cleaner to just define them in boot/msr.h as static inline and
include them directly as part of the header.

Here's what it looks like on top of this tree, and roughly how I plan to
split the patches for v10:

- define the rd_msr/wr_msr helpers
  https://github.com/mdroth/linux/commit/982c6c5741478c8f634db8ac0ba36575b5eff946

- use the helpers in boot/compressed/sev.c and boot/cpucheck.c
  https://github.com/mdroth/linux/commit/a16e11f727c01fc478d3b741e1bdd2fd44975d7c

For v10 though I'll likely just drop rd_sev_status_msr() completely and use
rd_msr() directly.

Let me know if I should make any changes and I'll make sure to get those in for
the next spin.

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7C1662b963f1c54f3663df08d9e5c9d6bd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637793477399827883%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=njDai06MS2mcyMLZ5YvLMcgXSXwfoO01U2c0D%2BE3HG4%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot
  2022-02-02  0:52         ` Michael Roth
@ 2022-02-02  6:09           ` Borislav Petkov
  2022-02-02 17:28             ` Michael Roth
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-02  6:09 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Feb 01, 2022 at 06:52:12PM -0600, Michael Roth wrote:
> Since the kernel proper rdmsr()/wrmsr() definitions are getting pulled in via
> misc.h, I have to use a different name to avoid compiler errors. For now I've
> gone with rd_msr()/wr_msr(), but no problem changing those if needed.

Does that fix it too?

diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 16ed360b6692..346c46d072c8 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -21,7 +21,6 @@
 
 #include <linux/linkage.h>
 #include <linux/screen_info.h>
-#include <linux/elf.h>
 #include <linux/io.h>
 #include <asm/page.h>
 #include <asm/boot.h>
---

This is exactly what I mean with a multi-year effort of untangling what
has been mindlessly mixed in over the years...

> Since the rd_msr/wr_msr are oneliners, it seemed like it might be a
> little cleaner to just define them in boot/msr.h as static inline and
> include them directly as part of the header.

Ok, that's fine too.

> Here's what it looks like on top of this tree, and roughly how I plan to
> split the patches for v10:
> 
> - define the rd_msr/wr_msr helpers
>   https://github.com/mdroth/linux/commit/982c6c5741478c8f634db8ac0ba36575b5eff946
> 
> - use the helpers in boot/compressed/sev.c and boot/cpucheck.c
>   https://github.com/mdroth/linux/commit/a16e11f727c01fc478d3b741e1bdd2fd44975d7c
> 
> For v10 though I'll likely just drop rd_sev_status_msr() completely and use
> rd_msr() directly.
> 
> Let me know if I should make any changes and I'll make sure to get those in for
> the next spin.

Yap, looks good.

Thanks for doing that!

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 15/43] x86/sev: Register GHCB memory when SEV-SNP is active
  2022-01-28 17:17 ` [PATCH v9 15/43] x86/sev: " Brijesh Singh
@ 2022-02-02 10:34   ` Borislav Petkov
  2022-02-02 14:29     ` Brijesh Singh
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-02 10:34 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:36AM -0600, Brijesh Singh wrote:
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 24df739c9c05..b86b48b66a44 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -41,7 +41,7 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
>   * Needs to be in the .data section because we need it NULL before bss is
>   * cleared
>   */
> -static struct ghcb __initdata *boot_ghcb;
> +static struct ghcb *boot_ghcb __section(".data");
>  
>  /* Bitmap of SEV features supported by the hypervisor */
>  static u64 sev_hv_features __ro_after_init;
> @@ -161,55 +161,6 @@ void noinstr __sev_es_ist_exit(void)
>  	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist);
>  }
>  
> -/*
> - * Nothing shall interrupt this code path while holding the per-CPU
> - * GHCB. The backup GHCB is only for NMIs interrupting this path.
> - *
> - * Callers must disable local interrupts around it.
> - */
> -static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)

That move doesn't look like it's needed anymore, does it?

I mean, it builds even without it.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 17/43] x86/kernel: Make the .bss..decrypted section shared in RMP table
  2022-01-28 17:17 ` [PATCH v9 17/43] x86/kernel: Make the .bss..decrypted section shared in RMP table Brijesh Singh
@ 2022-02-02 11:06   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-02 11:06 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:38AM -0600, Brijesh Singh wrote:
> The encryption attribute for the .bss..decrypted section is cleared in the
> initial page table build. This is because the section contains the data
> that need to be shared between the guest and the hypervisor.
> 
> When SEV-SNP is active, just clearing the encryption attribute in the
> page table is not enough. The page state need to be updated in the RMP
> table.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/kernel/head64.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 8075e91cff2b..1239bc104cda 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -143,7 +143,21 @@ static unsigned long sme_postprocess_startup(struct boot_params *bp, pmdval_t *p
>  	if (sme_get_me_mask()) {
>  		vaddr = (unsigned long)__start_bss_decrypted;
>  		vaddr_end = (unsigned long)__end_bss_decrypted;
> +
>  		for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> +			/*
> +			 * When SEV-SNP is active then transition the page to
> +			 * shared in the RMP table so that it is consistent with
> +			 * the page table attribute change.
> +			 *
> +			 * At this point, kernel is running in identity mapped mode.
> +			 * The __start_bss_decrypted is a regular kernel address. The
> +			 * early_snp_set_memory_shared() requires a valid virtual
> +			 * address, so use __pa() against __start_bss_decrypted to
> +			 * get valid virtual address.
> +			 */

How's that?

                        /*
                         * On SNP, transition the page to shared in the RMP table so that
                         * it is consistent with the page table attribute change.
                         *
                         * __start_bss_decrypted has a virtual address in the high range
                         * mapping (kernel .text). PVALIDATE, by way of
                         * early_snp_set_memory_shared(), requires a valid virtual
                         * address but the kernel is currently running off of the identity
                         * mapping so use __pa() to get a *currently* valid virtual address.
                         */

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 10/43] x86/sev: Check SEV-SNP features support
  2022-02-01 19:59   ` Borislav Petkov
@ 2022-02-02 14:28     ` Brijesh Singh
  2022-02-02 15:37       ` Borislav Petkov
  0 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-02-02 14:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 2/1/22 1:59 PM, Borislav Petkov wrote:
...

> 
>> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
>> index 19ad09712902..24df739c9c05 100644
>> --- a/arch/x86/kernel/sev.c
>> +++ b/arch/x86/kernel/sev.c
>> @@ -43,6 +43,9 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
>>    */
>>   static struct ghcb __initdata *boot_ghcb;
>>   
>> +/* Bitmap of SEV features supported by the hypervisor */
>> +static u64 sev_hv_features __ro_after_init;
>> +
>>   /* #VC handler runtime per-CPU data */
>>   struct sev_es_runtime_data {
>>   	struct ghcb ghcb_page;
>> @@ -766,6 +769,18 @@ void __init sev_es_init_vc_handling(void)
>>   	if (!sev_es_check_cpu_features())
>>   		panic("SEV-ES CPU Features missing");
>>   
>> +	/*
>> +	 * SEV-SNP is supported in v2 of the GHCB spec which mandates support for HV
>> +	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
>> +	 * the SEV-SNP features.
> 
> You guys have been completely brainwashed by marketing. I say:
> 
> "s/SEV-SNP/SNP/g
> 
> And please do that everywhere in sev-specific files."

Yeah, most of the documentation explicitly calls SEV-SNP, I was unsure 
about the trademark, so I used it in the comments/logs. I am okay with 
the SEV prefix removed; I am not in the marketing team, and hopefully, 
they will *never* see kernel code ;)

~ Brijesh




^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 15/43] x86/sev: Register GHCB memory when SEV-SNP is active
  2022-02-02 10:34   ` Borislav Petkov
@ 2022-02-02 14:29     ` Brijesh Singh
  0 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-02-02 14:29 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 2/2/22 4:34 AM, Borislav Petkov wrote:
> That move doesn't look like it's needed anymore, does it?

Ah, yes, we no longer need it. I will be remove it in the next rev.

-Brijesh

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 10/43] x86/sev: Check SEV-SNP features support
  2022-02-02 14:28     ` Brijesh Singh
@ 2022-02-02 15:37       ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-02 15:37 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Wed, Feb 02, 2022 at 08:28:17AM -0600, Brijesh Singh wrote:
> Yeah, most of the documentation explicitly calls SEV-SNP, I was unsure about
> the trademark, so I used it in the comments/logs. I am okay with the SEV
> prefix removed; I am not in the marketing team, and hopefully, they will
> *never* see kernel code ;)

They better! :-)

Also, in our kernel team here, the importance lies on having stuff
clearly and succinctly explained, without any bla or fluff. When you
look at that code later, you should go "ah, ok, that's why we're doing
this here" - not, "uuh, I need to sit down and parse that comment
first."

That's why I'd like for comments to have only good and important words -
no deadweight marketing bla.

:-)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 18/43] x86/kernel: Validate ROM memory before accessing when SEV-SNP is active
  2022-01-28 17:17 ` [PATCH v9 18/43] x86/kernel: Validate ROM memory before accessing when SEV-SNP is active Brijesh Singh
@ 2022-02-02 15:41   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-02 15:41 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:39AM -0600, Brijesh Singh wrote:
> @@ -197,11 +198,21 @@ static int __init romchecksum(const unsigned char *rom, unsigned long length)
>  
>  void __init probe_roms(void)
>  {
> -	const unsigned char *rom;
>  	unsigned long start, length, upper;
> +	const unsigned char *rom;
>  	unsigned char c;
>  	int i;
>  
> +	/*
> +	 * The ROM memory is not part of the E820 system RAM and is not pre-validated
> +	 * by the BIOS. The kernel page table maps the ROM region as encrypted memory,
> +	 * the SEV-SNP requires the encrypted memory must be validated before the
> +	 * access. Validate the ROM before accessing it.
> +	 */

Lemme massage it:

        /*
         * The ROM memory range is not part of the e820 table and is therefore not
         * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
         * memory, and SEV-SNP requires encrypted memory to be validated before access.
         * Do that here.
         */

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 19/43] x86/mm: Add support to validate memory when changing C-bit
  2022-01-28 17:17 ` [PATCH v9 19/43] x86/mm: Add support to validate memory when changing C-bit Brijesh Singh
@ 2022-02-02 16:10   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-02 16:10 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:40AM -0600, Brijesh Singh wrote:
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 5ee0fbd98d0d..b7ae741a8c66 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -655,6 +655,173 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
>  		WARN(1, "invalid memory op %d\n", op);
>  }
>  
> +static int vmgexit_psc(struct snp_psc_desc *desc)
> +{
> +	int cur_entry, end_entry, ret = 0;
> +	struct snp_psc_desc *data;
> +	struct ghcb_state state;
> +	struct es_em_ctxt ctxt;
> +	unsigned long flags;
> +	struct ghcb *ghcb;
> +
> +	/*
> +	 * __sev_get_ghcb() needs to run with IRQs disabled because it is using
> +	 * a per-CPU GHCB.
> +	 */
> +	local_irq_save(flags);
> +

I know the guest will terminate anyway but still...

> +	ghcb = __sev_get_ghcb(&state);
> +	if (unlikely(!ghcb))
> +		return 1;

This needs to be (btw, do you really need the unlikely()?)

	if (!ghcb) {
		ret = 1;
		goto out_unlock;
	}

and at the end you have

out:
        __sev_put_ghcb(&state);

out_unlock:
        local_irq_restore(flags);

...

> +static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
> +{
> +	unsigned long vaddr_end, next_vaddr;
> +	struct snp_psc_desc *desc;
> +
> +	desc = kmalloc(sizeof(*desc), GFP_KERNEL_ACCOUNT);
> +	if (!desc)
> +		panic("SEV-SNP: failed to allocate memory for PSC descriptor\n");
> +
> +	vaddr = vaddr & PAGE_MASK;
> +	vaddr_end = vaddr + (npages << PAGE_SHIFT);
> +
> +	while (vaddr < vaddr_end) {
> +		/*
> +		 * Calculate the last vaddr that can be fit in one

"... that fits in one ... "

and then the comment *fits* :) on a single line too:

		/* Calculate the last vaddr that fits in one struct snp_psc_desc. */

> +		 * struct snp_psc_desc.
> +		 */
> +		next_vaddr = min_t(unsigned long, vaddr_end,
> +				   (VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr);
> +
> +		__set_pages_state(desc, vaddr, next_vaddr, op);
> +
> +		vaddr = next_vaddr;
> +	}
> +
> +	kfree(desc);
> +}

...

> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index b4072115c8ef..1bc15b9d15f3 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -32,6 +32,7 @@
>  #include <asm/set_memory.h>
>  #include <asm/hyperv-tlfs.h>
>  #include <asm/mshyperv.h>
> +#include <asm/sev.h>
>  
>  #include "../mm_internal.h"
>  
> @@ -2012,8 +2013,22 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
>  	 */
>  	cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
>  
> +	/*
> +	 * To maintain the security guarantees of SEV-SNP guest invalidate the
> +	 * memory before clearing the encryption attribute.
> +	 */
> +	if (!enc)
> +		snp_set_memory_shared(addr, numpages);
> +
>  	ret = __change_page_attr_set_clr(&cpa, 1);
>  
> +	/*
> +	 * Now that memory is mapped encrypted in the page table, validate it
> +	 * so that is consistent with the above page state.

" ... so that it is consistent... "

> +	 */
> +	if (!ret && enc)
> +		snp_set_memory_private(addr, numpages);
> +
>  	/*
>  	 * After changing the encryption attribute, we need to flush TLBs again
>  	 * in case any speculative TLB caching occurred (but no need to flush
> -- 
> 2.25.1
> 

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 40/43] x86/sev: Register SEV-SNP guest request platform device
  2022-02-01 20:21   ` Peter Gonda
@ 2022-02-02 16:27     ` Brijesh Singh
  0 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-02-02 16:27 UTC (permalink / raw)
  To: Peter Gonda
  Cc: brijesh.singh, the arch/x86 maintainers, LKML, kvm list,
	linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Zijlstra,
	Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, Tony Luck, Marc Orr,
	Sathyanarayanan Kuppuswamy



On 2/1/22 2:21 PM, Peter Gonda wrote:

>> +       /* smoke-test the secrets page passed */
>> +       if (!info.secrets_phys || info.secrets_len != PAGE_SIZE)
>> +               return 0;
> 
> This seems like an error condition worth noting. If no cc_blob_address
> is passed it makes sense not to log but what if the address passed
> fails this smoke test, why not log?


If the smoke test fails, there will not be the /dev/sev-guest device, 
and userspace should log it accordingly. Having said so, I am not 
against logging if it helps.


thanks

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot
  2022-02-02  6:09           ` Borislav Petkov
@ 2022-02-02 17:28             ` Michael Roth
  2022-02-02 18:57               ` Borislav Petkov
  0 siblings, 1 reply; 115+ messages in thread
From: Michael Roth @ 2022-02-02 17:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Feb 02, 2022 at 07:09:24AM +0100, Borislav Petkov wrote:
> On Tue, Feb 01, 2022 at 06:52:12PM -0600, Michael Roth wrote:
> > Since the kernel proper rdmsr()/wrmsr() definitions are getting pulled in via
> > misc.h, I have to use a different name to avoid compiler errors. For now I've
> > gone with rd_msr()/wr_msr(), but no problem changing those if needed.
> 
> Does that fix it too?
> 
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 16ed360b6692..346c46d072c8 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -21,7 +21,6 @@
>  
>  #include <linux/linkage.h>
>  #include <linux/screen_info.h>
> -#include <linux/elf.h>
>  #include <linux/io.h>
>  #include <asm/page.h>
>  #include <asm/boot.h>
> ---
> 
> This is exactly what I mean with a multi-year effort of untangling what
> has been mindlessly mixed in over the years...

Indeed... it looks like linux/{elf,io,efi,acpi}.h all end up pulling in
kernel proper's rdmsr()/wrmsr() definitions, and pulling them out ends up
breaking a bunch of other stuff, so I think we might be stuck using a
different name like rd_msr()/wr_msr() in the meantime.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot
  2022-02-02 17:28             ` Michael Roth
@ 2022-02-02 18:57               ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-02 18:57 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Wed, Feb 02, 2022 at 11:28:01AM -0600, Michael Roth wrote:
> Indeed... it looks like linux/{elf,io,efi,acpi}.h all end up pulling in
> kernel proper's rdmsr()/wrmsr() definitions, and pulling them out ends up
> breaking a bunch of other stuff,

It's a nightmare - just gave it a try. No wonder they call it include
hell.

> so I think we might be stuck using a different name like
> rd_msr()/wr_msr() in the meantime.

Ok, but then pls call them boot_rdmsr() and boot_wrmsr() so that there's
a clear distinction from all the other msr helpers. And put a comment
above them in arch/x86/boot/msr.h explaining why they're called this
way.

One fine day I'll have this mess untangled and clean...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-02-01 20:39   ` Peter Gonda
@ 2022-02-02 22:31     ` Brijesh Singh
  0 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-02-02 22:31 UTC (permalink / raw)
  To: Peter Gonda
  Cc: brijesh.singh, the arch/x86 maintainers, LKML, kvm list,
	linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Zijlstra,
	Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, Tony Luck, Marc Orr,
	Sathyanarayanan Kuppuswamy, Liam Merwick


On 2/1/22 2:39 PM, Peter Gonda wrote:
> On Fri, Jan 28, 2022 at 10:19 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>> The SNP_GET_DERIVED_KEY ioctl interface can be used by the SNP guest to
>> ask the firmware to provide a key derived from a root key. The derived
>> key may be used by the guest for any purposes it chooses, such as a
>> sealing key or communicating with the external entities.
>>
>> See SEV-SNP firmware spec for more information.
>>
>> Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Reviewed-by: Peter Gonda <pgonda@google.com>
>
>> ---
>>  Documentation/virt/coco/sevguest.rst  | 17 ++++++++++
>>  drivers/virt/coco/sevguest/sevguest.c | 45 +++++++++++++++++++++++++++
>>  include/uapi/linux/sev-guest.h        | 17 ++++++++++
>>  3 files changed, 79 insertions(+)
>>
>> diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
>> index 47ef3b0821d5..aafc9bce9aef 100644
>> --- a/Documentation/virt/coco/sevguest.rst
>> +++ b/Documentation/virt/coco/sevguest.rst
>> @@ -72,6 +72,23 @@ On success, the snp_report_resp.data will contains the report. The report
>>  contain the format described in the SEV-SNP specification. See the SEV-SNP
>>  specification for further details.
>>
>> +2.2 SNP_GET_DERIVED_KEY
>> +-----------------------
>> +:Technology: sev-snp
>> +:Type: guest ioctl
>> +:Parameters (in): struct snp_derived_key_req
>> +:Returns (out): struct snp_derived_key_resp on success, -negative on error
>> +
>> +The SNP_GET_DERIVED_KEY ioctl can be used to get a key derive from a root key.
> derived from ...
>
>> +The derived key can be used by the guest for any purpose, such as sealing keys
>> +or communicating with external entities.
> Question: How would this be used to communicate with external
> entities? Reading Section 7.2 it seems like we could pick the VCEK and
> have no guest specific inputs and we'd get the same derived key as we
> would on another guest on the same platform with, is that correct?

That could work. This method is using the idea that the guests would derive identical keys removing the need for a complex key establishment protocol. 

However, it's probably better to approach this slightly differently. The derived key can be used to produce a persistent guest identity key that can be recovered across instances. That key can sign an key establishment exchange (e.g. an ephemeral ECDH key) for establishing trusted channels with remote parties.

Further, that identity key could be signed by a centralized CA. A way to approach that could be placing the fingerprint of the identity key into the REPORT_DATA parameter of the attestation request message. The attestation report will come back signed by the security processor and will contain the fingerprint. A CSR to the CA can be accompanied by the attestation report to prove the key came from the guest described in the attestation report. 

If the guest does not require keys or secrets to be persisted, or if the guest has other means for persisting keys or secrets, the derivation messages are not necessary. It's an optional security primitive for guests to use.

thanks


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 20/43] x86/sev: Use SEV-SNP AP creation to start secondary CPUs
  2022-01-28 17:17 ` [PATCH v9 20/43] x86/sev: Use SEV-SNP AP creation to start secondary CPUs Brijesh Singh
@ 2022-02-03  6:50   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-03  6:50 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:41AM -0600, Brijesh Singh wrote:
> @@ -822,6 +842,236 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
>  	pvalidate_pages(vaddr, npages, true);
>  }
>  
> +static int snp_set_vmsa(void *va, bool vmsa)
> +{
> +	u64 attrs;
> +
> +	/*
> +	 * Running at VMPL0 allows the kernel to change the VMSA bit for a page
> +	 * using the RMPADJUST instruction. However, for the instruction to
> +	 * succeed it must target the permissions of a lesser privileged

"lesser privileged/higher number VMPL level"

so that it is perfectly clear what this means.

> +	 * VMPL level, so use VMPL1 (refer to the RMPADJUST instruction in the
> +	 * AMD64 APM Volume 3).
> +	 */
> +	attrs = 1;
> +	if (vmsa)
> +		attrs |= RMPADJUST_VMSA_PAGE_BIT;
> +
> +	return rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs);
> +}

...

> +static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
> +{
> +	struct sev_es_save_area *cur_vmsa, *vmsa;
> +	struct ghcb_state state;
> +	unsigned long flags;
> +	struct ghcb *ghcb;
> +	u8 sipi_vector;
> +	int cpu, ret;
> +	u64 cr4;
> +
> +	/*
> +	 * SNP-SNP AP creation requires that the hypervisor must support SEV-SNP
	   ^^^^^^^

See what I mean? :-)

That marketing has brainwashed y'all.

> +	 * feature. The SEV-SNP feature check is already performed, so just check
> +	 * for the AP_CREATION feature flag.
> +	 */

Let's clean this one:

	/*
	 * The hypervisor SNP feature support check has happened earlier, just check
	 * the AP_CREATION one here.
	 */


> +	if (!(sev_hv_features & GHCB_HV_FT_SNP_AP_CREATION))
> +		return -EOPNOTSUPP;
> +
> +	/*
> +	 * Verify the desired start IP against the known trampoline start IP
> +	 * to catch any future new trampolines that may be introduced that
> +	 * would require a new protected guest entry point.
> +	 */
> +	if (WARN_ONCE(start_ip != real_mode_header->trampoline_start,
> +		      "Unsupported SEV-SNP start_ip: %lx\n", start_ip))
> +		return -EINVAL;
> +
> +	/* Override start_ip with known protected guest start IP */
> +	start_ip = real_mode_header->sev_es_trampoline_start;

Yah, I'd like to get rid of that ->sev_es_trampoline_start and use the
normal ->trampoline_start. TDX is introducing a third one even and
they're all mutually-exclusive u32 values.

...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 22/43] x86/sev: Move MSR-based VMGEXITs for CPUID to helper
  2022-01-28 17:17 ` [PATCH v9 22/43] x86/sev: Move MSR-based VMGEXITs for CPUID to helper Brijesh Singh
@ 2022-02-03 13:59   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-03 13:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:43AM -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> This code will also be used later for SEV-SNP-validated CPUID code in
> some cases, so move it to a common helper.

Suggested-by: Sean Christopherson <seanjc@google.com>

unless he really doesn't want to be the "suggestor" :)

> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/kernel/sev-shared.c | 62 +++++++++++++++++++++---------------
>  1 file changed, 36 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index 3aaef1a18ffe..633f1f93b6e1 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -194,6 +194,36 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
>  	return verify_exception_info(ghcb, ctxt);
>  }
>  
> +static int __sev_cpuid_hv(u32 func, int reg_idx, u32 *reg)
> +{
> +	u64 val;
> +
> +	if (!reg)
> +		return 0;

I don't know what that's supposed to catch? In case callers are
interested only in some subset of the CPUID leaf?

Meh, I guess...

> +	sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, reg_idx));
> +	VMGEXIT();
> +	val = sev_es_rd_ghcb_msr();
> +	if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
> +		return -EIO;
> +
> +	*reg = (val >> 32);
> +
> +	return 0;
> +}
> +
> +static int sev_cpuid_hv(u32 func, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
> +{
> +	int ret;
> +
> +	ret = __sev_cpuid_hv(func, GHCB_CPUID_REQ_EAX, eax);
> +	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EBX, ebx);
> +	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_ECX, ecx);
> +	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EDX, edx);

You can format that this way:

        ret =         __sev_cpuid_hv(func, GHCB_CPUID_REQ_EAX, eax);
        ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EBX, ebx);
        ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_ECX, ecx);
        ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EDX, edx);


and then it is visible at a quick glance what this does, due to the
regularity.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 24/43] x86/compressed/acpi: Move EFI detection to helper
  2022-01-28 17:17 ` [PATCH v9 24/43] x86/compressed/acpi: Move EFI detection " Brijesh Singh
@ 2022-02-03 14:39   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-03 14:39 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:45AM -0600, Brijesh Singh wrote:
> diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
> new file mode 100644
> index 000000000000..daa73efdc7a5
> --- /dev/null
> +++ b/arch/x86/boot/compressed/efi.c
> @@ -0,0 +1,50 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Helpers for early access to EFI configuration table.
> + *
> + * Originally derived from arch/x86/boot/compressed/acpi.c
> + */
> +
> +#include "misc.h"
> +#include <linux/efi.h>
> +#include <asm/efi.h>

Yap, it is includes like that which cause this whole decompressor
include hell. One day...

> +
> +/**
> + * efi_get_type - Given boot_params, determine the type of EFI environment.
> + *
> + * @boot_params:        pointer to boot_params
> + *
> + * Return: EFI_TYPE_{32,64} for valid EFI environments, EFI_TYPE_NONE otherwise.
> + */
> +enum efi_type efi_get_type(struct boot_params *boot_params)

				struct boot_params *bp

> +{
> +	struct efi_info *ei;
> +	enum efi_type et;
> +	const char *sig;
> +
> +	ei = &boot_params->efi_info;
> +	sig = (char *)&ei->efi_loader_signature;
> +
> +	if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
> +		et = EFI_TYPE_64;
> +	} else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
> +		et = EFI_TYPE_32;
> +	} else {
> +		debug_putstr("No EFI environment detected.\n");
> +		et = EFI_TYPE_NONE;
> +	}
> +
> +#ifndef CONFIG_X86_64
> +	/*
> +	 * Existing callers like acpi.c treat this case as an indicator to
> +	 * fall-through to non-EFI, rather than an error, so maintain that
> +	 * functionality here as well.
> +	 */
> +	if (ei->efi_systab_hi || ei->efi_memmap_hi) {
> +		debug_putstr("EFI system table is located above 4GB and cannot be accessed.\n");
> +		et = EFI_TYPE_NONE;
> +	}
> +#endif
> +
> +	return et;
> +}
> diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
> index 01cc13c12059..a26244c0fe01 100644
> --- a/arch/x86/boot/compressed/misc.h
> +++ b/arch/x86/boot/compressed/misc.h
> @@ -176,4 +176,20 @@ void boot_stage2_vc(void);
>  
>  unsigned long sev_verify_cbit(unsigned long cr3);
>  
> +enum efi_type {
> +	EFI_TYPE_64,
> +	EFI_TYPE_32,
> +	EFI_TYPE_NONE,
> +};

Haha, EFI folks will be wondering where in the spec that thing is... :-)))

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 25/43] x86/compressed/acpi: Move EFI system table lookup to helper
  2022-01-28 17:17 ` [PATCH v9 25/43] x86/compressed/acpi: Move EFI system table lookup " Brijesh Singh
@ 2022-02-03 14:48   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-03 14:48 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:46AM -0600, Brijesh Singh wrote:
> diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
> index daa73efdc7a5..bf99768cd229 100644
> --- a/arch/x86/boot/compressed/efi.c
> +++ b/arch/x86/boot/compressed/efi.c
> @@ -48,3 +48,32 @@ enum efi_type efi_get_type(struct boot_params *boot_params)
>  
>  	return et;
>  }
> +
> +/*

/**

kernel-doc comment pls.

> + * efi_get_system_table - Given boot_params, retrieve the physical address of
> + *                        EFI system table.
> + *
> + * @boot_params:        pointer to boot_params
> + *
> + * Return: EFI system table address on success. On error, return 0.
> + */
> +unsigned long efi_get_system_table(struct boot_params *boot_params)

... *bp) - as before.

Please go through all those functions taking in struct boot_params ptr.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 26/43] x86/compressed/acpi: Move EFI config table lookup to helper
  2022-01-28 17:17 ` [PATCH v9 26/43] x86/compressed/acpi: Move EFI config " Brijesh Singh
@ 2022-02-03 15:13   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-03 15:13 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:47AM -0600, Brijesh Singh wrote:
> +int efi_get_conf_table(struct boot_params *boot_params, unsigned long *cfg_tbl_pa,
> +		       unsigned int *cfg_tbl_len)
> +{
> +	unsigned long sys_tbl_pa = 0;
> +	enum efi_type et;
> +	int ret;
> +
> +	if (!cfg_tbl_pa || !cfg_tbl_len)
> +		return -EINVAL;
> +
> +	sys_tbl_pa = efi_get_system_table(boot_params);
> +	if (!sys_tbl_pa)
> +		return -EINVAL;
> +
> +	/* Handle EFI bitness properly */
> +	et = efi_get_type(boot_params);
> +	if (et == EFI_TYPE_64) {
> +		efi_system_table_64_t *stbl =
> +			(efi_system_table_64_t *)sys_tbl_pa;
> +
> +		*cfg_tbl_pa = stbl->tables;
> +		*cfg_tbl_len = stbl->nr_tables;
> +	} else if (et == EFI_TYPE_32) {
> +		efi_system_table_32_t *stbl =
> +			(efi_system_table_32_t *)sys_tbl_pa;

You don't have to break those assignment lines - the 80 cols rule is not
a hard one. Readability is more important.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 23/43] KVM: x86: Move lookup of indexed CPUID leafs to helper
  2022-01-28 17:17 ` [PATCH v9 23/43] KVM: x86: Move lookup of indexed CPUID leafs " Brijesh Singh
@ 2022-02-03 15:16   ` Borislav Petkov
  2022-02-03 16:44     ` Michael Roth
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-03 15:16 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy,
	Venu Busireddy

On Fri, Jan 28, 2022 at 11:17:44AM -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> Determining which CPUID leafs have significant ECX/index values is
> also needed by guest kernel code when doing SEV-SNP-validated CPUID
> lookups. Move this to common code to keep future updates in sync.
> 
> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/cpuid.h | 38 ++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/cpuid.c         | 19 ++----------------
>  2 files changed, 40 insertions(+), 17 deletions(-)
>  create mode 100644 arch/x86/include/asm/cpuid.h
> 
> diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
> new file mode 100644
> index 000000000000..00408aded67c
> --- /dev/null
> +++ b/arch/x86/include/asm/cpuid.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Kernel-based Virtual Machine driver for Linux cpuid support routines
> + *
> + * derived from arch/x86/kvm/x86.c
> + * derived from arch/x86/kvm/cpuid.c
> + *
> + * Copyright 2011 Red Hat, Inc. and/or its affiliates.
> + * Copyright IBM Corporation, 2008
> + */

I have no clue what you're trying to achieve by copying the copyright of
the file this comes from. As dhansen properly points out, those lines
in that function come from other folks/companies too so why even bother
with this?

git history holds the correct and full copyright anyway...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 23/43] KVM: x86: Move lookup of indexed CPUID leafs to helper
  2022-02-03 15:16   ` Borislav Petkov
@ 2022-02-03 16:44     ` Michael Roth
  2022-02-05 12:58       ` Borislav Petkov
  0 siblings, 1 reply; 115+ messages in thread
From: Michael Roth @ 2022-02-03 16:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Venu Busireddy

On Thu, Feb 03, 2022 at 04:16:33PM +0100, Borislav Petkov wrote:
> On Fri, Jan 28, 2022 at 11:17:44AM -0600, Brijesh Singh wrote:
> > From: Michael Roth <michael.roth@amd.com>
> > 
> > Determining which CPUID leafs have significant ECX/index values is
> > also needed by guest kernel code when doing SEV-SNP-validated CPUID
> > lookups. Move this to common code to keep future updates in sync.
> > 
> > Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
> > Signed-off-by: Michael Roth <michael.roth@amd.com>
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > ---
> >  arch/x86/include/asm/cpuid.h | 38 ++++++++++++++++++++++++++++++++++++
> >  arch/x86/kvm/cpuid.c         | 19 ++----------------
> >  2 files changed, 40 insertions(+), 17 deletions(-)
> >  create mode 100644 arch/x86/include/asm/cpuid.h
> > 
> > diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
> > new file mode 100644
> > index 000000000000..00408aded67c
> > --- /dev/null
> > +++ b/arch/x86/include/asm/cpuid.h
> > @@ -0,0 +1,38 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Kernel-based Virtual Machine driver for Linux cpuid support routines
> > + *
> > + * derived from arch/x86/kvm/x86.c
> > + * derived from arch/x86/kvm/cpuid.c
> > + *
> > + * Copyright 2011 Red Hat, Inc. and/or its affiliates.
> > + * Copyright IBM Corporation, 2008
> > + */
> 
> I have no clue what you're trying to achieve by copying the copyright of
> the file this comes from. As dhansen properly points out, those lines
> in that function come from other folks/companies too so why even bother
> with this?

I think Dave's main concern was that I'd added an AMD copyright banner
to a new file that was mostly derived from acpi.c. I thought we had some
agreement on simply adopting the file-wide copyright banner of whatever
source file the new one was derived from, since dropping an existing
copyright seemed similarly in bad taste, but if it's sufficient to lean
on git for getting a more accurate picture of copyright sources then
that sounds good to me and I'll adopt that for the next spin if there
are no objections.

  https://lore.kernel.org/linux-efi/16afaa00-06a9-dc58-6c59-3d1dfb819009@amd.com/T/#m88a765b6090ec794872f73bf0ee6642fd39db947

(In the case of acpi.c it happened to not have a file-wide copyright banner
so things were a little more straightforward for the acpi.c->efi.c movement)

> 
> git history holds the correct and full copyright anyway...
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7CMichael.Roth%40amd.com%7C189a90d4aa3e459f7d7908d9e7282f9c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637794982059320697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=T95Les9sx71RFXYAImDag9%2FclmImsnjMbzPkOvIsbbY%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 28/43] x86/compressed/acpi: Move EFI kexec handling into common code
  2022-01-28 17:17 ` [PATCH v9 28/43] x86/compressed/acpi: Move EFI kexec handling into common code Brijesh Singh
@ 2022-02-04 16:09   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-04 16:09 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:49AM -0600, Brijesh Singh wrote:
> +static struct efi_setup_data *get_kexec_setup_data(struct boot_params *boot_params,
> +						   enum efi_type et)
> +{
> +#ifdef CONFIG_X86_64
> +	struct efi_setup_data *esd = NULL;
> +	struct setup_data *data;
> +	u64 pa_data;
> +
> +	if (et != EFI_TYPE_64)
> +		return NULL;

Why that check here? That function is called for EFI_TYPE_64 only anyway.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 29/43] x86/boot: Add Confidential Computing type to setup_data
  2022-01-28 17:17 ` [PATCH v9 29/43] x86/boot: Add Confidential Computing type to setup_data Brijesh Singh
@ 2022-02-04 16:21   ` Borislav Petkov
  2022-02-04 17:41     ` Brijesh Singh
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-04 16:21 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:50AM -0600, Brijesh Singh wrote:
> +/*
> + * AMD SEV Confidential computing blob structure. The structure is
> + * defined in OVMF UEFI firmware header:
> + * https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h

So looking at that typedef struct CONFIDENTIAL_COMPUTING_SNP_BLOB_LOCATION there:

typedef struct {
  UINT32    Header;
  UINT16    Version;
  UINT16    Reserved1;
  UINT64    SecretsPhysicalAddress;
  UINT32    SecretsSize;
  UINT64    CpuidPhysicalAddress;
  UINT32    CpuidLSize;
} CONFIDENTIAL_COMPUTING_SNP_BLOB_LOCATION;

> + */
> +#define CC_BLOB_SEV_HDR_MAGIC	0x45444d41
> +struct cc_blob_sev_info {
> +	u32 magic;

That's called "Header" there.

> +	u16 version;
> +	u16 reserved;
> +	u64 secrets_phys;
> +	u32 secrets_len;
> +	u32 rsvd1;

You've added that member for padding but the fw blob one doesn't have
it.

But if we get a blob from the firmware and the structure layout differs,
how is this supposed to even work?

> +	u64 cpuid_phys;
> +	u32 cpuid_len;
> +	u32 rsvd2;

That one too.

Or are you going to change the blob layout in ovmf too, to match?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 29/43] x86/boot: Add Confidential Computing type to setup_data
  2022-02-04 16:21   ` Borislav Petkov
@ 2022-02-04 17:41     ` Brijesh Singh
  0 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-02-04 17:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy


On 2/4/22 10:21 AM, Borislav Petkov wrote:
> On Fri, Jan 28, 2022 at 11:17:50AM -0600, Brijesh Singh wrote:
>> +/*
>> + * AMD SEV Confidential computing blob structure. The structure is
>> + * defined in OVMF UEFI firmware header:
>> + * https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftianocore%2Fedk2%2Fblob%2Fmaster%2FOvmfPkg%2FInclude%2FGuid%2FConfidentialComputingSevSnpBlob.h&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7C334b7454d7a541f3497d08d9e7fa796c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637795885258984697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=XNPvA7re7WHpAgzeuOC%2Buu0ql18P6KOIbP5ZriFsxEY%3D&amp;reserved=0
> So looking at that typedef struct CONFIDENTIAL_COMPUTING_SNP_BLOB_LOCATION there:
>
> typedef struct {
>   UINT32    Header;
>   UINT16    Version;
>   UINT16    Reserved1;
>   UINT64    SecretsPhysicalAddress;
>   UINT32    SecretsSize;
>   UINT64    CpuidPhysicalAddress;
>   UINT32    CpuidLSize;
> } CONFIDENTIAL_COMPUTING_SNP_BLOB_LOCATION;
>
>> + */
>> +#define CC_BLOB_SEV_HDR_MAGIC	0x45444d41
>> +struct cc_blob_sev_info {
>> +	u32 magic;
> That's called "Header" there.

I will rename it to keep it consistent with OVMF header.


>> +	u16 version;
>> +	u16 reserved;
>> +	u64 secrets_phys;
>> +	u32 secrets_len;
>> +	u32 rsvd1;
> You've added that member for padding but the fw blob one doesn't have
> it.
>
> But if we get a blob from the firmware and the structure layout differs,
> how is this supposed to even work?
>
>> +	u64 cpuid_phys;
>> +	u32 cpuid_len;
>> +	u32 rsvd2;
> That one too.
>
> Or are you going to change the blob layout in ovmf too, to match?

Yes, that's the plan. I have tested my OVMF with v9 and everything looks
good at the runtime. Will send OVMF patch to match the blob layout.



^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers
  2022-01-28 17:17 ` [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers Brijesh Singh
@ 2022-02-05 10:54   ` Borislav Petkov
  2022-02-05 15:42     ` Michael Roth
  2022-02-05 16:22     ` Michael Roth
  0 siblings, 2 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-05 10:54 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:52AM -0600, Brijesh Singh wrote:
> +/*
> + * Individual entries of the SEV-SNP CPUID table, as defined by the SEV-SNP
> + * Firmware ABI, Revision 0.9, Section 7.1, Table 14.
> + */
> +struct snp_cpuid_fn {
> +	u32 eax_in;
> +	u32 ecx_in;
> +	u64 xcr0_in;
> +	u64 xss_in;

So what's the end result here:

-+	u64 __unused;
-+	u64 __unused2;
++	u64 xcr0_in;
++	u64 xss_in;

those are not unused fields anymore but xcr0 and xss input values?

Looking at the FW abi doc, they're only mentioned in "Table 14.
CPUID_FUNCTION Structure" that they're XCR0 and XSS at the time of the
CPUID execution.

But those values are input values to what exactly, guest or firmware?

There's a typo in the FW doc, btw:

"The guest constructs an MSG_CPUID_REQ message as defined in Table 13.
This message contains an array of CPUID function structures as defined
in Table 13."

That second "Table" is 14 not 13.

So, if an array CPUID_FUNCTION[] is passed as part of an MSG_CPUID_REQ
command, then, the two _IN variables contain what the guest received
from the HV for XCR0 and XSS values. Which means, this is the guest
asking the FW whether those values the HV gave the guest are kosher.

Am I close?

> +static const struct snp_cpuid_info *snp_cpuid_info_get_ptr(void)
> +{
> +	void *ptr;
> +
> +	asm ("lea cpuid_info_copy(%%rip), %0"
> +	     : "=r" (ptr)

Same question as the last time:

Why not "=g" and let the compiler decide?

> +	     : "p" (&cpuid_info_copy));
> +
> +	return ptr;
> +}

...

> +static bool snp_cpuid_check_range(u32 func)
> +{
> +	if (func <= cpuid_std_range_max ||
> +	    (func >= 0x40000000 && func <= cpuid_hyp_range_max) ||
> +	    (func >= 0x80000000 && func <= cpuid_ext_range_max))
> +		return true;
> +
> +	return false;
> +}
> +
> +static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> +				 u32 *ecx, u32 *edx)

And again, same question as the last time:

I'm wondering if you could make everything a lot easier by doing

static int snp_cpuid_postprocess(struct cpuid_leaf *leaf)

and marshall around that struct cpuid_leaf which contains func, subfunc,
e[abcd]x instead of dealing with 6 parameters.

Callers of snp_cpuid() can simply allocate it on their stack and hand it
in and it is all in sev-shared.c so nicely self-contained...

Ok I'm ignoring this patch for now and I'll review it only after you've
worked in all comments from the previous review.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 23/43] KVM: x86: Move lookup of indexed CPUID leafs to helper
  2022-02-03 16:44     ` Michael Roth
@ 2022-02-05 12:58       ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-05 12:58 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Venu Busireddy, Greg KH

On Thu, Feb 03, 2022 at 10:44:43AM -0600, Michael Roth wrote:
> I think Dave's main concern was that I'd added an AMD copyright banner
> to a new file that was mostly derived from acpi.c. I thought we had some
> agreement on simply adopting the file-wide copyright banner of whatever
> source file the new one was derived from, since dropping an existing
> copyright seemed similarly in bad taste,...

Well, I think simply saying where this function was carved out from:

	arch/x86/kvm/cpuid.c

is good enough. If someone is so much interested in a copyright, someone
can read that file's copyright. Especially since that copyright line
in the original file is not telling you a whole lot about who are all
copyright owners.

And before we delve into lawyer-land, let's simply point to where we got
this from and be done with it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 32/43] x86/boot: Add a pointer to Confidential Computing blob in bootparams
  2022-01-28 17:17 ` [PATCH v9 32/43] x86/boot: Add a pointer to Confidential Computing blob in bootparams Brijesh Singh
@ 2022-02-05 13:07   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-05 13:07 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:53AM -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> The previously defined Confidential Computing blob is provided to the
> kernel via a setup_data structure or EFI config table entry. Currently
> these are both checked for by boot/compressed kernel to access the
> CPUID table address within it for use with SEV-SNP CPUID enforcement.
> 
> To also enable SEV-SNP CPUID enforcement for the run-time kernel,
> similar early access to the CPUID table is needed early on while it's
> still using the identity-mapped page table set up by boot/compressed,
> where global pointers need to be accessed via fixup_pointer().
> 
> This isn't much of an issue for accessing setup_data, and the EFI
> config table helper code currently used in boot/compressed *could* be
> used in this case as well since they both rely on identity-mapping.
> However, it has some reliance on EFI helpers/string constants that
> would need to be accessed via fixup_pointer(), and fixing it up while
> making it shareable between boot/compressed and run-time kernel is
> fragile and introduces a good bit of uglyness.
> 
> Instead, add a boot_params->cc_blob_address pointer that the
> boot/compressed kernel can initialize so that the run-time kernel can
> access the CC blob from there instead of re-scanning the EFI config
> table.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/bootparam_utils.h | 1 +
>  arch/x86/include/uapi/asm/bootparam.h  | 3 ++-
>  2 files changed, 3 insertions(+), 1 deletion(-)

Another review comment ignored:

https://lore.kernel.org/r/YeWyCtr11rL7dxpT@zn.tnic

/me ignores this patch too.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers
  2022-02-05 10:54   ` Borislav Petkov
@ 2022-02-05 15:42     ` Michael Roth
  2022-02-05 16:22     ` Michael Roth
  1 sibling, 0 replies; 115+ messages in thread
From: Michael Roth @ 2022-02-05 15:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Sat, Feb 05, 2022 at 11:54:01AM +0100, Borislav Petkov wrote:
> On Fri, Jan 28, 2022 at 11:17:52AM -0600, Brijesh Singh wrote:
> > +/*
> > + * Individual entries of the SEV-SNP CPUID table, as defined by the SEV-SNP
> > + * Firmware ABI, Revision 0.9, Section 7.1, Table 14.
> > + */
> > +struct snp_cpuid_fn {
> > +	u32 eax_in;
> > +	u32 ecx_in;
> > +	u64 xcr0_in;
> > +	u64 xss_in;
> 
> So what's the end result here:
> 
> -+	u64 __unused;
> -+	u64 __unused2;
> ++	u64 xcr0_in;
> ++	u64 xss_in;
> 
> those are not unused fields anymore but xcr0 and xss input values?
> 
> Looking at the FW abi doc, they're only mentioned in "Table 14.
> CPUID_FUNCTION Structure" that they're XCR0 and XSS at the time of the
> CPUID execution.
> 
> But those values are input values to what exactly, guest or firmware?

These are inputs to firmware, either through a MSG_CPUID_REQ guest
message that gets passed along by the HV to firmware for validation,
or, in this case, through SNP_LAUNCH_UPDATE when the HV passes the
initial CPUID table contents to firmware for validation before it
gets mapped into guest memory.

> 
> There's a typo in the FW doc, btw:
> 
> "The guest constructs an MSG_CPUID_REQ message as defined in Table 13.
> This message contains an array of CPUID function structures as defined
> in Table 13."
> 
> That second "Table" is 14 not 13.
> 
> So, if an array CPUID_FUNCTION[] is passed as part of an MSG_CPUID_REQ
> command, then, the two _IN variables contain what the guest received
> from the HV for XCR0 and XSS values. Which means, this is the guest
> asking the FW whether those values the HV gave the guest are kosher.
> 
> Am I close?

Most of that documentation is written in the context of the
MSG_CPUID_REQ guest message interface. In that case, a guest has been
provided a certain sets of CPUID values by KVM CPUID instruction
emulation (or potentially the CPUID table), and wants firmware to
validate whether all/some of them are valid for that particular
system/configuration.

In the case of 0xD subfunction 0x0 and 0x1, the CPUID leaf on real
hardware will change depending on what the guest sets in its actual
XCR0 control register (APM Volume 2 11.5.2) or IA32_XSS_MSR register,
whose combined bits are OR'd to form the XFEATURE_ENABLED_MASK.

CPUID leaf 0xD subfunctions 0x0 and 0x1 advertise the size of the
XSAVE area required for the features currently enabled in
XFEATURE_ENABLED_MASK via the EBX return value from the corresponding
CPUID instruction, so for firmware to validate whether that EBX value
is valid for a particular system/configuration, it needs to know
what the guest's currently XFEATURE_ENABLED_MASK is, so it needs to
know what the guest has set in its XCR0 and IA32_XSS_MSR registers.

That's where the XCR0_IN and XSS_IN inputs come into play: they are
inputs to the firmware so it can do the proper validation based on
the guest's current state / XFEATURE_ENABLED_MASK.

But that guest message interface will likely be deprecated at some
point in favor of the static CPUID table (SNP FW ABI 8.14.2.6),
since all the firmware validation has already been done in advance,
and any additional validation is redundant, so this series relies
purely on the CPUID table.

In the case of the CPUID table, we don't know what the guest's
XFEATURE_ENABLED_MASK is going to be at the time a CPUID instruction
is issued for 0xD:0x0,0xD,0x1, and those values will change as the
guest boots and begins enabling additional XSAVE features. The only
thing that's certain is that there needs to be at least one CPUID
table entry for 0xD:0x0 and 0xD:0x1 corresponding to the initial
XFEATURE_ENABLED_MASK, which will correspond to XCR0_IN=1/XSS_IN=0,
or XCR0_IN=3/XSS_IN=0 depending on the hypervisor/guest
implementation.

What to do about other combinations of XCR0_IN/XSS_IN gets a bit
weird, since encoding all those permutations would not scale well,
and alternative methods of trying to encode them in the table would
almost certainly result in lots of different incompatibile guest
implementations floating around.

So our recommendation has been for hypervisors to encode only the
0xD:0x0/0xD:0x1 entries corresponding to these initial guest states
in the CPUID table, and for guests to ignore the XCR0_IN/XSS_IN
values completely, and instead have the guest calculate the XSAVE
save area size dynamically (EBX). This has multiple benefits:

1) Guests that implement this approach would be compatible even
   with hypervisors that try to encode additional permutations of
   XCR0_IN/XSS_IN in the table based on their read of the spec
2) It is just as secure, since it requires only that the guest
   trust itself and the APM specification, which is the ultimate
   authority on what firmware would be validationa against
3) The other leaf registers, EAX/ECX/EDX are not dependent on
   XCR0_IN/XSS_IN/XFEATURE_ENABLED_MASK, and so any
   0xD:0x0/0xD:0x1 entry will do as the starting point for the
   calculation.

That's why in v8 these fields were documented as unused1/unused2.
But when you suggested that we should enforce 0 in that case, it
became clear we should adjust our recommendation to go ahead and
check for the specific XCR0_IN={1,3}/XSS_IN=0 values when
searching for the base 0xD:0x0/0xD:0x1 entry to use, because a
hypervisor that chooses to use XCR0_IN/XSS_IN is not out-of-spec,
and should be supported, and there isn't really any read of the
spec from the hypervisor perspective where XCR0_IN=0/XSS_IN=0
would be valid.

So for v9 I went ahead and added the XCR0_IN/XSS_IN checks, and
updated our recommendations (will send an updated version to
SNP mailing list by Monday) to not ignore them in the guest.

I added some comments in snp_cpuid_get_validated_func() and
snp_cpuid_calc_xsave_size() to clarify how XCR0_IN/XSS_IN are
being used there, but let me know if those need to be beefed up
a bit.

-Mike

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers
  2022-02-05 10:54   ` Borislav Petkov
  2022-02-05 15:42     ` Michael Roth
@ 2022-02-05 16:22     ` Michael Roth
  2022-02-06 13:37       ` Borislav Petkov
  1 sibling, 1 reply; 115+ messages in thread
From: Michael Roth @ 2022-02-05 16:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr

On Sat, Feb 05, 2022 at 11:54:01AM +0100, Borislav Petkov wrote:
> On Fri, Jan 28, 2022 at 11:17:52AM -0600, Brijesh Singh wrote:
> 
> > +static const struct snp_cpuid_info *snp_cpuid_info_get_ptr(void)
> > +{
> > +	void *ptr;
> > +
> > +	asm ("lea cpuid_info_copy(%%rip), %0"
> > +	     : "=r" (ptr)
> 
> Same question as the last time:
> 
> Why not "=g" and let the compiler decide?

The documentation for lea (APM Volume 3 Chapter 3) seemed to require
that the destination register be a general purpose register, so it
seemed like there was potential for breakage in allowing GCC to use
anything otherwise. Maybe GCC is smart enough to figure that out, but
since we know the constraint in advance it seemed safer to stick
with the current approach of enforcing that constraint.

> 
> > +	     : "p" (&cpuid_info_copy));
> > +
> > +	return ptr;
> > +}
> 
> ...
> 
> > +static bool snp_cpuid_check_range(u32 func)
> > +{
> > +	if (func <= cpuid_std_range_max ||
> > +	    (func >= 0x40000000 && func <= cpuid_hyp_range_max) ||
> > +	    (func >= 0x80000000 && func <= cpuid_ext_range_max))
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> > +static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
> > +				 u32 *ecx, u32 *edx)
> 
> And again, same question as the last time:
> 
> I'm wondering if you could make everything a lot easier by doing
> 
> static int snp_cpuid_postprocess(struct cpuid_leaf *leaf)
> 
> and marshall around that struct cpuid_leaf which contains func, subfunc,
> e[abcd]x instead of dealing with 6 parameters.
> 
> Callers of snp_cpuid() can simply allocate it on their stack and hand it
> in and it is all in sev-shared.c so nicely self-contained...
> 
> Ok I'm ignoring this patch for now and I'll review it only after you've
> worked in all comments from the previous review.

I did look into it and honestly it just seemed to add more abstractions that
made it harder to parse the specific operations taken place here. For
instance, post-processing of 0x8000001E entry, we have e{a,b,c,d}x from
the CPUID table, then to post process:

  switch (func):
  case 0x8000001E:
    /* extended APIC ID */
    snp_cpuid_hv(func, subfunc, eax, &ebx2, &ecx2, NULL);
                                |    |      |      |
                                |    |      |      edx from cpuid table is used as-is
                                |    |      |  
                                |    |      |
                                |    |      load HV value into tmp ecx2
                                |    |
                                |    load HV value into tmp ebx2
                                |
                                |
                                replace eax completely with the HV value

    # then do the remaining fixups for final ebx/ecx

    /* compute ID */
    *ebx = (*ebx & GENMASK(31, 8)) | (ebx2 & GENMASK(7, 0));
    /* node ID */
    *ecx = (*ecx & GENMASK(31, 8)) | (ecx2 & GENMASK(7, 0));

and it all reads in a clear/familiar way to all the other
cpuid()/native_cpuid() users throughout the kernel, and from the
persective of someone auditing this from a security perspective that
needs to quickly check what registers come from the CPUID table, what
registers come from HV, what the final result is, it all just seems very
clear and familiar to me.

But if we start passing around this higher-level structure that does
not do anything other than abstract away e{a,b,c,x} to save on function
arguments, things become muddier, and there's more pointer dereference
operations and abstractions to sift through. I saved the diff from when
I looked into it previously (was just a rough-sketch, not build-tested),
and included it below for reference, but it just didn't seem to help with
readability to me, which I think is important here since this is probably
one of the most security-sensitive piece of the CPUID table handling,
since we're dealing with untrusted CPUID sources here and it needs to be
clear what exactly is ending up in the E{A,B,C,D} registers we're
returning for a particular CPUID instruction:

(There are some possible optimizations below, like added a mask parameter
so control specifically what EAX/EBX/ECX/EDX field should be modified,
possibly reworking snp_cpuid_info structure definitions to re-use the
cpuid_leaf struct internally, also modifying __sev_cpuid_hv() to take
the cpuid_leaf struct, etc., but none of that really seemed like
it would help much with the key issue of readability, so I ended up
setting it aside for v9)

diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index b2defbf7e66b..53534a6b1dcc 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -49,6 +49,13 @@ struct snp_cpuid_info {
 	struct snp_cpuid_fn fn[SNP_CPUID_COUNT_MAX];
 } __packed;
 
+struct cpuid_leaf {
+	u32 eax;
+	u32 ebx;
+	u32 ecx;
+	u32 edx;
+};
+
 /*
  * Since feature negotiation related variables are set early in the boot
  * process they must reside in the .data section so as not to be zeroed
@@ -260,14 +267,14 @@ static int __sev_cpuid_hv(u32 func, int reg_idx, u32 *reg)
 	return 0;
 }
 
-static int sev_cpuid_hv(u32 func, u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
+static int sev_cpuid_hv(u32 func, struct cpuid_leaf *leaf)
 {
 	int ret;
 
-	ret = __sev_cpuid_hv(func, GHCB_CPUID_REQ_EAX, eax);
-	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EBX, ebx);
-	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_ECX, ecx);
-	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EDX, edx);
+	ret = __sev_cpuid_hv(func, GHCB_CPUID_REQ_EAX, &leaf->eax);
+	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EBX, &leaf->ebx);
+	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_ECX, &leaf->ecx);
+	ret = ret ? : __sev_cpuid_hv(func, GHCB_CPUID_REQ_EDX, &leaf->edx);
 
 	return ret;
 }
@@ -328,8 +335,7 @@ static int snp_cpuid_calc_xsave_size(u64 xfeatures_en, bool compacted)
 	return xsave_size;
 }
 
-static void snp_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
-			 u32 *edx)
+static void snp_cpuid_hv(u32 func, u32 subfunc, struct cpuid_leaf *leaf)
 {
 	/*
 	 * MSR protocol does not support fetching indexed subfunction, but is
@@ -342,13 +348,12 @@ static void snp_cpuid_hv(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
 	if (cpuid_function_is_indexed(func) && subfunc)
 		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV);
 
-	if (sev_cpuid_hv(func, eax, ebx, ecx, edx))
+	if (sev_cpuid_hv(func, leaf))
 		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV);
 }
 
 static bool
-snp_cpuid_get_validated_func(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
-			     u32 *ecx, u32 *edx)
+snp_cpuid_get_validated_func(u32 func, u32 subfunc, struct cpuid_leaf *leaf)
 {
 	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
 	int i;
@@ -362,10 +367,10 @@ snp_cpuid_get_validated_func(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
 		if (cpuid_function_is_indexed(func) && fn->ecx_in != subfunc)
 			continue;
 
-		*eax = fn->eax;
-		*ebx = fn->ebx;
-		*ecx = fn->ecx;
-		*edx = fn->edx;
+		leaf->eax = fn->eax;
+		leaf->ebx = fn->ebx;
+		leaf->ecx = fn->ecx;
+		leaf->edx = fn->edx;
 
 		return true;
 	}
@@ -383,33 +388,34 @@ static bool snp_cpuid_check_range(u32 func)
 	return false;
 }
 
-static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
-				 u32 *ecx, u32 *edx)
+static int snp_cpuid_postprocess(u32 func, u32 subfunc, struct cpuid_leaf *leaf)
 {
-	u32 ebx2, ecx2, edx2;
+	struct cpuid_leaf leaf_tmp;
 
 	switch (func) {
 	case 0x1:
-		snp_cpuid_hv(func, subfunc, NULL, &ebx2, NULL, &edx2);
+		snp_cpuid_hv(func, subfunc, &leaf_tmp);
 
 		/* initial APIC ID */
-		*ebx = (ebx2 & GENMASK(31, 24)) | (*ebx & GENMASK(23, 0));
+		leaf->ebx = (leaf_tmp.ebx & GENMASK(31, 24)) | (leaf->ebx & GENMASK(23, 0));
 		/* APIC enabled bit */
-		*edx = (edx2 & BIT(9)) | (*edx & ~BIT(9));
+		leaf->edx = (leaf_tmp.edx & BIT(9)) | (leaf->edx & ~BIT(9));
 
 		/* OSXSAVE enabled bit */
 		if (native_read_cr4() & X86_CR4_OSXSAVE)
-			*ecx |= BIT(27);
+			leaf->ecx |= BIT(27);
 		break;
 	case 0x7:
 		/* OSPKE enabled bit */
-		*ecx &= ~BIT(4);
+		leaf->ecx &= ~BIT(4);
 		if (native_read_cr4() & X86_CR4_PKE)
-			*ecx |= BIT(4);
+			leaf->ecx |= BIT(4);
 		break;
 	case 0xB:
 		/* extended APIC ID */
-		snp_cpuid_hv(func, 0, NULL, NULL, NULL, edx);
+		snp_cpuid_hv(func, 0, &leaf_tmp);
+		leaf->edx = leaf_tmp.edx;
+
 		break;
 	case 0xD: {
 		bool compacted = false;
@@ -440,7 +446,7 @@ static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
 			 * to avoid this becoming an issue it's safer to simply
 			 * treat this as unsupported for SEV-SNP guests.
 			 */
-			if (!(*eax & (BIT(1) | BIT(3))))
+			if (!(leaf->eax & (BIT(1) | BIT(3))))
 				return -EINVAL;
 
 			compacted = true;
@@ -450,16 +456,17 @@ static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
 		if (xsave_size < 0)
 			return -EINVAL;
 
-		*ebx = xsave_size;
+		leaf->ebx = xsave_size;
 		}
 		break;
 	case 0x8000001E:
 		/* extended APIC ID */
-		snp_cpuid_hv(func, subfunc, eax, &ebx2, &ecx2, NULL);
+		snp_cpuid_hv(func, subfunc, &leaf_tmp);
+		leaf->eax = leaf_tmp.eax;
 		/* compute ID */
-		*ebx = (*ebx & GENMASK(31, 8)) | (ebx2 & GENMASK(7, 0));
+		leaf->ebx = (leaf->ebx & GENMASK(31, 8)) | (leaf_tmp.ebx & GENMASK(7, 0));
 		/* node ID */
-		*ecx = (*ecx & GENMASK(31, 8)) | (ecx2 & GENMASK(7, 0));
+		leaf->ecx = (leaf->ecx & GENMASK(31, 8)) | (leaf_tmp.ecx & GENMASK(7, 0));
 		break;
 	default:
 		/* No fix-ups needed, use values as-is. */
@@ -473,15 +480,14 @@ static int snp_cpuid_postprocess(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
  * Returns -EOPNOTSUPP if feature not enabled. Any other return value should be
  * treated as fatal by caller.
  */
-static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
-		     u32 *edx)
+static int snp_cpuid(u32 func, u32 subfunc, struct cpuid_leaf *leaf)
 {
 	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
 
 	if (!cpuid_info->count)
 		return -EOPNOTSUPP;
 
-	if (!snp_cpuid_get_validated_func(func, subfunc, eax, ebx, ecx, edx)) {
+	if (!snp_cpuid_get_validated_func(func, subfunc, leaf)) {
 		/*
 		 * Some hypervisors will avoid keeping track of CPUID entries
 		 * where all values are zero, since they can be handled the
@@ -497,12 +503,12 @@ static int snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
 		 * not in the table, but is still in the valid range, proceed
 		 * with the post-processing. Otherwise, just return zeros.
 		 */
-		*eax = *ebx = *ecx = *edx = 0;
+		leaf->eax = leaf->ebx = leaf->ecx = leaf->edx = 0;
 		if (!snp_cpuid_check_range(func))
 			return 0;
 	}
 
-	return snp_cpuid_postprocess(func, subfunc, eax, ebx, ecx, edx);
+	return snp_cpuid_postprocess(func, subfunc, leaf);
 }
 
 /*
@@ -514,28 +520,28 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
 {
 	unsigned int subfn = lower_bits(regs->cx, 32);
 	unsigned int fn = lower_bits(regs->ax, 32);
-	u32 eax, ebx, ecx, edx;
+	struct cpuid_leaf *leaf;
 	int ret;
 
 	/* Only CPUID is supported via MSR protocol */
 	if (exit_code != SVM_EXIT_CPUID)
 		goto fail;
 
-	ret = snp_cpuid(fn, subfn, &eax, &ebx, &ecx, &edx);
+	ret = snp_cpuid(fn, subfn, leaf);
 	if (!ret)
 		goto cpuid_done;
 
 	if (ret != -EOPNOTSUPP)
 		goto fail;
 
-	if (sev_cpuid_hv(fn, &eax, &ebx, &ecx, &edx))
+	if (sev_cpuid_hv(fn, leaf))
 		goto fail;
 
 cpuid_done:
-	regs->ax = eax;
-	regs->bx = ebx;
-	regs->cx = ecx;
-	regs->dx = edx;
+	regs->ax = leaf->eax;
+	regs->bx = leaf->ebx;
+	regs->cx = leaf->ecx;
+	regs->dx = leaf->edx;
 
 	/*
 	 * This is a VC handler and the #VC is only raised when SEV-ES is


> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7CMichael.Roth%40amd.com%7C6bc14b8b5b854a38d7c008d9e895da5b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637796552649205409%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Lc5o1tYKrtqUy2h%2B8onmgdaydqUWTlnj7V9rfuBEU0s%3D&amp;reserved=0

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests
  2022-01-28 17:17 ` [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
@ 2022-02-05 17:19   ` Michael Roth
  2022-02-06 15:46     ` Borislav Petkov
  2022-02-06 19:50   ` Borislav Petkov
  1 sibling, 1 reply; 115+ messages in thread
From: Michael Roth @ 2022-02-05 17:19 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, brijesh.singh, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:59AM -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> SEV-SNP guests will be provided the location of special 'secrets' and
> 'CPUID' pages via the Confidential Computing blob. This blob is
> provided to the run-time kernel either through a bootparams field that
> was initialized by the boot/compressed kernel, or via a setup_data
> structure as defined by the Linux Boot Protocol.
> 
> Locate the Confidential Computing blob from these sources and, if found,
> use the provided CPUID page/table address to create a copy that the
> run-time kernel will use when servicing CPUID instructions via a #VC
> handler.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/boot/compressed/sev.c | 37 -----------------
>  arch/x86/kernel/sev-shared.c   | 37 +++++++++++++++++
>  arch/x86/kernel/sev.c          | 75 ++++++++++++++++++++++++++++++++++
>  3 files changed, 112 insertions(+), 37 deletions(-)
> 
<snip>
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 5c72b21c7e7b..cb97200bfda7 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -2034,6 +2034,8 @@ bool __init snp_init(struct boot_params *bp)
>  	if (!cc_info)
>  		return false;
>  
> +	snp_setup_cpuid_table(cc_info);
> +
>  	/*
>  	 * The CC blob will be used later to access the secrets page. Cache
>  	 * it here like the boot kernel does.
> @@ -2047,3 +2049,76 @@ void __init snp_abort(void)
>  {
>  	sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
>  }
> +
> +static void snp_dump_cpuid_table(void)
> +{
> +	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
> +	int i = 0;
> +
> +	pr_info("count=%d reserved=0x%x reserved2=0x%llx\n",
> +		cpuid_info->count, cpuid_info->__reserved1, cpuid_info->__reserved2);
> +
> +	for (i = 0; i < SNP_CPUID_COUNT_MAX; i++) {
> +		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
> +
> +		pr_info("index=%03d fn=0x%08x subfn=0x%08x: eax=0x%08x ebx=0x%08x ecx=0x%08x edx=0x%08x xcr0_in=0x%016llx xss_in=0x%016llx reserved=0x%016llx\n",
> +			i, fn->eax_in, fn->ecx_in, fn->eax, fn->ebx, fn->ecx,
> +			fn->edx, fn->xcr0_in, fn->xss_in, fn->__reserved);
> +	}
> +}
> +
> +/*
> + * It is useful from an auditing/testing perspective to provide an easy way
> + * for the guest owner to know that the CPUID table has been initialized as
> + * expected, but that initialization happens too early in boot to print any
> + * sort of indicator, and there's not really any other good place to do it.
> + * So do it here. This is also a good place to flag unexpected usage of the
> + * CPUID table that does not violate the SNP specification, but may still
> + * be worth noting to the user.
> + */
> +static int __init snp_check_cpuid_table(void)
> +{
> +	const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr();
> +	bool dump_table = false;
> +	int i;
> +
> +	if (!cpuid_info->count)
> +		return 0;
> +
> +	pr_info("Using SEV-SNP CPUID table, %d entries present.\n",
> +		cpuid_info->count);
> +
> +	if (cpuid_info->__reserved1 || cpuid_info->__reserved2) {
> +		pr_warn("Unexpected use of reserved fields in CPUID table header.\n");
> +		dump_table = true;
> +	}
> +
> +	for (i = 0; i < cpuid_info->count; i++) {
> +		const struct snp_cpuid_fn *fn = &cpuid_info->fn[i];
> +		struct snp_cpuid_fn zero_fn = {0};
> +
> +		/*
> +		 * Though allowed, an all-zero entry is not a valid leaf for linux
> +		 * guests, and likely the result of an incorrect entry count.
> +		 */
> +		if (!memcmp(fn, &zero_fn, sizeof(struct snp_cpuid_fn))) {
> +			pr_warn("CPUID table contains all-zero entry at index=%d\n", i);
> +			dump_table = true;
> +		}
> +
> +		if (fn->__reserved) {
> +			pr_warn("Unexpected use of reserved fields in CPUID table entry at index=%d\n",
> +				i);
> +			dump_table = true;
> +		}
> +	}
> +
> +	if (dump_table) {
> +		pr_warn("Dumping CPUID table:\n");
> +		snp_dump_cpuid_table();
> +	}
> +
> +	return 0;
> +}
> +
> +arch_initcall(snp_check_cpuid_table);

Just a quick note on this one. I know you'd recently suggested dropping
this check since it was redundant:

  https://lore.kernel.org/all/YfGUWLmg82G+l4jU@zn.tnic/

I've dropped the redundant checks from in this version, but with the
additional sanity checks you suggested adding, this function remained
relevant for v9 so we can provide user-readable warnings for "weird"
stuff in the table that doesn't necessarily violate the spec, but also
doesn't seem to make much sense:

  https://lore.kernel.org/all/YefzQuqrV8kdLr9z@zn.tnic/

But I also agree with your later response here, where sanity-checking
doesn't gain us much since ultimately what really matter is the
resulting values we provide in response to CPUID instructions, which
would be ultimately what gets used for guest owner's attestation flow:

  https://lore.kernel.org/all/YfR1GNb%2FyzKu4n5+@zn.tnic/

So I agree we should drop the sanity-check in this function since
there's no cases covered here that would actually effect the end
result of those CPUID instructions, nor do any of the checks here
violate the SNP spec...

...except possibly the checks that enforce that the Reserved fields
here are all 0, which we discussed here:

  https://lore.kernel.org/all/YeGhKll2fTcTr2wS@zn.tnic/

I followed up with our firmware team on this to confirm whether our
intended handling and the potential backward-compatibility issues
aligned with the intent of the SNP FW spec, and the guidance I got
on that is that those Reserved/MBZ constraints are meant to apply
to what a hypervisor places in the initial CPUID table input to
SNP_LAUNCH_UPDATE (or what the guest places in the MSG_CPUID_REQ
structure in the guest message version). They are not meant to
cover what the guest should expect to read for these Reserved fields
in the resulting CPUID table: the guest is supposed to ignore them
completely and not enforce any value.

I mentioned the concern you raised about out-of-spec hypervisors
using non-zero for Reserved fields, and potentially breaking future
guests that attempt to make use of them if they ever get re-purposed
for some other functionality, but their take on that is that such a
hypervisor is clearly out-of-spec, and would not be supported. Their
preference would be to update the spec for clarity on this if we
have any better suggested wording, rather than implementing anything
on the guest side that might effect backward-compatibility in the
future. Another possibility is enforcing 0 in the firmware itself.

So given their guidance on the Reserved fields, and your guidance
on not doing the other sanity-checks, my current plan to to drop
snp_check_cpuid_table() completely for v10, but if you have other
thoughts on that just let me know.

Thanks!

-Mike


> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers
  2022-02-05 16:22     ` Michael Roth
@ 2022-02-06 13:37       ` Borislav Petkov
  2022-02-07 15:37         ` Michael Roth
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-06 13:37 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr

First of all,

let me give you a very important piece of advice for the future:
ignoring review feedback is a very unfriendly thing to do:

- if you agree with the feedback, you work it in in the next revision

- if you don't agree, you *say* *why* you don't

What you should avoid is what you've done - ignore it silently. Because
reviewing submitters code is the most ungrateful work around the kernel,
and, at the same time, it is the most important one.

So please make sure you don't do that in the future when submitting
patches for upstream inclusion. I'm sure you can imagine, the ignoring
can go both ways.

On Sat, Feb 05, 2022 at 10:22:49AM -0600, Michael Roth wrote:
> The documentation for lea (APM Volume 3 Chapter 3) seemed to require
> that the destination register be a general purpose register, so it
> seemed like there was potential for breakage in allowing GCC to use
> anything otherwise.

There's no such potential: binutils encodes the unstruction operands
and what types are allowed. IOW, the assembler knows that there goes a
register.

> Maybe GCC is smart enough to figure that out, but since we know the
> constraint in advance it seemed safer to stick with the current
> approach of enforcing that constraint.

I guess in this particular case it won't matter whether it is "=r" or
"=g" but still...

> I did look into it and honestly it just seemed to add more abstractions that
> made it harder to parse the specific operations taken place here. For
> instance, post-processing of 0x8000001E entry, we have e{a,b,c,d}x from
> the CPUID table, then to post process:
> 
>   switch (func):
>   case 0x8000001E:
>     /* extended APIC ID */
>     snp_cpuid_hv(func, subfunc, eax, &ebx2, &ecx2, NULL);

Well, my suggestion was to put it *all* in the subleaf struct - not just
the regs:

struct cpuid_leaf {
	u32 func;
	u32 subfunc;
	u32 eax;
	u32 ebx;
	u32 ecx;
	u32 edx;
};

so that the call signature is:

	snp_cpuid_postprocess(struct cpuid_leaf *leaf)


> and it all reads in a clear/familiar way to all the other
> cpuid()/native_cpuid() users throughout the kernel,

maybe it should read differently *on* *purpose*. Exactly because this is
*not* the usual CPUID handling code but CPUID verification code for SNP
guests.

And please explain to me what's so unclear about leaf->eax vs *eax?!

> and from the persective of someone auditing this from a security
> perspective that needs to quickly check what registers come from the
> CPUID table, what registers come from HV

Having a struct which is properly named will actually be beneficial in
this case:

	hv_leaf->eax

or even

	hv_reported_leaf->eax

vs

	*eax

Now guess which is which.

> But if we start passing around this higher-level structure that does
> not do anything other than abstract away e{a,b,c,x} to save on function
> arguments, things become muddier, and there's more pointer dereference
> operations and abstractions to sift through.

Huh?! I'm sorry but I cannot follow that logic here.

> I saved the diff from when I looked into it previously (was just a
> rough-sketch, not build-tested), and included it below for reference,
> but it just didn't seem to help with readability to me,

Well, looking at it, the only difference is that the IO is done
over a struct instead of separate pointers. And the diff is pretty
straight-forward.

So no, having a struct cpuid_leaf containing it all is actually better
in this case because you know which is which if you name it properly and
you have a single pointer argument which you pass around - something
which is done all around the kernel.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests
  2022-02-05 17:19   ` Michael Roth
@ 2022-02-06 15:46     ` Borislav Petkov
  2022-02-07 17:00       ` Michael Roth
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-06 15:46 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, brijesh.singh, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Sat, Feb 05, 2022 at 11:19:01AM -0600, Michael Roth wrote:
> I mentioned the concern you raised about out-of-spec hypervisors
> using non-zero for Reserved fields, and potentially breaking future
> guests that attempt to make use of them if they ever get re-purposed
> for some other functionality, but their take on that is that such a
> hypervisor is clearly out-of-spec, and would not be supported.

Yah, like stating that something should not be done in the spec is
going to stop people from doing it anyway. You folks need to understand
one thing: the smaller the attack surface, the better. And you need to
*enforce* that - not state it in a spec. No one cares about the spec
when it comes to poking holes in the architecture. And people do poke
and will poke holes *especially* in this architecture, as its main goal
is security.

> Another possibility is enforcing 0 in the firmware itself.

Yes, this is the thing to do. If they're going to be reused in the
future, then guests can be changed to handle that. Like we do all the
time anyway.

> So given their guidance on the Reserved fields, and your guidance
> on not doing the other sanity-checks, my current plan to to drop
> snp_check_cpuid_table() completely for v10, but if you have other
> thoughts on that just let me know.

Yes, and pls fix the firmware to zero them out, because from reading
your very detailed explanation - btw, thanks for taking the time to
explain properly what's not in the ABI doc:

https://lore.kernel.org/r/20220205154243.s33gwghqwbb4efyl@amd.com

it sounds like those two input fields are not really needed. So the
earlier you fix them in the firmware and deprecate them, the better.

Thx!

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 33/43] x86/compressed: Add SEV-SNP feature detection/setup
  2022-01-28 17:17 ` [PATCH v9 33/43] x86/compressed: Add SEV-SNP feature detection/setup Brijesh Singh
@ 2022-02-06 16:41   ` Borislav Petkov
  2022-02-08 13:50     ` Michael Roth
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-06 16:41 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:54AM -0600, Brijesh Singh wrote:
> +static struct cc_setup_data *get_cc_setup_data(struct boot_params *bp)
> +{
> +	struct setup_data *hdr = (struct setup_data *)bp->hdr.setup_data;
> +
> +	while (hdr) {
> +		if (hdr->type == SETUP_CC_BLOB)
> +			return (struct cc_setup_data *)hdr;
> +		hdr = (struct setup_data *)hdr->next;
> +	}
> +
> +	return NULL;
> +}

Merge that function into its only caller.

...

> +static struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)

static function, no need for the "snp_" prefix. Please audit all your
patches for that and remove that prefix from all static functions.

> +{
> +	struct cc_blob_sev_info *cc_info;
> +
> +	cc_info = snp_find_cc_blob_efi(bp);
> +	if (cc_info)
> +		goto found_cc_info;
> +
> +	cc_info = snp_find_cc_blob_setup_data(bp);
> +	if (!cc_info)
> +		return NULL;
> +
> +found_cc_info:
> +	if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
> +		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
> +
> +	return cc_info;
> +}
> +
> +bool snp_init(struct boot_params *bp)
> +{
> +	struct cc_blob_sev_info *cc_info;
> +
> +	if (!bp)
> +		return false;
> +
> +	cc_info = snp_find_cc_blob(bp);
> +	if (!cc_info)
> +		return false;
> +
> +	/*
> +	 * Pass run-time kernel a pointer to CC info via boot_params so EFI
> +	 * config table doesn't need to be searched again during early startup
> +	 * phase.
> +	 */
> +	bp->cc_blob_address = (u32)(unsigned long)cc_info;
> +
> +	/*
> +	 * Indicate SEV-SNP based on presence of SEV-SNP-specific CC blob.
> +	 * Subsequent checks will verify SEV-SNP CPUID/MSR bits.
> +	 */

Put that comment over the function name.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 35/43] x86/compressed: Export and rename add_identity_map()
  2022-01-28 17:17 ` [PATCH v9 35/43] x86/compressed: Export and rename add_identity_map() Brijesh Singh
@ 2022-02-06 19:01   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-06 19:01 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy,
	Borislav Petkov

On Fri, Jan 28, 2022 at 11:17:56AM -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> SEV-specific code will need to add some additional mappings, but doing
> this within ident_map_64.c requires some SEV-specific helpers to be
> exported and some SEV-specific struct definitions to be pulled into
> ident_map_64.c. Instead, export add_identity_map() so SEV-specific (and
> other subsystem-specific) code can be better contained outside of
> ident_map_64.c.
> 
> While at it, rename the function to kernel_add_identity_map(), similar
> to the kernel_ident_mapping_init() function it relies upon.

Add here:

"No functional changes."

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 36/43] x86/compressed/64: Add identity mapping for Confidential Computing blob
  2022-01-28 17:17 ` [PATCH v9 36/43] x86/compressed/64: Add identity mapping for Confidential Computing blob Brijesh Singh
@ 2022-02-06 19:21   ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-06 19:21 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:57AM -0600, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> The run-time kernel will need to access the Confidential Computing
> blob very early in boot to access the CPUID table it points to. At
> that stage of boot it will be relying on the identity-mapped page table
> set up by boot/compressed kernel, so make sure the blob and the CPUID
> table it points to are mapped in advance.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/boot/compressed/ident_map_64.c |  3 ++-
>  arch/x86/boot/compressed/misc.h         |  2 ++
>  arch/x86/boot/compressed/sev.c          | 22 ++++++++++++++++++++++
>  3 files changed, 26 insertions(+), 1 deletion(-)

Do this ontop:

---
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index faf432684870..a5a9210d73b6 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -500,7 +500,7 @@ bool snp_init(struct boot_params *bp)
 void sev_prep_identity_maps(unsigned long top_level_pgt)
 {
 	/*
-	 * The ConfidentialComputing blob is used very early in uncompressed
+	 * The Confidential Computing blob is used very early in uncompressed
 	 * kernel to find the in-memory cpuid table to handle cpuid
 	 * instructions. Make sure an identity-mapping exists so it can be
 	 * accessed after switchover.
@@ -509,11 +509,10 @@ void sev_prep_identity_maps(unsigned long top_level_pgt)
 		unsigned long cc_info_pa = boot_params->cc_blob_address;
 		struct cc_blob_sev_info *cc_info;
 
-		kernel_add_identity_map(cc_info_pa,
-					cc_info_pa + sizeof(*cc_info));
+		kernel_add_identity_map(cc_info_pa, cc_info_pa + sizeof(*cc_info));
+
 		cc_info = (struct cc_blob_sev_info *)cc_info_pa;
-		kernel_add_identity_map((unsigned long)cc_info->cpuid_phys,
-					(unsigned long)cc_info->cpuid_phys + cc_info->cpuid_len);
+		kernel_add_identity_map(cc_info->cpuid_phys, cc_info->cpuid_phys + cc_info->cpuid_len);
 	}
 
 	sev_verify_cbit(top_level_pgt);

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 37/43] x86/sev: Add SEV-SNP feature detection/setup
  2022-01-28 17:17 ` [PATCH v9 37/43] x86/sev: Add SEV-SNP feature detection/setup Brijesh Singh
@ 2022-02-06 19:38   ` Borislav Petkov
  2022-02-08  5:25     ` Michael Roth
  0 siblings, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-06 19:38 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:58AM -0600, Brijesh Singh wrote:
> +static __init struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
> +{
> +	struct cc_blob_sev_info *cc_info;
> +
> +	/* Boot kernel would have passed the CC blob via boot_params. */
> +	if (bp->cc_blob_address) {
> +		cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address;
> +		goto found_cc_info;
> +	}

What is the difference here, why aren't you looking for the blob in an
EFI table?

Even if you're booted directly by firmware, there should still be EFI
there or?

And if so, then I think you should share some of the code through
sev-shared.c so that there's not so much duplication...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests
  2022-01-28 17:17 ` [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
  2022-02-05 17:19   ` Michael Roth
@ 2022-02-06 19:50   ` Borislav Petkov
  1 sibling, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-06 19:50 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:17:59AM -0600, Brijesh Singh wrote:
> +/*
> + * It is useful from an auditing/testing perspective to provide an easy way
> + * for the guest owner to know that the CPUID table has been initialized as
> + * expected, but that initialization happens too early in boot to print any
> + * sort of indicator, and there's not really any other good place to do it.

If you really wanna do it - and I think it might make sense to be able
to dump that table for debugging or whatever purposes, you could define
a cmdline parameter like

	sev=debug

or so, which would enable checking and dumping of the cpuid table.

For example...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 40/43] x86/sev: Register SEV-SNP guest request platform device
  2022-01-28 17:18 ` [PATCH v9 40/43] x86/sev: Register SEV-SNP guest request platform device Brijesh Singh
  2022-02-01 20:21   ` Peter Gonda
@ 2022-02-06 20:05   ` Borislav Petkov
  1 sibling, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-06 20:05 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:18:01AM -0600, Brijesh Singh wrote:
> Version 2 of GHCB specification provides Non Automatic Exit (NAE) that can
	      ^				  ^			   ^
	      the			  a			   event type

> be used by the SEV-SNP guest to communicate with the PSP without risk from
> a malicious hypervisor who wishes to read, alter, drop or replay the
> messages sent.
> 
> SNP_LAUNCH_UPDATE can insert two special pages into the guest’s memory:
> the secrets page and the CPUID page. The PSP firmware populate the contents

"populates"

> of the secrets page. The secrets page contains encryption keys used by the
> guest to interact with the firmware. Because the secrets page is encrypted
> with the guest’s memory encryption key, the hypervisor cannot read the
> keys. See SEV-SNP firmware spec for further details on the secrets page
> format.
> 
> Create a platform device that the SEV-SNP guest driver can bind to get the
> platform resources such as encryption key and message id to use to
> communicate with the PSP. The SEV-SNP guest driver provides a userspace
> interface to get the attestation report, key derivation, extended
> attestation report etc.

...

> +static int __init init_snp_platform_device(void)

snp_init_platform_device()

> +{
> +	struct snp_guest_platform_data data;
> +	u64 gpa;
> +
> +	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> +		return -ENODEV;
> +
> +	gpa = get_secrets_page();
> +	if (!gpa)
> +		return -ENODEV;
> +
> +	data.secrets_gpa = gpa;
> +	if (platform_device_add_data(&guest_req_device, &data, sizeof(data)))
> +		goto e_fail;
> +
> +	if (platform_device_register(&guest_req_device))
> +		goto e_fail;
> +
> +	pr_info("SNP guest platform device initialized.\n");
> +	return 0;
> +
> +e_fail:
> +	pr_err("Failed to initialize SNP guest device\n");
> +	return -ENODEV;

So when someone tries to debug why the platform device doesn't register
properly, this error message is ambiguous and two of the error paths
don't even issue one.

Either issue a different error message before you return each time or
remove it completely and let someone who really needs it, add it.

I'd vote for former...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 41/43] virt: Add SEV-SNP guest driver
  2022-01-28 17:18 ` [PATCH v9 41/43] virt: Add SEV-SNP guest driver Brijesh Singh
  2022-02-01 20:33   ` Peter Gonda
@ 2022-02-06 22:39   ` Borislav Petkov
  2022-02-07 14:41     ` Brijesh Singh
  1 sibling, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-06 22:39 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:18:02AM -0600, Brijesh Singh wrote:
> SEV-SNP specification provides the guest a mechanism to communicate with
 ^
 The

> the PSP without risk from a malicious hypervisor who wishes to read, alter,
> drop or replay the messages sent. The driver uses snp_issue_guest_request()
> to issue GHCB SNP_GUEST_REQUEST or SNP_EXT_GUEST_REQUEST NAE events to
> submit the request to PSP.
> 
> The PSP requires that all communication should be encrypted using key
> specified through the platform_data.
> 
> The userspace can use SNP_GET_REPORT ioctl() to query the guest

s/The u/U/

> attestation report.
> 
> See SEV-SNP spec section Guest Messages for more details.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  Documentation/virt/coco/sevguest.rst  |  81 ++++
>  drivers/virt/Kconfig                  |   3 +
>  drivers/virt/Makefile                 |   1 +
>  drivers/virt/coco/sevguest/Kconfig    |  12 +
>  drivers/virt/coco/sevguest/Makefile   |   2 +
>  drivers/virt/coco/sevguest/sevguest.c | 605 ++++++++++++++++++++++++++
>  drivers/virt/coco/sevguest/sevguest.h |  98 +++++
>  include/uapi/linux/sev-guest.h        |  50 +++
>  8 files changed, 852 insertions(+)
>  create mode 100644 Documentation/virt/coco/sevguest.rst
>  create mode 100644 drivers/virt/coco/sevguest/Kconfig
>  create mode 100644 drivers/virt/coco/sevguest/Makefile
>  create mode 100644 drivers/virt/coco/sevguest/sevguest.c
>  create mode 100644 drivers/virt/coco/sevguest/sevguest.h
>  create mode 100644 include/uapi/linux/sev-guest.h
> 
> diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
> new file mode 100644
> index 000000000000..47ef3b0821d5
> --- /dev/null
> +++ b/Documentation/virt/coco/sevguest.rst

Adding documentation is all fine and good but you need to link that file
into the TOC somewhere so that it is visible when the documentation gets
generated... I guess something like this:

diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst
index edea7fea95a8..40ad0d20032e 100644
--- a/Documentation/virt/index.rst
+++ b/Documentation/virt/index.rst
@@ -13,6 +13,7 @@ Linux Virtualization Support
    guest-halt-polling
    ne_overview
    acrn/index
+   coco/sevguest
 
 .. only:: html and subproject
 


And once you do, do "make htmldocs" to see whether it complains about
some formatting issues or other errors etc.

/me goes and does it...

Ah, here we go:

Documentation/virt/coco/sevguest.rst:48: WARNING: Inline emphasis start-string without end-string.
Documentation/virt/coco/sevguest.rst:51: WARNING: Inline emphasis start-string without end-string.
Documentation/virt/coco/sevguest.rst:55: WARNING: Inline emphasis start-string without end-string.
Documentation/virt/coco/sevguest.rst:57: WARNING: Definition list ends without a blank line; unexpected unindent.

There's something it doesn't like about the struct. Yeah, when I look at
the html output, it is all weird and not monospaced...

> @@ -0,0 +1,81 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===================================================================
> +The Definitive SEV Guest API Documentation
> +===================================================================
> +
> +1. General description
> +======================
> +
> +The SEV API is a set of ioctls that are used by the guest or hypervisor
> +to get or set certain aspect of the SEV virtual machine. The ioctls belong
		^
		a

> +to the following classes:
> +
> + - Hypervisor ioctls: These query and set global attributes which affect the
> +   whole SEV firmware.  These ioctl are used by platform provision tools.

"provisioning" maybe?

> +
> + - Guest ioctls: These query and set attributes of the SEV virtual machine.
> +
> +2. API description
> +==================
> +
> +This section describes ioctls that can be used to query or set SEV guests.
							      ^^^^^^^^^^^^^^^

That sounds weird.

> +For each ioctl, the following information is provided along with a
> +description:
> +
> +  Technology:
> +      which SEV technology provides this ioctl. sev, sev-es, sev-snp or all.
						   ^^^^^^^^^^^^^^^^^^^^

Marketing is going to scold you - those need to be in all caps. :-P

> +
> +  Type:
> +      hypervisor or guest. The ioctl can be used inside the guest or the
> +      hypervisor.
> +
> +  Parameters:
> +      what parameters are accepted by the ioctl.
> +
> +  Returns:
> +      the return value.  General error numbers (ENOMEM, EINVAL)

Those are negative: -ENOMEM, -EINVAL

> +      are not detailed, but errors with specific meanings are.
> +
> +The guest ioctl should be issued on a file descriptor of the /dev/sev-guest device.
> +The ioctl accepts struct snp_user_guest_request. The input and output structure is
> +specified through the req_data and resp_data field respectively. If the ioctl fails
> +to execute due to a firmware error, then fw_err code will be set otherwise the
> +fw_err will be set to 0xff.

fw_err is u64. What does 0xff mean? Everything above the least
significant byte is reserved 0?

> diff --git a/drivers/virt/coco/sevguest/Kconfig b/drivers/virt/coco/sevguest/Kconfig
> new file mode 100644
> index 000000000000..07ab9ec6471c
> --- /dev/null
> +++ b/drivers/virt/coco/sevguest/Kconfig
> @@ -0,0 +1,12 @@
> +config SEV_GUEST
> +	tristate "AMD SEV Guest driver"
> +	default y

Definitely not. We don't enable drivers by default unless they're
ubiquitous.

> +	depends on AMD_MEM_ENCRYPT && CRYPTO_AEAD2
> +	help
> +	  SEV-SNP firmware provides the guest a mechanism to communicate with
> +	  the PSP without risk from a malicious hypervisor who wishes to read,
> +	  alter, drop or replay the messages sent. The driver provides
> +	  userspace interface to communicate with the PSP to request the
> +	  attestation report and more.
> +
> +	  If you choose 'M' here, this module will be called sevguest.

...

> +static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
> +{
> +	char zero_key[VMPCK_KEY_LEN] = {0};
> +
> +	if (snp_dev->vmpck)
> +		return memcmp(snp_dev->vmpck, zero_key, VMPCK_KEY_LEN) == 0;

		return !memcmp(...);
> +
> +	return true;
> +}
> +
> +static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
> +{
> +	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
> +	snp_dev->vmpck = NULL;
> +}
> +
> +static inline u64 __snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
> +{
> +	u64 count;
> +
> +	lockdep_assert_held(&snp_cmd_mutex);
> +
> +	/* Read the current message sequence counter from secrets pages */
> +	count = *snp_dev->os_area_msg_seqno;
> +
> +	return count + 1;
> +}
> +
> +/* Return a non-zero on success */
> +static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
> +{
> +	u64 count = __snp_get_msg_seqno(snp_dev);
> +
> +	/*
> +	 * The message sequence counter for the SNP guest request is a  64-bit
> +	 * value but the version 2 of GHCB specification defines a 32-bit storage
> +	 * for it. If the counter exceeds the 32-bit value then return zero.
> +	 * The caller should check the return value, but if the caller happens to
> +	 * not check the value and use it, then the firmware treats zero as an
> +	 * invalid number and will fail the  message request.
> +	 */
> +	if (count >= UINT_MAX) {
> +		pr_err_ratelimited("SNP guest request message sequence counter overflow\n");

How does error message help the user? Is she supposed to reboot the
machine or so?

Because it sounds to me like if this goes over 32-bit, this driver stops
working. So what resets those sequence numbers?

> +		return 0;
> +	}
> +
> +	return count;
> +}
> +
> +static void snp_inc_msg_seqno(struct snp_guest_dev *snp_dev)
> +{
> +	/*
> +	 * The counter is also incremented by the PSP, so increment it by 2
> +	 * and save in secrets page.
> +	 */
> +	*snp_dev->os_area_msg_seqno += 2;
> +}
> +
> +static inline struct snp_guest_dev *to_snp_dev(struct file *file)
> +{
> +	struct miscdevice *dev = file->private_data;
> +
> +	return container_of(dev, struct snp_guest_dev, misc);
> +}
> +
> +static struct snp_guest_crypto *init_crypto(struct snp_guest_dev *snp_dev, u8 *key, size_t keylen)
> +{
> +	struct snp_guest_crypto *crypto;
> +
> +	crypto = kzalloc(sizeof(*crypto), GFP_KERNEL_ACCOUNT);
> +	if (!crypto)
> +		return NULL;
> +
> +	crypto->tfm = crypto_alloc_aead("gcm(aes)", 0, 0);
> +	if (IS_ERR(crypto->tfm))
> +		goto e_free;
> +
> +	if (crypto_aead_setkey(crypto->tfm, key, keylen))
> +		goto e_free_crypto;
> +
> +	crypto->iv_len = crypto_aead_ivsize(crypto->tfm);
> +	if (crypto->iv_len < 12) {
> +		dev_err(snp_dev->dev, "IV length is less than 12.\n");

And? < 12 is bad? Make that error message more user-friendly pls.

> +		goto e_free_crypto;
> +	}
> +
> +	crypto->iv = kmalloc(crypto->iv_len, GFP_KERNEL_ACCOUNT);
> +	if (!crypto->iv)
> +		goto e_free_crypto;
> +
> +	if (crypto_aead_authsize(crypto->tfm) > MAX_AUTHTAG_LEN) {
> +		if (crypto_aead_setauthsize(crypto->tfm, MAX_AUTHTAG_LEN)) {
> +			dev_err(snp_dev->dev, "failed to set authsize to %d\n", MAX_AUTHTAG_LEN);
> +			goto e_free_crypto;
> +		}
> +	}
> +
> +	crypto->a_len = crypto_aead_authsize(crypto->tfm);
> +	crypto->authtag = kmalloc(crypto->a_len, GFP_KERNEL_ACCOUNT);
> +	if (!crypto->authtag)
> +		goto e_free_crypto;
> +
> +	return crypto;
> +
> +e_free_crypto:
> +	crypto_free_aead(crypto->tfm);
> +e_free:
> +	kfree(crypto->iv);
> +	kfree(crypto->authtag);
> +	kfree(crypto);

The order of those free calls needs to be the opposite of the kmallocs
above.

> +
> +	return NULL;
> +}

...

> +static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
> +				u8 type, void *req_buf, size_t req_sz, void *resp_buf,
> +				u32 resp_sz, __u64 *fw_err)
> +{
> +	unsigned long err;
> +	u64 seqno;
> +	int rc;
> +
> +	/* Get message sequence and verify that its a non-zero */
> +	seqno = snp_get_msg_seqno(snp_dev);
> +	if (!seqno)
> +		return -EIO;
> +
> +	memset(snp_dev->response, 0, sizeof(*snp_dev->response));

				     sizeof(struct snp_guest_msg)

> +
> +	/* Encrypt the userspace provided payload */
> +	rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
> +	if (rc)
> +		return rc;
> +
> +	/* Call firmware to process the request */
> +	rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> +	if (fw_err)
> +		*fw_err = err;
> +
> +	if (rc)
> +		return rc;
> +
> +	rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
> +	if (rc) {
> +		/*
> +		 * The verify_and_dec_payload() will fail only if the hypervisor is
> +		 * actively modifying the message header or corrupting the encrypted payload.
> +		 * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
> +		 * the key cannot be used for any communication. The key is disabled to ensure
> +		 * that AES-GCM does not use the same IV while encrypting the request payload.
> +		 */

Put that comment over the function call.

> +		dev_alert(snp_dev->dev,
> +			  "Detected unexpected decode failure, disabling the vmpck_id %d\n",
> +			  vmpck_id);
> +		snp_disable_vmpck(snp_dev);
> +		return rc;
> +	}
> +
> +	/* Increment to new message sequence after payload decryption was successful. */
> +	snp_inc_msg_seqno(snp_dev);
> +
> +	return 0;
> +}
> +
> +static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> +{
> +	struct snp_guest_crypto *crypto = snp_dev->crypto;
> +	struct snp_report_req req = {0};
> +	struct snp_report_resp *resp;
> +	int rc, resp_len;

	lockdep_assert_held(&snp_cmd_mutex);

> +	if (!arg->req_data || !arg->resp_data)
> +		return -EINVAL;
> +
> +	/* Copy the request payload from userspace */
> +	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> +		return -EFAULT;
> +
> +	/*
> +	 * The intermediate response buffer is used while decrypting the
> +	 * response payload. Make sure that it has enough space to cover the
> +	 * authtag.
> +	 */
> +	resp_len = sizeof(resp->data) + crypto->a_len;
> +	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
> +	if (!resp)
> +		return -ENOMEM;
> +
> +	/* Issue the command to get the attestation report */
> +	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
> +				  SNP_MSG_REPORT_REQ, &req, sizeof(req), resp->data,
> +				  resp_len, &arg->fw_err);
> +	if (rc)
> +		goto e_free;
> +
> +	/* Copy the response payload to userspace */
> +	if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
> +		rc = -EFAULT;
> +
> +e_free:
> +	kfree(resp);
> +	return rc;
> +}
> +
> +static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
> +{
> +	struct snp_guest_dev *snp_dev = to_snp_dev(file);
> +	void __user *argp = (void __user *)arg;
> +	struct snp_guest_request_ioctl input;
> +	int ret = -ENOTTY;
> +
> +	if (copy_from_user(&input, argp, sizeof(input)))
> +		return -EFAULT;
> +
> +	input.fw_err = 0xff;
> +
> +	/* Message version must be non-zero */
> +	if (!input.msg_version)
> +		return -EINVAL;
> +
> +	mutex_lock(&snp_cmd_mutex);

That mutex probably is to be held while issuing SNP commands but then
you hold it here already for...

> +
> +	/* Check if the VMPCK is not empty */
> +	if (is_vmpck_empty(snp_dev)) {

... this here which is not really a SNP command issuing.

Should that mutex be grabbed only around handle_guest_request() above or
is it supposed to protect more stuff?

> +		dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
> +		mutex_unlock(&snp_cmd_mutex);
> +		return -ENOTTY;
> +	}
> +
> +	switch (ioctl) {
> +	case SNP_GET_REPORT:
> +		ret = get_report(snp_dev, &input);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	mutex_unlock(&snp_cmd_mutex);
> +
> +	if (input.fw_err && copy_to_user(argp, &input, sizeof(input)))
> +		return -EFAULT;
> +
> +	return ret;
> +}
> +
> +static void free_shared_pages(void *buf, size_t sz)
> +{
> +	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> +
> +	if (!buf)
> +		return;
> +
> +	/* If fail to restore the encryption mask then leak it. */

Useless comment - the error message says the same.

> +	if (WARN_ONCE(set_memory_encrypted((unsigned long)buf, npages),
> +		      "Failed to restore encryption mask (leak it)\n"))
> +		return;
> +
> +	__free_pages(virt_to_page(buf), get_order(sz));
> +}
> +
> +static void *alloc_shared_pages(size_t sz)
> +{
> +	unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
> +	struct page *page;
> +	int ret;
> +
> +	page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
> +	if (IS_ERR(page))
> +		return NULL;
> +
> +	ret = set_memory_decrypted((unsigned long)page_address(page), npages);
> +	if (ret) {
> +		pr_err("SEV-SNP: failed to mark page shared, ret=%d\n", ret);

Use pr_fmt, grep the tree for an example how and drop "SEV-SNP:" - that
prefix should be "sevguest:".

> +		__free_pages(page, get_order(sz));
> +		return NULL;
> +	}
> +
> +	return page_address(page);
> +}

...

> +static int __init snp_guest_probe(struct platform_device *pdev)
> +{
> +	struct snp_secrets_page_layout *layout;
> +	struct snp_guest_platform_data *data;
> +	struct device *dev = &pdev->dev;
> +	struct snp_guest_dev *snp_dev;
> +	struct miscdevice *misc;
> +	int ret;
> +
> +	if (!dev->platform_data)
> +		return -ENODEV;
> +
> +	data = (struct snp_guest_platform_data *)dev->platform_data;
> +	layout = (__force void *)ioremap_encrypted(data->secrets_gpa, PAGE_SIZE);
> +	if (!layout)
> +		return -ENODEV;
> +
> +	ret = -ENOMEM;
> +	snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL);
> +	if (!snp_dev)
> +		goto e_fail;
> +
> +	ret = -EINVAL;
> +	snp_dev->vmpck = get_vmpck(vmpck_id, layout, &snp_dev->os_area_msg_seqno);
> +	if (!snp_dev->vmpck) {
> +		dev_err(dev, "invalid vmpck id %d\n", vmpck_id);
> +		goto e_fail;
> +	}
> +
> +	/* Verify that VMPCK is not zero. */
> +	if (is_vmpck_empty(snp_dev)) {
> +		dev_err(dev, "vmpck id %d is null\n", vmpck_id);
> +		goto e_fail;
> +	}
> +
> +	platform_set_drvdata(pdev, snp_dev);
> +	snp_dev->dev = dev;
> +	snp_dev->layout = layout;
> +
> +	/* Allocate the shared page used for the request and response message. */
> +	snp_dev->request = alloc_shared_pages(sizeof(struct snp_guest_msg));
> +	if (!snp_dev->request)
> +		goto e_fail;
> +
> +	snp_dev->response = alloc_shared_pages(sizeof(struct snp_guest_msg));
> +	if (!snp_dev->response)
> +		goto e_fail;
> +
> +	ret = -EIO;
> +	snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
> +	if (!snp_dev->crypto)
> +		goto e_fail;
> +
> +	misc = &snp_dev->misc;
> +	misc->minor = MISC_DYNAMIC_MINOR;
> +	misc->name = DEVICE_NAME;
> +	misc->fops = &snp_guest_fops;
> +
> +	/* initial the input address for guest request */
> +	snp_dev->input.req_gpa = __pa(snp_dev->request);
> +	snp_dev->input.resp_gpa = __pa(snp_dev->response);
> +
> +	ret =  misc_register(misc);
> +	if (ret)
> +		goto e_fail;
> +
> +	dev_info(dev, "Initialized SEV-SNP guest driver (using vmpck_id %d)\n", vmpck_id);
> +	return 0;
> +
> +e_fail:
> +	iounmap(layout);
> +	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
> +	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));

The freeing needs to happen in the opposite order of the allocations.
Audit all gotos.

> +
> +	return ret;
> +}
> +
> +static int __exit snp_guest_remove(struct platform_device *pdev)
> +{
> +	struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev);
> +
> +	free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
> +	free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
> +	deinit_crypto(snp_dev->crypto);
> +	misc_deregister(&snp_dev->misc);
> +
> +	return 0;
> +}
> +
> +static struct platform_driver snp_guest_driver = {
> +	.remove		= __exit_p(snp_guest_remove),
> +	.driver		= {
> +		.name = "snp-guest",
> +	},
> +};
> +
> +module_platform_driver_probe(snp_guest_driver, snp_guest_probe);
> +
> +MODULE_AUTHOR("Brijesh Singh <brijesh.singh@amd.com>");
> +MODULE_LICENSE("GPL");
> +MODULE_VERSION("1.0.0");
> +MODULE_DESCRIPTION("AMD SNP Guest Driver");
> diff --git a/drivers/virt/coco/sevguest/sevguest.h b/drivers/virt/coco/sevguest/sevguest.h
> new file mode 100644
> index 000000000000..cfa76cf8a21a
> --- /dev/null
> +++ b/drivers/virt/coco/sevguest/sevguest.h
> @@ -0,0 +1,98 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Brijesh Singh <brijesh.singh@amd.com>
> + *
> + * SEV-SNP API spec is available at https://developer.amd.com/sev
> + */
> +
> +#ifndef __LINUX_SEVGUEST_H_

I guess VIRT_SEVGUEST_H is better fitting.

> +#define __LINUX_SEVGUEST_H_

...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-01-28 17:18 ` [PATCH v9 42/43] virt: sevguest: Add support to derive key Brijesh Singh
  2022-02-01 20:39   ` Peter Gonda
@ 2022-02-07  8:52   ` Borislav Petkov
  2022-02-07 16:23     ` Brijesh Singh
  1 sibling, 1 reply; 115+ messages in thread
From: Borislav Petkov @ 2022-02-07  8:52 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy,
	Liam Merwick

On Fri, Jan 28, 2022 at 11:18:03AM -0600, Brijesh Singh wrote:
> +static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> +{
> +	struct snp_guest_crypto *crypto = snp_dev->crypto;
> +	struct snp_derived_key_resp resp = {0};
> +	struct snp_derived_key_req req = {0};
> +	int rc, resp_len;
> +	u8 buf[64+16]; /* Response data is 64 bytes and max authsize for GCM is 16 bytes */

verify_comment_style: Warning: No tail comments please:
 drivers/virt/coco/sevguest/sevguest.c:401 [+	u8 buf[64+16]; /* Response data is 64 bytes and max authsize for GCM is 16 bytes */]

> +	if (!arg->req_data || !arg->resp_data)
> +		return -EINVAL;
> +
> +	/* Copy the request payload from userspace */

That comment looks useless.

> +	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> +		return -EFAULT;
> +
> +	/*
> +	 * The intermediate response buffer is used while decrypting the
> +	 * response payload. Make sure that it has enough space to cover the
> +	 * authtag.
> +	 */
> +	resp_len = sizeof(resp.data) + crypto->a_len;
> +	if (sizeof(buf) < resp_len)
> +		return -ENOMEM;

That test can happen before the copy_from_user() above.

> +
> +	/* Issue the command to get the attestation report */

Also useless.

> +	rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
> +				  SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len,
> +				  &arg->fw_err);
> +	if (rc)
> +		goto e_free;
> +
> +	/* Copy the response payload to userspace */

Ditto.

> +	memcpy(resp.data, buf, sizeof(resp.data));
> +	if (copy_to_user((void __user *)arg->resp_data, &resp, sizeof(resp)))
> +		rc = -EFAULT;
> +
> +e_free:
> +	memzero_explicit(buf, sizeof(buf));
> +	memzero_explicit(&resp, sizeof(resp));

Those are allocated on stack, why are you clearing them?

> +	return rc;
> +}

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 43/43] virt: sevguest: Add support to get extended report
  2022-01-28 17:18 ` [PATCH v9 43/43] virt: sevguest: Add support to get extended report Brijesh Singh
  2022-02-01 20:43   ` Peter Gonda
@ 2022-02-07  9:16   ` Borislav Petkov
  1 sibling, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-07  9:16 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022 at 11:18:04AM -0600, Brijesh Singh wrote:
> +static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> +{
> +	struct snp_guest_crypto *crypto = snp_dev->crypto;
> +	struct snp_ext_report_req req = {0};
> +	struct snp_report_resp *resp;
> +	int ret, npages = 0, resp_len;
> +
> +	if (!arg->req_data || !arg->resp_data)
> +		return -EINVAL;
> +
> +	/* Copy the request payload from userspace */

Useless comment.

> +	if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
> +		return -EFAULT;
> +
> +	if (req.certs_len) {
> +		if (req.certs_len > SEV_FW_BLOB_MAX_SIZE ||
> +		    !IS_ALIGNED(req.certs_len, PAGE_SIZE))
> +			return -EINVAL;
> +	}
> +
> +	if (req.certs_address && req.certs_len) {
> +		if (!access_ok(req.certs_address, req.certs_len))
> +			return -EFAULT;
> +
> +		/*
> +		 * Initialize the intermediate buffer with all zero's. This buffer

"zeros"

> +		 * is used in the guest request message to get the certs blob from
> +		 * the host. If host does not supply any certs in it, then copy
> +		 * zeros to indicate that certificate data was not provided.
> +		 */
> +		memset(snp_dev->certs_data, 0, req.certs_len);
> +
> +		npages = req.certs_len >> PAGE_SHIFT;
> +	}
> +
> +	/*
> +	 * The intermediate response buffer is used while decrypting the
> +	 * response payload. Make sure that it has enough space to cover the
> +	 * authtag.
> +	 */
> +	resp_len = sizeof(resp->data) + crypto->a_len;
> +	resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
> +	if (!resp)
> +		return -ENOMEM;
> +
> +	snp_dev->input.data_npages = npages;
> +	ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg->msg_version,
> +				   SNP_MSG_REPORT_REQ, &req.data,
> +				   sizeof(req.data), resp->data, resp_len, &arg->fw_err);
> +
> +	/* If certs length is invalid then copy the returned length */
> +	if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
> +		req.certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
> +
> +		if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
> +			ret = -EFAULT;
> +	}
> +
> +	if (ret)
> +		goto e_free;
> +
> +	/* Copy the certificate data blob to userspace */
> +	if (req.certs_address && req.certs_len &&
> +	    copy_to_user((void __user *)req.certs_address, snp_dev->certs_data,
> +			 req.certs_len)) {
> +		ret = -EFAULT;
> +		goto e_free;
> +	}
> +
> +	/* Copy the response payload to userspace */

Both comments are not needed.

> +	if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
> +		ret = -EFAULT;
> +
> +e_free:
> +	kfree(resp);
> +	return ret;
> +}

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 41/43] virt: Add SEV-SNP guest driver
  2022-02-06 22:39   ` Borislav Petkov
@ 2022-02-07 14:41     ` Brijesh Singh
  2022-02-07 15:22       ` Borislav Petkov
  0 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-02-07 14:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy



On 2/6/22 4:39 PM, Borislav Petkov wrote:

> And once you do, do "make htmldocs" to see whether it complains about
> some formatting issues or other errors etc.
> 
> /me goes and does it...
> 
> Ah, here we go:
> 
> Documentation/virt/coco/sevguest.rst:48: WARNING: Inline emphasis start-string without end-string.
> Documentation/virt/coco/sevguest.rst:51: WARNING: Inline emphasis start-string without end-string.
> Documentation/virt/coco/sevguest.rst:55: WARNING: Inline emphasis start-string without end-string.
> Documentation/virt/coco/sevguest.rst:57: WARNING: Definition list ends without a blank line; unexpected unindent.
> 
> There's something it doesn't like about the struct. Yeah, when I look at
> the html output, it is all weird and not monospaced...

I will fix those in next rev.


>> +
>> +The guest ioctl should be issued on a file descriptor of the /dev/sev-guest device.
>> +The ioctl accepts struct snp_user_guest_request. The input and output structure is
>> +specified through the req_data and resp_data field respectively. If the ioctl fails
>> +to execute due to a firmware error, then fw_err code will be set otherwise the
>> +fw_err will be set to 0xff.
> 
> fw_err is u64. What does 0xff mean? Everything above the least
> significant byte is reserved 0?
> 

Yep, I will explicitly say that it should be set to 0x00000000000000FF.


>> diff --git a/drivers/virt/coco/sevguest/Kconfig b/drivers/virt/coco/sevguest/Kconfig
>> new file mode 100644
>> index 000000000000..07ab9ec6471c
>> --- /dev/null
>> +++ b/drivers/virt/coco/sevguest/Kconfig
>> @@ -0,0 +1,12 @@
>> +config SEV_GUEST
>> +	tristate "AMD SEV Guest driver"
>> +	default y
> 
> Definitely not. We don't enable drivers by default unless they're
> ubiquitous.
> 

Randy asked me similar question on v7, and here is my response to it.

https://lore.kernel.org/linux-mm/e6b412e4-f38e-d212-f52a-e7bdc9a26eff@infradead.org/

Let me know if you still think that we should make it 'n'. I am not dead 
against it but I have feeling that once distro's starts building SNP 
aware guest kernel, then we may get asked to enable it by default so 
that attestation report can be obtained by the initial ramdisk.



>> +	 */
>> +	if (count >= UINT_MAX) {
>> +		pr_err_ratelimited("SNP guest request message sequence counter overflow\n");
> 
> How does error message help the user? Is she supposed to reboot the
> machine or so?
> 
> Because it sounds to me like if this goes over 32-bit, this driver stops
> working. So what resets those sequence numbers?

After this condition is met, a guest will no longer get the attestation 
report. It's up to the userspace to reboot the guest or continue without 
attestation.

The only thing that will reset the counter is re-launching the guest to 
go through the entire PSP initialization sequence once again.


>> +
>> +	crypto->iv_len = crypto_aead_ivsize(crypto->tfm);
>> +	if (crypto->iv_len < 12) {
>> +		dev_err(snp_dev->dev, "IV length is less than 12.\n");
> 
> And? < 12 is bad? Make that error message more user-friendly pls.
> 
Okay.


> 
> The order of those free calls needs to be the opposite of the kmallocs
> above.
> 
Okay


>> +
>> +	/* Message version must be non-zero */
>> +	if (!input.msg_version)
>> +		return -EINVAL;
>> +
>> +	mutex_lock(&snp_cmd_mutex);
> 
> That mutex probably is to be held while issuing SNP commands but then
> you hold it here already for...
> 
>> +
>> +	/* Check if the VMPCK is not empty */
>> +	if (is_vmpck_empty(snp_dev)) {
> 
> ... this here which is not really a SNP command issuing.
> 
> Should that mutex be grabbed only around handle_guest_request() above or
> is it supposed to protect more stuff?


Yep, it need to protect more stuff.

We allocate a shared buffers (request, response, cert-chain) that gets 
populated before issuing the command, and then we copy the result from 
reponse shared to callers buffer after the command completes. So, we 
also want to ensure that the shared buffer is not touched before the 
previous ioctl is finished.


-Brijesh

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 41/43] virt: Add SEV-SNP guest driver
  2022-02-07 14:41     ` Brijesh Singh
@ 2022-02-07 15:22       ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-07 15:22 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Mon, Feb 07, 2022 at 08:41:47AM -0600, Brijesh Singh wrote:
> Randy asked me similar question on v7, and here is my response to it.
> 
> https://lore.kernel.org/linux-mm/e6b412e4-f38e-d212-f52a-e7bdc9a26eff@infradead.org/
> 
> Let me know if you still think that we should make it 'n'. I am not dead
> against it but I have feeling that once distro's starts building SNP aware
> guest kernel, then we may get asked to enable it by default so that
> attestation report can be obtained by the initial ramdisk.

Well, let's see:

$ make oldconfig
...

#
# No change to .config
#
$

So it didn't even ask me. Because

# CONFIG_VIRT_DRIVERS is not set

so what's the point of this "default y"?

If the distros are your worry, then you probably will have to ask
them to do so explicitly anyway because at least we edit our configs
ourselves and decide what to enable or what not.

> After this condition is met, a guest will no longer get the attestation
> report. It's up to the userspace to reboot the guest or continue without
> attestation.
> 
> The only thing that will reset the counter is re-launching the guest to go
> through the entire PSP initialization sequence once again.

Well, but you need to explain that somewhere to the guest owners.
I guess either here in that error message or in some higher-level
glue which will do the attestation. Just saying that some counter has
overflown is not very user-friendly, I'd say.

> Yep, it need to protect more stuff.
> 
> We allocate a shared buffers (request, response, cert-chain) that gets
> populated before issuing the command, and then we copy the result from
> reponse shared to callers buffer after the command completes. So, we also
> want to ensure that the shared buffer is not touched before the previous
> ioctl is finished.

So you need to rename that mutex and slap a comment above it what it
protects.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers
  2022-02-06 13:37       ` Borislav Petkov
@ 2022-02-07 15:37         ` Michael Roth
  2022-02-07 17:52           ` Borislav Petkov
  0 siblings, 1 reply; 115+ messages in thread
From: Michael Roth @ 2022-02-07 15:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr

On Sun, Feb 06, 2022 at 02:37:59PM +0100, Borislav Petkov wrote:
> First of all,
> 
> let me give you a very important piece of advice for the future:
> ignoring review feedback is a very unfriendly thing to do:
> 
> - if you agree with the feedback, you work it in in the next revision
> 
> - if you don't agree, you *say* *why* you don't
> 
> What you should avoid is what you've done - ignore it silently. Because
> reviewing submitters code is the most ungrateful work around the kernel,
> and, at the same time, it is the most important one.
> 
> So please make sure you don't do that in the future when submitting
> patches for upstream inclusion. I'm sure you can imagine, the ignoring
> can go both ways.

Absolutely, I know a thorough review is grueling work, and would never
want to give the impression that I don't appreciate it. Was just hoping
to revisit these in the context of v9 since there were some concerning
things in flight WRT the spec handling and I was sort of focused on
getting ahead of those in case they involved firmware/spec changes. But
I realize that's resulted in a waste of your time and I should have at
least provided some indication of where I was with these before your
review. Won't happen again.

> 
> On Sat, Feb 05, 2022 at 10:22:49AM -0600, Michael Roth wrote:
> > The documentation for lea (APM Volume 3 Chapter 3) seemed to require
> > that the destination register be a general purpose register, so it
> > seemed like there was potential for breakage in allowing GCC to use
> > anything otherwise.
> 
> There's no such potential: binutils encodes the unstruction operands
> and what types are allowed. IOW, the assembler knows that there goes a
> register.
> 
> > Maybe GCC is smart enough to figure that out, but since we know the
> > constraint in advance it seemed safer to stick with the current
> > approach of enforcing that constraint.
> 
> I guess in this particular case it won't matter whether it is "=r" or
> "=g" but still...
> 
> > I did look into it and honestly it just seemed to add more abstractions that
> > made it harder to parse the specific operations taken place here. For
> > instance, post-processing of 0x8000001E entry, we have e{a,b,c,d}x from
> > the CPUID table, then to post process:
> > 
> >   switch (func):
> >   case 0x8000001E:
> >     /* extended APIC ID */
> >     snp_cpuid_hv(func, subfunc, eax, &ebx2, &ecx2, NULL);
> 
> Well, my suggestion was to put it *all* in the subleaf struct - not just
> the regs:
> 
> struct cpuid_leaf {
> 	u32 func;
> 	u32 subfunc;
> 	u32 eax;
> 	u32 ebx;
> 	u32 ecx;
> 	u32 edx;
> };
> 
> so that the call signature is:
> 
> 	snp_cpuid_postprocess(struct cpuid_leaf *leaf)
> 
> 
> > and it all reads in a clear/familiar way to all the other
> > cpuid()/native_cpuid() users throughout the kernel,
> 
> maybe it should read differently *on* *purpose*. Exactly because this is
> *not* the usual CPUID handling code but CPUID verification code for SNP
> guests.

Well, there's also sev_cpuid_hv(), which really is sort of a variant of
cpuid()/native_cpuid() helpers that happens to use the GHCB protocol, and
snp_cpuid() gets called at roughly the same callsite in the #VC handler,
so it made sense to me to just stick to a similar signature across all of
them. But no problem implementing it as you suggested, that was just my
reasoning there.

> 
> And please explain to me what's so unclear about leaf->eax vs *eax?!

It's not that one is unclear, it's just that, in the context of reading
through a function which has a lot of similar/repetitive cases for
various leaves:

  snp_cpuid_hv(fn, subfn, &eax, &ebx2, NULL, NULL) 

  ebx = fixup(ebx2)

seems slightly easier to process to me than:

  snp_cpuid_hv(&leaf_hv)

  leaf->eax = leaf_hv->eax
  leaf->ebx = fixup(leaf_hv->ebx)

because I already have advance indication from the function call that eax
will be overwritten by hv, ebx2 will getting fixed up in subsequent lines,
and ecx/edx will be used as is, and don't need to parse multiple lines to
discern that.

But I know I may very well be biased from staring at the existing
implementation so much, and that maybe your approach would be clearer to
someone looking at the code who isn't as familiar with that pattern.

> 
> > I saved the diff from when I looked into it previously (was just a
> > rough-sketch, not build-tested), and included it below for reference,
> > but it just didn't seem to help with readability to me,
> 
> Well, looking at it, the only difference is that the IO is done
> over a struct instead of separate pointers. And the diff is pretty
> straight-forward.
> 
> So no, having a struct cpuid_leaf containing it all is actually better
> in this case because you know which is which if you name it properly and
> you have a single pointer argument which you pass around - something
> which is done all around the kernel.

Ok, will work this in for v10. My plan is to introduce this struct:

  struct cpuid_leaf {
      u32 fn;
      u32 subfn;
      u32 eax;
      u32 ebx;
      u32 ecx;
      u32 edx;
  }

as part of the patch which introduces sev_cpuid_hv():

  x86/sev: Move MSR-based VMGEXITs for CPUID to helper

and then utilize that for the function parameters there, here, and any
other patches in the SNP code involved with fetching/manipulating cpuid
values before returning them to the #VC handler.

Thanks!

-Mike

> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7Cd409d10fc3cc41c4cbfb08d9e975ebfc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637797515011518153%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=%2FDjnNrQ7a8%2F6eI%2FRVJJ2vmLEpeDaAeDVOJcW9CBnggU%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-02-07  8:52   ` Borislav Petkov
@ 2022-02-07 16:23     ` Brijesh Singh
  2022-02-07 19:09       ` Dov Murik
  0 siblings, 1 reply; 115+ messages in thread
From: Brijesh Singh @ 2022-02-07 16:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Liam Merwick



On 2/7/22 2:52 AM, Borislav Petkov wrote:
> Those are allocated on stack, why are you clearing them?

Yep, no need to explicitly clear it. I'll take it out in next rev.

thanks

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests
  2022-02-06 15:46     ` Borislav Petkov
@ 2022-02-07 17:00       ` Michael Roth
  2022-02-07 18:43         ` Borislav Petkov
  0 siblings, 1 reply; 115+ messages in thread
From: Michael Roth @ 2022-02-07 17:00 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, brijesh.singh, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Sun, Feb 06, 2022 at 04:46:08PM +0100, Borislav Petkov wrote:
> On Sat, Feb 05, 2022 at 11:19:01AM -0600, Michael Roth wrote:
> > I mentioned the concern you raised about out-of-spec hypervisors
> > using non-zero for Reserved fields, and potentially breaking future
> > guests that attempt to make use of them if they ever get re-purposed
> > for some other functionality, but their take on that is that such a
> > hypervisor is clearly out-of-spec, and would not be supported.
> 
> Yah, like stating that something should not be done in the spec is
> going to stop people from doing it anyway. You folks need to understand
> one thing: the smaller the attack surface, the better. And you need to
> *enforce* that - not state it in a spec. No one cares about the spec
> when it comes to poking holes in the architecture. And people do poke
> and will poke holes *especially* in this architecture, as its main goal
> is security.

Agreed, but SNP never relies on the hypervisor to be the single authority
on what security features are available, if these fields were ever
re-purposed for anything in the future that would almost assuredly be
accompanied by a new SEV status MSR bit (that can't be intercepted),
a new policy bit (which affects measurement), a new "type2" CPUID page
that would effect measurement (contents don't affect measurement in
current firmware, but the metadata, like what type of page it is, is),
etc.

At that point any guest code changes to make use of those new fields
could then fail any out-of-spec hypervisors trying to use those fields
without the requisite firmware/hardware support. So please don't take
my summary as an indication that the security relies on hypervisors
abiding by the spec, this is more a statement that an out-of-spec
hypervisor should not expect that their guests will continue working
in future firmware versions, and what's being determined here is
whether to break those out-of-spec hypervisor now, or later when/if
we actually make use of the fields in the guest code, and whether we
need to make clarifications in the spec to help implementations avoid
such breakage.

> 
> > Another possibility is enforcing 0 in the firmware itself.
> 
> Yes, this is the thing to do. If they're going to be reused in the
> future, then guests can be changed to handle that. Like we do all the
> time anyway.

Ok, I'll follow up with the firmware team on this. But just to be clear,
what they're suggesting is that the firmware could enforce the MBZ checks
on the CPUID page, so out-of-spec hypervisors will fail immediately,
rather than in some future version of the spec/cpuid page, and guests
can continue ignoring them in the meantime.

I'll also note the type you spotted with the Table 13 reference and
see if there's anything else that can be cleared up there.

> 
> > So given their guidance on the Reserved fields, and your guidance
> > on not doing the other sanity-checks, my current plan to to drop
> > snp_check_cpuid_table() completely for v10, but if you have other
> > thoughts on that just let me know.
> 
> Yes, and pls fix the firmware to zero them out, because from reading
> your very detailed explanation - btw, thanks for taking the time to
> explain properly what's not in the ABI doc:
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2F20220205154243.s33gwghqwbb4efyl%40amd.com&amp;data=04%7C01%7Cmichael.roth%40amd.com%7C377208b805be40f55c0f08d9e987d19e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637797591858315372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=L9McNfcn514uHmHLCtFPI2TqLKoCK0%2FKDMPk32lO8r8%3D&amp;reserved=0
> 
> it sounds like those two input fields are not really needed. So the
> earlier you fix them in the firmware and deprecate them, the better.

XCR0_IN/XSS_IN aren't needed by a guest if it follows the recommended
implementation and computes them on the fly, but they do still serve
a purpose in the context of firmware validation of 0xD,0x0/0xD,0x1
leaves, since those are validated relative to a particular
XCR0_IN/XSS_IN.

Whether a guest chooses to use the resulting firmware-validated version
of EBX for those, or compute it on it's own, is considered outside the
scope of the SNP firmware ABI, so I think the plan is to leave those
fields in the spec at least for current SNP version, and rely on the
implementation recommendations to document anything outside of that.

But since the recommendations need to be compatibile with the SNP
firmware ABI, I've updated the recommendations to have the guest to go
ahead and check for XCR0_IN={1,3}/XSS_IN=0 when searching for base
0xD,0x0/0xD,0x1 entries, so that there's no question of whether a guest
is supposed to expect XCR0_IN/XSS_IN to be 0.

Getting that document ready to send if a few.

Hope that helps clear things up, but please let me know if there's
anything else that needs clarification.

Thanks!

-Mike

> 
> Thx!
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7C377208b805be40f55c0f08d9e987d19e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637797591858315372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=0uNI0Ojku3ZIgiQRbNf0UxNXruycXsOYIUNIY9I2IiY%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers
  2022-02-07 15:37         ` Michael Roth
@ 2022-02-07 17:52           ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-07 17:52 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr

On Mon, Feb 07, 2022 at 09:37:39AM -0600, Michael Roth wrote:
> Absolutely, I know a thorough review is grueling work, and would never
> want to give the impression that I don't appreciate it. Was just hoping
> to revisit these in the context of v9 since there were some concerning
> things in flight WRT the spec handling and I was sort of focused on
> getting ahead of those in case they involved firmware/spec changes. But
> I realize that's resulted in a waste of your time and I should have at
> least provided some indication of where I was with these before your
> review. Won't happen again.

Thanks, that's appreciated.

And in case you're wondering, the kernel is the most flexible thing from
all parties involved so even if you have to change the spec/fw, fixing
the kernel is a lot easier than any of the other things. So make sure
you do a good job with the spec/fw - the kernel will be fine. :-)

> Ok, will work this in for v10. My plan is to introduce this struct:
> 
>   struct cpuid_leaf {
>       u32 fn;
>       u32 subfn;
>       u32 eax;
>       u32 ebx;
>       u32 ecx;
>       u32 edx;
>   }

Ok.

> as part of the patch which introduces sev_cpuid_hv():
> 
>   x86/sev: Move MSR-based VMGEXITs for CPUID to helper
> 
> and then utilize that for the function parameters there, here, and any
> other patches in the SNP code involved with fetching/manipulating cpuid
> values before returning them to the #VC handler.

Sounds good.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests
  2022-02-07 17:00       ` Michael Roth
@ 2022-02-07 18:43         ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-07 18:43 UTC (permalink / raw)
  To: Michael Roth
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, brijesh.singh, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Mon, Feb 07, 2022 at 11:00:18AM -0600, Michael Roth wrote:
> this is more a statement that an out-of-spec hypervisor should not
> expect that their guests will continue working in future firmware
> versions, and what's being determined here is whether to break
> those out-of-spec hypervisor now, or later when/if we actually
> make use of the fields in the guest code,

I think you're missing a very important aspect here called reality.

Let's say that HV is a huge cloud vendor who means a lot of $ and a
huge use case for SEV tech. And let's say that same HV is doing those
incompatible things.

Now imagine you break it with the spec change. But they already have
a gazillion of deployments on real hw which they can't simply update
just like that. Hell, cloud vendors are even trying to dictate how CPU
vendors should do microcode updates on a live system, without rebooting,
and we're talking about some wimpy fields in some table.

Now imagine your business unit calls your engineering and says, you need
to fix this because a very persuasive chunk of money.

What you most likely will end up with is an ugly ugly workaround after a
lot of managers screaming at each other and you won't even think about
breaking that HV.

Now imagine you've designed it the right and unambiguous way from the
getgo. You wake up and realize, it was all just a bad dream...

> Ok, I'll follow up with the firmware team on this. But just to be clear,
> what they're suggesting is that the firmware could enforce the MBZ checks
> on the CPUID page, so out-of-spec hypervisors will fail immediately,
> rather than in some future version of the spec/cpuid page, and guests
> can continue ignoring them in the meantime.

Yes, exactly. Fail immediately.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-02-07 16:23     ` Brijesh Singh
@ 2022-02-07 19:09       ` Dov Murik
  2022-02-07 20:08         ` Brijesh Singh
  0 siblings, 1 reply; 115+ messages in thread
From: Dov Murik @ 2022-02-07 19:09 UTC (permalink / raw)
  To: Brijesh Singh, Borislav Petkov
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy,
	Liam Merwick, Dov Murik



On 07/02/2022 18:23, Brijesh Singh wrote:
> 
> 
> On 2/7/22 2:52 AM, Borislav Petkov wrote:
>> Those are allocated on stack, why are you clearing them?
> 
> Yep, no need to explicitly clear it. I'll take it out in next rev.
> 

Wait, this is key material generated by PSP and passed to userspace.
Why leave copies of it floating around kernel memory?  I thought that's
the whole reason for these memzero_explicit() calls (maybe add a comment?).

As an example, in arch/x86/crypto/aesni-intel_glue.c there are two calls
to memzero_explicit(), both on stack variables; the only reason for
these calls (as I understand it) is to avoid some future possible leak
of this sensitive data (keys, cipher context, etc.).  I'm sure there are
other examples in the kernel code.


-Dov

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-02-07 19:09       ` Dov Murik
@ 2022-02-07 20:08         ` Brijesh Singh
  2022-02-07 20:28           ` Borislav Petkov
  2022-02-08  7:56           ` Dov Murik
  0 siblings, 2 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-02-07 20:08 UTC (permalink / raw)
  To: Dov Murik, Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Liam Merwick



On 2/7/22 1:09 PM, Dov Murik wrote:
> 
> 
> On 07/02/2022 18:23, Brijesh Singh wrote:
>>
>>
>> On 2/7/22 2:52 AM, Borislav Petkov wrote:
>>> Those are allocated on stack, why are you clearing them?
>>
>> Yep, no need to explicitly clear it. I'll take it out in next rev.
>>
> 
> Wait, this is key material generated by PSP and passed to userspace.
> Why leave copies of it floating around kernel memory?  I thought that's
> the whole reason for these memzero_explicit() calls (maybe add a comment?).
> 


Ah, now I remember I added the memzero_explicit() to address your review 
feedback :) In that patch version, we were using the kmalloc() to store 
the response data; since then, we switched to stack. We will leak the 
key outside when the stack is converted private-> shared; I don't know 
if any of these are going to happen. I can add a comment and keep the 
memzero_explicit() call.

Boris, let me know if you are okay with it?


> As an example, in arch/x86/crypto/aesni-intel_glue.c there are two calls
> to memzero_explicit(), both on stack variables; the only reason for
> these calls (as I understand it) is to avoid some future possible leak
> of this sensitive data (keys, cipher context, etc.).  I'm sure there are
> other examples in the kernel code.
> 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-02-07 20:08         ` Brijesh Singh
@ 2022-02-07 20:28           ` Borislav Petkov
  2022-02-08  7:56           ` Dov Murik
  1 sibling, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-07 20:28 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Dov Murik, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Liam Merwick

On Mon, Feb 07, 2022 at 02:08:17PM -0600, Brijesh Singh wrote:
> Boris, let me know if you are okay with it?

Yes, most definitely as long as there's a comment explaining why.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 30/43] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement
  2022-01-28 17:17 ` [PATCH v9 30/43] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement Brijesh Singh
@ 2022-02-07 23:48   ` Sean Christopherson
  2022-02-08 14:54     ` Michael Roth
  2022-02-08 15:11     ` Borislav Petkov
  0 siblings, 2 replies; 115+ messages in thread
From: Sean Christopherson @ 2022-02-07 23:48 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Borislav Petkov, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Fri, Jan 28, 2022, Brijesh Singh wrote:
> From: Michael Roth <michael.roth@amd.com>
> 
> Update the documentation with SEV-SNP CPUID enforcement.
> 
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  .../virt/kvm/amd-memory-encryption.rst        | 28 +++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> index 1c6847fff304..0c72f44cc11a 100644
> --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/amd-memory-encryption.rst

This doc is specifically for KVM's host-side implemenation, whereas the below is
(a) mostly targeted at the guest and (b) has nothing to do with KVM.

Documentation/x86/amd-memory-encryption.rst isn't a great fit either.

Since TDX will need a fair bit of documentation, and SEV-ES could retroactively
use docs as well, what about adding a sub-directory:

	Documentation/virt/confidential_compute

to match the "cc_platform_has" stuffr, and then we can add sev.rst and tdx.rst
there?  Or sev-es.rst, sev-snp.rst, etc... if we want to split things up more.

It might be worth extracting the SEV details from x86/amd-memory-encryption.rst
into virt/ as well.  A big chunk of that file appears to be SEV specific, and it
appears to have gotten a little out-of-whack.  E.g. this section no longer makes
sense as the last paragraph below appears to be talking about SME (bit 23 in MSR
0xc0010010), but walking back "this bit" would reference SEV.  I suspect a
mostly-standalone sev.rst would be easier to follow than an intertwined SME+SEV.

  If support for SME is present, MSR 0xc00100010 (MSR_AMD64_SYSCFG) can be used to
  determine if SME is enabled and/or to enable memory encryption::

          0xc0010010:
                  Bit[23]   0 = memory encryption features are disabled
                            1 = memory encryption features are enabled

  If SEV is supported, MSR 0xc0010131 (MSR_AMD64_SEV) can be used to determine if
  SEV is active::

          0xc0010131:
                  Bit[0]    0 = memory encryption is not active
                            1 = memory encryption is active

  Linux relies on BIOS to set this bit if BIOS has determined that the reduction
  in the physical address space as a result of enabling memory encryption (see
  CPUID information above) will not conflict with the address space resource
  requirements for the system.  If this bit is not set upon Linux startup then
  Linux itself will not set it and memory encryption will not be possible.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 37/43] x86/sev: Add SEV-SNP feature detection/setup
  2022-02-06 19:38   ` Borislav Petkov
@ 2022-02-08  5:25     ` Michael Roth
  0 siblings, 0 replies; 115+ messages in thread
From: Michael Roth @ 2022-02-08  5:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Sun, Feb 06, 2022 at 08:38:19PM +0100, Borislav Petkov wrote:
> On Fri, Jan 28, 2022 at 11:17:58AM -0600, Brijesh Singh wrote:
> > +static __init struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
> > +{
> > +	struct cc_blob_sev_info *cc_info;
> > +
> > +	/* Boot kernel would have passed the CC blob via boot_params. */
> > +	if (bp->cc_blob_address) {
> > +		cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address;
> > +		goto found_cc_info;
> > +	}
> 
> What is the difference here, why aren't you looking for the blob in an
> EFI table?
> 
> Even if you're booted directly by firmware, there should still be EFI
> there or?
> 
> And if so, then I think you should share some of the code through
> sev-shared.c so that there's not so much duplication...

In order to scan EFI this early in the boot of kernel proper, we'd need
to pull in the helpers from arch/x86/boot/compressed/efi.c, since the
normal EFI facilities don't get set up until later. It's doable, but
since any entry via boot/compressed kernel will have already done that
work and stashed it in boot_params->cc_blob_address, it seems like it
would only introduce more complexity and potential for breakage.

For direct entry to kernel proper, our thinking is that it would be for
things like the PVH entry path:

  https://stefano-garzarella.github.io/posts/2019-08-23-qemu-linux-kernel-pvh/

which doesn't use EFI, or other container-focused applications which use
lightweight non-EFI firmwares like qboot. So to support direct entry into
kernel proper, relying on the CC blob setup_data structure instead seems
more suited for these cases, so that's why kernel proper only uses the
setup_data structure and relies on boot/compressed to handle EFI entry.

> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7Cmichael.roth%40amd.com%7Ccb8e665b2ed14dbd400c08d9e9a841ff%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637797731199858639%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=5Pd7PA4BdfnfWzcXpah8HkrtVfu4h6nUIR8b3mB%2BIxM%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-02-07 20:08         ` Brijesh Singh
  2022-02-07 20:28           ` Borislav Petkov
@ 2022-02-08  7:56           ` Dov Murik
  2022-02-08 10:51             ` Borislav Petkov
  2022-02-08 14:14             ` Brijesh Singh
  1 sibling, 2 replies; 115+ messages in thread
From: Dov Murik @ 2022-02-08  7:56 UTC (permalink / raw)
  To: Brijesh Singh, Borislav Petkov
  Cc: x86, linux-kernel, kvm, linux-efi, platform-driver-x86,
	linux-coco, linux-mm, Thomas Gleixner, Ingo Molnar, Joerg Roedel,
	Tom Lendacky, H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy,
	Liam Merwick, Dov Murik



On 07/02/2022 22:08, Brijesh Singh wrote:
> 
> 
> On 2/7/22 1:09 PM, Dov Murik wrote:
>>
>>
>> On 07/02/2022 18:23, Brijesh Singh wrote:
>>>
>>>
>>> On 2/7/22 2:52 AM, Borislav Petkov wrote:
>>>> Those are allocated on stack, why are you clearing them?
>>>
>>> Yep, no need to explicitly clear it. I'll take it out in next rev.
>>>
>>
>> Wait, this is key material generated by PSP and passed to userspace.
>> Why leave copies of it floating around kernel memory?  I thought that's
>> the whole reason for these memzero_explicit() calls (maybe add a
>> comment?).
>>
> 
> 
> Ah, now I remember I added the memzero_explicit() to address your review
> feedback :) In that patch version, we were using the kmalloc() to store
> the response data; since then, we switched to stack. We will leak the
> key outside when the stack is converted private-> shared; I don't know
> if any of these are going to happen. I can add a comment and keep the
> memzero_explicit() call.
> 

Just to be clear, I didn't mean necessarily "leak the key to the
untrusted host" (even if a page is converted back from private to
shared, it is encrypted, so host can't read its contents).  But even
*inside* the guest, when dealing with sensitive data like keys, we
should minimize the amount of copies that float around (I assume this is
the reason for most of the uses of memzero_explicit() in the kernel).

-Dov



> Boris, let me know if you are okay with it?
> 
> 
>> As an example, in arch/x86/crypto/aesni-intel_glue.c there are two calls
>> to memzero_explicit(), both on stack variables; the only reason for
>> these calls (as I understand it) is to avoid some future possible leak
>> of this sensitive data (keys, cipher context, etc.).  I'm sure there are
>> other examples in the kernel code.
>>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-02-08  7:56           ` Dov Murik
@ 2022-02-08 10:51             ` Borislav Petkov
  2022-02-08 14:14             ` Brijesh Singh
  1 sibling, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-08 10:51 UTC (permalink / raw)
  To: Dov Murik
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Liam Merwick

On Tue, Feb 08, 2022 at 09:56:52AM +0200, Dov Murik wrote:
> Just to be clear, I didn't mean necessarily "leak the key to the
> untrusted host" (even if a page is converted back from private to
> shared, it is encrypted, so host can't read its contents).  But even
> *inside* the guest, when dealing with sensitive data like keys, we
> should minimize the amount of copies that float around (I assume this is
> the reason for most of the uses of memzero_explicit() in the kernel).

I don't know about Brijesh but I understood you exactly as you mean it.
And yap, I agree we should always clear such sensitive buffers.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 33/43] x86/compressed: Add SEV-SNP feature detection/setup
  2022-02-06 16:41   ` Borislav Petkov
@ 2022-02-08 13:50     ` Michael Roth
  2022-02-08 15:02       ` Borislav Petkov
  0 siblings, 1 reply; 115+ messages in thread
From: Michael Roth @ 2022-02-08 13:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Sun, Feb 06, 2022 at 05:41:26PM +0100, Borislav Petkov wrote:
> On Fri, Jan 28, 2022 at 11:17:54AM -0600, Brijesh Singh wrote:
> > +static struct cc_setup_data *get_cc_setup_data(struct boot_params *bp)
> > +{
> > +	struct setup_data *hdr = (struct setup_data *)bp->hdr.setup_data;
> > +
> > +	while (hdr) {
> > +		if (hdr->type == SETUP_CC_BLOB)
> > +			return (struct cc_setup_data *)hdr;
> > +		hdr = (struct setup_data *)hdr->next;
> > +	}
> > +
> > +	return NULL;
> > +}
> 
> Merge that function into its only caller.
> 
> ...
> 
> > +static struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp)
> 
> static function, no need for the "snp_" prefix. Please audit all your
> patches for that and remove that prefix from all static functions.
> 

I'm a little unsure how to handle some of these others cases:

We have this which is defined in sev-shared.c and used "externally"
by kernel/sev.c or boot/compressed/sev.c:

  snp_setup_cpuid_table()

I'm assuming that would be considered 'non-static' since it would need
to be exported if sev-shared.c was compiled separately instead of
directly #include'd.

And then there's also these which are static helpers that are only used
within sev-shared.c:

  snp_cpuid_info_get_ptr()
  snp_cpuid_calc_xsave_size()
  snp_cpuid_get_validated_func()
  snp_cpuid_check_range()
  snp_cpuid_hv()
  snp_cpuid_postprocess()
  snp_cpuid()

but in those cases it seems useful to keep them grouped under the
snp_cpuid_* prefix since they become ambiguous otherwise, and
just using cpuid_* as a prefix (or suffix/etc) makes it unclear
that they are only used for SNP and not for general CPUID handling.
Should we leave those as-is?

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 42/43] virt: sevguest: Add support to derive key
  2022-02-08  7:56           ` Dov Murik
  2022-02-08 10:51             ` Borislav Petkov
@ 2022-02-08 14:14             ` Brijesh Singh
  1 sibling, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-02-08 14:14 UTC (permalink / raw)
  To: Dov Murik, Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Liam Merwick



On 2/8/22 1:56 AM, Dov Murik wrote:
...

> 
> Just to be clear, I didn't mean necessarily "leak the key to the
> untrusted host" (even if a page is converted back from private to
> shared, it is encrypted, so host can't read its contents).  But even
> *inside* the guest, when dealing with sensitive data like keys, we
> should minimize the amount of copies that float around (I assume this is
> the reason for most of the uses of memzero_explicit() in the kernel).
> 

Yap, I agree with your point and will keep the memzero_explicit().

-Brijesh


^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 30/43] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement
  2022-02-07 23:48   ` Sean Christopherson
@ 2022-02-08 14:54     ` Michael Roth
  2022-02-08 15:11     ` Borislav Petkov
  1 sibling, 0 replies; 115+ messages in thread
From: Michael Roth @ 2022-02-08 14:54 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Mon, Feb 07, 2022 at 11:48:11PM +0000, Sean Christopherson wrote:
> On Fri, Jan 28, 2022, Brijesh Singh wrote:
> > From: Michael Roth <michael.roth@amd.com>
> > 
> > Update the documentation with SEV-SNP CPUID enforcement.
> > 
> > Signed-off-by: Michael Roth <michael.roth@amd.com>
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > ---
> >  .../virt/kvm/amd-memory-encryption.rst        | 28 +++++++++++++++++++
> >  1 file changed, 28 insertions(+)
> > 
> > diff --git a/Documentation/virt/kvm/amd-memory-encryption.rst b/Documentation/virt/kvm/amd-memory-encryption.rst
> > index 1c6847fff304..0c72f44cc11a 100644
> > --- a/Documentation/virt/kvm/amd-memory-encryption.rst
> > +++ b/Documentation/virt/kvm/amd-memory-encryption.rst
> 
> This doc is specifically for KVM's host-side implemenation, whereas the below is
> (a) mostly targeted at the guest and (b) has nothing to do with KVM.
> 
> Documentation/x86/amd-memory-encryption.rst isn't a great fit either.
> 
> Since TDX will need a fair bit of documentation, and SEV-ES could retroactively
> use docs as well, what about adding a sub-directory:
> 
> 	Documentation/virt/confidential_compute

There's actually a Documentation/virt/coco/sevguest.rst that was added
in this series as part of:

  "virt: Add SEV-SNP guest driver"

Maybe that's good choice?

I've been wondering about potentially adding the:

  "Guest/Hypervisor Implementation Notes for SEV-SNP CPUID Enforcement"

document that was sent to SNP mailing list under Documentation/
somewhere. If we were to do that, it would be a good place to move the
documentation from this patch into as well. Any thoughts on that?

> 
> to match the "cc_platform_has" stuffr, and then we can add sev.rst and tdx.rst
> there?  Or sev-es.rst, sev-snp.rst, etc... if we want to split things up more.
> 
> It might be worth extracting the SEV details from x86/amd-memory-encryption.rst
> into virt/ as well.  A big chunk of that file appears to be SEV specific, and it
> appears to have gotten a little out-of-whack.  E.g. this section no longer makes
> sense as the last paragraph below appears to be talking about SME (bit 23 in MSR
> 0xc0010010), but walking back "this bit" would reference SEV.  I suspect a
> mostly-standalone sev.rst would be easier to follow than an intertwined SME+SEV.
> 
>   If support for SME is present, MSR 0xc00100010 (MSR_AMD64_SYSCFG) can be used to
>   determine if SME is enabled and/or to enable memory encryption::
> 
>           0xc0010010:
>                   Bit[23]   0 = memory encryption features are disabled
>                             1 = memory encryption features are enabled
> 
>   If SEV is supported, MSR 0xc0010131 (MSR_AMD64_SEV) can be used to determine if
>   SEV is active::
> 
>           0xc0010131:
>                   Bit[0]    0 = memory encryption is not active
>                             1 = memory encryption is active
> 
>   Linux relies on BIOS to set this bit if BIOS has determined that the reduction
>   in the physical address space as a result of enabling memory encryption (see
>   CPUID information above) will not conflict with the address space resource
>   requirements for the system.  If this bit is not set upon Linux startup then
>   Linux itself will not set it and memory encryption will not be possible.

I'll check with Brijesh on these.

Thanks!

-Mike




^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 33/43] x86/compressed: Add SEV-SNP feature detection/setup
  2022-02-08 13:50     ` Michael Roth
@ 2022-02-08 15:02       ` Borislav Petkov
  0 siblings, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-08 15:02 UTC (permalink / raw)
  To: Michael Roth
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy

On Tue, Feb 08, 2022 at 07:50:09AM -0600, Michael Roth wrote:
> I'm assuming that would be considered 'non-static' since it would need
> to be exported if sev-shared.c was compiled separately instead of
> directly #include'd.

No, that one can lose the prefix too. We'll cross that bridge when we
get to it.

> And then there's also these which are static helpers that are only used
> within sev-shared.c:
> 
>   snp_cpuid_info_get_ptr()

So looking at that one - and it felt weird reading that code because it
said "cpuid_info" but that's not an "info" - that should be:

struct snp_cpuid_table {
        u32 count;
        u32 __reserved1;
        u64 __reserved2;
        struct snp_cpuid_fn fn[SNP_CPUID_COUNT_MAX];
} __packed;

just call it what it is - a SNP CPUID *table*.

And then you can have

	const struct snp_cpuid_table *cpuid_tbl = snp_cpuid_get_table();

and that makes it crystal clear what this does.

>   snp_cpuid_calc_xsave_size()
>   snp_cpuid_get_validated_func()
>
>   snp_cpuid_check_range()

You can merge that small function into its only call site and put a
comment above the code:

	/* Check function is within the supported CPUID leafs ranges */

>   snp_cpuid_hv()
>   snp_cpuid_postprocess()
>   snp_cpuid()
> 
> but in those cases it seems useful to keep them grouped under the
> snp_cpuid_* prefix since they become ambiguous otherwise, and
> just using cpuid_* as a prefix (or suffix/etc) makes it unclear
> that they are only used for SNP and not for general CPUID handling.
> Should we leave those as-is?

Yap, the rest make sense to denote to what functionality they belong to.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 30/43] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement
  2022-02-07 23:48   ` Sean Christopherson
  2022-02-08 14:54     ` Michael Roth
@ 2022-02-08 15:11     ` Borislav Petkov
  1 sibling, 0 replies; 115+ messages in thread
From: Borislav Petkov @ 2022-02-08 15:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Brijesh Singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Gonda,
	Peter Zijlstra, Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Michael Roth, Vlastimil Babka,
	Kirill A . Shutemov, Andi Kleen, Dr . David Alan Gilbert,
	brijesh.ksingh, tony.luck, marcorr, sathyanarayanan.kuppuswamy

On Mon, Feb 07, 2022 at 11:48:11PM +0000, Sean Christopherson wrote:
> It might be worth extracting the SEV details from x86/amd-memory-encryption.rst
> into virt/ as well.  A big chunk of that file appears to be SEV specific, and it

My personal preference would be to keep *all* SEV stuff in a single file
for easier grepping and have everything together.

But that's just me.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 02/43] KVM: SVM: Create a separate mapping for the SEV-ES save area
  2022-02-01 13:02   ` Borislav Petkov
@ 2022-02-09 15:02     ` Brijesh Singh
  0 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-02-09 15:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, linux-efi,
	platform-driver-x86, linux-coco, linux-mm, Thomas Gleixner,
	Ingo Molnar, Joerg Roedel, Tom Lendacky, H. Peter Anvin,
	Ard Biesheuvel, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Jim Mattson, Andy Lutomirski, Dave Hansen,
	Sergio Lopez, Peter Gonda, Peter Zijlstra, Srinivas Pandruvada,
	David Rientjes, Dov Murik, Tobin Feldman-Fitzthum, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, brijesh.ksingh, tony.luck, marcorr,
	sathyanarayanan.kuppuswamy, Venu Busireddy



On 2/1/22 7:02 AM, Borislav Petkov wrote:
> On Fri, Jan 28, 2022 at 11:17:23AM -0600, Brijesh Singh wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> The save area for SEV-ES/SEV-SNP guests, as used by the hardware, is
>> different from the save area of a non SEV-ES/SEV-SNP guest.
>>
>> This is the first step in defining the multiple save areas to keep them
>> separate and ensuring proper operation amongst the different types of
>> guests. Create an SEV-ES/SEV-SNP save area and adjust usage to the new
>> save area definition where needed.
>>
>> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> ---
>>   arch/x86/include/asm/svm.h | 87 +++++++++++++++++++++++++++++---------
>>   arch/x86/kvm/svm/sev.c     | 24 +++++------
>>   arch/x86/kvm/svm/svm.h     |  2 +-
>>   3 files changed, 80 insertions(+), 33 deletions(-)
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2FYc2jzOunYej4vwSc%40zn.tnic&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7C909b88bb6fb14010859508d9e583268e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637793173738404187%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=uOR2wyzO1%2B6K%2B%2FhZCkwlRrfQutFqpSzXKh8RVbub9cA%3D&amp;reserved=0
> 

The rename may touch more files in KVM subsystem, so, I'll work on this 
feedback while adding the SNP host support, and will submit it as a 
per-requisite for the host support.

-Brijesh

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: [PATCH v9 39/43] x86/sev: Provide support for SNP guest request NAEs
  2022-02-01 20:17   ` Peter Gonda
@ 2022-03-03 14:53     ` Brijesh Singh
  0 siblings, 0 replies; 115+ messages in thread
From: Brijesh Singh @ 2022-03-03 14:53 UTC (permalink / raw)
  To: Peter Gonda
  Cc: Brijesh Singh, the arch/x86 maintainers, LKML, kvm list,
	linux-efi, platform-driver-x86, linux-coco, linux-mm,
	Thomas Gleixner, Ingo Molnar, Joerg Roedel, Tom Lendacky,
	H. Peter Anvin, Ard Biesheuvel, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Andy Lutomirski, Dave Hansen, Sergio Lopez, Peter Zijlstra,
	Srinivas Pandruvada, David Rientjes, Dov Murik,
	Tobin Feldman-Fitzthum, Borislav Petkov, Michael Roth,
	Vlastimil Babka, Kirill A . Shutemov, Andi Kleen,
	Dr . David Alan Gilbert, Tony Luck, Marc Orr,
	Sathyanarayanan Kuppuswamy

[-- Attachment #1: Type: text/plain, Size: 6846 bytes --]

On Tue, Feb 1, 2022 at 2:17 PM Peter Gonda <pgonda@google.com> wrote:

> On Fri, Jan 28, 2022 at 10:19 AM Brijesh Singh <brijesh.singh@amd.com>
> wrote:
> >
> > Version 2 of GHCB specification provides SNP_GUEST_REQUEST and
> > SNP_EXT_GUEST_REQUEST NAE that can be used by the SNP guest to
> communicate
> > with the PSP.
> >
> > While at it, add a snp_issue_guest_request() helper that will be used by
> > driver or other subsystem to issue the request to PSP.
> >
> > See SEV-SNP firmware and GHCB spec for more details.
> >
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > ---
> >  arch/x86/include/asm/sev-common.h |  3 ++
> >  arch/x86/include/asm/sev.h        | 14 ++++++++
> >  arch/x86/include/uapi/asm/svm.h   |  4 +++
> >  arch/x86/kernel/sev.c             | 55 +++++++++++++++++++++++++++++++
> >  4 files changed, 76 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/sev-common.h
> b/arch/x86/include/asm/sev-common.h
> > index cd769984e929..442614879dad 100644
> > --- a/arch/x86/include/asm/sev-common.h
> > +++ b/arch/x86/include/asm/sev-common.h
> > @@ -128,6 +128,9 @@ struct snp_psc_desc {
> >         struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
> >  } __packed;
> >
> > +/* Guest message request error code */
> > +#define SNP_GUEST_REQ_INVALID_LEN      BIT_ULL(32)
> > +
> >  #define GHCB_MSR_TERM_REQ              0x100
> >  #define GHCB_MSR_TERM_REASON_SET_POS   12
> >  #define GHCB_MSR_TERM_REASON_SET_MASK  0xf
> > diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> > index 219abb4590f2..9830ee1d6ef0 100644
> > --- a/arch/x86/include/asm/sev.h
> > +++ b/arch/x86/include/asm/sev.h
> > @@ -87,6 +87,14 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
> >
> >  #define RMPADJUST_VMSA_PAGE_BIT                BIT(16)
> >
> > +/* SNP Guest message request */
> > +struct snp_req_data {
> > +       unsigned long req_gpa;
> > +       unsigned long resp_gpa;
> > +       unsigned long data_gpa;
> > +       unsigned int data_npages;
> > +};
> > +
> >  #ifdef CONFIG_AMD_MEM_ENCRYPT
> >  extern struct static_key_false sev_es_enable_key;
> >  extern void __sev_es_ist_enter(struct pt_regs *regs);
> > @@ -154,6 +162,7 @@ void snp_set_memory_private(unsigned long vaddr,
> unsigned int npages);
> >  void snp_set_wakeup_secondary_cpu(void);
> >  bool snp_init(struct boot_params *bp);
> >  void snp_abort(void);
> > +int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input,
> unsigned long *fw_err);
> >  #else
> >  static inline void sev_es_ist_enter(struct pt_regs *regs) { }
> >  static inline void sev_es_ist_exit(void) { }
> > @@ -173,6 +182,11 @@ static inline void snp_set_memory_private(unsigned
> long vaddr, unsigned int npag
> >  static inline void snp_set_wakeup_secondary_cpu(void) { }
> >  static inline bool snp_init(struct boot_params *bp) { return false; }
> >  static inline void snp_abort(void) { }
> > +static inline int snp_issue_guest_request(u64 exit_code, struct
> snp_req_data *input,
> > +                                         unsigned long *fw_err)
> > +{
> > +       return -ENOTTY;
> > +}
> >  #endif
> >
> >  #endif
> > diff --git a/arch/x86/include/uapi/asm/svm.h
> b/arch/x86/include/uapi/asm/svm.h
> > index 8b4c57baec52..5b8bc2b65a5e 100644
> > --- a/arch/x86/include/uapi/asm/svm.h
> > +++ b/arch/x86/include/uapi/asm/svm.h
> > @@ -109,6 +109,8 @@
> >  #define SVM_VMGEXIT_SET_AP_JUMP_TABLE          0
> >  #define SVM_VMGEXIT_GET_AP_JUMP_TABLE          1
> >  #define SVM_VMGEXIT_PSC                                0x80000010
> > +#define SVM_VMGEXIT_GUEST_REQUEST              0x80000011
> > +#define SVM_VMGEXIT_EXT_GUEST_REQUEST          0x80000012
> >  #define SVM_VMGEXIT_AP_CREATION                        0x80000013
> >  #define SVM_VMGEXIT_AP_CREATE_ON_INIT          0
> >  #define SVM_VMGEXIT_AP_CREATE                  1
> > @@ -225,6 +227,8 @@
> >         { SVM_VMGEXIT_AP_HLT_LOOP,      "vmgexit_ap_hlt_loop" }, \
> >         { SVM_VMGEXIT_AP_JUMP_TABLE,    "vmgexit_ap_jump_table" }, \
> >         { SVM_VMGEXIT_PSC,      "vmgexit_page_state_change" }, \
> > +       { SVM_VMGEXIT_GUEST_REQUEST,            "vmgexit_guest_request"
> }, \
> > +       { SVM_VMGEXIT_EXT_GUEST_REQUEST,
> "vmgexit_ext_guest_request" }, \
> >         { SVM_VMGEXIT_AP_CREATION,      "vmgexit_ap_creation" }, \
> >         { SVM_VMGEXIT_HV_FEATURES,      "vmgexit_hypervisor_feature" }, \
> >         { SVM_EXIT_ERR,         "invalid_guest_state" }
> > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> > index cb97200bfda7..1d3ac83226fc 100644
> > --- a/arch/x86/kernel/sev.c
> > +++ b/arch/x86/kernel/sev.c
> > @@ -2122,3 +2122,58 @@ static int __init snp_check_cpuid_table(void)
> >  }
> >
> >  arch_initcall(snp_check_cpuid_table);
> > +
> > +int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input,
> unsigned long *fw_err)
> > +{
> > +       struct ghcb_state state;
> > +       struct es_em_ctxt ctxt;
> > +       unsigned long flags;
> > +       struct ghcb *ghcb;
> > +       int ret;
> > +
> > +       if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> > +               return -ENODEV;
> > +
> > +       /*
> > +        * __sev_get_ghcb() needs to run with IRQs disabled because it
> is using
> > +        * a per-CPU GHCB.
> > +        */
> > +       local_irq_save(flags);
> > +
> > +       ghcb = __sev_get_ghcb(&state);
> > +       if (!ghcb) {
> > +               ret = -EIO;
> > +               goto e_restore_irq;
> > +       }
> > +
> > +       vc_ghcb_invalidate(ghcb);
> > +
> > +       if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
> > +               ghcb_set_rax(ghcb, input->data_gpa);
> > +               ghcb_set_rbx(ghcb, input->data_npages);
> > +       }
> > +
> > +       ret = sev_es_ghcb_hv_call(ghcb, true, &ctxt, exit_code,
> input->req_gpa, input->resp_gpa);
> > +       if (ret)
> > +               goto e_put;
> > +
> > +       if (ghcb->save.sw_exit_info_2) {
> > +               /* Number of expected pages are returned in RBX */
> > +               if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> > +                   ghcb->save.sw_exit_info_2 ==
> SNP_GUEST_REQ_INVALID_LEN)
> > +                       input->data_npages = ghcb_get_rbx(ghcb);
> > +
> > +               if (fw_err)
> > +                       *fw_err = ghcb->save.sw_exit_info_2;
>
> In the PSP driver we've had a bit of discussion around the fw_err and
> the return code and that it would be preferable to have fw_err be a
> required parameter. And then we can easily make sure fw_err is always
> non-zero when the return code is non-zero. Thoughts about doing the
> same inside the guest?
>
>
As per the GHCB spec, we will always have a non-zero error code. So, yes, I
can drop the if() check.

FYI, somehow this email did not show up on my @amd.com so I did not address
it in v10.

thanks

[-- Attachment #2: Type: text/html, Size: 8805 bytes --]

^ permalink raw reply	[flat|nested] 115+ messages in thread

end of thread, other threads:[~2022-03-03 14:54 UTC | newest]

Thread overview: 115+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-28 17:17 [PATCH v9 00/43] Add AMD Secure Nested Paging (SEV-SNP) Guest Support Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 01/43] KVM: SVM: Define sev_features and vmpl field in the VMSA Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 02/43] KVM: SVM: Create a separate mapping for the SEV-ES save area Brijesh Singh
2022-02-01 13:02   ` Borislav Petkov
2022-02-09 15:02     ` Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 03/43] KVM: SVM: Create a separate mapping for the GHCB " Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 04/43] KVM: SVM: Update the SEV-ES save area mapping Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 05/43] x86/compressed/64: Detect/setup SEV/SME features earlier in boot Brijesh Singh
2022-02-01 18:08   ` Borislav Petkov
2022-02-01 20:35     ` Michael Roth
2022-02-01 21:28       ` Borislav Petkov
2022-02-02  0:52         ` Michael Roth
2022-02-02  6:09           ` Borislav Petkov
2022-02-02 17:28             ` Michael Roth
2022-02-02 18:57               ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 06/43] x86/sev: " Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 07/43] x86/mm: Extend cc_attr to include AMD SEV-SNP Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 08/43] x86/sev: Define the Linux specific guest termination reasons Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 09/43] x86/sev: Save the negotiated GHCB version Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 10/43] x86/sev: Check SEV-SNP features support Brijesh Singh
2022-02-01 19:59   ` Borislav Petkov
2022-02-02 14:28     ` Brijesh Singh
2022-02-02 15:37       ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 11/43] x86/sev: Add a helper for the PVALIDATE instruction Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 12/43] x86/sev: Check the vmpl level Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 13/43] x86/compressed: Add helper for validating pages in the decompression stage Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 14/43] x86/compressed: Register GHCB memory when SEV-SNP is active Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 15/43] x86/sev: " Brijesh Singh
2022-02-02 10:34   ` Borislav Petkov
2022-02-02 14:29     ` Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 16/43] x86/sev: Add helper for validating pages in early enc attribute changes Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 17/43] x86/kernel: Make the .bss..decrypted section shared in RMP table Brijesh Singh
2022-02-02 11:06   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 18/43] x86/kernel: Validate ROM memory before accessing when SEV-SNP is active Brijesh Singh
2022-02-02 15:41   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 19/43] x86/mm: Add support to validate memory when changing C-bit Brijesh Singh
2022-02-02 16:10   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 20/43] x86/sev: Use SEV-SNP AP creation to start secondary CPUs Brijesh Singh
2022-02-03  6:50   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 21/43] x86/head/64: Re-enable stack protection Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 22/43] x86/sev: Move MSR-based VMGEXITs for CPUID to helper Brijesh Singh
2022-02-03 13:59   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 23/43] KVM: x86: Move lookup of indexed CPUID leafs " Brijesh Singh
2022-02-03 15:16   ` Borislav Petkov
2022-02-03 16:44     ` Michael Roth
2022-02-05 12:58       ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 24/43] x86/compressed/acpi: Move EFI detection " Brijesh Singh
2022-02-03 14:39   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 25/43] x86/compressed/acpi: Move EFI system table lookup " Brijesh Singh
2022-02-03 14:48   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 26/43] x86/compressed/acpi: Move EFI config " Brijesh Singh
2022-02-03 15:13   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 27/43] x86/compressed/acpi: Move EFI vendor " Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 28/43] x86/compressed/acpi: Move EFI kexec handling into common code Brijesh Singh
2022-02-04 16:09   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 29/43] x86/boot: Add Confidential Computing type to setup_data Brijesh Singh
2022-02-04 16:21   ` Borislav Petkov
2022-02-04 17:41     ` Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 30/43] KVM: SEV: Add documentation for SEV-SNP CPUID Enforcement Brijesh Singh
2022-02-07 23:48   ` Sean Christopherson
2022-02-08 14:54     ` Michael Roth
2022-02-08 15:11     ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 31/43] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers Brijesh Singh
2022-02-05 10:54   ` Borislav Petkov
2022-02-05 15:42     ` Michael Roth
2022-02-05 16:22     ` Michael Roth
2022-02-06 13:37       ` Borislav Petkov
2022-02-07 15:37         ` Michael Roth
2022-02-07 17:52           ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 32/43] x86/boot: Add a pointer to Confidential Computing blob in bootparams Brijesh Singh
2022-02-05 13:07   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 33/43] x86/compressed: Add SEV-SNP feature detection/setup Brijesh Singh
2022-02-06 16:41   ` Borislav Petkov
2022-02-08 13:50     ` Michael Roth
2022-02-08 15:02       ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 34/43] x86/compressed: Use firmware-validated CPUID leaves for SEV-SNP guests Brijesh Singh
2022-01-28 17:17 ` [PATCH v9 35/43] x86/compressed: Export and rename add_identity_map() Brijesh Singh
2022-02-06 19:01   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 36/43] x86/compressed/64: Add identity mapping for Confidential Computing blob Brijesh Singh
2022-02-06 19:21   ` Borislav Petkov
2022-01-28 17:17 ` [PATCH v9 37/43] x86/sev: Add SEV-SNP feature detection/setup Brijesh Singh
2022-02-06 19:38   ` Borislav Petkov
2022-02-08  5:25     ` Michael Roth
2022-01-28 17:17 ` [PATCH v9 38/43] x86/sev: Use firmware-validated CPUID for SEV-SNP guests Brijesh Singh
2022-02-05 17:19   ` Michael Roth
2022-02-06 15:46     ` Borislav Petkov
2022-02-07 17:00       ` Michael Roth
2022-02-07 18:43         ` Borislav Petkov
2022-02-06 19:50   ` Borislav Petkov
2022-01-28 17:18 ` [PATCH v9 39/43] x86/sev: Provide support for SNP guest request NAEs Brijesh Singh
2022-02-01 20:17   ` Peter Gonda
2022-03-03 14:53     ` Brijesh Singh
2022-01-28 17:18 ` [PATCH v9 40/43] x86/sev: Register SEV-SNP guest request platform device Brijesh Singh
2022-02-01 20:21   ` Peter Gonda
2022-02-02 16:27     ` Brijesh Singh
2022-02-06 20:05   ` Borislav Petkov
2022-01-28 17:18 ` [PATCH v9 41/43] virt: Add SEV-SNP guest driver Brijesh Singh
2022-02-01 20:33   ` Peter Gonda
2022-02-06 22:39   ` Borislav Petkov
2022-02-07 14:41     ` Brijesh Singh
2022-02-07 15:22       ` Borislav Petkov
2022-01-28 17:18 ` [PATCH v9 42/43] virt: sevguest: Add support to derive key Brijesh Singh
2022-02-01 20:39   ` Peter Gonda
2022-02-02 22:31     ` Brijesh Singh
2022-02-07  8:52   ` Borislav Petkov
2022-02-07 16:23     ` Brijesh Singh
2022-02-07 19:09       ` Dov Murik
2022-02-07 20:08         ` Brijesh Singh
2022-02-07 20:28           ` Borislav Petkov
2022-02-08  7:56           ` Dov Murik
2022-02-08 10:51             ` Borislav Petkov
2022-02-08 14:14             ` Brijesh Singh
2022-01-28 17:18 ` [PATCH v9 43/43] virt: sevguest: Add support to get extended report Brijesh Singh
2022-02-01 20:43   ` Peter Gonda
2022-02-07  9:16   ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.