All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] x86: Fix SEV guest regression
@ 2018-08-28 22:12 Brijesh Singh
  2018-08-28 22:12 ` [PATCH v2 1/3] x86/mm: Restructure sme_encrypt_kernel() Brijesh Singh
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Brijesh Singh @ 2018-08-28 22:12 UTC (permalink / raw)
  To: x86, linux-kernel, kvm
  Cc: Brijesh Singh, Tom Lendacky, Thomas Gleixner, Borislav Petkov,
	Paolo Bonzini, Sean Christopherson, Radim Krčmář

The following commit

"
x86/kvmclock: Remove memblock dependency

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=368a540e0232ad446931f5a4e8a5e06f69f21343
"

introduced SEV guest regression.

The guest physical address holding the wall_clock and hv_clock_boot
are shared with the hypervisor must be mapped with C=0 when SEV
is active. To clear the C-bit we use  kernel_physical_mapping_init() to
split the large pages. The above commit moved the kvmclock initialization
very early and kernel_physical_mapping_init() fails to allocate memory
while spliting the large page.

To solve it, we add a special .data..decrypted section, this section can be
used to hold the shared variables. Early boot code maps this section with
C=0. The section is pmd aligned and sized to avoid the need to split the pages.
Caller can use __decrypted attribute to add the variables in .data..decrypted
section. 

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>

Changes since v1:
 - move the logic to re-arrange mapping in new patch
 - move the definition of __start_data_* in mem_encrypt.h
 - map the workarea buffer as encrypted when SEV is enabled
 - enhance the sme_populate_pgd to update the pte/pmd flags when mapping exist

Brijesh Singh (3):
  x86/mm: Restructure sme_encrypt_kernel()
  x86/mm: add .data..decrypted section to hold shared variables
  x86/kvm: use __decrypted attribute when declaring shared variables

 arch/x86/include/asm/mem_encrypt.h |   6 +
 arch/x86/kernel/head64.c           |   9 ++
 arch/x86/kernel/kvmclock.c         |  30 ++++-
 arch/x86/kernel/vmlinux.lds.S      |  17 +++
 arch/x86/mm/mem_encrypt_identity.c | 232 +++++++++++++++++++++++++++----------
 5 files changed, 227 insertions(+), 67 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/3] x86/mm: Restructure sme_encrypt_kernel()
  2018-08-28 22:12 [PATCH v2 0/3] x86: Fix SEV guest regression Brijesh Singh
@ 2018-08-28 22:12 ` Brijesh Singh
  2018-08-29 13:13   ` Borislav Petkov
  2018-08-28 22:12 ` [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables Brijesh Singh
  2018-08-28 22:12 ` [PATCH v2 3/3] x86/kvm: use __decrypted attribute when declaring " Brijesh Singh
  2 siblings, 1 reply; 13+ messages in thread
From: Brijesh Singh @ 2018-08-28 22:12 UTC (permalink / raw)
  To: x86, linux-kernel, kvm
  Cc: Brijesh Singh, stable, Tom Lendacky, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, Paolo Bonzini,
	Sean Christopherson, Radim Krčmář

Re-arrange the sme_encrypt_kernel() by moving the workarea map/unmap
logic in a separate static function. There are no logical changes in this
patch. The restructuring will allow us to expand the sme_encrypt_kernel
in future.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Cc: stable@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: kvm@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kernel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: kvm@vger.kernel.org
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
---
 arch/x86/mm/mem_encrypt_identity.c | 160 ++++++++++++++++++++++++-------------
 1 file changed, 104 insertions(+), 56 deletions(-)

diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 7ae3686..bf6097e 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -72,6 +72,22 @@ struct sme_populate_pgd_data {
 	unsigned long vaddr_end;
 };
 
+struct sme_workarea_data {
+	unsigned long kernel_start;
+	unsigned long kernel_end;
+	unsigned long kernel_len;
+
+	unsigned long initrd_start;
+	unsigned long initrd_end;
+	unsigned long initrd_len;
+
+	unsigned long workarea_start;
+	unsigned long workarea_end;
+	unsigned long workarea_len;
+
+	unsigned long decrypted_base;
+};
+
 static char sme_cmdline_arg[] __initdata = "mem_encrypt";
 static char sme_cmdline_on[]  __initdata = "on";
 static char sme_cmdline_off[] __initdata = "off";
@@ -266,19 +282,17 @@ static unsigned long __init sme_pgtable_calc(unsigned long len)
 	return entries + tables;
 }
 
-void __init sme_encrypt_kernel(struct boot_params *bp)
+static void __init build_workarea_map(struct boot_params *bp,
+				      struct sme_workarea_data *wa,
+				      struct sme_populate_pgd_data *ppd)
 {
 	unsigned long workarea_start, workarea_end, workarea_len;
 	unsigned long execute_start, execute_end, execute_len;
 	unsigned long kernel_start, kernel_end, kernel_len;
 	unsigned long initrd_start, initrd_end, initrd_len;
-	struct sme_populate_pgd_data ppd;
 	unsigned long pgtable_area_len;
 	unsigned long decrypted_base;
 
-	if (!sme_active())
-		return;
-
 	/*
 	 * Prepare for encrypting the kernel and initrd by building new
 	 * pagetables with the necessary attributes needed to encrypt the
@@ -358,17 +372,17 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
 	 * pagetables and when the new encrypted and decrypted kernel
 	 * mappings are populated.
 	 */
-	ppd.pgtable_area = (void *)execute_end;
+	ppd->pgtable_area = (void *)execute_end;
 
 	/*
 	 * Make sure the current pagetable structure has entries for
 	 * addressing the workarea.
 	 */
-	ppd.pgd = (pgd_t *)native_read_cr3_pa();
-	ppd.paddr = workarea_start;
-	ppd.vaddr = workarea_start;
-	ppd.vaddr_end = workarea_end;
-	sme_map_range_decrypted(&ppd);
+	ppd->pgd = (pgd_t *)native_read_cr3_pa();
+	ppd->paddr = workarea_start;
+	ppd->vaddr = workarea_start;
+	ppd->vaddr_end = workarea_end;
+	sme_map_range_decrypted(ppd);
 
 	/* Flush the TLB - no globals so cr3 is enough */
 	native_write_cr3(__native_read_cr3());
@@ -379,9 +393,9 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
 	 * then be populated with new PUDs and PMDs as the encrypted and
 	 * decrypted kernel mappings are created.
 	 */
-	ppd.pgd = ppd.pgtable_area;
-	memset(ppd.pgd, 0, sizeof(pgd_t) * PTRS_PER_PGD);
-	ppd.pgtable_area += sizeof(pgd_t) * PTRS_PER_PGD;
+	ppd->pgd = ppd->pgtable_area;
+	memset(ppd->pgd, 0, sizeof(pgd_t) * PTRS_PER_PGD);
+	ppd->pgtable_area += sizeof(pgd_t) * PTRS_PER_PGD;
 
 	/*
 	 * A different PGD index/entry must be used to get different
@@ -399,75 +413,109 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
 	decrypted_base <<= PGDIR_SHIFT;
 
 	/* Add encrypted kernel (identity) mappings */
-	ppd.paddr = kernel_start;
-	ppd.vaddr = kernel_start;
-	ppd.vaddr_end = kernel_end;
-	sme_map_range_encrypted(&ppd);
+	ppd->paddr = kernel_start;
+	ppd->vaddr = kernel_start;
+	ppd->vaddr_end = kernel_end;
+	sme_map_range_encrypted(ppd);
 
 	/* Add decrypted, write-protected kernel (non-identity) mappings */
-	ppd.paddr = kernel_start;
-	ppd.vaddr = kernel_start + decrypted_base;
-	ppd.vaddr_end = kernel_end + decrypted_base;
-	sme_map_range_decrypted_wp(&ppd);
+	ppd->paddr = kernel_start;
+	ppd->vaddr = kernel_start + decrypted_base;
+	ppd->vaddr_end = kernel_end + decrypted_base;
+	sme_map_range_decrypted_wp(ppd);
 
 	if (initrd_len) {
 		/* Add encrypted initrd (identity) mappings */
-		ppd.paddr = initrd_start;
-		ppd.vaddr = initrd_start;
-		ppd.vaddr_end = initrd_end;
-		sme_map_range_encrypted(&ppd);
+		ppd->paddr = initrd_start;
+		ppd->vaddr = initrd_start;
+		ppd->vaddr_end = initrd_end;
+		sme_map_range_encrypted(ppd);
 		/*
 		 * Add decrypted, write-protected initrd (non-identity) mappings
 		 */
-		ppd.paddr = initrd_start;
-		ppd.vaddr = initrd_start + decrypted_base;
-		ppd.vaddr_end = initrd_end + decrypted_base;
-		sme_map_range_decrypted_wp(&ppd);
+		ppd->paddr = initrd_start;
+		ppd->vaddr = initrd_start + decrypted_base;
+		ppd->vaddr_end = initrd_end + decrypted_base;
+		sme_map_range_decrypted_wp(ppd);
 	}
 
 	/* Add decrypted workarea mappings to both kernel mappings */
-	ppd.paddr = workarea_start;
-	ppd.vaddr = workarea_start;
-	ppd.vaddr_end = workarea_end;
-	sme_map_range_decrypted(&ppd);
+	ppd->paddr = workarea_start;
+	ppd->vaddr = workarea_start;
+	ppd->vaddr_end = workarea_end;
+	sme_map_range_decrypted(ppd);
 
-	ppd.paddr = workarea_start;
-	ppd.vaddr = workarea_start + decrypted_base;
-	ppd.vaddr_end = workarea_end + decrypted_base;
-	sme_map_range_decrypted(&ppd);
+	ppd->paddr = workarea_start;
+	ppd->vaddr = workarea_start + decrypted_base;
+	ppd->vaddr_end = workarea_end + decrypted_base;
+	sme_map_range_decrypted(ppd);
 
-	/* Perform the encryption */
-	sme_encrypt_execute(kernel_start, kernel_start + decrypted_base,
-			    kernel_len, workarea_start, (unsigned long)ppd.pgd);
+	wa->kernel_start = kernel_start;
+	wa->kernel_end = kernel_end;
+	wa->kernel_len = kernel_len;
 
-	if (initrd_len)
-		sme_encrypt_execute(initrd_start, initrd_start + decrypted_base,
-				    initrd_len, workarea_start,
-				    (unsigned long)ppd.pgd);
+	wa->initrd_start = initrd_start;
+	wa->initrd_end = initrd_end;
+	wa->initrd_len = initrd_len;
+
+	wa->workarea_start = workarea_start;
+	wa->workarea_end = workarea_end;
+	wa->workarea_len = workarea_len;
+
+	wa->decrypted_base = decrypted_base;
+}
 
+static void __init remove_workarea_map(struct sme_workarea_data *wa,
+				       struct sme_populate_pgd_data *ppd)
+{
 	/*
 	 * At this point we are running encrypted.  Remove the mappings for
 	 * the decrypted areas - all that is needed for this is to remove
 	 * the PGD entry/entries.
 	 */
-	ppd.vaddr = kernel_start + decrypted_base;
-	ppd.vaddr_end = kernel_end + decrypted_base;
-	sme_clear_pgd(&ppd);
-
-	if (initrd_len) {
-		ppd.vaddr = initrd_start + decrypted_base;
-		ppd.vaddr_end = initrd_end + decrypted_base;
-		sme_clear_pgd(&ppd);
+	ppd->vaddr = wa->kernel_start + wa->decrypted_base;
+	ppd->vaddr_end = wa->kernel_end + wa->decrypted_base;
+	sme_clear_pgd(ppd);
+
+	if (wa->initrd_len) {
+		ppd->vaddr = wa->initrd_start + wa->decrypted_base;
+		ppd->vaddr_end = wa->initrd_end + wa->decrypted_base;
+		sme_clear_pgd(ppd);
 	}
 
-	ppd.vaddr = workarea_start + decrypted_base;
-	ppd.vaddr_end = workarea_end + decrypted_base;
-	sme_clear_pgd(&ppd);
+	ppd->vaddr = wa->workarea_start + wa->decrypted_base;
+	ppd->vaddr_end = wa->workarea_end + wa->decrypted_base;
+	sme_clear_pgd(ppd);
 
 	/* Flush the TLB - no globals so cr3 is enough */
 	native_write_cr3(__native_read_cr3());
 }
 
+void __init sme_encrypt_kernel(struct boot_params *bp)
+{
+	struct sme_populate_pgd_data ppd;
+	struct sme_workarea_data wa;
+
+	if (!sme_active())
+		return;
+
+	build_workarea_map(bp, &wa, &ppd);
+
+	/* When SEV is active, encrypt kernel and initrd */
+	sme_encrypt_execute(wa.kernel_start,
+			    wa.kernel_start + wa.decrypted_base,
+			    wa.kernel_len, wa.workarea_start,
+			    (unsigned long)ppd.pgd);
+
+	if (wa.initrd_len)
+		sme_encrypt_execute(wa.initrd_start,
+				    wa.initrd_start + wa.decrypted_base,
+				    wa.initrd_len, wa.workarea_start,
+				    (unsigned long)ppd.pgd);
+
+	remove_workarea_map(&wa, &ppd);
+}
+
 void __init sme_enable(struct boot_params *bp)
 {
 	const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables
  2018-08-28 22:12 [PATCH v2 0/3] x86: Fix SEV guest regression Brijesh Singh
  2018-08-28 22:12 ` [PATCH v2 1/3] x86/mm: Restructure sme_encrypt_kernel() Brijesh Singh
@ 2018-08-28 22:12 ` Brijesh Singh
  2018-08-29 13:59   ` Borislav Petkov
  2018-08-29 15:03   ` Sean Christopherson
  2018-08-28 22:12 ` [PATCH v2 3/3] x86/kvm: use __decrypted attribute when declaring " Brijesh Singh
  2 siblings, 2 replies; 13+ messages in thread
From: Brijesh Singh @ 2018-08-28 22:12 UTC (permalink / raw)
  To: x86, linux-kernel, kvm
  Cc: Brijesh Singh, stable, Tom Lendacky, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, Paolo Bonzini,
	Sean Christopherson, Radim Krčmář

kvmclock defines few static variables which are shared with hypervisor
during the kvmclock initialization.

When SEV is active, memory is encrypted with a guest-specific key, and
if guest OS wants to share the memory region with hypervisor then it must
clear the C-bit before sharing it. Currently, we use
kernel_physical_mapping_init() to split large pages before clearing the
C-bit on shared pages. But the kernel_physical_mapping_init fails when
called from the kvmclock initialization (mainly because memblock allocator
was not ready).

The '__decrypted' can be used to define a shared variable; the variables
will be put in the .data.decryption section. This section is mapped with
C=0 early in the boot, we also ensure that the initialized values are
updated to match with C=0 (i.e perform an in-place decryption). The
.data..decrypted section is PMD aligned and sized so that we avoid the
need to split the large pages when mapping this section.

The sme_encrypt_kernel() was used to perform the in-place encryption
of the Linux kernel and initrd when SME is active. The routine has been
enhanced to decrypt the .data..decryption section for both SME and SEV
cases.

While reusing the sme_populate_pgd() we found that the function does not
update the flags if the pte/pmd entry already exists. The patch updates
the function to take care of it.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Cc: stable@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: kvm@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kernel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: kvm@vger.kernel.org
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
---
 arch/x86/include/asm/mem_encrypt.h |   6 +++
 arch/x86/kernel/head64.c           |   9 ++++
 arch/x86/kernel/vmlinux.lds.S      |  17 +++++++
 arch/x86/mm/mem_encrypt_identity.c | 100 +++++++++++++++++++++++++++++--------
 4 files changed, 112 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index c064383..802b2eb 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -52,6 +52,8 @@ void __init mem_encrypt_init(void);
 bool sme_active(void);
 bool sev_active(void);
 
+#define __decrypted __attribute__((__section__(".data..decrypted")))
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define sme_me_mask	0ULL
@@ -77,6 +79,8 @@ early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0;
 static inline int __init
 early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
 
+#define __decrypted
+
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
 /*
@@ -88,6 +92,8 @@ early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0;
 #define __sme_pa(x)		(__pa(x) | sme_me_mask)
 #define __sme_pa_nodebug(x)	(__pa_nodebug(x) | sme_me_mask)
 
+extern char __start_data_decrypted[], __end_data_decrypted[];
+
 #endif	/* __ASSEMBLY__ */
 
 #endif	/* __X86_MEM_ENCRYPT_H__ */
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 8047379..3e03129 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -112,6 +112,7 @@ static bool __head check_la57_support(unsigned long physaddr)
 unsigned long __head __startup_64(unsigned long physaddr,
 				  struct boot_params *bp)
 {
+	unsigned long vaddr, vaddr_end;
 	unsigned long load_delta, *p;
 	unsigned long pgtable_flags;
 	pgdval_t *pgd;
@@ -234,6 +235,14 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	/* Encrypt the kernel and related (if SME is active) */
 	sme_encrypt_kernel(bp);
 
+	/* Clear the memory encryption mask from the decrypted section */
+	vaddr = (unsigned long)__start_data_decrypted;
+	vaddr_end = (unsigned long)__end_data_decrypted;
+	for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
+		i = pmd_index(vaddr);
+		pmd[i] -= sme_get_me_mask();
+	}
+
 	/*
 	 * Return the SME encryption mask (if SME is active) to be used as a
 	 * modifier for the initial pgdir entry programmed into CR3.
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 8bde0a4..0ef9320 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -89,6 +89,21 @@ PHDRS {
 	note PT_NOTE FLAGS(0);          /* ___ */
 }
 
+/*
+ * This section contains data which will be mapped as decrypted. Memory
+ * encryption operates on a page basis. But we make this section a pmd
+ * aligned to avoid spliting the pages while mapping the section early.
+ *
+ * Note: We use a separate section so that only this section gets
+ * decrypted to avoid exposing more than we wish.
+ */
+#define DATA_DECRYPTED_SECTION						\
+	. = ALIGN(PMD_SIZE);						\
+	__start_data_decrypted = .;					\
+	*(.data..decrypted);						\
+	. = ALIGN(PMD_SIZE);						\
+	__end_data_decrypted = .;					\
+
 SECTIONS
 {
 #ifdef CONFIG_X86_32
@@ -171,6 +186,8 @@ SECTIONS
 		/* rarely changed data like cpu maps */
 		READ_MOSTLY_DATA(INTERNODE_CACHE_BYTES)
 
+		DATA_DECRYPTED_SECTION
+
 		/* End of data section */
 		_edata = .;
 	} :data
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index bf6097e..88c1cce 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -51,6 +51,8 @@
 				 (_PAGE_PAT | _PAGE_PWT))
 
 #define PMD_FLAGS_ENC		(PMD_FLAGS_LARGE | _PAGE_ENC)
+#define PMD_FLAGS_ENC_WP	((PMD_FLAGS_ENC & ~_PAGE_CACHE_MASK) | \
+				 (_PAGE_PAT | _PAGE_PWT))
 
 #define PTE_FLAGS		(__PAGE_KERNEL_EXEC & ~_PAGE_GLOBAL)
 
@@ -59,6 +61,8 @@
 				 (_PAGE_PAT | _PAGE_PWT))
 
 #define PTE_FLAGS_ENC		(PTE_FLAGS | _PAGE_ENC)
+#define PTE_FLAGS_ENC_WP	((PTE_FLAGS_ENC & ~_PAGE_CACHE_MASK) | \
+				 (_PAGE_PAT | _PAGE_PWT))
 
 struct sme_populate_pgd_data {
 	void    *pgtable_area;
@@ -154,9 +158,6 @@ static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
 		return;
 
 	pmd = pmd_offset(pud, ppd->vaddr);
-	if (pmd_large(*pmd))
-		return;
-
 	set_pmd(pmd, __pmd(ppd->paddr | ppd->pmd_flags));
 }
 
@@ -182,8 +183,7 @@ static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd)
 		return;
 
 	pte = pte_offset_map(pmd, ppd->vaddr);
-	if (pte_none(*pte))
-		set_pte(pte, __pte(ppd->paddr | ppd->pte_flags));
+	set_pte(pte, __pte(ppd->paddr | ppd->pte_flags));
 }
 
 static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
@@ -235,6 +235,11 @@ static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
 	__sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC);
 }
 
+static void __init sme_map_range_encrypted_wp(struct sme_populate_pgd_data *ppd)
+{
+	__sme_map_range(ppd, PMD_FLAGS_ENC_WP, PTE_FLAGS_ENC_WP);
+}
+
 static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
 {
 	__sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC);
@@ -382,7 +387,10 @@ static void __init build_workarea_map(struct boot_params *bp,
 	ppd->paddr = workarea_start;
 	ppd->vaddr = workarea_start;
 	ppd->vaddr_end = workarea_end;
-	sme_map_range_decrypted(ppd);
+	if (sev_active())
+		sme_map_range_encrypted(ppd);
+	else
+		sme_map_range_decrypted(ppd);
 
 	/* Flush the TLB - no globals so cr3 is enough */
 	native_write_cr3(__native_read_cr3());
@@ -439,16 +447,27 @@ static void __init build_workarea_map(struct boot_params *bp,
 		sme_map_range_decrypted_wp(ppd);
 	}
 
-	/* Add decrypted workarea mappings to both kernel mappings */
+	/*
+	 * When SEV is active, kernel is already encrypted hence mapping
+	 * the initial workarea_start as encrypted. When SME is active,
+	 * the kernel is not encrypted hence add a decrypted workarea
+	 * mappings to both kernel mappings.
+	 */
 	ppd->paddr = workarea_start;
 	ppd->vaddr = workarea_start;
 	ppd->vaddr_end = workarea_end;
-	sme_map_range_decrypted(ppd);
+	if (sev_active())
+		sme_map_range_encrypted(ppd);
+	else
+		sme_map_range_decrypted(ppd);
 
 	ppd->paddr = workarea_start;
 	ppd->vaddr = workarea_start + decrypted_base;
 	ppd->vaddr_end = workarea_end + decrypted_base;
-	sme_map_range_decrypted(ppd);
+	if (sev_active())
+		sme_map_range_encrypted(ppd);
+	else
+		sme_map_range_decrypted(ppd);
 
 	wa->kernel_start = kernel_start;
 	wa->kernel_end = kernel_end;
@@ -491,28 +510,69 @@ static void __init remove_workarea_map(struct sme_workarea_data *wa,
 	native_write_cr3(__native_read_cr3());
 }
 
+static void __init decrypt_data_decrypted_section(struct sme_workarea_data *wa,
+						  struct sme_populate_pgd_data *ppd)
+{
+	unsigned long decrypted_start, decrypted_end, decrypted_len;
+
+	/* Physical addresses of decrypted data section */
+	decrypted_start = __pa_symbol(__start_data_decrypted);
+	decrypted_end = ALIGN(__pa_symbol(__end_data_decrypted), PMD_PAGE_SIZE);
+	decrypted_len = decrypted_end - decrypted_start;
+
+	if (!decrypted_len)
+		return;
+
+	/* Add decrypted mapping for the section (identity) */
+	ppd->paddr = decrypted_start;
+	ppd->vaddr = decrypted_start;
+	ppd->vaddr_end = decrypted_end;
+	sme_map_range_decrypted(ppd);
+
+	/* Add encrypted-wp mapping for the section (non-identity) */
+	ppd->paddr = decrypted_start;
+	ppd->vaddr = decrypted_start + wa->decrypted_base;
+	ppd->vaddr_end = decrypted_end + wa->decrypted_base;
+	sme_map_range_encrypted_wp(ppd);
+
+	/* Perform in-place decryption */
+	sme_encrypt_execute(decrypted_start,
+			    decrypted_start + wa->decrypted_base,
+			    decrypted_len, wa->workarea_start,
+			    (unsigned long)ppd->pgd);
+
+	ppd->vaddr = decrypted_start + wa->decrypted_base;
+	ppd->vaddr_end = decrypted_end + wa->decrypted_base;
+	sme_clear_pgd(ppd);
+}
+
 void __init sme_encrypt_kernel(struct boot_params *bp)
 {
 	struct sme_populate_pgd_data ppd;
 	struct sme_workarea_data wa;
 
-	if (!sme_active())
+	if (!mem_encrypt_active())
 		return;
 
 	build_workarea_map(bp, &wa, &ppd);
 
-	/* When SEV is active, encrypt kernel and initrd */
-	sme_encrypt_execute(wa.kernel_start,
-			    wa.kernel_start + wa.decrypted_base,
-			    wa.kernel_len, wa.workarea_start,
-			    (unsigned long)ppd.pgd);
-
-	if (wa.initrd_len)
-		sme_encrypt_execute(wa.initrd_start,
-				    wa.initrd_start + wa.decrypted_base,
-				    wa.initrd_len, wa.workarea_start,
+	/* When SME is active, encrypt kernel and initrd */
+	if (sme_active()) {
+		sme_encrypt_execute(wa.kernel_start,
+				    wa.kernel_start + wa.decrypted_base,
+				    wa.kernel_len, wa.workarea_start,
 				    (unsigned long)ppd.pgd);
 
+		if (wa.initrd_len)
+			sme_encrypt_execute(wa.initrd_start,
+					    wa.initrd_start + wa.decrypted_base,
+					    wa.initrd_len, wa.workarea_start,
+					    (unsigned long)ppd.pgd);
+	}
+
+	/* Decrypt the contents of .data..decrypted section */
+	decrypt_data_decrypted_section(&wa, &ppd);
+
 	remove_workarea_map(&wa, &ppd);
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/3] x86/kvm: use __decrypted attribute when declaring shared variables
  2018-08-28 22:12 [PATCH v2 0/3] x86: Fix SEV guest regression Brijesh Singh
  2018-08-28 22:12 ` [PATCH v2 1/3] x86/mm: Restructure sme_encrypt_kernel() Brijesh Singh
  2018-08-28 22:12 ` [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables Brijesh Singh
@ 2018-08-28 22:12 ` Brijesh Singh
  2 siblings, 0 replies; 13+ messages in thread
From: Brijesh Singh @ 2018-08-28 22:12 UTC (permalink / raw)
  To: x86, linux-kernel, kvm
  Cc: Brijesh Singh, stable, Tom Lendacky, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, Paolo Bonzini,
	Sean Christopherson, Radim Krčmář

The following commit:

  368a540e0232 (x86/kvmclock: Remove memblock dependency)

caused SEV guest regression. When SEV is active, we map the shared
variables (wall_clock and hv_clock_boot) with C=0 to ensure that both
the guest and the hypervisor is able to access the data. To map the
variables we use kernel_physical_mapping_init() to split the large pages,
but this routine fails to allocate a new page. Before the above commit,
kvmclock initialization was called after memory allocator was available
but now its called very early in the boot process.

Recently we added a special .data..decrypted section to hold the shared
variables. This section is mapped with C=0 very early. Use __decrypted
attribute to put the wall_clock and hv_clock_boot in .data..decrypted
section so that they are mapped with C=0.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Fixes: 368a540e0232 ("x86/kvmclock: Remove memblock dependency")
Cc: stable@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: kvm@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kernel@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: kvm@vger.kernel.org
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
---
 arch/x86/kernel/kvmclock.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 1e67646..08f5f8a 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -28,6 +28,7 @@
 #include <linux/sched/clock.h>
 #include <linux/mm.h>
 #include <linux/slab.h>
+#include <linux/set_memory.h>
 
 #include <asm/hypervisor.h>
 #include <asm/mem_encrypt.h>
@@ -61,8 +62,8 @@ early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
 	(PAGE_SIZE / sizeof(struct pvclock_vsyscall_time_info))
 
 static struct pvclock_vsyscall_time_info
-			hv_clock_boot[HVC_BOOT_ARRAY_SIZE] __aligned(PAGE_SIZE);
-static struct pvclock_wall_clock wall_clock;
+			hv_clock_boot[HVC_BOOT_ARRAY_SIZE] __decrypted __aligned(PAGE_SIZE);
+static struct pvclock_wall_clock wall_clock __decrypted;
 static DEFINE_PER_CPU(struct pvclock_vsyscall_time_info *, hv_clock_per_cpu);
 
 static inline struct pvclock_vcpu_time_info *this_cpu_pvti(void)
@@ -267,10 +268,29 @@ static int kvmclock_setup_percpu(unsigned int cpu)
 		return 0;
 
 	/* Use the static page for the first CPUs, allocate otherwise */
-	if (cpu < HVC_BOOT_ARRAY_SIZE)
+	if (cpu < HVC_BOOT_ARRAY_SIZE) {
 		p = &hv_clock_boot[cpu];
-	else
-		p = kzalloc(sizeof(*p), GFP_KERNEL);
+	} else {
+		int rc;
+		unsigned int sz = sizeof(*p);
+
+		if (sev_active())
+			sz = PAGE_ALIGN(sz);
+
+		p = kzalloc(sz, GFP_KERNEL);
+
+		/*
+		 * The physical address of per-cpu variable will be shared with
+		 * the hypervisor. Let's clear the C-bit before we assign the
+		 * memory to per_cpu variable.
+		 */
+		if (p && sev_active()) {
+			rc = set_memory_decrypted((unsigned long)p, sz >> PAGE_SHIFT);
+			if (rc)
+				return rc;
+			memset(p, 0, sz);
+		}
+	}
 
 	per_cpu(hv_clock_per_cpu, cpu) = p;
 	return p ? 0 : -ENOMEM;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] x86/mm: Restructure sme_encrypt_kernel()
  2018-08-28 22:12 ` [PATCH v2 1/3] x86/mm: Restructure sme_encrypt_kernel() Brijesh Singh
@ 2018-08-29 13:13   ` Borislav Petkov
  0 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2018-08-29 13:13 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, stable, Tom Lendacky, Thomas Gleixner,
	H. Peter Anvin, Paolo Bonzini, Sean Christopherson,
	Radim Krčmář

On Tue, Aug 28, 2018 at 05:12:55PM -0500, Brijesh Singh wrote:
> Re-arrange the sme_encrypt_kernel() by moving the workarea map/unmap
> logic in a separate static function. There are no logical changes in this
> patch. The restructuring will allow us to expand the sme_encrypt_kernel
> in future.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Cc: stable@vger.kernel.org

This patch is going to be a prerequisite for the actual fix so you
probably should write here:

Cc: stable@vger.kernel.org # prerequisite for <patch name>

or better yet do a separate backport/submission to stable@ once those
have been upstreamed.

...

> +void __init sme_encrypt_kernel(struct boot_params *bp)
> +{
> +	struct sme_populate_pgd_data ppd;
> +	struct sme_workarea_data wa;
> +
> +	if (!sme_active())
> +		return;
> +
> +	build_workarea_map(bp, &wa, &ppd);
> +
> +	/* When SEV is active, encrypt kernel and initrd */
> +	sme_encrypt_execute(wa.kernel_start,
> +			    wa.kernel_start + wa.decrypted_base,
> +			    wa.kernel_len, wa.workarea_start,
> +			    (unsigned long)ppd.pgd);
> +
> +	if (wa.initrd_len)
> +		sme_encrypt_execute(wa.initrd_start,
> +				    wa.initrd_start + wa.decrypted_base,
> +				    wa.initrd_len, wa.workarea_start,
> +				    (unsigned long)ppd.pgd);
> +
> +	remove_workarea_map(&wa, &ppd);

teardown_workarea_map() is a better naming counterpart to
build_workarea_map().

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables
  2018-08-28 22:12 ` [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables Brijesh Singh
@ 2018-08-29 13:59   ` Borislav Petkov
  2018-08-29 14:37     ` Brijesh Singh
  2018-08-29 15:54     ` Brijesh Singh
  2018-08-29 15:03   ` Sean Christopherson
  1 sibling, 2 replies; 13+ messages in thread
From: Borislav Petkov @ 2018-08-29 13:59 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, stable, Tom Lendacky, Thomas Gleixner,
	H. Peter Anvin, Paolo Bonzini, Sean Christopherson,
	Radim Krčmář

On Tue, Aug 28, 2018 at 05:12:56PM -0500, Brijesh Singh wrote:
> kvmclock defines few static variables which are shared with hypervisor

							... with the hypervisor ...

> during the kvmclock initialization.
> 
> When SEV is active, memory is encrypted with a guest-specific key, and
> if guest OS wants to share the memory region with hypervisor then it must
> clear the C-bit before sharing it. Currently, we use
> kernel_physical_mapping_init() to split large pages before clearing the
> C-bit on shared pages. But the kernel_physical_mapping_init fails when

"But it fails when... "

> called from the kvmclock initialization (mainly because memblock allocator
> was not ready).

"... is not ready that early during boot)."

> The '__decrypted' can be used to define a shared variable; the variables

"Add a __decrypted section attribute which can be used when defining
such shared variable. The so-defined variables will be placed..."

> will be put in the .data.decryption section. This section is mapped with

" ... in the .data..decrypted section."

> C=0 early in the boot, we also ensure that the initialized values are

"... early during boot, "

> updated to match with C=0 (i.e perform an in-place decryption). The
> .data..decrypted section is PMD aligned and sized so that we avoid the

"... PMD-aligned ... "

> need to split the large pages when mapping this section.
> 
> The sme_encrypt_kernel() was used to perform the in-place encryption
> of the Linux kernel and initrd when SME is active. The routine has been
> enhanced to decrypt the .data..decryption section for both SME and SEV
> cases.

".data..decrypted"

> 
> While reusing the sme_populate_pgd() we found that the function does not
> update the flags if the pte/pmd entry already exists. The patch updates
> the function to take care of it.


Change the tone to impartial:

"While at it, fix sme_populate_pgd() to update page flags if the PMD/PTE
entry already exists."

And avoid using "This patch" - what this patch does, should be visible
to the enlightened onlooker.

> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Cc: stable@vger.kernel.org
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: kvm@vger.kernel.org
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Sean Christopherson <sean.j.christopherson@intel.com>
> Cc: kvm@vger.kernel.org
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> ---
>  arch/x86/include/asm/mem_encrypt.h |   6 +++
>  arch/x86/kernel/head64.c           |   9 ++++
>  arch/x86/kernel/vmlinux.lds.S      |  17 +++++++
>  arch/x86/mm/mem_encrypt_identity.c | 100 +++++++++++++++++++++++++++++--------
>  4 files changed, 112 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index c064383..802b2eb 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -52,6 +52,8 @@ void __init mem_encrypt_init(void);
>  bool sme_active(void);
>  bool sev_active(void);
>  
> +#define __decrypted __attribute__((__section__(".data..decrypted")))
> +
>  #else	/* !CONFIG_AMD_MEM_ENCRYPT */
>  
>  #define sme_me_mask	0ULL
> @@ -77,6 +79,8 @@ early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0;
>  static inline int __init
>  early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
>  
> +#define __decrypted
> +
>  #endif	/* CONFIG_AMD_MEM_ENCRYPT */
>  
>  /*
> @@ -88,6 +92,8 @@ early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0;
>  #define __sme_pa(x)		(__pa(x) | sme_me_mask)
>  #define __sme_pa_nodebug(x)	(__pa_nodebug(x) | sme_me_mask)
>  
> +extern char __start_data_decrypted[], __end_data_decrypted[];
> +
>  #endif	/* __ASSEMBLY__ */
>  
>  #endif	/* __X86_MEM_ENCRYPT_H__ */
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 8047379..3e03129 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -112,6 +112,7 @@ static bool __head check_la57_support(unsigned long physaddr)
>  unsigned long __head __startup_64(unsigned long physaddr,
>  				  struct boot_params *bp)
>  {
> +	unsigned long vaddr, vaddr_end;
>  	unsigned long load_delta, *p;
>  	unsigned long pgtable_flags;
>  	pgdval_t *pgd;
> @@ -234,6 +235,14 @@ unsigned long __head __startup_64(unsigned long physaddr,
>  	/* Encrypt the kernel and related (if SME is active) */
>  	sme_encrypt_kernel(bp);
>  
> +	/* Clear the memory encryption mask from the decrypted section */

End sentences with a fullstop.

> +	vaddr = (unsigned long)__start_data_decrypted;
> +	vaddr_end = (unsigned long)__end_data_decrypted;
> +	for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> +		i = pmd_index(vaddr);
> +		pmd[i] -= sme_get_me_mask();
> +	}

This needs to be no-op on !SME machines. Hint: if (sme_active())...

>  	/*
>  	 * Return the SME encryption mask (if SME is active) to be used as a
>  	 * modifier for the initial pgdir entry programmed into CR3.
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 8bde0a4..0ef9320 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -89,6 +89,21 @@ PHDRS {
>  	note PT_NOTE FLAGS(0);          /* ___ */
>  }
>  
> +/*
> + * This section contains data which will be mapped as decrypted. Memory
> + * encryption operates on a page basis. But we make this section a pmd

"... make this section PMD-aligned ..."

Also, avoid the "we" and formulate in passive voice.

> + * aligned to avoid spliting the pages while mapping the section early.
> + *
> + * Note: We use a separate section so that only this section gets
> + * decrypted to avoid exposing more than we wish.
> + */
> +#define DATA_DECRYPTED_SECTION						\

DATA_DECRYPTED is perfectly fine.

> +	. = ALIGN(PMD_SIZE);						\
> +	__start_data_decrypted = .;					\
> +	*(.data..decrypted);						\
> +	. = ALIGN(PMD_SIZE);						\
> +	__end_data_decrypted = .;					\
> +
>  SECTIONS
>  {
>  #ifdef CONFIG_X86_32
> @@ -171,6 +186,8 @@ SECTIONS
>  		/* rarely changed data like cpu maps */
>  		READ_MOSTLY_DATA(INTERNODE_CACHE_BYTES)
>  
> +		DATA_DECRYPTED_SECTION
> +
>  		/* End of data section */
>  		_edata = .;
>  	} :data
> diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
> index bf6097e..88c1cce 100644
> --- a/arch/x86/mm/mem_encrypt_identity.c
> +++ b/arch/x86/mm/mem_encrypt_identity.c
> @@ -51,6 +51,8 @@
>  				 (_PAGE_PAT | _PAGE_PWT))
>  
>  #define PMD_FLAGS_ENC		(PMD_FLAGS_LARGE | _PAGE_ENC)
> +#define PMD_FLAGS_ENC_WP	((PMD_FLAGS_ENC & ~_PAGE_CACHE_MASK) | \
> +				 (_PAGE_PAT | _PAGE_PWT))
>  
>  #define PTE_FLAGS		(__PAGE_KERNEL_EXEC & ~_PAGE_GLOBAL)
>  
> @@ -59,6 +61,8 @@
>  				 (_PAGE_PAT | _PAGE_PWT))
>  
>  #define PTE_FLAGS_ENC		(PTE_FLAGS | _PAGE_ENC)
> +#define PTE_FLAGS_ENC_WP	((PTE_FLAGS_ENC & ~_PAGE_CACHE_MASK) | \
> +				 (_PAGE_PAT | _PAGE_PWT))
>  
>  struct sme_populate_pgd_data {
>  	void    *pgtable_area;
> @@ -154,9 +158,6 @@ static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
>  		return;
>  
>  	pmd = pmd_offset(pud, ppd->vaddr);
> -	if (pmd_large(*pmd))
> -		return;
> -
>  	set_pmd(pmd, __pmd(ppd->paddr | ppd->pmd_flags));
>  }
>  
> @@ -182,8 +183,7 @@ static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd)
>  		return;
>  
>  	pte = pte_offset_map(pmd, ppd->vaddr);
> -	if (pte_none(*pte))
> -		set_pte(pte, __pte(ppd->paddr | ppd->pte_flags));
> +	set_pte(pte, __pte(ppd->paddr | ppd->pte_flags));
>  }
>  

This looks like it belongs in a prepatch fix.

>  static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
> @@ -235,6 +235,11 @@ static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
>  	__sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC);
>  }
>  
> +static void __init sme_map_range_encrypted_wp(struct sme_populate_pgd_data *ppd)
> +{
> +	__sme_map_range(ppd, PMD_FLAGS_ENC_WP, PTE_FLAGS_ENC_WP);
> +}
> +
>  static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
>  {
>  	__sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC);

These changes with the _WP flags and helper addition belong in a pre-patch.

> @@ -382,7 +387,10 @@ static void __init build_workarea_map(struct boot_params *bp,
>  	ppd->paddr = workarea_start;
>  	ppd->vaddr = workarea_start;
>  	ppd->vaddr_end = workarea_end;
> -	sme_map_range_decrypted(ppd);
> +	if (sev_active())
> +		sme_map_range_encrypted(ppd);
> +	else
> +		sme_map_range_decrypted(ppd);
>  
>  	/* Flush the TLB - no globals so cr3 is enough */
>  	native_write_cr3(__native_read_cr3());
> @@ -439,16 +447,27 @@ static void __init build_workarea_map(struct boot_params *bp,
>  		sme_map_range_decrypted_wp(ppd);
>  	}
>  
> -	/* Add decrypted workarea mappings to both kernel mappings */
> +	/*
> +	 * When SEV is active, kernel is already encrypted hence mapping
> +	 * the initial workarea_start as encrypted. When SME is active,
> +	 * the kernel is not encrypted hence add a decrypted workarea

s/ a//

> +	 * mappings to both kernel mappings.
> +	 */
>  	ppd->paddr = workarea_start;
>  	ppd->vaddr = workarea_start;
>  	ppd->vaddr_end = workarea_end;
> -	sme_map_range_decrypted(ppd);
> +	if (sev_active())
> +		sme_map_range_encrypted(ppd);
> +	else
> +		sme_map_range_decrypted(ppd);
>  
>  	ppd->paddr = workarea_start;
>  	ppd->vaddr = workarea_start + decrypted_base;
>  	ppd->vaddr_end = workarea_end + decrypted_base;
> -	sme_map_range_decrypted(ppd);
> +	if (sev_active())
> +		sme_map_range_encrypted(ppd);
> +	else
> +		sme_map_range_decrypted(ppd);
>  
>  	wa->kernel_start = kernel_start;
>  	wa->kernel_end = kernel_end;
> @@ -491,28 +510,69 @@ static void __init remove_workarea_map(struct sme_workarea_data *wa,
>  	native_write_cr3(__native_read_cr3());
>  }
>  
> +static void __init decrypt_data_decrypted_section(struct sme_workarea_data *wa,

That function name could use some clarifying change...

> +						  struct sme_populate_pgd_data *ppd)
> +{
> +	unsigned long decrypted_start, decrypted_end, decrypted_len;
> +
> +	/* Physical addresses of decrypted data section */
> +	decrypted_start = __pa_symbol(__start_data_decrypted);
> +	decrypted_end = ALIGN(__pa_symbol(__end_data_decrypted), PMD_PAGE_SIZE);
> +	decrypted_len = decrypted_end - decrypted_start;
> +
> +	if (!decrypted_len)
> +		return;
> +
> +	/* Add decrypted mapping for the section (identity) */
> +	ppd->paddr = decrypted_start;
> +	ppd->vaddr = decrypted_start;
> +	ppd->vaddr_end = decrypted_end;
> +	sme_map_range_decrypted(ppd);
> +
> +	/* Add encrypted-wp mapping for the section (non-identity) */
> +	ppd->paddr = decrypted_start;
> +	ppd->vaddr = decrypted_start + wa->decrypted_base;
> +	ppd->vaddr_end = decrypted_end + wa->decrypted_base;
> +	sme_map_range_encrypted_wp(ppd);
> +
> +	/* Perform in-place decryption */
> +	sme_encrypt_execute(decrypted_start,
> +			    decrypted_start + wa->decrypted_base,
> +			    decrypted_len, wa->workarea_start,
> +			    (unsigned long)ppd->pgd);
> +
> +	ppd->vaddr = decrypted_start + wa->decrypted_base;
> +	ppd->vaddr_end = decrypted_end + wa->decrypted_base;
> +	sme_clear_pgd(ppd);
> +}
> +
>  void __init sme_encrypt_kernel(struct boot_params *bp)
>  {
>  	struct sme_populate_pgd_data ppd;
>  	struct sme_workarea_data wa;
>  
> -	if (!sme_active())
> +	if (!mem_encrypt_active())
>  		return;
>  
>  	build_workarea_map(bp, &wa, &ppd);
>  
> -	/* When SEV is active, encrypt kernel and initrd */
> -	sme_encrypt_execute(wa.kernel_start,
> -			    wa.kernel_start + wa.decrypted_base,
> -			    wa.kernel_len, wa.workarea_start,
> -			    (unsigned long)ppd.pgd);
> -
> -	if (wa.initrd_len)
> -		sme_encrypt_execute(wa.initrd_start,
> -				    wa.initrd_start + wa.decrypted_base,
> -				    wa.initrd_len, wa.workarea_start,
> +	/* When SME is active, encrypt kernel and initrd */
> +	if (sme_active()) {
> +		sme_encrypt_execute(wa.kernel_start,
> +				    wa.kernel_start + wa.decrypted_base,
> +				    wa.kernel_len, wa.workarea_start,
>  				    (unsigned long)ppd.pgd);
>  
> +		if (wa.initrd_len)
> +			sme_encrypt_execute(wa.initrd_start,
> +					    wa.initrd_start + wa.decrypted_base,
> +					    wa.initrd_len, wa.workarea_start,
> +					    (unsigned long)ppd.pgd);
> +	}
> +
> +	/* Decrypt the contents of .data..decrypted section */
> +	decrypt_data_decrypted_section(&wa, &ppd);
> +
>  	remove_workarea_map(&wa, &ppd);
>  }
>  
> -- 
> 2.7.4
> 

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables
  2018-08-29 13:59   ` Borislav Petkov
@ 2018-08-29 14:37     ` Brijesh Singh
  2018-08-30  9:21       ` Borislav Petkov
  2018-08-29 15:54     ` Brijesh Singh
  1 sibling, 1 reply; 13+ messages in thread
From: Brijesh Singh @ 2018-08-29 14:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, stable, Tom Lendacky,
	Thomas Gleixner, H. Peter Anvin, Paolo Bonzini,
	Sean Christopherson, Radim Krčmář



On 08/29/2018 08:59 AM, Borislav Petkov wrote:
> On Tue, Aug 28, 2018 at 05:12:56PM -0500, Brijesh Singh wrote:
>> kvmclock defines few static variables which are shared with hypervisor
> 
> 							... with the hypervisor ...
> 
>> during the kvmclock initialization.
>>
>> When SEV is active, memory is encrypted with a guest-specific key, and
>> if guest OS wants to share the memory region with hypervisor then it must
>> clear the C-bit before sharing it. Currently, we use
>> kernel_physical_mapping_init() to split large pages before clearing the
>> C-bit on shared pages. But the kernel_physical_mapping_init fails when
> 
> "But it fails when..."
> 
>> called from the kvmclock initialization (mainly because memblock allocator
>> was not ready).
> 
> "... is not ready that early during boot)."
> 
>> The '__decrypted' can be used to define a shared variable; the variables
> 
> "Add a __decrypted section attribute which can be used when defining
> such shared variable. The so-defined variables will be placed..."
> 
>> will be put in the .data.decryption section. This section is mapped with
> 
> " ... in the .data..decrypted section."
> 
>> C=0 early in the boot, we also ensure that the initialized values are
> 
> "... early during boot,"
> 
>> updated to match with C=0 (i.e perform an in-place decryption). The
>> .data..decrypted section is PMD aligned and sized so that we avoid the
> 
> "... PMD-aligned ..."
> 
>> need to split the large pages when mapping this section.
>>
>> The sme_encrypt_kernel() was used to perform the in-place encryption
>> of the Linux kernel and initrd when SME is active. The routine has been
>> enhanced to decrypt the .data..decryption section for both SME and SEV
>> cases.
> 
> ".data..decrypted"
> 
>>
>> While reusing the sme_populate_pgd() we found that the function does not
>> update the flags if the pte/pmd entry already exists. The patch updates
>> the function to take care of it.
> 
> 
> Change the tone to impartial:
> 
> "While at it, fix sme_populate_pgd() to update page flags if the PMD/PTE
> entry already exists."
> 
> And avoid using "This patch" - what this patch does, should be visible
> to the enlightened onlooker.
> 


Thanks Boris, I will incorporate your edits in commit message.


>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> Cc: stable@vger.kernel.org
>> Cc: Tom Lendacky <thomas.lendacky@amd.com>
>> Cc: kvm@vger.kernel.org
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: linux-kernel@vger.kernel.org
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Sean Christopherson <sean.j.christopherson@intel.com>
>> Cc: kvm@vger.kernel.org
>> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
>> ---
>>   arch/x86/include/asm/mem_encrypt.h |   6 +++
>>   arch/x86/kernel/head64.c           |   9 ++++
>>   arch/x86/kernel/vmlinux.lds.S      |  17 +++++++
>>   arch/x86/mm/mem_encrypt_identity.c | 100 +++++++++++++++++++++++++++++--------
>>   4 files changed, 112 insertions(+), 20 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
>> index c064383..802b2eb 100644
>> --- a/arch/x86/include/asm/mem_encrypt.h
>> +++ b/arch/x86/include/asm/mem_encrypt.h
>> @@ -52,6 +52,8 @@ void __init mem_encrypt_init(void);
>>   bool sme_active(void);
>>   bool sev_active(void);
>>   
>> +#define __decrypted __attribute__((__section__(".data..decrypted")))
>> +
>>   #else	/* !CONFIG_AMD_MEM_ENCRYPT */
>>   
>>   #define sme_me_mask	0ULL
>> @@ -77,6 +79,8 @@ early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0;
>>   static inline int __init
>>   early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
>>   
>> +#define __decrypted
>> +
>>   #endif	/* CONFIG_AMD_MEM_ENCRYPT */
>>   
>>   /*
>> @@ -88,6 +92,8 @@ early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0;
>>   #define __sme_pa(x)		(__pa(x) | sme_me_mask)
>>   #define __sme_pa_nodebug(x)	(__pa_nodebug(x) | sme_me_mask)
>>   
>> +extern char __start_data_decrypted[], __end_data_decrypted[];
>> +
>>   #endif	/* __ASSEMBLY__ */
>>   
>>   #endif	/* __X86_MEM_ENCRYPT_H__ */
>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>> index 8047379..3e03129 100644
>> --- a/arch/x86/kernel/head64.c
>> +++ b/arch/x86/kernel/head64.c
>> @@ -112,6 +112,7 @@ static bool __head check_la57_support(unsigned long physaddr)
>>   unsigned long __head __startup_64(unsigned long physaddr,
>>   				  struct boot_params *bp)
>>   {
>> +	unsigned long vaddr, vaddr_end;
>>   	unsigned long load_delta, *p;
>>   	unsigned long pgtable_flags;
>>   	pgdval_t *pgd;
>> @@ -234,6 +235,14 @@ unsigned long __head __startup_64(unsigned long physaddr,
>>   	/* Encrypt the kernel and related (if SME is active) */
>>   	sme_encrypt_kernel(bp);
>>   
>> +	/* Clear the memory encryption mask from the decrypted section */
> 
> End sentences with a fullstop.


Noted.

> 
>> +	vaddr = (unsigned long)__start_data_decrypted;
>> +	vaddr_end = (unsigned long)__end_data_decrypted;
>> +	for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>> +		i = pmd_index(vaddr);
>> +		pmd[i] -= sme_get_me_mask();
>> +	}
> 
> This needs to be no-op on !SME machines. Hint: if (sme_active())...

Sure, I will add the check before calling this loop.

> 
>>   	/*
>>   	 * Return the SME encryption mask (if SME is active) to be used as a
>>   	 * modifier for the initial pgdir entry programmed into CR3.
>> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
>> index 8bde0a4..0ef9320 100644
>> --- a/arch/x86/kernel/vmlinux.lds.S
>> +++ b/arch/x86/kernel/vmlinux.lds.S
>> @@ -89,6 +89,21 @@ PHDRS {
>>   	note PT_NOTE FLAGS(0);          /* ___ */
>>   }
>>   
>> +/*
>> + * This section contains data which will be mapped as decrypted. Memory
>> + * encryption operates on a page basis. But we make this section a pmd
> 
> "... make this section PMD-aligned ..."
> 
> Also, avoid the "we" and formulate in passive voic


Noted.


> 
>> + * aligned to avoid spliting the pages while mapping the section early.
>> + *
>> + * Note: We use a separate section so that only this section gets
>> + * decrypted to avoid exposing more than we wish.
>> + */
>> +#define DATA_DECRYPTED_SECTION						\
> 
> DATA_DECRYPTED is perfectly fine.


Noted.

> 
>> +	. = ALIGN(PMD_SIZE);						\
>> +	__start_data_decrypted = .;					\
>> +	*(.data..decrypted);						\
>> +	. = ALIGN(PMD_SIZE);						\
>> +	__end_data_decrypted = .;					\
>> +
>>   SECTIONS
>>   {
>>   #ifdef CONFIG_X86_32
>> @@ -171,6 +186,8 @@ SECTIONS
>>   		/* rarely changed data like cpu maps */
>>   		READ_MOSTLY_DATA(INTERNODE_CACHE_BYTES)
>>   
>> +		DATA_DECRYPTED_SECTION
>> +
>>   		/* End of data section */
>>   		_edata = .;
>>   	} :data
>> diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
>> index bf6097e..88c1cce 100644
>> --- a/arch/x86/mm/mem_encrypt_identity.c
>> +++ b/arch/x86/mm/mem_encrypt_identity.c
>> @@ -51,6 +51,8 @@
>>   				 (_PAGE_PAT | _PAGE_PWT))
>>   
>>   #define PMD_FLAGS_ENC		(PMD_FLAGS_LARGE | _PAGE_ENC)
>> +#define PMD_FLAGS_ENC_WP	((PMD_FLAGS_ENC & ~_PAGE_CACHE_MASK) | \
>> +				 (_PAGE_PAT | _PAGE_PWT))
>>   
>>   #define PTE_FLAGS		(__PAGE_KERNEL_EXEC & ~_PAGE_GLOBAL)
>>   
>> @@ -59,6 +61,8 @@
>>   				 (_PAGE_PAT | _PAGE_PWT))
>>   
>>   #define PTE_FLAGS_ENC		(PTE_FLAGS | _PAGE_ENC)
>> +#define PTE_FLAGS_ENC_WP	((PTE_FLAGS_ENC & ~_PAGE_CACHE_MASK) | \
>> +				 (_PAGE_PAT | _PAGE_PWT))
>>   
>>   struct sme_populate_pgd_data {
>>   	void    *pgtable_area;
>> @@ -154,9 +158,6 @@ static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
>>   		return;
>>   
>>   	pmd = pmd_offset(pud, ppd->vaddr);
>> -	if (pmd_large(*pmd))
>> -		return;
>> -
>>   	set_pmd(pmd, __pmd(ppd->paddr | ppd->pmd_flags));
>>   }
>>   
>> @@ -182,8 +183,7 @@ static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd)
>>   		return;
>>   
>>   	pte = pte_offset_map(pmd, ppd->vaddr);
>> -	if (pte_none(*pte))
>> -		set_pte(pte, __pte(ppd->paddr | ppd->pte_flags));
>> +	set_pte(pte, __pte(ppd->paddr | ppd->pte_flags));
>>   }
>>   
> 
> This looks like it belongs in a prepatch fix.


Sure, I can do a prepatch for this change.


> 
>>   static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
>> @@ -235,6 +235,11 @@ static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
>>   	__sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC);
>>   }
>>   
>> +static void __init sme_map_range_encrypted_wp(struct sme_populate_pgd_data *ppd)
>> +{
>> +	__sme_map_range(ppd, PMD_FLAGS_ENC_WP, PTE_FLAGS_ENC_WP);
>> +}
>> +
>>   static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
>>   {
>>   	__sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC);
> 
> These changes with the _WP flags and helper addition belong in a pre-patch.
> 
>> @@ -382,7 +387,10 @@ static void __init build_workarea_map(struct boot_params *bp,
>>   	ppd->paddr = workarea_start;
>>   	ppd->vaddr = workarea_start;
>>   	ppd->vaddr_end = workarea_end;
>> -	sme_map_range_decrypted(ppd);
>> +	if (sev_active())
>> +		sme_map_range_encrypted(ppd);
>> +	else
>> +		sme_map_range_decrypted(ppd);
>>   
>>   	/* Flush the TLB - no globals so cr3 is enough */
>>   	native_write_cr3(__native_read_cr3());
>> @@ -439,16 +447,27 @@ static void __init build_workarea_map(struct boot_params *bp,
>>   		sme_map_range_decrypted_wp(ppd);
>>   	}
>>   
>> -	/* Add decrypted workarea mappings to both kernel mappings */
>> +	/*
>> +	 * When SEV is active, kernel is already encrypted hence mapping
>> +	 * the initial workarea_start as encrypted. When SME is active,
>> +	 * the kernel is not encrypted hence add a decrypted workarea
> 
> s/ a//

Noted.


> 
>> +	 * mappings to both kernel mappings.
>> +	 */
>>   	ppd->paddr = workarea_start;
>>   	ppd->vaddr = workarea_start;
>>   	ppd->vaddr_end = workarea_end;
>> -	sme_map_range_decrypted(ppd);
>> +	if (sev_active())
>> +		sme_map_range_encrypted(ppd);
>> +	else
>> +		sme_map_range_decrypted(ppd);
>>   
>>   	ppd->paddr = workarea_start;
>>   	ppd->vaddr = workarea_start + decrypted_base;
>>   	ppd->vaddr_end = workarea_end + decrypted_base;
>> -	sme_map_range_decrypted(ppd);
>> +	if (sev_active())
>> +		sme_map_range_encrypted(ppd);
>> +	else
>> +		sme_map_range_decrypted(ppd);
>>   
>>   	wa->kernel_start = kernel_start;
>>   	wa->kernel_end = kernel_end;
>> @@ -491,28 +510,69 @@ static void __init remove_workarea_map(struct sme_workarea_data *wa,
>>   	native_write_cr3(__native_read_cr3());
>>   }
>>   
>> +static void __init decrypt_data_decrypted_section(struct sme_workarea_data *wa,
> 
> That function name could use some clarifying change...


How about decrypt_shared_data() ?


> 
>> +						  struct sme_populate_pgd_data *ppd)
>> +{
>> +	unsigned long decrypted_start, decrypted_end, decrypted_len;
>> +
>> +	/* Physical addresses of decrypted data section */
>> +	decrypted_start = __pa_symbol(__start_data_decrypted);
>> +	decrypted_end = ALIGN(__pa_symbol(__end_data_decrypted), PMD_PAGE_SIZE);
>> +	decrypted_len = decrypted_end - decrypted_start;
>> +
>> +	if (!decrypted_len)
>> +		return;
>> +
>> +	/* Add decrypted mapping for the section (identity) */
>> +	ppd->paddr = decrypted_start;
>> +	ppd->vaddr = decrypted_start;
>> +	ppd->vaddr_end = decrypted_end;
>> +	sme_map_range_decrypted(ppd);
>> +
>> +	/* Add encrypted-wp mapping for the section (non-identity) */
>> +	ppd->paddr = decrypted_start;
>> +	ppd->vaddr = decrypted_start + wa->decrypted_base;
>> +	ppd->vaddr_end = decrypted_end + wa->decrypted_base;
>> +	sme_map_range_encrypted_wp(ppd);
>> +
>> +	/* Perform in-place decryption */
>> +	sme_encrypt_execute(decrypted_start,
>> +			    decrypted_start + wa->decrypted_base,
>> +			    decrypted_len, wa->workarea_start,
>> +			    (unsigned long)ppd->pgd);
>> +
>> +	ppd->vaddr = decrypted_start + wa->decrypted_base;
>> +	ppd->vaddr_end = decrypted_end + wa->decrypted_base;
>> +	sme_clear_pgd(ppd);
>> +}
>> +
>>   void __init sme_encrypt_kernel(struct boot_params *bp)
>>   {
>>   	struct sme_populate_pgd_data ppd;
>>   	struct sme_workarea_data wa;
>>   
>> -	if (!sme_active())
>> +	if (!mem_encrypt_active())
>>   		return;
>>   
>>   	build_workarea_map(bp, &wa, &ppd);
>>   
>> -	/* When SEV is active, encrypt kernel and initrd */
>> -	sme_encrypt_execute(wa.kernel_start,
>> -			    wa.kernel_start + wa.decrypted_base,
>> -			    wa.kernel_len, wa.workarea_start,
>> -			    (unsigned long)ppd.pgd);
>> -
>> -	if (wa.initrd_len)
>> -		sme_encrypt_execute(wa.initrd_start,
>> -				    wa.initrd_start + wa.decrypted_base,
>> -				    wa.initrd_len, wa.workarea_start,
>> +	/* When SME is active, encrypt kernel and initrd */
>> +	if (sme_active()) {
>> +		sme_encrypt_execute(wa.kernel_start,
>> +				    wa.kernel_start + wa.decrypted_base,
>> +				    wa.kernel_len, wa.workarea_start,
>>   				    (unsigned long)ppd.pgd);
>>   
>> +		if (wa.initrd_len)
>> +			sme_encrypt_execute(wa.initrd_start,
>> +					    wa.initrd_start + wa.decrypted_base,
>> +					    wa.initrd_len, wa.workarea_start,
>> +					    (unsigned long)ppd.pgd);
>> +	}
>> +
>> +	/* Decrypt the contents of .data..decrypted section */
>> +	decrypt_data_decrypted_section(&wa, &ppd);
>> +
>>   	remove_workarea_map(&wa, &ppd);
>>   }
>>   
>> -- 
>> 2.7.4
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables
  2018-08-28 22:12 ` [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables Brijesh Singh
  2018-08-29 13:59   ` Borislav Petkov
@ 2018-08-29 15:03   ` Sean Christopherson
  2018-08-29 15:33     ` Brijesh Singh
  1 sibling, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2018-08-29 15:03 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, stable, Tom Lendacky, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, Paolo Bonzini,
	Radim Krčmář

On Tue, Aug 28, 2018 at 05:12:56PM -0500, Brijesh Singh wrote:
> kvmclock defines few static variables which are shared with hypervisor
> during the kvmclock initialization.
> 
> When SEV is active, memory is encrypted with a guest-specific key, and
> if guest OS wants to share the memory region with hypervisor then it must
> clear the C-bit before sharing it. Currently, we use
> kernel_physical_mapping_init() to split large pages before clearing the
> C-bit on shared pages. But the kernel_physical_mapping_init fails when
> called from the kvmclock initialization (mainly because memblock allocator
> was not ready).
> 
> The '__decrypted' can be used to define a shared variable; the variables
> will be put in the .data.decryption section. This section is mapped with
> C=0 early in the boot, we also ensure that the initialized values are
> updated to match with C=0 (i.e perform an in-place decryption). The
> .data..decrypted section is PMD aligned and sized so that we avoid the
> need to split the large pages when mapping this section.

What about naming the attribute (and section) '__unencrypted' instead
of '__decrypted'?  The attribute should be a property describing how
the data must be accessed, it shouldn't imply anything regarding the
history of the data.  Decrypted implies that data was once encrypted,
whereas unencrypted simply states that the data is stored in plain
text.  All data that has been decrypted is also unencrypted, but the
reverse does not hold true.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables
  2018-08-29 15:03   ` Sean Christopherson
@ 2018-08-29 15:33     ` Brijesh Singh
  2018-08-30  9:26       ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: Brijesh Singh @ 2018-08-29 15:33 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: brijesh.singh, x86, linux-kernel, kvm, stable, Tom Lendacky,
	Thomas Gleixner, Borislav Petkov, H. Peter Anvin, Paolo Bonzini,
	Radim Krčmář



On 08/29/2018 10:03 AM, Sean Christopherson wrote:
> On Tue, Aug 28, 2018 at 05:12:56PM -0500, Brijesh Singh wrote:
>> kvmclock defines few static variables which are shared with hypervisor
>> during the kvmclock initialization.
>>
>> When SEV is active, memory is encrypted with a guest-specific key, and
>> if guest OS wants to share the memory region with hypervisor then it must
>> clear the C-bit before sharing it. Currently, we use
>> kernel_physical_mapping_init() to split large pages before clearing the
>> C-bit on shared pages. But the kernel_physical_mapping_init fails when
>> called from the kvmclock initialization (mainly because memblock allocator
>> was not ready).
>>
>> The '__decrypted' can be used to define a shared variable; the variables
>> will be put in the .data.decryption section. This section is mapped with
>> C=0 early in the boot, we also ensure that the initialized values are
>> updated to match with C=0 (i.e perform an in-place decryption). The
>> .data..decrypted section is PMD aligned and sized so that we avoid the
>> need to split the large pages when mapping this section.
> 
> What about naming the attribute (and section) '__unencrypted' instead
> of '__decrypted'?  The attribute should be a property describing how
> the data must be accessed, it shouldn't imply anything regarding the
> history of the data.  Decrypted implies that data was once encrypted,
> whereas unencrypted simply states that the data is stored in plain
> text.  All data that has been decrypted is also unencrypted, but the
> reverse does not hold true.
> 


During the initial SEV/SME patch review cycle we had some discussion 
about using decrypted vs unencrypted. At that time the consensus was
that a memory range mapped with C=0 should be referred as 'decrypted'.
Having said so, I do see your point and I am not oppose to calling it
'unencrypted' if others agrees to it.

Tom and Boris, thoughts ?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables
  2018-08-29 13:59   ` Borislav Petkov
  2018-08-29 14:37     ` Brijesh Singh
@ 2018-08-29 15:54     ` Brijesh Singh
  2018-08-30  9:22       ` Borislav Petkov
  1 sibling, 1 reply; 13+ messages in thread
From: Brijesh Singh @ 2018-08-29 15:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: brijesh.singh, x86, linux-kernel, kvm, stable, Tom Lendacky,
	Thomas Gleixner, H. Peter Anvin, Paolo Bonzini,
	Sean Christopherson, Radim Krčmář

Hi Boris,


On 08/29/2018 08:59 AM, Borislav Petkov wrote:

...

> 
>>   static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
>> @@ -235,6 +235,11 @@ static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
>>   	__sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC);
>>   }
>>   
>> +static void __init sme_map_range_encrypted_wp(struct sme_populate_pgd_data *ppd)
>> +{
>> +	__sme_map_range(ppd, PMD_FLAGS_ENC_WP, PTE_FLAGS_ENC_WP);
>> +}
>> +
>>   static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
>>   {
>>   	__sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC);
> 
> These changes with the _WP flags and helper addition belong in a pre-patch.
> 


Please note that the _WP flags and helper functions are used by this
patch only. Introducing a helper in a separate patch will cause a build
warning. I am leaning to keep the helper in this patch but if you think
it should be done in separate patch then let me know.


-Brijesh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables
  2018-08-29 14:37     ` Brijesh Singh
@ 2018-08-30  9:21       ` Borislav Petkov
  0 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2018-08-30  9:21 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, stable, Tom Lendacky, Thomas Gleixner,
	H. Peter Anvin, Paolo Bonzini, Sean Christopherson,
	Radim Krčmář

On Wed, Aug 29, 2018 at 09:37:46AM -0500, Brijesh Singh wrote:
> > > @@ -491,28 +510,69 @@ static void __init remove_workarea_map(struct sme_workarea_data *wa,
> > >   	native_write_cr3(__native_read_cr3());
> > >   }
> > > +static void __init decrypt_data_decrypted_section(struct sme_workarea_data *wa,
> > 
> > That function name could use some clarifying change...
> 
> 
> How about decrypt_shared_data() ?

Much better.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables
  2018-08-29 15:54     ` Brijesh Singh
@ 2018-08-30  9:22       ` Borislav Petkov
  0 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2018-08-30  9:22 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: x86, linux-kernel, kvm, stable, Tom Lendacky, Thomas Gleixner,
	H. Peter Anvin, Paolo Bonzini, Sean Christopherson,
	Radim Krčmář

On Wed, Aug 29, 2018 at 10:54:37AM -0500, Brijesh Singh wrote:
> Please note that the _WP flags and helper functions are used by this
> patch only. Introducing a helper in a separate patch will cause a build
> warning.

... a build warning which goes away with the next patch, adding the
users, right?

> I am leaning to keep the helper in this patch but if you think
> it should be done in separate patch then let me know.

Well, if you're doing a bunch of different things in a single patch, it
would be better, IMO, to have each thing in a separate patch.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables
  2018-08-29 15:33     ` Brijesh Singh
@ 2018-08-30  9:26       ` Borislav Petkov
  0 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2018-08-30  9:26 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: Sean Christopherson, x86, linux-kernel, kvm, Tom Lendacky,
	Thomas Gleixner, H. Peter Anvin, Paolo Bonzini,
	Radim Krčmář

dropping stable@

On Wed, Aug 29, 2018 at 10:33:24AM -0500, Brijesh Singh wrote:
> During the initial SEV/SME patch review cycle we had some discussion about
> using decrypted vs unencrypted. At that time the consensus was
> that a memory range mapped with C=0 should be referred as 'decrypted'.

Yes, the idea was to avoid having "unencrypted" *and* "decrypted" to
mean pretty much the same thing for ease of understanding just by
looking at the name.

Also whether the data was initially unencrypted or was decrypted is
immaterial - you only need to know how to access it.

> Having said so, I do see your point and I am not oppose to calling it
> 'unencrypted' if others agrees to it.

No, please don't. Let's stick with "decrypted" for everything.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-08-30  9:25 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-28 22:12 [PATCH v2 0/3] x86: Fix SEV guest regression Brijesh Singh
2018-08-28 22:12 ` [PATCH v2 1/3] x86/mm: Restructure sme_encrypt_kernel() Brijesh Singh
2018-08-29 13:13   ` Borislav Petkov
2018-08-28 22:12 ` [PATCH v2 2/3] x86/mm: add .data..decrypted section to hold shared variables Brijesh Singh
2018-08-29 13:59   ` Borislav Petkov
2018-08-29 14:37     ` Brijesh Singh
2018-08-30  9:21       ` Borislav Petkov
2018-08-29 15:54     ` Brijesh Singh
2018-08-30  9:22       ` Borislav Petkov
2018-08-29 15:03   ` Sean Christopherson
2018-08-29 15:33     ` Brijesh Singh
2018-08-30  9:26       ` Borislav Petkov
2018-08-28 22:12 ` [PATCH v2 3/3] x86/kvm: use __decrypted attribute when declaring " Brijesh Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.