linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD)
@ 2016-04-26 22:45 Tom Lendacky
  2016-04-26 22:45 ` [RFC PATCH v1 01/18] x86: Set the write-protect cache mode for AMD processors Tom Lendacky
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:45 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

This RFC patch series provides support for AMD's new Secure Memory
Encryption (SME) feature.

SME can be used to mark individual pages of memory as encrypted through the
page tables. A page of memory that is marked encrypted will be automatically
decrypted when read from DRAM and will be automatically encrypted when
written to DRAM. Details on SME can found in the links below.

The SME feature is identified through a CPUID function and enabled through
the SYSCFG MSR. Once enabled, page table entries will determine how the
memory is accessed. If a page table entry has the memory encryption mask set,
then that memory will be accessed as encrypted memory. The memory encryption
mask (as well as other related information) is determined from settings
returned through the same CPUID function that identifies the presence of the
feature.

The approach that this patch series takes is to encrypt everything possible
starting early in the boot where the kernel is encrypted. Using the page
table macros the encryption mask can be incorporated into all page table
entries and page allocations. By updating the protection map, userspace
allocations are also marked encrypted. Certain data must be accounted for
as having been placed in memory before SME was enabled (EFI, initrd, etc.)
and accessed accordingly.

This patch series is a pre-cursor to another AMD processor feature called
Secure Encrypted Virtualization (SEV). The support for SEV will build upon
the SME support and will be submitted later. Details on SEV can be found
in the links below.

The following links provide additional detail:

AMD Memory Encryption whitepaper:
   http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf

AMD64 Architecture Programmer's Manual:
   http://support.amd.com/TechDocs/24593.pdf
   SME is section 7.10
   SEV is section 15.34

This patch series is based off of the master branch of tip.
  Commit 8d54fcebd9b3 ("Merge branch 'x86/urgent'")

---

Tom Lendacky (18):
      x86: Set the write-protect cache mode for AMD processors
      x86: Secure Memory Encryption (SME) build enablement
      x86: Secure Memory Encryption (SME) support
      x86: Add the Secure Memory Encryption cpu feature
      x86: Handle reduction in physical address size with SME
      x86: Provide general kernel support for memory encryption
      x86: Extend the early_memmap support with additional attrs
      x86: Add support for early encryption/decryption of memory
      x86: Insure that memory areas are encrypted when possible
      x86/efi: Access EFI related tables in the clear
      x86: Decrypt trampoline area if memory encryption is active
      x86: Access device tree in the clear
      x86: DMA support for memory encryption
      iommu/amd: AMD IOMMU support for memory encryption
      x86: Enable memory encryption on the APs
      x86: Do not specify encrypted memory for VGA mapping
      x86/kvm: Enable Secure Memory Encryption of nested page tables
      x86: Add support to turn on Secure Memory Encryption


 Documentation/kernel-parameters.txt  |    3 
 arch/x86/Kconfig                     |    9 +
 arch/x86/include/asm/cacheflush.h    |    3 
 arch/x86/include/asm/cpufeature.h    |    1 
 arch/x86/include/asm/cpufeatures.h   |    5 
 arch/x86/include/asm/dma-mapping.h   |    5 
 arch/x86/include/asm/fixmap.h        |   16 ++
 arch/x86/include/asm/kvm_host.h      |    2 
 arch/x86/include/asm/mem_encrypt.h   |   99 ++++++++++
 arch/x86/include/asm/msr-index.h     |    2 
 arch/x86/include/asm/pgtable_types.h |   49 +++--
 arch/x86/include/asm/processor.h     |    3 
 arch/x86/include/asm/realmode.h      |   12 +
 arch/x86/include/asm/vga.h           |   13 +
 arch/x86/kernel/Makefile             |    2 
 arch/x86/kernel/asm-offsets.c        |    2 
 arch/x86/kernel/cpu/common.c         |    2 
 arch/x86/kernel/cpu/scattered.c      |    1 
 arch/x86/kernel/devicetree.c         |    6 -
 arch/x86/kernel/espfix_64.c          |    2 
 arch/x86/kernel/head64.c             |  100 +++++++++-
 arch/x86/kernel/head_64.S            |   42 +++-
 arch/x86/kernel/machine_kexec_64.c   |    2 
 arch/x86/kernel/mem_encrypt.S        |  343 ++++++++++++++++++++++++++++++++++
 arch/x86/kernel/pci-dma.c            |   11 +
 arch/x86/kernel/pci-nommu.c          |    2 
 arch/x86/kernel/pci-swiotlb.c        |    8 +
 arch/x86/kernel/setup.c              |   14 +
 arch/x86/kernel/x8664_ksyms_64.c     |    6 +
 arch/x86/kvm/mmu.c                   |    7 -
 arch/x86/kvm/vmx.c                   |    2 
 arch/x86/kvm/x86.c                   |    3 
 arch/x86/mm/Makefile                 |    1 
 arch/x86/mm/fault.c                  |    5 
 arch/x86/mm/ioremap.c                |   31 +++
 arch/x86/mm/kasan_init_64.c          |    4 
 arch/x86/mm/mem_encrypt.c            |  201 ++++++++++++++++++++
 arch/x86/mm/pageattr.c               |   78 ++++++++
 arch/x86/mm/pat.c                    |   11 +
 arch/x86/platform/efi/efi.c          |   26 +--
 arch/x86/platform/efi/efi_64.c       |    9 +
 arch/x86/platform/efi/quirks.c       |   12 +
 arch/x86/realmode/init.c             |   13 +
 arch/x86/realmode/rm/trampoline_64.S |   14 +
 drivers/firmware/efi/efi.c           |   18 +-
 drivers/firmware/efi/esrt.c          |   12 +
 drivers/iommu/amd_iommu.c            |   10 +
 include/asm-generic/early_ioremap.h  |    2 
 include/linux/efi.h                  |    3 
 include/linux/swiotlb.h              |    1 
 init/main.c                          |    6 +
 lib/swiotlb.c                        |   64 ++++++
 mm/early_ioremap.c                   |   15 +
 53 files changed, 1217 insertions(+), 96 deletions(-)
 create mode 100644 arch/x86/include/asm/mem_encrypt.h
 create mode 100644 arch/x86/kernel/mem_encrypt.S
 create mode 100644 arch/x86/mm/mem_encrypt.c

-- 
Tom Lendacky

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 01/18] x86: Set the write-protect cache mode for AMD processors
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
@ 2016-04-26 22:45 ` Tom Lendacky
  2016-04-26 22:45 ` [RFC PATCH v1 02/18] x86: Secure Memory Encryption (SME) build enablement Tom Lendacky
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:45 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

For AMD processors that support PAT, set the write-protect cache mode
(_PAGE_CACHE_MODE_WP) entry to the actual write-protect value (x05).

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/mm/pat.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index fb0604f..dda78ed 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -345,6 +345,8 @@ void pat_init(void)
 		 * we lose performance without causing a correctness issue.
 		 * Pentium 4 erratum N46 is an example for such an erratum,
 		 * although we try not to use PAT at all on affected CPUs.
+		 * AMD processors support write-protect so initialize the
+		 * PAT slot 5 appropriately.
 		 *
 		 *  PTE encoding:
 		 *      PAT
@@ -356,7 +358,7 @@ void pat_init(void)
 		 *      010    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
 		 *      011    3    UC : _PAGE_CACHE_MODE_UC
 		 *      100    4    WB : Reserved
-		 *      101    5    WC : Reserved
+		 *      101    5    WC : Reserved (AMD: _PAGE_CACHE_MODE_WP)
 		 *      110    6    UC-: Reserved
 		 *      111    7    WT : _PAGE_CACHE_MODE_WT
 		 *
@@ -364,7 +366,12 @@ void pat_init(void)
 		 * corresponding types in the presence of PAT errata.
 		 */
 		pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
-		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
+		      PAT(4, WB) | PAT(6, UC_MINUS) | PAT(7, WT);
+
+		if (c->x86_vendor == X86_VENDOR_AMD)
+			pat |= PAT(5, WP);
+		else
+			pat |= PAT(5, WC);
 	}
 
 	if (!boot_cpu_done) {

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 02/18] x86: Secure Memory Encryption (SME) build enablement
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
  2016-04-26 22:45 ` [RFC PATCH v1 01/18] x86: Set the write-protect cache mode for AMD processors Tom Lendacky
@ 2016-04-26 22:45 ` Tom Lendacky
  2016-04-26 22:45 ` [RFC PATCH v1 03/18] x86: Secure Memory Encryption (SME) support Tom Lendacky
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:45 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

Provide the Kconfig support to build the SME support in the kernel.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/Kconfig |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7bb1574..13249b5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1356,6 +1356,15 @@ config X86_DIRECT_GBPAGES
 	  supports them), so don't confuse the user by printing
 	  that we have them enabled.
 
+config AMD_MEM_ENCRYPT
+	bool "Secure Memory Encryption support for AMD"
+	depends on X86_64 && CPU_SUP_AMD
+	---help---
+	  Say yes to enable the encryption of system memory. This requires
+	  an AMD processor that supports Secure Memory Encryption (SME).
+	  The encryption of system memory is disabled by default but can be
+	  enabled with the mem_encrypt=on command line option.
+
 # Common NUMA Features
 config NUMA
 	bool "Numa Memory Allocation and Scheduler Support"

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 03/18] x86: Secure Memory Encryption (SME) support
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
  2016-04-26 22:45 ` [RFC PATCH v1 01/18] x86: Set the write-protect cache mode for AMD processors Tom Lendacky
  2016-04-26 22:45 ` [RFC PATCH v1 02/18] x86: Secure Memory Encryption (SME) build enablement Tom Lendacky
@ 2016-04-26 22:45 ` Tom Lendacky
  2016-04-26 22:45 ` [RFC PATCH v1 04/18] x86: Add the Secure Memory Encryption cpu feature Tom Lendacky
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:45 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

Provide support for Secure Memory Encryption (SME). This initial support
defines the memory encryption mask as a variable for quick access and an
accessor for retrieving the number of physical addressing bits lost if
SME is enabled.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   37 ++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/Makefile           |    2 ++
 arch/x86/kernel/mem_encrypt.S      |   29 ++++++++++++++++++++++++++++
 arch/x86/kernel/x8664_ksyms_64.c   |    6 ++++++
 4 files changed, 74 insertions(+)
 create mode 100644 arch/x86/include/asm/mem_encrypt.h
 create mode 100644 arch/x86/kernel/mem_encrypt.S

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
new file mode 100644
index 0000000..747fc52
--- /dev/null
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -0,0 +1,37 @@
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __X86_MEM_ENCRYPT_H__
+#define __X86_MEM_ENCRYPT_H__
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+
+extern unsigned long sme_me_mask;
+
+u8 sme_get_me_loss(void);
+
+#else	/* !CONFIG_AMD_MEM_ENCRYPT */
+
+#define sme_me_mask		0UL
+
+static inline u8 sme_get_me_loss(void)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_AMD_MEM_ENCRYPT */
+
+#endif	/* __ASSEMBLY__ */
+
+#endif	/* __X86_MEM_ENCRYPT_H__ */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 9abf855..11536d9 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -126,6 +126,8 @@ obj-$(CONFIG_EFI)			+= sysfb_efi.o
 obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o
 obj-$(CONFIG_TRACING)			+= tracepoint.o
 
+obj-y					+= mem_encrypt.o
+
 ###
 # 64 bit specific files
 ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/mem_encrypt.S b/arch/x86/kernel/mem_encrypt.S
new file mode 100644
index 0000000..ef7f325
--- /dev/null
+++ b/arch/x86/kernel/mem_encrypt.S
@@ -0,0 +1,29 @@
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+
+	.text
+	.code64
+ENTRY(sme_get_me_loss)
+	xor	%rax, %rax
+	mov	sme_me_loss(%rip), %al
+	ret
+ENDPROC(sme_get_me_loss)
+
+	.data
+	.align 16
+ENTRY(sme_me_mask)
+	.quad	0x0000000000000000
+sme_me_loss:
+	.byte	0x00
+	.align	8
diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c
index cd05942..72cb689 100644
--- a/arch/x86/kernel/x8664_ksyms_64.c
+++ b/arch/x86/kernel/x8664_ksyms_64.c
@@ -11,6 +11,7 @@
 #include <asm/uaccess.h>
 #include <asm/desc.h>
 #include <asm/ftrace.h>
+#include <asm/mem_encrypt.h>
 
 #ifdef CONFIG_FUNCTION_TRACER
 /* mcount and __fentry__ are defined in assembly */
@@ -79,3 +80,8 @@ EXPORT_SYMBOL(native_load_gs_index);
 EXPORT_SYMBOL(___preempt_schedule);
 EXPORT_SYMBOL(___preempt_schedule_notrace);
 #endif
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+EXPORT_SYMBOL_GPL(sme_me_mask);
+EXPORT_SYMBOL_GPL(sme_get_me_loss);
+#endif

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 04/18] x86: Add the Secure Memory Encryption cpu feature
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
                   ` (2 preceding siblings ...)
  2016-04-26 22:45 ` [RFC PATCH v1 03/18] x86: Secure Memory Encryption (SME) support Tom Lendacky
@ 2016-04-26 22:45 ` Tom Lendacky
  2016-04-26 22:46 ` [RFC PATCH v1 05/18] x86: Handle reduction in physical address size with SME Tom Lendacky
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:45 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

Update the cpu features to include identifying and reporting on the
Secure Memory Encryption feature.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/cpufeature.h  |    1 +
 arch/x86/include/asm/cpufeatures.h |    5 ++++-
 arch/x86/kernel/cpu/scattered.c    |    1 +
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 07c942d..e27e352 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -27,6 +27,7 @@ enum cpuid_leafs
 	CPUID_6_EAX,
 	CPUID_8000_000A_EDX,
 	CPUID_7_ECX,
+	CPUID_8000_001F_EAX,
 };
 
 #ifdef CONFIG_X86_FEATURE_NAMES
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 47b5056..4aea205 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
 /*
  * Defines x86 CPU feature bits
  */
-#define NCAPINTS	17	/* N 32-bit words worth of info */
+#define NCAPINTS	18	/* N 32-bit words worth of info */
 #define NBUGINTS	1	/* N 32-bit bug flags */
 
 /*
@@ -282,6 +282,9 @@
 #define X86_FEATURE_PKU		(16*32+ 3) /* Protection Keys for Userspace */
 #define X86_FEATURE_OSPKE	(16*32+ 4) /* OS Protection Keys Enable */
 
+/* AMD SME Feature Identification, CPUID level 0x8000001f (eax), word 17 */
+#define X86_FEATURE_SME		(17*32+ 0) /* Secure Memory Encryption support */
+
 /*
  * BUG word(s)
  */
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 8cb57df..d86d9a5 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -37,6 +37,7 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
 		{ X86_FEATURE_HW_PSTATE,	CR_EDX, 7, 0x80000007, 0 },
 		{ X86_FEATURE_CPB,		CR_EDX, 9, 0x80000007, 0 },
 		{ X86_FEATURE_PROC_FEEDBACK,	CR_EDX,11, 0x80000007, 0 },
+		{ X86_FEATURE_SME,		CR_EAX, 0, 0x8000001f, 0 },
 		{ 0, 0, 0, 0, 0 }
 	};
 

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 05/18] x86: Handle reduction in physical address size with SME
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
                   ` (3 preceding siblings ...)
  2016-04-26 22:45 ` [RFC PATCH v1 04/18] x86: Add the Secure Memory Encryption cpu feature Tom Lendacky
@ 2016-04-26 22:46 ` Tom Lendacky
  2016-04-26 22:46 ` [RFC PATCH v1 06/18] x86: Provide general kernel support for memory encryption Tom Lendacky
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:46 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

When System Memory Encryption (SME) is enabled, the physical address
space is reduced. Adjust the x86_phys_bits value to reflect this
reduction.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kernel/cpu/common.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 6bfa36d..b49e7fc 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -43,6 +43,7 @@
 #include <asm/pat.h>
 #include <asm/microcode.h>
 #include <asm/microcode_intel.h>
+#include <asm/mem_encrypt.h>
 
 #ifdef CONFIG_X86_LOCAL_APIC
 #include <asm/uv/uv.h>
@@ -722,6 +723,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 
 		c->x86_virt_bits = (eax >> 8) & 0xff;
 		c->x86_phys_bits = eax & 0xff;
+		c->x86_phys_bits -= sme_get_me_loss();
 		c->x86_capability[CPUID_8000_0008_EBX] = ebx;
 	}
 #ifdef CONFIG_X86_32

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 06/18] x86: Provide general kernel support for memory encryption
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
                   ` (4 preceding siblings ...)
  2016-04-26 22:46 ` [RFC PATCH v1 05/18] x86: Handle reduction in physical address size with SME Tom Lendacky
@ 2016-04-26 22:46 ` Tom Lendacky
  2016-04-26 22:46 ` [RFC PATCH v1 07/18] x86: Extend the early_memmap support with additional attrs Tom Lendacky
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:46 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

Adding general kernel support for memory encryption includes:
- Modify and create some page table macros to include the Secure Memory
  Encryption (SME) memory encryption mask
- Update kernel boot support to call an SME routine that checks for and
  sets the SME capability (the SME routine will grow later and for now
  is just a stub routine)
- Update kernel boot support to call an SME routine that encrypts the
  kernel (the SME routine will grow later and for now is just a stub
  routine)
- Provide an SME initialization routine to update the protection map with
  the memory encryption mask so that it is used by default

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/fixmap.h        |    7 ++++++
 arch/x86/include/asm/mem_encrypt.h   |   18 +++++++++++++++
 arch/x86/include/asm/pgtable_types.h |   41 ++++++++++++++++++++++-----------
 arch/x86/include/asm/processor.h     |    3 ++
 arch/x86/kernel/espfix_64.c          |    2 +-
 arch/x86/kernel/head64.c             |   10 ++++++--
 arch/x86/kernel/head_64.S            |   42 ++++++++++++++++++++++++++--------
 arch/x86/kernel/machine_kexec_64.c   |    2 +-
 arch/x86/kernel/mem_encrypt.S        |    8 ++++++
 arch/x86/mm/Makefile                 |    1 +
 arch/x86/mm/fault.c                  |    5 ++--
 arch/x86/mm/ioremap.c                |    3 ++
 arch/x86/mm/kasan_init_64.c          |    4 ++-
 arch/x86/mm/mem_encrypt.c            |   30 ++++++++++++++++++++++++
 arch/x86/mm/pageattr.c               |    3 ++
 15 files changed, 145 insertions(+), 34 deletions(-)
 create mode 100644 arch/x86/mm/mem_encrypt.c

diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 8554f96..83e91f0 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -153,6 +153,13 @@ static inline void __set_fixmap(enum fixed_addresses idx,
 }
 #endif
 
+/*
+ * Fixmap settings used with memory encryption
+ *   - FIXMAP_PAGE_NOCACHE is used for MMIO so make sure the memory
+ *     encryption mask is not part of the page attributes
+ */
+#define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_IO_NOCACHE
+
 #include <asm-generic/fixmap.h>
 
 #define __late_set_fixmap(idx, phys, flags) __set_fixmap(idx, phys, flags)
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 747fc52..9f3e762 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -15,12 +15,21 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/init.h>
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 
 extern unsigned long sme_me_mask;
 
 u8 sme_get_me_loss(void);
 
+void __init sme_early_init(void);
+
+#define __sme_pa(x)		(__pa((x)) | sme_me_mask)
+#define __sme_pa_nodebug(x)	(__pa_nodebug((x)) | sme_me_mask)
+
+#define __sme_va(x)		(__va((x) & ~sme_me_mask))
+
 #else	/* !CONFIG_AMD_MEM_ENCRYPT */
 
 #define sme_me_mask		0UL
@@ -30,6 +39,15 @@ static inline u8 sme_get_me_loss(void)
 	return 0;
 }
 
+static inline void __init sme_early_init(void)
+{
+}
+
+#define __sme_pa		__pa
+#define __sme_pa_nodebug	__pa_nodebug
+
+#define __sme_va		__va
+
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
 #endif	/* __ASSEMBLY__ */
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 7b5efe2..fda7877 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -3,6 +3,7 @@
 
 #include <linux/const.h>
 #include <asm/page_types.h>
+#include <asm/mem_encrypt.h>
 
 #define FIRST_USER_ADDRESS	0UL
 
@@ -115,9 +116,9 @@
 
 #define _PAGE_PROTNONE	(_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE)
 
-#define _PAGE_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |	\
+#define __PAGE_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |	\
 			 _PAGE_ACCESSED | _PAGE_DIRTY)
-#define _KERNPG_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |	\
+#define __KERNPG_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |	\
 			 _PAGE_DIRTY)
 
 /*
@@ -185,18 +186,30 @@ enum page_cache_mode {
 #define __PAGE_KERNEL_IO		(__PAGE_KERNEL)
 #define __PAGE_KERNEL_IO_NOCACHE	(__PAGE_KERNEL_NOCACHE)
 
-#define PAGE_KERNEL			__pgprot(__PAGE_KERNEL)
-#define PAGE_KERNEL_RO			__pgprot(__PAGE_KERNEL_RO)
-#define PAGE_KERNEL_EXEC		__pgprot(__PAGE_KERNEL_EXEC)
-#define PAGE_KERNEL_RX			__pgprot(__PAGE_KERNEL_RX)
-#define PAGE_KERNEL_NOCACHE		__pgprot(__PAGE_KERNEL_NOCACHE)
-#define PAGE_KERNEL_LARGE		__pgprot(__PAGE_KERNEL_LARGE)
-#define PAGE_KERNEL_LARGE_EXEC		__pgprot(__PAGE_KERNEL_LARGE_EXEC)
-#define PAGE_KERNEL_VSYSCALL		__pgprot(__PAGE_KERNEL_VSYSCALL)
-#define PAGE_KERNEL_VVAR		__pgprot(__PAGE_KERNEL_VVAR)
-
-#define PAGE_KERNEL_IO			__pgprot(__PAGE_KERNEL_IO)
-#define PAGE_KERNEL_IO_NOCACHE		__pgprot(__PAGE_KERNEL_IO_NOCACHE)
+#ifndef __ASSEMBLY__
+
+#define _PAGE_ENC	sme_me_mask
+
+/* Redefine macros to inclue the memory encryption mask */
+#define _PAGE_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |	\
+			 _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_ENC)
+#define _KERNPG_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |	\
+			 _PAGE_DIRTY | _PAGE_ENC)
+
+#define PAGE_KERNEL		__pgprot(__PAGE_KERNEL | _PAGE_ENC)
+#define PAGE_KERNEL_RO		__pgprot(__PAGE_KERNEL_RO | _PAGE_ENC)
+#define PAGE_KERNEL_EXEC	__pgprot(__PAGE_KERNEL_EXEC | _PAGE_ENC)
+#define PAGE_KERNEL_RX		__pgprot(__PAGE_KERNEL_RX | _PAGE_ENC)
+#define PAGE_KERNEL_NOCACHE	__pgprot(__PAGE_KERNEL_NOCACHE | _PAGE_ENC)
+#define PAGE_KERNEL_LARGE	__pgprot(__PAGE_KERNEL_LARGE | _PAGE_ENC)
+#define PAGE_KERNEL_LARGE_EXEC	__pgprot(__PAGE_KERNEL_LARGE_EXEC | _PAGE_ENC)
+#define PAGE_KERNEL_VSYSCALL	__pgprot(__PAGE_KERNEL_VSYSCALL | _PAGE_ENC)
+#define PAGE_KERNEL_VVAR	__pgprot(__PAGE_KERNEL_VVAR | _PAGE_ENC)
+
+#define PAGE_KERNEL_IO		__pgprot(__PAGE_KERNEL_IO)
+#define PAGE_KERNEL_IO_NOCACHE	__pgprot(__PAGE_KERNEL_IO_NOCACHE)
+
+#endif	/* __ASSEMBLY__ */
 
 /*         xwr */
 #define __P000	PAGE_NONE
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 8d326e8..1fae737 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -22,6 +22,7 @@ struct vm86;
 #include <asm/nops.h>
 #include <asm/special_insns.h>
 #include <asm/fpu/types.h>
+#include <asm/mem_encrypt.h>
 
 #include <linux/personality.h>
 #include <linux/cache.h>
@@ -207,7 +208,7 @@ static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
 
 static inline void load_cr3(pgd_t *pgdir)
 {
-	write_cr3(__pa(pgdir));
+	write_cr3(__sme_pa(pgdir));
 }
 
 #ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c
index 4d38416..3385377 100644
--- a/arch/x86/kernel/espfix_64.c
+++ b/arch/x86/kernel/espfix_64.c
@@ -193,7 +193,7 @@ void init_espfix_ap(int cpu)
 
 	pte_p = pte_offset_kernel(&pmd, addr);
 	stack_page = page_address(alloc_pages_node(node, GFP_KERNEL, 0));
-	pte = __pte(__pa(stack_page) | (__PAGE_KERNEL_RO & ptemask));
+	pte = __pte(__pa(stack_page) | ((__PAGE_KERNEL_RO | _PAGE_ENC) & ptemask));
 	for (n = 0; n < ESPFIX_PTE_CLONES; n++)
 		set_pte(&pte_p[n*PTE_STRIDE], pte);
 
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index b72fb0b..3516f9f 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -28,6 +28,7 @@
 #include <asm/bootparam_utils.h>
 #include <asm/microcode.h>
 #include <asm/kasan.h>
+#include <asm/mem_encrypt.h>
 
 /*
  * Manage page tables very early on.
@@ -42,7 +43,7 @@ static void __init reset_early_page_tables(void)
 {
 	memset(early_level4_pgt, 0, sizeof(pgd_t)*(PTRS_PER_PGD-1));
 	next_early_pgt = 0;
-	write_cr3(__pa_nodebug(early_level4_pgt));
+	write_cr3(__sme_pa_nodebug(early_level4_pgt));
 }
 
 /* Create a new PMD entry */
@@ -54,7 +55,7 @@ int __init early_make_pgtable(unsigned long address)
 	pmdval_t pmd, *pmd_p;
 
 	/* Invalid address or early pgt is done ?  */
-	if (physaddr >= MAXMEM || read_cr3() != __pa_nodebug(early_level4_pgt))
+	if (physaddr >= MAXMEM || read_cr3() != __sme_pa_nodebug(early_level4_pgt))
 		return -1;
 
 again:
@@ -157,6 +158,11 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
 
 	clear_page(init_level4_pgt);
 
+	/* Update the early_pmd_flags with the memory encryption mask */
+	early_pmd_flags |= _PAGE_ENC;
+
+	sme_early_init();
+
 	kasan_early_init();
 
 	for (i = 0; i < NUM_EXCEPTION_VECTORS; i++)
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 5df831e..0f3ad72 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -95,6 +95,13 @@ startup_64:
 	jnz	bad_address
 
 	/*
+	 * Enable memory encryption (if available). Add the memory encryption
+	 * mask to %rbp to include it in the the page table fixup.
+	 */
+	call	sme_enable
+	addq	sme_me_mask(%rip), %rbp
+
+	/*
 	 * Fixup the physical addresses in the page table
 	 */
 	addq	%rbp, early_level4_pgt + (L4_START_KERNEL*8)(%rip)
@@ -116,7 +123,8 @@ startup_64:
 	movq	%rdi, %rax
 	shrq	$PGDIR_SHIFT, %rax
 
-	leaq	(4096 + _KERNPG_TABLE)(%rbx), %rdx
+	leaq	(4096 + __KERNPG_TABLE)(%rbx), %rdx
+	addq	sme_me_mask(%rip), %rdx		/* Apply mem encryption mask */
 	movq	%rdx, 0(%rbx,%rax,8)
 	movq	%rdx, 8(%rbx,%rax,8)
 
@@ -133,6 +141,7 @@ startup_64:
 	movq	%rdi, %rax
 	shrq	$PMD_SHIFT, %rdi
 	addq	$(__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL), %rax
+	addq	sme_me_mask(%rip), %rax		/* Apply mem encryption mask */
 	leaq	(_end - 1)(%rip), %rcx
 	shrq	$PMD_SHIFT, %rcx
 	subq	%rdi, %rcx
@@ -163,9 +172,19 @@ startup_64:
 	cmp	%r8, %rdi
 	jne	1b
 
-	/* Fixup phys_base */
+	/*
+	 * Fixup phys_base, remove the memory encryption mask from %rbp
+	 * to obtain the true physical address.
+	 */
+	subq	sme_me_mask(%rip), %rbp
 	addq	%rbp, phys_base(%rip)
 
+	/*
+	 * The page tables have been updated with the memory encryption mask,
+	 * so encrypt the kernel if memory encryption is active
+	 */
+	call	sme_encrypt_kernel
+
 	movq	$(early_level4_pgt - __START_KERNEL_map), %rax
 	jmp 1f
 ENTRY(secondary_startup_64)
@@ -189,6 +208,9 @@ ENTRY(secondary_startup_64)
 	movq	$(init_level4_pgt - __START_KERNEL_map), %rax
 1:
 
+	/* Add the memory encryption mask to RAX */
+	addq	sme_me_mask(%rip), %rax
+
 	/* Enable PAE mode and PGE */
 	movl	$(X86_CR4_PAE | X86_CR4_PGE), %ecx
 	movq	%rcx, %cr4
@@ -416,7 +438,7 @@ GLOBAL(name)
 	__INITDATA
 NEXT_PAGE(early_level4_pgt)
 	.fill	511,8,0
-	.quad	level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE
+	.quad	level3_kernel_pgt - __START_KERNEL_map + __PAGE_TABLE
 
 NEXT_PAGE(early_dynamic_pgts)
 	.fill	512*EARLY_DYNAMIC_PAGE_TABLES,8,0
@@ -428,15 +450,15 @@ NEXT_PAGE(init_level4_pgt)
 	.fill	512,8,0
 #else
 NEXT_PAGE(init_level4_pgt)
-	.quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+	.quad   level3_ident_pgt - __START_KERNEL_map + __KERNPG_TABLE
 	.org    init_level4_pgt + L4_PAGE_OFFSET*8, 0
-	.quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+	.quad   level3_ident_pgt - __START_KERNEL_map + __KERNPG_TABLE
 	.org    init_level4_pgt + L4_START_KERNEL*8, 0
 	/* (2^48-(2*1024*1024*1024))/(2^39) = 511 */
-	.quad   level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE
+	.quad   level3_kernel_pgt - __START_KERNEL_map + __PAGE_TABLE
 
 NEXT_PAGE(level3_ident_pgt)
-	.quad	level2_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE
+	.quad	level2_ident_pgt - __START_KERNEL_map + __KERNPG_TABLE
 	.fill	511, 8, 0
 NEXT_PAGE(level2_ident_pgt)
 	/* Since I easily can, map the first 1G.
@@ -448,8 +470,8 @@ NEXT_PAGE(level2_ident_pgt)
 NEXT_PAGE(level3_kernel_pgt)
 	.fill	L3_START_KERNEL,8,0
 	/* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */
-	.quad	level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE
-	.quad	level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE
+	.quad	level2_kernel_pgt - __START_KERNEL_map + __KERNPG_TABLE
+	.quad	level2_fixmap_pgt - __START_KERNEL_map + __PAGE_TABLE
 
 NEXT_PAGE(level2_kernel_pgt)
 	/*
@@ -467,7 +489,7 @@ NEXT_PAGE(level2_kernel_pgt)
 
 NEXT_PAGE(level2_fixmap_pgt)
 	.fill	506,8,0
-	.quad	level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE
+	.quad	level1_fixmap_pgt - __START_KERNEL_map + __PAGE_TABLE
 	/* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */
 	.fill	5,8,0
 
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index ba7fbba..eb2faee 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -103,7 +103,7 @@ static int init_pgtable(struct kimage *image, unsigned long start_pgtable)
 	struct x86_mapping_info info = {
 		.alloc_pgt_page	= alloc_pgt_page,
 		.context	= image,
-		.pmd_flag	= __PAGE_KERNEL_LARGE_EXEC,
+		.pmd_flag	= __PAGE_KERNEL_LARGE_EXEC | _PAGE_ENC,
 	};
 	unsigned long mstart, mend;
 	pgd_t *level4p;
diff --git a/arch/x86/kernel/mem_encrypt.S b/arch/x86/kernel/mem_encrypt.S
index ef7f325..f2e0536 100644
--- a/arch/x86/kernel/mem_encrypt.S
+++ b/arch/x86/kernel/mem_encrypt.S
@@ -14,6 +14,14 @@
 
 	.text
 	.code64
+ENTRY(sme_enable)
+	ret
+ENDPROC(sme_enable)
+
+ENTRY(sme_encrypt_kernel)
+	ret
+ENDPROC(sme_encrypt_kernel)
+
 ENTRY(sme_get_me_loss)
 	xor	%rax, %rax
 	mov	sme_me_loss(%rip), %al
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index f989132..21f8ea3 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -39,3 +39,4 @@ obj-$(CONFIG_NUMA_EMU)		+= numa_emulation.o
 obj-$(CONFIG_X86_INTEL_MPX)	+= mpx.o
 obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o
 
+obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= mem_encrypt.o
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 5ce1ed0..9e93a0d 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -23,6 +23,7 @@
 #include <asm/vsyscall.h>		/* emulate_vsyscall		*/
 #include <asm/vm86.h>			/* struct vm86			*/
 #include <asm/mmu_context.h>		/* vma_pkey()			*/
+#include <asm/mem_encrypt.h>		/* __sme_va()			*/
 
 #define CREATE_TRACE_POINTS
 #include <asm/trace/exceptions.h>
@@ -523,7 +524,7 @@ static int bad_address(void *p)
 
 static void dump_pagetable(unsigned long address)
 {
-	pgd_t *base = __va(read_cr3() & PHYSICAL_PAGE_MASK);
+	pgd_t *base = __sme_va(read_cr3() & PHYSICAL_PAGE_MASK);
 	pgd_t *pgd = base + pgd_index(address);
 	pud_t *pud;
 	pmd_t *pmd;
@@ -659,7 +660,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code,
 		pgd_t *pgd;
 		pte_t *pte;
 
-		pgd = __va(read_cr3() & PHYSICAL_PAGE_MASK);
+		pgd = __sme_va(read_cr3() & PHYSICAL_PAGE_MASK);
 		pgd += pgd_index(address);
 
 		pte = lookup_address_in_pgd(pgd, address, &level);
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index f089491..77dadf5 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -21,6 +21,7 @@
 #include <asm/tlbflush.h>
 #include <asm/pgalloc.h>
 #include <asm/pat.h>
+#include <asm/mem_encrypt.h>
 
 #include "physaddr.h"
 
@@ -424,7 +425,7 @@ static pte_t bm_pte[PAGE_SIZE/sizeof(pte_t)] __page_aligned_bss;
 static inline pmd_t * __init early_ioremap_pmd(unsigned long addr)
 {
 	/* Don't assume we're using swapper_pg_dir at this point */
-	pgd_t *base = __va(read_cr3());
+	pgd_t *base = __sme_va(read_cr3());
 	pgd_t *pgd = &base[pgd_index(addr)];
 	pud_t *pud = pud_offset(pgd, addr);
 	pmd_t *pmd = pmd_offset(pud, addr);
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 1b1110f..7102408 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -68,7 +68,7 @@ static struct notifier_block kasan_die_notifier = {
 void __init kasan_early_init(void)
 {
 	int i;
-	pteval_t pte_val = __pa_nodebug(kasan_zero_page) | __PAGE_KERNEL;
+	pteval_t pte_val = __pa_nodebug(kasan_zero_page) | __PAGE_KERNEL | _PAGE_ENC;
 	pmdval_t pmd_val = __pa_nodebug(kasan_zero_pte) | _KERNPG_TABLE;
 	pudval_t pud_val = __pa_nodebug(kasan_zero_pmd) | _KERNPG_TABLE;
 
@@ -130,7 +130,7 @@ void __init kasan_init(void)
 	 */
 	memset(kasan_zero_page, 0, PAGE_SIZE);
 	for (i = 0; i < PTRS_PER_PTE; i++) {
-		pte_t pte = __pte(__pa(kasan_zero_page) | __PAGE_KERNEL_RO);
+		pte_t pte = __pte(__pa(kasan_zero_page) | __PAGE_KERNEL_RO | _PAGE_ENC);
 		set_pte(&kasan_zero_pte[i], pte);
 	}
 	/* Flush TLBs again to be sure that write protection applied. */
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
new file mode 100644
index 0000000..00eb705
--- /dev/null
+++ b/arch/x86/mm/mem_encrypt.c
@@ -0,0 +1,30 @@
+/*
+ * AMD Memory Encryption Support
+ *
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky <thomas.lendacky@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/init.h>
+#include <linux/mm.h>
+
+#include <asm/mem_encrypt.h>
+
+void __init sme_early_init(void)
+{
+	unsigned int i;
+
+	if (!sme_me_mask)
+		return;
+
+	__supported_pte_mask |= sme_me_mask;
+
+	/* Update the protection map with memory encryption mask */
+	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
+		protection_map[i] = __pgprot(pgprot_val(protection_map[i]) | sme_me_mask);
+}
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index bbf462f..c055302 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1976,6 +1976,9 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 	if (!(page_flags & _PAGE_RW))
 		cpa.mask_clr = __pgprot(_PAGE_RW);
 
+	if (!(page_flags & _PAGE_ENC))
+		cpa.mask_clr = __pgprot(pgprot_val(cpa.mask_clr) | _PAGE_ENC);
+
 	cpa.mask_set = __pgprot(_PAGE_PRESENT | page_flags);
 
 	retval = __change_page_attr_set_clr(&cpa, 0);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 07/18] x86: Extend the early_memmap support with additional attrs
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
                   ` (5 preceding siblings ...)
  2016-04-26 22:46 ` [RFC PATCH v1 06/18] x86: Provide general kernel support for memory encryption Tom Lendacky
@ 2016-04-26 22:46 ` Tom Lendacky
  2016-04-26 22:46 ` [RFC PATCH v1 08/18] x86: Add support for early encryption/decryption of memory Tom Lendacky
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:46 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

Add to the early_memmap support to be able to specify encrypted and
un-encrypted mappings with and without write-protection. The use of
write-protection is necessary when encrypting data "in place". The
write-protect attribute is considered cacheable for loads, but not
stores. This implies that the hardware will never give the core a
dirty line with this memtype.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/fixmap.h        |    9 +++++++++
 arch/x86/include/asm/pgtable_types.h |    8 ++++++++
 arch/x86/mm/ioremap.c                |   28 ++++++++++++++++++++++++++++
 include/asm-generic/early_ioremap.h  |    2 ++
 mm/early_ioremap.c                   |   15 +++++++++++++++
 5 files changed, 62 insertions(+)

diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 83e91f0..4d41878 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -160,6 +160,15 @@ static inline void __set_fixmap(enum fixed_addresses idx,
  */
 #define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_IO_NOCACHE
 
+void __init *early_memremap_enc(resource_size_t phys_addr,
+				unsigned long size);
+void __init *early_memremap_enc_wp(resource_size_t phys_addr,
+				   unsigned long size);
+void __init *early_memremap_dec(resource_size_t phys_addr,
+				unsigned long size);
+void __init *early_memremap_dec_wp(resource_size_t phys_addr,
+				   unsigned long size);
+
 #include <asm-generic/fixmap.h>
 
 #define __late_set_fixmap(idx, phys, flags) __set_fixmap(idx, phys, flags)
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index fda7877..6291248 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -154,6 +154,7 @@ enum page_cache_mode {
 
 #define _PAGE_CACHE_MASK	(_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)
 #define _PAGE_NOCACHE		(cachemode2protval(_PAGE_CACHE_MODE_UC))
+#define _PAGE_CACHE_WP		(cachemode2protval(_PAGE_CACHE_MODE_WP))
 
 #define PAGE_NONE	__pgprot(_PAGE_PROTNONE | _PAGE_ACCESSED)
 #define PAGE_SHARED	__pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | \
@@ -182,6 +183,7 @@ enum page_cache_mode {
 #define __PAGE_KERNEL_VVAR		(__PAGE_KERNEL_RO | _PAGE_USER)
 #define __PAGE_KERNEL_LARGE		(__PAGE_KERNEL | _PAGE_PSE)
 #define __PAGE_KERNEL_LARGE_EXEC	(__PAGE_KERNEL_EXEC | _PAGE_PSE)
+#define __PAGE_KERNEL_WP		(__PAGE_KERNEL | _PAGE_CACHE_WP)
 
 #define __PAGE_KERNEL_IO		(__PAGE_KERNEL)
 #define __PAGE_KERNEL_IO_NOCACHE	(__PAGE_KERNEL_NOCACHE)
@@ -196,6 +198,12 @@ enum page_cache_mode {
 #define _KERNPG_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |	\
 			 _PAGE_DIRTY | _PAGE_ENC)
 
+#define __PAGE_KERNEL_ENC	(__PAGE_KERNEL | _PAGE_ENC)
+#define __PAGE_KERNEL_ENC_WP	(__PAGE_KERNEL_WP | _PAGE_ENC)
+
+#define __PAGE_KERNEL_DEC	(__PAGE_KERNEL)
+#define __PAGE_KERNEL_DEC_WP	(__PAGE_KERNEL_WP)
+
 #define PAGE_KERNEL		__pgprot(__PAGE_KERNEL | _PAGE_ENC)
 #define PAGE_KERNEL_RO		__pgprot(__PAGE_KERNEL_RO | _PAGE_ENC)
 #define PAGE_KERNEL_EXEC	__pgprot(__PAGE_KERNEL_EXEC | _PAGE_ENC)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 77dadf5..14c7ed5 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -420,6 +420,34 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
 	iounmap((void __iomem *)((unsigned long)addr & PAGE_MASK));
 }
 
+/* Remap memory with encryption */
+void __init *early_memremap_enc(resource_size_t phys_addr,
+				unsigned long size)
+{
+	return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_ENC);
+}
+
+/* Remap memory with encryption and write-protected */
+void __init *early_memremap_enc_wp(resource_size_t phys_addr,
+				   unsigned long size)
+{
+	return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_ENC_WP);
+}
+
+/* Remap memory without encryption */
+void __init *early_memremap_dec(resource_size_t phys_addr,
+				unsigned long size)
+{
+	return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_DEC);
+}
+
+/* Remap memory without encryption and write-protected */
+void __init *early_memremap_dec_wp(resource_size_t phys_addr,
+				   unsigned long size)
+{
+	return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_DEC_WP);
+}
+
 static pte_t bm_pte[PAGE_SIZE/sizeof(pte_t)] __page_aligned_bss;
 
 static inline pmd_t * __init early_ioremap_pmd(unsigned long addr)
diff --git a/include/asm-generic/early_ioremap.h b/include/asm-generic/early_ioremap.h
index 734ad4d..2edef8d 100644
--- a/include/asm-generic/early_ioremap.h
+++ b/include/asm-generic/early_ioremap.h
@@ -13,6 +13,8 @@ extern void *early_memremap(resource_size_t phys_addr,
 			    unsigned long size);
 extern void *early_memremap_ro(resource_size_t phys_addr,
 			       unsigned long size);
+extern void *early_memremap_prot(resource_size_t phys_addr,
+				 unsigned long size, unsigned long prot_val);
 extern void early_iounmap(void __iomem *addr, unsigned long size);
 extern void early_memunmap(void *addr, unsigned long size);
 
diff --git a/mm/early_ioremap.c b/mm/early_ioremap.c
index 6d5717b..d71b98b 100644
--- a/mm/early_ioremap.c
+++ b/mm/early_ioremap.c
@@ -226,6 +226,14 @@ early_memremap_ro(resource_size_t phys_addr, unsigned long size)
 }
 #endif
 
+void __init *
+early_memremap_prot(resource_size_t phys_addr, unsigned long size,
+		    unsigned long prot_val)
+{
+	return (__force void *)__early_ioremap(phys_addr, size,
+					       __pgprot(prot_val));
+}
+
 #define MAX_MAP_CHUNK	(NR_FIX_BTMAPS << PAGE_SHIFT)
 
 void __init copy_from_early_mem(void *dest, phys_addr_t src, unsigned long size)
@@ -267,6 +275,13 @@ early_memremap_ro(resource_size_t phys_addr, unsigned long size)
 	return (void *)phys_addr;
 }
 
+void __init *
+early_memremap_prot(resource_size_t phys_addr, unsigned long size,
+		    unsigned long prot_val)
+{
+	return (void *)phys_addr;
+}
+
 void __init early_iounmap(void __iomem *addr, unsigned long size)
 {
 }

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 08/18] x86: Add support for early encryption/decryption of memory
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
                   ` (6 preceding siblings ...)
  2016-04-26 22:46 ` [RFC PATCH v1 07/18] x86: Extend the early_memmap support with additional attrs Tom Lendacky
@ 2016-04-26 22:46 ` Tom Lendacky
  2016-04-26 22:46 ` [RFC PATCH v1 09/18] x86: Insure that memory areas are encrypted when possible Tom Lendacky
  2016-04-26 22:47 ` [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear Tom Lendacky
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:46 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

This adds support to be able to either encrypt or decrypt data during
the early stages of booting the kernel. This does not change the memory
encryption attribute - it is used for ensuring that data present in
either an encrypted or un-encrypted memory area is in the proper state
(for example the initrd will have been loaded by the boot loader and
will not be encrypted, but the memory that it resides in is marked as
encrypted).

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/mem_encrypt.h |   15 ++++++
 arch/x86/mm/mem_encrypt.c          |   89 ++++++++++++++++++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 9f3e762..2785493 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -23,6 +23,11 @@ extern unsigned long sme_me_mask;
 
 u8 sme_get_me_loss(void);
 
+void __init sme_early_mem_enc(resource_size_t paddr,
+			      unsigned long size);
+void __init sme_early_mem_dec(resource_size_t paddr,
+			      unsigned long size);
+
 void __init sme_early_init(void);
 
 #define __sme_pa(x)		(__pa((x)) | sme_me_mask)
@@ -39,6 +44,16 @@ static inline u8 sme_get_me_loss(void)
 	return 0;
 }
 
+static inline void __init sme_early_mem_enc(resource_size_t paddr,
+					    unsigned long size)
+{
+}
+
+static inline void __init sme_early_mem_dec(resource_size_t paddr,
+					    unsigned long size)
+{
+}
+
 static inline void __init sme_early_init(void)
 {
 }
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 00eb705..5f19ede 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -14,6 +14,95 @@
 #include <linux/mm.h>
 
 #include <asm/mem_encrypt.h>
+#include <asm/tlbflush.h>
+#include <asm/fixmap.h>
+
+/* Buffer used for early in-place encryption by BSP, no locking needed */
+static char me_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
+
+void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size)
+{
+	void *src, *dst;
+	size_t len;
+
+	if (!sme_me_mask)
+		return;
+
+	local_flush_tlb();
+	wbinvd();
+
+	/*
+	 * There are limited number of early mapping slots, so map (at most)
+	 * one page at time.
+	 */
+	while (size) {
+		len = min_t(size_t, sizeof(me_early_buffer), size);
+
+		/* Create a mapping for non-encrypted write-protected memory */
+		src = early_memremap_dec_wp(paddr, len);
+
+		/* Create a mapping for encrypted memory */
+		dst = early_memremap_enc(paddr, len);
+
+		/*
+		 * If a mapping can't be obtained to perform the encryption,
+		 * then encrypted access to that area will end up causing
+		 * a crash.
+		 */
+		BUG_ON(!src || !dst);
+
+		memcpy(me_early_buffer, src, len);
+		memcpy(dst, me_early_buffer, len);
+
+		early_memunmap(dst, len);
+		early_memunmap(src, len);
+
+		paddr += len;
+		size -= len;
+	}
+}
+
+void __init sme_early_mem_dec(resource_size_t paddr, unsigned long size)
+{
+	void *src, *dst;
+	size_t len;
+
+	if (!sme_me_mask)
+		return;
+
+	local_flush_tlb();
+	wbinvd();
+
+	/*
+	 * There are limited number of early mapping slots, so map (at most)
+	 * one page at time.
+	 */
+	while (size) {
+		len = min_t(size_t, sizeof(me_early_buffer), size);
+
+		/* Create a mapping for encrypted write-protected memory */
+		src = early_memremap_enc_wp(paddr, len);
+
+		/* Create a mapping for non-encrypted memory */
+		dst = early_memremap_dec(paddr, len);
+
+		/*
+		 * If a mapping can't be obtained to perform the decryption,
+		 * then un-encrypted access to that area will end up causing
+		 * a crash.
+		 */
+		BUG_ON(!src || !dst);
+
+		memcpy(me_early_buffer, src, len);
+		memcpy(dst, me_early_buffer, len);
+
+		early_memunmap(dst, len);
+		early_memunmap(src, len);
+
+		paddr += len;
+		size -= len;
+	}
+}
 
 void __init sme_early_init(void)
 {

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 09/18] x86: Insure that memory areas are encrypted when possible
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
                   ` (7 preceding siblings ...)
  2016-04-26 22:46 ` [RFC PATCH v1 08/18] x86: Add support for early encryption/decryption of memory Tom Lendacky
@ 2016-04-26 22:46 ` Tom Lendacky
  2016-04-26 22:47 ` [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear Tom Lendacky
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:46 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

Encrypt memory areas in place when possible (e.g. zero page, etc.) so
that special handling isn't needed afterwards.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kernel/head64.c |   90 +++++++++++++++++++++++++++++++++++++++++++---
 arch/x86/kernel/setup.c  |    8 ++++
 2 files changed, 93 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 3516f9f..ac3a2bf 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -47,12 +47,12 @@ static void __init reset_early_page_tables(void)
 }
 
 /* Create a new PMD entry */
-int __init early_make_pgtable(unsigned long address)
+static int __init __early_make_pgtable(unsigned long address, pmdval_t pmd)
 {
 	unsigned long physaddr = address - __PAGE_OFFSET;
 	pgdval_t pgd, *pgd_p;
 	pudval_t pud, *pud_p;
-	pmdval_t pmd, *pmd_p;
+	pmdval_t *pmd_p;
 
 	/* Invalid address or early pgt is done ?  */
 	if (physaddr >= MAXMEM || read_cr3() != __sme_pa_nodebug(early_level4_pgt))
@@ -94,12 +94,92 @@ again:
 		memset(pmd_p, 0, sizeof(*pmd_p) * PTRS_PER_PMD);
 		*pud_p = (pudval_t)pmd_p - __START_KERNEL_map + phys_base + _KERNPG_TABLE;
 	}
-	pmd = (physaddr & PMD_MASK) + early_pmd_flags;
 	pmd_p[pmd_index(address)] = pmd;
 
 	return 0;
 }
 
+int __init early_make_pgtable(unsigned long address)
+{
+	unsigned long physaddr = address - __PAGE_OFFSET;
+	pmdval_t pmd;
+
+	pmd = (physaddr & PMD_MASK) + early_pmd_flags;
+
+	return __early_make_pgtable(address, pmd);
+}
+
+static void __init create_unencrypted_mapping(void *address, unsigned long size)
+{
+	unsigned long physaddr = (unsigned long)address - __PAGE_OFFSET;
+	pmdval_t pmd_flags, pmd;
+
+	if (!sme_me_mask)
+		return;
+
+	/* Clear the encryption mask from the early_pmd_flags */
+	pmd_flags = early_pmd_flags & ~sme_me_mask;
+
+	do {
+		pmd = (physaddr & PMD_MASK) + pmd_flags;
+		__early_make_pgtable((unsigned long)address, pmd);
+
+		address += PMD_SIZE;
+		physaddr += PMD_SIZE;
+		size = (size < PMD_SIZE) ? 0 : size - PMD_SIZE;
+	} while (size);
+}
+
+static void __init __clear_mapping(unsigned long address)
+{
+	unsigned long physaddr = address - __PAGE_OFFSET;
+	pgdval_t pgd, *pgd_p;
+	pudval_t pud, *pud_p;
+	pmdval_t *pmd_p;
+
+	/* Invalid address or early pgt is done ?  */
+	if (physaddr >= MAXMEM ||
+	    read_cr3() != __sme_pa_nodebug(early_level4_pgt))
+		return;
+
+	pgd_p = &early_level4_pgt[pgd_index(address)].pgd;
+	pgd = *pgd_p;
+
+	if (!pgd)
+		return;
+
+	/*
+	 * The use of __START_KERNEL_map rather than __PAGE_OFFSET here matches
+	 * __early_make_pgtable where the entry was created.
+	 */
+	pud_p = (pudval_t *)((pgd & PTE_PFN_MASK) + __START_KERNEL_map - phys_base);
+	pud_p += pud_index(address);
+	pud = *pud_p;
+
+	if (!pud)
+		return;
+
+	pmd_p = (pmdval_t *)((pud & PTE_PFN_MASK) + __START_KERNEL_map - phys_base);
+	pmd_p[pmd_index(address)] = 0;
+}
+
+static void __init clear_mapping(void *address, unsigned long size)
+{
+	do {
+		__clear_mapping((unsigned long)address);
+
+		address += PMD_SIZE;
+		size = (size < PMD_SIZE) ? 0 : size - PMD_SIZE;
+	} while (size);
+}
+
+static void __init sme_memcpy(void *dst, void *src, unsigned long size)
+{
+	create_unencrypted_mapping(src, size);
+	memcpy(dst, src, size);
+	clear_mapping(src, size);
+}
+
 /* Don't add a printk in there. printk relies on the PDA which is not initialized 
    yet. */
 static void __init clear_bss(void)
@@ -122,12 +202,12 @@ static void __init copy_bootdata(char *real_mode_data)
 	char * command_line;
 	unsigned long cmd_line_ptr;
 
-	memcpy(&boot_params, real_mode_data, sizeof boot_params);
+	sme_memcpy(&boot_params, real_mode_data, sizeof boot_params);
 	sanitize_boot_params(&boot_params);
 	cmd_line_ptr = get_cmd_line_ptr();
 	if (cmd_line_ptr) {
 		command_line = __va(cmd_line_ptr);
-		memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
+		sme_memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
 	}
 }
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 2367ae0..1d29cf9 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -113,6 +113,7 @@
 #include <asm/prom.h>
 #include <asm/microcode.h>
 #include <asm/mmu_context.h>
+#include <asm/mem_encrypt.h>
 
 /*
  * max_low_pfn_mapped: highest direct mapped pfn under 4GB
@@ -375,6 +376,13 @@ static void __init reserve_initrd(void)
 	    !ramdisk_image || !ramdisk_size)
 		return;		/* No initrd provided by bootloader */
 
+	/*
+	 * This memory is marked encrypted by the kernel but the ramdisk
+	 * was loaded in the clear by the bootloader, so make sure that
+	 * the ramdisk image is encrypted.
+	 */
+	sme_early_mem_enc(ramdisk_image, ramdisk_end - ramdisk_image);
+
 	initrd_start = 0;
 
 	mapped_size = memblock_mem_size(max_pfn_mapped);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
                   ` (8 preceding siblings ...)
  2016-04-26 22:46 ` [RFC PATCH v1 09/18] x86: Insure that memory areas are encrypted when possible Tom Lendacky
@ 2016-04-26 22:47 ` Tom Lendacky
  9 siblings, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:47 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

The EFI tables are not encrypted and need to be accessed as such. Be sure
to memmap them without the encryption attribute set. For EFI support that
lives outside of the arch/x86 tree, create a routine that uses the __weak
attribute so that it can be overridden by an architecture specific routine.

When freeing boot services related memory, since it has been mapped as
un-encrypted, be sure to change the mapping to encrypted for future use.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/cacheflush.h  |    3 +
 arch/x86/include/asm/mem_encrypt.h |   22 +++++++++++
 arch/x86/kernel/setup.c            |    6 +--
 arch/x86/mm/mem_encrypt.c          |   56 +++++++++++++++++++++++++++
 arch/x86/mm/pageattr.c             |   75 ++++++++++++++++++++++++++++++++++++
 arch/x86/platform/efi/efi.c        |   26 +++++++-----
 arch/x86/platform/efi/efi_64.c     |    9 +++-
 arch/x86/platform/efi/quirks.c     |   12 +++++-
 drivers/firmware/efi/efi.c         |   18 +++++++--
 drivers/firmware/efi/esrt.c        |   12 +++---
 include/linux/efi.h                |    3 +
 11 files changed, 212 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h
index 61518cf..bfb08e5 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -13,6 +13,7 @@
  * Executability : eXeutable, NoteXecutable
  * Read/Write    : ReadOnly, ReadWrite
  * Presence      : NotPresent
+ * Encryption    : ENCrypted, DECrypted
  *
  * Within a category, the attributes are mutually exclusive.
  *
@@ -48,6 +49,8 @@ int set_memory_ro(unsigned long addr, int numpages);
 int set_memory_rw(unsigned long addr, int numpages);
 int set_memory_np(unsigned long addr, int numpages);
 int set_memory_4k(unsigned long addr, int numpages);
+int set_memory_enc(unsigned long addr, int numpages);
+int set_memory_dec(unsigned long addr, int numpages);
 
 int set_memory_array_uc(unsigned long *addr, int addrinarray);
 int set_memory_array_wc(unsigned long *addr, int addrinarray);
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 2785493..42868f5 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -23,13 +23,23 @@ extern unsigned long sme_me_mask;
 
 u8 sme_get_me_loss(void);
 
+int sme_set_mem_enc(void *vaddr, unsigned long size);
+int sme_set_mem_dec(void *vaddr, unsigned long size);
+
 void __init sme_early_mem_enc(resource_size_t paddr,
 			      unsigned long size);
 void __init sme_early_mem_dec(resource_size_t paddr,
 			      unsigned long size);
 
+void __init *sme_early_memremap(resource_size_t paddr,
+				unsigned long size);
+
 void __init sme_early_init(void);
 
+/* Architecture __weak replacement functions */
+void __init *efi_me_early_memremap(resource_size_t paddr,
+				   unsigned long size);
+
 #define __sme_pa(x)		(__pa((x)) | sme_me_mask)
 #define __sme_pa_nodebug(x)	(__pa_nodebug((x)) | sme_me_mask)
 
@@ -44,6 +54,16 @@ static inline u8 sme_get_me_loss(void)
 	return 0;
 }
 
+static inline int sme_set_mem_enc(void *vaddr, unsigned long size)
+{
+	return 0;
+}
+
+static inline int sme_set_mem_dec(void *vaddr, unsigned long size)
+{
+	return 0;
+}
+
 static inline void __init sme_early_mem_enc(resource_size_t paddr,
 					    unsigned long size)
 {
@@ -63,6 +83,8 @@ static inline void __init sme_early_init(void)
 
 #define __sme_va		__va
 
+#define sme_early_memremap	early_memremap
+
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
 #endif	/* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 1d29cf9..2e460fb 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -424,7 +424,7 @@ static void __init parse_setup_data(void)
 	while (pa_data) {
 		u32 data_len, data_type;
 
-		data = early_memremap(pa_data, sizeof(*data));
+		data = sme_early_memremap(pa_data, sizeof(*data));
 		data_len = data->len + sizeof(struct setup_data);
 		data_type = data->type;
 		pa_next = data->next;
@@ -457,7 +457,7 @@ static void __init e820_reserve_setup_data(void)
 		return;
 
 	while (pa_data) {
-		data = early_memremap(pa_data, sizeof(*data));
+		data = sme_early_memremap(pa_data, sizeof(*data));
 		e820_update_range(pa_data, sizeof(*data)+data->len,
 			 E820_RAM, E820_RESERVED_KERN);
 		pa_data = data->next;
@@ -477,7 +477,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
 
 	pa_data = boot_params.hdr.setup_data;
 	while (pa_data) {
-		data = early_memremap(pa_data, sizeof(*data));
+		data = sme_early_memremap(pa_data, sizeof(*data));
 		memblock_reserve(pa_data, sizeof(*data) + data->len);
 		pa_data = data->next;
 		early_memunmap(data, sizeof(*data));
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 5f19ede..7d56d1b 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -14,12 +14,55 @@
 #include <linux/mm.h>
 
 #include <asm/mem_encrypt.h>
+#include <asm/cacheflush.h>
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
 
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char me_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
 
+int sme_set_mem_enc(void *vaddr, unsigned long size)
+{
+	unsigned long addr, numpages;
+
+	if (!sme_me_mask)
+		return 0;
+
+	addr = (unsigned long)vaddr & PAGE_MASK;
+	numpages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+	/*
+	 * The set_memory_xxx functions take an integer for numpages, make
+	 * sure it doesn't exceed that.
+	 */
+	if (numpages > INT_MAX)
+		return -EINVAL;
+
+	return set_memory_enc(addr, numpages);
+}
+EXPORT_SYMBOL_GPL(sme_set_mem_enc);
+
+int sme_set_mem_dec(void *vaddr, unsigned long size)
+{
+	unsigned long addr, numpages;
+
+	if (!sme_me_mask)
+		return 0;
+
+	addr = (unsigned long)vaddr & PAGE_MASK;
+	numpages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+	/*
+	 * The set_memory_xxx functions take an integer for numpages, make
+	 * sure it doesn't exceed that.
+	 */
+	if (numpages > INT_MAX)
+		return -EINVAL;
+
+	return set_memory_dec(addr, numpages);
+}
+EXPORT_SYMBOL_GPL(sme_set_mem_dec);
+
 void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size)
 {
 	void *src, *dst;
@@ -104,6 +147,12 @@ void __init sme_early_mem_dec(resource_size_t paddr, unsigned long size)
 	}
 }
 
+void __init *sme_early_memremap(resource_size_t paddr,
+				unsigned long size)
+{
+	return early_memremap_dec(paddr, size);
+}
+
 void __init sme_early_init(void)
 {
 	unsigned int i;
@@ -117,3 +166,10 @@ void __init sme_early_init(void)
 	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
 		protection_map[i] = __pgprot(pgprot_val(protection_map[i]) | sme_me_mask);
 }
+
+/* Architecture __weak replacement functions */
+void __init *efi_me_early_memremap(resource_size_t paddr,
+				   unsigned long size)
+{
+	return sme_early_memremap(paddr, size);
+}
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index c055302..0384fb3 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1731,6 +1731,81 @@ int set_memory_4k(unsigned long addr, int numpages)
 					__pgprot(0), 1, 0, NULL);
 }
 
+static int __set_memory_enc_dec(struct cpa_data *cpa)
+{
+	unsigned long addr;
+	int numpages;
+	int ret;
+
+	if (*cpa->vaddr & ~PAGE_MASK) {
+		*cpa->vaddr &= PAGE_MASK;
+
+		/* People should not be passing in unaligned addresses */
+		WARN_ON_ONCE(1);
+	}
+
+	addr = *cpa->vaddr;
+	numpages = cpa->numpages;
+
+	/* Must avoid aliasing mappings in the highmem code */
+	kmap_flush_unused();
+	vm_unmap_aliases();
+
+	ret = __change_page_attr_set_clr(cpa, 1);
+
+	/* Check whether we really changed something */
+	if (!(cpa->flags & CPA_FLUSHTLB))
+		goto out;
+
+	/*
+	 * On success we use CLFLUSH, when the CPU supports it to
+	 * avoid the WBINVD.
+	 */
+	if (!ret && static_cpu_has(X86_FEATURE_CLFLUSH))
+		cpa_flush_range(addr, numpages, 1);
+	else
+		cpa_flush_all(1);
+
+out:
+	return ret;
+}
+
+int set_memory_enc(unsigned long addr, int numpages)
+{
+	struct cpa_data cpa;
+
+	if (!sme_me_mask)
+		return 0;
+
+	memset(&cpa, 0, sizeof(cpa));
+	cpa.vaddr = &addr;
+	cpa.numpages = numpages;
+	cpa.mask_set = __pgprot(_PAGE_ENC);
+	cpa.mask_clr = __pgprot(0);
+	cpa.pgd = init_mm.pgd;
+
+	return __set_memory_enc_dec(&cpa);
+}
+EXPORT_SYMBOL(set_memory_enc);
+
+int set_memory_dec(unsigned long addr, int numpages)
+{
+	struct cpa_data cpa;
+
+	if (!sme_me_mask)
+		return 0;
+
+	memset(&cpa, 0, sizeof(cpa));
+	cpa.vaddr = &addr;
+	cpa.numpages = numpages;
+	cpa.mask_set = __pgprot(0);
+	cpa.mask_clr = __pgprot(_PAGE_ENC);
+	cpa.pgd = init_mm.pgd;
+
+	return __set_memory_enc_dec(&cpa);
+}
+EXPORT_SYMBOL(set_memory_dec);
+
 int set_pages_uc(struct page *page, int numpages)
 {
 	unsigned long addr = (unsigned long)page_address(page);
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 994a7df8..871b213 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -53,6 +53,7 @@
 #include <asm/x86_init.h>
 #include <asm/rtc.h>
 #include <asm/uv/uv.h>
+#include <asm/mem_encrypt.h>
 
 #define EFI_DEBUG
 
@@ -261,12 +262,12 @@ static int __init efi_systab_init(void *phys)
 		u64 tmp = 0;
 
 		if (efi_setup) {
-			data = early_memremap(efi_setup, sizeof(*data));
+			data = sme_early_memremap(efi_setup, sizeof(*data));
 			if (!data)
 				return -ENOMEM;
 		}
-		systab64 = early_memremap((unsigned long)phys,
-					 sizeof(*systab64));
+		systab64 = sme_early_memremap((unsigned long)phys,
+					      sizeof(*systab64));
 		if (systab64 == NULL) {
 			pr_err("Couldn't map the system table!\n");
 			if (data)
@@ -314,8 +315,8 @@ static int __init efi_systab_init(void *phys)
 	} else {
 		efi_system_table_32_t *systab32;
 
-		systab32 = early_memremap((unsigned long)phys,
-					 sizeof(*systab32));
+		systab32 = sme_early_memremap((unsigned long)phys,
+					      sizeof(*systab32));
 		if (systab32 == NULL) {
 			pr_err("Couldn't map the system table!\n");
 			return -ENOMEM;
@@ -361,8 +362,8 @@ static int __init efi_runtime_init32(void)
 {
 	efi_runtime_services_32_t *runtime;
 
-	runtime = early_memremap((unsigned long)efi.systab->runtime,
-			sizeof(efi_runtime_services_32_t));
+	runtime = sme_early_memremap((unsigned long)efi.systab->runtime,
+				     sizeof(efi_runtime_services_32_t));
 	if (!runtime) {
 		pr_err("Could not map the runtime service table!\n");
 		return -ENOMEM;
@@ -385,8 +386,8 @@ static int __init efi_runtime_init64(void)
 {
 	efi_runtime_services_64_t *runtime;
 
-	runtime = early_memremap((unsigned long)efi.systab->runtime,
-			sizeof(efi_runtime_services_64_t));
+	runtime = sme_early_memremap((unsigned long)efi.systab->runtime,
+				     sizeof(efi_runtime_services_64_t));
 	if (!runtime) {
 		pr_err("Could not map the runtime service table!\n");
 		return -ENOMEM;
@@ -444,8 +445,8 @@ static int __init efi_memmap_init(void)
 		return 0;
 
 	/* Map the EFI memory map */
-	memmap.map = early_memremap((unsigned long)memmap.phys_map,
-				   memmap.nr_map * memmap.desc_size);
+	memmap.map = sme_early_memremap((unsigned long)memmap.phys_map,
+					memmap.nr_map * memmap.desc_size);
 	if (memmap.map == NULL) {
 		pr_err("Could not map the memory map!\n");
 		return -ENOMEM;
@@ -490,7 +491,7 @@ void __init efi_init(void)
 	/*
 	 * Show what we know for posterity
 	 */
-	c16 = tmp = early_memremap(efi.systab->fw_vendor, 2);
+	c16 = tmp = sme_early_memremap(efi.systab->fw_vendor, 2);
 	if (c16) {
 		for (i = 0; i < sizeof(vendor) - 1 && *c16; ++i)
 			vendor[i] = *c16++;
@@ -690,6 +691,7 @@ static void *realloc_pages(void *old_memmap, int old_shift)
 	ret = (void *)__get_free_pages(GFP_KERNEL, old_shift + 1);
 	if (!ret)
 		goto out;
+	sme_set_mem_dec(ret, PAGE_SIZE << (old_shift + 1));
 
 	/*
 	 * A first-time allocation doesn't have anything to copy.
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 49e4dd4..834a992 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -223,7 +223,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	if (efi_enabled(EFI_OLD_MEMMAP))
 		return 0;
 
-	efi_scratch.efi_pgt = (pgd_t *)__pa(efi_pgd);
+	efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd);
 	pgd = efi_pgd;
 
 	/*
@@ -262,7 +262,8 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 		pfn = md->phys_addr >> PAGE_SHIFT;
 		npages = md->num_pages;
 
-		if (kernel_map_pages_in_pgd(pgd, pfn, md->phys_addr, npages, _PAGE_RW)) {
+		if (kernel_map_pages_in_pgd(pgd, pfn, md->phys_addr, npages,
+					    _PAGE_RW | _PAGE_ENC)) {
 			pr_err("Failed to map 1:1 memory\n");
 			return 1;
 		}
@@ -272,6 +273,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	if (!page)
 		panic("Unable to allocate EFI runtime stack < 4GB\n");
 
+	sme_set_mem_dec(page_address(page), PAGE_SIZE);
 	efi_scratch.phys_stack = virt_to_phys(page_address(page));
 	efi_scratch.phys_stack += PAGE_SIZE; /* stack grows down */
 
@@ -279,7 +281,8 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	text = __pa(_text);
 	pfn = text >> PAGE_SHIFT;
 
-	if (kernel_map_pages_in_pgd(pgd, pfn, text, npages, _PAGE_RW)) {
+	if (kernel_map_pages_in_pgd(pgd, pfn, text, npages,
+				    _PAGE_RW | _PAGE_ENC)) {
 		pr_err("Failed to map kernel text 1:1\n");
 		return 1;
 	}
diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index ab50ada..dde4fb6b 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -13,6 +13,7 @@
 #include <linux/dmi.h>
 #include <asm/efi.h>
 #include <asm/uv/uv.h>
+#include <asm/mem_encrypt.h>
 
 #define EFI_MIN_RESERVE 5120
 
@@ -265,6 +266,13 @@ void __init efi_free_boot_services(void)
 		if (md->attribute & EFI_MEMORY_RUNTIME)
 			continue;
 
+		/*
+		 * Change the mapping to encrypted memory before freeing.
+		 * This insures any future allocations of this mapped area
+		 * are used encrypted.
+		 */
+		sme_set_mem_enc(__va(start), size);
+
 		free_bootmem_late(start, size);
 	}
 
@@ -292,7 +300,7 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
 	if (!efi_enabled(EFI_64BIT))
 		return 0;
 
-	data = early_memremap(efi_setup, sizeof(*data));
+	data = sme_early_memremap(efi_setup, sizeof(*data));
 	if (!data) {
 		ret = -ENOMEM;
 		goto out;
@@ -303,7 +311,7 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
 
 	sz = sizeof(efi_config_table_64_t);
 
-	p = tablep = early_memremap(tables, nr_tables * sz);
+	p = tablep = sme_early_memremap(tables, nr_tables * sz);
 	if (!p) {
 		pr_err("Could not map Configuration table!\n");
 		ret = -ENOMEM;
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 3a69ed5..25010c7 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -76,6 +76,16 @@ static int __init parse_efi_cmdline(char *str)
 }
 early_param("efi", parse_efi_cmdline);
 
+/*
+ * If memory encryption is supported, then an override to this function
+ * will be provided.
+ */
+void __weak __init *efi_me_early_memremap(resource_size_t phys_addr,
+					  unsigned long size)
+{
+	return early_memremap(phys_addr, size);
+}
+
 struct kobject *efi_kobj;
 
 /*
@@ -289,9 +299,9 @@ int __init efi_mem_desc_lookup(u64 phys_addr, efi_memory_desc_t *out_md)
 		 * So just always get our own virtual map on the CPU.
 		 *
 		 */
-		md = early_memremap(p, sizeof (*md));
+		md = efi_me_early_memremap(p, sizeof (*md));
 		if (!md) {
-			pr_err_once("early_memremap(%pa, %zu) failed.\n",
+			pr_err_once("efi_me_early_memremap(%pa, %zu) failed.\n",
 				    &p, sizeof (*md));
 			return -ENOMEM;
 		}
@@ -431,8 +441,8 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 	/*
 	 * Let's see what config tables the firmware passed to us.
 	 */
-	config_tables = early_memremap(efi.systab->tables,
-				       efi.systab->nr_tables * sz);
+	config_tables = efi_me_early_memremap(efi.systab->tables,
+					      efi.systab->nr_tables * sz);
 	if (config_tables == NULL) {
 		pr_err("Could not map Configuration table!\n");
 		return -ENOMEM;
diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c
index 75feb3f..7a96bc6 100644
--- a/drivers/firmware/efi/esrt.c
+++ b/drivers/firmware/efi/esrt.c
@@ -273,10 +273,10 @@ void __init efi_esrt_init(void)
 		return;
 	}
 
-	va = early_memremap(efi.esrt, size);
+	va = efi_me_early_memremap(efi.esrt, size);
 	if (!va) {
-		pr_err("early_memremap(%p, %zu) failed.\n", (void *)efi.esrt,
-		       size);
+		pr_err("efi_me_early_memremap(%p, %zu) failed.\n",
+		       (void *)efi.esrt, size);
 		return;
 	}
 
@@ -323,10 +323,10 @@ void __init efi_esrt_init(void)
 	/* remap it with our (plausible) new pages */
 	early_memunmap(va, size);
 	size += entries_size;
-	va = early_memremap(efi.esrt, size);
+	va = efi_me_early_memremap(efi.esrt, size);
 	if (!va) {
-		pr_err("early_memremap(%p, %zu) failed.\n", (void *)efi.esrt,
-		       size);
+		pr_err("efi_me_early_memremap(%p, %zu) failed.\n",
+		       (void *)efi.esrt, size);
 		return;
 	}
 
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 1626474..557c774 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -957,6 +957,9 @@ extern void __init efi_fake_memmap(void);
 static inline void efi_fake_memmap(void) { }
 #endif
 
+extern void __weak __init *efi_me_early_memremap(resource_size_t phys_addr,
+						 unsigned long size);
+
 /* Iterate through an efi_memory_map */
 #define for_each_efi_memory_desc(m, md)					   \
 	for ((md) = (m)->map;						   \

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-06-16 14:38           ` Tom Lendacky
@ 2016-06-17 15:51             ` Matt Fleming
  0 siblings, 0 replies; 30+ messages in thread
From: Matt Fleming @ 2016-06-17 15:51 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

On Thu, 16 Jun, at 09:38:31AM, Tom Lendacky wrote:
> 
> Ok, I think this was happening before the commit to build our own
> EFI page table structures:
> 
> commit 67a9108ed ("x86/efi: Build our own page table structures")
> 
> Before this commit the boot services ended up mapped into the kernel
> page table entries as un-encrypted during efi_map_regions() and I needed
> to change those entries back to encrypted. With your change above,
> this appears to no longer be needed.

Great news! Things are as they should be ;)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-06-15 13:17         ` Tom Lendacky
@ 2016-06-16 14:38           ` Tom Lendacky
  2016-06-17 15:51             ` Matt Fleming
  0 siblings, 1 reply; 30+ messages in thread
From: Tom Lendacky @ 2016-06-16 14:38 UTC (permalink / raw)
  To: Matt Fleming
  Cc: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

On 06/15/2016 08:17 AM, Tom Lendacky wrote:
> On 06/13/2016 08:51 AM, Matt Fleming wrote:
>> On Thu, 09 Jun, at 01:33:30PM, Tom Lendacky wrote:
>>>

[...]

>>
>>> I'll look further into this, but I saw that this area of virtual memory
>>> was mapped un-encrypted and after freeing the boot services the
>>> mappings were somehow reused as un-encrypted for DMA which assumes
>>> (unless using swiotlb) encrypted. This resulted in DMA data being
>>> transferred in as encrypted and then accessed un-encrypted.
>>
>> That the mappings were re-used isn't a surprise.
>>
>> efi_free_boot_services() lifts the reservation that was put in place
>> during efi_reserve_boot_services() and releases the pages to the
>> kernel's memory allocators.
>>
>> What is surprising is that they were marked unencrypted at all.
>> There's nothing special about these pages as far as the __va() region
>> is concerned.
> 
> Right, let me keep looking into this to see if I can pin down what
> was (or is) happening.

Ok, I think this was happening before the commit to build our own
EFI page table structures:

commit 67a9108ed ("x86/efi: Build our own page table structures")

Before this commit the boot services ended up mapped into the kernel
page table entries as un-encrypted during efi_map_regions() and I needed
to change those entries back to encrypted. With your change above,
this appears to no longer be needed.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-06-13 13:51       ` Matt Fleming
@ 2016-06-15 13:17         ` Tom Lendacky
  2016-06-16 14:38           ` Tom Lendacky
  0 siblings, 1 reply; 30+ messages in thread
From: Tom Lendacky @ 2016-06-15 13:17 UTC (permalink / raw)
  To: Matt Fleming
  Cc: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

On 06/13/2016 08:51 AM, Matt Fleming wrote:
> On Thu, 09 Jun, at 01:33:30PM, Tom Lendacky wrote:
>>
>> I was trying to play it safe here, but as you say, the firmware should
>> be using our page tables so we can get rid of this call. The problem
>> will actually be if we transition to a 32-bit efi. The encryption bit
>> will be lost in cr3 and so the pgd table will have to be un-encrypted.
>> The entries in the pgd can have the encryption bit set so I would only
>> need to worry about the pgd itself. I'll have to update the
>> efi_alloc_page_tables routine.
>  
> Interesting, I hadn't expected 32-bit EFI to be an option for
> platforms with the SME technology. I'd assumed we could just ignore
> that.

We may be able to do that.

> 
> Are you saying that the encryption bit isn't supported in 32-bit
> compatibility mode? We don't do a "full" switch to 32-bit protected
> mode when in mixed mode, just load a 32-bit code segment descriptor.
> The page tables are not modified at all.

The encryption bit is supported in 32-bit compatibility mode and since
we're not doing the "full" switch the cr3 register will remain as a
64-bit register so we can leave the pgd table encrypted.

> 
>> The encryption bit in the cr3 register will indicate if the pgd table
>> is encrypted or not. Based on my comment above about the pgd having
>> to be un-encrypted in case we have to transition to 32-bit efi, this
>> can be removed.
>  
> I'm not (yet) sure that the pgd needs to be unencrypted for 32-bit EFI
> when running a 64-bit kernel. In the AMD Programmer's Manual, Section
> 7.10.3 Operating Modes seems to indicate that running encrypted should
> work fine.
> 
>> I'll look into this a bit more. From looking at it I don't want the
>> _PAGE_ENC bit set for the memmap unless it gets re-allocated (which
>> I missed in these patches). Let me see what I can do with this.
>  
> I don't understand your comment about re-allocating the memmap.
> 
> The kernel builds its own EFI memory map at runtime, initially based
> on the memory map provided by the firmware. We always allocate a new
> memory map.

Sorry, I mis-interpreted the efi_map_regions function/loop and see
that the memmap is always allocated by the kernel.

> 
> In efi_setup_page_tables() we're building our own page tables, which
> should be encrypted, and mapping EFI regions described by the memmap
> into those page tables.
> 
> So unless we're mapping an MMIO region (in which case _PAGE_PCD is set
> in @flags for kernel_map_pages_in_pgd()) I would expect _PAGE_ENC to
> be set.
> 
>> I'll look further into this, but I saw that this area of virtual memory
>> was mapped un-encrypted and after freeing the boot services the
>> mappings were somehow reused as un-encrypted for DMA which assumes
>> (unless using swiotlb) encrypted. This resulted in DMA data being
>> transferred in as encrypted and then accessed un-encrypted.
> 
> That the mappings were re-used isn't a surprise.
> 
> efi_free_boot_services() lifts the reservation that was put in place
> during efi_reserve_boot_services() and releases the pages to the
> kernel's memory allocators.
> 
> What is surprising is that they were marked unencrypted at all.
> There's nothing special about these pages as far as the __va() region
> is concerned.

Right, let me keep looking into this to see if I can pin down what
was (or is) happening.

Thanks,
Tom

> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-06-13 12:03                   ` Matt Fleming
  2016-06-13 12:34                     ` Matt Fleming
@ 2016-06-13 15:16                     ` Tom Lendacky
  1 sibling, 0 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-06-13 15:16 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Borislav Petkov, Leif Lindholm, Mark Salter, Daniel Kiper,
	linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov, Ard Biesheuvel

On 06/13/2016 07:03 AM, Matt Fleming wrote:
> On Thu, 09 Jun, at 11:16:40AM, Tom Lendacky wrote:
>>
>> So maybe something along the lines of an enum that would have entries
>> (initially) like KERNEL_DATA (equal to zero) and EFI_DATA. Others could
>> be added later as needed.
>  
> Sure, that works for me, though maybe BOOT_DATA would be more
> applicable considering the devicetree case too.
> 
>> Would you then want to allow the protection attributes to be updated
>> by architecture specific code through something like a __weak function?
>> In the x86 case I can add this function as a non-SME specific function
>> that would initially just have the SME-related mask modification in it.
> 
> Would we need a new function? Couldn't we just have a new
> FIXMAP_PAGE_* constant? e.g. would something like this work?

Looking forward to the virtualization support (SEV), the VM will be
completely encrypted from the time it is started. In this case all of
the UEFI data will be encrypted and I would need to insure that the
mapping reflects that. When I do the SEV patches, I can change the
FIXMAP #define to add some logic to return a value, so I think the
FIXMAP_PAGE_ idea can work.

Thanks,
Tom

> 
> ---
> 
> enum memremap_owner {
> 	KERNEL_DATA = 0,
> 	BOOT_DATA,
> };
> 
> void __init *
> early_memremap(resource_size_t phys_addr, unsigned long size,
> 	       enum memremap_owner owner)
> {
> 	pgprot_t prot;
> 
> 	switch (owner) {
> 	case BOOT_DATA:
> 		prot = FIXMAP_PAGE_BOOT;
> 		break;
> 	case KERNEL_DATA:	/* FALLTHROUGH */
> 	default:
> 		prot = FIXMAP_PAGE_NORMAL;
> 		
> 	}
> 
> 	return (__force void *)__early_ioremap(phys_addr, size, prot);
> }
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-06-09 18:33     ` Tom Lendacky
@ 2016-06-13 13:51       ` Matt Fleming
  2016-06-15 13:17         ` Tom Lendacky
  0 siblings, 1 reply; 30+ messages in thread
From: Matt Fleming @ 2016-06-13 13:51 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

On Thu, 09 Jun, at 01:33:30PM, Tom Lendacky wrote:
> 
> I was trying to play it safe here, but as you say, the firmware should
> be using our page tables so we can get rid of this call. The problem
> will actually be if we transition to a 32-bit efi. The encryption bit
> will be lost in cr3 and so the pgd table will have to be un-encrypted.
> The entries in the pgd can have the encryption bit set so I would only
> need to worry about the pgd itself. I'll have to update the
> efi_alloc_page_tables routine.
 
Interesting, I hadn't expected 32-bit EFI to be an option for
platforms with the SME technology. I'd assumed we could just ignore
that.

Are you saying that the encryption bit isn't supported in 32-bit
compatibility mode? We don't do a "full" switch to 32-bit protected
mode when in mixed mode, just load a 32-bit code segment descriptor.
The page tables are not modified at all.

> The encryption bit in the cr3 register will indicate if the pgd table
> is encrypted or not. Based on my comment above about the pgd having
> to be un-encrypted in case we have to transition to 32-bit efi, this
> can be removed.
 
I'm not (yet) sure that the pgd needs to be unencrypted for 32-bit EFI
when running a 64-bit kernel. In the AMD Programmer's Manual, Section
7.10.3 Operating Modes seems to indicate that running encrypted should
work fine.

> I'll look into this a bit more. From looking at it I don't want the
> _PAGE_ENC bit set for the memmap unless it gets re-allocated (which
> I missed in these patches). Let me see what I can do with this.
 
I don't understand your comment about re-allocating the memmap.

The kernel builds its own EFI memory map at runtime, initially based
on the memory map provided by the firmware. We always allocate a new
memory map.

In efi_setup_page_tables() we're building our own page tables, which
should be encrypted, and mapping EFI regions described by the memmap
into those page tables.

So unless we're mapping an MMIO region (in which case _PAGE_PCD is set
in @flags for kernel_map_pages_in_pgd()) I would expect _PAGE_ENC to
be set.

> I'll look further into this, but I saw that this area of virtual memory
> was mapped un-encrypted and after freeing the boot services the
> mappings were somehow reused as un-encrypted for DMA which assumes
> (unless using swiotlb) encrypted. This resulted in DMA data being
> transferred in as encrypted and then accessed un-encrypted.

That the mappings were re-used isn't a surprise.

efi_free_boot_services() lifts the reservation that was put in place
during efi_reserve_boot_services() and releases the pages to the
kernel's memory allocators.

What is surprising is that they were marked unencrypted at all.
There's nothing special about these pages as far as the __va() region
is concerned.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-06-13 12:03                   ` Matt Fleming
@ 2016-06-13 12:34                     ` Matt Fleming
  2016-06-13 15:16                     ` Tom Lendacky
  1 sibling, 0 replies; 30+ messages in thread
From: Matt Fleming @ 2016-06-13 12:34 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Borislav Petkov, Leif Lindholm, Mark Salter, Daniel Kiper,
	linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov, Ard Biesheuvel

On Mon, 13 Jun, at 01:03:22PM, Matt Fleming wrote:
> 
> Would we need a new function? Couldn't we just have a new
> FIXMAP_PAGE_* constant? e.g. would something like this work?
> 
> ---
> 
> enum memremap_owner {
> 	KERNEL_DATA = 0,
> 	BOOT_DATA,
> };
> 
> void __init *
> early_memremap(resource_size_t phys_addr, unsigned long size,
> 	       enum memremap_owner owner)
> {
> 	pgprot_t prot;
> 
> 	switch (owner) {
> 	case BOOT_DATA:
> 		prot = FIXMAP_PAGE_BOOT;
> 		break;
> 	case KERNEL_DATA:	/* FALLTHROUGH */
> 	default:
> 		prot = FIXMAP_PAGE_NORMAL;
> 		
> 	}
> 
> 	return (__force void *)__early_ioremap(phys_addr, size, prot);
> }

Although it occurs to me that if there's a trivial 1:1 mapping between
memremap_owner and FIXMAP_PAGE_* we might as well just add a new
early_memremap_boot() that uses the correct FIXMAP_PAGE_* constant,
akin to early_memremap_ro().

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-06-09 16:16                 ` Tom Lendacky
@ 2016-06-13 12:03                   ` Matt Fleming
  2016-06-13 12:34                     ` Matt Fleming
  2016-06-13 15:16                     ` Tom Lendacky
  0 siblings, 2 replies; 30+ messages in thread
From: Matt Fleming @ 2016-06-13 12:03 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Borislav Petkov, Leif Lindholm, Mark Salter, Daniel Kiper,
	linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov, Ard Biesheuvel

On Thu, 09 Jun, at 11:16:40AM, Tom Lendacky wrote:
> 
> So maybe something along the lines of an enum that would have entries
> (initially) like KERNEL_DATA (equal to zero) and EFI_DATA. Others could
> be added later as needed.
 
Sure, that works for me, though maybe BOOT_DATA would be more
applicable considering the devicetree case too.

> Would you then want to allow the protection attributes to be updated
> by architecture specific code through something like a __weak function?
> In the x86 case I can add this function as a non-SME specific function
> that would initially just have the SME-related mask modification in it.

Would we need a new function? Couldn't we just have a new
FIXMAP_PAGE_* constant? e.g. would something like this work?

---

enum memremap_owner {
	KERNEL_DATA = 0,
	BOOT_DATA,
};

void __init *
early_memremap(resource_size_t phys_addr, unsigned long size,
	       enum memremap_owner owner)
{
	pgprot_t prot;

	switch (owner) {
	case BOOT_DATA:
		prot = FIXMAP_PAGE_BOOT;
		break;
	case KERNEL_DATA:	/* FALLTHROUGH */
	default:
		prot = FIXMAP_PAGE_NORMAL;
		
	}

	return (__force void *)__early_ioremap(phys_addr, size, prot);
}

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-06-08 11:18   ` Matt Fleming
@ 2016-06-09 18:33     ` Tom Lendacky
  2016-06-13 13:51       ` Matt Fleming
  0 siblings, 1 reply; 30+ messages in thread
From: Tom Lendacky @ 2016-06-09 18:33 UTC (permalink / raw)
  To: Matt Fleming
  Cc: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

On 06/08/2016 06:18 AM, Matt Fleming wrote:
> On Tue, 26 Apr, at 05:57:40PM, Tom Lendacky wrote:
>> The EFI tables are not encrypted and need to be accessed as such. Be sure
>> to memmap them without the encryption attribute set. For EFI support that
>> lives outside of the arch/x86 tree, create a routine that uses the __weak
>> attribute so that it can be overridden by an architecture specific routine.
>>
>> When freeing boot services related memory, since it has been mapped as
>> un-encrypted, be sure to change the mapping to encrypted for future use.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/x86/include/asm/cacheflush.h  |    3 +
>>  arch/x86/include/asm/mem_encrypt.h |   22 +++++++++++
>>  arch/x86/kernel/setup.c            |    6 +--
>>  arch/x86/mm/mem_encrypt.c          |   56 +++++++++++++++++++++++++++
>>  arch/x86/mm/pageattr.c             |   75 ++++++++++++++++++++++++++++++++++++
>>  arch/x86/platform/efi/efi.c        |   26 +++++++-----
>>  arch/x86/platform/efi/efi_64.c     |    9 +++-
>>  arch/x86/platform/efi/quirks.c     |   12 +++++-
>>  drivers/firmware/efi/efi.c         |   18 +++++++--
>>  drivers/firmware/efi/esrt.c        |   12 +++---
>>  include/linux/efi.h                |    3 +
>>  11 files changed, 212 insertions(+), 30 deletions(-)
> 
> [...]
> 
>> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
>> index 994a7df8..871b213 100644
>> --- a/arch/x86/platform/efi/efi.c
>> +++ b/arch/x86/platform/efi/efi.c
>> @@ -53,6 +53,7 @@
>>  #include <asm/x86_init.h>
>>  #include <asm/rtc.h>
>>  #include <asm/uv/uv.h>
>> +#include <asm/mem_encrypt.h>
>>  
>>  #define EFI_DEBUG
>>  
>> @@ -261,12 +262,12 @@ static int __init efi_systab_init(void *phys)
>>  		u64 tmp = 0;
>>  
>>  		if (efi_setup) {
>> -			data = early_memremap(efi_setup, sizeof(*data));
>> +			data = sme_early_memremap(efi_setup, sizeof(*data));
>>  			if (!data)
>>  				return -ENOMEM;
>>  		}
> 
> Beware, this data comes from a previous kernel that kexec'd this
> kernel. Unless you've updated bzImage64_load() to allocate an
> unencrypted region 'efi_setup' will in fact be encrypted.

Yes, I missed the kexec path originally and need to take that into
account in general.

> 
>> @@ -690,6 +691,7 @@ static void *realloc_pages(void *old_memmap, int old_shift)
>>  	ret = (void *)__get_free_pages(GFP_KERNEL, old_shift + 1);
>>  	if (!ret)
>>  		goto out;
>> +	sme_set_mem_dec(ret, PAGE_SIZE << (old_shift + 1));
>>  
>>  	/*
>>  	 * A first-time allocation doesn't have anything to copy.
> 
> I'm not sure why it's necessary to mark this region as unencrypted,
> because at this point the kernel controls the platform and when we
> call into the firmware it should be using our page tables. I wouldn't
> expect the firmware to mess with the SYSCFG MSR either.
> 
> Have you come across a situation where the above was required?

I was trying to play it safe here, but as you say, the firmware should
be using our page tables so we can get rid of this call. The problem
will actually be if we transition to a 32-bit efi. The encryption bit
will be lost in cr3 and so the pgd table will have to be un-encrypted.
The entries in the pgd can have the encryption bit set so I would only
need to worry about the pgd itself. I'll have to update the
efi_alloc_page_tables routine.

> 
>> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
>> index 49e4dd4..834a992 100644
>> --- a/arch/x86/platform/efi/efi_64.c
>> +++ b/arch/x86/platform/efi/efi_64.c
>> @@ -223,7 +223,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>>  	if (efi_enabled(EFI_OLD_MEMMAP))
>>  		return 0;
>>  
>> -	efi_scratch.efi_pgt = (pgd_t *)__pa(efi_pgd);
>> +	efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd);
>>  	pgd = efi_pgd;
>>  
>>  	/*
> 
> Huh? Why does __pa() now OR in sme_mas_mask? I thought SME only
> required the page table structures to be modified, not the end
> address?

The encryption bit in the cr3 register will indicate if the pgd table
is encrypted or not. Based on my comment above about the pgd having
to be un-encrypted in case we have to transition to 32-bit efi, this
can be removed.

> 
>> @@ -262,7 +262,8 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>>  		pfn = md->phys_addr >> PAGE_SHIFT;
>>  		npages = md->num_pages;
>>  
>> -		if (kernel_map_pages_in_pgd(pgd, pfn, md->phys_addr, npages, _PAGE_RW)) {
>> +		if (kernel_map_pages_in_pgd(pgd, pfn, md->phys_addr, npages,
>> +					    _PAGE_RW | _PAGE_ENC)) {
>>  			pr_err("Failed to map 1:1 memory\n");
>>  			return 1;
>>  		}
> 
> Could you push the _PAGE_ENC addition down into
> kernel_map_pages_in_pgd()? Other flags are also handled that way, see
> _PAGE_PRESENT.

I'll look into this a bit more. From looking at it I don't want the
_PAGE_ENC bit set for the memmap unless it gets re-allocated (which
I missed in these patches). Let me see what I can do with this.

> 
>> @@ -272,6 +273,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>>  	if (!page)
>>  		panic("Unable to allocate EFI runtime stack < 4GB\n");
>>  
>> +	sme_set_mem_dec(page_address(page), PAGE_SIZE);
>>  	efi_scratch.phys_stack = virt_to_phys(page_address(page));
>>  	efi_scratch.phys_stack += PAGE_SIZE; /* stack grows down */
>>  
> 
> We should not need to mark the stack as unencrypted, the firmware
> should respect our SME settings, right?

Yup, you're correct. I think we can get rid of this call, too.

> 
>> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
>> index ab50ada..dde4fb6b 100644
>> --- a/arch/x86/platform/efi/quirks.c
>> +++ b/arch/x86/platform/efi/quirks.c
>> @@ -13,6 +13,7 @@
>>  #include <linux/dmi.h>
>>  #include <asm/efi.h>
>>  #include <asm/uv/uv.h>
>> +#include <asm/mem_encrypt.h>
>>  
>>  #define EFI_MIN_RESERVE 5120
>>  
>> @@ -265,6 +266,13 @@ void __init efi_free_boot_services(void)
>>  		if (md->attribute & EFI_MEMORY_RUNTIME)
>>  			continue;
>>  
>> +		/*
>> +		 * Change the mapping to encrypted memory before freeing.
>> +		 * This insures any future allocations of this mapped area
>> +		 * are used encrypted.
>> +		 */
>> +		sme_set_mem_enc(__va(start), size);
>> +
>>  		free_bootmem_late(start, size);
>>  	}
>>  
> 
> I don't think it's necessary to have to mark the __va() mapping of
> these regions as encrypted at this point. They should be setup that
> way initially.
> 
> The reason is that it'd be a bug if these regions were accessed via
> the __va() mappings before this point. Unless there's something I'm
> missing.

I'll look further into this, but I saw that this area of virtual memory
was mapped un-encrypted and after freeing the boot services the
mappings were somehow reused as un-encrypted for DMA which assumes
(unless using swiotlb) encrypted. This resulted in DMA data being
transferred in as encrypted and then accessed un-encrypted.

Thanks,
Tom

> 
>> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
>> index 3a69ed5..25010c7 100644
>> --- a/drivers/firmware/efi/efi.c
>> +++ b/drivers/firmware/efi/efi.c
>> @@ -76,6 +76,16 @@ static int __init parse_efi_cmdline(char *str)
>>  }
>>  early_param("efi", parse_efi_cmdline);
>>  
>> +/*
>> + * If memory encryption is supported, then an override to this function
>> + * will be provided.
>> + */
>> +void __weak __init *efi_me_early_memremap(resource_size_t phys_addr,
>> +					  unsigned long size)
>> +{
>> +	return early_memremap(phys_addr, size);
>> +}
>> +
>>  struct kobject *efi_kobj;
>>  
>>  /*
> 
> Like I said in my other mail, I'd much prefer to see this buried in
> arch/x86 by passing a flag to early_memremap() which can be parsed in
> arch directories.
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-06-08 10:07               ` Matt Fleming
@ 2016-06-09 16:16                 ` Tom Lendacky
  2016-06-13 12:03                   ` Matt Fleming
  0 siblings, 1 reply; 30+ messages in thread
From: Tom Lendacky @ 2016-06-09 16:16 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Borislav Petkov, Leif Lindholm, Mark Salter, Daniel Kiper,
	linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov, Ard Biesheuvel

On 06/08/2016 05:07 AM, Matt Fleming wrote:
> (Sorry for the delay)

No worries, thanks for all the feedback.

> 
> On Thu, 26 May, at 08:45:58AM, Tom Lendacky wrote:
>>
>> The patch in question is patch 6/18 where PAGE_KERNEL is changed to
>> include the _PAGE_ENC attribute (the encryption mask). This now
>> makes FIXMAP_PAGE_NORMAL contain the encryption mask while
>> FIXMAP_PAGE_IO does not. In this way, anything mapped using the
>> early_ioremap call won't be mapped encrypted.
> 
> There are semantics attached to early_ioremap() that do not apply in
> this case; that you're mapping an MMIO region but for EFI we just care
> about noting where the firmware (not the kernel) populated the region
> with data. Similar problems exist for other early boot code such as
> the devicetree stuff.
> 
> early_ioremap() is not the answer.
> 
> What you really want is just some way to distinguish kernel-owned
> regions from those owned by "somebody else".
> 
> I have no problem updating early_memremap() to take a @flags argument
> to make that distinction, provided that the naming is generic and not
> tied to AMD's SME technology via an "sme" prefix/suffix.

So maybe something along the lines of an enum that would have entries
(initially) like KERNEL_DATA (equal to zero) and EFI_DATA. Others could
be added later as needed.

Would you then want to allow the protection attributes to be updated
by architecture specific code through something like a __weak function?
In the x86 case I can add this function as a non-SME specific function
that would initially just have the SME-related mask modification in it.

Thanks,
Tom

> 
> And making it generic should allow it to be easily sprinkled into the
> shared architecture code in drivers/firmware/efi/ without issue.
> 
> I'm going to follow up with some additional comments/questions on
> PATCH 10.
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-04-26 22:57 ` [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear Tom Lendacky
  2016-05-10 13:43   ` Matt Fleming
@ 2016-06-08 11:18   ` Matt Fleming
  2016-06-09 18:33     ` Tom Lendacky
  1 sibling, 1 reply; 30+ messages in thread
From: Matt Fleming @ 2016-06-08 11:18 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

On Tue, 26 Apr, at 05:57:40PM, Tom Lendacky wrote:
> The EFI tables are not encrypted and need to be accessed as such. Be sure
> to memmap them without the encryption attribute set. For EFI support that
> lives outside of the arch/x86 tree, create a routine that uses the __weak
> attribute so that it can be overridden by an architecture specific routine.
> 
> When freeing boot services related memory, since it has been mapped as
> un-encrypted, be sure to change the mapping to encrypted for future use.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/cacheflush.h  |    3 +
>  arch/x86/include/asm/mem_encrypt.h |   22 +++++++++++
>  arch/x86/kernel/setup.c            |    6 +--
>  arch/x86/mm/mem_encrypt.c          |   56 +++++++++++++++++++++++++++
>  arch/x86/mm/pageattr.c             |   75 ++++++++++++++++++++++++++++++++++++
>  arch/x86/platform/efi/efi.c        |   26 +++++++-----
>  arch/x86/platform/efi/efi_64.c     |    9 +++-
>  arch/x86/platform/efi/quirks.c     |   12 +++++-
>  drivers/firmware/efi/efi.c         |   18 +++++++--
>  drivers/firmware/efi/esrt.c        |   12 +++---
>  include/linux/efi.h                |    3 +
>  11 files changed, 212 insertions(+), 30 deletions(-)

[...]

> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> index 994a7df8..871b213 100644
> --- a/arch/x86/platform/efi/efi.c
> +++ b/arch/x86/platform/efi/efi.c
> @@ -53,6 +53,7 @@
>  #include <asm/x86_init.h>
>  #include <asm/rtc.h>
>  #include <asm/uv/uv.h>
> +#include <asm/mem_encrypt.h>
>  
>  #define EFI_DEBUG
>  
> @@ -261,12 +262,12 @@ static int __init efi_systab_init(void *phys)
>  		u64 tmp = 0;
>  
>  		if (efi_setup) {
> -			data = early_memremap(efi_setup, sizeof(*data));
> +			data = sme_early_memremap(efi_setup, sizeof(*data));
>  			if (!data)
>  				return -ENOMEM;
>  		}

Beware, this data comes from a previous kernel that kexec'd this
kernel. Unless you've updated bzImage64_load() to allocate an
unencrypted region 'efi_setup' will in fact be encrypted.

> @@ -690,6 +691,7 @@ static void *realloc_pages(void *old_memmap, int old_shift)
>  	ret = (void *)__get_free_pages(GFP_KERNEL, old_shift + 1);
>  	if (!ret)
>  		goto out;
> +	sme_set_mem_dec(ret, PAGE_SIZE << (old_shift + 1));
>  
>  	/*
>  	 * A first-time allocation doesn't have anything to copy.

I'm not sure why it's necessary to mark this region as unencrypted,
because at this point the kernel controls the platform and when we
call into the firmware it should be using our page tables. I wouldn't
expect the firmware to mess with the SYSCFG MSR either.

Have you come across a situation where the above was required?

> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
> index 49e4dd4..834a992 100644
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -223,7 +223,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>  	if (efi_enabled(EFI_OLD_MEMMAP))
>  		return 0;
>  
> -	efi_scratch.efi_pgt = (pgd_t *)__pa(efi_pgd);
> +	efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd);
>  	pgd = efi_pgd;
>  
>  	/*

Huh? Why does __pa() now OR in sme_mas_mask? I thought SME only
required the page table structures to be modified, not the end
address?

> @@ -262,7 +262,8 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>  		pfn = md->phys_addr >> PAGE_SHIFT;
>  		npages = md->num_pages;
>  
> -		if (kernel_map_pages_in_pgd(pgd, pfn, md->phys_addr, npages, _PAGE_RW)) {
> +		if (kernel_map_pages_in_pgd(pgd, pfn, md->phys_addr, npages,
> +					    _PAGE_RW | _PAGE_ENC)) {
>  			pr_err("Failed to map 1:1 memory\n");
>  			return 1;
>  		}

Could you push the _PAGE_ENC addition down into
kernel_map_pages_in_pgd()? Other flags are also handled that way, see
_PAGE_PRESENT.

> @@ -272,6 +273,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
>  	if (!page)
>  		panic("Unable to allocate EFI runtime stack < 4GB\n");
>  
> +	sme_set_mem_dec(page_address(page), PAGE_SIZE);
>  	efi_scratch.phys_stack = virt_to_phys(page_address(page));
>  	efi_scratch.phys_stack += PAGE_SIZE; /* stack grows down */
>  

We should not need to mark the stack as unencrypted, the firmware
should respect our SME settings, right?

> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
> index ab50ada..dde4fb6b 100644
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c
> @@ -13,6 +13,7 @@
>  #include <linux/dmi.h>
>  #include <asm/efi.h>
>  #include <asm/uv/uv.h>
> +#include <asm/mem_encrypt.h>
>  
>  #define EFI_MIN_RESERVE 5120
>  
> @@ -265,6 +266,13 @@ void __init efi_free_boot_services(void)
>  		if (md->attribute & EFI_MEMORY_RUNTIME)
>  			continue;
>  
> +		/*
> +		 * Change the mapping to encrypted memory before freeing.
> +		 * This insures any future allocations of this mapped area
> +		 * are used encrypted.
> +		 */
> +		sme_set_mem_enc(__va(start), size);
> +
>  		free_bootmem_late(start, size);
>  	}
>  

I don't think it's necessary to have to mark the __va() mapping of
these regions as encrypted at this point. They should be setup that
way initially.

The reason is that it'd be a bug if these regions were accessed via
the __va() mappings before this point. Unless there's something I'm
missing.

> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 3a69ed5..25010c7 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -76,6 +76,16 @@ static int __init parse_efi_cmdline(char *str)
>  }
>  early_param("efi", parse_efi_cmdline);
>  
> +/*
> + * If memory encryption is supported, then an override to this function
> + * will be provided.
> + */
> +void __weak __init *efi_me_early_memremap(resource_size_t phys_addr,
> +					  unsigned long size)
> +{
> +	return early_memremap(phys_addr, size);
> +}
> +
>  struct kobject *efi_kobj;
>  
>  /*

Like I said in my other mail, I'd much prefer to see this buried in
arch/x86 by passing a flag to early_memremap() which can be parsed in
arch directories.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-05-26 13:45             ` Tom Lendacky
@ 2016-06-08 10:07               ` Matt Fleming
  2016-06-09 16:16                 ` Tom Lendacky
  0 siblings, 1 reply; 30+ messages in thread
From: Matt Fleming @ 2016-06-08 10:07 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Borislav Petkov, Leif Lindholm, Mark Salter, Daniel Kiper,
	linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov, Ard Biesheuvel

(Sorry for the delay)

On Thu, 26 May, at 08:45:58AM, Tom Lendacky wrote:
> 
> The patch in question is patch 6/18 where PAGE_KERNEL is changed to
> include the _PAGE_ENC attribute (the encryption mask). This now
> makes FIXMAP_PAGE_NORMAL contain the encryption mask while
> FIXMAP_PAGE_IO does not. In this way, anything mapped using the
> early_ioremap call won't be mapped encrypted.

There are semantics attached to early_ioremap() that do not apply in
this case; that you're mapping an MMIO region but for EFI we just care
about noting where the firmware (not the kernel) populated the region
with data. Similar problems exist for other early boot code such as
the devicetree stuff.

early_ioremap() is not the answer.

What you really want is just some way to distinguish kernel-owned
regions from those owned by "somebody else".

I have no problem updating early_memremap() to take a @flags argument
to make that distinction, provided that the naming is generic and not
tied to AMD's SME technology via an "sme" prefix/suffix.

And making it generic should allow it to be easily sprinkled into the
shared architecture code in drivers/firmware/efi/ without issue.

I'm going to follow up with some additional comments/questions on
PATCH 10.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-05-25 19:30           ` Matt Fleming
@ 2016-05-26 13:45             ` Tom Lendacky
  2016-06-08 10:07               ` Matt Fleming
  0 siblings, 1 reply; 30+ messages in thread
From: Tom Lendacky @ 2016-05-26 13:45 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Borislav Petkov, Leif Lindholm, Mark Salter, Daniel Kiper,
	linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov, Ard Biesheuvel

On 05/25/2016 02:30 PM, Matt Fleming wrote:
> On Tue, 24 May, at 09:54:31AM, Tom Lendacky wrote:
>>
>> I looked into this and this would be a large change also to parse tables
>> and build lists.  It occurred to me that this could all be taken care of
>> if the early_memremap calls were changed to early_ioremap calls. Looking
>> in the git log I see that they were originally early_ioremap calls but
>> were changed to early_memremap calls with this commit:
>>
>> commit abc93f8eb6e4 ("efi: Use early_mem*() instead of early_io*()")
>>
>> Looking at the early_memremap code and the early_ioremap code they both
>> call __early_ioremap so I don't see how this change makes any
>> difference (especially since FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are
>> identical in this case).
>>
>> Is it safe to change these back to early_ioremap calls (at least on
>> x86)?
> 
> I really don't want to begin mixing early_ioremap() calls and
> early_memremap() calls for any of the EFI code if it can be avoided.

I definitely wouldn't mix them, it would be all or nothing.

> 
> There is slow but steady progress to move more and more of the
> architecture specific EFI code out into generic code. Swapping
> early_memremap() for early_ioremap() would be a step backwards,
> because FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are not identical on
> ARM/arm64.

Maybe adding something similar to __acpi_map_table would be more
appropriate?

> 
> Could you point me at the patch that in this series that fixes up
> early_ioremap() to work with mem encrypt/decrypt? I took another
> (quick) look through but couldn't find it.

The patch in question is patch 6/18 where PAGE_KERNEL is changed to
include the _PAGE_ENC attribute (the encryption mask). This now
makes FIXMAP_PAGE_NORMAL contain the encryption mask while
FIXMAP_PAGE_IO does not. In this way, anything mapped using the
early_ioremap call won't be mapped encrypted.

Thanks,
Tom

> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-05-24 14:54         ` Tom Lendacky
  2016-05-25 16:09           ` Daniel Kiper
@ 2016-05-25 19:30           ` Matt Fleming
  2016-05-26 13:45             ` Tom Lendacky
  1 sibling, 1 reply; 30+ messages in thread
From: Matt Fleming @ 2016-05-25 19:30 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Borislav Petkov, Leif Lindholm, Mark Salter, Daniel Kiper,
	linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov, Ard Biesheuvel

On Tue, 24 May, at 09:54:31AM, Tom Lendacky wrote:
> 
> I looked into this and this would be a large change also to parse tables
> and build lists.  It occurred to me that this could all be taken care of
> if the early_memremap calls were changed to early_ioremap calls. Looking
> in the git log I see that they were originally early_ioremap calls but
> were changed to early_memremap calls with this commit:
> 
> commit abc93f8eb6e4 ("efi: Use early_mem*() instead of early_io*()")
> 
> Looking at the early_memremap code and the early_ioremap code they both
> call __early_ioremap so I don't see how this change makes any
> difference (especially since FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are
> identical in this case).
> 
> Is it safe to change these back to early_ioremap calls (at least on
> x86)?

I really don't want to begin mixing early_ioremap() calls and
early_memremap() calls for any of the EFI code if it can be avoided.

There is slow but steady progress to move more and more of the
architecture specific EFI code out into generic code. Swapping
early_memremap() for early_ioremap() would be a step backwards,
because FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are not identical on
ARM/arm64.

Could you point me at the patch that in this series that fixes up
early_ioremap() to work with mem encrypt/decrypt? I took another
(quick) look through but couldn't find it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-05-24 14:54         ` Tom Lendacky
@ 2016-05-25 16:09           ` Daniel Kiper
  2016-05-25 19:30           ` Matt Fleming
  1 sibling, 0 replies; 30+ messages in thread
From: Daniel Kiper @ 2016-05-25 16:09 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Borislav Petkov, Matt Fleming, Leif Lindholm, Mark Salter,
	linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov

On Tue, May 24, 2016 at 09:54:31AM -0500, Tom Lendacky wrote:
> On 05/12/2016 01:20 PM, Tom Lendacky wrote:
> > On 05/10/2016 08:57 AM, Borislav Petkov wrote:
> >> On Tue, May 10, 2016 at 02:43:58PM +0100, Matt Fleming wrote:
> >>> Is it not possible to maintain some kind of kernel virtual address
> >>> mapping so memremap*() and friends can figure out when to twiddle the
> >>> mapping attributes and map with/without encryption?
> >>
> >> I guess we can move the sme_* specific stuff one indirection layer
> >> below, i.e., in the *memremap() routines so that callers don't have to
> >> care... That should keep the churn down...
> >>
> >
> > We could do that, but we'll have to generate that list of addresses so
> > that it can be checked against the range being mapped.  Since this is
> > part of early memmap support searching that list every time might not be
> > too bad. I'll have to look into that and see what that looks like.
>
> I looked into this and this would be a large change also to parse tables
> and build lists.  It occurred to me that this could all be taken care of
> if the early_memremap calls were changed to early_ioremap calls. Looking
> in the git log I see that they were originally early_ioremap calls but
> were changed to early_memremap calls with this commit:
>
> commit abc93f8eb6e4 ("efi: Use early_mem*() instead of early_io*()")
>
> Looking at the early_memremap code and the early_ioremap code they both
> call __early_ioremap so I don't see how this change makes any
> difference (especially since FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are
> identical in this case).
>
> Is it safe to change these back to early_ioremap calls (at least on
> x86)?

Commit f955371ca9d3986bca100666041fcfa9b6d21962 (x86: remove the Xen-specific
_PAGE_IOMAP PTE flag) made commit abc93f8eb6e4 unnecessary. Though, IMO, it
is still valid code cleanup. So, if it is not very strongly needed I would
not revert this change.

Daniel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-05-12 18:20       ` Tom Lendacky
@ 2016-05-24 14:54         ` Tom Lendacky
  2016-05-25 16:09           ` Daniel Kiper
  2016-05-25 19:30           ` Matt Fleming
  0 siblings, 2 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-05-24 14:54 UTC (permalink / raw)
  To: Borislav Petkov, Matt Fleming, Leif Lindholm, Mark Salter, Daniel Kiper
  Cc: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov

On 05/12/2016 01:20 PM, Tom Lendacky wrote:
> On 05/10/2016 08:57 AM, Borislav Petkov wrote:
>> On Tue, May 10, 2016 at 02:43:58PM +0100, Matt Fleming wrote:
>>> Is it not possible to maintain some kind of kernel virtual address
>>> mapping so memremap*() and friends can figure out when to twiddle the
>>> mapping attributes and map with/without encryption?
>>
>> I guess we can move the sme_* specific stuff one indirection layer
>> below, i.e., in the *memremap() routines so that callers don't have to
>> care... That should keep the churn down...
>>
> 
> We could do that, but we'll have to generate that list of addresses so
> that it can be checked against the range being mapped.  Since this is
> part of early memmap support searching that list every time might not be
> too bad. I'll have to look into that and see what that looks like.

I looked into this and this would be a large change also to parse tables
and build lists.  It occurred to me that this could all be taken care of
if the early_memremap calls were changed to early_ioremap calls. Looking
in the git log I see that they were originally early_ioremap calls but
were changed to early_memremap calls with this commit:

commit abc93f8eb6e4 ("efi: Use early_mem*() instead of early_io*()")

Looking at the early_memremap code and the early_ioremap code they both
call __early_ioremap so I don't see how this change makes any
difference (especially since FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are
identical in this case).

Is it safe to change these back to early_ioremap calls (at least on
x86)?

Thanks,
Tom

> 
> Thanks,
> Tom
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-05-10 13:57     ` Borislav Petkov
@ 2016-05-12 18:20       ` Tom Lendacky
  2016-05-24 14:54         ` Tom Lendacky
  0 siblings, 1 reply; 30+ messages in thread
From: Tom Lendacky @ 2016-05-12 18:20 UTC (permalink / raw)
  To: Borislav Petkov, Matt Fleming
  Cc: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov

On 05/10/2016 08:57 AM, Borislav Petkov wrote:
> On Tue, May 10, 2016 at 02:43:58PM +0100, Matt Fleming wrote:
>> Is it not possible to maintain some kind of kernel virtual address
>> mapping so memremap*() and friends can figure out when to twiddle the
>> mapping attributes and map with/without encryption?
> 
> I guess we can move the sme_* specific stuff one indirection layer
> below, i.e., in the *memremap() routines so that callers don't have to
> care... That should keep the churn down...
> 

We could do that, but we'll have to generate that list of addresses so
that it can be checked against the range being mapped.  Since this is
part of early memmap support searching that list every time might not be
too bad. I'll have to look into that and see what that looks like.

Thanks,
Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-05-10 13:43   ` Matt Fleming
@ 2016-05-10 13:57     ` Borislav Petkov
  2016-05-12 18:20       ` Tom Lendacky
  0 siblings, 1 reply; 30+ messages in thread
From: Borislav Petkov @ 2016-05-10 13:57 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Tom Lendacky, linux-arch, linux-efi, kvm, linux-doc, x86,
	linux-kernel, kasan-dev, linux-mm, iommu,
	Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Thomas Gleixner, Dmitry Vyukov

On Tue, May 10, 2016 at 02:43:58PM +0100, Matt Fleming wrote:
> Is it not possible to maintain some kind of kernel virtual address
> mapping so memremap*() and friends can figure out when to twiddle the
> mapping attributes and map with/without encryption?

I guess we can move the sme_* specific stuff one indirection layer
below, i.e., in the *memremap() routines so that callers don't have to
care... That should keep the churn down...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-04-26 22:57 ` [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear Tom Lendacky
@ 2016-05-10 13:43   ` Matt Fleming
  2016-05-10 13:57     ` Borislav Petkov
  2016-06-08 11:18   ` Matt Fleming
  1 sibling, 1 reply; 30+ messages in thread
From: Matt Fleming @ 2016-05-10 13:43 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu, Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

On Tue, 26 Apr, at 05:57:40PM, Tom Lendacky wrote:
> The EFI tables are not encrypted and need to be accessed as such. Be sure
> to memmap them without the encryption attribute set. For EFI support that
> lives outside of the arch/x86 tree, create a routine that uses the __weak
> attribute so that it can be overridden by an architecture specific routine.
> 
> When freeing boot services related memory, since it has been mapped as
> un-encrypted, be sure to change the mapping to encrypted for future use.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/include/asm/cacheflush.h  |    3 +
>  arch/x86/include/asm/mem_encrypt.h |   22 +++++++++++
>  arch/x86/kernel/setup.c            |    6 +--
>  arch/x86/mm/mem_encrypt.c          |   56 +++++++++++++++++++++++++++
>  arch/x86/mm/pageattr.c             |   75 ++++++++++++++++++++++++++++++++++++
>  arch/x86/platform/efi/efi.c        |   26 +++++++-----
>  arch/x86/platform/efi/efi_64.c     |    9 +++-
>  arch/x86/platform/efi/quirks.c     |   12 +++++-
>  drivers/firmware/efi/efi.c         |   18 +++++++--
>  drivers/firmware/efi/esrt.c        |   12 +++---
>  include/linux/efi.h                |    3 +
>  11 files changed, 212 insertions(+), 30 deletions(-)

The size of this change is completely unexpected. Why is there so much
churn to workaround this new feature?

Is it not possible to maintain some kind of kernel virtual address
mapping so memremap*() and friends can figure out when to twiddle the
mapping attributes and map with/without encryption?

These API changes place an undue burden on developers that don't even
care about SME.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
  2016-04-26 22:55 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
@ 2016-04-26 22:57 ` Tom Lendacky
  2016-05-10 13:43   ` Matt Fleming
  2016-06-08 11:18   ` Matt Fleming
  0 siblings, 2 replies; 30+ messages in thread
From: Tom Lendacky @ 2016-04-26 22:57 UTC (permalink / raw)
  To: linux-arch, linux-efi, kvm, linux-doc, x86, linux-kernel,
	kasan-dev, linux-mm, iommu
  Cc: Radim Krčmář,
	Arnd Bergmann, Jonathan Corbet, Matt Fleming, Joerg Roedel,
	Konrad Rzeszutek Wilk, Paolo Bonzini, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Thomas Gleixner, Dmitry Vyukov

The EFI tables are not encrypted and need to be accessed as such. Be sure
to memmap them without the encryption attribute set. For EFI support that
lives outside of the arch/x86 tree, create a routine that uses the __weak
attribute so that it can be overridden by an architecture specific routine.

When freeing boot services related memory, since it has been mapped as
un-encrypted, be sure to change the mapping to encrypted for future use.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/cacheflush.h  |    3 +
 arch/x86/include/asm/mem_encrypt.h |   22 +++++++++++
 arch/x86/kernel/setup.c            |    6 +--
 arch/x86/mm/mem_encrypt.c          |   56 +++++++++++++++++++++++++++
 arch/x86/mm/pageattr.c             |   75 ++++++++++++++++++++++++++++++++++++
 arch/x86/platform/efi/efi.c        |   26 +++++++-----
 arch/x86/platform/efi/efi_64.c     |    9 +++-
 arch/x86/platform/efi/quirks.c     |   12 +++++-
 drivers/firmware/efi/efi.c         |   18 +++++++--
 drivers/firmware/efi/esrt.c        |   12 +++---
 include/linux/efi.h                |    3 +
 11 files changed, 212 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h
index 61518cf..bfb08e5 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -13,6 +13,7 @@
  * Executability : eXeutable, NoteXecutable
  * Read/Write    : ReadOnly, ReadWrite
  * Presence      : NotPresent
+ * Encryption    : ENCrypted, DECrypted
  *
  * Within a category, the attributes are mutually exclusive.
  *
@@ -48,6 +49,8 @@ int set_memory_ro(unsigned long addr, int numpages);
 int set_memory_rw(unsigned long addr, int numpages);
 int set_memory_np(unsigned long addr, int numpages);
 int set_memory_4k(unsigned long addr, int numpages);
+int set_memory_enc(unsigned long addr, int numpages);
+int set_memory_dec(unsigned long addr, int numpages);
 
 int set_memory_array_uc(unsigned long *addr, int addrinarray);
 int set_memory_array_wc(unsigned long *addr, int addrinarray);
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 2785493..42868f5 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -23,13 +23,23 @@ extern unsigned long sme_me_mask;
 
 u8 sme_get_me_loss(void);
 
+int sme_set_mem_enc(void *vaddr, unsigned long size);
+int sme_set_mem_dec(void *vaddr, unsigned long size);
+
 void __init sme_early_mem_enc(resource_size_t paddr,
 			      unsigned long size);
 void __init sme_early_mem_dec(resource_size_t paddr,
 			      unsigned long size);
 
+void __init *sme_early_memremap(resource_size_t paddr,
+				unsigned long size);
+
 void __init sme_early_init(void);
 
+/* Architecture __weak replacement functions */
+void __init *efi_me_early_memremap(resource_size_t paddr,
+				   unsigned long size);
+
 #define __sme_pa(x)		(__pa((x)) | sme_me_mask)
 #define __sme_pa_nodebug(x)	(__pa_nodebug((x)) | sme_me_mask)
 
@@ -44,6 +54,16 @@ static inline u8 sme_get_me_loss(void)
 	return 0;
 }
 
+static inline int sme_set_mem_enc(void *vaddr, unsigned long size)
+{
+	return 0;
+}
+
+static inline int sme_set_mem_dec(void *vaddr, unsigned long size)
+{
+	return 0;
+}
+
 static inline void __init sme_early_mem_enc(resource_size_t paddr,
 					    unsigned long size)
 {
@@ -63,6 +83,8 @@ static inline void __init sme_early_init(void)
 
 #define __sme_va		__va
 
+#define sme_early_memremap	early_memremap
+
 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
 
 #endif	/* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 1d29cf9..2e460fb 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -424,7 +424,7 @@ static void __init parse_setup_data(void)
 	while (pa_data) {
 		u32 data_len, data_type;
 
-		data = early_memremap(pa_data, sizeof(*data));
+		data = sme_early_memremap(pa_data, sizeof(*data));
 		data_len = data->len + sizeof(struct setup_data);
 		data_type = data->type;
 		pa_next = data->next;
@@ -457,7 +457,7 @@ static void __init e820_reserve_setup_data(void)
 		return;
 
 	while (pa_data) {
-		data = early_memremap(pa_data, sizeof(*data));
+		data = sme_early_memremap(pa_data, sizeof(*data));
 		e820_update_range(pa_data, sizeof(*data)+data->len,
 			 E820_RAM, E820_RESERVED_KERN);
 		pa_data = data->next;
@@ -477,7 +477,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
 
 	pa_data = boot_params.hdr.setup_data;
 	while (pa_data) {
-		data = early_memremap(pa_data, sizeof(*data));
+		data = sme_early_memremap(pa_data, sizeof(*data));
 		memblock_reserve(pa_data, sizeof(*data) + data->len);
 		pa_data = data->next;
 		early_memunmap(data, sizeof(*data));
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 5f19ede..7d56d1b 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -14,12 +14,55 @@
 #include <linux/mm.h>
 
 #include <asm/mem_encrypt.h>
+#include <asm/cacheflush.h>
 #include <asm/tlbflush.h>
 #include <asm/fixmap.h>
 
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char me_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
 
+int sme_set_mem_enc(void *vaddr, unsigned long size)
+{
+	unsigned long addr, numpages;
+
+	if (!sme_me_mask)
+		return 0;
+
+	addr = (unsigned long)vaddr & PAGE_MASK;
+	numpages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+	/*
+	 * The set_memory_xxx functions take an integer for numpages, make
+	 * sure it doesn't exceed that.
+	 */
+	if (numpages > INT_MAX)
+		return -EINVAL;
+
+	return set_memory_enc(addr, numpages);
+}
+EXPORT_SYMBOL_GPL(sme_set_mem_enc);
+
+int sme_set_mem_dec(void *vaddr, unsigned long size)
+{
+	unsigned long addr, numpages;
+
+	if (!sme_me_mask)
+		return 0;
+
+	addr = (unsigned long)vaddr & PAGE_MASK;
+	numpages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+	/*
+	 * The set_memory_xxx functions take an integer for numpages, make
+	 * sure it doesn't exceed that.
+	 */
+	if (numpages > INT_MAX)
+		return -EINVAL;
+
+	return set_memory_dec(addr, numpages);
+}
+EXPORT_SYMBOL_GPL(sme_set_mem_dec);
+
 void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size)
 {
 	void *src, *dst;
@@ -104,6 +147,12 @@ void __init sme_early_mem_dec(resource_size_t paddr, unsigned long size)
 	}
 }
 
+void __init *sme_early_memremap(resource_size_t paddr,
+				unsigned long size)
+{
+	return early_memremap_dec(paddr, size);
+}
+
 void __init sme_early_init(void)
 {
 	unsigned int i;
@@ -117,3 +166,10 @@ void __init sme_early_init(void)
 	for (i = 0; i < ARRAY_SIZE(protection_map); i++)
 		protection_map[i] = __pgprot(pgprot_val(protection_map[i]) | sme_me_mask);
 }
+
+/* Architecture __weak replacement functions */
+void __init *efi_me_early_memremap(resource_size_t paddr,
+				   unsigned long size)
+{
+	return sme_early_memremap(paddr, size);
+}
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index c055302..0384fb3 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1731,6 +1731,81 @@ int set_memory_4k(unsigned long addr, int numpages)
 					__pgprot(0), 1, 0, NULL);
 }
 
+static int __set_memory_enc_dec(struct cpa_data *cpa)
+{
+	unsigned long addr;
+	int numpages;
+	int ret;
+
+	if (*cpa->vaddr & ~PAGE_MASK) {
+		*cpa->vaddr &= PAGE_MASK;
+
+		/* People should not be passing in unaligned addresses */
+		WARN_ON_ONCE(1);
+	}
+
+	addr = *cpa->vaddr;
+	numpages = cpa->numpages;
+
+	/* Must avoid aliasing mappings in the highmem code */
+	kmap_flush_unused();
+	vm_unmap_aliases();
+
+	ret = __change_page_attr_set_clr(cpa, 1);
+
+	/* Check whether we really changed something */
+	if (!(cpa->flags & CPA_FLUSHTLB))
+		goto out;
+
+	/*
+	 * On success we use CLFLUSH, when the CPU supports it to
+	 * avoid the WBINVD.
+	 */
+	if (!ret && static_cpu_has(X86_FEATURE_CLFLUSH))
+		cpa_flush_range(addr, numpages, 1);
+	else
+		cpa_flush_all(1);
+
+out:
+	return ret;
+}
+
+int set_memory_enc(unsigned long addr, int numpages)
+{
+	struct cpa_data cpa;
+
+	if (!sme_me_mask)
+		return 0;
+
+	memset(&cpa, 0, sizeof(cpa));
+	cpa.vaddr = &addr;
+	cpa.numpages = numpages;
+	cpa.mask_set = __pgprot(_PAGE_ENC);
+	cpa.mask_clr = __pgprot(0);
+	cpa.pgd = init_mm.pgd;
+
+	return __set_memory_enc_dec(&cpa);
+}
+EXPORT_SYMBOL(set_memory_enc);
+
+int set_memory_dec(unsigned long addr, int numpages)
+{
+	struct cpa_data cpa;
+
+	if (!sme_me_mask)
+		return 0;
+
+	memset(&cpa, 0, sizeof(cpa));
+	cpa.vaddr = &addr;
+	cpa.numpages = numpages;
+	cpa.mask_set = __pgprot(0);
+	cpa.mask_clr = __pgprot(_PAGE_ENC);
+	cpa.pgd = init_mm.pgd;
+
+	return __set_memory_enc_dec(&cpa);
+}
+EXPORT_SYMBOL(set_memory_dec);
+
 int set_pages_uc(struct page *page, int numpages)
 {
 	unsigned long addr = (unsigned long)page_address(page);
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 994a7df8..871b213 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -53,6 +53,7 @@
 #include <asm/x86_init.h>
 #include <asm/rtc.h>
 #include <asm/uv/uv.h>
+#include <asm/mem_encrypt.h>
 
 #define EFI_DEBUG
 
@@ -261,12 +262,12 @@ static int __init efi_systab_init(void *phys)
 		u64 tmp = 0;
 
 		if (efi_setup) {
-			data = early_memremap(efi_setup, sizeof(*data));
+			data = sme_early_memremap(efi_setup, sizeof(*data));
 			if (!data)
 				return -ENOMEM;
 		}
-		systab64 = early_memremap((unsigned long)phys,
-					 sizeof(*systab64));
+		systab64 = sme_early_memremap((unsigned long)phys,
+					      sizeof(*systab64));
 		if (systab64 == NULL) {
 			pr_err("Couldn't map the system table!\n");
 			if (data)
@@ -314,8 +315,8 @@ static int __init efi_systab_init(void *phys)
 	} else {
 		efi_system_table_32_t *systab32;
 
-		systab32 = early_memremap((unsigned long)phys,
-					 sizeof(*systab32));
+		systab32 = sme_early_memremap((unsigned long)phys,
+					      sizeof(*systab32));
 		if (systab32 == NULL) {
 			pr_err("Couldn't map the system table!\n");
 			return -ENOMEM;
@@ -361,8 +362,8 @@ static int __init efi_runtime_init32(void)
 {
 	efi_runtime_services_32_t *runtime;
 
-	runtime = early_memremap((unsigned long)efi.systab->runtime,
-			sizeof(efi_runtime_services_32_t));
+	runtime = sme_early_memremap((unsigned long)efi.systab->runtime,
+				     sizeof(efi_runtime_services_32_t));
 	if (!runtime) {
 		pr_err("Could not map the runtime service table!\n");
 		return -ENOMEM;
@@ -385,8 +386,8 @@ static int __init efi_runtime_init64(void)
 {
 	efi_runtime_services_64_t *runtime;
 
-	runtime = early_memremap((unsigned long)efi.systab->runtime,
-			sizeof(efi_runtime_services_64_t));
+	runtime = sme_early_memremap((unsigned long)efi.systab->runtime,
+				     sizeof(efi_runtime_services_64_t));
 	if (!runtime) {
 		pr_err("Could not map the runtime service table!\n");
 		return -ENOMEM;
@@ -444,8 +445,8 @@ static int __init efi_memmap_init(void)
 		return 0;
 
 	/* Map the EFI memory map */
-	memmap.map = early_memremap((unsigned long)memmap.phys_map,
-				   memmap.nr_map * memmap.desc_size);
+	memmap.map = sme_early_memremap((unsigned long)memmap.phys_map,
+					memmap.nr_map * memmap.desc_size);
 	if (memmap.map == NULL) {
 		pr_err("Could not map the memory map!\n");
 		return -ENOMEM;
@@ -490,7 +491,7 @@ void __init efi_init(void)
 	/*
 	 * Show what we know for posterity
 	 */
-	c16 = tmp = early_memremap(efi.systab->fw_vendor, 2);
+	c16 = tmp = sme_early_memremap(efi.systab->fw_vendor, 2);
 	if (c16) {
 		for (i = 0; i < sizeof(vendor) - 1 && *c16; ++i)
 			vendor[i] = *c16++;
@@ -690,6 +691,7 @@ static void *realloc_pages(void *old_memmap, int old_shift)
 	ret = (void *)__get_free_pages(GFP_KERNEL, old_shift + 1);
 	if (!ret)
 		goto out;
+	sme_set_mem_dec(ret, PAGE_SIZE << (old_shift + 1));
 
 	/*
 	 * A first-time allocation doesn't have anything to copy.
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 49e4dd4..834a992 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -223,7 +223,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	if (efi_enabled(EFI_OLD_MEMMAP))
 		return 0;
 
-	efi_scratch.efi_pgt = (pgd_t *)__pa(efi_pgd);
+	efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd);
 	pgd = efi_pgd;
 
 	/*
@@ -262,7 +262,8 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 		pfn = md->phys_addr >> PAGE_SHIFT;
 		npages = md->num_pages;
 
-		if (kernel_map_pages_in_pgd(pgd, pfn, md->phys_addr, npages, _PAGE_RW)) {
+		if (kernel_map_pages_in_pgd(pgd, pfn, md->phys_addr, npages,
+					    _PAGE_RW | _PAGE_ENC)) {
 			pr_err("Failed to map 1:1 memory\n");
 			return 1;
 		}
@@ -272,6 +273,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	if (!page)
 		panic("Unable to allocate EFI runtime stack < 4GB\n");
 
+	sme_set_mem_dec(page_address(page), PAGE_SIZE);
 	efi_scratch.phys_stack = virt_to_phys(page_address(page));
 	efi_scratch.phys_stack += PAGE_SIZE; /* stack grows down */
 
@@ -279,7 +281,8 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	text = __pa(_text);
 	pfn = text >> PAGE_SHIFT;
 
-	if (kernel_map_pages_in_pgd(pgd, pfn, text, npages, _PAGE_RW)) {
+	if (kernel_map_pages_in_pgd(pgd, pfn, text, npages,
+				    _PAGE_RW | _PAGE_ENC)) {
 		pr_err("Failed to map kernel text 1:1\n");
 		return 1;
 	}
diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index ab50ada..dde4fb6b 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -13,6 +13,7 @@
 #include <linux/dmi.h>
 #include <asm/efi.h>
 #include <asm/uv/uv.h>
+#include <asm/mem_encrypt.h>
 
 #define EFI_MIN_RESERVE 5120
 
@@ -265,6 +266,13 @@ void __init efi_free_boot_services(void)
 		if (md->attribute & EFI_MEMORY_RUNTIME)
 			continue;
 
+		/*
+		 * Change the mapping to encrypted memory before freeing.
+		 * This insures any future allocations of this mapped area
+		 * are used encrypted.
+		 */
+		sme_set_mem_enc(__va(start), size);
+
 		free_bootmem_late(start, size);
 	}
 
@@ -292,7 +300,7 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
 	if (!efi_enabled(EFI_64BIT))
 		return 0;
 
-	data = early_memremap(efi_setup, sizeof(*data));
+	data = sme_early_memremap(efi_setup, sizeof(*data));
 	if (!data) {
 		ret = -ENOMEM;
 		goto out;
@@ -303,7 +311,7 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
 
 	sz = sizeof(efi_config_table_64_t);
 
-	p = tablep = early_memremap(tables, nr_tables * sz);
+	p = tablep = sme_early_memremap(tables, nr_tables * sz);
 	if (!p) {
 		pr_err("Could not map Configuration table!\n");
 		ret = -ENOMEM;
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 3a69ed5..25010c7 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -76,6 +76,16 @@ static int __init parse_efi_cmdline(char *str)
 }
 early_param("efi", parse_efi_cmdline);
 
+/*
+ * If memory encryption is supported, then an override to this function
+ * will be provided.
+ */
+void __weak __init *efi_me_early_memremap(resource_size_t phys_addr,
+					  unsigned long size)
+{
+	return early_memremap(phys_addr, size);
+}
+
 struct kobject *efi_kobj;
 
 /*
@@ -289,9 +299,9 @@ int __init efi_mem_desc_lookup(u64 phys_addr, efi_memory_desc_t *out_md)
 		 * So just always get our own virtual map on the CPU.
 		 *
 		 */
-		md = early_memremap(p, sizeof (*md));
+		md = efi_me_early_memremap(p, sizeof (*md));
 		if (!md) {
-			pr_err_once("early_memremap(%pa, %zu) failed.\n",
+			pr_err_once("efi_me_early_memremap(%pa, %zu) failed.\n",
 				    &p, sizeof (*md));
 			return -ENOMEM;
 		}
@@ -431,8 +441,8 @@ int __init efi_config_init(efi_config_table_type_t *arch_tables)
 	/*
 	 * Let's see what config tables the firmware passed to us.
 	 */
-	config_tables = early_memremap(efi.systab->tables,
-				       efi.systab->nr_tables * sz);
+	config_tables = efi_me_early_memremap(efi.systab->tables,
+					      efi.systab->nr_tables * sz);
 	if (config_tables == NULL) {
 		pr_err("Could not map Configuration table!\n");
 		return -ENOMEM;
diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c
index 75feb3f..7a96bc6 100644
--- a/drivers/firmware/efi/esrt.c
+++ b/drivers/firmware/efi/esrt.c
@@ -273,10 +273,10 @@ void __init efi_esrt_init(void)
 		return;
 	}
 
-	va = early_memremap(efi.esrt, size);
+	va = efi_me_early_memremap(efi.esrt, size);
 	if (!va) {
-		pr_err("early_memremap(%p, %zu) failed.\n", (void *)efi.esrt,
-		       size);
+		pr_err("efi_me_early_memremap(%p, %zu) failed.\n",
+		       (void *)efi.esrt, size);
 		return;
 	}
 
@@ -323,10 +323,10 @@ void __init efi_esrt_init(void)
 	/* remap it with our (plausible) new pages */
 	early_memunmap(va, size);
 	size += entries_size;
-	va = early_memremap(efi.esrt, size);
+	va = efi_me_early_memremap(efi.esrt, size);
 	if (!va) {
-		pr_err("early_memremap(%p, %zu) failed.\n", (void *)efi.esrt,
-		       size);
+		pr_err("efi_me_early_memremap(%p, %zu) failed.\n",
+		       (void *)efi.esrt, size);
 		return;
 	}
 
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 1626474..557c774 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -957,6 +957,9 @@ extern void __init efi_fake_memmap(void);
 static inline void efi_fake_memmap(void) { }
 #endif
 
+extern void __weak __init *efi_me_early_memremap(resource_size_t phys_addr,
+						 unsigned long size);
+
 /* Iterate through an efi_memory_map */
 #define for_each_efi_memory_desc(m, md)					   \
 	for ((md) = (m)->map;						   \

^ permalink raw reply related	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2016-06-17 15:51 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-26 22:45 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
2016-04-26 22:45 ` [RFC PATCH v1 01/18] x86: Set the write-protect cache mode for AMD processors Tom Lendacky
2016-04-26 22:45 ` [RFC PATCH v1 02/18] x86: Secure Memory Encryption (SME) build enablement Tom Lendacky
2016-04-26 22:45 ` [RFC PATCH v1 03/18] x86: Secure Memory Encryption (SME) support Tom Lendacky
2016-04-26 22:45 ` [RFC PATCH v1 04/18] x86: Add the Secure Memory Encryption cpu feature Tom Lendacky
2016-04-26 22:46 ` [RFC PATCH v1 05/18] x86: Handle reduction in physical address size with SME Tom Lendacky
2016-04-26 22:46 ` [RFC PATCH v1 06/18] x86: Provide general kernel support for memory encryption Tom Lendacky
2016-04-26 22:46 ` [RFC PATCH v1 07/18] x86: Extend the early_memmap support with additional attrs Tom Lendacky
2016-04-26 22:46 ` [RFC PATCH v1 08/18] x86: Add support for early encryption/decryption of memory Tom Lendacky
2016-04-26 22:46 ` [RFC PATCH v1 09/18] x86: Insure that memory areas are encrypted when possible Tom Lendacky
2016-04-26 22:47 ` [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear Tom Lendacky
2016-04-26 22:55 [RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD) Tom Lendacky
2016-04-26 22:57 ` [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear Tom Lendacky
2016-05-10 13:43   ` Matt Fleming
2016-05-10 13:57     ` Borislav Petkov
2016-05-12 18:20       ` Tom Lendacky
2016-05-24 14:54         ` Tom Lendacky
2016-05-25 16:09           ` Daniel Kiper
2016-05-25 19:30           ` Matt Fleming
2016-05-26 13:45             ` Tom Lendacky
2016-06-08 10:07               ` Matt Fleming
2016-06-09 16:16                 ` Tom Lendacky
2016-06-13 12:03                   ` Matt Fleming
2016-06-13 12:34                     ` Matt Fleming
2016-06-13 15:16                     ` Tom Lendacky
2016-06-08 11:18   ` Matt Fleming
2016-06-09 18:33     ` Tom Lendacky
2016-06-13 13:51       ` Matt Fleming
2016-06-15 13:17         ` Tom Lendacky
2016-06-16 14:38           ` Tom Lendacky
2016-06-17 15:51             ` Matt Fleming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).