linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5 V6] Support kdump for AMD secure memory encryption(SME)
@ 2018-08-31  8:19 Lianbo Jiang
  2018-08-31  8:19 ` [PATCH 1/5 V6] x86/ioremap: add a function ioremap_encrypted() to remap kdump old memroy Lianbo Jiang
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Lianbo Jiang @ 2018-08-31  8:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, tglx, hpa, ebiederm, joro, thomas.lendacky, dyoung, kexec,
	iommu, bhe

When SME is enabled on AMD machine, we also need to support kdump. Because
the memory is encrypted in the first kernel, we will remap the old memory
to the kdump kernel for dumping data, and SME is also enabled in the kdump
kernel, otherwise the old memory can not be decrypted.

For the kdump, it is necessary to distinguish whether the memory is encrypted.
Furthermore, we should also know which part of the memory is encrypted or
decrypted. We will appropriately remap the memory according to the specific
situation in order to tell cpu how to access the memory.

As we know, a page of memory that is marked as encrypted, which will be
automatically decrypted when read from DRAM, and will also be automatically
encrypted when written to DRAM. If the old memory is encrypted, we have to
remap the old memory with the memory encryption mask, which will automatically
decrypt the old memory when we read those data.

For kdump(SME), there are two cases that doesn't support:

 ----------------------------------------------
| first-kernel | second-kernel | kdump support |
|      (mem_encrypt=on|off)    |   (yes|no)    |
|--------------+---------------+---------------|
|     on       |     on        |     yes       |
|     off      |     off       |     yes       |
|     on       |     off       |     no        |
|     off      |     on        |     no        |
|______________|_______________|_______________|

1. SME is enabled in the first kernel, but SME is disabled in kdump kernel
In this case, because the old memory is encrypted, we can't decrypt the
old memory.

2. SME is disabled in the first kernel, but SME is enabled in kdump kernel
It is unnecessary to support in this case, because the old memory is
unencrypted, the old memory can be dumped as usual, we don't need to enable
SME in kdump kernel. Another, If we must support the scenario, it will
increase the complexity of the code, we will have to consider how to pass
the SME flag from the first kernel to the kdump kernel, in order to let the
kdump kernel know that whether the old memory is encrypted.

There are two methods to pass the SME flag to the kdump kernel. The first
method is to modify the assembly code, which includes some common code and
the path is too long. The second method is to use kexec tool, which could
require the SME flag to be exported in the first kernel by "proc" or "sysfs",
kexec tools will read the SME flag from "proc" or "sysfs" when we use kexec
tools to load image, subsequently the SME flag will be saved in boot_params,
we can properly remap the old memory according to the previously saved SME
flag. But it is too expensive to do this.

This patches are only for SME kdump, the patches don't support SEV kdump.

Test tools:
makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
Note: This patch can only dump vmcore in the case of SME enabled.

crash-7.2.3: https://github.com/crash-utility/crash.git
commit 001f77a05585 (Fix for Linux 4.19-rc1 and later kernels that contain
                     kernel commit7290d58095712a89f845e1bca05334796dd49ed2)

Test environment:
HP ProLiant DL385Gen10 AMD EPYC 7251
8-Core Processor
32768 MB memory
600 GB disk space

Linux 4.19-rc1:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
commit 5b394b2ddf0347bef56e50c69a58773c94343ff3

Reference:
AMD64 Architecture Programmer's Manual
https://support.amd.com/TechDocs/24593.pdf

Changes since v5:
1. modified patches log.

Some known issues:
1. about SME
Upstream kernel will hang on HP machine(DL385Gen10 AMD EPYC 7251) when
we execute the kexec command as follow:

# kexec -l /boot/vmlinuz-4.19.0-rc1+ --initrd=/boot/initramfs-4.19.0-rc1+.img --command-line="root=/dev/mapper/rhel_hp--dl385g10--03-root ro mem_encrypt=on rd.lvm.lv=rhel_hp-dl385g10-03/root rd.lvm.lv=rhel_hp-dl385g10-03/swap console=ttyS0,115200n81 LANG=en_US.UTF-8 earlyprintk=serial debug nokaslr"
# kexec -e (or reboot)

But this issue can not be reproduced on speedway machine, and this issue
is irrelevant to my posted patches.

The kernel log:
[ 1248.932239] kexec_core: Starting new kernel
early console in extract_kernel
input_data: 0x000000087e91c3b4
input_len: 0x000000000067fcbd
output: 0x000000087d400000
output_len: 0x0000000001b6fa90
kernel_total_size: 0x0000000001a9d000
trampoline_32bit: 0x0000000000099000

Decompressing Linux...
Parsing ELF...        [---Here the system will hang]

2. about SEV
Upstream kernel(Host OS) doesn't work in host side, some drivers about
SEV always go wrong in host side. We can't boot SEV Guest OS to test
kdump patch. Maybe it is more reasonable to improve SEV in another
patch. When some drivers can work in host side and it can also boot
Virtual Machine(SEV Guest OS), it will be suitable to fix SEV for kdump.

[  369.426131] INFO: task systemd-udevd:865 blocked for more than 120 seconds.
[  369.433177]       Not tainted 4.17.0-rc5+ #60
[  369.437585] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  369.445783] systemd-udevd   D    0   865    813 0x80000004
[  369.451323] Call Trace:
[  369.453815]  ? __schedule+0x290/0x870
[  369.457523]  schedule+0x32/0x80
[  369.460714]  __sev_do_cmd_locked+0x1f6/0x2a0 [ccp]
[  369.465556]  ? cleanup_uevent_env+0x10/0x10
[  369.470084]  ? remove_wait_queue+0x60/0x60
[  369.474219]  ? 0xffffffffc0247000
[  369.477572]  __sev_platform_init_locked+0x2b/0x70 [ccp]
[  369.482843]  sev_platform_init+0x1d/0x30 [ccp]
[  369.487333]  psp_pci_init+0x40/0xe0 [ccp]
[  369.491380]  ? 0xffffffffc0247000
[  369.494936]  sp_mod_init+0x18/0x1000 [ccp]
[  369.499071]  do_one_initcall+0x4e/0x1d4
[  369.502944]  ? _cond_resched+0x15/0x30
[  369.506728]  ? kmem_cache_alloc_trace+0xae/0x1d0
[  369.511386]  ? do_init_module+0x22/0x220
[  369.515345]  do_init_module+0x5a/0x220
[  369.519444]  load_module+0x21cb/0x2a50
[  369.523227]  ? m_show+0x1c0/0x1c0
[  369.526571]  ? security_capable+0x3f/0x60
[  369.530611]  __do_sys_finit_module+0x94/0xe0
[  369.534915]  do_syscall_64+0x5b/0x180
[  369.538607]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  369.543698] RIP: 0033:0x7f708e6311b9
[  369.547536] RSP: 002b:00007ffff9d32aa8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  369.555162] RAX: ffffffffffffffda RBX: 000055602a04c2d0 RCX: 00007f708e6311b9
[  369.562346] RDX: 0000000000000000 RSI: 00007f708ef52039 RDI: 0000000000000008
[  369.569801] RBP: 00007f708ef52039 R08: 0000000000000000 R09: 000055602a048b20
[  369.576988] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
[  369.584177] R13: 000055602a075260 R14: 0000000000020000 R15: 0000000000000000

Lianbo Jiang (5):
  x86/ioremap: add a function ioremap_encrypted() to remap kdump old
    memroy
  x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to
    adjust encryption mask
  kexec: allocate unencrypted control pages for kdump in case SME is
    enabled
  iommu/amd_iommu: remap the device table of IOMMU with the memory
    encryption mask for kdump
  kdump/vmcore: support encrypted old memory with SME enabled

 arch/x86/include/asm/io.h            |  3 ++
 arch/x86/kernel/Makefile             |  1 +
 arch/x86/kernel/crash_dump_encrypt.c | 53 ++++++++++++++++++++++++++++
 arch/x86/mm/ioremap.c                | 34 +++++++++++++-----
 drivers/iommu/amd_iommu_init.c       | 14 ++++++--
 fs/proc/vmcore.c                     | 21 +++++++----
 include/linux/crash_dump.h           | 12 +++++++
 kernel/kexec_core.c                  | 12 +++++++
 8 files changed, 133 insertions(+), 17 deletions(-)
 create mode 100644 arch/x86/kernel/crash_dump_encrypt.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5 V6] x86/ioremap: add a function ioremap_encrypted() to remap kdump old memroy
  2018-08-31  8:19 [PATCH 0/5 V6] Support kdump for AMD secure memory encryption(SME) Lianbo Jiang
@ 2018-08-31  8:19 ` Lianbo Jiang
  2018-08-31  8:19 ` [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask Lianbo Jiang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Lianbo Jiang @ 2018-08-31  8:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, tglx, hpa, ebiederm, joro, thomas.lendacky, dyoung, kexec,
	iommu, bhe

When SME is enabled on AMD machine, the memory is encrypted in the first
kernel. In this case, SME also needs to be enabled in kdump kernel, and
we have to remap the old memory with the memory encryption mask.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/include/asm/io.h |  3 +++
 arch/x86/mm/ioremap.c     | 25 +++++++++++++++++--------
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 6de64840dd22..f8795f9581c7 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -192,6 +192,9 @@ extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
 #define ioremap_cache ioremap_cache
 extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, unsigned long prot_val);
 #define ioremap_prot ioremap_prot
+extern void __iomem *ioremap_encrypted(resource_size_t phys_addr,
+					unsigned long size);
+#define ioremap_encrypted ioremap_encrypted
 
 /**
  * ioremap     -   map bus memory into CPU space
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c63a545ec199..e01e6c695add 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -24,6 +24,7 @@
 #include <asm/pgalloc.h>
 #include <asm/pat.h>
 #include <asm/setup.h>
+#include <linux/crash_dump.h>
 
 #include "physaddr.h"
 
@@ -131,7 +132,8 @@ static void __ioremap_check_mem(resource_size_t addr, unsigned long size,
  * caller shouldn't need to know that small detail.
  */
 static void __iomem *__ioremap_caller(resource_size_t phys_addr,
-		unsigned long size, enum page_cache_mode pcm, void *caller)
+		unsigned long size, enum page_cache_mode pcm,
+		void *caller, bool encrypted)
 {
 	unsigned long offset, vaddr;
 	resource_size_t last_addr;
@@ -199,7 +201,7 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 	 * resulting mapping.
 	 */
 	prot = PAGE_KERNEL_IO;
-	if (sev_active() && mem_flags.desc_other)
+	if ((sev_active() && mem_flags.desc_other) || encrypted)
 		prot = pgprot_encrypted(prot);
 
 	switch (pcm) {
@@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
 	enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
 
 	return __ioremap_caller(phys_addr, size, pcm,
-				__builtin_return_address(0));
+				__builtin_return_address(0), false);
 }
 EXPORT_SYMBOL(ioremap_nocache);
 
@@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, unsigned long size)
 	enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
 
 	return __ioremap_caller(phys_addr, size, pcm,
-				__builtin_return_address(0));
+				__builtin_return_address(0), false);
 }
 EXPORT_SYMBOL_GPL(ioremap_uc);
 
@@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
 void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 {
 	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
-					__builtin_return_address(0));
+					__builtin_return_address(0), false);
 }
 EXPORT_SYMBOL(ioremap_wc);
 
@@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
 void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
 {
 	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
-					__builtin_return_address(0));
+					__builtin_return_address(0), false);
 }
 EXPORT_SYMBOL(ioremap_wt);
 
+void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size)
+{
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
+				__builtin_return_address(0), true);
+}
+EXPORT_SYMBOL(ioremap_encrypted);
+
 void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
 {
 	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
-				__builtin_return_address(0));
+				__builtin_return_address(0), false);
 }
 EXPORT_SYMBOL(ioremap_cache);
 
@@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, unsigned long size,
 {
 	return __ioremap_caller(phys_addr, size,
 				pgprot2cachemode(__pgprot(prot_val)),
-				__builtin_return_address(0));
+				__builtin_return_address(0), false);
 }
 EXPORT_SYMBOL(ioremap_prot);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-08-31  8:19 [PATCH 0/5 V6] Support kdump for AMD secure memory encryption(SME) Lianbo Jiang
  2018-08-31  8:19 ` [PATCH 1/5 V6] x86/ioremap: add a function ioremap_encrypted() to remap kdump old memroy Lianbo Jiang
@ 2018-08-31  8:19 ` Lianbo Jiang
  2018-09-03  2:45   ` Dave Young
  2018-08-31  8:19 ` [PATCH 3/5 V6] kexec: allocate unencrypted control pages for kdump in case SME is enabled Lianbo Jiang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Lianbo Jiang @ 2018-08-31  8:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, tglx, hpa, ebiederm, joro, thomas.lendacky, dyoung, kexec,
	iommu, bhe

For kdump kernel, when SME is enabled, the acpi table and dmi table will need
to be remapped without the memory encryption mask. So we have to strengthen
the logic in early_memremap_pgprot_adjust(), which makes us have an opportunity
to adjust the memory encryption mask.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/mm/ioremap.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index e01e6c695add..f9d9a39955f3 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -689,8 +689,15 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
 	encrypted_prot = true;
 
 	if (sme_active()) {
+                /*
+                 * In kdump kernel, the acpi table and dmi table will need
+                 * to be remapped without the memory encryption mask. Here
+                 * we have to strengthen the logic to adjust the memory
+                 * encryption mask.
+                 */
 		if (early_memremap_is_setup_data(phys_addr, size) ||
-		    memremap_is_efi_data(phys_addr, size))
+		    memremap_is_efi_data(phys_addr, size) ||
+		    is_kdump_kernel())
 			encrypted_prot = false;
 	}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5 V6] kexec: allocate unencrypted control pages for kdump in case SME is enabled
  2018-08-31  8:19 [PATCH 0/5 V6] Support kdump for AMD secure memory encryption(SME) Lianbo Jiang
  2018-08-31  8:19 ` [PATCH 1/5 V6] x86/ioremap: add a function ioremap_encrypted() to remap kdump old memroy Lianbo Jiang
  2018-08-31  8:19 ` [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask Lianbo Jiang
@ 2018-08-31  8:19 ` Lianbo Jiang
  2018-08-31  8:19 ` [PATCH 4/5 V6] iommu/amd_iommu: remap the device table of IOMMU with the memory encryption mask for kdump Lianbo Jiang
  2018-08-31  8:19 ` [PATCH 5/5 V6] kdump/vmcore: support encrypted old memory with SME enabled Lianbo Jiang
  4 siblings, 0 replies; 15+ messages in thread
From: Lianbo Jiang @ 2018-08-31  8:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, tglx, hpa, ebiederm, joro, thomas.lendacky, dyoung, kexec,
	iommu, bhe

When SME is enabled in the first kernel, we will allocate unencrypted pages
for kdump in order to be able to boot the kdump kernel like kexec.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 kernel/kexec_core.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 23a83a4da38a..e7efcd1a977b 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -471,6 +471,16 @@ static struct page *kimage_alloc_crash_control_pages(struct kimage *image,
 		}
 	}
 
+	if (pages) {
+		/*
+		 * For kdump, we need to ensure that these pages are
+		 * unencrypted pages if SME is enabled.
+		 * By the way, it is unnecessary to call the arch_
+		 * kexec_pre_free_pages(), which will make the code
+		 * become more simple.
+		 */
+		arch_kexec_post_alloc_pages(page_address(pages), 1 << order, 0);
+	}
 	return pages;
 }
 
@@ -867,6 +877,7 @@ static int kimage_load_crash_segment(struct kimage *image,
 			result  = -ENOMEM;
 			goto out;
 		}
+		arch_kexec_post_alloc_pages(page_address(page), 1, 0);
 		ptr = kmap(page);
 		ptr += maddr & ~PAGE_MASK;
 		mchunk = min_t(size_t, mbytes,
@@ -884,6 +895,7 @@ static int kimage_load_crash_segment(struct kimage *image,
 			result = copy_from_user(ptr, buf, uchunk);
 		kexec_flush_icache_page(page);
 		kunmap(page);
+		arch_kexec_pre_free_pages(page_address(page), 1);
 		if (result) {
 			result = -EFAULT;
 			goto out;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5 V6] iommu/amd_iommu: remap the device table of IOMMU with the memory encryption mask for kdump
  2018-08-31  8:19 [PATCH 0/5 V6] Support kdump for AMD secure memory encryption(SME) Lianbo Jiang
                   ` (2 preceding siblings ...)
  2018-08-31  8:19 ` [PATCH 3/5 V6] kexec: allocate unencrypted control pages for kdump in case SME is enabled Lianbo Jiang
@ 2018-08-31  8:19 ` Lianbo Jiang
  2018-08-31  8:19 ` [PATCH 5/5 V6] kdump/vmcore: support encrypted old memory with SME enabled Lianbo Jiang
  4 siblings, 0 replies; 15+ messages in thread
From: Lianbo Jiang @ 2018-08-31  8:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, tglx, hpa, ebiederm, joro, thomas.lendacky, dyoung, kexec,
	iommu, bhe

In kdump kernel, it will copy the device table of IOMMU from the old device
table, which is encrypted when SME is enabled in the first kernel. So we
have to remap the old device table with the memory encryption mask.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 drivers/iommu/amd_iommu_init.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 84b3e4445d46..3931c7de7c69 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -902,12 +902,22 @@ static bool copy_device_table(void)
 		}
 	}
 
-	old_devtb_phys = entry & PAGE_MASK;
+	/*
+	 * When SME is enabled in the first kernel, the entry includes the
+	 * memory encryption mask(sme_me_mask), we must remove the memory
+	 * encryption mask to obtain the true physical address in kdump kernel.
+	 */
+	old_devtb_phys = __sme_clr(entry) & PAGE_MASK;
+
 	if (old_devtb_phys >= 0x100000000ULL) {
 		pr_err("The address of old device table is above 4G, not trustworthy!\n");
 		return false;
 	}
-	old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
+	old_devtb = (sme_active() && is_kdump_kernel())
+		    ? (__force void *)ioremap_encrypted(old_devtb_phys,
+							dev_table_size)
+		    : memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
+
 	if (!old_devtb)
 		return false;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5 V6] kdump/vmcore: support encrypted old memory with SME enabled
  2018-08-31  8:19 [PATCH 0/5 V6] Support kdump for AMD secure memory encryption(SME) Lianbo Jiang
                   ` (3 preceding siblings ...)
  2018-08-31  8:19 ` [PATCH 4/5 V6] iommu/amd_iommu: remap the device table of IOMMU with the memory encryption mask for kdump Lianbo Jiang
@ 2018-08-31  8:19 ` Lianbo Jiang
  4 siblings, 0 replies; 15+ messages in thread
From: Lianbo Jiang @ 2018-08-31  8:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, tglx, hpa, ebiederm, joro, thomas.lendacky, dyoung, kexec,
	iommu, bhe

In kdump kernel, we need to dump the old memory into vmcore file,if SME
is enabled in the first kernel, we have to remap the old memory with the
memory encryption mask, which will be automatically decrypted when we
read from DRAM.

For SME kdump, there are two cases that doesn't support:

 ----------------------------------------------
| first-kernel | second-kernel | kdump support |
|      (mem_encrypt=on|off)    |   (yes|no)    |
|--------------+---------------+---------------|
|     on       |     on        |     yes       |
|     off      |     off       |     yes       |
|     on       |     off       |     no        |
|     off      |     on        |     no        |
|______________|_______________|_______________|

1. SME is enabled in the first kernel, but SME is disabled in kdump kernel
In this case, because the old memory is encrypted, we can't decrypt the
old memory.

2. SME is disabled in the first kernel, but SME is enabled in kdump kernel
On the one hand, the old memory is unencrypted, the old memory can be dumped
as usual, we don't need to enable SME in kdump kernel; On the other hand, it
will increase the complexity of the code, we will have to consider how to
pass the SME flag from the first kernel to the kdump kernel, it is really
too expensive to do this.

This patches are only for SME kdump, the patches don't support SEV kdump.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/kernel/Makefile             |  1 +
 arch/x86/kernel/crash_dump_encrypt.c | 53 ++++++++++++++++++++++++++++
 fs/proc/vmcore.c                     | 21 +++++++----
 include/linux/crash_dump.h           | 12 +++++++
 4 files changed, 81 insertions(+), 6 deletions(-)
 create mode 100644 arch/x86/kernel/crash_dump_encrypt.c

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 8824d01c0c35..dfbeae0e35ce 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -97,6 +97,7 @@ obj-$(CONFIG_KEXEC_CORE)	+= machine_kexec_$(BITS).o
 obj-$(CONFIG_KEXEC_CORE)	+= relocate_kernel_$(BITS).o crash.o
 obj-$(CONFIG_KEXEC_FILE)	+= kexec-bzimage64.o
 obj-$(CONFIG_CRASH_DUMP)	+= crash_dump_$(BITS).o
+obj-$(CONFIG_AMD_MEM_ENCRYPT)	+= crash_dump_encrypt.o
 obj-y				+= kprobes/
 obj-$(CONFIG_MODULES)		+= module.o
 obj-$(CONFIG_DOUBLEFAULT)	+= doublefault.o
diff --git a/arch/x86/kernel/crash_dump_encrypt.c b/arch/x86/kernel/crash_dump_encrypt.c
new file mode 100644
index 000000000000..e1b1a577f197
--- /dev/null
+++ b/arch/x86/kernel/crash_dump_encrypt.c
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *	Memory preserving reboot related code.
+ *
+ *	Created by: Lianbo Jiang (lijiang@redhat.com)
+ *	Copyright (C) RedHat Corporation, 2018. All rights reserved
+ */
+
+#include <linux/errno.h>
+#include <linux/crash_dump.h>
+#include <linux/uaccess.h>
+#include <linux/io.h>
+
+/**
+ * copy_oldmem_page_encrypted - copy one page from "oldmem encrypted"
+ * @pfn: page frame number to be copied
+ * @buf: target memory address for the copy; this can be in kernel address
+ *	space or user address space (see @userbuf)
+ * @csize: number of bytes to copy
+ * @offset: offset in bytes into the page (based on pfn) to begin the copy
+ * @userbuf: if set, @buf is in user address space, use copy_to_user(),
+ *	otherwise @buf is in kernel address space, use memcpy().
+ *
+ * Copy a page from "oldmem encrypted". For this page, there is no pte
+ * mapped in the current kernel. We stitch up a pte, similar to
+ * kmap_atomic.
+ */
+
+ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
+		size_t csize, unsigned long offset, int userbuf)
+{
+	void  *vaddr;
+
+	if (!csize)
+		return 0;
+
+	vaddr = (__force void *)ioremap_encrypted(pfn << PAGE_SHIFT,
+						  PAGE_SIZE);
+	if (!vaddr)
+		return -ENOMEM;
+
+	if (userbuf) {
+		if (copy_to_user((void __user *)buf, vaddr + offset, csize)) {
+			iounmap((void __iomem *)vaddr);
+			return -EFAULT;
+		}
+	} else
+		memcpy(buf, vaddr + offset, csize);
+
+	set_iounmap_nonlazy();
+	iounmap((void __iomem *)vaddr);
+	return csize;
+}
diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index cbde728f8ac6..3065c8bada6a 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -25,6 +25,9 @@
 #include <linux/pagemap.h>
 #include <linux/uaccess.h>
 #include <asm/io.h>
+#include <linux/io.h>
+#include <linux/mem_encrypt.h>
+#include <asm/pgtable.h>
 #include "internal.h"
 
 /* List representing chunks of contiguous memory areas and their offsets in
@@ -98,7 +101,8 @@ static int pfn_is_ram(unsigned long pfn)
 
 /* Reads a page from the oldmem device from given offset. */
 static ssize_t read_from_oldmem(char *buf, size_t count,
-				u64 *ppos, int userbuf)
+				u64 *ppos, int userbuf,
+				bool encrypted)
 {
 	unsigned long pfn, offset;
 	size_t nr_bytes;
@@ -120,8 +124,11 @@ static ssize_t read_from_oldmem(char *buf, size_t count,
 		if (pfn_is_ram(pfn) == 0)
 			memset(buf, 0, nr_bytes);
 		else {
-			tmp = copy_oldmem_page(pfn, buf, nr_bytes,
-						offset, userbuf);
+			tmp = encrypted ? copy_oldmem_page_encrypted(pfn,
+					    buf, nr_bytes, offset, userbuf)
+					: copy_oldmem_page(pfn, buf, nr_bytes,
+							   offset, userbuf);
+
 			if (tmp < 0)
 				return tmp;
 		}
@@ -155,7 +162,7 @@ void __weak elfcorehdr_free(unsigned long long addr)
  */
 ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
 {
-	return read_from_oldmem(buf, count, ppos, 0);
+	return read_from_oldmem(buf, count, ppos, 0, false);
 }
 
 /*
@@ -163,7 +170,7 @@ ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
  */
 ssize_t __weak elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos)
 {
-	return read_from_oldmem(buf, count, ppos, 0);
+	return read_from_oldmem(buf, count, ppos, 0, sme_active());
 }
 
 /*
@@ -173,6 +180,7 @@ int __weak remap_oldmem_pfn_range(struct vm_area_struct *vma,
 				  unsigned long from, unsigned long pfn,
 				  unsigned long size, pgprot_t prot)
 {
+	prot = pgprot_encrypted(prot);
 	return remap_pfn_range(vma, from, pfn, size, prot);
 }
 
@@ -351,7 +359,8 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
 					    m->offset + m->size - *fpos,
 					    buflen);
 			start = m->paddr + *fpos - m->offset;
-			tmp = read_from_oldmem(buffer, tsz, &start, userbuf);
+			tmp = read_from_oldmem(buffer, tsz, &start, userbuf,
+						sme_active());
 			if (tmp < 0)
 				return tmp;
 			buflen -= tsz;
diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
index 3e4ba9d753c8..a7e7be8b0502 100644
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -26,6 +26,18 @@ extern int remap_oldmem_pfn_range(struct vm_area_struct *vma,
 
 extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
 						unsigned long, int);
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+extern ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
+					   size_t csize, unsigned long offset,
+					   int userbuf);
+#else
+static inline
+ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf, size_t csize,
+				    unsigned long offset, int userbuf)
+{
+	return 0;
+}
+#endif
 void vmcore_cleanup(void);
 
 /* Architecture code defines this if there are other possible ELF
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-08-31  8:19 ` [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask Lianbo Jiang
@ 2018-09-03  2:45   ` Dave Young
  2018-09-03 14:06     ` lijiang
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Young @ 2018-09-03  2:45 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: linux-kernel, mingo, tglx, hpa, ebiederm, joro, thomas.lendacky,
	kexec, iommu, bhe

On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
> For kdump kernel, when SME is enabled, the acpi table and dmi table will need
> to be remapped without the memory encryption mask. So we have to strengthen
> the logic in early_memremap_pgprot_adjust(), which makes us have an opportunity
> to adjust the memory encryption mask.
> 
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
>  arch/x86/mm/ioremap.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index e01e6c695add..f9d9a39955f3 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -689,8 +689,15 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
>  	encrypted_prot = true;
>  
>  	if (sme_active()) {
> +                /*
> +                 * In kdump kernel, the acpi table and dmi table will need
> +                 * to be remapped without the memory encryption mask. Here
> +                 * we have to strengthen the logic to adjust the memory
> +                 * encryption mask.

Assume the acpi/dmi tables are identical for both 1st kernel and kdump
kernel, I'm not sure what is the difference, why need special handling
for kdump. Can you add more explanations?

> +                 */
>  		if (early_memremap_is_setup_data(phys_addr, size) ||
> -		    memremap_is_efi_data(phys_addr, size))
> +		    memremap_is_efi_data(phys_addr, size) ||
> +		    is_kdump_kernel())
>  			encrypted_prot = false;
>  	}
>  
> -- 
> 2.17.1
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-09-03  2:45   ` Dave Young
@ 2018-09-03 14:06     ` lijiang
  2018-09-04  0:44       ` Dave Young
  0 siblings, 1 reply; 15+ messages in thread
From: lijiang @ 2018-09-03 14:06 UTC (permalink / raw)
  To: Dave Young
  Cc: linux-kernel, mingo, tglx, hpa, ebiederm, joro, thomas.lendacky,
	kexec, iommu, bhe

在 2018年09月03日 10:45, Dave Young 写道:
> On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
>> For kdump kernel, when SME is enabled, the acpi table and dmi table will need
>> to be remapped without the memory encryption mask. So we have to strengthen
>> the logic in early_memremap_pgprot_adjust(), which makes us have an opportunity
>> to adjust the memory encryption mask.
>>
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>>  arch/x86/mm/ioremap.c | 9 ++++++++-
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index e01e6c695add..f9d9a39955f3 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -689,8 +689,15 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
>>  	encrypted_prot = true;
>>  
>>  	if (sme_active()) {
>> +                /*
>> +                 * In kdump kernel, the acpi table and dmi table will need
>> +                 * to be remapped without the memory encryption mask. Here
>> +                 * we have to strengthen the logic to adjust the memory
>> +                 * encryption mask.
> 
> Assume the acpi/dmi tables are identical for both 1st kernel and kdump
> kernel, I'm not sure what is the difference, why need special handling
> for kdump. Can you add more explanations?
> 

Ok, i will use a dmi example to explain this issue.

There are significant differences about E820 between the 1st kernel and kdump kernel. I pasted them at bottom.

Firstly, we need to know how they are called.
__acpi_map_table()\                                                        / early_memremap_is_setup_data()
                   |-> early_memremap()-> early_memremap_pgprot_adjust()-> | memremap_is_efi_data()
 dmi_early_remap()/                                                        \ memremap_should_map_decrypted()-> e820__get_entry_type()

Secondly, we also need to understand the memremap_should_map_decrypted(), which is illustrated by the fake code.
static bool memremap_should_map_decrypted(resource_size_t phys_addr,
                                          unsigned long size)
{

    /* code ... */

    switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
        case E820_TYPE_RESERVED:
        case E820_TYPE_ACPI:
        case E820_TYPE_NVS:
        case E820_TYPE_UNUSABLE:
                /* For SEV, these areas are encrypted */
                if (sev_active())
                        break;
                /* Fallthrough */

        case E820_TYPE_PRAM:
                /* For SME, these areas are decrypted */
                return true;
        default:
                /* these areas are encrypted by default*/
                break;
    }

    return false;
}

For the dmi case, the dmi base address is 0x6286b000 in my test machine.

In the 1st kernel, the e820__get_entry_type() can get a valid entry and type by the dmi address, and we can also find the dmi base address from e820.
(see the 1st kernel log)
0x6286b000 ∈ [mem 0x000000006286b000-0x000000006286efff]
So, these areas are decrypted according to the memremap_should_map_decrypted().

In kdump kernel, the dmi base address is still 0x6286b000, but we can not find the dmi base address from e820 any more. The e820__get_entry_type() can
not get a valid entry and type by the dmi base address, it will go into the default branch. That is to say, these areas become encrypted. In fact, these
areas are also decrypted, so we have to strengthen the logic of adjusting the memory encryption mask.


The 1st kernel log:

[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008bfff] usable
[    0.000000] BIOS-e820: [mem 0x000000000008c000-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000029920fff] usable
[    0.000000] BIOS-e820: [mem 0x0000000029921000-0x0000000029921fff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000029922000-0x0000000062256fff] usable
[    0.000000] BIOS-e820: [mem 0x0000000062257000-0x0000000062356fff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000062357000-0x000000006235cfff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000006235d000-0x00000000623dbfff] usable
[    0.000000] BIOS-e820: [mem 0x00000000623dc000-0x000000006261bfff] reserved
[    0.000000] BIOS-e820: [mem 0x000000006261c000-0x000000006263dfff] usable
[    0.000000] BIOS-e820: [mem 0x000000006263e000-0x000000006269dfff] reserved
[    0.000000] BIOS-e820: [mem 0x000000006269e000-0x00000000627d6fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000627d7000-0x00000000627e3fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000627e4000-0x00000000627e4fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000627e5000-0x00000000627e8fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000627e9000-0x00000000627eafff] usable
[    0.000000] BIOS-e820: [mem 0x00000000627eb000-0x00000000627ebfff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000627ec000-0x000000006286afff] usable
[    0.000000] BIOS-e820: [mem 0x000000006286b000-0x000000006286efff] reserved
[    0.000000] BIOS-e820: [mem 0x000000006286f000-0x00000000682f8fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000682f9000-0x0000000068b05fff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000068b0a000-0x0000000068b1afff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000068b1e000-0x0000000071d1dfff] usable
[    0.000000] BIOS-e820: [mem 0x0000000071d1e000-0x0000000071d2dfff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
[    0.000000] BIOS-e820: [mem 0x0000000071d4e000-0x0000000077ffffff] usable
[    0.000000] BIOS-e820: [mem 0x0000000078000000-0x000000008fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed80fff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000087effffff] usable
[    0.000000] BIOS-e820: [mem 0x000000087f000000-0x000000087fffffff] reserved

The kdump kernel log:

[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000008bfff] usable
[    0.000000] BIOS-e820: [mem 0x0000000052000000-0x0000000061ffffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000622ee000-0x0000000062300fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x0000000062301000-0x0000000062301fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000062703000-0x0000000062703fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x0000000062735000-0x0000000062737fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x000000006273a000-0x000000006273afff] ACPI data
[    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000007fe000000-0x000000087df70fff] usable

>> +                 */
>>  		if (early_memremap_is_setup_data(phys_addr, size) ||
>> -		    memremap_is_efi_data(phys_addr, size))
>> +		    memremap_is_efi_data(phys_addr, size) ||
>> +		    is_kdump_kernel())
>>  			encrypted_prot = false;
>>  	}
>>  
>> -- 
>> 2.17.1
>>
> 
> Thanks
> Dave
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-09-03 14:06     ` lijiang
@ 2018-09-04  0:44       ` Dave Young
  2018-09-04  1:29         ` Dave Young
  2018-09-04  1:44         ` lijiang
  0 siblings, 2 replies; 15+ messages in thread
From: Dave Young @ 2018-09-04  0:44 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, mingo, tglx, hpa, ebiederm, joro, thomas.lendacky,
	kexec, iommu, bhe

On 09/03/18 at 10:06pm, lijiang wrote:
> 在 2018年09月03日 10:45, Dave Young 写道:
> > On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
> >> For kdump kernel, when SME is enabled, the acpi table and dmi table will need
> >> to be remapped without the memory encryption mask. So we have to strengthen
> >> the logic in early_memremap_pgprot_adjust(), which makes us have an opportunity
> >> to adjust the memory encryption mask.
> >>
> >> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> >> ---
> >>  arch/x86/mm/ioremap.c | 9 ++++++++-
> >>  1 file changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> >> index e01e6c695add..f9d9a39955f3 100644
> >> --- a/arch/x86/mm/ioremap.c
> >> +++ b/arch/x86/mm/ioremap.c
> >> @@ -689,8 +689,15 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
> >>  	encrypted_prot = true;
> >>  
> >>  	if (sme_active()) {
> >> +                /*
> >> +                 * In kdump kernel, the acpi table and dmi table will need
> >> +                 * to be remapped without the memory encryption mask. Here
> >> +                 * we have to strengthen the logic to adjust the memory
> >> +                 * encryption mask.
> > 
> > Assume the acpi/dmi tables are identical for both 1st kernel and kdump
> > kernel, I'm not sure what is the difference, why need special handling
> > for kdump. Can you add more explanations?
> > 
> 
> Ok, i will use a dmi example to explain this issue.
> 
> There are significant differences about E820 between the 1st kernel and kdump kernel. I pasted them at bottom.
> 
> Firstly, we need to know how they are called.
> __acpi_map_table()\                                                        / early_memremap_is_setup_data()
>                    |-> early_memremap()-> early_memremap_pgprot_adjust()-> | memremap_is_efi_data()
>  dmi_early_remap()/                                                        \ memremap_should_map_decrypted()-> e820__get_entry_type()
> 
> Secondly, we also need to understand the memremap_should_map_decrypted(), which is illustrated by the fake code.
> static bool memremap_should_map_decrypted(resource_size_t phys_addr,
>                                           unsigned long size)
> {
> 
>     /* code ... */
> 
>     switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
>         case E820_TYPE_RESERVED:
>         case E820_TYPE_ACPI:
>         case E820_TYPE_NVS:
>         case E820_TYPE_UNUSABLE:
>                 /* For SEV, these areas are encrypted */
>                 if (sev_active())
>                         break;
>                 /* Fallthrough */
> 
>         case E820_TYPE_PRAM:
>                 /* For SME, these areas are decrypted */
>                 return true;
>         default:
>                 /* these areas are encrypted by default*/
>                 break;
>     }
> 
>     return false;
> }
> 
> For the dmi case, the dmi base address is 0x6286b000 in my test machine.
> 
> In the 1st kernel, the e820__get_entry_type() can get a valid entry and type by the dmi address, and we can also find the dmi base address from e820.
> (see the 1st kernel log)
> 0x6286b000 ∈ [mem 0x000000006286b000-0x000000006286efff]
> So, these areas are decrypted according to the memremap_should_map_decrypted().
> 
> In kdump kernel, the dmi base address is still 0x6286b000, but we can not find the dmi base address from e820 any more. The e820__get_entry_type() can
> not get a valid entry and type by the dmi base address, it will go into the default branch. That is to say, these areas become encrypted. In fact, these
> areas are also decrypted, so we have to strengthen the logic of adjusting the memory encryption mask.
> 
> 
> The 1st kernel log:
> 
> [    0.000000] BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008bfff] usable
> [    0.000000] BIOS-e820: [mem 0x000000000008c000-0x000000000009ffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000029920fff] usable
> [    0.000000] BIOS-e820: [mem 0x0000000029921000-0x0000000029921fff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000029922000-0x0000000062256fff] usable
> [    0.000000] BIOS-e820: [mem 0x0000000062257000-0x0000000062356fff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000062357000-0x000000006235cfff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x000000006235d000-0x00000000623dbfff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000623dc000-0x000000006261bfff] reserved
> [    0.000000] BIOS-e820: [mem 0x000000006261c000-0x000000006263dfff] usable
> [    0.000000] BIOS-e820: [mem 0x000000006263e000-0x000000006269dfff] reserved
> [    0.000000] BIOS-e820: [mem 0x000000006269e000-0x00000000627d6fff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000627d7000-0x00000000627e3fff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x00000000627e4000-0x00000000627e4fff] ACPI NVS
> [    0.000000] BIOS-e820: [mem 0x00000000627e5000-0x00000000627e8fff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x00000000627e9000-0x00000000627eafff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000627eb000-0x00000000627ebfff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x00000000627ec000-0x000000006286afff] usable
> [    0.000000] BIOS-e820: [mem 0x000000006286b000-0x000000006286efff] reserved
> [    0.000000] BIOS-e820: [mem 0x000000006286f000-0x00000000682f8fff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000682f9000-0x0000000068b05fff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
> [    0.000000] BIOS-e820: [mem 0x0000000068b0a000-0x0000000068b1afff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
> [    0.000000] BIOS-e820: [mem 0x0000000068b1e000-0x0000000071d1dfff] usable
> [    0.000000] BIOS-e820: [mem 0x0000000071d1e000-0x0000000071d2dfff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
> [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x0000000071d4e000-0x0000000077ffffff] usable
> [    0.000000] BIOS-e820: [mem 0x0000000078000000-0x000000008fffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed80fff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000087effffff] usable
> [    0.000000] BIOS-e820: [mem 0x000000087f000000-0x000000087fffffff] reserved
> 
> The kdump kernel log:
> 
> [    0.000000] BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000008bfff] usable
> [    0.000000] BIOS-e820: [mem 0x0000000052000000-0x0000000061ffffff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000622ee000-0x0000000062300fff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x0000000062301000-0x0000000062301fff] ACPI NVS
> [    0.000000] BIOS-e820: [mem 0x0000000062703000-0x0000000062703fff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x0000000062735000-0x0000000062737fff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x000000006273a000-0x000000006273afff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
> [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
> [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
> [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
> [    0.000000] BIOS-e820: [mem 0x00000007fe000000-0x000000087df70fff] usable
> 

Can you provide the efi memmap dmesg?  boot with efi=debug?

> >> +                 */
> >>  		if (early_memremap_is_setup_data(phys_addr, size) ||
> >> -		    memremap_is_efi_data(phys_addr, size))
> >> +		    memremap_is_efi_data(phys_addr, size) ||
> >> +		    is_kdump_kernel())
> >>  			encrypted_prot = false;
> >>  	}
> >>  
> >> -- 
> >> 2.17.1
> >>
> > 
> > Thanks
> > Dave
> > 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-09-04  0:44       ` Dave Young
@ 2018-09-04  1:29         ` Dave Young
  2018-09-04  1:51           ` Dave Young
  2018-09-04  1:44         ` lijiang
  1 sibling, 1 reply; 15+ messages in thread
From: Dave Young @ 2018-09-04  1:29 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, mingo, tglx, hpa, ebiederm, joro, thomas.lendacky,
	kexec, iommu, bhe, Ard Biesheuvel, linux-efi

On 09/04/18 at 08:44am, Dave Young wrote:
> On 09/03/18 at 10:06pm, lijiang wrote:
> > 在 2018年09月03日 10:45, Dave Young 写道:
> > > On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
> > >> For kdump kernel, when SME is enabled, the acpi table and dmi table will need
> > >> to be remapped without the memory encryption mask. So we have to strengthen
> > >> the logic in early_memremap_pgprot_adjust(), which makes us have an opportunity
> > >> to adjust the memory encryption mask.
> > >>
> > >> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> > >> ---
> > >>  arch/x86/mm/ioremap.c | 9 ++++++++-
> > >>  1 file changed, 8 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> > >> index e01e6c695add..f9d9a39955f3 100644
> > >> --- a/arch/x86/mm/ioremap.c
> > >> +++ b/arch/x86/mm/ioremap.c
> > >> @@ -689,8 +689,15 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
> > >>  	encrypted_prot = true;
> > >>  
> > >>  	if (sme_active()) {
> > >> +                /*
> > >> +                 * In kdump kernel, the acpi table and dmi table will need
> > >> +                 * to be remapped without the memory encryption mask. Here
> > >> +                 * we have to strengthen the logic to adjust the memory
> > >> +                 * encryption mask.
> > > 
> > > Assume the acpi/dmi tables are identical for both 1st kernel and kdump
> > > kernel, I'm not sure what is the difference, why need special handling
> > > for kdump. Can you add more explanations?
> > > 
> > 
> > Ok, i will use a dmi example to explain this issue.
> > 
> > There are significant differences about E820 between the 1st kernel and kdump kernel. I pasted them at bottom.
> > 
> > Firstly, we need to know how they are called.
> > __acpi_map_table()\                                                        / early_memremap_is_setup_data()
> >                    |-> early_memremap()-> early_memremap_pgprot_adjust()-> | memremap_is_efi_data()
> >  dmi_early_remap()/                                                        \ memremap_should_map_decrypted()-> e820__get_entry_type()
> > 
> > Secondly, we also need to understand the memremap_should_map_decrypted(), which is illustrated by the fake code.
> > static bool memremap_should_map_decrypted(resource_size_t phys_addr,
> >                                           unsigned long size)
> > {
> > 
> >     /* code ... */
> > 
> >     switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
> >         case E820_TYPE_RESERVED:
> >         case E820_TYPE_ACPI:
> >         case E820_TYPE_NVS:
> >         case E820_TYPE_UNUSABLE:
> >                 /* For SEV, these areas are encrypted */
> >                 if (sev_active())
> >                         break;
> >                 /* Fallthrough */
> > 
> >         case E820_TYPE_PRAM:
> >                 /* For SME, these areas are decrypted */
> >                 return true;
> >         default:
> >                 /* these areas are encrypted by default*/
> >                 break;
> >     }
> > 
> >     return false;
> > }
> > 
> > For the dmi case, the dmi base address is 0x6286b000 in my test machine.
> > 
> > In the 1st kernel, the e820__get_entry_type() can get a valid entry and type by the dmi address, and we can also find the dmi base address from e820.
> > (see the 1st kernel log)
> > 0x6286b000 ∈ [mem 0x000000006286b000-0x000000006286efff]
> > So, these areas are decrypted according to the memremap_should_map_decrypted().
> > 
> > In kdump kernel, the dmi base address is still 0x6286b000, but we can not find the dmi base address from e820 any more. The e820__get_entry_type() can
> > not get a valid entry and type by the dmi base address, it will go into the default branch. That is to say, these areas become encrypted. In fact, these
> > areas are also decrypted, so we have to strengthen the logic of adjusting the memory encryption mask.
> > 
> > 
> > The 1st kernel log:
> > 
> > [    0.000000] BIOS-provided physical RAM map:
> > [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008bfff] usable
> > [    0.000000] BIOS-e820: [mem 0x000000000008c000-0x000000000009ffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000029920fff] usable
> > [    0.000000] BIOS-e820: [mem 0x0000000029921000-0x0000000029921fff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000029922000-0x0000000062256fff] usable
> > [    0.000000] BIOS-e820: [mem 0x0000000062257000-0x0000000062356fff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000062357000-0x000000006235cfff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x000000006235d000-0x00000000623dbfff] usable
> > [    0.000000] BIOS-e820: [mem 0x00000000623dc000-0x000000006261bfff] reserved
> > [    0.000000] BIOS-e820: [mem 0x000000006261c000-0x000000006263dfff] usable
> > [    0.000000] BIOS-e820: [mem 0x000000006263e000-0x000000006269dfff] reserved
> > [    0.000000] BIOS-e820: [mem 0x000000006269e000-0x00000000627d6fff] usable
> > [    0.000000] BIOS-e820: [mem 0x00000000627d7000-0x00000000627e3fff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x00000000627e4000-0x00000000627e4fff] ACPI NVS
> > [    0.000000] BIOS-e820: [mem 0x00000000627e5000-0x00000000627e8fff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x00000000627e9000-0x00000000627eafff] usable
> > [    0.000000] BIOS-e820: [mem 0x00000000627eb000-0x00000000627ebfff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x00000000627ec000-0x000000006286afff] usable
> > [    0.000000] BIOS-e820: [mem 0x000000006286b000-0x000000006286efff] reserved
> > [    0.000000] BIOS-e820: [mem 0x000000006286f000-0x00000000682f8fff] usable
> > [    0.000000] BIOS-e820: [mem 0x00000000682f9000-0x0000000068b05fff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
> > [    0.000000] BIOS-e820: [mem 0x0000000068b0a000-0x0000000068b1afff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
> > [    0.000000] BIOS-e820: [mem 0x0000000068b1e000-0x0000000071d1dfff] usable
> > [    0.000000] BIOS-e820: [mem 0x0000000071d1e000-0x0000000071d2dfff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
> > [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x0000000071d4e000-0x0000000077ffffff] usable
> > [    0.000000] BIOS-e820: [mem 0x0000000078000000-0x000000008fffffff] reserved
> > [    0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed80fff] reserved
> > [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000087effffff] usable
> > [    0.000000] BIOS-e820: [mem 0x000000087f000000-0x000000087fffffff] reserved
> > 
> > The kdump kernel log:
> > 
> > [    0.000000] BIOS-provided physical RAM map:
> > [    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000008bfff] usable
> > [    0.000000] BIOS-e820: [mem 0x0000000052000000-0x0000000061ffffff] usable
> > [    0.000000] BIOS-e820: [mem 0x00000000622ee000-0x0000000062300fff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x0000000062301000-0x0000000062301fff] ACPI NVS
> > [    0.000000] BIOS-e820: [mem 0x0000000062703000-0x0000000062703fff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x0000000062735000-0x0000000062737fff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x000000006273a000-0x000000006273afff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
> > [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
> > [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
> > [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
> > [    0.000000] BIOS-e820: [mem 0x00000007fe000000-0x000000087df70fff] usable
> > 
> 
> Can you provide the efi memmap dmesg?  boot with efi=debug?

The right way should be checking the efi mem types instead of only
checking is_kdump_kernel.

Something like below, probably also check the region size with something
like efi_mem_range_type(addr, size), return -EINVAL in case cross
different type efi memory desc, added efi people in cc:

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c63a545ec199..4a24e138c0d0 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -527,6 +527,13 @@ static bool memremap_should_map_decrypted(resource_size_t phys_addr,
 		break;
 	}
 
+	if (is_kdump_kernel() {
+		switch (efi_mem_type(phys_addr)) {
+		/* refer to arch/x86/boot/compressed/eboot.c -> setup_e820()*/
+		case ...
+		}
+	}
+
 	return false;
 }
 

> 
> > >> +                 */
> > >>  		if (early_memremap_is_setup_data(phys_addr, size) ||
> > >> -		    memremap_is_efi_data(phys_addr, size))
> > >> +		    memremap_is_efi_data(phys_addr, size) ||
> > >> +		    is_kdump_kernel())
> > >>  			encrypted_prot = false;
> > >>  	}
> > >>  
> > >> -- 
> > >> 2.17.1
> > >>
> > > 
> > > Thanks
> > > Dave
> > > 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-09-04  0:44       ` Dave Young
  2018-09-04  1:29         ` Dave Young
@ 2018-09-04  1:44         ` lijiang
  1 sibling, 0 replies; 15+ messages in thread
From: lijiang @ 2018-09-04  1:44 UTC (permalink / raw)
  To: Dave Young
  Cc: linux-kernel, mingo, tglx, hpa, ebiederm, joro, thomas.lendacky,
	kexec, iommu, bhe

在 2018年09月04日 08:44, Dave Young 写道:
> On 09/03/18 at 10:06pm, lijiang wrote:
>> 在 2018年09月03日 10:45, Dave Young 写道:
>>> On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
>>>> For kdump kernel, when SME is enabled, the acpi table and dmi table will need
>>>> to be remapped without the memory encryption mask. So we have to strengthen
>>>> the logic in early_memremap_pgprot_adjust(), which makes us have an opportunity
>>>> to adjust the memory encryption mask.
>>>>
>>>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>>>> ---
>>>>  arch/x86/mm/ioremap.c | 9 ++++++++-
>>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>>> index e01e6c695add..f9d9a39955f3 100644
>>>> --- a/arch/x86/mm/ioremap.c
>>>> +++ b/arch/x86/mm/ioremap.c
>>>> @@ -689,8 +689,15 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
>>>>  	encrypted_prot = true;
>>>>  
>>>>  	if (sme_active()) {
>>>> +                /*
>>>> +                 * In kdump kernel, the acpi table and dmi table will need
>>>> +                 * to be remapped without the memory encryption mask. Here
>>>> +                 * we have to strengthen the logic to adjust the memory
>>>> +                 * encryption mask.
>>>
>>> Assume the acpi/dmi tables are identical for both 1st kernel and kdump
>>> kernel, I'm not sure what is the difference, why need special handling
>>> for kdump. Can you add more explanations?
>>>
>>
>> Ok, i will use a dmi example to explain this issue.
>>
>> There are significant differences about E820 between the 1st kernel and kdump kernel. I pasted them at bottom.
>>
>> Firstly, we need to know how they are called.
>> __acpi_map_table()\                                                        / early_memremap_is_setup_data()
>>                    |-> early_memremap()-> early_memremap_pgprot_adjust()-> | memremap_is_efi_data()
>>  dmi_early_remap()/                                                        \ memremap_should_map_decrypted()-> e820__get_entry_type()
>>
>> Secondly, we also need to understand the memremap_should_map_decrypted(), which is illustrated by the fake code.
>> static bool memremap_should_map_decrypted(resource_size_t phys_addr,
>>                                           unsigned long size)
>> {
>>
>>     /* code ... */
>>
>>     switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
>>         case E820_TYPE_RESERVED:
>>         case E820_TYPE_ACPI:
>>         case E820_TYPE_NVS:
>>         case E820_TYPE_UNUSABLE:
>>                 /* For SEV, these areas are encrypted */
>>                 if (sev_active())
>>                         break;
>>                 /* Fallthrough */
>>
>>         case E820_TYPE_PRAM:
>>                 /* For SME, these areas are decrypted */
>>                 return true;
>>         default:
>>                 /* these areas are encrypted by default*/
>>                 break;
>>     }
>>
>>     return false;
>> }
>>
>> For the dmi case, the dmi base address is 0x6286b000 in my test machine.
>>
>> In the 1st kernel, the e820__get_entry_type() can get a valid entry and type by the dmi address, and we can also find the dmi base address from e820.
>> (see the 1st kernel log)
>> 0x6286b000 ∈ [mem 0x000000006286b000-0x000000006286efff]
>> So, these areas are decrypted according to the memremap_should_map_decrypted().
>>
>> In kdump kernel, the dmi base address is still 0x6286b000, but we can not find the dmi base address from e820 any more. The e820__get_entry_type() can
>> not get a valid entry and type by the dmi base address, it will go into the default branch. That is to say, these areas become encrypted. In fact, these
>> areas are also decrypted, so we have to strengthen the logic of adjusting the memory encryption mask.
>>
>>
>> The 1st kernel log:
>>
>> [    0.000000] BIOS-provided physical RAM map:
>> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008bfff] usable
>> [    0.000000] BIOS-e820: [mem 0x000000000008c000-0x000000000009ffff] reserved
>> [    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
>> [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000029920fff] usable
>> [    0.000000] BIOS-e820: [mem 0x0000000029921000-0x0000000029921fff] reserved
>> [    0.000000] BIOS-e820: [mem 0x0000000029922000-0x0000000062256fff] usable
>> [    0.000000] BIOS-e820: [mem 0x0000000062257000-0x0000000062356fff] reserved
>> [    0.000000] BIOS-e820: [mem 0x0000000062357000-0x000000006235cfff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x000000006235d000-0x00000000623dbfff] usable
>> [    0.000000] BIOS-e820: [mem 0x00000000623dc000-0x000000006261bfff] reserved
>> [    0.000000] BIOS-e820: [mem 0x000000006261c000-0x000000006263dfff] usable
>> [    0.000000] BIOS-e820: [mem 0x000000006263e000-0x000000006269dfff] reserved
>> [    0.000000] BIOS-e820: [mem 0x000000006269e000-0x00000000627d6fff] usable
>> [    0.000000] BIOS-e820: [mem 0x00000000627d7000-0x00000000627e3fff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x00000000627e4000-0x00000000627e4fff] ACPI NVS
>> [    0.000000] BIOS-e820: [mem 0x00000000627e5000-0x00000000627e8fff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x00000000627e9000-0x00000000627eafff] usable
>> [    0.000000] BIOS-e820: [mem 0x00000000627eb000-0x00000000627ebfff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x00000000627ec000-0x000000006286afff] usable
>> [    0.000000] BIOS-e820: [mem 0x000000006286b000-0x000000006286efff] reserved
>> [    0.000000] BIOS-e820: [mem 0x000000006286f000-0x00000000682f8fff] usable
>> [    0.000000] BIOS-e820: [mem 0x00000000682f9000-0x0000000068b05fff] reserved
>> [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
>> [    0.000000] BIOS-e820: [mem 0x0000000068b0a000-0x0000000068b1afff] reserved
>> [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
>> [    0.000000] BIOS-e820: [mem 0x0000000068b1e000-0x0000000071d1dfff] usable
>> [    0.000000] BIOS-e820: [mem 0x0000000071d1e000-0x0000000071d2dfff] reserved
>> [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
>> [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x0000000071d4e000-0x0000000077ffffff] usable
>> [    0.000000] BIOS-e820: [mem 0x0000000078000000-0x000000008fffffff] reserved
>> [    0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed80fff] reserved
>> [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000087effffff] usable
>> [    0.000000] BIOS-e820: [mem 0x000000087f000000-0x000000087fffffff] reserved
>>
>> The kdump kernel log:
>>
>> [    0.000000] BIOS-provided physical RAM map:
>> [    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000008bfff] usable
>> [    0.000000] BIOS-e820: [mem 0x0000000052000000-0x0000000061ffffff] usable
>> [    0.000000] BIOS-e820: [mem 0x00000000622ee000-0x0000000062300fff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x0000000062301000-0x0000000062301fff] ACPI NVS
>> [    0.000000] BIOS-e820: [mem 0x0000000062703000-0x0000000062703fff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x0000000062735000-0x0000000062737fff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x000000006273a000-0x000000006273afff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
>> [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
>> [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
>> [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
>> [    0.000000] BIOS-e820: [mem 0x00000007fe000000-0x000000087df70fff] usable
>>
> 
> Can you provide the efi memmap dmesg?  boot with efi=debug?
> 

Sorry, this machine is ProLiant DL385Gen10, which is not EFI boot.
I might not be able to provide the EFI information.

>>>> +                 */
>>>>  		if (early_memremap_is_setup_data(phys_addr, size) ||
>>>> -		    memremap_is_efi_data(phys_addr, size))
>>>> +		    memremap_is_efi_data(phys_addr, size) ||
>>>> +		    is_kdump_kernel())
>>>>  			encrypted_prot = false;
>>>>  	}
>>>>  
>>>> -- 
>>>> 2.17.1
>>>>
>>>
>>> Thanks
>>> Dave
>>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-09-04  1:29         ` Dave Young
@ 2018-09-04  1:51           ` Dave Young
  2018-09-05  6:35             ` lijiang
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Young @ 2018-09-04  1:51 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, mingo, tglx, hpa, ebiederm, joro, thomas.lendacky,
	kexec, iommu, bhe, Ard Biesheuvel, linux-efi

On 09/04/18 at 09:29am, Dave Young wrote:
> On 09/04/18 at 08:44am, Dave Young wrote:
> > On 09/03/18 at 10:06pm, lijiang wrote:
> > > 在 2018年09月03日 10:45, Dave Young 写道:
> > > > On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
> > > >> For kdump kernel, when SME is enabled, the acpi table and dmi table will need
> > > >> to be remapped without the memory encryption mask. So we have to strengthen
> > > >> the logic in early_memremap_pgprot_adjust(), which makes us have an opportunity
> > > >> to adjust the memory encryption mask.
> > > >>
> > > >> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> > > >> ---
> > > >>  arch/x86/mm/ioremap.c | 9 ++++++++-
> > > >>  1 file changed, 8 insertions(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> > > >> index e01e6c695add..f9d9a39955f3 100644
> > > >> --- a/arch/x86/mm/ioremap.c
> > > >> +++ b/arch/x86/mm/ioremap.c
> > > >> @@ -689,8 +689,15 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
> > > >>  	encrypted_prot = true;
> > > >>  
> > > >>  	if (sme_active()) {
> > > >> +                /*
> > > >> +                 * In kdump kernel, the acpi table and dmi table will need
> > > >> +                 * to be remapped without the memory encryption mask. Here
> > > >> +                 * we have to strengthen the logic to adjust the memory
> > > >> +                 * encryption mask.
> > > > 
> > > > Assume the acpi/dmi tables are identical for both 1st kernel and kdump
> > > > kernel, I'm not sure what is the difference, why need special handling
> > > > for kdump. Can you add more explanations?
> > > > 
> > > 
> > > Ok, i will use a dmi example to explain this issue.
> > > 
> > > There are significant differences about E820 between the 1st kernel and kdump kernel. I pasted them at bottom.
> > > 
> > > Firstly, we need to know how they are called.
> > > __acpi_map_table()\                                                        / early_memremap_is_setup_data()
> > >                    |-> early_memremap()-> early_memremap_pgprot_adjust()-> | memremap_is_efi_data()
> > >  dmi_early_remap()/                                                        \ memremap_should_map_decrypted()-> e820__get_entry_type()
> > > 
> > > Secondly, we also need to understand the memremap_should_map_decrypted(), which is illustrated by the fake code.
> > > static bool memremap_should_map_decrypted(resource_size_t phys_addr,
> > >                                           unsigned long size)
> > > {
> > > 
> > >     /* code ... */
> > > 
> > >     switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
> > >         case E820_TYPE_RESERVED:
> > >         case E820_TYPE_ACPI:
> > >         case E820_TYPE_NVS:
> > >         case E820_TYPE_UNUSABLE:
> > >                 /* For SEV, these areas are encrypted */
> > >                 if (sev_active())
> > >                         break;
> > >                 /* Fallthrough */
> > > 
> > >         case E820_TYPE_PRAM:
> > >                 /* For SME, these areas are decrypted */
> > >                 return true;
> > >         default:
> > >                 /* these areas are encrypted by default*/
> > >                 break;
> > >     }
> > > 
> > >     return false;
> > > }
> > > 
> > > For the dmi case, the dmi base address is 0x6286b000 in my test machine.
> > > 
> > > In the 1st kernel, the e820__get_entry_type() can get a valid entry and type by the dmi address, and we can also find the dmi base address from e820.
> > > (see the 1st kernel log)
> > > 0x6286b000 ∈ [mem 0x000000006286b000-0x000000006286efff]
> > > So, these areas are decrypted according to the memremap_should_map_decrypted().
> > > 
> > > In kdump kernel, the dmi base address is still 0x6286b000, but we can not find the dmi base address from e820 any more. The e820__get_entry_type() can
> > > not get a valid entry and type by the dmi base address, it will go into the default branch. That is to say, these areas become encrypted. In fact, these
> > > areas are also decrypted, so we have to strengthen the logic of adjusting the memory encryption mask.
> > > 
> > > 
> > > The 1st kernel log:
> > > 
> > > [    0.000000] BIOS-provided physical RAM map:
> > > [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008bfff] usable
> > > [    0.000000] BIOS-e820: [mem 0x000000000008c000-0x000000000009ffff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000029920fff] usable
> > > [    0.000000] BIOS-e820: [mem 0x0000000029921000-0x0000000029921fff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x0000000029922000-0x0000000062256fff] usable
> > > [    0.000000] BIOS-e820: [mem 0x0000000062257000-0x0000000062356fff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x0000000062357000-0x000000006235cfff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x000000006235d000-0x00000000623dbfff] usable
> > > [    0.000000] BIOS-e820: [mem 0x00000000623dc000-0x000000006261bfff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x000000006261c000-0x000000006263dfff] usable
> > > [    0.000000] BIOS-e820: [mem 0x000000006263e000-0x000000006269dfff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x000000006269e000-0x00000000627d6fff] usable
> > > [    0.000000] BIOS-e820: [mem 0x00000000627d7000-0x00000000627e3fff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x00000000627e4000-0x00000000627e4fff] ACPI NVS
> > > [    0.000000] BIOS-e820: [mem 0x00000000627e5000-0x00000000627e8fff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x00000000627e9000-0x00000000627eafff] usable
> > > [    0.000000] BIOS-e820: [mem 0x00000000627eb000-0x00000000627ebfff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x00000000627ec000-0x000000006286afff] usable
> > > [    0.000000] BIOS-e820: [mem 0x000000006286b000-0x000000006286efff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x000000006286f000-0x00000000682f8fff] usable
> > > [    0.000000] BIOS-e820: [mem 0x00000000682f9000-0x0000000068b05fff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
> > > [    0.000000] BIOS-e820: [mem 0x0000000068b0a000-0x0000000068b1afff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
> > > [    0.000000] BIOS-e820: [mem 0x0000000068b1e000-0x0000000071d1dfff] usable
> > > [    0.000000] BIOS-e820: [mem 0x0000000071d1e000-0x0000000071d2dfff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
> > > [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x0000000071d4e000-0x0000000077ffffff] usable
> > > [    0.000000] BIOS-e820: [mem 0x0000000078000000-0x000000008fffffff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed80fff] reserved
> > > [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000087effffff] usable
> > > [    0.000000] BIOS-e820: [mem 0x000000087f000000-0x000000087fffffff] reserved
> > > 
> > > The kdump kernel log:
> > > 
> > > [    0.000000] BIOS-provided physical RAM map:
> > > [    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000008bfff] usable
> > > [    0.000000] BIOS-e820: [mem 0x0000000052000000-0x0000000061ffffff] usable
> > > [    0.000000] BIOS-e820: [mem 0x00000000622ee000-0x0000000062300fff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x0000000062301000-0x0000000062301fff] ACPI NVS
> > > [    0.000000] BIOS-e820: [mem 0x0000000062703000-0x0000000062703fff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x0000000062735000-0x0000000062737fff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x000000006273a000-0x000000006273afff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
> > > [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
> > > [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
> > > [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
> > > [    0.000000] BIOS-e820: [mem 0x00000007fe000000-0x000000087df70fff] usable
> > > 
> > 
> > Can you provide the efi memmap dmesg?  boot with efi=debug?
> 
> The right way should be checking the efi mem types instead of only
> checking is_kdump_kernel.
> 
> Something like below, probably also check the region size with something
> like efi_mem_range_type(addr, size), return -EINVAL in case cross
> different type efi memory desc, added efi people in cc:
> 
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index c63a545ec199..4a24e138c0d0 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -527,6 +527,13 @@ static bool memremap_should_map_decrypted(resource_size_t phys_addr,
>  		break;
>  	}
>  
> +	if (is_kdump_kernel() {
> +		switch (efi_mem_type(phys_addr)) {
> +		/* refer to arch/x86/boot/compressed/eboot.c -> setup_e820()*/
> +		case ...
> +		}
> +	}
> +
>  	return false;
>  }
>  

Hold on, I suppose kexec reboot also need this, but if it works without
a fix then there might be thing to be made clear.

kexec-tools will read /proc/iomem and recreate the e820 ranges for 2nd
kernel so in theory we should be fine without a fix.

Can you debug the kexec-tools load process about the e820 creating code
path?

> 
> > 
> > > >> +                 */
> > > >>  		if (early_memremap_is_setup_data(phys_addr, size) ||
> > > >> -		    memremap_is_efi_data(phys_addr, size))
> > > >> +		    memremap_is_efi_data(phys_addr, size) ||
> > > >> +		    is_kdump_kernel())
> > > >>  			encrypted_prot = false;
> > > >>  	}
> > > >>  
> > > >> -- 
> > > >> 2.17.1
> > > >>
> > > > 
> > > > Thanks
> > > > Dave
> > > > 

Thanks
Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-09-04  1:51           ` Dave Young
@ 2018-09-05  6:35             ` lijiang
  2018-09-05  6:46               ` Dave Young
  0 siblings, 1 reply; 15+ messages in thread
From: lijiang @ 2018-09-05  6:35 UTC (permalink / raw)
  To: Dave Young
  Cc: linux-kernel, mingo, tglx, hpa, ebiederm, joro, thomas.lendacky,
	kexec, iommu, bhe, Ard Biesheuvel, linux-efi

在 2018年09月04日 09:51, Dave Young 写道:
> On 09/04/18 at 09:29am, Dave Young wrote:
>> On 09/04/18 at 08:44am, Dave Young wrote:
>>> On 09/03/18 at 10:06pm, lijiang wrote:
>>>> 在 2018年09月03日 10:45, Dave Young 写道:
>>>>> On 08/31/18 at 04:19pm, Lianbo Jiang wrote:
>>>>>> For kdump kernel, when SME is enabled, the acpi table and dmi table will need
>>>>>> to be remapped without the memory encryption mask. So we have to strengthen
>>>>>> the logic in early_memremap_pgprot_adjust(), which makes us have an opportunity
>>>>>> to adjust the memory encryption mask.
>>>>>>
>>>>>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>>>>>> ---
>>>>>>  arch/x86/mm/ioremap.c | 9 ++++++++-
>>>>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>>>>> index e01e6c695add..f9d9a39955f3 100644
>>>>>> --- a/arch/x86/mm/ioremap.c
>>>>>> +++ b/arch/x86/mm/ioremap.c
>>>>>> @@ -689,8 +689,15 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr,
>>>>>>  	encrypted_prot = true;
>>>>>>  
>>>>>>  	if (sme_active()) {
>>>>>> +                /*
>>>>>> +                 * In kdump kernel, the acpi table and dmi table will need
>>>>>> +                 * to be remapped without the memory encryption mask. Here
>>>>>> +                 * we have to strengthen the logic to adjust the memory
>>>>>> +                 * encryption mask.
>>>>>
>>>>> Assume the acpi/dmi tables are identical for both 1st kernel and kdump
>>>>> kernel, I'm not sure what is the difference, why need special handling
>>>>> for kdump. Can you add more explanations?
>>>>>
>>>>
>>>> Ok, i will use a dmi example to explain this issue.
>>>>
>>>> There are significant differences about E820 between the 1st kernel and kdump kernel. I pasted them at bottom.
>>>>
>>>> Firstly, we need to know how they are called.
>>>> __acpi_map_table()\                                                        / early_memremap_is_setup_data()
>>>>                    |-> early_memremap()-> early_memremap_pgprot_adjust()-> | memremap_is_efi_data()
>>>>  dmi_early_remap()/                                                        \ memremap_should_map_decrypted()-> e820__get_entry_type()
>>>>
>>>> Secondly, we also need to understand the memremap_should_map_decrypted(), which is illustrated by the fake code.
>>>> static bool memremap_should_map_decrypted(resource_size_t phys_addr,
>>>>                                           unsigned long size)
>>>> {
>>>>
>>>>     /* code ... */
>>>>
>>>>     switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) {
>>>>         case E820_TYPE_RESERVED:
>>>>         case E820_TYPE_ACPI:
>>>>         case E820_TYPE_NVS:
>>>>         case E820_TYPE_UNUSABLE:
>>>>                 /* For SEV, these areas are encrypted */
>>>>                 if (sev_active())
>>>>                         break;
>>>>                 /* Fallthrough */
>>>>
>>>>         case E820_TYPE_PRAM:
>>>>                 /* For SME, these areas are decrypted */
>>>>                 return true;
>>>>         default:
>>>>                 /* these areas are encrypted by default*/
>>>>                 break;
>>>>     }
>>>>
>>>>     return false;
>>>> }
>>>>
>>>> For the dmi case, the dmi base address is 0x6286b000 in my test machine.
>>>>
>>>> In the 1st kernel, the e820__get_entry_type() can get a valid entry and type by the dmi address, and we can also find the dmi base address from e820.
>>>> (see the 1st kernel log)
>>>> 0x6286b000 ∈ [mem 0x000000006286b000-0x000000006286efff]
>>>> So, these areas are decrypted according to the memremap_should_map_decrypted().
>>>>
>>>> In kdump kernel, the dmi base address is still 0x6286b000, but we can not find the dmi base address from e820 any more. The e820__get_entry_type() can
>>>> not get a valid entry and type by the dmi base address, it will go into the default branch. That is to say, these areas become encrypted. In fact, these
>>>> areas are also decrypted, so we have to strengthen the logic of adjusting the memory encryption mask.
>>>>
>>>>
>>>> The 1st kernel log:
>>>>
>>>> [    0.000000] BIOS-provided physical RAM map:
>>>> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008bfff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x000000000008c000-0x000000000009ffff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000029920fff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x0000000029921000-0x0000000029921fff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x0000000029922000-0x0000000062256fff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x0000000062257000-0x0000000062356fff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x0000000062357000-0x000000006235cfff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x000000006235d000-0x00000000623dbfff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x00000000623dc000-0x000000006261bfff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x000000006261c000-0x000000006263dfff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x000000006263e000-0x000000006269dfff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x000000006269e000-0x00000000627d6fff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x00000000627d7000-0x00000000627e3fff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x00000000627e4000-0x00000000627e4fff] ACPI NVS
>>>> [    0.000000] BIOS-e820: [mem 0x00000000627e5000-0x00000000627e8fff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x00000000627e9000-0x00000000627eafff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x00000000627eb000-0x00000000627ebfff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x00000000627ec000-0x000000006286afff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x000000006286b000-0x000000006286efff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x000000006286f000-0x00000000682f8fff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x00000000682f9000-0x0000000068b05fff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
>>>> [    0.000000] BIOS-e820: [mem 0x0000000068b0a000-0x0000000068b1afff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
>>>> [    0.000000] BIOS-e820: [mem 0x0000000068b1e000-0x0000000071d1dfff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x0000000071d1e000-0x0000000071d2dfff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
>>>> [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x0000000071d4e000-0x0000000077ffffff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x0000000078000000-0x000000008fffffff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed80fff] reserved
>>>> [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000087effffff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x000000087f000000-0x000000087fffffff] reserved
>>>>
>>>> The kdump kernel log:
>>>>
>>>> [    0.000000] BIOS-provided physical RAM map:
>>>> [    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000008bfff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x0000000052000000-0x0000000061ffffff] usable
>>>> [    0.000000] BIOS-e820: [mem 0x00000000622ee000-0x0000000062300fff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x0000000062301000-0x0000000062301fff] ACPI NVS
>>>> [    0.000000] BIOS-e820: [mem 0x0000000062703000-0x0000000062703fff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x0000000062735000-0x0000000062737fff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x000000006273a000-0x000000006273afff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x0000000068b06000-0x0000000068b09fff] ACPI NVS
>>>> [    0.000000] BIOS-e820: [mem 0x0000000068b1b000-0x0000000068b1dfff] ACPI NVS
>>>> [    0.000000] BIOS-e820: [mem 0x0000000071d2e000-0x0000000071d3dfff] ACPI NVS
>>>> [    0.000000] BIOS-e820: [mem 0x0000000071d3e000-0x0000000071d4dfff] ACPI data
>>>> [    0.000000] BIOS-e820: [mem 0x00000007fe000000-0x000000087df70fff] usable
>>>>
>>>
>>> Can you provide the efi memmap dmesg?  boot with efi=debug?
>>
>> The right way should be checking the efi mem types instead of only
>> checking is_kdump_kernel.
>>
>> Something like below, probably also check the region size with something
>> like efi_mem_range_type(addr, size), return -EINVAL in case cross
>> different type efi memory desc, added efi people in cc:
>>
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index c63a545ec199..4a24e138c0d0 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -527,6 +527,13 @@ static bool memremap_should_map_decrypted(resource_size_t phys_addr,
>>  		break;
>>  	}
>>  
>> +	if (is_kdump_kernel() {
>> +		switch (efi_mem_type(phys_addr)) {
>> +		/* refer to arch/x86/boot/compressed/eboot.c -> setup_e820()*/
>> +		case ...
>> +		}
>> +	}
>> +
>>  	return false;
>>  }
>>  
> 
> Hold on, I suppose kexec reboot also need this, but if it works without
> a fix then there might be thing to be made clear.
> 
> kexec-tools will read /proc/iomem and recreate the e820 ranges for 2nd
> kernel so in theory we should be fine without a fix.
> 

As previously mentioned, there are also many differences between kexec and kdump. In general,
kexec needs to look at all of available physical memory, but kdump doesn't need.

For kexec, kexec-tools will read /sys/firmware/memmap and recreate the e820 ranges for the 2nd
kernel. If it fails, will use /proc/iomem.

For kdump, kexec-tools will read /proc/iomem and recreate the e820 ranges for kdump kernel.
BTW: we can not get the range of persistent memory from /proc/iomem. So e820 ranges don't contain
the persistent memory in kdump kernel, this is the real reason why i need to strengthen the logic
of adjusting memory encryption mask.

If kexec-tools also use /sys/firmware/memmap for kdump(like kexec), kdump kernel can also work
without a fix, but the kexec-tools will have to be modified. Are you sure that you want me to
fix kexec-tools instead of kernel?


Thanks.

> Can you debug the kexec-tools load process about the e820 creating code
> path?
> 
>>
>>>
>>>>>> +                 */
>>>>>>  		if (early_memremap_is_setup_data(phys_addr, size) ||
>>>>>> -		    memremap_is_efi_data(phys_addr, size))
>>>>>> +		    memremap_is_efi_data(phys_addr, size) ||
>>>>>> +		    is_kdump_kernel())
>>>>>>  			encrypted_prot = false;
>>>>>>  	}
>>>>>>  
>>>>>> -- 
>>>>>> 2.17.1
>>>>>>
>>>>>
>>>>> Thanks
>>>>> Dave
>>>>>
> 
> Thanks
> Dave
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-09-05  6:35             ` lijiang
@ 2018-09-05  6:46               ` Dave Young
  2018-09-05 14:04                 ` lijiang
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Young @ 2018-09-05  6:46 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, mingo, tglx, hpa, ebiederm, joro, thomas.lendacky,
	kexec, iommu, bhe, Ard Biesheuvel, linux-efi

[snip]
> 
> As previously mentioned, there are also many differences between kexec and kdump. In general,
> kexec needs to look at all of available physical memory, but kdump doesn't need.
> 
> For kexec, kexec-tools will read /sys/firmware/memmap and recreate the e820 ranges for the 2nd
> kernel. If it fails, will use /proc/iomem.
> 
> For kdump, kexec-tools will read /proc/iomem and recreate the e820 ranges for kdump kernel.
> BTW: we can not get the range of persistent memory from /proc/iomem. So e820 ranges don't contain
> the persistent memory in kdump kernel, this is the real reason why i need to strengthen the logic
> of adjusting memory encryption mask.

"persistent memory" is different, I think you meant about some reserved
memory instead

> 
> If kexec-tools also use /sys/firmware/memmap for kdump(like kexec), kdump kernel can also work
> without a fix, but the kexec-tools will have to be modified. Are you sure that you want me to
> fix kexec-tools instead of kernel?

Yes, please fix kexec-tools to pass reserved ranges in e820, you will
not need this patch then.

Thanks
Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask
  2018-09-05  6:46               ` Dave Young
@ 2018-09-05 14:04                 ` lijiang
  0 siblings, 0 replies; 15+ messages in thread
From: lijiang @ 2018-09-05 14:04 UTC (permalink / raw)
  To: Dave Young
  Cc: linux-kernel, mingo, tglx, hpa, ebiederm, joro, thomas.lendacky,
	kexec, iommu, bhe, Ard Biesheuvel, linux-efi

在 2018年09月05日 14:46, Dave Young 写道:
> [snip]
>>
>> As previously mentioned, there are also many differences between kexec and kdump. In general,
>> kexec needs to look at all of available physical memory, but kdump doesn't need.
>>
>> For kexec, kexec-tools will read /sys/firmware/memmap and recreate the e820 ranges for the 2nd
>> kernel. If it fails, will use /proc/iomem.
>>
>> For kdump, kexec-tools will read /proc/iomem and recreate the e820 ranges for kdump kernel.
>> BTW: we can not get the range of persistent memory from /proc/iomem. So e820 ranges don't contain
>> the persistent memory in kdump kernel, this is the real reason why i need to strengthen the logic
>> of adjusting memory encryption mask.
> 
> "persistent memory" is different, I think you meant about some reserved
> memory instead
> 
>>
>> If kexec-tools also use /sys/firmware/memmap for kdump(like kexec), kdump kernel can also work
>> without a fix, but the kexec-tools will have to be modified. Are you sure that you want me to
>> fix kexec-tools instead of kernel?
> 
> Yes, please fix kexec-tools to pass reserved ranges in e820, you will
> not need this patch then.
> 

This might be a kexec-tools bug, i have posted a patch for kexec-tools(please check kexec@lists.infradead.org).
Thanks.

> Thanks
> Dave
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-09-05 14:05 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-31  8:19 [PATCH 0/5 V6] Support kdump for AMD secure memory encryption(SME) Lianbo Jiang
2018-08-31  8:19 ` [PATCH 1/5 V6] x86/ioremap: add a function ioremap_encrypted() to remap kdump old memroy Lianbo Jiang
2018-08-31  8:19 ` [PATCH 2/5 V6] x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask Lianbo Jiang
2018-09-03  2:45   ` Dave Young
2018-09-03 14:06     ` lijiang
2018-09-04  0:44       ` Dave Young
2018-09-04  1:29         ` Dave Young
2018-09-04  1:51           ` Dave Young
2018-09-05  6:35             ` lijiang
2018-09-05  6:46               ` Dave Young
2018-09-05 14:04                 ` lijiang
2018-09-04  1:44         ` lijiang
2018-08-31  8:19 ` [PATCH 3/5 V6] kexec: allocate unencrypted control pages for kdump in case SME is enabled Lianbo Jiang
2018-08-31  8:19 ` [PATCH 4/5 V6] iommu/amd_iommu: remap the device table of IOMMU with the memory encryption mask for kdump Lianbo Jiang
2018-08-31  8:19 ` [PATCH 5/5 V6] kdump/vmcore: support encrypted old memory with SME enabled Lianbo Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).