linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3 v4] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active
@ 2019-10-17  9:43 Lianbo Jiang
  2019-10-17  9:43 ` [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Lianbo Jiang @ 2019-10-17  9:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, mingo, bp, hpa, x86, bhe, dyoung, jgross, dhowells,
	Thomas.Lendacky, ebiederm, vgoyal, kexec

In purgatory(), the main things are as below:

[1] verify sha256 hashes for various segments.
    Lets keep these codes, and do not touch the logic.

[2] copy the first 640k content to a backup region.
    Lets safely remove it and clean all code related to backup region.

This patch series will remove the backup region, because the current
handling of copying the first 640k runs into problems when SME is
active(https://bugzilla.kernel.org/show_bug.cgi?id=204793).

The low 1MiB region will always be reserved when the crashkernel kernel
command line option is specified. And this way makes it unnecessary to
do anything with the low 1MiB region, because the memory allocated later
won't fall into the low 1MiB area.

This series includes three patches:
[1] x86/kdump: always reserve the low 1MiB when the crashkernel option
    is specified
    The low 1MiB region will always be reserved when the crashkernel
    kernel command line option is specified, which ensures that the
    memory allocated later won't fall into the low 1MiB area.

[2] x86/kdump: remove the unused crash_copy_backup_region()
    The crash_copy_backup_region() has never been used, so clean
    up the redundant code.

[3] x86/kdump: clean up all the code related to the backup region
    Remove the backup region and clean up.

Changes since v1:
[1] Add extra checking condition: when the crashkernel option is
    specified, reserve the low 640k area.

Changes since v2:
[1] Reserve the low 1MiB region when the crashkernel option is only
    specified.(Suggested by Eric)

[2] Remove the unused crash_copy_backup_region()

[3] Remove the backup region and clean up

[4] Split them into three patches

Changes since v3:
[1] Improve the first patch's log
[2] Improve the third patch based on Eric's suggestions

Lianbo Jiang (3):
  x86/kdump: always reserve the low 1MiB when the crashkernel option is
    specified
  x86/kdump: remove the unused crash_copy_backup_region()
  x86/kdump: clean up all the code related to the backup region

 arch/x86/include/asm/crash.h       |  1 -
 arch/x86/include/asm/kexec.h       | 10 ----
 arch/x86/include/asm/purgatory.h   | 10 ----
 arch/x86/kernel/crash.c            | 87 ++++--------------------------
 arch/x86/kernel/machine_kexec_64.c | 47 ----------------
 arch/x86/purgatory/purgatory.c     | 19 -------
 arch/x86/realmode/init.c           | 11 ++++
 7 files changed, 22 insertions(+), 163 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-17  9:43 [PATCH 0/3 v4] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
@ 2019-10-17  9:43 ` Lianbo Jiang
  2019-10-22  8:30   ` Borislav Petkov
  2019-10-17  9:43 ` [PATCH 2/3 v4] x86/kdump: remove the unused crash_copy_backup_region() Lianbo Jiang
  2019-10-17  9:43 ` [PATCH 3/3 v4] x86/kdump: clean up all the code related to the backup region Lianbo Jiang
  2 siblings, 1 reply; 16+ messages in thread
From: Lianbo Jiang @ 2019-10-17  9:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, mingo, bp, hpa, x86, bhe, dyoung, jgross, dhowells,
	Thomas.Lendacky, ebiederm, vgoyal, kexec

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Kdump kernel will reuse the first 640k region because of some reasons,
for example: the trampline and conventional PC system BIOS region may
require to allocate memory in this area. Obviously, kdump kernel will
also overwrite the first 640k region, therefore, kernel has to copy
the contents of the first 640k area to a backup area, which is done in
purgatory(), because vmcore may need the old memory. When vmcore is
dumped, kdump kernel will read the old memory from the backup area of
the first 640k area.

Basically, the main reason should be clear, kernel does not correctly
handle the first 640k region when SME is active, which causes that
kernel does not properly copy these old memory to the backup area in
purgatory(). Therefore, kdump kernel reads out the incorrect contents
from the backup area when dumping vmcore. Finally, the phenomenon is
as follow:

[root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values

      KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
        CPUS: 128
        DATE: Thu Sep 19 08:31:18 2019
      UPTIME: 00:01:21
LOAD AVERAGE: 0.16, 0.07, 0.02
       TASKS: 1343
    NODENAME: amd-ethanol
     RELEASE: 5.3.0-rc7+
     VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
     MACHINE: x86_64  (2195 Mhz)
      MEMORY: 127.9 GB
       PANIC: "Kernel panic - not syncing: sysrq triggered crash"
         PID: 9789
     COMMAND: "bash"
        TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
         CPU: 83
       STATE: TASK_RUNNING (PANIC)

crash> kmem -s|grep -i invalid
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
crash>

BTW: I also tried to fix the above problem in purgatory(), but there
are too many restricts in purgatory() context, for example: i can't
allocate new memory to create the identity mapping page table for SME
situation.

Currently, there are two places where the first 640k area is needed,
the first one is in the find_trampoline_placement(), another one is
in the reserve_real_mode(), and their content doesn't matter.

To avoid the above error, when the crashkernel kernel command line
option is specified, lets reserve the remaining low 1MiB memory(
after reserving real mode memroy) so that the allocated memory does
not fall into the low 1MiB area, which makes us not to copy the first
640k content to a backup region in purgatory(). This indicates that
it does not need to be included in crash dumps or used for anything
execept the processor trampolines that must live in the low 1MiB.

In addition, also need to clean all the code related to the backup
region later.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/realmode/init.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 7dce39c8c034..1f0492830f2c 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -34,6 +34,17 @@ void __init reserve_real_mode(void)
 
 	memblock_reserve(mem, size);
 	set_real_mode_mem(mem);
+
+#ifdef CONFIG_KEXEC_CORE
+	/*
+	 * When the crashkernel option is specified, only use the low
+	 * 1MiB for the real mode trampoline.
+	 */
+	if (strstr(boot_command_line, "crashkernel=")) {
+		memblock_reserve(0, 1<<20);
+		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+	}
+#endif /* CONFIG_KEXEC_CORE */
 }
 
 static void __init setup_real_mode(void)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3 v4] x86/kdump: remove the unused crash_copy_backup_region()
  2019-10-17  9:43 [PATCH 0/3 v4] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
  2019-10-17  9:43 ` [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
@ 2019-10-17  9:43 ` Lianbo Jiang
  2019-10-17  9:43 ` [PATCH 3/3 v4] x86/kdump: clean up all the code related to the backup region Lianbo Jiang
  2 siblings, 0 replies; 16+ messages in thread
From: Lianbo Jiang @ 2019-10-17  9:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, mingo, bp, hpa, x86, bhe, dyoung, jgross, dhowells,
	Thomas.Lendacky, ebiederm, vgoyal, kexec

The crash_copy_backup_region() has never been used, so clean
up the redundant code.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/include/asm/crash.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h
index 0acf5ee45a21..089b2850f9d1 100644
--- a/arch/x86/include/asm/crash.h
+++ b/arch/x86/include/asm/crash.h
@@ -3,7 +3,6 @@
 #define _ASM_X86_CRASH_H
 
 int crash_load_segments(struct kimage *image);
-int crash_copy_backup_region(struct kimage *image);
 int crash_setup_memmap_entries(struct kimage *image,
 		struct boot_params *params);
 void crash_smp_send_stop(void);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3 v4] x86/kdump: clean up all the code related to the backup region
  2019-10-17  9:43 [PATCH 0/3 v4] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
  2019-10-17  9:43 ` [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
  2019-10-17  9:43 ` [PATCH 2/3 v4] x86/kdump: remove the unused crash_copy_backup_region() Lianbo Jiang
@ 2019-10-17  9:43 ` Lianbo Jiang
  2019-10-22 12:15   ` Borislav Petkov
  2 siblings, 1 reply; 16+ messages in thread
From: Lianbo Jiang @ 2019-10-17  9:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, mingo, bp, hpa, x86, bhe, dyoung, jgross, dhowells,
	Thomas.Lendacky, ebiederm, vgoyal, kexec

When the crashkernel kernel command line option is specified, the
low 1MiB memory will always be reserved, which makes that the memory
allocated later won't fall into the low 1MiB area, thereby, it's not
necessary to create a backup region and also no need to copy the first
640k content to a backup region.

Currently, the code related to the backup region can be safely removed,
so lets clean up.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/include/asm/kexec.h       | 10 ----
 arch/x86/include/asm/purgatory.h   | 10 ----
 arch/x86/kernel/crash.c            | 87 ++++--------------------------
 arch/x86/kernel/machine_kexec_64.c | 47 ----------------
 arch/x86/purgatory/purgatory.c     | 19 -------
 5 files changed, 11 insertions(+), 162 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 5e7d6b46de97..6802c59e8252 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -66,10 +66,6 @@ struct kimage;
 # define KEXEC_ARCH KEXEC_ARCH_X86_64
 #endif
 
-/* Memory to backup during crash kdump */
-#define KEXEC_BACKUP_SRC_START	(0UL)
-#define KEXEC_BACKUP_SRC_END	(640 * 1024UL - 1)	/* 640K */
-
 /*
  * This function is responsible for capturing register states if coming
  * via panic otherwise just fix up the ss and sp if coming via kernel
@@ -154,12 +150,6 @@ struct kimage_arch {
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
-	/* Details of backup region */
-	unsigned long backup_src_start;
-	unsigned long backup_src_sz;
-
-	/* Physical address of backup segment */
-	unsigned long backup_load_addr;
 
 	/* Core ELF header buffer */
 	void *elf_headers;
diff --git a/arch/x86/include/asm/purgatory.h b/arch/x86/include/asm/purgatory.h
index 92c34e517da1..5528e9325049 100644
--- a/arch/x86/include/asm/purgatory.h
+++ b/arch/x86/include/asm/purgatory.h
@@ -6,16 +6,6 @@
 #include <linux/purgatory.h>
 
 extern void purgatory(void);
-/*
- * These forward declarations serve two purposes:
- *
- * 1) Make sparse happy when checking arch/purgatory
- * 2) Document that these are required to be global so the symbol
- *    lookup in kexec works
- */
-extern unsigned long purgatory_backup_dest;
-extern unsigned long purgatory_backup_src;
-extern unsigned long purgatory_backup_sz;
 #endif	/* __ASSEMBLY__ */
 
 #endif /* _ASM_PURGATORY_H */
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index eb651fbde92a..ef54b3ffb0f6 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -173,8 +173,6 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 
 #ifdef CONFIG_KEXEC_FILE
 
-static unsigned long crash_zero_bytes;
-
 static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
 {
 	unsigned int *nr_ranges = arg;
@@ -217,6 +215,11 @@ static int elf_header_exclude_ranges(struct crash_mem *cmem)
 {
 	int ret = 0;
 
+	/* Exclude the low 1MiB because it is always reserved */
+	ret = crash_exclude_mem_range(cmem, 0, 1<<20);
+	if (ret)
+		return ret;
+
 	/* Exclude crashkernel region */
 	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
 	if (ret)
@@ -246,9 +249,7 @@ static int prepare_elf_headers(struct kimage *image, void **addr,
 					unsigned long *sz)
 {
 	struct crash_mem *cmem;
-	Elf64_Ehdr *ehdr;
-	Elf64_Phdr *phdr;
-	int ret, i;
+	int ret;
 
 	cmem = fill_up_crash_elf_data();
 	if (!cmem)
@@ -267,22 +268,7 @@ static int prepare_elf_headers(struct kimage *image, void **addr,
 	/* By default prepare 64bit headers */
 	ret =  crash_prepare_elf64_headers(cmem,
 				IS_ENABLED(CONFIG_X86_64), addr, sz);
-	if (ret)
-		goto out;
 
-	/*
-	 * If a range matches backup region, adjust offset to backup
-	 * segment.
-	 */
-	ehdr = (Elf64_Ehdr *)*addr;
-	phdr = (Elf64_Phdr *)(ehdr + 1);
-	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
-		if (phdr->p_type == PT_LOAD &&
-				phdr->p_paddr == image->arch.backup_src_start &&
-				phdr->p_memsz == image->arch.backup_src_sz) {
-			phdr->p_offset = image->arch.backup_load_addr;
-			break;
-		}
 out:
 	vfree(cmem);
 	return ret;
@@ -321,19 +307,11 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
 				 unsigned long long mend)
 {
 	unsigned long start, end;
-	int ret = 0;
 
 	cmem->ranges[0].start = mstart;
 	cmem->ranges[0].end = mend;
 	cmem->nr_ranges = 1;
 
-	/* Exclude Backup region */
-	start = image->arch.backup_load_addr;
-	end = start + image->arch.backup_src_sz - 1;
-	ret = crash_exclude_mem_range(cmem, start, end);
-	if (ret)
-		return ret;
-
 	/* Exclude elf header region */
 	start = image->arch.elf_load_addr;
 	end = start + image->arch.elf_headers_sz - 1;
@@ -356,11 +334,11 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
 	memset(&cmd, 0, sizeof(struct crash_memmap_data));
 	cmd.params = params;
 
-	/* Add first 640K segment */
-	ei.addr = image->arch.backup_src_start;
-	ei.size = image->arch.backup_src_sz;
-	ei.type = E820_TYPE_RAM;
-	add_e820_entry(params, &ei);
+	/* Add the low 1MiB */
+	cmd.type = E820_TYPE_RAM;
+	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	walk_iomem_res_desc(IORES_DESC_NONE, flags, 0, (1<<20)-1, &cmd,
+			memmap_entry_callback);
 
 	/* Add ACPI tables */
 	cmd.type = E820_TYPE_ACPI;
@@ -409,55 +387,12 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
 	return ret;
 }
 
-static int determine_backup_region(struct resource *res, void *arg)
-{
-	struct kimage *image = arg;
-
-	image->arch.backup_src_start = res->start;
-	image->arch.backup_src_sz = resource_size(res);
-
-	/* Expecting only one range for backup region */
-	return 1;
-}
-
 int crash_load_segments(struct kimage *image)
 {
 	int ret;
 	struct kexec_buf kbuf = { .image = image, .buf_min = 0,
 				  .buf_max = ULONG_MAX, .top_down = false };
 
-	/*
-	 * Determine and load a segment for backup area. First 640K RAM
-	 * region is backup source
-	 */
-
-	ret = walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
-				image, determine_backup_region);
-
-	/* Zero or postive return values are ok */
-	if (ret < 0)
-		return ret;
-
-	/* Add backup segment. */
-	if (image->arch.backup_src_sz) {
-		kbuf.buffer = &crash_zero_bytes;
-		kbuf.bufsz = sizeof(crash_zero_bytes);
-		kbuf.memsz = image->arch.backup_src_sz;
-		kbuf.buf_align = PAGE_SIZE;
-		/*
-		 * Ideally there is no source for backup segment. This is
-		 * copied in purgatory after crash. Just add a zero filled
-		 * segment for now to make sure checksum logic works fine.
-		 */
-		ret = kexec_add_buffer(&kbuf);
-		if (ret)
-			return ret;
-		image->arch.backup_load_addr = kbuf.mem;
-		pr_debug("Loaded backup region at 0x%lx backup_start=0x%lx memsz=0x%lx\n",
-			 image->arch.backup_load_addr,
-			 image->arch.backup_src_start, kbuf.memsz);
-	}
-
 	/* Prepare elf headers and add a segment */
 	ret = prepare_elf_headers(image, &kbuf.buffer, &kbuf.bufsz);
 	if (ret)
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 5dcd438ad8f2..16e125a50b33 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -298,48 +298,6 @@ static void load_segments(void)
 		);
 }
 
-#ifdef CONFIG_KEXEC_FILE
-/* Update purgatory as needed after various image segments have been prepared */
-static int arch_update_purgatory(struct kimage *image)
-{
-	int ret = 0;
-
-	if (!image->file_mode)
-		return 0;
-
-	/* Setup copying of backup region */
-	if (image->type == KEXEC_TYPE_CRASH) {
-		ret = kexec_purgatory_get_set_symbol(image,
-				"purgatory_backup_dest",
-				&image->arch.backup_load_addr,
-				sizeof(image->arch.backup_load_addr), 0);
-		if (ret)
-			return ret;
-
-		ret = kexec_purgatory_get_set_symbol(image,
-				"purgatory_backup_src",
-				&image->arch.backup_src_start,
-				sizeof(image->arch.backup_src_start), 0);
-		if (ret)
-			return ret;
-
-		ret = kexec_purgatory_get_set_symbol(image,
-				"purgatory_backup_sz",
-				&image->arch.backup_src_sz,
-				sizeof(image->arch.backup_src_sz), 0);
-		if (ret)
-			return ret;
-	}
-
-	return ret;
-}
-#else /* !CONFIG_KEXEC_FILE */
-static inline int arch_update_purgatory(struct kimage *image)
-{
-	return 0;
-}
-#endif /* CONFIG_KEXEC_FILE */
-
 int machine_kexec_prepare(struct kimage *image)
 {
 	unsigned long start_pgtable;
@@ -353,11 +311,6 @@ int machine_kexec_prepare(struct kimage *image)
 	if (result)
 		return result;
 
-	/* update purgatory as needed */
-	result = arch_update_purgatory(image);
-	if (result)
-		return result;
-
 	return 0;
 }
 
diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c
index 3b95410ff0f8..2961234d0795 100644
--- a/arch/x86/purgatory/purgatory.c
+++ b/arch/x86/purgatory/purgatory.c
@@ -14,28 +14,10 @@
 
 #include "../boot/string.h"
 
-unsigned long purgatory_backup_dest __section(.kexec-purgatory);
-unsigned long purgatory_backup_src __section(.kexec-purgatory);
-unsigned long purgatory_backup_sz __section(.kexec-purgatory);
-
 u8 purgatory_sha256_digest[SHA256_DIGEST_SIZE] __section(.kexec-purgatory);
 
 struct kexec_sha_region purgatory_sha_regions[KEXEC_SEGMENT_MAX] __section(.kexec-purgatory);
 
-/*
- * On x86, second kernel requries first 640K of memory to boot. Copy
- * first 640K to a backup region in reserved memory range so that second
- * kernel can use first 640K.
- */
-static int copy_backup_region(void)
-{
-	if (purgatory_backup_dest) {
-		memcpy((void *)purgatory_backup_dest,
-		       (void *)purgatory_backup_src, purgatory_backup_sz);
-	}
-	return 0;
-}
-
 static int verify_sha256_digest(void)
 {
 	struct kexec_sha_region *ptr, *end;
@@ -66,7 +48,6 @@ void purgatory(void)
 		for (;;)
 			;
 	}
-	copy_backup_region();
 }
 
 /*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-17  9:43 ` [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
@ 2019-10-22  8:30   ` Borislav Petkov
  2019-10-23  5:23     ` lijiang
  2019-10-23  5:35     ` lijiang
  0 siblings, 2 replies; 16+ messages in thread
From: Borislav Petkov @ 2019-10-22  8:30 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

On Thu, Oct 17, 2019 at 05:43:45PM +0800, Lianbo Jiang wrote:
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Put that as a Link: below.

> Kdump kernel will reuse the first 640k region because of some reasons,

s/ of some reasons//

> for example: the trampline and conventional PC system BIOS region may

spellcheck: s/trampline/trampoline/

I see two more typos in here and if you had a spellchecker enabled in
your editor where you write the commit message, you'll see them too.
Please use one.

> require to allocate memory in this area. Obviously, kdump kernel will
> also overwrite the first 640k region,

Well, it is not obvious to me. Please be more specific: why would the
kdump kernel do that?

> therefore, kernel has to copy
> the contents of the first 640k area to a backup area, which is done in
> purgatory(), because vmcore may need the old memory. When vmcore is
> dumped, kdump kernel will read the old memory from the backup area of
> the first 640k area.
> 
> Basically, the main reason should be clear, kernel does not correctly
> handle the first 640k region when SME is active,

If you mention the actual reason here, that sentence would be clearer:

"When SME is enabled in the first kernel, the kdump kernel must access
the first kernel's memory with the encryption bit set."

Something like that. 

> which causes that
> kernel does not properly copy these old memory to the backup area in
> purgatory(). Therefore, kdump kernel reads out the incorrect contents

s/incorrect/encrypted/

> from the backup area when dumping vmcore. Finally, the phenomenon is

phenomenon?

> as follow:
> 
> [root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
> WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values
> 
>       KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
>     DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
>         CPUS: 128
>         DATE: Thu Sep 19 08:31:18 2019
>       UPTIME: 00:01:21
> LOAD AVERAGE: 0.16, 0.07, 0.02
>        TASKS: 1343
>     NODENAME: amd-ethanol
>      RELEASE: 5.3.0-rc7+
>      VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
>      MACHINE: x86_64  (2195 Mhz)
>       MEMORY: 127.9 GB
>        PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>          PID: 9789
>      COMMAND: "bash"
>         TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
>          CPU: 83
>        STATE: TASK_RUNNING (PANIC)
> 
> crash> kmem -s|grep -i invalid
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> crash>

I fail to see what that's trying to tell me? You have invalid pointers?

> BTW: I also tried to fix the above problem in purgatory(), but there
> are too many restricts in purgatory() context, for example: i can't
> allocate new memory to create the identity mapping page table for SME
> situation.

This paragraph belongs under the "---" line below.

> Currently, there are two places where the first 640k area is needed,
> the first one is in the find_trampoline_placement(), another one is
> in the reserve_real_mode(), and their content doesn't matter.
> 
> To avoid the above error, when the crashkernel kernel command line
> option is specified, lets reserve the remaining low 1MiB memory(
> after reserving real mode memroy) so that the allocated memory does
> not fall into the low 1MiB area, which makes us not to copy the first
> 640k content to a backup region in purgatory(). This indicates that
> it does not need to be included in crash dumps or used for anything
> execept the processor trampolines that must live in the low 1MiB.
> 
> In addition, also need to clean all the code related to the backup
> region later.

Ditto.

> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
>  arch/x86/realmode/init.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> index 7dce39c8c034..1f0492830f2c 100644
> --- a/arch/x86/realmode/init.c
> +++ b/arch/x86/realmode/init.c
> @@ -34,6 +34,17 @@ void __init reserve_real_mode(void)
>  
>  	memblock_reserve(mem, size);
>  	set_real_mode_mem(mem);
> +
> +#ifdef CONFIG_KEXEC_CORE
> +	/*
> +	 * When the crashkernel option is specified, only use the low
> +	 * 1MiB for the real mode trampoline.
> +	 */
> +	if (strstr(boot_command_line, "crashkernel=")) {
> +		memblock_reserve(0, 1<<20);
> +		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
> +	}
> +#endif /* CONFIG_KEXEC_CORE */

This ifdeffery needs to be a function in kernel/kexec_core.c which is
called by reserve_real_mode(), instead.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3 v4] x86/kdump: clean up all the code related to the backup region
  2019-10-17  9:43 ` [PATCH 3/3 v4] x86/kdump: clean up all the code related to the backup region Lianbo Jiang
@ 2019-10-22 12:15   ` Borislav Petkov
  0 siblings, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2019-10-22 12:15 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

On Thu, Oct 17, 2019 at 05:43:47PM +0800, Lianbo Jiang wrote:
> When the crashkernel kernel command line option is specified, the
> low 1MiB memory will always be reserved, which makes that the memory
> allocated later won't fall into the low 1MiB area, thereby, it's not
> necessary to create a backup region and also no need to copy the first
> 640k content to a backup region.
> 
> Currently, the code related to the backup region can be safely removed,
> so lets clean up.
> 
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
>  arch/x86/include/asm/kexec.h       | 10 ----
>  arch/x86/include/asm/purgatory.h   | 10 ----
>  arch/x86/kernel/crash.c            | 87 ++++--------------------------
>  arch/x86/kernel/machine_kexec_64.c | 47 ----------------
>  arch/x86/purgatory/purgatory.c     | 19 -------
>  5 files changed, 11 insertions(+), 162 deletions(-)

That's a diffstat one cannot object to nowadays. :)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-22  8:30   ` Borislav Petkov
@ 2019-10-23  5:23     ` lijiang
  2019-10-23  7:43       ` Borislav Petkov
  2019-10-23  5:35     ` lijiang
  1 sibling, 1 reply; 16+ messages in thread
From: lijiang @ 2019-10-23  5:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

在 2019年10月22日 16:30, Borislav Petkov 写道:
> On Thu, Oct 17, 2019 at 05:43:45PM +0800, Lianbo Jiang wrote:
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
> 
Thanks for your comment.

> Put that as a Link: below.
> 
Looks better. OK.

>> Kdump kernel will reuse the first 640k region because of some reasons,
> 
> s/ of some reasons//
> 
>> for example: the trampline and conventional PC system BIOS region may
> 
> spellcheck: s/trampline/trampoline/
> 
> I see two more typos in here and if you had a spellchecker enabled in
> your editor where you write the commit message, you'll see them too.
> Please use one.
> 
Good point. I just tried to enable the spellchecker in the vim and now it
has worked well. Thanks. :-) 

>> require to allocate memory in this area. Obviously, kdump kernel will
>> also overwrite the first 640k region,
> 
> Well, it is not obvious to me. Please be more specific: why would the
> kdump kernel do that?
> 
Kdump kernel will reuse the first 640k region because the real mode
trampoline has to work in this area. When the vmcore is dumped, the
old memory in this area may be accessed, therefore, kernel has to
copy the contents of the first 640k area to a backup region so that
kdump kernel can read the old memory from the backup area of the
first 640k area, which is done in the purgatory().

>> therefore, kernel has to copy
>> the contents of the first 640k area to a backup area, which is done in
>> purgatory(), because vmcore may need the old memory. When vmcore is
>> dumped, kdump kernel will read the old memory from the backup area of
>> the first 640k area.
>>
>> Basically, the main reason should be clear, kernel does not correctly
>> handle the first 640k region when SME is active,
> 
> If you mention the actual reason here, that sentence would be clearer:
> 
> "When SME is enabled in the first kernel, the kdump kernel must access
> the first kernel's memory with the encryption bit set."
> 
> Something like that. 
> 
Looks good.

>> which causes that
>> kernel does not properly copy these old memory to the backup area in
>> purgatory(). Therefore, kdump kernel reads out the incorrect contents
> 
> s/incorrect/encrypted/
> 
Exactly.

>> from the backup area when dumping vmcore. Finally, the phenomenon is
> 
> phenomenon?
> 
Finally, it caused the following errors.

>> as follow:
>>
>> [root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
>> WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values
>>
>>       KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
>>     DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
>>         CPUS: 128
>>         DATE: Thu Sep 19 08:31:18 2019
>>       UPTIME: 00:01:21
>> LOAD AVERAGE: 0.16, 0.07, 0.02
>>        TASKS: 1343
>>     NODENAME: amd-ethanol
>>      RELEASE: 5.3.0-rc7+
>>      VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
>>      MACHINE: x86_64  (2195 Mhz)
>>       MEMORY: 127.9 GB
>>        PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>>          PID: 9789
>>      COMMAND: "bash"
>>         TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
>>          CPU: 83
>>        STATE: TASK_RUNNING (PANIC)
>>
>> crash> kmem -s|grep -i invalid
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> crash>
> 
> I fail to see what that's trying to tell me? You have invalid pointers?
> 
Yes, when parsing the vmcore via crash tool, it occurs the above errors,
the crash tool gets invalid pointers. 

>> BTW: I also tried to fix the above problem in purgatory(), but there
>> are too many restricts in purgatory() context, for example: i can't
>> allocate new memory to create the identity mapping page table for SME
>> situation.
> 
> This paragraph belongs under the "---" line below.
> 
OK. Thanks.

>> Currently, there are two places where the first 640k area is needed,
>> the first one is in the find_trampoline_placement(), another one is
>> in the reserve_real_mode(), and their content doesn't matter.
>>
>> To avoid the above error, when the crashkernel kernel command line
>> option is specified, lets reserve the remaining low 1MiB memory(
>> after reserving real mode memroy) so that the allocated memory does
>> not fall into the low 1MiB area, which makes us not to copy the first
>> 640k content to a backup region in purgatory(). This indicates that
>> it does not need to be included in crash dumps or used for anything
>> execept the processor trampolines that must live in the low 1MiB.
>>
>> In addition, also need to clean all the code related to the backup
>> region later.
> 
> Ditto.
> 
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>>  arch/x86/realmode/init.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>> index 7dce39c8c034..1f0492830f2c 100644
>> --- a/arch/x86/realmode/init.c
>> +++ b/arch/x86/realmode/init.c
>> @@ -34,6 +34,17 @@ void __init reserve_real_mode(void)
>>  
>>  	memblock_reserve(mem, size);
>>  	set_real_mode_mem(mem);
>> +
>> +#ifdef CONFIG_KEXEC_CORE
>> +	/*
>> +	 * When the crashkernel option is specified, only use the low
>> +	 * 1MiB for the real mode trampoline.
>> +	 */
>> +	if (strstr(boot_command_line, "crashkernel=")) {
>> +		memblock_reserve(0, 1<<20);
>> +		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
>> +	}
>> +#endif /* CONFIG_KEXEC_CORE */
> 
> This ifdeffery needs to be a function in kernel/kexec_core.c which is
> called by reserve_real_mode(), instead.
> 
Good understanding. I will try to improve it later.

Thanks.
Lianbo
> Thx.
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-22  8:30   ` Borislav Petkov
  2019-10-23  5:23     ` lijiang
@ 2019-10-23  5:35     ` lijiang
  2019-10-23  7:46       ` Borislav Petkov
                         ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: lijiang @ 2019-10-23  5:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

在 2019年10月22日 16:30, Borislav Petkov 写道:
> This ifdeffery needs to be a function in kernel/kexec_core.c which is
> called by reserve_real_mode(), instead.

Would you mind if i improve this patch as follow? Thanks.

From 5804abec62279585f374d78ace1250505c44c6b7 Mon Sep 17 00:00:00 2001
From: Lianbo Jiang <lijiang@redhat.com>
Date: Wed, 23 Oct 2019 11:27:04 +0800
Subject: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
 option is specified

Kdump kernel will reuse the first 640k region because the real mode
trampoline has to work in this area. When the vmcore is dumped, the
old memory in this area may be accessed, therefore, kernel has to
copy the contents of the first 640k area to a backup region so that
kdump kernel can read the old memory from the backup area of the
first 640k area, which is done in the purgatory().

But, the current handling of copying the first 640k area runs into
problems when SME is enabled, kernel does not properly copy these
old memory to the backup area in the purgatory(), thereby, kdump
kernel reads out the encrypted contents, because the kdump kernel
must access the first kernel's memory with the encryption bit set
when SME is enabled in the first kernel. Please refer to this link:

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Finally, it causes the following errors, and the crash tool gets
invalid pointers when parsing the vmcore.

crash> kmem -s|grep -i invalid
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
crash>

To avoid the above errors, when the crashkernel option is specified,
lets reserve the remaining low 1MiB memory(after reserving real mode
memory) so that the allocated memory does not fall into the low 1MiB
area, which makes us not to copy the first 640k content to a backup
region in purgatory(). This indicates that it does not need to be
included in crash dumps or used for anything except the processor
trampolines that must live in the low 1MiB.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
BTW:I also tried to fix the above problem in purgatory(), but there
are too many restricts in purgatory() context, for example: i can't
allocate new memory to create the identity mapping page table for
SME situation.

Currently, there are two places where the first 640k area is needed,
the first one is in the find_trampoline_placement(), another one is
in the reserve_real_mode(), and their content doesn't matter.

In addition, also need to clean all the code related to the backup
region later.

 arch/x86/realmode/init.c |  2 ++
 include/linux/kexec.h    |  2 ++
 kernel/kexec_core.c      | 13 +++++++++++++
 3 files changed, 17 insertions(+)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 7dce39c8c034..064cc79a015d 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -3,6 +3,7 @@
 #include <linux/slab.h>
 #include <linux/memblock.h>
 #include <linux/mem_encrypt.h>
+#include <linux/kexec.h>
 
 #include <asm/set_memory.h>
 #include <asm/pgtable.h>
@@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
 
 	memblock_reserve(mem, size);
 	set_real_mode_mem(mem);
+	kexec_reserve_low_1MiB();
 }
 
 static void __init setup_real_mode(void)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 1776eb2e43a4..30acf1d738bc 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
 extern void crash_kexec(struct pt_regs *);
 int kexec_should_crash(struct task_struct *);
 int kexec_crash_loaded(void);
+void kexec_reserve_low_1MiB(void);
 void crash_save_cpu(struct pt_regs *regs, int cpu);
 extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
 
@@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
 static inline void crash_kexec(struct pt_regs *regs) { }
 static inline int kexec_should_crash(struct task_struct *p) { return 0; }
 static inline int kexec_crash_loaded(void) { return 0; }
+static inline void kexec_reserve_low_1MiB(void) { }
 #define kexec_in_progress false
 #endif /* CONFIG_KEXEC_CORE */
 
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 15d70a90b50d..5bd89f1fee42 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -37,6 +37,7 @@
 #include <linux/compiler.h>
 #include <linux/hugetlb.h>
 #include <linux/frame.h>
+#include <linux/memblock.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
@@ -70,6 +71,18 @@ struct resource crashk_low_res = {
 	.desc  = IORES_DESC_CRASH_KERNEL
 };
 
+/*
+ * When the crashkernel option is specified, only use the low
+ * 1MiB for the real mode trampoline.
+ */
+void kexec_reserve_low_1MiB(void)
+{
+	if (strstr(boot_command_line, "crashkernel=")) {
+		memblock_reserve(0, 1<<20);
+		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+	}
+}
+
 int kexec_should_crash(struct task_struct *p)
 {
 	/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23  5:23     ` lijiang
@ 2019-10-23  7:43       ` Borislav Petkov
  0 siblings, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2019-10-23  7:43 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

On Wed, Oct 23, 2019 at 01:23:33PM +0800, lijiang wrote:
> Kdump kernel will reuse the first 640k region because the real mode
> trampoline has to work in this area. When the vmcore is dumped, the
> old memory in this area may be accessed, therefore, kernel has to
> copy the contents of the first 640k area to a backup region so that
> kdump kernel can read the old memory from the backup area of the
> first 640k area, which is done in the purgatory().

That sounds better. :)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23  5:35     ` lijiang
@ 2019-10-23  7:46       ` Borislav Petkov
  2019-10-23  9:20         ` lijiang
  2019-10-24  8:13       ` d.hatayama
  2019-10-24 22:12       ` [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel kbuild test robot
  2 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2019-10-23  7:46 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

On Wed, Oct 23, 2019 at 01:35:09PM +0800, lijiang wrote:
> Would you mind if i improve this patch as follow? Thanks.

Yap, looks good to me.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23  7:46       ` Borislav Petkov
@ 2019-10-23  9:20         ` lijiang
  0 siblings, 0 replies; 16+ messages in thread
From: lijiang @ 2019-10-23  9:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

在 2019年10月23日 15:46, Borislav Petkov 写道:
> On Wed, Oct 23, 2019 at 01:35:09PM +0800, lijiang wrote:
>> Would you mind if i improve this patch as follow? Thanks.
> 
> Yap, looks good to me.
> 
Thanks for your comment.

OK. I will post this one and the third patch in this series later.

Thanks.
Lianbo


> Thx.
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23  5:35     ` lijiang
  2019-10-23  7:46       ` Borislav Petkov
@ 2019-10-24  8:13       ` d.hatayama
  2019-10-24  9:10         ` Borislav Petkov
  2019-10-24 11:24         ` lijiang
  2019-10-24 22:12       ` [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel kbuild test robot
  2 siblings, 2 replies; 16+ messages in thread
From: d.hatayama @ 2019-10-24  8:13 UTC (permalink / raw)
  To: 'lijiang'
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec,
	Borislav Petkov

I don't find the corresponding patch in the v5 patchset, so I comment here.

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org
> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of lijiang
> Sent: Wednesday, October 23, 2019 2:35 PM
> To: Borislav Petkov <bp@alien8.de>
> Cc: linux-kernel@vger.kernel.org; tglx@linutronix.de; mingo@redhat.com;
> hpa@zytor.com; x86@kernel.org; bhe@redhat.com; dyoung@redhat.com;
> jgross@suse.com; dhowells@redhat.com; Thomas.Lendacky@amd.com;
> ebiederm@xmission.com; vgoyal@redhat.com; kexec@lists.infradead.org
> Subject: Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the
> crashkernel option is specified
> 
> 在 2019年10月22日 16:30, Borislav Petkov 写道:
> > This ifdeffery needs to be a function in kernel/kexec_core.c which is
> > called by reserve_real_mode(), instead.
> 
> Would you mind if i improve this patch as follow? Thanks.
> 
> From 5804abec62279585f374d78ace1250505c44c6b7 Mon Sep 17 00:00:00 2001
> From: Lianbo Jiang <lijiang@redhat.com>
> Date: Wed, 23 Oct 2019 11:27:04 +0800
> Subject: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
>  option is specified
> 
> Kdump kernel will reuse the first 640k region because the real mode
> trampoline has to work in this area. When the vmcore is dumped, the
> old memory in this area may be accessed, therefore, kernel has to
> copy the contents of the first 640k area to a backup region so that
> kdump kernel can read the old memory from the backup area of the
> first 640k area, which is done in the purgatory().
> 
> But, the current handling of copying the first 640k area runs into
> problems when SME is enabled, kernel does not properly copy these
> old memory to the backup area in the purgatory(), thereby, kdump
> kernel reads out the encrypted contents, because the kdump kernel
> must access the first kernel's memory with the encryption bit set
> when SME is enabled in the first kernel. Please refer to this link:
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
> 
> Finally, it causes the following errors, and the crash tool gets
> invalid pointers when parsing the vmcore.
> 
> crash> kmem -s|grep -i invalid
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
> freepointer:a6086ac099f0c5a4
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
> freepointer:a6086ac099f0c5a4
> crash>
> 
> To avoid the above errors, when the crashkernel option is specified,
> lets reserve the remaining low 1MiB memory(after reserving real mode
> memory) so that the allocated memory does not fall into the low 1MiB
> area, which makes us not to copy the first 640k content to a backup
> region in purgatory(). This indicates that it does not need to be
> included in crash dumps or used for anything except the processor
> trampolines that must live in the low 1MiB.
> 
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
> BTW:I also tried to fix the above problem in purgatory(), but there
> are too many restricts in purgatory() context, for example: i can't
> allocate new memory to create the identity mapping page table for
> SME situation.
> 
> Currently, there are two places where the first 640k area is needed,
> the first one is in the find_trampoline_placement(), another one is
> in the reserve_real_mode(), and their content doesn't matter.
> 
> In addition, also need to clean all the code related to the backup
> region later.
> 
>  arch/x86/realmode/init.c |  2 ++
>  include/linux/kexec.h    |  2 ++
>  kernel/kexec_core.c      | 13 +++++++++++++
>  3 files changed, 17 insertions(+)
> 
> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> index 7dce39c8c034..064cc79a015d 100644
> --- a/arch/x86/realmode/init.c
> +++ b/arch/x86/realmode/init.c
> @@ -3,6 +3,7 @@
>  #include <linux/slab.h>
>  #include <linux/memblock.h>
>  #include <linux/mem_encrypt.h>
> +#include <linux/kexec.h>
> 
>  #include <asm/set_memory.h>
>  #include <asm/pgtable.h>
> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
> 
>  	memblock_reserve(mem, size);
>  	set_real_mode_mem(mem);
> +	kexec_reserve_low_1MiB();
>  }
> 
>  static void __init setup_real_mode(void)
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 1776eb2e43a4..30acf1d738bc 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>  extern void crash_kexec(struct pt_regs *);
>  int kexec_should_crash(struct task_struct *);
>  int kexec_crash_loaded(void);
> +void kexec_reserve_low_1MiB(void);
>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
> 
> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
>  static inline void crash_kexec(struct pt_regs *regs) { }
>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>  static inline int kexec_crash_loaded(void) { return 0; }
> +static inline void kexec_reserve_low_1MiB(void) { }
>  #define kexec_in_progress false
>  #endif /* CONFIG_KEXEC_CORE */
> 
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 15d70a90b50d..5bd89f1fee42 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -37,6 +37,7 @@
>  #include <linux/compiler.h>
>  #include <linux/hugetlb.h>
>  #include <linux/frame.h>
> +#include <linux/memblock.h>
> 
>  #include <asm/page.h>
>  #include <asm/sections.h>
> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>  	.desc  = IORES_DESC_CRASH_KERNEL
>  };
> 
> +/*
> + * When the crashkernel option is specified, only use the low
> + * 1MiB for the real mode trampoline.
> + */
> +void kexec_reserve_low_1MiB(void)
> +{
> +	if (strstr(boot_command_line, "crashkernel=")) {

strstr() matches for example, ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.

Is it enough to use cmdline_find_option_bool()?

> +		memblock_reserve(0, 1<<20);
> +		pr_info("Reserving the low 1MiB of memory for
> crashkernel\n");
> +	}
> +}
> +
>  int kexec_should_crash(struct task_struct *p)
>  {
>  	/*
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-24  8:13       ` d.hatayama
@ 2019-10-24  9:10         ` Borislav Petkov
  2019-10-24 11:24         ` lijiang
  1 sibling, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2019-10-24  9:10 UTC (permalink / raw)
  To: d.hatayama
  Cc: 'lijiang',
	linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

On Thu, Oct 24, 2019 at 08:13:25AM +0000, d.hatayama@fujitsu.com wrote:
> I don't find the corresponding patch in the v5 patchset, so I comment here.

You don't?

https://lore.kernel.org/lkml/20191023141912.29110-2-lijiang@redhat.com/

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-24  8:13       ` d.hatayama
  2019-10-24  9:10         ` Borislav Petkov
@ 2019-10-24 11:24         ` lijiang
  1 sibling, 0 replies; 16+ messages in thread
From: lijiang @ 2019-10-24 11:24 UTC (permalink / raw)
  To: d.hatayama
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec,
	Borislav Petkov

在 2019年10月24日 16:13, d.hatayama@fujitsu.com 写道:
> I don't find the corresponding patch in the v5 patchset, so I comment here.
> 
Thanks for your comment.

>> -----Original Message-----
>> From: linux-kernel-owner@vger.kernel.org
>> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of lijiang
>> Sent: Wednesday, October 23, 2019 2:35 PM
>> To: Borislav Petkov <bp@alien8.de>
>> Cc: linux-kernel@vger.kernel.org; tglx@linutronix.de; mingo@redhat.com;
>> hpa@zytor.com; x86@kernel.org; bhe@redhat.com; dyoung@redhat.com;
>> jgross@suse.com; dhowells@redhat.com; Thomas.Lendacky@amd.com;
>> ebiederm@xmission.com; vgoyal@redhat.com; kexec@lists.infradead.org
>> Subject: Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the
>> crashkernel option is specified
>>
>> 在 2019年10月22日 16:30, Borislav Petkov 写道:
>>> This ifdeffery needs to be a function in kernel/kexec_core.c which is
>>> called by reserve_real_mode(), instead.
>>
>> Would you mind if i improve this patch as follow? Thanks.
>>
>> From 5804abec62279585f374d78ace1250505c44c6b7 Mon Sep 17 00:00:00 2001
>> From: Lianbo Jiang <lijiang@redhat.com>
>> Date: Wed, 23 Oct 2019 11:27:04 +0800
>> Subject: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
>>  option is specified
>>
>> Kdump kernel will reuse the first 640k region because the real mode
>> trampoline has to work in this area. When the vmcore is dumped, the
>> old memory in this area may be accessed, therefore, kernel has to
>> copy the contents of the first 640k area to a backup region so that
>> kdump kernel can read the old memory from the backup area of the
>> first 640k area, which is done in the purgatory().
>>
>> But, the current handling of copying the first 640k area runs into
>> problems when SME is enabled, kernel does not properly copy these
>> old memory to the backup area in the purgatory(), thereby, kdump
>> kernel reads out the encrypted contents, because the kdump kernel
>> must access the first kernel's memory with the encryption bit set
>> when SME is enabled in the first kernel. Please refer to this link:
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
>>
>> Finally, it causes the following errors, and the crash tool gets
>> invalid pointers when parsing the vmcore.
>>
>> crash> kmem -s|grep -i invalid
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
>> freepointer:a6086ac099f0c5a4
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
>> freepointer:a6086ac099f0c5a4
>> crash>
>>
>> To avoid the above errors, when the crashkernel option is specified,
>> lets reserve the remaining low 1MiB memory(after reserving real mode
>> memory) so that the allocated memory does not fall into the low 1MiB
>> area, which makes us not to copy the first 640k content to a backup
>> region in purgatory(). This indicates that it does not need to be
>> included in crash dumps or used for anything except the processor
>> trampolines that must live in the low 1MiB.
>>
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>> BTW:I also tried to fix the above problem in purgatory(), but there
>> are too many restricts in purgatory() context, for example: i can't
>> allocate new memory to create the identity mapping page table for
>> SME situation.
>>
>> Currently, there are two places where the first 640k area is needed,
>> the first one is in the find_trampoline_placement(), another one is
>> in the reserve_real_mode(), and their content doesn't matter.
>>
>> In addition, also need to clean all the code related to the backup
>> region later.
>>
>>  arch/x86/realmode/init.c |  2 ++
>>  include/linux/kexec.h    |  2 ++
>>  kernel/kexec_core.c      | 13 +++++++++++++
>>  3 files changed, 17 insertions(+)
>>
>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>> index 7dce39c8c034..064cc79a015d 100644
>> --- a/arch/x86/realmode/init.c
>> +++ b/arch/x86/realmode/init.c
>> @@ -3,6 +3,7 @@
>>  #include <linux/slab.h>
>>  #include <linux/memblock.h>
>>  #include <linux/mem_encrypt.h>
>> +#include <linux/kexec.h>
>>
>>  #include <asm/set_memory.h>
>>  #include <asm/pgtable.h>
>> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
>>
>>  	memblock_reserve(mem, size);
>>  	set_real_mode_mem(mem);
>> +	kexec_reserve_low_1MiB();
>>  }
>>
>>  static void __init setup_real_mode(void)
>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
>> index 1776eb2e43a4..30acf1d738bc 100644
>> --- a/include/linux/kexec.h
>> +++ b/include/linux/kexec.h
>> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>>  extern void crash_kexec(struct pt_regs *);
>>  int kexec_should_crash(struct task_struct *);
>>  int kexec_crash_loaded(void);
>> +void kexec_reserve_low_1MiB(void);
>>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
>>
>> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
>>  static inline void crash_kexec(struct pt_regs *regs) { }
>>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>>  static inline int kexec_crash_loaded(void) { return 0; }
>> +static inline void kexec_reserve_low_1MiB(void) { }
>>  #define kexec_in_progress false
>>  #endif /* CONFIG_KEXEC_CORE */
>>
>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>> index 15d70a90b50d..5bd89f1fee42 100644
>> --- a/kernel/kexec_core.c
>> +++ b/kernel/kexec_core.c
>> @@ -37,6 +37,7 @@
>>  #include <linux/compiler.h>
>>  #include <linux/hugetlb.h>
>>  #include <linux/frame.h>
>> +#include <linux/memblock.h>
>>
>>  #include <asm/page.h>
>>  #include <asm/sections.h>
>> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>>  	.desc  = IORES_DESC_CRASH_KERNEL
>>  };
>>
>> +/*
>> + * When the crashkernel option is specified, only use the low
>> + * 1MiB for the real mode trampoline.
>> + */
>> +void kexec_reserve_low_1MiB(void)
>> +{
>> +	if (strstr(boot_command_line, "crashkernel=")) {
> 
> strstr() matches for example, ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.
> 
> Is it enough to use cmdline_find_option_bool()?
> 
The cmdline_find_option_bool() will find a boolean option, but the crashkernel option
is not a boolean option, maybe it looks odd. So, should we use the cmdline_find_option()
better?

+#include <asm/cmdline.h>

 void __init kexec_reserve_low_1MiB(void)
 {
-       if (strstr(boot_command_line, "crashkernel=")) {
+       char buffer[4];
+
+       if (cmdline_find_option(boot_command_line, "crashkernel=",
+                               buffer, sizeof(buffer))) {
                memblock_reserve(0, 1<<20);
                pr_info("Reserving the low 1MiB of memory for crashkernel\n");
        }

And here, no need to parse the arguments of crashkernel(sometimes, which has a
complicated syntax), so the size of buffer should be enough. What's your opinion?

Thanks
Lianbo
 
>> +		memblock_reserve(0, 1<<20);
>> +		pr_info("Reserving the low 1MiB of memory for
>> crashkernel\n");
>> +	}
>> +}
>> +
>>  int kexec_should_crash(struct task_struct *p)
>>  {
>>  	/*
>> --
>> 2.17.1
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
  2019-10-23  5:35     ` lijiang
  2019-10-23  7:46       ` Borislav Petkov
  2019-10-24  8:13       ` d.hatayama
@ 2019-10-24 22:12       ` kbuild test robot
  2019-10-24 23:55         ` lijiang
  2 siblings, 1 reply; 16+ messages in thread
From: kbuild test robot @ 2019-10-24 22:12 UTC (permalink / raw)
  To: lijiang
  Cc: kbuild-all, Borislav Petkov, linux-kernel, tglx, mingo, hpa, x86,
	bhe, dyoung, jgross, dhowells, Thomas.Lendacky, ebiederm, vgoyal,
	kexec

[-- Attachment #1: Type: text/plain, Size: 1886 bytes --]

Hi lijiang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[cannot apply to v5.4-rc4 next-20191024]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/lijiang/x86-kdump-always-reserve-the-low-1MiB-when-the-crashkernel/20191025-030439
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git f116b96685a046a89c25d4a6ba2da489145c8888
config: i386-defconfig (attached as .config)
compiler: gcc-7 (Debian 7.4.0-14) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0xe39b7): Section mismatch in reference from the function kexec_reserve_low_1MiB() to the variable .init.data:boot_command_line
   The function kexec_reserve_low_1MiB() references
   the variable __initdata boot_command_line.
   This is often because kexec_reserve_low_1MiB lacks a __initdata
   annotation or the annotation of boot_command_line is wrong.
--
>> WARNING: vmlinux.o(.text+0xe39d0): Section mismatch in reference from the function kexec_reserve_low_1MiB() to the function .meminit.text:memblock_reserve()
   The function kexec_reserve_low_1MiB() references
   the function __meminit memblock_reserve().
   This is often because kexec_reserve_low_1MiB lacks a __meminit
   annotation or the annotation of memblock_reserve is wrong.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28148 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
  2019-10-24 22:12       ` [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel kbuild test robot
@ 2019-10-24 23:55         ` lijiang
  0 siblings, 0 replies; 16+ messages in thread
From: lijiang @ 2019-10-24 23:55 UTC (permalink / raw)
  To: kbuild test robot
  Cc: kbuild-all, Borislav Petkov, linux-kernel, tglx, mingo, hpa, x86,
	bhe, dyoung, jgross, dhowells, Thomas.Lendacky, ebiederm, vgoyal,
	kexec

在 2019年10月25日 06:12, kbuild test robot 写道:
> Hi lijiang,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> [auto build test WARNING on linus/master]
> [cannot apply to v5.4-rc4 next-20191024]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
> 
> url:    https://github.com/0day-ci/linux/commits/lijiang/x86-kdump-always-reserve-the-low-1MiB-when-the-crashkernel/20191025-030439
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git f116b96685a046a89c25d4a6ba2da489145c8888
> config: i386-defconfig (attached as .config)
> compiler: gcc-7 (Debian 7.4.0-14) 7.4.0
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=i386 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
>>> WARNING: vmlinux.o(.text+0xe39b7): Section mismatch in reference from the function kexec_reserve_low_1MiB() to the variable .init.data:boot_command_line
>    The function kexec_reserve_low_1MiB() references
>    the variable __initdata boot_command_line.
>    This is often because kexec_reserve_low_1MiB lacks a __initdata
>    annotation or the annotation of boot_command_line is wrong.
> --
>>> WARNING: vmlinux.o(.text+0xe39d0): Section mismatch in reference from the function kexec_reserve_low_1MiB() to the function .meminit.text:memblock_reserve()
>    The function kexec_reserve_low_1MiB() references
>    the function __meminit memblock_reserve().
>    This is often because kexec_reserve_low_1MiB lacks a __meminit
>    annotation or the annotation of memblock_reserve is wrong.
> 
These warnings have been fixed in patch v5. Please refer to the latest patch v5.

Thanks.
Lianbo

> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-10-24 23:56 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-17  9:43 [PATCH 0/3 v4] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
2019-10-17  9:43 ` [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
2019-10-22  8:30   ` Borislav Petkov
2019-10-23  5:23     ` lijiang
2019-10-23  7:43       ` Borislav Petkov
2019-10-23  5:35     ` lijiang
2019-10-23  7:46       ` Borislav Petkov
2019-10-23  9:20         ` lijiang
2019-10-24  8:13       ` d.hatayama
2019-10-24  9:10         ` Borislav Petkov
2019-10-24 11:24         ` lijiang
2019-10-24 22:12       ` [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel kbuild test robot
2019-10-24 23:55         ` lijiang
2019-10-17  9:43 ` [PATCH 2/3 v4] x86/kdump: remove the unused crash_copy_backup_region() Lianbo Jiang
2019-10-17  9:43 ` [PATCH 3/3 v4] x86/kdump: clean up all the code related to the backup region Lianbo Jiang
2019-10-22 12:15   ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).