linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2 v5] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active
@ 2019-10-23 14:19 Lianbo Jiang
  2019-10-23 14:19 ` [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
  2019-10-23 14:19 ` [PATCH 2/2 v5] x86/kdump: clean up all the code related to the backup region Lianbo Jiang
  0 siblings, 2 replies; 10+ messages in thread
From: Lianbo Jiang @ 2019-10-23 14:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, mingo, bp, hpa, x86, bhe, dyoung, jgross, dhowells,
	Thomas.Lendacky, ebiederm, vgoyal, kexec

In purgatory(), the main things are as below:

[1] verify sha256 hashes for various segments.
    Lets keep these codes, and do not touch the logic.

[2] copy the first 640k content to a backup region.
    Lets safely remove it and clean all code related to backup region.

This patch series will remove the backup region, because the current
handling of copying the first 640k runs into problems when SME is
active(https://bugzilla.kernel.org/show_bug.cgi?id=204793).

The low 1MiB region will always be reserved when the crashkernel kernel
command line option is specified. And this way makes it unnecessary to
do anything with the low 1MiB region, because the memory allocated later
won't fall into the low 1MiB area.

This series includes two patches:
[1] x86/kdump: always reserve the low 1MiB when the crashkernel option
    is specified
    The low 1MiB region will always be reserved when the crashkernel
    kernel command line option is specified, which ensures that the
    memory allocated later won't fall into the low 1MiB area.

[2] x86/kdump: clean up all the code related to the backup region
    Remove the backup region and clean up.

Changes since v1:
[1] Add extra checking condition: when the crashkernel option is
    specified, reserve the low 640k area.

Changes since v2:
[1] Reserve the low 1MiB region when the crashkernel option is only
    specified.(Suggested by Eric)

[2] Remove the unused crash_copy_backup_region()

[3] Remove the backup region and clean up

[4] Split them into three patches

Changes since v3:
[1] Improve the first patch's log

[2] Improve the third patch based on Eric's suggestions

Changes since v4:
[1] Correct some typos, and also improve the first patch's log

[2] Add a new function kexec_reserve_low_1MiB() in kernel/kexec_core.c
    and which is called by reserve_real_mode(). (Suggested by Boris) 

Lianbo Jiang (2):
  x86/kdump: always reserve the low 1MiB when the crashkernel option is
    specified
  x86/kdump: clean up all the code related to the backup region

 arch/x86/include/asm/kexec.h       | 10 ----
 arch/x86/include/asm/purgatory.h   | 10 ----
 arch/x86/kernel/crash.c            | 87 ++++--------------------------
 arch/x86/kernel/machine_kexec_64.c | 47 ----------------
 arch/x86/purgatory/purgatory.c     | 19 -------
 arch/x86/realmode/init.c           |  2 +
 include/linux/kexec.h              |  2 +
 kernel/kexec_core.c                | 13 +++++
 8 files changed, 28 insertions(+), 162 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23 14:19 [PATCH 0/2 v5] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
@ 2019-10-23 14:19 ` Lianbo Jiang
  2019-10-24 10:07   ` Simon Horman
  2019-10-23 14:19 ` [PATCH 2/2 v5] x86/kdump: clean up all the code related to the backup region Lianbo Jiang
  1 sibling, 1 reply; 10+ messages in thread
From: Lianbo Jiang @ 2019-10-23 14:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, mingo, bp, hpa, x86, bhe, dyoung, jgross, dhowells,
	Thomas.Lendacky, ebiederm, vgoyal, kexec

Kdump kernel will reuse the first 640k region because the real mode
trampoline has to work in this area. When the vmcore is dumped, the
old memory in this area may be accessed, therefore, kernel has to
copy the contents of the first 640k area to a backup region so that
kdump kernel can read the old memory from the backup area of the
first 640k area, which is done in the purgatory().

But, the current handling of copying the first 640k area runs into
problems when SME is enabled, kernel does not properly copy these
old memory to the backup area in the purgatory(), thereby, kdump
kernel reads out the encrypted contents, because the kdump kernel
must access the first kernel's memory with the encryption bit set
when SME is enabled in the first kernel. Please refer to this link:

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Finally, it causes the following errors, and the crash tool gets
invalid pointers when parsing the vmcore.

crash> kmem -s|grep -i invalid
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
crash>

To avoid the above errors, when the crashkernel option is specified,
lets reserve the remaining low 1MiB memory(after reserving real mode
memory) so that the allocated memory does not fall into the low 1MiB
area, which makes us not to copy the first 640k content to a backup
region in purgatory(). This indicates that it does not need to be
included in crash dumps or used for anything except the processor
trampolines that must live in the low 1MiB.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
BTW:I also tried to fix the above problem in purgatory(), but there
are too many restricts in purgatory() context, for example: i can't
allocate new memory to create the identity mapping page table for
SME situation.

Currently, there are two places where the first 640k area is needed,
the first one is in the find_trampoline_placement(), another one is
in the reserve_real_mode(), and their content doesn't matter.

In addition, also need to clean all the code related to the backup
region later.

 arch/x86/realmode/init.c |  2 ++
 include/linux/kexec.h    |  2 ++
 kernel/kexec_core.c      | 13 +++++++++++++
 3 files changed, 17 insertions(+)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 7dce39c8c034..064cc79a015d 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -3,6 +3,7 @@
 #include <linux/slab.h>
 #include <linux/memblock.h>
 #include <linux/mem_encrypt.h>
+#include <linux/kexec.h>
 
 #include <asm/set_memory.h>
 #include <asm/pgtable.h>
@@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
 
 	memblock_reserve(mem, size);
 	set_real_mode_mem(mem);
+	kexec_reserve_low_1MiB();
 }
 
 static void __init setup_real_mode(void)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 1776eb2e43a4..30acf1d738bc 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
 extern void crash_kexec(struct pt_regs *);
 int kexec_should_crash(struct task_struct *);
 int kexec_crash_loaded(void);
+void __init kexec_reserve_low_1MiB(void);
 void crash_save_cpu(struct pt_regs *regs, int cpu);
 extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
 
@@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
 static inline void crash_kexec(struct pt_regs *regs) { }
 static inline int kexec_should_crash(struct task_struct *p) { return 0; }
 static inline int kexec_crash_loaded(void) { return 0; }
+static inline void __init kexec_reserve_low_1MiB(void) { }
 #define kexec_in_progress false
 #endif /* CONFIG_KEXEC_CORE */
 
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 15d70a90b50d..5bd89f1fee42 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -37,6 +37,7 @@
 #include <linux/compiler.h>
 #include <linux/hugetlb.h>
 #include <linux/frame.h>
+#include <linux/memblock.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
@@ -70,6 +71,18 @@ struct resource crashk_low_res = {
 	.desc  = IORES_DESC_CRASH_KERNEL
 };
 
+/*
+ * When the crashkernel option is specified, only use the low
+ * 1MiB for the real mode trampoline.
+ */
+void __init kexec_reserve_low_1MiB(void)
+{
+	if (strstr(boot_command_line, "crashkernel=")) {
+		memblock_reserve(0, 1<<20);
+		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+	}
+}
+
 int kexec_should_crash(struct task_struct *p)
 {
 	/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2 v5] x86/kdump: clean up all the code related to the backup region
  2019-10-23 14:19 [PATCH 0/2 v5] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
  2019-10-23 14:19 ` [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
@ 2019-10-23 14:19 ` Lianbo Jiang
  1 sibling, 0 replies; 10+ messages in thread
From: Lianbo Jiang @ 2019-10-23 14:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, mingo, bp, hpa, x86, bhe, dyoung, jgross, dhowells,
	Thomas.Lendacky, ebiederm, vgoyal, kexec

When the crashkernel kernel command line option is specified, the
low 1MiB memory will always be reserved, which makes that the memory
allocated later won't fall into the low 1MiB area, thereby, it's not
necessary to create a backup region and also no need to copy the first
640k content to a backup region.

Currently, the code related to the backup region can be safely removed,
so lets clean up.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/include/asm/kexec.h       | 10 ----
 arch/x86/include/asm/purgatory.h   | 10 ----
 arch/x86/kernel/crash.c            | 87 ++++--------------------------
 arch/x86/kernel/machine_kexec_64.c | 47 ----------------
 arch/x86/purgatory/purgatory.c     | 19 -------
 5 files changed, 11 insertions(+), 162 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 5e7d6b46de97..6802c59e8252 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -66,10 +66,6 @@ struct kimage;
 # define KEXEC_ARCH KEXEC_ARCH_X86_64
 #endif
 
-/* Memory to backup during crash kdump */
-#define KEXEC_BACKUP_SRC_START	(0UL)
-#define KEXEC_BACKUP_SRC_END	(640 * 1024UL - 1)	/* 640K */
-
 /*
  * This function is responsible for capturing register states if coming
  * via panic otherwise just fix up the ss and sp if coming via kernel
@@ -154,12 +150,6 @@ struct kimage_arch {
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
-	/* Details of backup region */
-	unsigned long backup_src_start;
-	unsigned long backup_src_sz;
-
-	/* Physical address of backup segment */
-	unsigned long backup_load_addr;
 
 	/* Core ELF header buffer */
 	void *elf_headers;
diff --git a/arch/x86/include/asm/purgatory.h b/arch/x86/include/asm/purgatory.h
index 92c34e517da1..5528e9325049 100644
--- a/arch/x86/include/asm/purgatory.h
+++ b/arch/x86/include/asm/purgatory.h
@@ -6,16 +6,6 @@
 #include <linux/purgatory.h>
 
 extern void purgatory(void);
-/*
- * These forward declarations serve two purposes:
- *
- * 1) Make sparse happy when checking arch/purgatory
- * 2) Document that these are required to be global so the symbol
- *    lookup in kexec works
- */
-extern unsigned long purgatory_backup_dest;
-extern unsigned long purgatory_backup_src;
-extern unsigned long purgatory_backup_sz;
 #endif	/* __ASSEMBLY__ */
 
 #endif /* _ASM_PURGATORY_H */
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index eb651fbde92a..ef54b3ffb0f6 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -173,8 +173,6 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 
 #ifdef CONFIG_KEXEC_FILE
 
-static unsigned long crash_zero_bytes;
-
 static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
 {
 	unsigned int *nr_ranges = arg;
@@ -217,6 +215,11 @@ static int elf_header_exclude_ranges(struct crash_mem *cmem)
 {
 	int ret = 0;
 
+	/* Exclude the low 1MiB because it is always reserved */
+	ret = crash_exclude_mem_range(cmem, 0, 1<<20);
+	if (ret)
+		return ret;
+
 	/* Exclude crashkernel region */
 	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
 	if (ret)
@@ -246,9 +249,7 @@ static int prepare_elf_headers(struct kimage *image, void **addr,
 					unsigned long *sz)
 {
 	struct crash_mem *cmem;
-	Elf64_Ehdr *ehdr;
-	Elf64_Phdr *phdr;
-	int ret, i;
+	int ret;
 
 	cmem = fill_up_crash_elf_data();
 	if (!cmem)
@@ -267,22 +268,7 @@ static int prepare_elf_headers(struct kimage *image, void **addr,
 	/* By default prepare 64bit headers */
 	ret =  crash_prepare_elf64_headers(cmem,
 				IS_ENABLED(CONFIG_X86_64), addr, sz);
-	if (ret)
-		goto out;
 
-	/*
-	 * If a range matches backup region, adjust offset to backup
-	 * segment.
-	 */
-	ehdr = (Elf64_Ehdr *)*addr;
-	phdr = (Elf64_Phdr *)(ehdr + 1);
-	for (i = 0; i < ehdr->e_phnum; phdr++, i++)
-		if (phdr->p_type == PT_LOAD &&
-				phdr->p_paddr == image->arch.backup_src_start &&
-				phdr->p_memsz == image->arch.backup_src_sz) {
-			phdr->p_offset = image->arch.backup_load_addr;
-			break;
-		}
 out:
 	vfree(cmem);
 	return ret;
@@ -321,19 +307,11 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
 				 unsigned long long mend)
 {
 	unsigned long start, end;
-	int ret = 0;
 
 	cmem->ranges[0].start = mstart;
 	cmem->ranges[0].end = mend;
 	cmem->nr_ranges = 1;
 
-	/* Exclude Backup region */
-	start = image->arch.backup_load_addr;
-	end = start + image->arch.backup_src_sz - 1;
-	ret = crash_exclude_mem_range(cmem, start, end);
-	if (ret)
-		return ret;
-
 	/* Exclude elf header region */
 	start = image->arch.elf_load_addr;
 	end = start + image->arch.elf_headers_sz - 1;
@@ -356,11 +334,11 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
 	memset(&cmd, 0, sizeof(struct crash_memmap_data));
 	cmd.params = params;
 
-	/* Add first 640K segment */
-	ei.addr = image->arch.backup_src_start;
-	ei.size = image->arch.backup_src_sz;
-	ei.type = E820_TYPE_RAM;
-	add_e820_entry(params, &ei);
+	/* Add the low 1MiB */
+	cmd.type = E820_TYPE_RAM;
+	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	walk_iomem_res_desc(IORES_DESC_NONE, flags, 0, (1<<20)-1, &cmd,
+			memmap_entry_callback);
 
 	/* Add ACPI tables */
 	cmd.type = E820_TYPE_ACPI;
@@ -409,55 +387,12 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
 	return ret;
 }
 
-static int determine_backup_region(struct resource *res, void *arg)
-{
-	struct kimage *image = arg;
-
-	image->arch.backup_src_start = res->start;
-	image->arch.backup_src_sz = resource_size(res);
-
-	/* Expecting only one range for backup region */
-	return 1;
-}
-
 int crash_load_segments(struct kimage *image)
 {
 	int ret;
 	struct kexec_buf kbuf = { .image = image, .buf_min = 0,
 				  .buf_max = ULONG_MAX, .top_down = false };
 
-	/*
-	 * Determine and load a segment for backup area. First 640K RAM
-	 * region is backup source
-	 */
-
-	ret = walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
-				image, determine_backup_region);
-
-	/* Zero or postive return values are ok */
-	if (ret < 0)
-		return ret;
-
-	/* Add backup segment. */
-	if (image->arch.backup_src_sz) {
-		kbuf.buffer = &crash_zero_bytes;
-		kbuf.bufsz = sizeof(crash_zero_bytes);
-		kbuf.memsz = image->arch.backup_src_sz;
-		kbuf.buf_align = PAGE_SIZE;
-		/*
-		 * Ideally there is no source for backup segment. This is
-		 * copied in purgatory after crash. Just add a zero filled
-		 * segment for now to make sure checksum logic works fine.
-		 */
-		ret = kexec_add_buffer(&kbuf);
-		if (ret)
-			return ret;
-		image->arch.backup_load_addr = kbuf.mem;
-		pr_debug("Loaded backup region at 0x%lx backup_start=0x%lx memsz=0x%lx\n",
-			 image->arch.backup_load_addr,
-			 image->arch.backup_src_start, kbuf.memsz);
-	}
-
 	/* Prepare elf headers and add a segment */
 	ret = prepare_elf_headers(image, &kbuf.buffer, &kbuf.bufsz);
 	if (ret)
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 5dcd438ad8f2..16e125a50b33 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -298,48 +298,6 @@ static void load_segments(void)
 		);
 }
 
-#ifdef CONFIG_KEXEC_FILE
-/* Update purgatory as needed after various image segments have been prepared */
-static int arch_update_purgatory(struct kimage *image)
-{
-	int ret = 0;
-
-	if (!image->file_mode)
-		return 0;
-
-	/* Setup copying of backup region */
-	if (image->type == KEXEC_TYPE_CRASH) {
-		ret = kexec_purgatory_get_set_symbol(image,
-				"purgatory_backup_dest",
-				&image->arch.backup_load_addr,
-				sizeof(image->arch.backup_load_addr), 0);
-		if (ret)
-			return ret;
-
-		ret = kexec_purgatory_get_set_symbol(image,
-				"purgatory_backup_src",
-				&image->arch.backup_src_start,
-				sizeof(image->arch.backup_src_start), 0);
-		if (ret)
-			return ret;
-
-		ret = kexec_purgatory_get_set_symbol(image,
-				"purgatory_backup_sz",
-				&image->arch.backup_src_sz,
-				sizeof(image->arch.backup_src_sz), 0);
-		if (ret)
-			return ret;
-	}
-
-	return ret;
-}
-#else /* !CONFIG_KEXEC_FILE */
-static inline int arch_update_purgatory(struct kimage *image)
-{
-	return 0;
-}
-#endif /* CONFIG_KEXEC_FILE */
-
 int machine_kexec_prepare(struct kimage *image)
 {
 	unsigned long start_pgtable;
@@ -353,11 +311,6 @@ int machine_kexec_prepare(struct kimage *image)
 	if (result)
 		return result;
 
-	/* update purgatory as needed */
-	result = arch_update_purgatory(image);
-	if (result)
-		return result;
-
 	return 0;
 }
 
diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c
index 3b95410ff0f8..2961234d0795 100644
--- a/arch/x86/purgatory/purgatory.c
+++ b/arch/x86/purgatory/purgatory.c
@@ -14,28 +14,10 @@
 
 #include "../boot/string.h"
 
-unsigned long purgatory_backup_dest __section(.kexec-purgatory);
-unsigned long purgatory_backup_src __section(.kexec-purgatory);
-unsigned long purgatory_backup_sz __section(.kexec-purgatory);
-
 u8 purgatory_sha256_digest[SHA256_DIGEST_SIZE] __section(.kexec-purgatory);
 
 struct kexec_sha_region purgatory_sha_regions[KEXEC_SEGMENT_MAX] __section(.kexec-purgatory);
 
-/*
- * On x86, second kernel requries first 640K of memory to boot. Copy
- * first 640K to a backup region in reserved memory range so that second
- * kernel can use first 640K.
- */
-static int copy_backup_region(void)
-{
-	if (purgatory_backup_dest) {
-		memcpy((void *)purgatory_backup_dest,
-		       (void *)purgatory_backup_src, purgatory_backup_sz);
-	}
-	return 0;
-}
-
 static int verify_sha256_digest(void)
 {
 	struct kexec_sha_region *ptr, *end;
@@ -66,7 +48,6 @@ void purgatory(void)
 		for (;;)
 			;
 	}
-	copy_backup_region();
 }
 
 /*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23 14:19 ` [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
@ 2019-10-24 10:07   ` Simon Horman
  2019-10-24 11:33     ` lijiang
  0 siblings, 1 reply; 10+ messages in thread
From: Simon Horman @ 2019-10-24 10:07 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: linux-kernel, jgross, Thomas.Lendacky, bhe, x86, kexec, dhowells,
	mingo, bp, ebiederm, hpa, tglx, dyoung, vgoyal, d.hatayama

Hi Linbo,

thanks for your patch.

On Wed, Oct 23, 2019 at 10:19:11PM +0800, Lianbo Jiang wrote:
> Kdump kernel will reuse the first 640k region because the real mode
> trampoline has to work in this area. When the vmcore is dumped, the
> old memory in this area may be accessed, therefore, kernel has to
> copy the contents of the first 640k area to a backup region so that
> kdump kernel can read the old memory from the backup area of the
> first 640k area, which is done in the purgatory().
> 
> But, the current handling of copying the first 640k area runs into
> problems when SME is enabled, kernel does not properly copy these
> old memory to the backup area in the purgatory(), thereby, kdump
> kernel reads out the encrypted contents, because the kdump kernel
> must access the first kernel's memory with the encryption bit set
> when SME is enabled in the first kernel. Please refer to this link:
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
> 
> Finally, it causes the following errors, and the crash tool gets
> invalid pointers when parsing the vmcore.
> 
> crash> kmem -s|grep -i invalid
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> crash>
> 
> To avoid the above errors, when the crashkernel option is specified,
> lets reserve the remaining low 1MiB memory(after reserving real mode
> memory) so that the allocated memory does not fall into the low 1MiB
> area, which makes us not to copy the first 640k content to a backup
> region in purgatory(). This indicates that it does not need to be
> included in crash dumps or used for anything except the processor
> trampolines that must live in the low 1MiB.
> 
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
> BTW:I also tried to fix the above problem in purgatory(), but there
> are too many restricts in purgatory() context, for example: i can't
> allocate new memory to create the identity mapping page table for
> SME situation.
> 
> Currently, there are two places where the first 640k area is needed,
> the first one is in the find_trampoline_placement(), another one is
> in the reserve_real_mode(), and their content doesn't matter.
> 
> In addition, also need to clean all the code related to the backup
> region later.
> 
>  arch/x86/realmode/init.c |  2 ++
>  include/linux/kexec.h    |  2 ++
>  kernel/kexec_core.c      | 13 +++++++++++++
>  3 files changed, 17 insertions(+)
> 
> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> index 7dce39c8c034..064cc79a015d 100644
> --- a/arch/x86/realmode/init.c
> +++ b/arch/x86/realmode/init.c
> @@ -3,6 +3,7 @@
>  #include <linux/slab.h>
>  #include <linux/memblock.h>
>  #include <linux/mem_encrypt.h>
> +#include <linux/kexec.h>
>  
>  #include <asm/set_memory.h>
>  #include <asm/pgtable.h>
> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
>  
>  	memblock_reserve(mem, size);
>  	set_real_mode_mem(mem);
> +	kexec_reserve_low_1MiB();
>  }
>  
>  static void __init setup_real_mode(void)
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 1776eb2e43a4..30acf1d738bc 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>  extern void crash_kexec(struct pt_regs *);
>  int kexec_should_crash(struct task_struct *);
>  int kexec_crash_loaded(void);
> +void __init kexec_reserve_low_1MiB(void);
>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
>  
> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
>  static inline void crash_kexec(struct pt_regs *regs) { }
>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>  static inline int kexec_crash_loaded(void) { return 0; }
> +static inline void __init kexec_reserve_low_1MiB(void) { }
>  #define kexec_in_progress false
>  #endif /* CONFIG_KEXEC_CORE */
>  
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 15d70a90b50d..5bd89f1fee42 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -37,6 +37,7 @@
>  #include <linux/compiler.h>
>  #include <linux/hugetlb.h>
>  #include <linux/frame.h>
> +#include <linux/memblock.h>
>  
>  #include <asm/page.h>
>  #include <asm/sections.h>
> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>  	.desc  = IORES_DESC_CRASH_KERNEL
>  };
>  
> +/*
> + * When the crashkernel option is specified, only use the low
> + * 1MiB for the real mode trampoline.
> + */
> +void __init kexec_reserve_low_1MiB(void)
> +{
> +	if (strstr(boot_command_line, "crashkernel=")) {

Could you comment on the issue of using strstr which
was raised by Hatayama-san in response to an earlier revision
of this patch?

Thanks in advance!

> +		memblock_reserve(0, 1<<20);
> +		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
> +	}
> +}
> +
>  int kexec_should_crash(struct task_struct *p)
>  {
>  	/*
> -- 
> 2.17.1
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-24 10:07   ` Simon Horman
@ 2019-10-24 11:33     ` lijiang
  2019-10-25  1:31       ` lijiang
  0 siblings, 1 reply; 10+ messages in thread
From: lijiang @ 2019-10-24 11:33 UTC (permalink / raw)
  To: Simon Horman
  Cc: linux-kernel, jgross, Thomas.Lendacky, bhe, x86, kexec, dhowells,
	mingo, bp, ebiederm, hpa, tglx, dyoung, vgoyal, d.hatayama,
	d.hatayama

在 2019年10月24日 18:07, Simon Horman 写道:
> Hi Linbo,
> 
> thanks for your patch.
> 
> On Wed, Oct 23, 2019 at 10:19:11PM +0800, Lianbo Jiang wrote:
>> Kdump kernel will reuse the first 640k region because the real mode
>> trampoline has to work in this area. When the vmcore is dumped, the
>> old memory in this area may be accessed, therefore, kernel has to
>> copy the contents of the first 640k area to a backup region so that
>> kdump kernel can read the old memory from the backup area of the
>> first 640k area, which is done in the purgatory().
>>
>> But, the current handling of copying the first 640k area runs into
>> problems when SME is enabled, kernel does not properly copy these
>> old memory to the backup area in the purgatory(), thereby, kdump
>> kernel reads out the encrypted contents, because the kdump kernel
>> must access the first kernel's memory with the encryption bit set
>> when SME is enabled in the first kernel. Please refer to this link:
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
>>
>> Finally, it causes the following errors, and the crash tool gets
>> invalid pointers when parsing the vmcore.
>>
>> crash> kmem -s|grep -i invalid
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> crash>
>>
>> To avoid the above errors, when the crashkernel option is specified,
>> lets reserve the remaining low 1MiB memory(after reserving real mode
>> memory) so that the allocated memory does not fall into the low 1MiB
>> area, which makes us not to copy the first 640k content to a backup
>> region in purgatory(). This indicates that it does not need to be
>> included in crash dumps or used for anything except the processor
>> trampolines that must live in the low 1MiB.
>>
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>> BTW:I also tried to fix the above problem in purgatory(), but there
>> are too many restricts in purgatory() context, for example: i can't
>> allocate new memory to create the identity mapping page table for
>> SME situation.
>>
>> Currently, there are two places where the first 640k area is needed,
>> the first one is in the find_trampoline_placement(), another one is
>> in the reserve_real_mode(), and their content doesn't matter.
>>
>> In addition, also need to clean all the code related to the backup
>> region later.
>>
>>  arch/x86/realmode/init.c |  2 ++
>>  include/linux/kexec.h    |  2 ++
>>  kernel/kexec_core.c      | 13 +++++++++++++
>>  3 files changed, 17 insertions(+)
>>
>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>> index 7dce39c8c034..064cc79a015d 100644
>> --- a/arch/x86/realmode/init.c
>> +++ b/arch/x86/realmode/init.c
>> @@ -3,6 +3,7 @@
>>  #include <linux/slab.h>
>>  #include <linux/memblock.h>
>>  #include <linux/mem_encrypt.h>
>> +#include <linux/kexec.h>
>>  
>>  #include <asm/set_memory.h>
>>  #include <asm/pgtable.h>
>> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
>>  
>>  	memblock_reserve(mem, size);
>>  	set_real_mode_mem(mem);
>> +	kexec_reserve_low_1MiB();
>>  }
>>  
>>  static void __init setup_real_mode(void)
>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
>> index 1776eb2e43a4..30acf1d738bc 100644
>> --- a/include/linux/kexec.h
>> +++ b/include/linux/kexec.h
>> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>>  extern void crash_kexec(struct pt_regs *);
>>  int kexec_should_crash(struct task_struct *);
>>  int kexec_crash_loaded(void);
>> +void __init kexec_reserve_low_1MiB(void);
>>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
>>  
>> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
>>  static inline void crash_kexec(struct pt_regs *regs) { }
>>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>>  static inline int kexec_crash_loaded(void) { return 0; }
>> +static inline void __init kexec_reserve_low_1MiB(void) { }
>>  #define kexec_in_progress false
>>  #endif /* CONFIG_KEXEC_CORE */
>>  
>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>> index 15d70a90b50d..5bd89f1fee42 100644
>> --- a/kernel/kexec_core.c
>> +++ b/kernel/kexec_core.c
>> @@ -37,6 +37,7 @@
>>  #include <linux/compiler.h>
>>  #include <linux/hugetlb.h>
>>  #include <linux/frame.h>
>> +#include <linux/memblock.h>
>>  
>>  #include <asm/page.h>
>>  #include <asm/sections.h>
>> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>>  	.desc  = IORES_DESC_CRASH_KERNEL
>>  };
>>  
>> +/*
>> + * When the crashkernel option is specified, only use the low
>> + * 1MiB for the real mode trampoline.
>> + */
>> +void __init kexec_reserve_low_1MiB(void)
>> +{
>> +	if (strstr(boot_command_line, "crashkernel=")) {
> 
> Could you comment on the issue of using strstr which
> was raised by Hatayama-san in response to an earlier revision
> of this patch?
> 

Thank you, Simon and Hatayama-san. Lets talk about it here.

> strstr() matches for example, ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.
> 
> Is it enough to use cmdline_find_option_bool()?
> 

The cmdline_find_option_bool() will find a boolean option, but the crashkernel option
is not a boolean option, maybe it looks odd. So, should we use the cmdline_find_option()
better?

+#include <asm/cmdline.h>

 void __init kexec_reserve_low_1MiB(void)
 {
-       if (strstr(boot_command_line, "crashkernel=")) {
+       char buffer[4];
+
+       if (cmdline_find_option(boot_command_line, "crashkernel=",
+                               buffer, sizeof(buffer))) {
                memblock_reserve(0, 1<<20);
                pr_info("Reserving the low 1MiB of memory for crashkernel\n");
        }

And here, no need to parse the arguments of crashkernel(sometimes, which has a
complicated syntax), so the size of buffer should be enough. What's your opinion?

Thanks
Lianbo

> Thanks in advance!
> 
>> +		memblock_reserve(0, 1<<20);
>> +		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
>> +	}
>> +}
>> +
>>  int kexec_should_crash(struct task_struct *p)
>>  {
>>  	/*
>> -- 
>> 2.17.1
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-24 11:33     ` lijiang
@ 2019-10-25  1:31       ` lijiang
  2019-10-25  1:38         ` d.hatayama
  0 siblings, 1 reply; 10+ messages in thread
From: lijiang @ 2019-10-25  1:31 UTC (permalink / raw)
  To: Simon Horman, d.hatayama
  Cc: linux-kernel, jgross, Thomas.Lendacky, bhe, x86, kexec, dhowells,
	mingo, bp, ebiederm, hpa, tglx, dyoung, vgoyal

在 2019年10月24日 19:33, lijiang 写道:
> 在 2019年10月24日 18:07, Simon Horman 写道:
>> Hi Linbo,
>>
>> thanks for your patch.
>>
>> On Wed, Oct 23, 2019 at 10:19:11PM +0800, Lianbo Jiang wrote:
>>> Kdump kernel will reuse the first 640k region because the real mode
>>> trampoline has to work in this area. When the vmcore is dumped, the
>>> old memory in this area may be accessed, therefore, kernel has to
>>> copy the contents of the first 640k area to a backup region so that
>>> kdump kernel can read the old memory from the backup area of the
>>> first 640k area, which is done in the purgatory().
>>>
>>> But, the current handling of copying the first 640k area runs into
>>> problems when SME is enabled, kernel does not properly copy these
>>> old memory to the backup area in the purgatory(), thereby, kdump
>>> kernel reads out the encrypted contents, because the kdump kernel
>>> must access the first kernel's memory with the encryption bit set
>>> when SME is enabled in the first kernel. Please refer to this link:
>>>
>>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
>>>
>>> Finally, it causes the following errors, and the crash tool gets
>>> invalid pointers when parsing the vmcore.
>>>
>>> crash> kmem -s|grep -i invalid
>>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>>> crash>
>>>
>>> To avoid the above errors, when the crashkernel option is specified,
>>> lets reserve the remaining low 1MiB memory(after reserving real mode
>>> memory) so that the allocated memory does not fall into the low 1MiB
>>> area, which makes us not to copy the first 640k content to a backup
>>> region in purgatory(). This indicates that it does not need to be
>>> included in crash dumps or used for anything except the processor
>>> trampolines that must live in the low 1MiB.
>>>
>>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>>> ---
>>> BTW:I also tried to fix the above problem in purgatory(), but there
>>> are too many restricts in purgatory() context, for example: i can't
>>> allocate new memory to create the identity mapping page table for
>>> SME situation.
>>>
>>> Currently, there are two places where the first 640k area is needed,
>>> the first one is in the find_trampoline_placement(), another one is
>>> in the reserve_real_mode(), and their content doesn't matter.
>>>
>>> In addition, also need to clean all the code related to the backup
>>> region later.
>>>
>>>  arch/x86/realmode/init.c |  2 ++
>>>  include/linux/kexec.h    |  2 ++
>>>  kernel/kexec_core.c      | 13 +++++++++++++
>>>  3 files changed, 17 insertions(+)
>>>
>>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>>> index 7dce39c8c034..064cc79a015d 100644
>>> --- a/arch/x86/realmode/init.c
>>> +++ b/arch/x86/realmode/init.c
>>> @@ -3,6 +3,7 @@
>>>  #include <linux/slab.h>
>>>  #include <linux/memblock.h>
>>>  #include <linux/mem_encrypt.h>
>>> +#include <linux/kexec.h>
>>>  
>>>  #include <asm/set_memory.h>
>>>  #include <asm/pgtable.h>
>>> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
>>>  
>>>  	memblock_reserve(mem, size);
>>>  	set_real_mode_mem(mem);
>>> +	kexec_reserve_low_1MiB();
>>>  }
>>>  
>>>  static void __init setup_real_mode(void)
>>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
>>> index 1776eb2e43a4..30acf1d738bc 100644
>>> --- a/include/linux/kexec.h
>>> +++ b/include/linux/kexec.h
>>> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>>>  extern void crash_kexec(struct pt_regs *);
>>>  int kexec_should_crash(struct task_struct *);
>>>  int kexec_crash_loaded(void);
>>> +void __init kexec_reserve_low_1MiB(void);
>>>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>>>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
>>>  
>>> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
>>>  static inline void crash_kexec(struct pt_regs *regs) { }
>>>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>>>  static inline int kexec_crash_loaded(void) { return 0; }
>>> +static inline void __init kexec_reserve_low_1MiB(void) { }
>>>  #define kexec_in_progress false
>>>  #endif /* CONFIG_KEXEC_CORE */
>>>  
>>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>>> index 15d70a90b50d..5bd89f1fee42 100644
>>> --- a/kernel/kexec_core.c
>>> +++ b/kernel/kexec_core.c
>>> @@ -37,6 +37,7 @@
>>>  #include <linux/compiler.h>
>>>  #include <linux/hugetlb.h>
>>>  #include <linux/frame.h>
>>> +#include <linux/memblock.h>
>>>  
>>>  #include <asm/page.h>
>>>  #include <asm/sections.h>
>>> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>>>  	.desc  = IORES_DESC_CRASH_KERNEL
>>>  };
>>>  
>>> +/*
>>> + * When the crashkernel option is specified, only use the low
>>> + * 1MiB for the real mode trampoline.
>>> + */
>>> +void __init kexec_reserve_low_1MiB(void)
>>> +{
>>> +	if (strstr(boot_command_line, "crashkernel=")) {
>>
>> Could you comment on the issue of using strstr which
>> was raised by Hatayama-san in response to an earlier revision
>> of this patch?
>>
> 
> Thank you, Simon and Hatayama-san. Lets talk about it here.
> 
>> strstr() matches for example, ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.
>>
>> Is it enough to use cmdline_find_option_bool()?
>>
> 
> The cmdline_find_option_bool() will find a boolean option, but the crashkernel option
> is not a boolean option, maybe it looks odd. So, should we use the cmdline_find_option()
> better?
> 
> +#include <asm/cmdline.h>
> 
>  void __init kexec_reserve_low_1MiB(void)
>  {
> -       if (strstr(boot_command_line, "crashkernel=")) {
> +       char buffer[4];
> +
> +       if (cmdline_find_option(boot_command_line, "crashkernel=",
> +                               buffer, sizeof(buffer))) {
Maybe it is simpler as follow:

+       if (cmdline_find_option(boot_command_line, "crashkernel=",
+                               NULL, 0)) {

Any thoughts?

Thanks
Lianbo
>                 memblock_reserve(0, 1<<20);
>                 pr_info("Reserving the low 1MiB of memory for crashkernel\n");
>         }
> 
> And here, no need to parse the arguments of crashkernel(sometimes, which has a
> complicated syntax), so the size of buffer should be enough. What's your opinion?
> 
> Thanks
> Lianbo
> 
>> Thanks in advance!
>>
>>> +		memblock_reserve(0, 1<<20);
>>> +		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
>>> +	}
>>> +}
>>> +
>>>  int kexec_should_crash(struct task_struct *p)
>>>  {
>>>  	/*
>>> -- 
>>> 2.17.1
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
>>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-25  1:31       ` lijiang
@ 2019-10-25  1:38         ` d.hatayama
  2019-10-25  3:14           ` lijiang
  0 siblings, 1 reply; 10+ messages in thread
From: d.hatayama @ 2019-10-25  1:38 UTC (permalink / raw)
  To: 'lijiang', Simon Horman
  Cc: linux-kernel, jgross, Thomas.Lendacky, bhe, x86, kexec, dhowells,
	mingo, bp, ebiederm, hpa, tglx, dyoung, vgoyal

[-- Attachment #1: Type: text/plain, Size: 9268 bytes --]



> -----Original Message-----
> From: lijiang [mailto:lijiang@redhat.com]
> Sent: Friday, October 25, 2019 10:31 AM
> To: Simon Horman <horms@verge.net.au>; Hatayama, Daisuke/畑山 大輔
> <d.hatayama@fujitsu.com>
> Cc: linux-kernel@vger.kernel.org; jgross@suse.com; Thomas.Lendacky@amd.com;
> bhe@redhat.com; x86@kernel.org; kexec@lists.infradead.org;
> dhowells@redhat.com; mingo@redhat.com; bp@alien8.de; ebiederm@xmission.com;
> hpa@zytor.com; tglx@linutronix.de; dyoung@redhat.com; vgoyal@redhat.com
> Subject: Re: [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the
> crashkernel option is specified
> 
> 在 2019年10月24日 19:33, lijiang 写道:
> > 在 2019年10月24日 18:07, Simon Horman 写道:
> >> Hi Linbo,
> >>
> >> thanks for your patch.
> >>
> >> On Wed, Oct 23, 2019 at 10:19:11PM +0800, Lianbo Jiang wrote:
> >>> Kdump kernel will reuse the first 640k region because the real mode
> >>> trampoline has to work in this area. When the vmcore is dumped, the
> >>> old memory in this area may be accessed, therefore, kernel has to
> >>> copy the contents of the first 640k area to a backup region so that
> >>> kdump kernel can read the old memory from the backup area of the
> >>> first 640k area, which is done in the purgatory().
> >>>
> >>> But, the current handling of copying the first 640k area runs into
> >>> problems when SME is enabled, kernel does not properly copy these
> >>> old memory to the backup area in the purgatory(), thereby, kdump
> >>> kernel reads out the encrypted contents, because the kdump kernel
> >>> must access the first kernel's memory with the encryption bit set
> >>> when SME is enabled in the first kernel. Please refer to this link:
> >>>
> >>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
> >>>
> >>> Finally, it causes the following errors, and the crash tool gets
> >>> invalid pointers when parsing the vmcore.
> >>>
> >>> crash> kmem -s|grep -i invalid
> >>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
> freepointer:a6086ac099f0c5a4
> >>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
> freepointer:a6086ac099f0c5a4
> >>> crash>
> >>>
> >>> To avoid the above errors, when the crashkernel option is specified,
> >>> lets reserve the remaining low 1MiB memory(after reserving real mode
> >>> memory) so that the allocated memory does not fall into the low 1MiB
> >>> area, which makes us not to copy the first 640k content to a backup
> >>> region in purgatory(). This indicates that it does not need to be
> >>> included in crash dumps or used for anything except the processor
> >>> trampolines that must live in the low 1MiB.
> >>>
> >>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> >>> ---
> >>> BTW:I also tried to fix the above problem in purgatory(), but there
> >>> are too many restricts in purgatory() context, for example: i can't
> >>> allocate new memory to create the identity mapping page table for
> >>> SME situation.
> >>>
> >>> Currently, there are two places where the first 640k area is needed,
> >>> the first one is in the find_trampoline_placement(), another one is
> >>> in the reserve_real_mode(), and their content doesn't matter.
> >>>
> >>> In addition, also need to clean all the code related to the backup
> >>> region later.
> >>>
> >>>  arch/x86/realmode/init.c |  2 ++
> >>>  include/linux/kexec.h    |  2 ++
> >>>  kernel/kexec_core.c      | 13 +++++++++++++
> >>>  3 files changed, 17 insertions(+)
> >>>
> >>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> >>> index 7dce39c8c034..064cc79a015d 100644
> >>> --- a/arch/x86/realmode/init.c
> >>> +++ b/arch/x86/realmode/init.c
> >>> @@ -3,6 +3,7 @@
> >>>  #include <linux/slab.h>
> >>>  #include <linux/memblock.h>
> >>>  #include <linux/mem_encrypt.h>
> >>> +#include <linux/kexec.h>
> >>>
> >>>  #include <asm/set_memory.h>
> >>>  #include <asm/pgtable.h>
> >>> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
> >>>
> >>>  	memblock_reserve(mem, size);
> >>>  	set_real_mode_mem(mem);
> >>> +	kexec_reserve_low_1MiB();
> >>>  }
> >>>
> >>>  static void __init setup_real_mode(void)
> >>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> >>> index 1776eb2e43a4..30acf1d738bc 100644
> >>> --- a/include/linux/kexec.h
> >>> +++ b/include/linux/kexec.h
> >>> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
> >>>  extern void crash_kexec(struct pt_regs *);
> >>>  int kexec_should_crash(struct task_struct *);
> >>>  int kexec_crash_loaded(void);
> >>> +void __init kexec_reserve_low_1MiB(void);
> >>>  void crash_save_cpu(struct pt_regs *regs, int cpu);
> >>>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
> >>>
> >>> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs)
> { }
> >>>  static inline void crash_kexec(struct pt_regs *regs) { }
> >>>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
> >>>  static inline int kexec_crash_loaded(void) { return 0; }
> >>> +static inline void __init kexec_reserve_low_1MiB(void) { }
> >>>  #define kexec_in_progress false
> >>>  #endif /* CONFIG_KEXEC_CORE */
> >>>
> >>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> >>> index 15d70a90b50d..5bd89f1fee42 100644
> >>> --- a/kernel/kexec_core.c
> >>> +++ b/kernel/kexec_core.c
> >>> @@ -37,6 +37,7 @@
> >>>  #include <linux/compiler.h>
> >>>  #include <linux/hugetlb.h>
> >>>  #include <linux/frame.h>
> >>> +#include <linux/memblock.h>
> >>>
> >>>  #include <asm/page.h>
> >>>  #include <asm/sections.h>
> >>> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
> >>>  	.desc  = IORES_DESC_CRASH_KERNEL
> >>>  };
> >>>
> >>> +/*
> >>> + * When the crashkernel option is specified, only use the low
> >>> + * 1MiB for the real mode trampoline.
> >>> + */
> >>> +void __init kexec_reserve_low_1MiB(void)
> >>> +{
> >>> +	if (strstr(boot_command_line, "crashkernel=")) {
> >>
> >> Could you comment on the issue of using strstr which
> >> was raised by Hatayama-san in response to an earlier revision
> >> of this patch?
> >>
> >
> > Thank you, Simon and Hatayama-san. Lets talk about it here.
> >
> >> strstr() matches for example,
> ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.
> >>
> >> Is it enough to use cmdline_find_option_bool()?
> >>
> >
> > The cmdline_find_option_bool() will find a boolean option, but the
> crashkernel option
> > is not a boolean option, maybe it looks odd. So, should we use the
> cmdline_find_option()
> > better?
> >
> > +#include <asm/cmdline.h>
> >
> >  void __init kexec_reserve_low_1MiB(void)
> >  {
> > -       if (strstr(boot_command_line, "crashkernel=")) {
> > +       char buffer[4];
> > +
> > +       if (cmdline_find_option(boot_command_line, "crashkernel=",
> > +                               buffer, sizeof(buffer))) {
> Maybe it is simpler as follow:
> 
> +       if (cmdline_find_option(boot_command_line, "crashkernel=",
> +                               NULL, 0)) {
> 
> Any thoughts?

I wrote a test kernel module and it works as expected.

static int __init testmod_init(void)
{
        char cmdline1[] = "x y crashkernel z";
        char cmdline2[] = "x y crashkernel=128M z";

        printk("\"1: %d\n",
               cmdline_find_option_bool(cmdline1, "crashkernel"));
        printk("\"2: %d\n",
               cmdline_find_option_bool(cmdline1, "crashkernel="));
        printk("\"3: %d\n",
               cmdline_find_option_bool(cmdline2, "crashkernel"));
        printk("\"4: %d\n",
               cmdline_find_option_bool(cmdline2, "crashkernel="));

        printk("\"5: %d\n",
               cmdline_find_option(cmdline1, "crashkernel", NULL, 0));
        printk("\"6: %d\n",
               cmdline_find_option(cmdline1, "crashkernel=", NULL, 0));
        printk("\"7: %d\n",
               cmdline_find_option(cmdline2, "crashkernel", NULL, 0));
        printk("\"8: %d\n",
               cmdline_find_option(cmdline2, "crashkernel=", NULL, 0));

        return 0;
}

# dmesg | tail
[85335.355459] "7: 4
[85335.356923] "8: -1
[85349.763849] "1: 5
[85349.765128] "2: 0
[85349.766159] "3: 0
[85349.767145] "4: 0
[85349.768157] "5: -1
[85349.769259] "6: -1
[85349.770423] "7: 4
[85349.771512] "8: -1

> 
> Thanks
> Lianbo
> >                 memblock_reserve(0, 1<<20);
> >                 pr_info("Reserving the low 1MiB of memory for
> crashkernel\n");
> >         }
> >
> > And here, no need to parse the arguments of crashkernel(sometimes, which has
> a
> > complicated syntax), so the size of buffer should be enough. What's your
> opinion?
> >
> > Thanks
> > Lianbo
> >
> >> Thanks in advance!
> >>
> >>> +		memblock_reserve(0, 1<<20);
> >>> +		pr_info("Reserving the low 1MiB of memory for
> crashkernel\n");
> >>> +	}
> >>> +}
> >>> +
> >>>  int kexec_should_crash(struct task_struct *p)
> >>>  {
> >>>  	/*
> >>> --
> >>> 2.17.1
> >>>
> >>>
> >>> _______________________________________________
> >>> kexec mailing list
> >>> kexec@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/kexec
> >>>


[-- Attachment #2: testmod.c --]
[-- Type: text/plain, Size: 5863 bytes --]

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/string.h>
#include <linux/ctype.h>
#include <asm/setup.h>

MODULE_AUTHOR("FUJITSU LIMITED");
MODULE_LICENSE("GPL v2");

static inline int myisspace(u8 c)
{
	return c <= ' ';        /* Close enough approximation */
}

/**
 * Find a boolean option (like quiet,noapic,nosmp....)
 *
 * @cmdline: the cmdline string
 * @option: option string to look for
 *
 * Returns the position of that @option (starts counting with 1)
 * or 0 on not found.  @option will only be found if it is found
 * as an entire word in @cmdline.  For instance, if @option="car"
 * then a cmdline which contains "cart" will not match.
 */
static int
__cmdline_find_option_bool(const char *cmdline, int max_cmdline_size,
			   const char *option)
{
	char c;
	int pos = 0, wstart = 0;
	const char *opptr = NULL;
	enum {
		st_wordstart = 0,       /* Start of word/after whitespace */
		st_wordcmp,     /* Comparing this word */
		st_wordskip,    /* Miscompare, skip */
	} state = st_wordstart;

	if (!cmdline)
		return -1;      /* No command line */

/*
 * This 'pos' check ensures we do not overrun
 * a non-NULL-terminated 'cmdline'
 */
	while (pos < max_cmdline_size) {
		c = *(char *)cmdline++;
		pos++;

		switch (state) {
		case st_wordstart:
			if (!c)
				return 0;
			else if (myisspace(c))
				break;

			state = st_wordcmp;
			opptr = option;
			wstart = pos;
			/* fall through */

		case st_wordcmp:
			if (!*opptr) {
				/*
				 * We matched all the way to the end of the
				 * option we were looking for.  If the
				 * command-line has a space _or_ ends, then
				 * we matched!
				 */
				if (!c || myisspace(c))
					return wstart;
				/*
				 * We hit the end of the option, but _not_
				 * the end of a word on the cmdline.  Not
				 * a match.
				 */
			} else if (!c) {
				/*
				 * Hit the NULL terminator on the end of
				 * cmdline.
				 */
				return 0;
			} else if (c == *opptr++) {
				/*
				 * We are currently matching, so continue
				 * to the next character on the cmdline.
				 */
				break;
			}
			state = st_wordskip;
			/* fall through */

		case st_wordskip:
			if (!c)
				return 0;
			else if (myisspace(c))
				state = st_wordstart;
			break;
		}
	}

	return 0;       /* Buffer overrun */
}

/*
 * Find a non-boolean option (i.e. option=argument). In accordance with
 * standard Linux practice, if this option is repeated, this returns the
 * last instance on the command line.
 *
 * @cmdline: the cmdline string
 * @max_cmdline_size: the maximum size of cmdline
 * @option: option string to look for
 * @buffer: memory buffer to return the option argument
 * @bufsize: size of the supplied memory buffer
 *
 * Returns the length of the argument (regardless of if it was
 * truncated to fit in the buffer), or -1 on not found.
 */
static int
__cmdline_find_option(const char *cmdline, int max_cmdline_size,
		      const char *option, char *buffer, int bufsize)
{
	char c;
	int pos = 0, len = -1;
	const char *opptr = NULL;
	char *bufptr = buffer;
	enum {
		st_wordstart = 0,       /* Start of word/after whitespace */
		st_wordcmp,     /* Comparing this word */
		st_wordskip,    /* Miscompare, skip */
		st_bufcpy,      /* Copying this to buffer */
	} state = st_wordstart;

	if (!cmdline)
		return -1;      /* No command line */
	
/*
 * This 'pos' check ensures we do not overrun
 * a non-NULL-terminated 'cmdline'
 */
	while (pos++ < max_cmdline_size) {
		c = *(char *)cmdline++;
		if (!c)
			break;

		switch (state) {
		case st_wordstart:
			if (myisspace(c))
				break;

			state = st_wordcmp;
			opptr = option;
			/* fall through */

		case st_wordcmp:
			if ((c == '=') && !*opptr) {
				/*
				 * We matched all the way to the end of the
				 * option we were looking for, prepare to
				 * copy the argument.
				 */
				len = 0;
				bufptr = buffer;
				state = st_bufcpy;
				break;
			} else if (c == *opptr++) {
				/*
				 * We are currently matching, so continue
				 * to the next character on the cmdline.
				 */
				break;
			}
			state = st_wordskip;
			/* fall through */

		case st_wordskip:
			if (myisspace(c))
				state = st_wordstart;
			break;

		case st_bufcpy:
			if (myisspace(c)) {
				state = st_wordstart;
			} else {
				/*
				 * Increment len, but don't overrun the
				 * supplied buffer and leave room for the
				 * NULL terminator.
				 */
				if (++len < bufsize)
					*bufptr++ = c;
			}
			break;
		}
	}

	if (bufsize)
		*bufptr = '\0';

	return len;
}

int cmdline_find_option_bool(const char *cmdline, const char *option)
{
	return __cmdline_find_option_bool(cmdline, COMMAND_LINE_SIZE, option);
}

int cmdline_find_option(const char *cmdline, const char *option, char *buffer,
			int bufsize)
{
	return __cmdline_find_option(cmdline, COMMAND_LINE_SIZE, option,
				     buffer, bufsize);
}

static int __init testmod_init(void)
{
        char cmdline1[] = "x y crashkernel z";
        char cmdline2[] = "x y crashkernel=128M z";

        printk("\"1: %d\n",
               cmdline_find_option_bool(cmdline1, "crashkernel"));
        printk("\"2: %d\n",
               cmdline_find_option_bool(cmdline1, "crashkernel="));
        printk("\"3: %d\n",
               cmdline_find_option_bool(cmdline2, "crashkernel"));
        printk("\"4: %d\n",
               cmdline_find_option_bool(cmdline2, "crashkernel="));

        printk("\"5: %d\n",
               cmdline_find_option(cmdline1, "crashkernel", NULL, 0));
        printk("\"6: %d\n",
               cmdline_find_option(cmdline1, "crashkernel=", NULL, 0));
        printk("\"7: %d\n",
               cmdline_find_option(cmdline2, "crashkernel", NULL, 0));
        printk("\"8: %d\n",
               cmdline_find_option(cmdline2, "crashkernel=", NULL, 0));

        return 0;
}

static void __exit testmod_exit(void)
{
}

module_init(testmod_init);
module_exit(testmod_exit);

[-- Attachment #3: Makefile --]
[-- Type: application/octet-stream, Size: 272 bytes --]

obj-m += testmod.o

default:
	make -C /lib/modules/$(shell uname -r)/build M=$(shell pwd) V=1 modules

install:
	make -C /lib/modules/$(shell uname -r)/build M=$(shell pwd) V=1 modules_install

clean:
	rm -rf .tmp_versions Module.symvers *.o *.ko *fefpcl_stub.mod.[co] *~

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-25  1:38         ` d.hatayama
@ 2019-10-25  3:14           ` lijiang
  2019-10-25  3:39             ` Eric W. Biederman
  0 siblings, 1 reply; 10+ messages in thread
From: lijiang @ 2019-10-25  3:14 UTC (permalink / raw)
  To: d.hatayama, Simon Horman
  Cc: linux-kernel, jgross, Thomas.Lendacky, bhe, x86, kexec, dhowells,
	mingo, bp, ebiederm, hpa, tglx, dyoung, vgoyal

在 2019年10月25日 09:38, d.hatayama@fujitsu.com 写道:> 
> 
>> -----Original Message-----
>> From: lijiang [mailto:lijiang@redhat.com]
>> Sent: Friday, October 25, 2019 10:31 AM
>> To: Simon Horman <horms@verge.net.au>; Hatayama, Daisuke/畑山 大輔
>> <d.hatayama@fujitsu.com>
>> Cc: linux-kernel@vger.kernel.org; jgross@suse.com; Thomas.Lendacky@amd.com;
>> bhe@redhat.com; x86@kernel.org; kexec@lists.infradead.org;
>> dhowells@redhat.com; mingo@redhat.com; bp@alien8.de; ebiederm@xmission.com;
>> hpa@zytor.com; tglx@linutronix.de; dyoung@redhat.com; vgoyal@redhat.com
>> Subject: Re: [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the
>> crashkernel option is specified
>>
>> 在 2019年10月24日 19:33, lijiang 写道:
>>> 在 2019年10月24日 18:07, Simon Horman 写道:
>>>> Hi Linbo,
>>>>
>>>> thanks for your patch.
>>>>
>>>> On Wed, Oct 23, 2019 at 10:19:11PM +0800, Lianbo Jiang wrote:
>>>>> Kdump kernel will reuse the first 640k region because the real mode
>>>>> trampoline has to work in this area. When the vmcore is dumped, the
>>>>> old memory in this area may be accessed, therefore, kernel has to
>>>>> copy the contents of the first 640k area to a backup region so that
>>>>> kdump kernel can read the old memory from the backup area of the
>>>>> first 640k area, which is done in the purgatory().
>>>>>
>>>>> But, the current handling of copying the first 640k area runs into
>>>>> problems when SME is enabled, kernel does not properly copy these
>>>>> old memory to the backup area in the purgatory(), thereby, kdump
>>>>> kernel reads out the encrypted contents, because the kdump kernel
>>>>> must access the first kernel's memory with the encryption bit set
>>>>> when SME is enabled in the first kernel. Please refer to this link:
>>>>>
>>>>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
>>>>>
>>>>> Finally, it causes the following errors, and the crash tool gets
>>>>> invalid pointers when parsing the vmcore.
>>>>>
>>>>> crash> kmem -s|grep -i invalid
>>>>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
>> freepointer:a6086ac099f0c5a4
>>>>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
>> freepointer:a6086ac099f0c5a4
>>>>> crash>
>>>>>
>>>>> To avoid the above errors, when the crashkernel option is specified,
>>>>> lets reserve the remaining low 1MiB memory(after reserving real mode
>>>>> memory) so that the allocated memory does not fall into the low 1MiB
>>>>> area, which makes us not to copy the first 640k content to a backup
>>>>> region in purgatory(). This indicates that it does not need to be
>>>>> included in crash dumps or used for anything except the processor
>>>>> trampolines that must live in the low 1MiB.
>>>>>
>>>>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>>>>> ---
>>>>> BTW:I also tried to fix the above problem in purgatory(), but there
>>>>> are too many restricts in purgatory() context, for example: i can't
>>>>> allocate new memory to create the identity mapping page table for
>>>>> SME situation.
>>>>>
>>>>> Currently, there are two places where the first 640k area is needed,
>>>>> the first one is in the find_trampoline_placement(), another one is
>>>>> in the reserve_real_mode(), and their content doesn't matter.
>>>>>
>>>>> In addition, also need to clean all the code related to the backup
>>>>> region later.
>>>>>
>>>>>  arch/x86/realmode/init.c |  2 ++
>>>>>  include/linux/kexec.h    |  2 ++
>>>>>  kernel/kexec_core.c      | 13 +++++++++++++
>>>>>  3 files changed, 17 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>>>>> index 7dce39c8c034..064cc79a015d 100644
>>>>> --- a/arch/x86/realmode/init.c
>>>>> +++ b/arch/x86/realmode/init.c
>>>>> @@ -3,6 +3,7 @@
>>>>>  #include <linux/slab.h>
>>>>>  #include <linux/memblock.h>
>>>>>  #include <linux/mem_encrypt.h>
>>>>> +#include <linux/kexec.h>
>>>>>
>>>>>  #include <asm/set_memory.h>
>>>>>  #include <asm/pgtable.h>
>>>>> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
>>>>>
>>>>>  	memblock_reserve(mem, size);
>>>>>  	set_real_mode_mem(mem);
>>>>> +	kexec_reserve_low_1MiB();
>>>>>  }
>>>>>
>>>>>  static void __init setup_real_mode(void)
>>>>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
>>>>> index 1776eb2e43a4..30acf1d738bc 100644
>>>>> --- a/include/linux/kexec.h
>>>>> +++ b/include/linux/kexec.h
>>>>> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>>>>>  extern void crash_kexec(struct pt_regs *);
>>>>>  int kexec_should_crash(struct task_struct *);
>>>>>  int kexec_crash_loaded(void);
>>>>> +void __init kexec_reserve_low_1MiB(void);
>>>>>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>>>>>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
>>>>>
>>>>> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs)
>> { }
>>>>>  static inline void crash_kexec(struct pt_regs *regs) { }
>>>>>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>>>>>  static inline int kexec_crash_loaded(void) { return 0; }
>>>>> +static inline void __init kexec_reserve_low_1MiB(void) { }
>>>>>  #define kexec_in_progress false
>>>>>  #endif /* CONFIG_KEXEC_CORE */
>>>>>
>>>>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>>>>> index 15d70a90b50d..5bd89f1fee42 100644
>>>>> --- a/kernel/kexec_core.c
>>>>> +++ b/kernel/kexec_core.c
>>>>> @@ -37,6 +37,7 @@
>>>>>  #include <linux/compiler.h>
>>>>>  #include <linux/hugetlb.h>
>>>>>  #include <linux/frame.h>
>>>>> +#include <linux/memblock.h>
>>>>>
>>>>>  #include <asm/page.h>
>>>>>  #include <asm/sections.h>
>>>>> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>>>>>  	.desc  = IORES_DESC_CRASH_KERNEL
>>>>>  };
>>>>>
>>>>> +/*
>>>>> + * When the crashkernel option is specified, only use the low
>>>>> + * 1MiB for the real mode trampoline.
>>>>> + */
>>>>> +void __init kexec_reserve_low_1MiB(void)
>>>>> +{
>>>>> +	if (strstr(boot_command_line, "crashkernel=")) {
>>>>
>>>> Could you comment on the issue of using strstr which
>>>> was raised by Hatayama-san in response to an earlier revision
>>>> of this patch?
>>>>
>>>
>>> Thank you, Simon and Hatayama-san. Lets talk about it here.
>>>
>>>> strstr() matches for example,
>> ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.
>>>>
>>>> Is it enough to use cmdline_find_option_bool()?
>>>>
>>>
>>> The cmdline_find_option_bool() will find a boolean option, but the
>> crashkernel option
>>> is not a boolean option, maybe it looks odd. So, should we use the
>> cmdline_find_option()
>>> better?
>>>
>>> +#include <asm/cmdline.h>
>>>
>>>  void __init kexec_reserve_low_1MiB(void)
>>>  {
>>> -       if (strstr(boot_command_line, "crashkernel=")) {
>>> +       char buffer[4];
>>> +
>>> +       if (cmdline_find_option(boot_command_line, "crashkernel=",
>>> +                               buffer, sizeof(buffer))) {
>> Maybe it is simpler as follow:
>>
>> +       if (cmdline_find_option(boot_command_line, "crashkernel=",
>> +                               NULL, 0)) {
>>
>> Any thoughts?
> 
> I wrote a test kernel module and it works as expected.
> 
Thank you, Hatayama. This is a very good test case.

> static int __init testmod_init(void)
> {
>         char cmdline1[] = "x y crashkernel z";
>         char cmdline2[] = "x y crashkernel=128M z";
> 
>         printk("\"1: %d\n",
>                cmdline_find_option_bool(cmdline1, "crashkernel"));
>         printk("\"2: %d\n",
>                cmdline_find_option_bool(cmdline1, "crashkernel="));
>         printk("\"3: %d\n",
>                cmdline_find_option_bool(cmdline2, "crashkernel"));
>         printk("\"4: %d\n",
>                cmdline_find_option_bool(cmdline2, "crashkernel="));
> 
>         printk("\"5: %d\n",
>                cmdline_find_option(cmdline1, "crashkernel", NULL, 0));
>         printk("\"6: %d\n",
>                cmdline_find_option(cmdline1, "crashkernel=", NULL, 0));
>         printk("\"7: %d\n",
>                cmdline_find_option(cmdline2, "crashkernel", NULL, 0));
>         printk("\"8: %d\n",
>                cmdline_find_option(cmdline2, "crashkernel=", NULL, 0));
> 
>         return 0;
> }
> 
> # dmesg | tail
> [85335.355459] "7: 4
> [85335.356923] "8: -1
> [85349.763849] "1: 5
> [85349.765128] "2: 0
> [85349.766159] "3: 0
> [85349.767145] "4: 0
> [85349.768157] "5: -1
> [85349.769259] "6: -1
> [85349.770423] "7: 4
> [85349.771512] "8: -1
> 
 * Returns the length of the argument (regardless of if it was
 * truncated to fit in the buffer), or -1 on not found.
 */
static int
__cmdline_find_option(const char *cmdline, int max_cmdline_size,
                      const char *option, char *buffer, int bufsize)


According to the above code comment, it should be better like this:

+       if (cmdline_find_option(boot_command_line, "crashkernel",
+                               NULL, 0) > 0) {

After i test, i will post again.

Thanks.
Lianbo

>>
>> Thanks
>> Lianbo
>>>                 memblock_reserve(0, 1<<20);
>>>                 pr_info("Reserving the low 1MiB of memory for
>> crashkernel\n");
>>>         }
>>>
>>> And here, no need to parse the arguments of crashkernel(sometimes, which has
>> a
>>> complicated syntax), so the size of buffer should be enough. What's your
>> opinion?
>>>
>>> Thanks
>>> Lianbo
>>>
>>>> Thanks in advance!
>>>>
>>>>> +		memblock_reserve(0, 1<<20);
>>>>> +		pr_info("Reserving the low 1MiB of memory for
>> crashkernel\n");
>>>>> +	}
>>>>> +}
>>>>> +
>>>>>  int kexec_should_crash(struct task_struct *p)
>>>>>  {
>>>>>  	/*
>>>>> --
>>>>> 2.17.1
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> kexec mailing list
>>>>> kexec@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/kexec
>>>>>
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-25  3:14           ` lijiang
@ 2019-10-25  3:39             ` Eric W. Biederman
  2019-10-25  8:52               ` lijiang
  0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2019-10-25  3:39 UTC (permalink / raw)
  To: lijiang
  Cc: d.hatayama, Simon Horman, linux-kernel, jgross, Thomas.Lendacky,
	bhe, x86, kexec, dhowells, mingo, bp, hpa, tglx, dyoung, vgoyal

lijiang <lijiang@redhat.com> writes:

>  * Returns the length of the argument (regardless of if it was
>  * truncated to fit in the buffer), or -1 on not found.
>  */
> static int
> __cmdline_find_option(const char *cmdline, int max_cmdline_size,
>                       const char *option, char *buffer, int bufsize)
>
>
> According to the above code comment, it should be better like this:
>
> +       if (cmdline_find_option(boot_command_line, "crashkernel",
> +                               NULL, 0) > 0) {
>
> After i test, i will post again.
>

This seems reasonable as we are dealing with x86 only code.

It wound be nice if someone could generalize cmdline_find_option to be
arch independent so that crash_core.c:parse_crashkernel could use it.
I don't think for this patchset, but it looks like an overdue cleanup.

We run the risk with parse_crashkernel using strstr and this using
another algorithm of having different kernel command line parsers
giving different results and disagreeing if "crashkernel=" is present
or not on the kernel command line.

Eric



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-25  3:39             ` Eric W. Biederman
@ 2019-10-25  8:52               ` lijiang
  0 siblings, 0 replies; 10+ messages in thread
From: lijiang @ 2019-10-25  8:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: d.hatayama, Simon Horman, linux-kernel, jgross, Thomas.Lendacky,
	bhe, x86, kexec, dhowells, mingo, bp, hpa, tglx, dyoung, vgoyal

在 2019年10月25日 11:39, Eric W. Biederman 写道:
> lijiang <lijiang@redhat.com> writes:
> 
>>  * Returns the length of the argument (regardless of if it was
>>  * truncated to fit in the buffer), or -1 on not found.
>>  */
>> static int
>> __cmdline_find_option(const char *cmdline, int max_cmdline_size,
>>                       const char *option, char *buffer, int bufsize)
>>
>>
>> According to the above code comment, it should be better like this:
>>
>> +       if (cmdline_find_option(boot_command_line, "crashkernel",
>> +                               NULL, 0) > 0) {
>>
>> After i test, i will post again.
>>
> 
> This seems reasonable as we are dealing with x86 only code.
> 
When we compile the non-x86 kernel, that could cause the the compile error
because the cmdline_find_option() won't be defined on non-x86 architecture.
So i will define a weak function in the kernel/kexec_core.c like this:
+
+void __init __weak kexec_reserve_low_1MiB(void)
+{}

and implement the kexec_reserve_low_1MiB() in the arch/x86/kernel/machine_kexec_64.c.

+/*
+ * When the crashkernel option is specified, only use the low
+ * 1MiB for the real mode trampoline.
+ */
+void __init kexec_reserve_low_1MiB(void)
+{
+       if (cmdline_find_option(boot_command_line, "crashkernel",
+                               NULL, 0) > 0) {
+               memblock_reserve(0, 1<<20);
+               pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+       }
+} 

That will solve the compile error on the non-x86 kernel, and it also works well on
the x86 kernel.

BTW: i pasted the code at the end, please refer to it.

> It wound be nice if someone could generalize cmdline_find_option to be
> arch independent so that crash_core.c:parse_crashkernel could use it.

Good point, that could be done in the future.

> I don't think for this patchset, but it looks like an overdue cleanup.
> 
> We run the risk with parse_crashkernel using strstr and this using
> another algorithm of having different kernel command line parsers
> giving different results and disagreeing if "crashkernel=" is present
> or not on the kernel command line.
> 
Indeed, but sometimes, the crashkernel has a complicated syntax, maybe
that could be a reason.

Thanks.
Lianbo

> Eric
> 
> 

---
 arch/x86/kernel/machine_kexec_64.c | 15 +++++++++++++++
 arch/x86/realmode/init.c           |  2 ++
 include/linux/kexec.h              |  2 ++
 kernel/kexec_core.c                |  3 +++
 4 files changed, 22 insertions(+)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 5dcd438ad8f2..42d7c15c45f1 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -17,6 +17,7 @@
 #include <linux/suspend.h>
 #include <linux/vmalloc.h>
 #include <linux/efi.h>
+#include <linux/memblock.h>
 
 #include <asm/init.h>
 #include <asm/pgtable.h>
@@ -27,6 +28,7 @@
 #include <asm/kexec-bzimage64.h>
 #include <asm/setup.h>
 #include <asm/set_memory.h>
+#include <asm/cmdline.h>
 
 #ifdef CONFIG_ACPI
 /*
@@ -687,3 +689,16 @@ void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
 	 */
 	set_memory_encrypted((unsigned long)vaddr, pages);
 }
+
+/*
+ * When the crashkernel option is specified, only use the low
+ * 1MiB for the real mode trampoline.
+ */
+void __init kexec_reserve_low_1MiB(void)
+{
+	if (cmdline_find_option(boot_command_line, "crashkernel",
+				NULL, 0) > 0) {
+		memblock_reserve(0, 1<<20);
+		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+	}
+}
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 7dce39c8c034..064cc79a015d 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -3,6 +3,7 @@
 #include <linux/slab.h>
 #include <linux/memblock.h>
 #include <linux/mem_encrypt.h>
+#include <linux/kexec.h>
 
 #include <asm/set_memory.h>
 #include <asm/pgtable.h>
@@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
 
 	memblock_reserve(mem, size);
 	set_real_mode_mem(mem);
+	kexec_reserve_low_1MiB();
 }
 
 static void __init setup_real_mode(void)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 1776eb2e43a4..988bf2de51a7 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
 extern void crash_kexec(struct pt_regs *);
 int kexec_should_crash(struct task_struct *);
 int kexec_crash_loaded(void);
+void __init kexec_reserve_low_1MiB(void);
 void crash_save_cpu(struct pt_regs *regs, int cpu);
 extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
 
@@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
 static inline void crash_kexec(struct pt_regs *regs) { }
 static inline int kexec_should_crash(struct task_struct *p) { return 0; }
 static inline int kexec_crash_loaded(void) { return 0; }
+static inline void __init kexec_reserve_low_1MiB(void) { }
 #define kexec_in_progress false
 #endif /* CONFIG_KEXEC_CORE */
 
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 15d70a90b50d..8856047bcdc8 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1213,3 +1213,6 @@ void __weak arch_kexec_protect_crashkres(void)
 
 void __weak arch_kexec_unprotect_crashkres(void)
 {}
+
+void __init __weak kexec_reserve_low_1MiB(void)
+{}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-10-25  8:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23 14:19 [PATCH 0/2 v5] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
2019-10-23 14:19 ` [PATCH 1/2 v5] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
2019-10-24 10:07   ` Simon Horman
2019-10-24 11:33     ` lijiang
2019-10-25  1:31       ` lijiang
2019-10-25  1:38         ` d.hatayama
2019-10-25  3:14           ` lijiang
2019-10-25  3:39             ` Eric W. Biederman
2019-10-25  8:52               ` lijiang
2019-10-23 14:19 ` [PATCH 2/2 v5] x86/kdump: clean up all the code related to the backup region Lianbo Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).