All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
       [not found] <mailman.5359.1571746554.2486.kexec@lists.infradead.org>
@ 2019-10-22 13:38 ` Dave Anderson
  0 siblings, 0 replies; 22+ messages in thread
From: Dave Anderson @ 2019-10-22 13:38 UTC (permalink / raw)
  To: kexec



----- Original Message -----

> > 
> > [root linux]$ crash vmlinux
> > /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
> > WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values
> > 
> >       KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
> >     DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
> >         CPUS: 128
> >         DATE: Thu Sep 19 08:31:18 2019
> >       UPTIME: 00:01:21
> > LOAD AVERAGE: 0.16, 0.07, 0.02
> >        TASKS: 1343
> >     NODENAME: amd-ethanol
> >      RELEASE: 5.3.0-rc7+
> >      VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
> >      MACHINE: x86_64  (2195 Mhz)
> >       MEMORY: 127.9 GB
> >        PANIC: "Kernel panic - not syncing: sysrq triggered crash"
> >          PID: 9789
> >      COMMAND: "bash"
> >         TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
> >          CPU: 83
> >        STATE: TASK_RUNNING (PANIC)
> > 
> > crash> kmem -s|grep -i invalid
> > kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> > kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> > crash>
> 
> I fail to see what that's trying to tell me? You have invalid pointers?

Correct, because the pointer values are encrypted.  The command is walking through the
singly-linked list of free objects in a slab from the dma-kmalloc-512 slab cache.  The
slab memory had been allocated from low memory, and because of the  problem at hand,
it was was copied to the vmcore in its encrypted state.

Dave 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-24  8:13         ` d.hatayama
@ 2019-10-24 11:24           ` lijiang
  -1 siblings, 0 replies; 22+ messages in thread
From: lijiang @ 2019-10-24 11:24 UTC (permalink / raw)
  To: d.hatayama
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec,
	Borislav Petkov

在 2019年10月24日 16:13, d.hatayama@fujitsu.com 写道:
> I don't find the corresponding patch in the v5 patchset, so I comment here.
> 
Thanks for your comment.

>> -----Original Message-----
>> From: linux-kernel-owner@vger.kernel.org
>> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of lijiang
>> Sent: Wednesday, October 23, 2019 2:35 PM
>> To: Borislav Petkov <bp@alien8.de>
>> Cc: linux-kernel@vger.kernel.org; tglx@linutronix.de; mingo@redhat.com;
>> hpa@zytor.com; x86@kernel.org; bhe@redhat.com; dyoung@redhat.com;
>> jgross@suse.com; dhowells@redhat.com; Thomas.Lendacky@amd.com;
>> ebiederm@xmission.com; vgoyal@redhat.com; kexec@lists.infradead.org
>> Subject: Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the
>> crashkernel option is specified
>>
>> 在 2019年10月22日 16:30, Borislav Petkov 写道:
>>> This ifdeffery needs to be a function in kernel/kexec_core.c which is
>>> called by reserve_real_mode(), instead.
>>
>> Would you mind if i improve this patch as follow? Thanks.
>>
>> From 5804abec62279585f374d78ace1250505c44c6b7 Mon Sep 17 00:00:00 2001
>> From: Lianbo Jiang <lijiang@redhat.com>
>> Date: Wed, 23 Oct 2019 11:27:04 +0800
>> Subject: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
>>  option is specified
>>
>> Kdump kernel will reuse the first 640k region because the real mode
>> trampoline has to work in this area. When the vmcore is dumped, the
>> old memory in this area may be accessed, therefore, kernel has to
>> copy the contents of the first 640k area to a backup region so that
>> kdump kernel can read the old memory from the backup area of the
>> first 640k area, which is done in the purgatory().
>>
>> But, the current handling of copying the first 640k area runs into
>> problems when SME is enabled, kernel does not properly copy these
>> old memory to the backup area in the purgatory(), thereby, kdump
>> kernel reads out the encrypted contents, because the kdump kernel
>> must access the first kernel's memory with the encryption bit set
>> when SME is enabled in the first kernel. Please refer to this link:
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
>>
>> Finally, it causes the following errors, and the crash tool gets
>> invalid pointers when parsing the vmcore.
>>
>> crash> kmem -s|grep -i invalid
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
>> freepointer:a6086ac099f0c5a4
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
>> freepointer:a6086ac099f0c5a4
>> crash>
>>
>> To avoid the above errors, when the crashkernel option is specified,
>> lets reserve the remaining low 1MiB memory(after reserving real mode
>> memory) so that the allocated memory does not fall into the low 1MiB
>> area, which makes us not to copy the first 640k content to a backup
>> region in purgatory(). This indicates that it does not need to be
>> included in crash dumps or used for anything except the processor
>> trampolines that must live in the low 1MiB.
>>
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>> BTW:I also tried to fix the above problem in purgatory(), but there
>> are too many restricts in purgatory() context, for example: i can't
>> allocate new memory to create the identity mapping page table for
>> SME situation.
>>
>> Currently, there are two places where the first 640k area is needed,
>> the first one is in the find_trampoline_placement(), another one is
>> in the reserve_real_mode(), and their content doesn't matter.
>>
>> In addition, also need to clean all the code related to the backup
>> region later.
>>
>>  arch/x86/realmode/init.c |  2 ++
>>  include/linux/kexec.h    |  2 ++
>>  kernel/kexec_core.c      | 13 +++++++++++++
>>  3 files changed, 17 insertions(+)
>>
>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>> index 7dce39c8c034..064cc79a015d 100644
>> --- a/arch/x86/realmode/init.c
>> +++ b/arch/x86/realmode/init.c
>> @@ -3,6 +3,7 @@
>>  #include <linux/slab.h>
>>  #include <linux/memblock.h>
>>  #include <linux/mem_encrypt.h>
>> +#include <linux/kexec.h>
>>
>>  #include <asm/set_memory.h>
>>  #include <asm/pgtable.h>
>> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
>>
>>  	memblock_reserve(mem, size);
>>  	set_real_mode_mem(mem);
>> +	kexec_reserve_low_1MiB();
>>  }
>>
>>  static void __init setup_real_mode(void)
>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
>> index 1776eb2e43a4..30acf1d738bc 100644
>> --- a/include/linux/kexec.h
>> +++ b/include/linux/kexec.h
>> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>>  extern void crash_kexec(struct pt_regs *);
>>  int kexec_should_crash(struct task_struct *);
>>  int kexec_crash_loaded(void);
>> +void kexec_reserve_low_1MiB(void);
>>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
>>
>> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
>>  static inline void crash_kexec(struct pt_regs *regs) { }
>>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>>  static inline int kexec_crash_loaded(void) { return 0; }
>> +static inline void kexec_reserve_low_1MiB(void) { }
>>  #define kexec_in_progress false
>>  #endif /* CONFIG_KEXEC_CORE */
>>
>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>> index 15d70a90b50d..5bd89f1fee42 100644
>> --- a/kernel/kexec_core.c
>> +++ b/kernel/kexec_core.c
>> @@ -37,6 +37,7 @@
>>  #include <linux/compiler.h>
>>  #include <linux/hugetlb.h>
>>  #include <linux/frame.h>
>> +#include <linux/memblock.h>
>>
>>  #include <asm/page.h>
>>  #include <asm/sections.h>
>> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>>  	.desc  = IORES_DESC_CRASH_KERNEL
>>  };
>>
>> +/*
>> + * When the crashkernel option is specified, only use the low
>> + * 1MiB for the real mode trampoline.
>> + */
>> +void kexec_reserve_low_1MiB(void)
>> +{
>> +	if (strstr(boot_command_line, "crashkernel=")) {
> 
> strstr() matches for example, ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.
> 
> Is it enough to use cmdline_find_option_bool()?
> 
The cmdline_find_option_bool() will find a boolean option, but the crashkernel option
is not a boolean option, maybe it looks odd. So, should we use the cmdline_find_option()
better?

+#include <asm/cmdline.h>

 void __init kexec_reserve_low_1MiB(void)
 {
-       if (strstr(boot_command_line, "crashkernel=")) {
+       char buffer[4];
+
+       if (cmdline_find_option(boot_command_line, "crashkernel=",
+                               buffer, sizeof(buffer))) {
                memblock_reserve(0, 1<<20);
                pr_info("Reserving the low 1MiB of memory for crashkernel\n");
        }

And here, no need to parse the arguments of crashkernel(sometimes, which has a
complicated syntax), so the size of buffer should be enough. What's your opinion?

Thanks
Lianbo
 
>> +		memblock_reserve(0, 1<<20);
>> +		pr_info("Reserving the low 1MiB of memory for
>> crashkernel\n");
>> +	}
>> +}
>> +
>>  int kexec_should_crash(struct task_struct *p)
>>  {
>>  	/*
>> --
>> 2.17.1
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-24 11:24           ` lijiang
  0 siblings, 0 replies; 22+ messages in thread
From: lijiang @ 2019-10-24 11:24 UTC (permalink / raw)
  To: d.hatayama
  Cc: jgross, Thomas.Lendacky, bhe, x86, kexec, linux-kernel, dhowells,
	mingo, Borislav Petkov, ebiederm, hpa, tglx, dyoung, vgoyal

在 2019年10月24日 16:13, d.hatayama@fujitsu.com 写道:
> I don't find the corresponding patch in the v5 patchset, so I comment here.
> 
Thanks for your comment.

>> -----Original Message-----
>> From: linux-kernel-owner@vger.kernel.org
>> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of lijiang
>> Sent: Wednesday, October 23, 2019 2:35 PM
>> To: Borislav Petkov <bp@alien8.de>
>> Cc: linux-kernel@vger.kernel.org; tglx@linutronix.de; mingo@redhat.com;
>> hpa@zytor.com; x86@kernel.org; bhe@redhat.com; dyoung@redhat.com;
>> jgross@suse.com; dhowells@redhat.com; Thomas.Lendacky@amd.com;
>> ebiederm@xmission.com; vgoyal@redhat.com; kexec@lists.infradead.org
>> Subject: Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the
>> crashkernel option is specified
>>
>> 在 2019年10月22日 16:30, Borislav Petkov 写道:
>>> This ifdeffery needs to be a function in kernel/kexec_core.c which is
>>> called by reserve_real_mode(), instead.
>>
>> Would you mind if i improve this patch as follow? Thanks.
>>
>> From 5804abec62279585f374d78ace1250505c44c6b7 Mon Sep 17 00:00:00 2001
>> From: Lianbo Jiang <lijiang@redhat.com>
>> Date: Wed, 23 Oct 2019 11:27:04 +0800
>> Subject: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
>>  option is specified
>>
>> Kdump kernel will reuse the first 640k region because the real mode
>> trampoline has to work in this area. When the vmcore is dumped, the
>> old memory in this area may be accessed, therefore, kernel has to
>> copy the contents of the first 640k area to a backup region so that
>> kdump kernel can read the old memory from the backup area of the
>> first 640k area, which is done in the purgatory().
>>
>> But, the current handling of copying the first 640k area runs into
>> problems when SME is enabled, kernel does not properly copy these
>> old memory to the backup area in the purgatory(), thereby, kdump
>> kernel reads out the encrypted contents, because the kdump kernel
>> must access the first kernel's memory with the encryption bit set
>> when SME is enabled in the first kernel. Please refer to this link:
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
>>
>> Finally, it causes the following errors, and the crash tool gets
>> invalid pointers when parsing the vmcore.
>>
>> crash> kmem -s|grep -i invalid
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
>> freepointer:a6086ac099f0c5a4
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
>> freepointer:a6086ac099f0c5a4
>> crash>
>>
>> To avoid the above errors, when the crashkernel option is specified,
>> lets reserve the remaining low 1MiB memory(after reserving real mode
>> memory) so that the allocated memory does not fall into the low 1MiB
>> area, which makes us not to copy the first 640k content to a backup
>> region in purgatory(). This indicates that it does not need to be
>> included in crash dumps or used for anything except the processor
>> trampolines that must live in the low 1MiB.
>>
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>> BTW:I also tried to fix the above problem in purgatory(), but there
>> are too many restricts in purgatory() context, for example: i can't
>> allocate new memory to create the identity mapping page table for
>> SME situation.
>>
>> Currently, there are two places where the first 640k area is needed,
>> the first one is in the find_trampoline_placement(), another one is
>> in the reserve_real_mode(), and their content doesn't matter.
>>
>> In addition, also need to clean all the code related to the backup
>> region later.
>>
>>  arch/x86/realmode/init.c |  2 ++
>>  include/linux/kexec.h    |  2 ++
>>  kernel/kexec_core.c      | 13 +++++++++++++
>>  3 files changed, 17 insertions(+)
>>
>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>> index 7dce39c8c034..064cc79a015d 100644
>> --- a/arch/x86/realmode/init.c
>> +++ b/arch/x86/realmode/init.c
>> @@ -3,6 +3,7 @@
>>  #include <linux/slab.h>
>>  #include <linux/memblock.h>
>>  #include <linux/mem_encrypt.h>
>> +#include <linux/kexec.h>
>>
>>  #include <asm/set_memory.h>
>>  #include <asm/pgtable.h>
>> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
>>
>>  	memblock_reserve(mem, size);
>>  	set_real_mode_mem(mem);
>> +	kexec_reserve_low_1MiB();
>>  }
>>
>>  static void __init setup_real_mode(void)
>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
>> index 1776eb2e43a4..30acf1d738bc 100644
>> --- a/include/linux/kexec.h
>> +++ b/include/linux/kexec.h
>> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>>  extern void crash_kexec(struct pt_regs *);
>>  int kexec_should_crash(struct task_struct *);
>>  int kexec_crash_loaded(void);
>> +void kexec_reserve_low_1MiB(void);
>>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
>>
>> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
>>  static inline void crash_kexec(struct pt_regs *regs) { }
>>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>>  static inline int kexec_crash_loaded(void) { return 0; }
>> +static inline void kexec_reserve_low_1MiB(void) { }
>>  #define kexec_in_progress false
>>  #endif /* CONFIG_KEXEC_CORE */
>>
>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>> index 15d70a90b50d..5bd89f1fee42 100644
>> --- a/kernel/kexec_core.c
>> +++ b/kernel/kexec_core.c
>> @@ -37,6 +37,7 @@
>>  #include <linux/compiler.h>
>>  #include <linux/hugetlb.h>
>>  #include <linux/frame.h>
>> +#include <linux/memblock.h>
>>
>>  #include <asm/page.h>
>>  #include <asm/sections.h>
>> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>>  	.desc  = IORES_DESC_CRASH_KERNEL
>>  };
>>
>> +/*
>> + * When the crashkernel option is specified, only use the low
>> + * 1MiB for the real mode trampoline.
>> + */
>> +void kexec_reserve_low_1MiB(void)
>> +{
>> +	if (strstr(boot_command_line, "crashkernel=")) {
> 
> strstr() matches for example, ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.
> 
> Is it enough to use cmdline_find_option_bool()?
> 
The cmdline_find_option_bool() will find a boolean option, but the crashkernel option
is not a boolean option, maybe it looks odd. So, should we use the cmdline_find_option()
better?

+#include <asm/cmdline.h>

 void __init kexec_reserve_low_1MiB(void)
 {
-       if (strstr(boot_command_line, "crashkernel=")) {
+       char buffer[4];
+
+       if (cmdline_find_option(boot_command_line, "crashkernel=",
+                               buffer, sizeof(buffer))) {
                memblock_reserve(0, 1<<20);
                pr_info("Reserving the low 1MiB of memory for crashkernel\n");
        }

And here, no need to parse the arguments of crashkernel(sometimes, which has a
complicated syntax), so the size of buffer should be enough. What's your opinion?

Thanks
Lianbo
 
>> +		memblock_reserve(0, 1<<20);
>> +		pr_info("Reserving the low 1MiB of memory for
>> crashkernel\n");
>> +	}
>> +}
>> +
>>  int kexec_should_crash(struct task_struct *p)
>>  {
>>  	/*
>> --
>> 2.17.1
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-24  8:13         ` d.hatayama
@ 2019-10-24  9:10           ` Borislav Petkov
  -1 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2019-10-24  9:10 UTC (permalink / raw)
  To: d.hatayama
  Cc: 'lijiang',
	linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

On Thu, Oct 24, 2019 at 08:13:25AM +0000, d.hatayama@fujitsu.com wrote:
> I don't find the corresponding patch in the v5 patchset, so I comment here.

You don't?

https://lore.kernel.org/lkml/20191023141912.29110-2-lijiang@redhat.com/

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-24  9:10           ` Borislav Petkov
  0 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2019-10-24  9:10 UTC (permalink / raw)
  To: d.hatayama
  Cc: jgross, Thomas.Lendacky, 'lijiang',
	bhe, x86, kexec, linux-kernel, dhowells, mingo, ebiederm, hpa,
	tglx, dyoung, vgoyal

On Thu, Oct 24, 2019 at 08:13:25AM +0000, d.hatayama@fujitsu.com wrote:
> I don't find the corresponding patch in the v5 patchset, so I comment here.

You don't?

https://lore.kernel.org/lkml/20191023141912.29110-2-lijiang@redhat.com/

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23  5:35       ` lijiang
@ 2019-10-24  8:13         ` d.hatayama
  -1 siblings, 0 replies; 22+ messages in thread
From: d.hatayama @ 2019-10-24  8:13 UTC (permalink / raw)
  To: 'lijiang'
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec,
	Borislav Petkov

I don't find the corresponding patch in the v5 patchset, so I comment here.

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org
> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of lijiang
> Sent: Wednesday, October 23, 2019 2:35 PM
> To: Borislav Petkov <bp@alien8.de>
> Cc: linux-kernel@vger.kernel.org; tglx@linutronix.de; mingo@redhat.com;
> hpa@zytor.com; x86@kernel.org; bhe@redhat.com; dyoung@redhat.com;
> jgross@suse.com; dhowells@redhat.com; Thomas.Lendacky@amd.com;
> ebiederm@xmission.com; vgoyal@redhat.com; kexec@lists.infradead.org
> Subject: Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the
> crashkernel option is specified
> 
> 在 2019年10月22日 16:30, Borislav Petkov 写道:
> > This ifdeffery needs to be a function in kernel/kexec_core.c which is
> > called by reserve_real_mode(), instead.
> 
> Would you mind if i improve this patch as follow? Thanks.
> 
> From 5804abec62279585f374d78ace1250505c44c6b7 Mon Sep 17 00:00:00 2001
> From: Lianbo Jiang <lijiang@redhat.com>
> Date: Wed, 23 Oct 2019 11:27:04 +0800
> Subject: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
>  option is specified
> 
> Kdump kernel will reuse the first 640k region because the real mode
> trampoline has to work in this area. When the vmcore is dumped, the
> old memory in this area may be accessed, therefore, kernel has to
> copy the contents of the first 640k area to a backup region so that
> kdump kernel can read the old memory from the backup area of the
> first 640k area, which is done in the purgatory().
> 
> But, the current handling of copying the first 640k area runs into
> problems when SME is enabled, kernel does not properly copy these
> old memory to the backup area in the purgatory(), thereby, kdump
> kernel reads out the encrypted contents, because the kdump kernel
> must access the first kernel's memory with the encryption bit set
> when SME is enabled in the first kernel. Please refer to this link:
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
> 
> Finally, it causes the following errors, and the crash tool gets
> invalid pointers when parsing the vmcore.
> 
> crash> kmem -s|grep -i invalid
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
> freepointer:a6086ac099f0c5a4
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
> freepointer:a6086ac099f0c5a4
> crash>
> 
> To avoid the above errors, when the crashkernel option is specified,
> lets reserve the remaining low 1MiB memory(after reserving real mode
> memory) so that the allocated memory does not fall into the low 1MiB
> area, which makes us not to copy the first 640k content to a backup
> region in purgatory(). This indicates that it does not need to be
> included in crash dumps or used for anything except the processor
> trampolines that must live in the low 1MiB.
> 
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
> BTW:I also tried to fix the above problem in purgatory(), but there
> are too many restricts in purgatory() context, for example: i can't
> allocate new memory to create the identity mapping page table for
> SME situation.
> 
> Currently, there are two places where the first 640k area is needed,
> the first one is in the find_trampoline_placement(), another one is
> in the reserve_real_mode(), and their content doesn't matter.
> 
> In addition, also need to clean all the code related to the backup
> region later.
> 
>  arch/x86/realmode/init.c |  2 ++
>  include/linux/kexec.h    |  2 ++
>  kernel/kexec_core.c      | 13 +++++++++++++
>  3 files changed, 17 insertions(+)
> 
> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> index 7dce39c8c034..064cc79a015d 100644
> --- a/arch/x86/realmode/init.c
> +++ b/arch/x86/realmode/init.c
> @@ -3,6 +3,7 @@
>  #include <linux/slab.h>
>  #include <linux/memblock.h>
>  #include <linux/mem_encrypt.h>
> +#include <linux/kexec.h>
> 
>  #include <asm/set_memory.h>
>  #include <asm/pgtable.h>
> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
> 
>  	memblock_reserve(mem, size);
>  	set_real_mode_mem(mem);
> +	kexec_reserve_low_1MiB();
>  }
> 
>  static void __init setup_real_mode(void)
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 1776eb2e43a4..30acf1d738bc 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>  extern void crash_kexec(struct pt_regs *);
>  int kexec_should_crash(struct task_struct *);
>  int kexec_crash_loaded(void);
> +void kexec_reserve_low_1MiB(void);
>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
> 
> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
>  static inline void crash_kexec(struct pt_regs *regs) { }
>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>  static inline int kexec_crash_loaded(void) { return 0; }
> +static inline void kexec_reserve_low_1MiB(void) { }
>  #define kexec_in_progress false
>  #endif /* CONFIG_KEXEC_CORE */
> 
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 15d70a90b50d..5bd89f1fee42 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -37,6 +37,7 @@
>  #include <linux/compiler.h>
>  #include <linux/hugetlb.h>
>  #include <linux/frame.h>
> +#include <linux/memblock.h>
> 
>  #include <asm/page.h>
>  #include <asm/sections.h>
> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>  	.desc  = IORES_DESC_CRASH_KERNEL
>  };
> 
> +/*
> + * When the crashkernel option is specified, only use the low
> + * 1MiB for the real mode trampoline.
> + */
> +void kexec_reserve_low_1MiB(void)
> +{
> +	if (strstr(boot_command_line, "crashkernel=")) {

strstr() matches for example, ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.

Is it enough to use cmdline_find_option_bool()?

> +		memblock_reserve(0, 1<<20);
> +		pr_info("Reserving the low 1MiB of memory for
> crashkernel\n");
> +	}
> +}
> +
>  int kexec_should_crash(struct task_struct *p)
>  {
>  	/*
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-24  8:13         ` d.hatayama
  0 siblings, 0 replies; 22+ messages in thread
From: d.hatayama @ 2019-10-24  8:13 UTC (permalink / raw)
  To: 'lijiang'
  Cc: jgross, Thomas.Lendacky, bhe, x86, kexec, linux-kernel, dhowells,
	mingo, Borislav Petkov, ebiederm, hpa, tglx, dyoung, vgoyal

I don't find the corresponding patch in the v5 patchset, so I comment here.

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org
> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of lijiang
> Sent: Wednesday, October 23, 2019 2:35 PM
> To: Borislav Petkov <bp@alien8.de>
> Cc: linux-kernel@vger.kernel.org; tglx@linutronix.de; mingo@redhat.com;
> hpa@zytor.com; x86@kernel.org; bhe@redhat.com; dyoung@redhat.com;
> jgross@suse.com; dhowells@redhat.com; Thomas.Lendacky@amd.com;
> ebiederm@xmission.com; vgoyal@redhat.com; kexec@lists.infradead.org
> Subject: Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the
> crashkernel option is specified
> 
> 在 2019年10月22日 16:30, Borislav Petkov 写道:
> > This ifdeffery needs to be a function in kernel/kexec_core.c which is
> > called by reserve_real_mode(), instead.
> 
> Would you mind if i improve this patch as follow? Thanks.
> 
> From 5804abec62279585f374d78ace1250505c44c6b7 Mon Sep 17 00:00:00 2001
> From: Lianbo Jiang <lijiang@redhat.com>
> Date: Wed, 23 Oct 2019 11:27:04 +0800
> Subject: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
>  option is specified
> 
> Kdump kernel will reuse the first 640k region because the real mode
> trampoline has to work in this area. When the vmcore is dumped, the
> old memory in this area may be accessed, therefore, kernel has to
> copy the contents of the first 640k area to a backup region so that
> kdump kernel can read the old memory from the backup area of the
> first 640k area, which is done in the purgatory().
> 
> But, the current handling of copying the first 640k area runs into
> problems when SME is enabled, kernel does not properly copy these
> old memory to the backup area in the purgatory(), thereby, kdump
> kernel reads out the encrypted contents, because the kdump kernel
> must access the first kernel's memory with the encryption bit set
> when SME is enabled in the first kernel. Please refer to this link:
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
> 
> Finally, it causes the following errors, and the crash tool gets
> invalid pointers when parsing the vmcore.
> 
> crash> kmem -s|grep -i invalid
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
> freepointer:a6086ac099f0c5a4
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid
> freepointer:a6086ac099f0c5a4
> crash>
> 
> To avoid the above errors, when the crashkernel option is specified,
> lets reserve the remaining low 1MiB memory(after reserving real mode
> memory) so that the allocated memory does not fall into the low 1MiB
> area, which makes us not to copy the first 640k content to a backup
> region in purgatory(). This indicates that it does not need to be
> included in crash dumps or used for anything except the processor
> trampolines that must live in the low 1MiB.
> 
> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
> BTW:I also tried to fix the above problem in purgatory(), but there
> are too many restricts in purgatory() context, for example: i can't
> allocate new memory to create the identity mapping page table for
> SME situation.
> 
> Currently, there are two places where the first 640k area is needed,
> the first one is in the find_trampoline_placement(), another one is
> in the reserve_real_mode(), and their content doesn't matter.
> 
> In addition, also need to clean all the code related to the backup
> region later.
> 
>  arch/x86/realmode/init.c |  2 ++
>  include/linux/kexec.h    |  2 ++
>  kernel/kexec_core.c      | 13 +++++++++++++
>  3 files changed, 17 insertions(+)
> 
> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> index 7dce39c8c034..064cc79a015d 100644
> --- a/arch/x86/realmode/init.c
> +++ b/arch/x86/realmode/init.c
> @@ -3,6 +3,7 @@
>  #include <linux/slab.h>
>  #include <linux/memblock.h>
>  #include <linux/mem_encrypt.h>
> +#include <linux/kexec.h>
> 
>  #include <asm/set_memory.h>
>  #include <asm/pgtable.h>
> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
> 
>  	memblock_reserve(mem, size);
>  	set_real_mode_mem(mem);
> +	kexec_reserve_low_1MiB();
>  }
> 
>  static void __init setup_real_mode(void)
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 1776eb2e43a4..30acf1d738bc 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
>  extern void crash_kexec(struct pt_regs *);
>  int kexec_should_crash(struct task_struct *);
>  int kexec_crash_loaded(void);
> +void kexec_reserve_low_1MiB(void);
>  void crash_save_cpu(struct pt_regs *regs, int cpu);
>  extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
> 
> @@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
>  static inline void crash_kexec(struct pt_regs *regs) { }
>  static inline int kexec_should_crash(struct task_struct *p) { return 0; }
>  static inline int kexec_crash_loaded(void) { return 0; }
> +static inline void kexec_reserve_low_1MiB(void) { }
>  #define kexec_in_progress false
>  #endif /* CONFIG_KEXEC_CORE */
> 
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 15d70a90b50d..5bd89f1fee42 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -37,6 +37,7 @@
>  #include <linux/compiler.h>
>  #include <linux/hugetlb.h>
>  #include <linux/frame.h>
> +#include <linux/memblock.h>
> 
>  #include <asm/page.h>
>  #include <asm/sections.h>
> @@ -70,6 +71,18 @@ struct resource crashk_low_res = {
>  	.desc  = IORES_DESC_CRASH_KERNEL
>  };
> 
> +/*
> + * When the crashkernel option is specified, only use the low
> + * 1MiB for the real mode trampoline.
> + */
> +void kexec_reserve_low_1MiB(void)
> +{
> +	if (strstr(boot_command_line, "crashkernel=")) {

strstr() matches for example, ANYEXTRACHARACTERScrashkernel=ANYEXTRACHARACTERS.

Is it enough to use cmdline_find_option_bool()?

> +		memblock_reserve(0, 1<<20);
> +		pr_info("Reserving the low 1MiB of memory for
> crashkernel\n");
> +	}
> +}
> +
>  int kexec_should_crash(struct task_struct *p)
>  {
>  	/*
> --
> 2.17.1

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23  7:46         ` Borislav Petkov
@ 2019-10-23  9:20           ` lijiang
  -1 siblings, 0 replies; 22+ messages in thread
From: lijiang @ 2019-10-23  9:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

在 2019年10月23日 15:46, Borislav Petkov 写道:
> On Wed, Oct 23, 2019 at 01:35:09PM +0800, lijiang wrote:
>> Would you mind if i improve this patch as follow? Thanks.
> 
> Yap, looks good to me.
> 
Thanks for your comment.

OK. I will post this one and the third patch in this series later.

Thanks.
Lianbo


> Thx.
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-23  9:20           ` lijiang
  0 siblings, 0 replies; 22+ messages in thread
From: lijiang @ 2019-10-23  9:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: jgross, Thomas.Lendacky, bhe, x86, kexec, linux-kernel, dhowells,
	mingo, ebiederm, hpa, tglx, dyoung, vgoyal

在 2019年10月23日 15:46, Borislav Petkov 写道:
> On Wed, Oct 23, 2019 at 01:35:09PM +0800, lijiang wrote:
>> Would you mind if i improve this patch as follow? Thanks.
> 
> Yap, looks good to me.
> 
Thanks for your comment.

OK. I will post this one and the third patch in this series later.

Thanks.
Lianbo


> Thx.
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23  5:35       ` lijiang
@ 2019-10-23  7:46         ` Borislav Petkov
  -1 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2019-10-23  7:46 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

On Wed, Oct 23, 2019 at 01:35:09PM +0800, lijiang wrote:
> Would you mind if i improve this patch as follow? Thanks.

Yap, looks good to me.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-23  7:46         ` Borislav Petkov
  0 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2019-10-23  7:46 UTC (permalink / raw)
  To: lijiang
  Cc: jgross, Thomas.Lendacky, bhe, x86, kexec, linux-kernel, dhowells,
	mingo, ebiederm, hpa, tglx, dyoung, vgoyal

On Wed, Oct 23, 2019 at 01:35:09PM +0800, lijiang wrote:
> Would you mind if i improve this patch as follow? Thanks.

Yap, looks good to me.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-23  5:23       ` lijiang
@ 2019-10-23  7:43         ` Borislav Petkov
  -1 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2019-10-23  7:43 UTC (permalink / raw)
  To: lijiang
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

On Wed, Oct 23, 2019 at 01:23:33PM +0800, lijiang wrote:
> Kdump kernel will reuse the first 640k region because the real mode
> trampoline has to work in this area. When the vmcore is dumped, the
> old memory in this area may be accessed, therefore, kernel has to
> copy the contents of the first 640k area to a backup region so that
> kdump kernel can read the old memory from the backup area of the
> first 640k area, which is done in the purgatory().

That sounds better. :)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-23  7:43         ` Borislav Petkov
  0 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2019-10-23  7:43 UTC (permalink / raw)
  To: lijiang
  Cc: jgross, Thomas.Lendacky, bhe, x86, kexec, linux-kernel, dhowells,
	mingo, ebiederm, hpa, tglx, dyoung, vgoyal

On Wed, Oct 23, 2019 at 01:23:33PM +0800, lijiang wrote:
> Kdump kernel will reuse the first 640k region because the real mode
> trampoline has to work in this area. When the vmcore is dumped, the
> old memory in this area may be accessed, therefore, kernel has to
> copy the contents of the first 640k area to a backup region so that
> kdump kernel can read the old memory from the backup area of the
> first 640k area, which is done in the purgatory().

That sounds better. :)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-22  8:30     ` Borislav Petkov
@ 2019-10-23  5:35       ` lijiang
  -1 siblings, 0 replies; 22+ messages in thread
From: lijiang @ 2019-10-23  5:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

在 2019年10月22日 16:30, Borislav Petkov 写道:
> This ifdeffery needs to be a function in kernel/kexec_core.c which is
> called by reserve_real_mode(), instead.

Would you mind if i improve this patch as follow? Thanks.

From 5804abec62279585f374d78ace1250505c44c6b7 Mon Sep 17 00:00:00 2001
From: Lianbo Jiang <lijiang@redhat.com>
Date: Wed, 23 Oct 2019 11:27:04 +0800
Subject: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
 option is specified

Kdump kernel will reuse the first 640k region because the real mode
trampoline has to work in this area. When the vmcore is dumped, the
old memory in this area may be accessed, therefore, kernel has to
copy the contents of the first 640k area to a backup region so that
kdump kernel can read the old memory from the backup area of the
first 640k area, which is done in the purgatory().

But, the current handling of copying the first 640k area runs into
problems when SME is enabled, kernel does not properly copy these
old memory to the backup area in the purgatory(), thereby, kdump
kernel reads out the encrypted contents, because the kdump kernel
must access the first kernel's memory with the encryption bit set
when SME is enabled in the first kernel. Please refer to this link:

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Finally, it causes the following errors, and the crash tool gets
invalid pointers when parsing the vmcore.

crash> kmem -s|grep -i invalid
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
crash>

To avoid the above errors, when the crashkernel option is specified,
lets reserve the remaining low 1MiB memory(after reserving real mode
memory) so that the allocated memory does not fall into the low 1MiB
area, which makes us not to copy the first 640k content to a backup
region in purgatory(). This indicates that it does not need to be
included in crash dumps or used for anything except the processor
trampolines that must live in the low 1MiB.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
BTW:I also tried to fix the above problem in purgatory(), but there
are too many restricts in purgatory() context, for example: i can't
allocate new memory to create the identity mapping page table for
SME situation.

Currently, there are two places where the first 640k area is needed,
the first one is in the find_trampoline_placement(), another one is
in the reserve_real_mode(), and their content doesn't matter.

In addition, also need to clean all the code related to the backup
region later.

 arch/x86/realmode/init.c |  2 ++
 include/linux/kexec.h    |  2 ++
 kernel/kexec_core.c      | 13 +++++++++++++
 3 files changed, 17 insertions(+)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 7dce39c8c034..064cc79a015d 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -3,6 +3,7 @@
 #include <linux/slab.h>
 #include <linux/memblock.h>
 #include <linux/mem_encrypt.h>
+#include <linux/kexec.h>
 
 #include <asm/set_memory.h>
 #include <asm/pgtable.h>
@@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
 
 	memblock_reserve(mem, size);
 	set_real_mode_mem(mem);
+	kexec_reserve_low_1MiB();
 }
 
 static void __init setup_real_mode(void)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 1776eb2e43a4..30acf1d738bc 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
 extern void crash_kexec(struct pt_regs *);
 int kexec_should_crash(struct task_struct *);
 int kexec_crash_loaded(void);
+void kexec_reserve_low_1MiB(void);
 void crash_save_cpu(struct pt_regs *regs, int cpu);
 extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
 
@@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
 static inline void crash_kexec(struct pt_regs *regs) { }
 static inline int kexec_should_crash(struct task_struct *p) { return 0; }
 static inline int kexec_crash_loaded(void) { return 0; }
+static inline void kexec_reserve_low_1MiB(void) { }
 #define kexec_in_progress false
 #endif /* CONFIG_KEXEC_CORE */
 
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 15d70a90b50d..5bd89f1fee42 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -37,6 +37,7 @@
 #include <linux/compiler.h>
 #include <linux/hugetlb.h>
 #include <linux/frame.h>
+#include <linux/memblock.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
@@ -70,6 +71,18 @@ struct resource crashk_low_res = {
 	.desc  = IORES_DESC_CRASH_KERNEL
 };
 
+/*
+ * When the crashkernel option is specified, only use the low
+ * 1MiB for the real mode trampoline.
+ */
+void kexec_reserve_low_1MiB(void)
+{
+	if (strstr(boot_command_line, "crashkernel=")) {
+		memblock_reserve(0, 1<<20);
+		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+	}
+}
+
 int kexec_should_crash(struct task_struct *p)
 {
 	/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-23  5:35       ` lijiang
  0 siblings, 0 replies; 22+ messages in thread
From: lijiang @ 2019-10-23  5:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: jgross, Thomas.Lendacky, bhe, x86, kexec, linux-kernel, dhowells,
	mingo, ebiederm, hpa, tglx, dyoung, vgoyal

在 2019年10月22日 16:30, Borislav Petkov 写道:
> This ifdeffery needs to be a function in kernel/kexec_core.c which is
> called by reserve_real_mode(), instead.

Would you mind if i improve this patch as follow? Thanks.

From 5804abec62279585f374d78ace1250505c44c6b7 Mon Sep 17 00:00:00 2001
From: Lianbo Jiang <lijiang@redhat.com>
Date: Wed, 23 Oct 2019 11:27:04 +0800
Subject: [PATCH] x86/kdump: always reserve the low 1MiB when the crashkernel
 option is specified

Kdump kernel will reuse the first 640k region because the real mode
trampoline has to work in this area. When the vmcore is dumped, the
old memory in this area may be accessed, therefore, kernel has to
copy the contents of the first 640k area to a backup region so that
kdump kernel can read the old memory from the backup area of the
first 640k area, which is done in the purgatory().

But, the current handling of copying the first 640k area runs into
problems when SME is enabled, kernel does not properly copy these
old memory to the backup area in the purgatory(), thereby, kdump
kernel reads out the encrypted contents, because the kdump kernel
must access the first kernel's memory with the encryption bit set
when SME is enabled in the first kernel. Please refer to this link:

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Finally, it causes the following errors, and the crash tool gets
invalid pointers when parsing the vmcore.

crash> kmem -s|grep -i invalid
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
crash>

To avoid the above errors, when the crashkernel option is specified,
lets reserve the remaining low 1MiB memory(after reserving real mode
memory) so that the allocated memory does not fall into the low 1MiB
area, which makes us not to copy the first 640k content to a backup
region in purgatory(). This indicates that it does not need to be
included in crash dumps or used for anything except the processor
trampolines that must live in the low 1MiB.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
BTW:I also tried to fix the above problem in purgatory(), but there
are too many restricts in purgatory() context, for example: i can't
allocate new memory to create the identity mapping page table for
SME situation.

Currently, there are two places where the first 640k area is needed,
the first one is in the find_trampoline_placement(), another one is
in the reserve_real_mode(), and their content doesn't matter.

In addition, also need to clean all the code related to the backup
region later.

 arch/x86/realmode/init.c |  2 ++
 include/linux/kexec.h    |  2 ++
 kernel/kexec_core.c      | 13 +++++++++++++
 3 files changed, 17 insertions(+)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 7dce39c8c034..064cc79a015d 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -3,6 +3,7 @@
 #include <linux/slab.h>
 #include <linux/memblock.h>
 #include <linux/mem_encrypt.h>
+#include <linux/kexec.h>
 
 #include <asm/set_memory.h>
 #include <asm/pgtable.h>
@@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
 
 	memblock_reserve(mem, size);
 	set_real_mode_mem(mem);
+	kexec_reserve_low_1MiB();
 }
 
 static void __init setup_real_mode(void)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 1776eb2e43a4..30acf1d738bc 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -306,6 +306,7 @@ extern void __crash_kexec(struct pt_regs *);
 extern void crash_kexec(struct pt_regs *);
 int kexec_should_crash(struct task_struct *);
 int kexec_crash_loaded(void);
+void kexec_reserve_low_1MiB(void);
 void crash_save_cpu(struct pt_regs *regs, int cpu);
 extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
 
@@ -397,6 +398,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
 static inline void crash_kexec(struct pt_regs *regs) { }
 static inline int kexec_should_crash(struct task_struct *p) { return 0; }
 static inline int kexec_crash_loaded(void) { return 0; }
+static inline void kexec_reserve_low_1MiB(void) { }
 #define kexec_in_progress false
 #endif /* CONFIG_KEXEC_CORE */
 
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 15d70a90b50d..5bd89f1fee42 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -37,6 +37,7 @@
 #include <linux/compiler.h>
 #include <linux/hugetlb.h>
 #include <linux/frame.h>
+#include <linux/memblock.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
@@ -70,6 +71,18 @@ struct resource crashk_low_res = {
 	.desc  = IORES_DESC_CRASH_KERNEL
 };
 
+/*
+ * When the crashkernel option is specified, only use the low
+ * 1MiB for the real mode trampoline.
+ */
+void kexec_reserve_low_1MiB(void)
+{
+	if (strstr(boot_command_line, "crashkernel=")) {
+		memblock_reserve(0, 1<<20);
+		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+	}
+}
+
 int kexec_should_crash(struct task_struct *p)
 {
 	/*
-- 
2.17.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-22  8:30     ` Borislav Petkov
@ 2019-10-23  5:23       ` lijiang
  -1 siblings, 0 replies; 22+ messages in thread
From: lijiang @ 2019-10-23  5:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

在 2019年10月22日 16:30, Borislav Petkov 写道:
> On Thu, Oct 17, 2019 at 05:43:45PM +0800, Lianbo Jiang wrote:
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
> 
Thanks for your comment.

> Put that as a Link: below.
> 
Looks better. OK.

>> Kdump kernel will reuse the first 640k region because of some reasons,
> 
> s/ of some reasons//
> 
>> for example: the trampline and conventional PC system BIOS region may
> 
> spellcheck: s/trampline/trampoline/
> 
> I see two more typos in here and if you had a spellchecker enabled in
> your editor where you write the commit message, you'll see them too.
> Please use one.
> 
Good point. I just tried to enable the spellchecker in the vim and now it
has worked well. Thanks. :-) 

>> require to allocate memory in this area. Obviously, kdump kernel will
>> also overwrite the first 640k region,
> 
> Well, it is not obvious to me. Please be more specific: why would the
> kdump kernel do that?
> 
Kdump kernel will reuse the first 640k region because the real mode
trampoline has to work in this area. When the vmcore is dumped, the
old memory in this area may be accessed, therefore, kernel has to
copy the contents of the first 640k area to a backup region so that
kdump kernel can read the old memory from the backup area of the
first 640k area, which is done in the purgatory().

>> therefore, kernel has to copy
>> the contents of the first 640k area to a backup area, which is done in
>> purgatory(), because vmcore may need the old memory. When vmcore is
>> dumped, kdump kernel will read the old memory from the backup area of
>> the first 640k area.
>>
>> Basically, the main reason should be clear, kernel does not correctly
>> handle the first 640k region when SME is active,
> 
> If you mention the actual reason here, that sentence would be clearer:
> 
> "When SME is enabled in the first kernel, the kdump kernel must access
> the first kernel's memory with the encryption bit set."
> 
> Something like that. 
> 
Looks good.

>> which causes that
>> kernel does not properly copy these old memory to the backup area in
>> purgatory(). Therefore, kdump kernel reads out the incorrect contents
> 
> s/incorrect/encrypted/
> 
Exactly.

>> from the backup area when dumping vmcore. Finally, the phenomenon is
> 
> phenomenon?
> 
Finally, it caused the following errors.

>> as follow:
>>
>> [root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
>> WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values
>>
>>       KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
>>     DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
>>         CPUS: 128
>>         DATE: Thu Sep 19 08:31:18 2019
>>       UPTIME: 00:01:21
>> LOAD AVERAGE: 0.16, 0.07, 0.02
>>        TASKS: 1343
>>     NODENAME: amd-ethanol
>>      RELEASE: 5.3.0-rc7+
>>      VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
>>      MACHINE: x86_64  (2195 Mhz)
>>       MEMORY: 127.9 GB
>>        PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>>          PID: 9789
>>      COMMAND: "bash"
>>         TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
>>          CPU: 83
>>        STATE: TASK_RUNNING (PANIC)
>>
>> crash> kmem -s|grep -i invalid
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> crash>
> 
> I fail to see what that's trying to tell me? You have invalid pointers?
> 
Yes, when parsing the vmcore via crash tool, it occurs the above errors,
the crash tool gets invalid pointers. 

>> BTW: I also tried to fix the above problem in purgatory(), but there
>> are too many restricts in purgatory() context, for example: i can't
>> allocate new memory to create the identity mapping page table for SME
>> situation.
> 
> This paragraph belongs under the "---" line below.
> 
OK. Thanks.

>> Currently, there are two places where the first 640k area is needed,
>> the first one is in the find_trampoline_placement(), another one is
>> in the reserve_real_mode(), and their content doesn't matter.
>>
>> To avoid the above error, when the crashkernel kernel command line
>> option is specified, lets reserve the remaining low 1MiB memory(
>> after reserving real mode memroy) so that the allocated memory does
>> not fall into the low 1MiB area, which makes us not to copy the first
>> 640k content to a backup region in purgatory(). This indicates that
>> it does not need to be included in crash dumps or used for anything
>> execept the processor trampolines that must live in the low 1MiB.
>>
>> In addition, also need to clean all the code related to the backup
>> region later.
> 
> Ditto.
> 
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>>  arch/x86/realmode/init.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>> index 7dce39c8c034..1f0492830f2c 100644
>> --- a/arch/x86/realmode/init.c
>> +++ b/arch/x86/realmode/init.c
>> @@ -34,6 +34,17 @@ void __init reserve_real_mode(void)
>>  
>>  	memblock_reserve(mem, size);
>>  	set_real_mode_mem(mem);
>> +
>> +#ifdef CONFIG_KEXEC_CORE
>> +	/*
>> +	 * When the crashkernel option is specified, only use the low
>> +	 * 1MiB for the real mode trampoline.
>> +	 */
>> +	if (strstr(boot_command_line, "crashkernel=")) {
>> +		memblock_reserve(0, 1<<20);
>> +		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
>> +	}
>> +#endif /* CONFIG_KEXEC_CORE */
> 
> This ifdeffery needs to be a function in kernel/kexec_core.c which is
> called by reserve_real_mode(), instead.
> 
Good understanding. I will try to improve it later.

Thanks.
Lianbo
> Thx.
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-23  5:23       ` lijiang
  0 siblings, 0 replies; 22+ messages in thread
From: lijiang @ 2019-10-23  5:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: jgross, Thomas.Lendacky, bhe, x86, kexec, linux-kernel, dhowells,
	mingo, ebiederm, hpa, tglx, dyoung, vgoyal

在 2019年10月22日 16:30, Borislav Petkov 写道:
> On Thu, Oct 17, 2019 at 05:43:45PM +0800, Lianbo Jiang wrote:
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
> 
Thanks for your comment.

> Put that as a Link: below.
> 
Looks better. OK.

>> Kdump kernel will reuse the first 640k region because of some reasons,
> 
> s/ of some reasons//
> 
>> for example: the trampline and conventional PC system BIOS region may
> 
> spellcheck: s/trampline/trampoline/
> 
> I see two more typos in here and if you had a spellchecker enabled in
> your editor where you write the commit message, you'll see them too.
> Please use one.
> 
Good point. I just tried to enable the spellchecker in the vim and now it
has worked well. Thanks. :-) 

>> require to allocate memory in this area. Obviously, kdump kernel will
>> also overwrite the first 640k region,
> 
> Well, it is not obvious to me. Please be more specific: why would the
> kdump kernel do that?
> 
Kdump kernel will reuse the first 640k region because the real mode
trampoline has to work in this area. When the vmcore is dumped, the
old memory in this area may be accessed, therefore, kernel has to
copy the contents of the first 640k area to a backup region so that
kdump kernel can read the old memory from the backup area of the
first 640k area, which is done in the purgatory().

>> therefore, kernel has to copy
>> the contents of the first 640k area to a backup area, which is done in
>> purgatory(), because vmcore may need the old memory. When vmcore is
>> dumped, kdump kernel will read the old memory from the backup area of
>> the first 640k area.
>>
>> Basically, the main reason should be clear, kernel does not correctly
>> handle the first 640k region when SME is active,
> 
> If you mention the actual reason here, that sentence would be clearer:
> 
> "When SME is enabled in the first kernel, the kdump kernel must access
> the first kernel's memory with the encryption bit set."
> 
> Something like that. 
> 
Looks good.

>> which causes that
>> kernel does not properly copy these old memory to the backup area in
>> purgatory(). Therefore, kdump kernel reads out the incorrect contents
> 
> s/incorrect/encrypted/
> 
Exactly.

>> from the backup area when dumping vmcore. Finally, the phenomenon is
> 
> phenomenon?
> 
Finally, it caused the following errors.

>> as follow:
>>
>> [root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
>> WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values
>>
>>       KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
>>     DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
>>         CPUS: 128
>>         DATE: Thu Sep 19 08:31:18 2019
>>       UPTIME: 00:01:21
>> LOAD AVERAGE: 0.16, 0.07, 0.02
>>        TASKS: 1343
>>     NODENAME: amd-ethanol
>>      RELEASE: 5.3.0-rc7+
>>      VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
>>      MACHINE: x86_64  (2195 Mhz)
>>       MEMORY: 127.9 GB
>>        PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>>          PID: 9789
>>      COMMAND: "bash"
>>         TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
>>          CPU: 83
>>        STATE: TASK_RUNNING (PANIC)
>>
>> crash> kmem -s|grep -i invalid
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> crash>
> 
> I fail to see what that's trying to tell me? You have invalid pointers?
> 
Yes, when parsing the vmcore via crash tool, it occurs the above errors,
the crash tool gets invalid pointers. 

>> BTW: I also tried to fix the above problem in purgatory(), but there
>> are too many restricts in purgatory() context, for example: i can't
>> allocate new memory to create the identity mapping page table for SME
>> situation.
> 
> This paragraph belongs under the "---" line below.
> 
OK. Thanks.

>> Currently, there are two places where the first 640k area is needed,
>> the first one is in the find_trampoline_placement(), another one is
>> in the reserve_real_mode(), and their content doesn't matter.
>>
>> To avoid the above error, when the crashkernel kernel command line
>> option is specified, lets reserve the remaining low 1MiB memory(
>> after reserving real mode memroy) so that the allocated memory does
>> not fall into the low 1MiB area, which makes us not to copy the first
>> 640k content to a backup region in purgatory(). This indicates that
>> it does not need to be included in crash dumps or used for anything
>> execept the processor trampolines that must live in the low 1MiB.
>>
>> In addition, also need to clean all the code related to the backup
>> region later.
> 
> Ditto.
> 
>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>> ---
>>  arch/x86/realmode/init.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>> index 7dce39c8c034..1f0492830f2c 100644
>> --- a/arch/x86/realmode/init.c
>> +++ b/arch/x86/realmode/init.c
>> @@ -34,6 +34,17 @@ void __init reserve_real_mode(void)
>>  
>>  	memblock_reserve(mem, size);
>>  	set_real_mode_mem(mem);
>> +
>> +#ifdef CONFIG_KEXEC_CORE
>> +	/*
>> +	 * When the crashkernel option is specified, only use the low
>> +	 * 1MiB for the real mode trampoline.
>> +	 */
>> +	if (strstr(boot_command_line, "crashkernel=")) {
>> +		memblock_reserve(0, 1<<20);
>> +		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
>> +	}
>> +#endif /* CONFIG_KEXEC_CORE */
> 
> This ifdeffery needs to be a function in kernel/kexec_core.c which is
> called by reserve_real_mode(), instead.
> 
Good understanding. I will try to improve it later.

Thanks.
Lianbo
> Thx.
> 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
       [not found] <55902207.7797907.1571751831873.JavaMail.zimbra@redhat.com>
@ 2019-10-22 13:43 ` Dave Anderson
  0 siblings, 0 replies; 22+ messages in thread
From: Dave Anderson @ 2019-10-22 13:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List


---- Original Message -----

> > 
> > [root linux]$ crash vmlinux
> > /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
> > WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values
> > 
> >       KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
> >     DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
> >         CPUS: 128
> >         DATE: Thu Sep 19 08:31:18 2019
> >       UPTIME: 00:01:21
> > LOAD AVERAGE: 0.16, 0.07, 0.02
> >        TASKS: 1343
> >     NODENAME: amd-ethanol
> >      RELEASE: 5.3.0-rc7+
> >      VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
> >      MACHINE: x86_64  (2195 Mhz)
> >       MEMORY: 127.9 GB
> >        PANIC: "Kernel panic - not syncing: sysrq triggered crash"
> >          PID: 9789
> >      COMMAND: "bash"
> >         TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
> >          CPU: 83
> >        STATE: TASK_RUNNING (PANIC)
> > 
> > crash> kmem -s|grep -i invalid
> > kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> > kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> > crash>
> 
> I fail to see what that's trying to tell me? You have invalid pointers?

Correct, because the pointer values are encrypted.  The command is walking through the
singly-linked list of free objects in a slab from the dma-kmalloc-512 slab cache.  The
slab memory had been allocated from low memory, and because of the  problem at hand,
it was was copied to the vmcore in its encrypted state.

Dave 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-17  9:43   ` Lianbo Jiang
@ 2019-10-22  8:30     ` Borislav Petkov
  -1 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2019-10-22  8:30 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: linux-kernel, tglx, mingo, hpa, x86, bhe, dyoung, jgross,
	dhowells, Thomas.Lendacky, ebiederm, vgoyal, kexec

On Thu, Oct 17, 2019 at 05:43:45PM +0800, Lianbo Jiang wrote:
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Put that as a Link: below.

> Kdump kernel will reuse the first 640k region because of some reasons,

s/ of some reasons//

> for example: the trampline and conventional PC system BIOS region may

spellcheck: s/trampline/trampoline/

I see two more typos in here and if you had a spellchecker enabled in
your editor where you write the commit message, you'll see them too.
Please use one.

> require to allocate memory in this area. Obviously, kdump kernel will
> also overwrite the first 640k region,

Well, it is not obvious to me. Please be more specific: why would the
kdump kernel do that?

> therefore, kernel has to copy
> the contents of the first 640k area to a backup area, which is done in
> purgatory(), because vmcore may need the old memory. When vmcore is
> dumped, kdump kernel will read the old memory from the backup area of
> the first 640k area.
> 
> Basically, the main reason should be clear, kernel does not correctly
> handle the first 640k region when SME is active,

If you mention the actual reason here, that sentence would be clearer:

"When SME is enabled in the first kernel, the kdump kernel must access
the first kernel's memory with the encryption bit set."

Something like that. 

> which causes that
> kernel does not properly copy these old memory to the backup area in
> purgatory(). Therefore, kdump kernel reads out the incorrect contents

s/incorrect/encrypted/

> from the backup area when dumping vmcore. Finally, the phenomenon is

phenomenon?

> as follow:
> 
> [root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
> WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values
> 
>       KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
>     DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
>         CPUS: 128
>         DATE: Thu Sep 19 08:31:18 2019
>       UPTIME: 00:01:21
> LOAD AVERAGE: 0.16, 0.07, 0.02
>        TASKS: 1343
>     NODENAME: amd-ethanol
>      RELEASE: 5.3.0-rc7+
>      VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
>      MACHINE: x86_64  (2195 Mhz)
>       MEMORY: 127.9 GB
>        PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>          PID: 9789
>      COMMAND: "bash"
>         TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
>          CPU: 83
>        STATE: TASK_RUNNING (PANIC)
> 
> crash> kmem -s|grep -i invalid
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> crash>

I fail to see what that's trying to tell me? You have invalid pointers?

> BTW: I also tried to fix the above problem in purgatory(), but there
> are too many restricts in purgatory() context, for example: i can't
> allocate new memory to create the identity mapping page table for SME
> situation.

This paragraph belongs under the "---" line below.

> Currently, there are two places where the first 640k area is needed,
> the first one is in the find_trampoline_placement(), another one is
> in the reserve_real_mode(), and their content doesn't matter.
> 
> To avoid the above error, when the crashkernel kernel command line
> option is specified, lets reserve the remaining low 1MiB memory(
> after reserving real mode memroy) so that the allocated memory does
> not fall into the low 1MiB area, which makes us not to copy the first
> 640k content to a backup region in purgatory(). This indicates that
> it does not need to be included in crash dumps or used for anything
> execept the processor trampolines that must live in the low 1MiB.
> 
> In addition, also need to clean all the code related to the backup
> region later.

Ditto.

> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
>  arch/x86/realmode/init.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> index 7dce39c8c034..1f0492830f2c 100644
> --- a/arch/x86/realmode/init.c
> +++ b/arch/x86/realmode/init.c
> @@ -34,6 +34,17 @@ void __init reserve_real_mode(void)
>  
>  	memblock_reserve(mem, size);
>  	set_real_mode_mem(mem);
> +
> +#ifdef CONFIG_KEXEC_CORE
> +	/*
> +	 * When the crashkernel option is specified, only use the low
> +	 * 1MiB for the real mode trampoline.
> +	 */
> +	if (strstr(boot_command_line, "crashkernel=")) {
> +		memblock_reserve(0, 1<<20);
> +		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
> +	}
> +#endif /* CONFIG_KEXEC_CORE */

This ifdeffery needs to be a function in kernel/kexec_core.c which is
called by reserve_real_mode(), instead.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-22  8:30     ` Borislav Petkov
  0 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2019-10-22  8:30 UTC (permalink / raw)
  To: Lianbo Jiang
  Cc: jgross, Thomas.Lendacky, bhe, x86, kexec, linux-kernel, dhowells,
	mingo, ebiederm, hpa, tglx, dyoung, vgoyal

On Thu, Oct 17, 2019 at 05:43:45PM +0800, Lianbo Jiang wrote:
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Put that as a Link: below.

> Kdump kernel will reuse the first 640k region because of some reasons,

s/ of some reasons//

> for example: the trampline and conventional PC system BIOS region may

spellcheck: s/trampline/trampoline/

I see two more typos in here and if you had a spellchecker enabled in
your editor where you write the commit message, you'll see them too.
Please use one.

> require to allocate memory in this area. Obviously, kdump kernel will
> also overwrite the first 640k region,

Well, it is not obvious to me. Please be more specific: why would the
kdump kernel do that?

> therefore, kernel has to copy
> the contents of the first 640k area to a backup area, which is done in
> purgatory(), because vmcore may need the old memory. When vmcore is
> dumped, kdump kernel will read the old memory from the backup area of
> the first 640k area.
> 
> Basically, the main reason should be clear, kernel does not correctly
> handle the first 640k region when SME is active,

If you mention the actual reason here, that sentence would be clearer:

"When SME is enabled in the first kernel, the kdump kernel must access
the first kernel's memory with the encryption bit set."

Something like that. 

> which causes that
> kernel does not properly copy these old memory to the backup area in
> purgatory(). Therefore, kdump kernel reads out the incorrect contents

s/incorrect/encrypted/

> from the backup area when dumping vmcore. Finally, the phenomenon is

phenomenon?

> as follow:
> 
> [root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
> WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values
> 
>       KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
>     DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
>         CPUS: 128
>         DATE: Thu Sep 19 08:31:18 2019
>       UPTIME: 00:01:21
> LOAD AVERAGE: 0.16, 0.07, 0.02
>        TASKS: 1343
>     NODENAME: amd-ethanol
>      RELEASE: 5.3.0-rc7+
>      VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
>      MACHINE: x86_64  (2195 Mhz)
>       MEMORY: 127.9 GB
>        PANIC: "Kernel panic - not syncing: sysrq triggered crash"
>          PID: 9789
>      COMMAND: "bash"
>         TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
>          CPU: 83
>        STATE: TASK_RUNNING (PANIC)
> 
> crash> kmem -s|grep -i invalid
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> crash>

I fail to see what that's trying to tell me? You have invalid pointers?

> BTW: I also tried to fix the above problem in purgatory(), but there
> are too many restricts in purgatory() context, for example: i can't
> allocate new memory to create the identity mapping page table for SME
> situation.

This paragraph belongs under the "---" line below.

> Currently, there are two places where the first 640k area is needed,
> the first one is in the find_trampoline_placement(), another one is
> in the reserve_real_mode(), and their content doesn't matter.
> 
> To avoid the above error, when the crashkernel kernel command line
> option is specified, lets reserve the remaining low 1MiB memory(
> after reserving real mode memroy) so that the allocated memory does
> not fall into the low 1MiB area, which makes us not to copy the first
> 640k content to a backup region in purgatory(). This indicates that
> it does not need to be included in crash dumps or used for anything
> execept the processor trampolines that must live in the low 1MiB.
> 
> In addition, also need to clean all the code related to the backup
> region later.

Ditto.

> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
> ---
>  arch/x86/realmode/init.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> index 7dce39c8c034..1f0492830f2c 100644
> --- a/arch/x86/realmode/init.c
> +++ b/arch/x86/realmode/init.c
> @@ -34,6 +34,17 @@ void __init reserve_real_mode(void)
>  
>  	memblock_reserve(mem, size);
>  	set_real_mode_mem(mem);
> +
> +#ifdef CONFIG_KEXEC_CORE
> +	/*
> +	 * When the crashkernel option is specified, only use the low
> +	 * 1MiB for the real mode trampoline.
> +	 */
> +	if (strstr(boot_command_line, "crashkernel=")) {
> +		memblock_reserve(0, 1<<20);
> +		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
> +	}
> +#endif /* CONFIG_KEXEC_CORE */

This ifdeffery needs to be a function in kernel/kexec_core.c which is
called by reserve_real_mode(), instead.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
  2019-10-17  9:43 [PATCH 0/3 v4] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
@ 2019-10-17  9:43   ` Lianbo Jiang
  0 siblings, 0 replies; 22+ messages in thread
From: Lianbo Jiang @ 2019-10-17  9:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, mingo, bp, hpa, x86, bhe, dyoung, jgross, dhowells,
	Thomas.Lendacky, ebiederm, vgoyal, kexec

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Kdump kernel will reuse the first 640k region because of some reasons,
for example: the trampline and conventional PC system BIOS region may
require to allocate memory in this area. Obviously, kdump kernel will
also overwrite the first 640k region, therefore, kernel has to copy
the contents of the first 640k area to a backup area, which is done in
purgatory(), because vmcore may need the old memory. When vmcore is
dumped, kdump kernel will read the old memory from the backup area of
the first 640k area.

Basically, the main reason should be clear, kernel does not correctly
handle the first 640k region when SME is active, which causes that
kernel does not properly copy these old memory to the backup area in
purgatory(). Therefore, kdump kernel reads out the incorrect contents
from the backup area when dumping vmcore. Finally, the phenomenon is
as follow:

[root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values

      KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
        CPUS: 128
        DATE: Thu Sep 19 08:31:18 2019
      UPTIME: 00:01:21
LOAD AVERAGE: 0.16, 0.07, 0.02
       TASKS: 1343
    NODENAME: amd-ethanol
     RELEASE: 5.3.0-rc7+
     VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
     MACHINE: x86_64  (2195 Mhz)
      MEMORY: 127.9 GB
       PANIC: "Kernel panic - not syncing: sysrq triggered crash"
         PID: 9789
     COMMAND: "bash"
        TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
         CPU: 83
       STATE: TASK_RUNNING (PANIC)

crash> kmem -s|grep -i invalid
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
crash>

BTW: I also tried to fix the above problem in purgatory(), but there
are too many restricts in purgatory() context, for example: i can't
allocate new memory to create the identity mapping page table for SME
situation.

Currently, there are two places where the first 640k area is needed,
the first one is in the find_trampoline_placement(), another one is
in the reserve_real_mode(), and their content doesn't matter.

To avoid the above error, when the crashkernel kernel command line
option is specified, lets reserve the remaining low 1MiB memory(
after reserving real mode memroy) so that the allocated memory does
not fall into the low 1MiB area, which makes us not to copy the first
640k content to a backup region in purgatory(). This indicates that
it does not need to be included in crash dumps or used for anything
execept the processor trampolines that must live in the low 1MiB.

In addition, also need to clean all the code related to the backup
region later.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/realmode/init.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 7dce39c8c034..1f0492830f2c 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -34,6 +34,17 @@ void __init reserve_real_mode(void)
 
 	memblock_reserve(mem, size);
 	set_real_mode_mem(mem);
+
+#ifdef CONFIG_KEXEC_CORE
+	/*
+	 * When the crashkernel option is specified, only use the low
+	 * 1MiB for the real mode trampoline.
+	 */
+	if (strstr(boot_command_line, "crashkernel=")) {
+		memblock_reserve(0, 1<<20);
+		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+	}
+#endif /* CONFIG_KEXEC_CORE */
 }
 
 static void __init setup_real_mode(void)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified
@ 2019-10-17  9:43   ` Lianbo Jiang
  0 siblings, 0 replies; 22+ messages in thread
From: Lianbo Jiang @ 2019-10-17  9:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: jgross, Thomas.Lendacky, bhe, x86, kexec, dhowells, mingo, bp,
	ebiederm, hpa, tglx, dyoung, vgoyal

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Kdump kernel will reuse the first 640k region because of some reasons,
for example: the trampline and conventional PC system BIOS region may
require to allocate memory in this area. Obviously, kdump kernel will
also overwrite the first 640k region, therefore, kernel has to copy
the contents of the first 640k area to a backup area, which is done in
purgatory(), because vmcore may need the old memory. When vmcore is
dumped, kdump kernel will read the old memory from the backup area of
the first 640k area.

Basically, the main reason should be clear, kernel does not correctly
handle the first 640k region when SME is active, which causes that
kernel does not properly copy these old memory to the backup area in
purgatory(). Therefore, kdump kernel reads out the incorrect contents
from the backup area when dumping vmcore. Finally, the phenomenon is
as follow:

[root linux]$ crash vmlinux /var/crash/127.0.0.1-2019-09-19-08\:31\:27/vmcore
WARNING: kernel relocated [240MB]: patching 97110 gdb minimal_symbol values

      KERNEL: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2019-09-19-08:31:27/vmcore  [PARTIAL DUMP]
        CPUS: 128
        DATE: Thu Sep 19 08:31:18 2019
      UPTIME: 00:01:21
LOAD AVERAGE: 0.16, 0.07, 0.02
       TASKS: 1343
    NODENAME: amd-ethanol
     RELEASE: 5.3.0-rc7+
     VERSION: #4 SMP Thu Sep 19 08:14:00 EDT 2019
     MACHINE: x86_64  (2195 Mhz)
      MEMORY: 127.9 GB
       PANIC: "Kernel panic - not syncing: sysrq triggered crash"
         PID: 9789
     COMMAND: "bash"
        TASK: "ffff89711894ae80  [THREAD_INFO: ffff89711894ae80]"
         CPU: 83
       STATE: TASK_RUNNING (PANIC)

crash> kmem -s|grep -i invalid
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
crash>

BTW: I also tried to fix the above problem in purgatory(), but there
are too many restricts in purgatory() context, for example: i can't
allocate new memory to create the identity mapping page table for SME
situation.

Currently, there are two places where the first 640k area is needed,
the first one is in the find_trampoline_placement(), another one is
in the reserve_real_mode(), and their content doesn't matter.

To avoid the above error, when the crashkernel kernel command line
option is specified, lets reserve the remaining low 1MiB memory(
after reserving real mode memroy) so that the allocated memory does
not fall into the low 1MiB area, which makes us not to copy the first
640k content to a backup region in purgatory(). This indicates that
it does not need to be included in crash dumps or used for anything
execept the processor trampolines that must live in the low 1MiB.

In addition, also need to clean all the code related to the backup
region later.

Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
---
 arch/x86/realmode/init.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 7dce39c8c034..1f0492830f2c 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -34,6 +34,17 @@ void __init reserve_real_mode(void)
 
 	memblock_reserve(mem, size);
 	set_real_mode_mem(mem);
+
+#ifdef CONFIG_KEXEC_CORE
+	/*
+	 * When the crashkernel option is specified, only use the low
+	 * 1MiB for the real mode trampoline.
+	 */
+	if (strstr(boot_command_line, "crashkernel=")) {
+		memblock_reserve(0, 1<<20);
+		pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+	}
+#endif /* CONFIG_KEXEC_CORE */
 }
 
 static void __init setup_real_mode(void)
-- 
2.17.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-10-24 11:25 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <mailman.5359.1571746554.2486.kexec@lists.infradead.org>
2019-10-22 13:38 ` [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Dave Anderson
     [not found] <55902207.7797907.1571751831873.JavaMail.zimbra@redhat.com>
2019-10-22 13:43 ` Dave Anderson
2019-10-17  9:43 [PATCH 0/3 v4] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
2019-10-17  9:43 ` [PATCH 1/3 v4] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified Lianbo Jiang
2019-10-17  9:43   ` Lianbo Jiang
2019-10-22  8:30   ` Borislav Petkov
2019-10-22  8:30     ` Borislav Petkov
2019-10-23  5:23     ` lijiang
2019-10-23  5:23       ` lijiang
2019-10-23  7:43       ` Borislav Petkov
2019-10-23  7:43         ` Borislav Petkov
2019-10-23  5:35     ` lijiang
2019-10-23  5:35       ` lijiang
2019-10-23  7:46       ` Borislav Petkov
2019-10-23  7:46         ` Borislav Petkov
2019-10-23  9:20         ` lijiang
2019-10-23  9:20           ` lijiang
2019-10-24  8:13       ` d.hatayama
2019-10-24  8:13         ` d.hatayama
2019-10-24  9:10         ` Borislav Petkov
2019-10-24  9:10           ` Borislav Petkov
2019-10-24 11:24         ` lijiang
2019-10-24 11:24           ` lijiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.