From: Dave Young <dyoung@redhat.com> To: x86@kernel.org Cc: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com>, linux-kernel@vger.kernel.org, kexec@lists.infradead.org, Baoquan He <bhe@redhat.com>, Eric Biederman <ebiederm@xmission.com>, "Huang, Kai" <kai.huang@intel.com> Subject: [PATCH V2] x86/kexec: do not update E820 kexec table for setup_data Date: Thu, 21 Mar 2024 17:23:20 +0800 [thread overview] Message-ID: <Zfv8iCL6CT2JqLIC@darkstar.users.ipa.redhat.com> (raw) crashkernel reservation failed on a Thinkpad t440s laptop recently. Actually the memblock reservation succeeded, but later insert_resource() failed. Test steps: kexec load -> /* make sure add crashkernel param eg. crashkernel=160M */ kexec reboot -> dmesg|grep "crashkernel reserved"; crashkernel memory range like below reserved successfully: 0x00000000d0000000 - 0x00000000da000000 But no such "Crash kernel" region in /proc/iomem The background story is like below: Currently E820 code reserves setup_data regions for both the current kernel and the kexec kernel, and it inserts them into the resources list. Before the kexec kernel reboots nobody passes the old setup_data, and kexec only passes fresh SETUP_EFI and SETUP_IMA if needed. Thus the old setup data memory is not used at all. Due to old kernel updates the kexec e820 table as well so kexec kernel sees them as E820_TYPE_RESERVED_KERN regions, and later the old setup_data regions are inserted into resources list in the kexec kernel by e820__reserve_resources(). Note, due to no setup_data is passed in for those old regions they are not early reserved (by function early_reserve_memory), and the crashkernel memblock reservation will just treat them as usable memory and it could reserve the crashkernel region which overlaps with the old setup_data regions. And just like the bug I noticed here, kdump insert_resource failed because e820__reserve_resources has added the overlapped chunks in /proc/iomem already. Finally, looking at the code, the old setup_data regions are not used at all as no setup_data is passed in by the kexec boot loader. Although something like SETUP_PCI etc could be needed, kexec should pass the info as new setup_data so that kexec kernel can take care of them. This should be taken care of in other separate patches if needed. Thus drop the useless buggy code here. Signed-off-by: Dave Young <dyoung@redhat.com> --- V2: changelog grammar fixes [suggestions from Huang Kai] arch/x86/kernel/e820.c | 16 +--------------- 1 file changed, 1 insertion(+), 15 deletions(-) Index: linux/arch/x86/kernel/e820.c =================================================================== --- linux.orig/arch/x86/kernel/e820.c +++ linux/arch/x86/kernel/e820.c @@ -1015,16 +1015,6 @@ void __init e820__reserve_setup_data(voi pa_next = data->next; e820__range_update(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - - /* - * SETUP_EFI and SETUP_IMA are supplied by kexec and do not need - * to be reserved. - */ - if (data->type != SETUP_EFI && data->type != SETUP_IMA) - e820__range_update_kexec(pa_data, - sizeof(*data) + data->len, - E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - if (data->type == SETUP_INDIRECT) { len += data->len; early_memunmap(data, sizeof(*data)); @@ -1036,12 +1026,9 @@ void __init e820__reserve_setup_data(voi indirect = (struct setup_indirect *)data->data; - if (indirect->type != SETUP_INDIRECT) { + if (indirect->type != SETUP_INDIRECT) e820__range_update(indirect->addr, indirect->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - e820__range_update_kexec(indirect->addr, indirect->len, - E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - } } pa_data = pa_next; @@ -1049,7 +1036,6 @@ void __init e820__reserve_setup_data(voi } e820__update_table(e820_table); - e820__update_table(e820_table_kexec); pr_info("extended physical RAM map:\n"); e820__print_table("reserve setup_data");
WARNING: multiple messages have this Message-ID (diff)
From: Dave Young <dyoung@redhat.com> To: x86@kernel.org Cc: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com>, linux-kernel@vger.kernel.org, kexec@lists.infradead.org, Baoquan He <bhe@redhat.com>, Eric Biederman <ebiederm@xmission.com>, "Huang, Kai" <kai.huang@intel.com> Subject: [PATCH V2] x86/kexec: do not update E820 kexec table for setup_data Date: Thu, 21 Mar 2024 17:23:20 +0800 [thread overview] Message-ID: <Zfv8iCL6CT2JqLIC@darkstar.users.ipa.redhat.com> (raw) crashkernel reservation failed on a Thinkpad t440s laptop recently. Actually the memblock reservation succeeded, but later insert_resource() failed. Test steps: kexec load -> /* make sure add crashkernel param eg. crashkernel=160M */ kexec reboot -> dmesg|grep "crashkernel reserved"; crashkernel memory range like below reserved successfully: 0x00000000d0000000 - 0x00000000da000000 But no such "Crash kernel" region in /proc/iomem The background story is like below: Currently E820 code reserves setup_data regions for both the current kernel and the kexec kernel, and it inserts them into the resources list. Before the kexec kernel reboots nobody passes the old setup_data, and kexec only passes fresh SETUP_EFI and SETUP_IMA if needed. Thus the old setup data memory is not used at all. Due to old kernel updates the kexec e820 table as well so kexec kernel sees them as E820_TYPE_RESERVED_KERN regions, and later the old setup_data regions are inserted into resources list in the kexec kernel by e820__reserve_resources(). Note, due to no setup_data is passed in for those old regions they are not early reserved (by function early_reserve_memory), and the crashkernel memblock reservation will just treat them as usable memory and it could reserve the crashkernel region which overlaps with the old setup_data regions. And just like the bug I noticed here, kdump insert_resource failed because e820__reserve_resources has added the overlapped chunks in /proc/iomem already. Finally, looking at the code, the old setup_data regions are not used at all as no setup_data is passed in by the kexec boot loader. Although something like SETUP_PCI etc could be needed, kexec should pass the info as new setup_data so that kexec kernel can take care of them. This should be taken care of in other separate patches if needed. Thus drop the useless buggy code here. Signed-off-by: Dave Young <dyoung@redhat.com> --- V2: changelog grammar fixes [suggestions from Huang Kai] arch/x86/kernel/e820.c | 16 +--------------- 1 file changed, 1 insertion(+), 15 deletions(-) Index: linux/arch/x86/kernel/e820.c =================================================================== --- linux.orig/arch/x86/kernel/e820.c +++ linux/arch/x86/kernel/e820.c @@ -1015,16 +1015,6 @@ void __init e820__reserve_setup_data(voi pa_next = data->next; e820__range_update(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - - /* - * SETUP_EFI and SETUP_IMA are supplied by kexec and do not need - * to be reserved. - */ - if (data->type != SETUP_EFI && data->type != SETUP_IMA) - e820__range_update_kexec(pa_data, - sizeof(*data) + data->len, - E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - if (data->type == SETUP_INDIRECT) { len += data->len; early_memunmap(data, sizeof(*data)); @@ -1036,12 +1026,9 @@ void __init e820__reserve_setup_data(voi indirect = (struct setup_indirect *)data->data; - if (indirect->type != SETUP_INDIRECT) { + if (indirect->type != SETUP_INDIRECT) e820__range_update(indirect->addr, indirect->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - e820__range_update_kexec(indirect->addr, indirect->len, - E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - } } pa_data = pa_next; @@ -1049,7 +1036,6 @@ void __init e820__reserve_setup_data(voi } e820__update_table(e820_table); - e820__update_table(e820_table_kexec); pr_info("extended physical RAM map:\n"); e820__print_table("reserve setup_data"); _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
next reply other threads:[~2024-03-21 9:23 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-03-21 9:23 Dave Young [this message] 2024-03-21 9:23 ` [PATCH V2] x86/kexec: do not update E820 kexec table for setup_data Dave Young 2024-03-21 10:32 ` Jiri Bohac 2024-03-21 10:32 ` Jiri Bohac 2024-03-22 2:17 ` Dave Young 2024-03-22 2:17 ` Dave Young 2024-03-22 5:20 ` Dave Young 2024-03-22 5:20 ` Dave Young
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=Zfv8iCL6CT2JqLIC@darkstar.users.ipa.redhat.com \ --to=dyoung@redhat.com \ --cc=bhe@redhat.com \ --cc=bp@alien8.de \ --cc=dave.hansen@linux.intel.com \ --cc=ebiederm@xmission.com \ --cc=hpa@zytor.com \ --cc=kai.huang@intel.com \ --cc=kexec@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mingo@redhat.com \ --cc=tglx@linutronix.de \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.