* [PATCH v3 0/5] RISC-V: Add kexec/kdump support @ 2021-04-05 8:57 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel, Nick Kossifidis This patch series adds kexec/kdump and crash kernel support on RISC-V. For testing the patches a patched version of kexec-tools is needed (still a work in progress) which can be found at: https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz v3: * Rebase on newer kernel tree * Minor cleanups * Split UAPI changes to a separate patch * Improve / cleanup init_resources * Resolve Palmer's comments v2: * Rebase on newer kernel tree * Minor cleanups * Properly populate the ioresources tre, so that it can be used later on for implementing strict /dev/mem * Use linux,usable-memory on /memory instead of a new binding * USe a reserved-memory node for ELF core header Nick Kossifidis (5): RISC-V: Add EM_RISCV to kexec UAPI header RISC-V: Add kexec support RISC-V: Improve init_resources RISC-V: Add kdump support RISC-V: Add crash kernel support arch/riscv/Kconfig | 25 ++++ arch/riscv/include/asm/elf.h | 6 + arch/riscv/include/asm/kexec.h | 54 +++++++ arch/riscv/kernel/Makefile | 6 + arch/riscv/kernel/crash_dump.c | 46 ++++++ arch/riscv/kernel/crash_save_regs.S | 56 +++++++ arch/riscv/kernel/kexec_relocate.S | 222 ++++++++++++++++++++++++++++ arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ arch/riscv/kernel/setup.c | 113 ++++++++------ arch/riscv/mm/init.c | 110 ++++++++++++++ include/uapi/linux/kexec.h | 1 + 11 files changed, 787 insertions(+), 45 deletions(-) create mode 100644 arch/riscv/include/asm/kexec.h create mode 100644 arch/riscv/kernel/crash_dump.c create mode 100644 arch/riscv/kernel/crash_save_regs.S create mode 100644 arch/riscv/kernel/kexec_relocate.S create mode 100644 arch/riscv/kernel/machine_kexec.c -- 2.26.2 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 0/5] RISC-V: Add kexec/kdump support @ 2021-04-05 8:57 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel, Nick Kossifidis This patch series adds kexec/kdump and crash kernel support on RISC-V. For testing the patches a patched version of kexec-tools is needed (still a work in progress) which can be found at: https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz v3: * Rebase on newer kernel tree * Minor cleanups * Split UAPI changes to a separate patch * Improve / cleanup init_resources * Resolve Palmer's comments v2: * Rebase on newer kernel tree * Minor cleanups * Properly populate the ioresources tre, so that it can be used later on for implementing strict /dev/mem * Use linux,usable-memory on /memory instead of a new binding * USe a reserved-memory node for ELF core header Nick Kossifidis (5): RISC-V: Add EM_RISCV to kexec UAPI header RISC-V: Add kexec support RISC-V: Improve init_resources RISC-V: Add kdump support RISC-V: Add crash kernel support arch/riscv/Kconfig | 25 ++++ arch/riscv/include/asm/elf.h | 6 + arch/riscv/include/asm/kexec.h | 54 +++++++ arch/riscv/kernel/Makefile | 6 + arch/riscv/kernel/crash_dump.c | 46 ++++++ arch/riscv/kernel/crash_save_regs.S | 56 +++++++ arch/riscv/kernel/kexec_relocate.S | 222 ++++++++++++++++++++++++++++ arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ arch/riscv/kernel/setup.c | 113 ++++++++------ arch/riscv/mm/init.c | 110 ++++++++++++++ include/uapi/linux/kexec.h | 1 + 11 files changed, 787 insertions(+), 45 deletions(-) create mode 100644 arch/riscv/include/asm/kexec.h create mode 100644 arch/riscv/kernel/crash_dump.c create mode 100644 arch/riscv/kernel/crash_save_regs.S create mode 100644 arch/riscv/kernel/kexec_relocate.S create mode 100644 arch/riscv/kernel/machine_kexec.c -- 2.26.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 1/5] RISC-V: Add EM_RISCV to kexec UAPI header 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-05 8:57 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel, Nick Kossifidis Add RISC-V to the list of supported kexec architecturs, we need to add the definition early-on so that later patches can use it. EM_RISCV is 243 as per ELF psABI specification here: https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- include/uapi/linux/kexec.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h index 05669c87a..778dc191c 100644 --- a/include/uapi/linux/kexec.h +++ b/include/uapi/linux/kexec.h @@ -42,6 +42,7 @@ #define KEXEC_ARCH_MIPS_LE (10 << 16) #define KEXEC_ARCH_MIPS ( 8 << 16) #define KEXEC_ARCH_AARCH64 (183 << 16) +#define KEXEC_ARCH_RISCV (243 << 16) /* The artificial cap on the number of segments passed to kexec_load. */ #define KEXEC_SEGMENT_MAX 16 -- 2.26.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 1/5] RISC-V: Add EM_RISCV to kexec UAPI header @ 2021-04-05 8:57 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel, Nick Kossifidis Add RISC-V to the list of supported kexec architecturs, we need to add the definition early-on so that later patches can use it. EM_RISCV is 243 as per ELF psABI specification here: https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- include/uapi/linux/kexec.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h index 05669c87a..778dc191c 100644 --- a/include/uapi/linux/kexec.h +++ b/include/uapi/linux/kexec.h @@ -42,6 +42,7 @@ #define KEXEC_ARCH_MIPS_LE (10 << 16) #define KEXEC_ARCH_MIPS ( 8 << 16) #define KEXEC_ARCH_AARCH64 (183 << 16) +#define KEXEC_ARCH_RISCV (243 << 16) /* The artificial cap on the number of segments passed to kexec_load. */ #define KEXEC_SEGMENT_MAX 16 -- 2.26.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v3 1/5] RISC-V: Add EM_RISCV to kexec UAPI header 2021-04-05 8:57 ` Nick Kossifidis (?) @ 2021-04-23 3:30 ` Palmer Dabbelt -1 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick, ebiederm, kexec; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick On Mon, 05 Apr 2021 01:57:08 PDT (-0700), mick@ics.forth.gr wrote: > Add RISC-V to the list of supported kexec architecturs, we need to > add the definition early-on so that later patches can use it. > > EM_RISCV is 243 as per ELF psABI specification here: > https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > include/uapi/linux/kexec.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h > index 05669c87a..778dc191c 100644 > --- a/include/uapi/linux/kexec.h > +++ b/include/uapi/linux/kexec.h > @@ -42,6 +42,7 @@ > #define KEXEC_ARCH_MIPS_LE (10 << 16) > #define KEXEC_ARCH_MIPS ( 8 << 16) > #define KEXEC_ARCH_AARCH64 (183 << 16) > +#define KEXEC_ARCH_RISCV (243 << 16) > > /* The artificial cap on the number of segments passed to kexec_load. */ > #define KEXEC_SEGMENT_MAX 16 This is missing the kexec maintainers, who I've added. I'm happy to just take this along with the rest of the patch set, as that's probably easiest. I usually like to get an Ack on this sort of thing, but I'm just going to speculate that this isn't controversial and put this on riscv/for-next. LMK if you want me to do something more complicated like a shared tag, but I see the arm64 stuff went in via the arm64 tree so I'm assuming this is fine. Thanks! ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 1/5] RISC-V: Add EM_RISCV to kexec UAPI header @ 2021-04-23 3:30 ` Palmer Dabbelt 0 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: ebiederm, kexec; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick On Mon, 05 Apr 2021 01:57:08 PDT (-0700), mick@ics.forth.gr wrote: > Add RISC-V to the list of supported kexec architecturs, we need to > add the definition early-on so that later patches can use it. > > EM_RISCV is 243 as per ELF psABI specification here: > https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > include/uapi/linux/kexec.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h > index 05669c87a..778dc191c 100644 > --- a/include/uapi/linux/kexec.h > +++ b/include/uapi/linux/kexec.h > @@ -42,6 +42,7 @@ > #define KEXEC_ARCH_MIPS_LE (10 << 16) > #define KEXEC_ARCH_MIPS ( 8 << 16) > #define KEXEC_ARCH_AARCH64 (183 << 16) > +#define KEXEC_ARCH_RISCV (243 << 16) > > /* The artificial cap on the number of segments passed to kexec_load. */ > #define KEXEC_SEGMENT_MAX 16 This is missing the kexec maintainers, who I've added. I'm happy to just take this along with the rest of the patch set, as that's probably easiest. I usually like to get an Ack on this sort of thing, but I'm just going to speculate that this isn't controversial and put this on riscv/for-next. LMK if you want me to do something more complicated like a shared tag, but I see the arm64 stuff went in via the arm64 tree so I'm assuming this is fine. Thanks! _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 1/5] RISC-V: Add EM_RISCV to kexec UAPI header @ 2021-04-23 3:30 ` Palmer Dabbelt 0 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick, ebiederm, kexec; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick On Mon, 05 Apr 2021 01:57:08 PDT (-0700), mick@ics.forth.gr wrote: > Add RISC-V to the list of supported kexec architecturs, we need to > add the definition early-on so that later patches can use it. > > EM_RISCV is 243 as per ELF psABI specification here: > https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > include/uapi/linux/kexec.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h > index 05669c87a..778dc191c 100644 > --- a/include/uapi/linux/kexec.h > +++ b/include/uapi/linux/kexec.h > @@ -42,6 +42,7 @@ > #define KEXEC_ARCH_MIPS_LE (10 << 16) > #define KEXEC_ARCH_MIPS ( 8 << 16) > #define KEXEC_ARCH_AARCH64 (183 << 16) > +#define KEXEC_ARCH_RISCV (243 << 16) > > /* The artificial cap on the number of segments passed to kexec_load. */ > #define KEXEC_SEGMENT_MAX 16 This is missing the kexec maintainers, who I've added. I'm happy to just take this along with the rest of the patch set, as that's probably easiest. I usually like to get an Ack on this sort of thing, but I'm just going to speculate that this isn't controversial and put this on riscv/for-next. LMK if you want me to do something more complicated like a shared tag, but I see the arm64 stuff went in via the arm64 tree so I'm assuming this is fine. Thanks! _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 2/5] RISC-V: Add kexec support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-05 8:57 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel, Nick Kossifidis This patch adds support for kexec on RISC-V. On SMP systems it depends on HOTPLUG_CPU in order to be able to bring up all harts after kexec. It also needs a recent OpenSBI version that supports the HSM extension. I tested it on riscv64 QEMU on both an smp and a non-smp system. v5: * For now depend on MMU, further changes needed for NOMMU support * Make sure stvec is aligned * Cleanup some unneeded fences * Verify control code's buffer size * Compile kexec_relocate.S with medany and norelax v4: * No functional changes, just re-based v3: * Use the new smp_shutdown_nonboot_cpus() call. * Move riscv_kexec_relocate to .rodata v2: * Pass needed parameters as arguments to riscv_kexec_relocate instead of using global variables. * Use kimage_arch to hold the fdt address of the included fdt. * Use SYM_* macros on kexec_relocate.S. * Compatibility with STRICT_KERNEL_RWX. * Compatibility with HOTPLUG_CPU for SMP * Small cleanups Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- arch/riscv/Kconfig | 15 +++ arch/riscv/include/asm/kexec.h | 47 ++++++++ arch/riscv/kernel/Makefile | 5 + arch/riscv/kernel/kexec_relocate.S | 156 ++++++++++++++++++++++++ arch/riscv/kernel/machine_kexec.c | 186 +++++++++++++++++++++++++++++ 5 files changed, 409 insertions(+) create mode 100644 arch/riscv/include/asm/kexec.h create mode 100644 arch/riscv/kernel/kexec_relocate.S create mode 100644 arch/riscv/kernel/machine_kexec.c diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 8ea60a0a1..3716262ef 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -389,6 +389,21 @@ config RISCV_SBI_V01 help This config allows kernel to use SBI v0.1 APIs. This will be deprecated in future once legacy M-mode software are no longer in use. + +config KEXEC + bool "Kexec system call" + select KEXEC_CORE + select HOTPLUG_CPU if SMP + depends on MMU + help + kexec is a system call that implements the ability to shutdown your + current kernel, and to start another kernel. It is like a reboot + but it is independent of the system firmware. And like a reboot + you can start any kernel with it, not just Linux. + + The name comes from the similarity to the exec system call. + + endmenu menu "Boot options" diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h new file mode 100644 index 000000000..efc69feb4 --- /dev/null +++ b/arch/riscv/include/asm/kexec.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2019 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> + */ + +#ifndef _RISCV_KEXEC_H +#define _RISCV_KEXEC_H + +/* Maximum physical address we can use pages from */ +#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL) + +/* Maximum address we can reach in physical address mode */ +#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL) + +/* Maximum address we can use for the control code buffer */ +#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL) + +/* Reserve a page for the control code buffer */ +#define KEXEC_CONTROL_PAGE_SIZE 4096 + +#define KEXEC_ARCH KEXEC_ARCH_RISCV + +static inline void +crash_setup_regs(struct pt_regs *newregs, + struct pt_regs *oldregs) +{ + /* Dummy implementation for now */ +} + + +#define ARCH_HAS_KIMAGE_ARCH + +struct kimage_arch { + unsigned long fdt_addr; +}; + +const extern unsigned char riscv_kexec_relocate[]; +const extern unsigned int riscv_kexec_relocate_size; + +typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, + unsigned long jump_addr, + unsigned long fdt_addr, + unsigned long hartid, + unsigned long va_pa_off); + +#endif diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 3dc0abde9..c2594018c 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -9,6 +9,10 @@ CFLAGS_REMOVE_patch.o = $(CC_FLAGS_FTRACE) CFLAGS_REMOVE_sbi.o = $(CC_FLAGS_FTRACE) endif +ifdef CONFIG_KEXEC +AFLAGS_kexec_relocate.o := -mcmodel=medany -mno-relax +endif + extra-y += head.o extra-y += vmlinux.lds @@ -54,6 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o endif obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o obj-$(CONFIG_KGDB) += kgdb.o +obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o obj-$(CONFIG_JUMP_LABEL) += jump_label.o diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S new file mode 100644 index 000000000..616c20771 --- /dev/null +++ b/arch/riscv/kernel/kexec_relocate.S @@ -0,0 +1,156 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2019 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> + */ + +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ +#include <asm/page.h> /* For PAGE_SHIFT */ +#include <linux/linkage.h> /* For SYM_* macros */ + +.section ".rodata" +SYM_CODE_START(riscv_kexec_relocate) + + /* + * s0: Pointer to the current entry + * s1: (const) Phys address to jump to after relocation + * s2: (const) Phys address of the FDT image + * s3: (const) The hartid of the current hart + * s4: Pointer to the destination address for the relocation + * s5: (const) Number of words per page + * s6: (const) 1, used for subtraction + * s7: (const) va_pa_offset, used when switching MMU off + * s8: (const) Physical address of the main loop + * s9: (debug) indirection page counter + * s10: (debug) entry counter + * s11: (debug) copied words counter + */ + mv s0, a0 + mv s1, a1 + mv s2, a2 + mv s3, a3 + mv s4, zero + li s5, ((1 << PAGE_SHIFT) / RISCV_SZPTR) + li s6, 1 + mv s7, a4 + mv s8, zero + mv s9, zero + mv s10, zero + mv s11, zero + + /* Disable / cleanup interrupts */ + csrw sie, zero + csrw sip, zero + + /* + * When we switch SATP.MODE to "Bare" we'll only + * play with physical addresses. However the first time + * we try to jump somewhere, the offset on the jump + * will be relative to pc which will still be on VA. To + * deal with this we set stvec to the physical address at + * the start of the loop below so that we jump there in + * any case. + */ + la s8, 1f + sub s8, s8, s7 + csrw stvec, s8 + + /* Process entries in a loop */ +.align 2 +1: + addi s10, s10, 1 + REG_L t0, 0(s0) /* t0 = *image->entry */ + addi s0, s0, RISCV_SZPTR /* image->entry++ */ + + /* IND_DESTINATION entry ? -> save destination address */ + andi t1, t0, 0x1 + beqz t1, 2f + andi s4, t0, ~0x1 + j 1b + +2: + /* IND_INDIRECTION entry ? -> update next entry ptr (PA) */ + andi t1, t0, 0x2 + beqz t1, 2f + andi s0, t0, ~0x2 + addi s9, s9, 1 + csrw sptbr, zero + jalr zero, s8, 0 + +2: + /* IND_DONE entry ? -> jump to done label */ + andi t1, t0, 0x4 + beqz t1, 2f + j 4f + +2: + /* + * IND_SOURCE entry ? -> copy page word by word to the + * destination address we got from IND_DESTINATION + */ + andi t1, t0, 0x8 + beqz t1, 1b /* Unknown entry type, ignore it */ + andi t0, t0, ~0x8 + mv t3, s5 /* i = num words per page */ +3: /* copy loop */ + REG_L t1, (t0) /* t1 = *src_ptr */ + REG_S t1, (s4) /* *dst_ptr = *src_ptr */ + addi t0, t0, RISCV_SZPTR /* stc_ptr++ */ + addi s4, s4, RISCV_SZPTR /* dst_ptr++ */ + sub t3, t3, s6 /* i-- */ + addi s11, s11, 1 /* c++ */ + beqz t3, 1b /* copy done ? */ + j 3b + +4: + /* Pass the arguments to the next kernel / Cleanup*/ + mv a0, s3 + mv a1, s2 + mv a2, s1 + + /* Cleanup */ + mv a3, zero + mv a4, zero + mv a5, zero + mv a6, zero + mv a7, zero + + mv s0, zero + mv s1, zero + mv s2, zero + mv s3, zero + mv s4, zero + mv s5, zero + mv s6, zero + mv s7, zero + mv s8, zero + mv s9, zero + mv s10, zero + mv s11, zero + + mv t0, zero + mv t1, zero + mv t2, zero + mv t3, zero + mv t4, zero + mv t5, zero + mv t6, zero + csrw sepc, zero + csrw scause, zero + csrw sscratch, zero + + /* + * Make sure the relocated code is visible + * and jump to the new kernel + */ + fence.i + + jalr zero, a2, 0 + +SYM_CODE_END(riscv_kexec_relocate) +riscv_kexec_relocate_end: + + .section ".rodata" +SYM_DATA(riscv_kexec_relocate_size, + .long riscv_kexec_relocate_end - riscv_kexec_relocate) + diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c new file mode 100644 index 000000000..2ce6c3daf --- /dev/null +++ b/arch/riscv/kernel/machine_kexec.c @@ -0,0 +1,186 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2019 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> + */ + +#include <linux/kexec.h> +#include <asm/kexec.h> /* For riscv_kexec_* symbol defines */ +#include <linux/smp.h> /* For smp_send_stop () */ +#include <asm/cacheflush.h> /* For local_flush_icache_all() */ +#include <asm/barrier.h> /* For smp_wmb() */ +#include <asm/page.h> /* For PAGE_MASK */ +#include <linux/libfdt.h> /* For fdt_check_header() */ +#include <asm/set_memory.h> /* For set_memory_x() */ +#include <linux/compiler.h> /* For unreachable() */ +#include <linux/cpu.h> /* For cpu_down() */ + +/** + * kexec_image_info - Print received image details + */ +static void +kexec_image_info(const struct kimage *image) +{ + unsigned long i; + + pr_debug("Kexec image info:\n"); + pr_debug("\ttype: %d\n", image->type); + pr_debug("\tstart: %lx\n", image->start); + pr_debug("\thead: %lx\n", image->head); + pr_debug("\tnr_segments: %lu\n", image->nr_segments); + + for (i = 0; i < image->nr_segments; i++) { + pr_debug("\t segment[%lu]: %016lx - %016lx", i, + image->segment[i].mem, + image->segment[i].mem + image->segment[i].memsz); + pr_debug("\t\t0x%lx bytes, %lu pages\n", + (unsigned long) image->segment[i].memsz, + (unsigned long) image->segment[i].memsz / PAGE_SIZE); + } +} + +/** + * machine_kexec_prepare - Initialize kexec + * + * This function is called from do_kexec_load, when the user has + * provided us with an image to be loaded. Its goal is to validate + * the image and prepare the control code buffer as needed. + * Note that kimage_alloc_init has already been called and the + * control buffer has already been allocated. + */ +int +machine_kexec_prepare(struct kimage *image) +{ + struct kimage_arch *internal = &image->arch; + struct fdt_header fdt = {0}; + void *control_code_buffer = NULL; + unsigned int control_code_buffer_sz = 0; + int i = 0; + + kexec_image_info(image); + + if (image->type == KEXEC_TYPE_CRASH) { + pr_warn("Loading a crash kernel is unsupported for now.\n"); + return -EINVAL; + } + + /* Find the Flattened Device Tree and save its physical address */ + for (i = 0; i < image->nr_segments; i++) { + if (image->segment[i].memsz <= sizeof(fdt)) + continue; + + if (copy_from_user(&fdt, image->segment[i].buf, sizeof(fdt))) + continue; + + if (fdt_check_header(&fdt)) + continue; + + internal->fdt_addr = (unsigned long) image->segment[i].mem; + break; + } + + if (!internal->fdt_addr) { + pr_err("Device tree not included in the provided image\n"); + return -EINVAL; + } + + /* Copy the assembler code for relocation to the control page */ + control_code_buffer = page_address(image->control_code_page); + control_code_buffer_sz = page_size(image->control_code_page); + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { + pr_err("Relocation code doesn't fit within a control page\n"); + return -EINVAL; + } + memcpy(control_code_buffer, riscv_kexec_relocate, + riscv_kexec_relocate_size); + + /* Mark the control page executable */ + set_memory_x((unsigned long) control_code_buffer, 1); + + return 0; +} + + +/** + * machine_kexec_cleanup - Cleanup any leftovers from + * machine_kexec_prepare + * + * This function is called by kimage_free to handle any arch-specific + * allocations done on machine_kexec_prepare. Since we didn't do any + * allocations there, this is just an empty function. Note that the + * control buffer is freed by kimage_free. + */ +void +machine_kexec_cleanup(struct kimage *image) +{ +} + + +/* + * machine_shutdown - Prepare for a kexec reboot + * + * This function is called by kernel_kexec just before machine_kexec + * below. Its goal is to prepare the rest of the system (the other + * harts and possibly devices etc) for a kexec reboot. + */ +void machine_shutdown(void) +{ + /* + * No more interrupts on this hart + * until we are back up. + */ + local_irq_disable(); + +#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) + smp_shutdown_nonboot_cpus(smp_processor_id()); +#endif +} + +/** + * machine_crash_shutdown - Prepare to kexec after a kernel crash + * + * This function is called by crash_kexec just before machine_kexec + * below and its goal is similar to machine_shutdown, but in case of + * a kernel crash. Since we don't handle such cases yet, this function + * is empty. + */ +void +machine_crash_shutdown(struct pt_regs *regs) +{ +} + +/** + * machine_kexec - Jump to the loaded kimage + * + * This function is called by kernel_kexec which is called by the + * reboot system call when the reboot cmd is LINUX_REBOOT_CMD_KEXEC, + * or by crash_kernel which is called by the kernel's arch-specific + * trap handler in case of a kernel panic. It's the final stage of + * the kexec process where the pre-loaded kimage is ready to be + * executed. We assume at this point that all other harts are + * suspended and this hart will be the new boot hart. + */ +void __noreturn +machine_kexec(struct kimage *image) +{ + struct kimage_arch *internal = &image->arch; + unsigned long jump_addr = (unsigned long) image->start; + unsigned long first_ind_entry = (unsigned long) &image->head; + unsigned long this_hart_id = raw_smp_processor_id(); + unsigned long fdt_addr = internal->fdt_addr; + void *control_code_buffer = page_address(image->control_code_page); + riscv_kexec_do_relocate do_relocate = control_code_buffer; + + pr_notice("Will call new kernel at %08lx from hart id %lx\n", + jump_addr, this_hart_id); + pr_notice("FDT image at %08lx\n", fdt_addr); + + /* Make sure the relocation code is visible to the hart */ + local_flush_icache_all(); + + /* Jump to the relocation code */ + pr_notice("Bye...\n"); + do_relocate(first_ind_entry, jump_addr, fdt_addr, + this_hart_id, va_pa_offset); + unreachable(); +} -- 2.26.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 2/5] RISC-V: Add kexec support @ 2021-04-05 8:57 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel, Nick Kossifidis This patch adds support for kexec on RISC-V. On SMP systems it depends on HOTPLUG_CPU in order to be able to bring up all harts after kexec. It also needs a recent OpenSBI version that supports the HSM extension. I tested it on riscv64 QEMU on both an smp and a non-smp system. v5: * For now depend on MMU, further changes needed for NOMMU support * Make sure stvec is aligned * Cleanup some unneeded fences * Verify control code's buffer size * Compile kexec_relocate.S with medany and norelax v4: * No functional changes, just re-based v3: * Use the new smp_shutdown_nonboot_cpus() call. * Move riscv_kexec_relocate to .rodata v2: * Pass needed parameters as arguments to riscv_kexec_relocate instead of using global variables. * Use kimage_arch to hold the fdt address of the included fdt. * Use SYM_* macros on kexec_relocate.S. * Compatibility with STRICT_KERNEL_RWX. * Compatibility with HOTPLUG_CPU for SMP * Small cleanups Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- arch/riscv/Kconfig | 15 +++ arch/riscv/include/asm/kexec.h | 47 ++++++++ arch/riscv/kernel/Makefile | 5 + arch/riscv/kernel/kexec_relocate.S | 156 ++++++++++++++++++++++++ arch/riscv/kernel/machine_kexec.c | 186 +++++++++++++++++++++++++++++ 5 files changed, 409 insertions(+) create mode 100644 arch/riscv/include/asm/kexec.h create mode 100644 arch/riscv/kernel/kexec_relocate.S create mode 100644 arch/riscv/kernel/machine_kexec.c diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 8ea60a0a1..3716262ef 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -389,6 +389,21 @@ config RISCV_SBI_V01 help This config allows kernel to use SBI v0.1 APIs. This will be deprecated in future once legacy M-mode software are no longer in use. + +config KEXEC + bool "Kexec system call" + select KEXEC_CORE + select HOTPLUG_CPU if SMP + depends on MMU + help + kexec is a system call that implements the ability to shutdown your + current kernel, and to start another kernel. It is like a reboot + but it is independent of the system firmware. And like a reboot + you can start any kernel with it, not just Linux. + + The name comes from the similarity to the exec system call. + + endmenu menu "Boot options" diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h new file mode 100644 index 000000000..efc69feb4 --- /dev/null +++ b/arch/riscv/include/asm/kexec.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2019 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> + */ + +#ifndef _RISCV_KEXEC_H +#define _RISCV_KEXEC_H + +/* Maximum physical address we can use pages from */ +#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL) + +/* Maximum address we can reach in physical address mode */ +#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL) + +/* Maximum address we can use for the control code buffer */ +#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL) + +/* Reserve a page for the control code buffer */ +#define KEXEC_CONTROL_PAGE_SIZE 4096 + +#define KEXEC_ARCH KEXEC_ARCH_RISCV + +static inline void +crash_setup_regs(struct pt_regs *newregs, + struct pt_regs *oldregs) +{ + /* Dummy implementation for now */ +} + + +#define ARCH_HAS_KIMAGE_ARCH + +struct kimage_arch { + unsigned long fdt_addr; +}; + +const extern unsigned char riscv_kexec_relocate[]; +const extern unsigned int riscv_kexec_relocate_size; + +typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, + unsigned long jump_addr, + unsigned long fdt_addr, + unsigned long hartid, + unsigned long va_pa_off); + +#endif diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 3dc0abde9..c2594018c 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -9,6 +9,10 @@ CFLAGS_REMOVE_patch.o = $(CC_FLAGS_FTRACE) CFLAGS_REMOVE_sbi.o = $(CC_FLAGS_FTRACE) endif +ifdef CONFIG_KEXEC +AFLAGS_kexec_relocate.o := -mcmodel=medany -mno-relax +endif + extra-y += head.o extra-y += vmlinux.lds @@ -54,6 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o endif obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o obj-$(CONFIG_KGDB) += kgdb.o +obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o obj-$(CONFIG_JUMP_LABEL) += jump_label.o diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S new file mode 100644 index 000000000..616c20771 --- /dev/null +++ b/arch/riscv/kernel/kexec_relocate.S @@ -0,0 +1,156 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2019 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> + */ + +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ +#include <asm/page.h> /* For PAGE_SHIFT */ +#include <linux/linkage.h> /* For SYM_* macros */ + +.section ".rodata" +SYM_CODE_START(riscv_kexec_relocate) + + /* + * s0: Pointer to the current entry + * s1: (const) Phys address to jump to after relocation + * s2: (const) Phys address of the FDT image + * s3: (const) The hartid of the current hart + * s4: Pointer to the destination address for the relocation + * s5: (const) Number of words per page + * s6: (const) 1, used for subtraction + * s7: (const) va_pa_offset, used when switching MMU off + * s8: (const) Physical address of the main loop + * s9: (debug) indirection page counter + * s10: (debug) entry counter + * s11: (debug) copied words counter + */ + mv s0, a0 + mv s1, a1 + mv s2, a2 + mv s3, a3 + mv s4, zero + li s5, ((1 << PAGE_SHIFT) / RISCV_SZPTR) + li s6, 1 + mv s7, a4 + mv s8, zero + mv s9, zero + mv s10, zero + mv s11, zero + + /* Disable / cleanup interrupts */ + csrw sie, zero + csrw sip, zero + + /* + * When we switch SATP.MODE to "Bare" we'll only + * play with physical addresses. However the first time + * we try to jump somewhere, the offset on the jump + * will be relative to pc which will still be on VA. To + * deal with this we set stvec to the physical address at + * the start of the loop below so that we jump there in + * any case. + */ + la s8, 1f + sub s8, s8, s7 + csrw stvec, s8 + + /* Process entries in a loop */ +.align 2 +1: + addi s10, s10, 1 + REG_L t0, 0(s0) /* t0 = *image->entry */ + addi s0, s0, RISCV_SZPTR /* image->entry++ */ + + /* IND_DESTINATION entry ? -> save destination address */ + andi t1, t0, 0x1 + beqz t1, 2f + andi s4, t0, ~0x1 + j 1b + +2: + /* IND_INDIRECTION entry ? -> update next entry ptr (PA) */ + andi t1, t0, 0x2 + beqz t1, 2f + andi s0, t0, ~0x2 + addi s9, s9, 1 + csrw sptbr, zero + jalr zero, s8, 0 + +2: + /* IND_DONE entry ? -> jump to done label */ + andi t1, t0, 0x4 + beqz t1, 2f + j 4f + +2: + /* + * IND_SOURCE entry ? -> copy page word by word to the + * destination address we got from IND_DESTINATION + */ + andi t1, t0, 0x8 + beqz t1, 1b /* Unknown entry type, ignore it */ + andi t0, t0, ~0x8 + mv t3, s5 /* i = num words per page */ +3: /* copy loop */ + REG_L t1, (t0) /* t1 = *src_ptr */ + REG_S t1, (s4) /* *dst_ptr = *src_ptr */ + addi t0, t0, RISCV_SZPTR /* stc_ptr++ */ + addi s4, s4, RISCV_SZPTR /* dst_ptr++ */ + sub t3, t3, s6 /* i-- */ + addi s11, s11, 1 /* c++ */ + beqz t3, 1b /* copy done ? */ + j 3b + +4: + /* Pass the arguments to the next kernel / Cleanup*/ + mv a0, s3 + mv a1, s2 + mv a2, s1 + + /* Cleanup */ + mv a3, zero + mv a4, zero + mv a5, zero + mv a6, zero + mv a7, zero + + mv s0, zero + mv s1, zero + mv s2, zero + mv s3, zero + mv s4, zero + mv s5, zero + mv s6, zero + mv s7, zero + mv s8, zero + mv s9, zero + mv s10, zero + mv s11, zero + + mv t0, zero + mv t1, zero + mv t2, zero + mv t3, zero + mv t4, zero + mv t5, zero + mv t6, zero + csrw sepc, zero + csrw scause, zero + csrw sscratch, zero + + /* + * Make sure the relocated code is visible + * and jump to the new kernel + */ + fence.i + + jalr zero, a2, 0 + +SYM_CODE_END(riscv_kexec_relocate) +riscv_kexec_relocate_end: + + .section ".rodata" +SYM_DATA(riscv_kexec_relocate_size, + .long riscv_kexec_relocate_end - riscv_kexec_relocate) + diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c new file mode 100644 index 000000000..2ce6c3daf --- /dev/null +++ b/arch/riscv/kernel/machine_kexec.c @@ -0,0 +1,186 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2019 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> + */ + +#include <linux/kexec.h> +#include <asm/kexec.h> /* For riscv_kexec_* symbol defines */ +#include <linux/smp.h> /* For smp_send_stop () */ +#include <asm/cacheflush.h> /* For local_flush_icache_all() */ +#include <asm/barrier.h> /* For smp_wmb() */ +#include <asm/page.h> /* For PAGE_MASK */ +#include <linux/libfdt.h> /* For fdt_check_header() */ +#include <asm/set_memory.h> /* For set_memory_x() */ +#include <linux/compiler.h> /* For unreachable() */ +#include <linux/cpu.h> /* For cpu_down() */ + +/** + * kexec_image_info - Print received image details + */ +static void +kexec_image_info(const struct kimage *image) +{ + unsigned long i; + + pr_debug("Kexec image info:\n"); + pr_debug("\ttype: %d\n", image->type); + pr_debug("\tstart: %lx\n", image->start); + pr_debug("\thead: %lx\n", image->head); + pr_debug("\tnr_segments: %lu\n", image->nr_segments); + + for (i = 0; i < image->nr_segments; i++) { + pr_debug("\t segment[%lu]: %016lx - %016lx", i, + image->segment[i].mem, + image->segment[i].mem + image->segment[i].memsz); + pr_debug("\t\t0x%lx bytes, %lu pages\n", + (unsigned long) image->segment[i].memsz, + (unsigned long) image->segment[i].memsz / PAGE_SIZE); + } +} + +/** + * machine_kexec_prepare - Initialize kexec + * + * This function is called from do_kexec_load, when the user has + * provided us with an image to be loaded. Its goal is to validate + * the image and prepare the control code buffer as needed. + * Note that kimage_alloc_init has already been called and the + * control buffer has already been allocated. + */ +int +machine_kexec_prepare(struct kimage *image) +{ + struct kimage_arch *internal = &image->arch; + struct fdt_header fdt = {0}; + void *control_code_buffer = NULL; + unsigned int control_code_buffer_sz = 0; + int i = 0; + + kexec_image_info(image); + + if (image->type == KEXEC_TYPE_CRASH) { + pr_warn("Loading a crash kernel is unsupported for now.\n"); + return -EINVAL; + } + + /* Find the Flattened Device Tree and save its physical address */ + for (i = 0; i < image->nr_segments; i++) { + if (image->segment[i].memsz <= sizeof(fdt)) + continue; + + if (copy_from_user(&fdt, image->segment[i].buf, sizeof(fdt))) + continue; + + if (fdt_check_header(&fdt)) + continue; + + internal->fdt_addr = (unsigned long) image->segment[i].mem; + break; + } + + if (!internal->fdt_addr) { + pr_err("Device tree not included in the provided image\n"); + return -EINVAL; + } + + /* Copy the assembler code for relocation to the control page */ + control_code_buffer = page_address(image->control_code_page); + control_code_buffer_sz = page_size(image->control_code_page); + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { + pr_err("Relocation code doesn't fit within a control page\n"); + return -EINVAL; + } + memcpy(control_code_buffer, riscv_kexec_relocate, + riscv_kexec_relocate_size); + + /* Mark the control page executable */ + set_memory_x((unsigned long) control_code_buffer, 1); + + return 0; +} + + +/** + * machine_kexec_cleanup - Cleanup any leftovers from + * machine_kexec_prepare + * + * This function is called by kimage_free to handle any arch-specific + * allocations done on machine_kexec_prepare. Since we didn't do any + * allocations there, this is just an empty function. Note that the + * control buffer is freed by kimage_free. + */ +void +machine_kexec_cleanup(struct kimage *image) +{ +} + + +/* + * machine_shutdown - Prepare for a kexec reboot + * + * This function is called by kernel_kexec just before machine_kexec + * below. Its goal is to prepare the rest of the system (the other + * harts and possibly devices etc) for a kexec reboot. + */ +void machine_shutdown(void) +{ + /* + * No more interrupts on this hart + * until we are back up. + */ + local_irq_disable(); + +#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) + smp_shutdown_nonboot_cpus(smp_processor_id()); +#endif +} + +/** + * machine_crash_shutdown - Prepare to kexec after a kernel crash + * + * This function is called by crash_kexec just before machine_kexec + * below and its goal is similar to machine_shutdown, but in case of + * a kernel crash. Since we don't handle such cases yet, this function + * is empty. + */ +void +machine_crash_shutdown(struct pt_regs *regs) +{ +} + +/** + * machine_kexec - Jump to the loaded kimage + * + * This function is called by kernel_kexec which is called by the + * reboot system call when the reboot cmd is LINUX_REBOOT_CMD_KEXEC, + * or by crash_kernel which is called by the kernel's arch-specific + * trap handler in case of a kernel panic. It's the final stage of + * the kexec process where the pre-loaded kimage is ready to be + * executed. We assume at this point that all other harts are + * suspended and this hart will be the new boot hart. + */ +void __noreturn +machine_kexec(struct kimage *image) +{ + struct kimage_arch *internal = &image->arch; + unsigned long jump_addr = (unsigned long) image->start; + unsigned long first_ind_entry = (unsigned long) &image->head; + unsigned long this_hart_id = raw_smp_processor_id(); + unsigned long fdt_addr = internal->fdt_addr; + void *control_code_buffer = page_address(image->control_code_page); + riscv_kexec_do_relocate do_relocate = control_code_buffer; + + pr_notice("Will call new kernel at %08lx from hart id %lx\n", + jump_addr, this_hart_id); + pr_notice("FDT image at %08lx\n", fdt_addr); + + /* Make sure the relocation code is visible to the hart */ + local_flush_icache_all(); + + /* Jump to the relocation code */ + pr_notice("Bye...\n"); + do_relocate(first_ind_entry, jump_addr, fdt_addr, + this_hart_id, va_pa_offset); + unreachable(); +} -- 2.26.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v3 2/5] RISC-V: Add kexec support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-06 18:38 ` Alex Ghiti -1 siblings, 0 replies; 53+ messages in thread From: Alex Ghiti @ 2021-04-06 18:38 UTC (permalink / raw) To: Nick Kossifidis, linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel Le 4/5/21 à 4:57 AM, Nick Kossifidis a écrit : > This patch adds support for kexec on RISC-V. On SMP systems it depends > on HOTPLUG_CPU in order to be able to bring up all harts after kexec. > It also needs a recent OpenSBI version that supports the HSM extension. > I tested it on riscv64 QEMU on both an smp and a non-smp system. > > v5: > * For now depend on MMU, further changes needed for NOMMU support > * Make sure stvec is aligned > * Cleanup some unneeded fences > * Verify control code's buffer size > * Compile kexec_relocate.S with medany and norelax > > v4: > * No functional changes, just re-based > > v3: > * Use the new smp_shutdown_nonboot_cpus() call. > * Move riscv_kexec_relocate to .rodata > > v2: > * Pass needed parameters as arguments to riscv_kexec_relocate > instead of using global variables. > * Use kimage_arch to hold the fdt address of the included fdt. > * Use SYM_* macros on kexec_relocate.S. > * Compatibility with STRICT_KERNEL_RWX. > * Compatibility with HOTPLUG_CPU for SMP > * Small cleanups > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/Kconfig | 15 +++ > arch/riscv/include/asm/kexec.h | 47 ++++++++ > arch/riscv/kernel/Makefile | 5 + > arch/riscv/kernel/kexec_relocate.S | 156 ++++++++++++++++++++++++ > arch/riscv/kernel/machine_kexec.c | 186 +++++++++++++++++++++++++++++ > 5 files changed, 409 insertions(+) > create mode 100644 arch/riscv/include/asm/kexec.h > create mode 100644 arch/riscv/kernel/kexec_relocate.S > create mode 100644 arch/riscv/kernel/machine_kexec.c > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index 8ea60a0a1..3716262ef 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -389,6 +389,21 @@ config RISCV_SBI_V01 > help > This config allows kernel to use SBI v0.1 APIs. This will be > deprecated in future once legacy M-mode software are no longer in use. > + > +config KEXEC > + bool "Kexec system call" > + select KEXEC_CORE > + select HOTPLUG_CPU if SMP > + depends on MMU > + help > + kexec is a system call that implements the ability to shutdown your > + current kernel, and to start another kernel. It is like a reboot > + but it is independent of the system firmware. And like a reboot > + you can start any kernel with it, not just Linux. > + > + The name comes from the similarity to the exec system call. > + > + > endmenu > > menu "Boot options" > diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h > new file mode 100644 > index 000000000..efc69feb4 > --- /dev/null > +++ b/arch/riscv/include/asm/kexec.h > @@ -0,0 +1,47 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#ifndef _RISCV_KEXEC_H > +#define _RISCV_KEXEC_H > + > +/* Maximum physical address we can use pages from */ > +#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL) > + > +/* Maximum address we can reach in physical address mode */ > +#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL) > + > +/* Maximum address we can use for the control code buffer */ > +#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL) > + > +/* Reserve a page for the control code buffer */ > +#define KEXEC_CONTROL_PAGE_SIZE 4096 PAGE_SIZE instead ? > + > +#define KEXEC_ARCH KEXEC_ARCH_RISCV > + > +static inline void > +crash_setup_regs(struct pt_regs *newregs, > + struct pt_regs *oldregs) > +{ > + /* Dummy implementation for now */ > +} > + > + > +#define ARCH_HAS_KIMAGE_ARCH > + > +struct kimage_arch { > + unsigned long fdt_addr; > +}; > + > +const extern unsigned char riscv_kexec_relocate[]; > +const extern unsigned int riscv_kexec_relocate_size; > + > +typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, > + unsigned long jump_addr, > + unsigned long fdt_addr, > + unsigned long hartid, > + unsigned long va_pa_off); > + > +#endif > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index 3dc0abde9..c2594018c 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -9,6 +9,10 @@ CFLAGS_REMOVE_patch.o = $(CC_FLAGS_FTRACE) > CFLAGS_REMOVE_sbi.o = $(CC_FLAGS_FTRACE) > endif > > +ifdef CONFIG_KEXEC > +AFLAGS_kexec_relocate.o := -mcmodel=medany -mno-relax > +endif > + > extra-y += head.o > extra-y += vmlinux.lds > > @@ -54,6 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o > endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > +obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o Other obj-$() use parenthesis. > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S > new file mode 100644 > index 000000000..616c20771 > --- /dev/null > +++ b/arch/riscv/kernel/kexec_relocate.S > @@ -0,0 +1,156 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ > +#include <asm/page.h> /* For PAGE_SHIFT */ > +#include <linux/linkage.h> /* For SYM_* macros */ > + > +.section ".rodata" > +SYM_CODE_START(riscv_kexec_relocate) > + > + /* > + * s0: Pointer to the current entry > + * s1: (const) Phys address to jump to after relocation > + * s2: (const) Phys address of the FDT image > + * s3: (const) The hartid of the current hart > + * s4: Pointer to the destination address for the relocation > + * s5: (const) Number of words per page > + * s6: (const) 1, used for subtraction > + * s7: (const) va_pa_offset, used when switching MMU off > + * s8: (const) Physical address of the main loop > + * s9: (debug) indirection page counter > + * s10: (debug) entry counter > + * s11: (debug) copied words counter > + */ > + mv s0, a0 > + mv s1, a1 > + mv s2, a2 > + mv s3, a3 > + mv s4, zero > + li s5, ((1 << PAGE_SHIFT) / RISCV_SZPTR) 1 << PAGE_SHIFT = PAGE_SIZE > + li s6, 1 > + mv s7, a4 > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + /* Disable / cleanup interrupts */ > + csrw sie, zero > + csrw sip, zero > + > + /* > + * When we switch SATP.MODE to "Bare" we'll only > + * play with physical addresses. However the first time > + * we try to jump somewhere, the offset on the jump > + * will be relative to pc which will still be on VA. To > + * deal with this we set stvec to the physical address at > + * the start of the loop below so that we jump there in > + * any case. > + */ > + la s8, 1f > + sub s8, s8, s7 > + csrw stvec, s8 > + > + /* Process entries in a loop */ > +.align 2 > +1: > + addi s10, s10, 1 > + REG_L t0, 0(s0) /* t0 = *image->entry */ > + addi s0, s0, RISCV_SZPTR /* image->entry++ */ > + > + /* IND_DESTINATION entry ? -> save destination address */ > + andi t1, t0, 0x1 > + beqz t1, 2f > + andi s4, t0, ~0x1 > + j 1b > + > +2: > + /* IND_INDIRECTION entry ? -> update next entry ptr (PA) */ > + andi t1, t0, 0x2 > + beqz t1, 2f > + andi s0, t0, ~0x2 > + addi s9, s9, 1 > + csrw sptbr, zero > + jalr zero, s8, 0 > + > +2: > + /* IND_DONE entry ? -> jump to done label */ > + andi t1, t0, 0x4 > + beqz t1, 2f > + j 4f > + > +2: > + /* > + * IND_SOURCE entry ? -> copy page word by word to the > + * destination address we got from IND_DESTINATION > + */ > + andi t1, t0, 0x8 > + beqz t1, 1b /* Unknown entry type, ignore it */ > + andi t0, t0, ~0x8 > + mv t3, s5 /* i = num words per page */ > +3: /* copy loop */ > + REG_L t1, (t0) /* t1 = *src_ptr */ > + REG_S t1, (s4) /* *dst_ptr = *src_ptr */ > + addi t0, t0, RISCV_SZPTR /* stc_ptr++ */ > + addi s4, s4, RISCV_SZPTR /* dst_ptr++ */ > + sub t3, t3, s6 /* i-- */ > + addi s11, s11, 1 /* c++ */ > + beqz t3, 1b /* copy done ? */ > + j 3b > + > +4: > + /* Pass the arguments to the next kernel / Cleanup*/ > + mv a0, s3 > + mv a1, s2 > + mv a2, s1 > + > + /* Cleanup */ > + mv a3, zero > + mv a4, zero > + mv a5, zero > + mv a6, zero > + mv a7, zero > + > + mv s0, zero > + mv s1, zero > + mv s2, zero > + mv s3, zero > + mv s4, zero > + mv s5, zero > + mv s6, zero > + mv s7, zero > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + mv t0, zero > + mv t1, zero > + mv t2, zero > + mv t3, zero > + mv t4, zero > + mv t5, zero > + mv t6, zero > + csrw sepc, zero > + csrw scause, zero > + csrw sscratch, zero > + > + /* > + * Make sure the relocated code is visible > + * and jump to the new kernel > + */ > + fence.i > + > + jalr zero, a2, 0 > + > +SYM_CODE_END(riscv_kexec_relocate) > +riscv_kexec_relocate_end: > + > + .section ".rodata" > +SYM_DATA(riscv_kexec_relocate_size, > + .long riscv_kexec_relocate_end - riscv_kexec_relocate) > + > diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c > new file mode 100644 > index 000000000..2ce6c3daf > --- /dev/null > +++ b/arch/riscv/kernel/machine_kexec.c > @@ -0,0 +1,186 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <linux/kexec.h> > +#include <asm/kexec.h> /* For riscv_kexec_* symbol defines */ > +#include <linux/smp.h> /* For smp_send_stop () */ > +#include <asm/cacheflush.h> /* For local_flush_icache_all() */ > +#include <asm/barrier.h> /* For smp_wmb() */ > +#include <asm/page.h> /* For PAGE_MASK */ > +#include <linux/libfdt.h> /* For fdt_check_header() */ > +#include <asm/set_memory.h> /* For set_memory_x() */ > +#include <linux/compiler.h> /* For unreachable() */ > +#include <linux/cpu.h> /* For cpu_down() */ > + > +/** > + * kexec_image_info - Print received image details > + */ > +static void > +kexec_image_info(const struct kimage *image) > +{ > + unsigned long i; > + * harts and possibly devices etc) for a kexec reboot. > + */ > +void machine_shutdown(void) > + > + pr_debug("Kexec image info:\n"); > + pr_debug("\ttype: %d\n", image->type); > + pr_debug("\tstart: %lx\n", image->start); > + pr_debug("\thead: %lx\n", image->head); > + pr_debug("\tnr_segments: %lu\n", image->nr_segments); > + > + for (i = 0; i < image->nr_segments; i++) { > + pr_debug("\t segment[%lu]: %016lx - %016lx", i, > + image->segment[i].mem, > + image->segment[i].mem + image->segment[i].memsz); > + pr_debug("\t\t0x%lx bytes, %lu pages\n", > + (unsigned long) image->segment[i].memsz, > + (unsigned long) image->segment[i].memsz / PAGE_SIZE); > + } > +} > + > +/** > + * machine_kexec_prepare - Initialize kexec > + * > + * This function is called from do_kexec_load, when the user has > + * provided us with an image to be loaded. Its goal is to validate > + * the image and prepare the control code buffer as needed. > + * Note that kimage_alloc_init has already been called and the > + * control buffer has already been allocated. > + */ > +int > +machine_kexec_prepare(struct kimage *image) > +{ > + struct kimage_arch *internal = &image->arch; > + struct fdt_header fdt = {0}; > + void *control_code_buffer = NULL; > + unsigned int control_code_buffer_sz = 0; > + int i = 0; > + > + kexec_image_info(image); > + > + if (image->type == KEXEC_TYPE_CRASH) { > + pr_warn("Loading a crash kernel is unsupported for now.\n"); > + return -EINVAL; > + } > + > + /* Find the Flattened Device Tree and save its physical address */ > + for (i = 0; i < image->nr_segments; i++) { > + if (image->segment[i].memsz <= sizeof(fdt)) > + continue; > + > + if (copy_from_user(&fdt, image->segment[i].buf, sizeof(fdt))) > + continue; > + > + if (fdt_check_header(&fdt)) > + continue; > + > + internal->fdt_addr = (unsigned long) image->segment[i].mem; > + break; > + } > + > + if (!internal->fdt_addr) { > + pr_err("Device tree not included in the provided image\n"); > + return -EINVAL; > + } > + > + /* Copy the assembler code for relocation to the control page */ > + control_code_buffer = page_address(image->control_code_page); > + control_code_buffer_sz = page_size(image->control_code_page); > + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > + pr_err("Relocation code doesn't fit within a control page\n"); > + return -EINVAL; > + } > + memcpy(control_code_buffer, riscv_kexec_relocate, > + riscv_kexec_relocate_size); > + > + /* Mark the control page executable */ > + set_memory_x((unsigned long) control_code_buffer, 1); > + > + return 0; > +} > + > + > +/** > + * machine_kexec_cleanup - Cleanup any leftovers from > + * machine_kexec_prepare > + * > + * This function is called by kimage_free to handle any arch-specific > + * allocations done on machine_kexec_prepare. Since we didn't do any > + * allocations there, this is just an empty function. Note that the > + * control buffer is freed by kimage_free. > + */ > +void > +machine_kexec_cleanup(struct kimage *image) > +{ > +} > + > + > +/* > + * machine_shutdown - Prepare for a kexec reboot > + * > + * This function is called by kernel_kexec just before machine_kexec > + * below. Its goal is to prepare the rest of the system (the other > + * harts and possibly devices etc) for a kexec reboot. > + */ > +void machine_shutdown(void) > +{ > + /* > + * No more interrupts on this hart > + * until we are back up. > + */ > + local_irq_disable(); > + > +#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) Shouldn't it be defined(CONFIG_SMP) ? > + smp_shutdown_nonboot_cpus(smp_processor_id()); > +#endif > +} > + > +/** > + * machine_crash_shutdown - Prepare to kexec after a kernel crash > + * > + * This function is called by crash_kexec just before machine_kexec > + * below and its goal is similar to machine_shutdown, but in case of > + * a kernel crash. Since we don't handle such cases yet, this function > + * is empty. > + */ > +void > +machine_crash_shutdown(struct pt_regs *regs) > +{ > +} > + > +/** > + * machine_kexec - Jump to the loaded kimage > + * > + * This function is called by kernel_kexec which is called by the > + * reboot system call when the reboot cmd is LINUX_REBOOT_CMD_KEXEC, > + * or by crash_kernel which is called by the kernel's arch-specific > + * trap handler in case of a kernel panic. It's the final stage of > + * the kexec process where the pre-loaded kimage is ready to be > + * executed. We assume at this point that all other harts are > + * suspended and this hart will be the new boot hart. > + */ > +void __noreturn > +machine_kexec(struct kimage *image) > +{ > + struct kimage_arch *internal = &image->arch; > + unsigned long jump_addr = (unsigned long) image->start; > + unsigned long first_ind_entry = (unsigned long) &image->head; > + unsigned long this_hart_id = raw_smp_processor_id(); > + unsigned long fdt_addr = internal->fdt_addr; > + void *control_code_buffer = page_address(image->control_code_page); > + riscv_kexec_do_relocate do_relocate = control_code_buffer; > + > + pr_notice("Will call new kernel at %08lx from hart id %lx\n", > + jump_addr, this_hart_id); > + pr_notice("FDT image at %08lx\n", fdt_addr); > + > + /* Make sure the relocation code is visible to the hart */ > + local_flush_icache_all(); > + > + /* Jump to the relocation code */ > + pr_notice("Bye...\n"); > + do_relocate(first_ind_entry, jump_addr, fdt_addr, > + this_hart_id, va_pa_offset); > + unreachable(); > +} > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 2/5] RISC-V: Add kexec support @ 2021-04-06 18:38 ` Alex Ghiti 0 siblings, 0 replies; 53+ messages in thread From: Alex Ghiti @ 2021-04-06 18:38 UTC (permalink / raw) To: Nick Kossifidis, linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel Le 4/5/21 à 4:57 AM, Nick Kossifidis a écrit : > This patch adds support for kexec on RISC-V. On SMP systems it depends > on HOTPLUG_CPU in order to be able to bring up all harts after kexec. > It also needs a recent OpenSBI version that supports the HSM extension. > I tested it on riscv64 QEMU on both an smp and a non-smp system. > > v5: > * For now depend on MMU, further changes needed for NOMMU support > * Make sure stvec is aligned > * Cleanup some unneeded fences > * Verify control code's buffer size > * Compile kexec_relocate.S with medany and norelax > > v4: > * No functional changes, just re-based > > v3: > * Use the new smp_shutdown_nonboot_cpus() call. > * Move riscv_kexec_relocate to .rodata > > v2: > * Pass needed parameters as arguments to riscv_kexec_relocate > instead of using global variables. > * Use kimage_arch to hold the fdt address of the included fdt. > * Use SYM_* macros on kexec_relocate.S. > * Compatibility with STRICT_KERNEL_RWX. > * Compatibility with HOTPLUG_CPU for SMP > * Small cleanups > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/Kconfig | 15 +++ > arch/riscv/include/asm/kexec.h | 47 ++++++++ > arch/riscv/kernel/Makefile | 5 + > arch/riscv/kernel/kexec_relocate.S | 156 ++++++++++++++++++++++++ > arch/riscv/kernel/machine_kexec.c | 186 +++++++++++++++++++++++++++++ > 5 files changed, 409 insertions(+) > create mode 100644 arch/riscv/include/asm/kexec.h > create mode 100644 arch/riscv/kernel/kexec_relocate.S > create mode 100644 arch/riscv/kernel/machine_kexec.c > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index 8ea60a0a1..3716262ef 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -389,6 +389,21 @@ config RISCV_SBI_V01 > help > This config allows kernel to use SBI v0.1 APIs. This will be > deprecated in future once legacy M-mode software are no longer in use. > + > +config KEXEC > + bool "Kexec system call" > + select KEXEC_CORE > + select HOTPLUG_CPU if SMP > + depends on MMU > + help > + kexec is a system call that implements the ability to shutdown your > + current kernel, and to start another kernel. It is like a reboot > + but it is independent of the system firmware. And like a reboot > + you can start any kernel with it, not just Linux. > + > + The name comes from the similarity to the exec system call. > + > + > endmenu > > menu "Boot options" > diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h > new file mode 100644 > index 000000000..efc69feb4 > --- /dev/null > +++ b/arch/riscv/include/asm/kexec.h > @@ -0,0 +1,47 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#ifndef _RISCV_KEXEC_H > +#define _RISCV_KEXEC_H > + > +/* Maximum physical address we can use pages from */ > +#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL) > + > +/* Maximum address we can reach in physical address mode */ > +#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL) > + > +/* Maximum address we can use for the control code buffer */ > +#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL) > + > +/* Reserve a page for the control code buffer */ > +#define KEXEC_CONTROL_PAGE_SIZE 4096 PAGE_SIZE instead ? > + > +#define KEXEC_ARCH KEXEC_ARCH_RISCV > + > +static inline void > +crash_setup_regs(struct pt_regs *newregs, > + struct pt_regs *oldregs) > +{ > + /* Dummy implementation for now */ > +} > + > + > +#define ARCH_HAS_KIMAGE_ARCH > + > +struct kimage_arch { > + unsigned long fdt_addr; > +}; > + > +const extern unsigned char riscv_kexec_relocate[]; > +const extern unsigned int riscv_kexec_relocate_size; > + > +typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, > + unsigned long jump_addr, > + unsigned long fdt_addr, > + unsigned long hartid, > + unsigned long va_pa_off); > + > +#endif > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index 3dc0abde9..c2594018c 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -9,6 +9,10 @@ CFLAGS_REMOVE_patch.o = $(CC_FLAGS_FTRACE) > CFLAGS_REMOVE_sbi.o = $(CC_FLAGS_FTRACE) > endif > > +ifdef CONFIG_KEXEC > +AFLAGS_kexec_relocate.o := -mcmodel=medany -mno-relax > +endif > + > extra-y += head.o > extra-y += vmlinux.lds > > @@ -54,6 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o > endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > +obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o Other obj-$() use parenthesis. > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S > new file mode 100644 > index 000000000..616c20771 > --- /dev/null > +++ b/arch/riscv/kernel/kexec_relocate.S > @@ -0,0 +1,156 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ > +#include <asm/page.h> /* For PAGE_SHIFT */ > +#include <linux/linkage.h> /* For SYM_* macros */ > + > +.section ".rodata" > +SYM_CODE_START(riscv_kexec_relocate) > + > + /* > + * s0: Pointer to the current entry > + * s1: (const) Phys address to jump to after relocation > + * s2: (const) Phys address of the FDT image > + * s3: (const) The hartid of the current hart > + * s4: Pointer to the destination address for the relocation > + * s5: (const) Number of words per page > + * s6: (const) 1, used for subtraction > + * s7: (const) va_pa_offset, used when switching MMU off > + * s8: (const) Physical address of the main loop > + * s9: (debug) indirection page counter > + * s10: (debug) entry counter > + * s11: (debug) copied words counter > + */ > + mv s0, a0 > + mv s1, a1 > + mv s2, a2 > + mv s3, a3 > + mv s4, zero > + li s5, ((1 << PAGE_SHIFT) / RISCV_SZPTR) 1 << PAGE_SHIFT = PAGE_SIZE > + li s6, 1 > + mv s7, a4 > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + /* Disable / cleanup interrupts */ > + csrw sie, zero > + csrw sip, zero > + > + /* > + * When we switch SATP.MODE to "Bare" we'll only > + * play with physical addresses. However the first time > + * we try to jump somewhere, the offset on the jump > + * will be relative to pc which will still be on VA. To > + * deal with this we set stvec to the physical address at > + * the start of the loop below so that we jump there in > + * any case. > + */ > + la s8, 1f > + sub s8, s8, s7 > + csrw stvec, s8 > + > + /* Process entries in a loop */ > +.align 2 > +1: > + addi s10, s10, 1 > + REG_L t0, 0(s0) /* t0 = *image->entry */ > + addi s0, s0, RISCV_SZPTR /* image->entry++ */ > + > + /* IND_DESTINATION entry ? -> save destination address */ > + andi t1, t0, 0x1 > + beqz t1, 2f > + andi s4, t0, ~0x1 > + j 1b > + > +2: > + /* IND_INDIRECTION entry ? -> update next entry ptr (PA) */ > + andi t1, t0, 0x2 > + beqz t1, 2f > + andi s0, t0, ~0x2 > + addi s9, s9, 1 > + csrw sptbr, zero > + jalr zero, s8, 0 > + > +2: > + /* IND_DONE entry ? -> jump to done label */ > + andi t1, t0, 0x4 > + beqz t1, 2f > + j 4f > + > +2: > + /* > + * IND_SOURCE entry ? -> copy page word by word to the > + * destination address we got from IND_DESTINATION > + */ > + andi t1, t0, 0x8 > + beqz t1, 1b /* Unknown entry type, ignore it */ > + andi t0, t0, ~0x8 > + mv t3, s5 /* i = num words per page */ > +3: /* copy loop */ > + REG_L t1, (t0) /* t1 = *src_ptr */ > + REG_S t1, (s4) /* *dst_ptr = *src_ptr */ > + addi t0, t0, RISCV_SZPTR /* stc_ptr++ */ > + addi s4, s4, RISCV_SZPTR /* dst_ptr++ */ > + sub t3, t3, s6 /* i-- */ > + addi s11, s11, 1 /* c++ */ > + beqz t3, 1b /* copy done ? */ > + j 3b > + > +4: > + /* Pass the arguments to the next kernel / Cleanup*/ > + mv a0, s3 > + mv a1, s2 > + mv a2, s1 > + > + /* Cleanup */ > + mv a3, zero > + mv a4, zero > + mv a5, zero > + mv a6, zero > + mv a7, zero > + > + mv s0, zero > + mv s1, zero > + mv s2, zero > + mv s3, zero > + mv s4, zero > + mv s5, zero > + mv s6, zero > + mv s7, zero > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + mv t0, zero > + mv t1, zero > + mv t2, zero > + mv t3, zero > + mv t4, zero > + mv t5, zero > + mv t6, zero > + csrw sepc, zero > + csrw scause, zero > + csrw sscratch, zero > + > + /* > + * Make sure the relocated code is visible > + * and jump to the new kernel > + */ > + fence.i > + > + jalr zero, a2, 0 > + > +SYM_CODE_END(riscv_kexec_relocate) > +riscv_kexec_relocate_end: > + > + .section ".rodata" > +SYM_DATA(riscv_kexec_relocate_size, > + .long riscv_kexec_relocate_end - riscv_kexec_relocate) > + > diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c > new file mode 100644 > index 000000000..2ce6c3daf > --- /dev/null > +++ b/arch/riscv/kernel/machine_kexec.c > @@ -0,0 +1,186 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <linux/kexec.h> > +#include <asm/kexec.h> /* For riscv_kexec_* symbol defines */ > +#include <linux/smp.h> /* For smp_send_stop () */ > +#include <asm/cacheflush.h> /* For local_flush_icache_all() */ > +#include <asm/barrier.h> /* For smp_wmb() */ > +#include <asm/page.h> /* For PAGE_MASK */ > +#include <linux/libfdt.h> /* For fdt_check_header() */ > +#include <asm/set_memory.h> /* For set_memory_x() */ > +#include <linux/compiler.h> /* For unreachable() */ > +#include <linux/cpu.h> /* For cpu_down() */ > + > +/** > + * kexec_image_info - Print received image details > + */ > +static void > +kexec_image_info(const struct kimage *image) > +{ > + unsigned long i; > + * harts and possibly devices etc) for a kexec reboot. > + */ > +void machine_shutdown(void) > + > + pr_debug("Kexec image info:\n"); > + pr_debug("\ttype: %d\n", image->type); > + pr_debug("\tstart: %lx\n", image->start); > + pr_debug("\thead: %lx\n", image->head); > + pr_debug("\tnr_segments: %lu\n", image->nr_segments); > + > + for (i = 0; i < image->nr_segments; i++) { > + pr_debug("\t segment[%lu]: %016lx - %016lx", i, > + image->segment[i].mem, > + image->segment[i].mem + image->segment[i].memsz); > + pr_debug("\t\t0x%lx bytes, %lu pages\n", > + (unsigned long) image->segment[i].memsz, > + (unsigned long) image->segment[i].memsz / PAGE_SIZE); > + } > +} > + > +/** > + * machine_kexec_prepare - Initialize kexec > + * > + * This function is called from do_kexec_load, when the user has > + * provided us with an image to be loaded. Its goal is to validate > + * the image and prepare the control code buffer as needed. > + * Note that kimage_alloc_init has already been called and the > + * control buffer has already been allocated. > + */ > +int > +machine_kexec_prepare(struct kimage *image) > +{ > + struct kimage_arch *internal = &image->arch; > + struct fdt_header fdt = {0}; > + void *control_code_buffer = NULL; > + unsigned int control_code_buffer_sz = 0; > + int i = 0; > + > + kexec_image_info(image); > + > + if (image->type == KEXEC_TYPE_CRASH) { > + pr_warn("Loading a crash kernel is unsupported for now.\n"); > + return -EINVAL; > + } > + > + /* Find the Flattened Device Tree and save its physical address */ > + for (i = 0; i < image->nr_segments; i++) { > + if (image->segment[i].memsz <= sizeof(fdt)) > + continue; > + > + if (copy_from_user(&fdt, image->segment[i].buf, sizeof(fdt))) > + continue; > + > + if (fdt_check_header(&fdt)) > + continue; > + > + internal->fdt_addr = (unsigned long) image->segment[i].mem; > + break; > + } > + > + if (!internal->fdt_addr) { > + pr_err("Device tree not included in the provided image\n"); > + return -EINVAL; > + } > + > + /* Copy the assembler code for relocation to the control page */ > + control_code_buffer = page_address(image->control_code_page); > + control_code_buffer_sz = page_size(image->control_code_page); > + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > + pr_err("Relocation code doesn't fit within a control page\n"); > + return -EINVAL; > + } > + memcpy(control_code_buffer, riscv_kexec_relocate, > + riscv_kexec_relocate_size); > + > + /* Mark the control page executable */ > + set_memory_x((unsigned long) control_code_buffer, 1); > + > + return 0; > +} > + > + > +/** > + * machine_kexec_cleanup - Cleanup any leftovers from > + * machine_kexec_prepare > + * > + * This function is called by kimage_free to handle any arch-specific > + * allocations done on machine_kexec_prepare. Since we didn't do any > + * allocations there, this is just an empty function. Note that the > + * control buffer is freed by kimage_free. > + */ > +void > +machine_kexec_cleanup(struct kimage *image) > +{ > +} > + > + > +/* > + * machine_shutdown - Prepare for a kexec reboot > + * > + * This function is called by kernel_kexec just before machine_kexec > + * below. Its goal is to prepare the rest of the system (the other > + * harts and possibly devices etc) for a kexec reboot. > + */ > +void machine_shutdown(void) > +{ > + /* > + * No more interrupts on this hart > + * until we are back up. > + */ > + local_irq_disable(); > + > +#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) Shouldn't it be defined(CONFIG_SMP) ? > + smp_shutdown_nonboot_cpus(smp_processor_id()); > +#endif > +} > + > +/** > + * machine_crash_shutdown - Prepare to kexec after a kernel crash > + * > + * This function is called by crash_kexec just before machine_kexec > + * below and its goal is similar to machine_shutdown, but in case of > + * a kernel crash. Since we don't handle such cases yet, this function > + * is empty. > + */ > +void > +machine_crash_shutdown(struct pt_regs *regs) > +{ > +} > + > +/** > + * machine_kexec - Jump to the loaded kimage > + * > + * This function is called by kernel_kexec which is called by the > + * reboot system call when the reboot cmd is LINUX_REBOOT_CMD_KEXEC, > + * or by crash_kernel which is called by the kernel's arch-specific > + * trap handler in case of a kernel panic. It's the final stage of > + * the kexec process where the pre-loaded kimage is ready to be > + * executed. We assume at this point that all other harts are > + * suspended and this hart will be the new boot hart. > + */ > +void __noreturn > +machine_kexec(struct kimage *image) > +{ > + struct kimage_arch *internal = &image->arch; > + unsigned long jump_addr = (unsigned long) image->start; > + unsigned long first_ind_entry = (unsigned long) &image->head; > + unsigned long this_hart_id = raw_smp_processor_id(); > + unsigned long fdt_addr = internal->fdt_addr; > + void *control_code_buffer = page_address(image->control_code_page); > + riscv_kexec_do_relocate do_relocate = control_code_buffer; > + > + pr_notice("Will call new kernel at %08lx from hart id %lx\n", > + jump_addr, this_hart_id); > + pr_notice("FDT image at %08lx\n", fdt_addr); > + > + /* Make sure the relocation code is visible to the hart */ > + local_flush_icache_all(); > + > + /* Jump to the relocation code */ > + pr_notice("Bye...\n"); > + do_relocate(first_ind_entry, jump_addr, fdt_addr, > + this_hart_id, va_pa_offset); > + unreachable(); > +} > _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 2/5] RISC-V: Add kexec support 2021-04-06 18:38 ` Alex Ghiti @ 2021-04-09 10:19 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-09 10:19 UTC (permalink / raw) To: Alex Ghiti Cc: Nick Kossifidis, linux-riscv, palmer, paul.walmsley, linux-kernel Στις 2021-04-06 21:38, Alex Ghiti έγραψε: > Le 4/5/21 à 4:57 AM, Nick Kossifidis a écrit : >> + >> +/* Reserve a page for the control code buffer */ >> +#define KEXEC_CONTROL_PAGE_SIZE 4096 > > PAGE_SIZE instead ? > Yup, I'll change it. >> +obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o > > Other obj-$() use parenthesis. > ACK >> + li s5, ((1 << PAGE_SHIFT) / RISCV_SZPTR) > > 1 << PAGE_SHIFT = PAGE_SIZE > ACK >> +#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) > > Shouldn't it be defined(CONFIG_SMP) ? > It depends on SMP anyway, I'll remove the second part. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 2/5] RISC-V: Add kexec support @ 2021-04-09 10:19 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-09 10:19 UTC (permalink / raw) To: Alex Ghiti Cc: Nick Kossifidis, linux-riscv, palmer, paul.walmsley, linux-kernel Στις 2021-04-06 21:38, Alex Ghiti έγραψε: > Le 4/5/21 à 4:57 AM, Nick Kossifidis a écrit : >> + >> +/* Reserve a page for the control code buffer */ >> +#define KEXEC_CONTROL_PAGE_SIZE 4096 > > PAGE_SIZE instead ? > Yup, I'll change it. >> +obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o > > Other obj-$() use parenthesis. > ACK >> + li s5, ((1 << PAGE_SHIFT) / RISCV_SZPTR) > > 1 << PAGE_SHIFT = PAGE_SIZE > ACK >> +#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) > > Shouldn't it be defined(CONFIG_SMP) ? > It depends on SMP anyway, I'll remove the second part. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 2/5] RISC-V: Add kexec support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-23 3:30 ` Palmer Dabbelt -1 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick On Mon, 05 Apr 2021 01:57:09 PDT (-0700), mick@ics.forth.gr wrote: > This patch adds support for kexec on RISC-V. On SMP systems it depends > on HOTPLUG_CPU in order to be able to bring up all harts after kexec. > It also needs a recent OpenSBI version that supports the HSM extension. > I tested it on riscv64 QEMU on both an smp and a non-smp system. > > v5: > * For now depend on MMU, further changes needed for NOMMU support > * Make sure stvec is aligned > * Cleanup some unneeded fences > * Verify control code's buffer size > * Compile kexec_relocate.S with medany and norelax > > v4: > * No functional changes, just re-based > > v3: > * Use the new smp_shutdown_nonboot_cpus() call. > * Move riscv_kexec_relocate to .rodata > > v2: > * Pass needed parameters as arguments to riscv_kexec_relocate > instead of using global variables. > * Use kimage_arch to hold the fdt address of the included fdt. > * Use SYM_* macros on kexec_relocate.S. > * Compatibility with STRICT_KERNEL_RWX. > * Compatibility with HOTPLUG_CPU for SMP > * Small cleanups If you put these below a "---" then I don't have to manually remove them, but the best thing to do is to include the changelog as part of the cover letter when you have one as it's pretty tough to track changelogs on single patches. > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/Kconfig | 15 +++ > arch/riscv/include/asm/kexec.h | 47 ++++++++ > arch/riscv/kernel/Makefile | 5 + > arch/riscv/kernel/kexec_relocate.S | 156 ++++++++++++++++++++++++ > arch/riscv/kernel/machine_kexec.c | 186 +++++++++++++++++++++++++++++ > 5 files changed, 409 insertions(+) > create mode 100644 arch/riscv/include/asm/kexec.h > create mode 100644 arch/riscv/kernel/kexec_relocate.S > create mode 100644 arch/riscv/kernel/machine_kexec.c > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index 8ea60a0a1..3716262ef 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -389,6 +389,21 @@ config RISCV_SBI_V01 > help > This config allows kernel to use SBI v0.1 APIs. This will be > deprecated in future once legacy M-mode software are no longer in use. > + > +config KEXEC > + bool "Kexec system call" > + select KEXEC_CORE > + select HOTPLUG_CPU if SMP > + depends on MMU > + help > + kexec is a system call that implements the ability to shutdown your > + current kernel, and to start another kernel. It is like a reboot > + but it is independent of the system firmware. And like a reboot > + you can start any kernel with it, not just Linux. > + > + The name comes from the similarity to the exec system call. > + > + > endmenu > > menu "Boot options" > diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h > new file mode 100644 > index 000000000..efc69feb4 > --- /dev/null > +++ b/arch/riscv/include/asm/kexec.h > @@ -0,0 +1,47 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#ifndef _RISCV_KEXEC_H > +#define _RISCV_KEXEC_H > + > +/* Maximum physical address we can use pages from */ > +#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL) > + > +/* Maximum address we can reach in physical address mode */ > +#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL) > + > +/* Maximum address we can use for the control code buffer */ > +#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL) > + > +/* Reserve a page for the control code buffer */ > +#define KEXEC_CONTROL_PAGE_SIZE 4096 > + > +#define KEXEC_ARCH KEXEC_ARCH_RISCV > + > +static inline void > +crash_setup_regs(struct pt_regs *newregs, > + struct pt_regs *oldregs) > +{ > + /* Dummy implementation for now */ > +} > + > + > +#define ARCH_HAS_KIMAGE_ARCH > + > +struct kimage_arch { > + unsigned long fdt_addr; > +}; > + > +const extern unsigned char riscv_kexec_relocate[]; > +const extern unsigned int riscv_kexec_relocate_size; > + > +typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, > + unsigned long jump_addr, > + unsigned long fdt_addr, > + unsigned long hartid, > + unsigned long va_pa_off); > + > +#endif > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index 3dc0abde9..c2594018c 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -9,6 +9,10 @@ CFLAGS_REMOVE_patch.o = $(CC_FLAGS_FTRACE) > CFLAGS_REMOVE_sbi.o = $(CC_FLAGS_FTRACE) > endif > > +ifdef CONFIG_KEXEC > +AFLAGS_kexec_relocate.o := -mcmodel=medany -mno-relax > +endif > + > extra-y += head.o > extra-y += vmlinux.lds > > @@ -54,6 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o > endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > +obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S > new file mode 100644 > index 000000000..616c20771 > --- /dev/null > +++ b/arch/riscv/kernel/kexec_relocate.S > @@ -0,0 +1,156 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ > +#include <asm/page.h> /* For PAGE_SHIFT */ > +#include <linux/linkage.h> /* For SYM_* macros */ > + > +.section ".rodata" > +SYM_CODE_START(riscv_kexec_relocate) > + > + /* > + * s0: Pointer to the current entry > + * s1: (const) Phys address to jump to after relocation > + * s2: (const) Phys address of the FDT image > + * s3: (const) The hartid of the current hart > + * s4: Pointer to the destination address for the relocation > + * s5: (const) Number of words per page > + * s6: (const) 1, used for subtraction > + * s7: (const) va_pa_offset, used when switching MMU off > + * s8: (const) Physical address of the main loop > + * s9: (debug) indirection page counter > + * s10: (debug) entry counter > + * s11: (debug) copied words counter > + */ > + mv s0, a0 > + mv s1, a1 > + mv s2, a2 > + mv s3, a3 > + mv s4, zero > + li s5, ((1 << PAGE_SHIFT) / RISCV_SZPTR) > + li s6, 1 > + mv s7, a4 > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + /* Disable / cleanup interrupts */ > + csrw sie, zero > + csrw sip, zero > + > + /* > + * When we switch SATP.MODE to "Bare" we'll only > + * play with physical addresses. However the first time > + * we try to jump somewhere, the offset on the jump > + * will be relative to pc which will still be on VA. To > + * deal with this we set stvec to the physical address at > + * the start of the loop below so that we jump there in > + * any case. > + */ > + la s8, 1f > + sub s8, s8, s7 > + csrw stvec, s8 > + > + /* Process entries in a loop */ > +.align 2 > +1: > + addi s10, s10, 1 > + REG_L t0, 0(s0) /* t0 = *image->entry */ > + addi s0, s0, RISCV_SZPTR /* image->entry++ */ > + > + /* IND_DESTINATION entry ? -> save destination address */ > + andi t1, t0, 0x1 > + beqz t1, 2f > + andi s4, t0, ~0x1 > + j 1b > + > +2: > + /* IND_INDIRECTION entry ? -> update next entry ptr (PA) */ > + andi t1, t0, 0x2 > + beqz t1, 2f > + andi s0, t0, ~0x2 > + addi s9, s9, 1 > + csrw sptbr, zero > + jalr zero, s8, 0 > + > +2: > + /* IND_DONE entry ? -> jump to done label */ > + andi t1, t0, 0x4 > + beqz t1, 2f > + j 4f > + > +2: > + /* > + * IND_SOURCE entry ? -> copy page word by word to the > + * destination address we got from IND_DESTINATION > + */ > + andi t1, t0, 0x8 > + beqz t1, 1b /* Unknown entry type, ignore it */ > + andi t0, t0, ~0x8 > + mv t3, s5 /* i = num words per page */ > +3: /* copy loop */ > + REG_L t1, (t0) /* t1 = *src_ptr */ > + REG_S t1, (s4) /* *dst_ptr = *src_ptr */ > + addi t0, t0, RISCV_SZPTR /* stc_ptr++ */ > + addi s4, s4, RISCV_SZPTR /* dst_ptr++ */ > + sub t3, t3, s6 /* i-- */ > + addi s11, s11, 1 /* c++ */ > + beqz t3, 1b /* copy done ? */ > + j 3b > + > +4: > + /* Pass the arguments to the next kernel / Cleanup*/ > + mv a0, s3 > + mv a1, s2 > + mv a2, s1 > + > + /* Cleanup */ > + mv a3, zero > + mv a4, zero > + mv a5, zero > + mv a6, zero > + mv a7, zero > + > + mv s0, zero > + mv s1, zero > + mv s2, zero > + mv s3, zero > + mv s4, zero > + mv s5, zero > + mv s6, zero > + mv s7, zero > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + mv t0, zero > + mv t1, zero > + mv t2, zero > + mv t3, zero > + mv t4, zero > + mv t5, zero > + mv t6, zero > + csrw sepc, zero > + csrw scause, zero > + csrw sscratch, zero > + > + /* > + * Make sure the relocated code is visible > + * and jump to the new kernel > + */ > + fence.i > + > + jalr zero, a2, 0 > + > +SYM_CODE_END(riscv_kexec_relocate) > +riscv_kexec_relocate_end: > + > + .section ".rodata" > +SYM_DATA(riscv_kexec_relocate_size, > + .long riscv_kexec_relocate_end - riscv_kexec_relocate) > + > diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c > new file mode 100644 > index 000000000..2ce6c3daf > --- /dev/null > +++ b/arch/riscv/kernel/machine_kexec.c > @@ -0,0 +1,186 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <linux/kexec.h> > +#include <asm/kexec.h> /* For riscv_kexec_* symbol defines */ > +#include <linux/smp.h> /* For smp_send_stop () */ > +#include <asm/cacheflush.h> /* For local_flush_icache_all() */ > +#include <asm/barrier.h> /* For smp_wmb() */ > +#include <asm/page.h> /* For PAGE_MASK */ > +#include <linux/libfdt.h> /* For fdt_check_header() */ > +#include <asm/set_memory.h> /* For set_memory_x() */ > +#include <linux/compiler.h> /* For unreachable() */ > +#include <linux/cpu.h> /* For cpu_down() */ > + > +/** > + * kexec_image_info - Print received image details > + */ > +static void > +kexec_image_info(const struct kimage *image) > +{ > + unsigned long i; > + > + pr_debug("Kexec image info:\n"); > + pr_debug("\ttype: %d\n", image->type); > + pr_debug("\tstart: %lx\n", image->start); > + pr_debug("\thead: %lx\n", image->head); > + pr_debug("\tnr_segments: %lu\n", image->nr_segments); > + > + for (i = 0; i < image->nr_segments; i++) { > + pr_debug("\t segment[%lu]: %016lx - %016lx", i, > + image->segment[i].mem, > + image->segment[i].mem + image->segment[i].memsz); > + pr_debug("\t\t0x%lx bytes, %lu pages\n", > + (unsigned long) image->segment[i].memsz, > + (unsigned long) image->segment[i].memsz / PAGE_SIZE); > + } > +} > + > +/** > + * machine_kexec_prepare - Initialize kexec > + * > + * This function is called from do_kexec_load, when the user has > + * provided us with an image to be loaded. Its goal is to validate > + * the image and prepare the control code buffer as needed. > + * Note that kimage_alloc_init has already been called and the > + * control buffer has already been allocated. > + */ > +int > +machine_kexec_prepare(struct kimage *image) > +{ > + struct kimage_arch *internal = &image->arch; > + struct fdt_header fdt = {0}; > + void *control_code_buffer = NULL; > + unsigned int control_code_buffer_sz = 0; > + int i = 0; > + > + kexec_image_info(image); > + > + if (image->type == KEXEC_TYPE_CRASH) { > + pr_warn("Loading a crash kernel is unsupported for now.\n"); > + return -EINVAL; > + } > + > + /* Find the Flattened Device Tree and save its physical address */ > + for (i = 0; i < image->nr_segments; i++) { > + if (image->segment[i].memsz <= sizeof(fdt)) > + continue; > + > + if (copy_from_user(&fdt, image->segment[i].buf, sizeof(fdt))) > + continue; > + > + if (fdt_check_header(&fdt)) > + continue; > + > + internal->fdt_addr = (unsigned long) image->segment[i].mem; > + break; > + } > + > + if (!internal->fdt_addr) { > + pr_err("Device tree not included in the provided image\n"); > + return -EINVAL; > + } > + > + /* Copy the assembler code for relocation to the control page */ > + control_code_buffer = page_address(image->control_code_page); > + control_code_buffer_sz = page_size(image->control_code_page); > + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > + pr_err("Relocation code doesn't fit within a control page\n"); > + return -EINVAL; > + } > + memcpy(control_code_buffer, riscv_kexec_relocate, > + riscv_kexec_relocate_size); > + > + /* Mark the control page executable */ > + set_memory_x((unsigned long) control_code_buffer, 1); > + > + return 0; > +} > + > + > +/** > + * machine_kexec_cleanup - Cleanup any leftovers from > + * machine_kexec_prepare > + * > + * This function is called by kimage_free to handle any arch-specific > + * allocations done on machine_kexec_prepare. Since we didn't do any > + * allocations there, this is just an empty function. Note that the > + * control buffer is freed by kimage_free. > + */ > +void > +machine_kexec_cleanup(struct kimage *image) > +{ > +} > + > + > +/* > + * machine_shutdown - Prepare for a kexec reboot > + * > + * This function is called by kernel_kexec just before machine_kexec > + * below. Its goal is to prepare the rest of the system (the other > + * harts and possibly devices etc) for a kexec reboot. > + */ > +void machine_shutdown(void) > +{ > + /* > + * No more interrupts on this hart > + * until we are back up. > + */ > + local_irq_disable(); > + > +#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) > + smp_shutdown_nonboot_cpus(smp_processor_id()); > +#endif > +} > + > +/** > + * machine_crash_shutdown - Prepare to kexec after a kernel crash > + * > + * This function is called by crash_kexec just before machine_kexec > + * below and its goal is similar to machine_shutdown, but in case of > + * a kernel crash. Since we don't handle such cases yet, this function > + * is empty. > + */ > +void > +machine_crash_shutdown(struct pt_regs *regs) > +{ > +} > + > +/** > + * machine_kexec - Jump to the loaded kimage > + * > + * This function is called by kernel_kexec which is called by the > + * reboot system call when the reboot cmd is LINUX_REBOOT_CMD_KEXEC, > + * or by crash_kernel which is called by the kernel's arch-specific > + * trap handler in case of a kernel panic. It's the final stage of > + * the kexec process where the pre-loaded kimage is ready to be > + * executed. We assume at this point that all other harts are > + * suspended and this hart will be the new boot hart. > + */ > +void __noreturn > +machine_kexec(struct kimage *image) > +{ > + struct kimage_arch *internal = &image->arch; > + unsigned long jump_addr = (unsigned long) image->start; > + unsigned long first_ind_entry = (unsigned long) &image->head; > + unsigned long this_hart_id = raw_smp_processor_id(); > + unsigned long fdt_addr = internal->fdt_addr; > + void *control_code_buffer = page_address(image->control_code_page); > + riscv_kexec_do_relocate do_relocate = control_code_buffer; > + > + pr_notice("Will call new kernel at %08lx from hart id %lx\n", > + jump_addr, this_hart_id); > + pr_notice("FDT image at %08lx\n", fdt_addr); > + > + /* Make sure the relocation code is visible to the hart */ > + local_flush_icache_all(); > + > + /* Jump to the relocation code */ > + pr_notice("Bye...\n"); > + do_relocate(first_ind_entry, jump_addr, fdt_addr, > + this_hart_id, va_pa_offset); > + unreachable(); > +} ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 2/5] RISC-V: Add kexec support @ 2021-04-23 3:30 ` Palmer Dabbelt 0 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick On Mon, 05 Apr 2021 01:57:09 PDT (-0700), mick@ics.forth.gr wrote: > This patch adds support for kexec on RISC-V. On SMP systems it depends > on HOTPLUG_CPU in order to be able to bring up all harts after kexec. > It also needs a recent OpenSBI version that supports the HSM extension. > I tested it on riscv64 QEMU on both an smp and a non-smp system. > > v5: > * For now depend on MMU, further changes needed for NOMMU support > * Make sure stvec is aligned > * Cleanup some unneeded fences > * Verify control code's buffer size > * Compile kexec_relocate.S with medany and norelax > > v4: > * No functional changes, just re-based > > v3: > * Use the new smp_shutdown_nonboot_cpus() call. > * Move riscv_kexec_relocate to .rodata > > v2: > * Pass needed parameters as arguments to riscv_kexec_relocate > instead of using global variables. > * Use kimage_arch to hold the fdt address of the included fdt. > * Use SYM_* macros on kexec_relocate.S. > * Compatibility with STRICT_KERNEL_RWX. > * Compatibility with HOTPLUG_CPU for SMP > * Small cleanups If you put these below a "---" then I don't have to manually remove them, but the best thing to do is to include the changelog as part of the cover letter when you have one as it's pretty tough to track changelogs on single patches. > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/Kconfig | 15 +++ > arch/riscv/include/asm/kexec.h | 47 ++++++++ > arch/riscv/kernel/Makefile | 5 + > arch/riscv/kernel/kexec_relocate.S | 156 ++++++++++++++++++++++++ > arch/riscv/kernel/machine_kexec.c | 186 +++++++++++++++++++++++++++++ > 5 files changed, 409 insertions(+) > create mode 100644 arch/riscv/include/asm/kexec.h > create mode 100644 arch/riscv/kernel/kexec_relocate.S > create mode 100644 arch/riscv/kernel/machine_kexec.c > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index 8ea60a0a1..3716262ef 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -389,6 +389,21 @@ config RISCV_SBI_V01 > help > This config allows kernel to use SBI v0.1 APIs. This will be > deprecated in future once legacy M-mode software are no longer in use. > + > +config KEXEC > + bool "Kexec system call" > + select KEXEC_CORE > + select HOTPLUG_CPU if SMP > + depends on MMU > + help > + kexec is a system call that implements the ability to shutdown your > + current kernel, and to start another kernel. It is like a reboot > + but it is independent of the system firmware. And like a reboot > + you can start any kernel with it, not just Linux. > + > + The name comes from the similarity to the exec system call. > + > + > endmenu > > menu "Boot options" > diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h > new file mode 100644 > index 000000000..efc69feb4 > --- /dev/null > +++ b/arch/riscv/include/asm/kexec.h > @@ -0,0 +1,47 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#ifndef _RISCV_KEXEC_H > +#define _RISCV_KEXEC_H > + > +/* Maximum physical address we can use pages from */ > +#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL) > + > +/* Maximum address we can reach in physical address mode */ > +#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL) > + > +/* Maximum address we can use for the control code buffer */ > +#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL) > + > +/* Reserve a page for the control code buffer */ > +#define KEXEC_CONTROL_PAGE_SIZE 4096 > + > +#define KEXEC_ARCH KEXEC_ARCH_RISCV > + > +static inline void > +crash_setup_regs(struct pt_regs *newregs, > + struct pt_regs *oldregs) > +{ > + /* Dummy implementation for now */ > +} > + > + > +#define ARCH_HAS_KIMAGE_ARCH > + > +struct kimage_arch { > + unsigned long fdt_addr; > +}; > + > +const extern unsigned char riscv_kexec_relocate[]; > +const extern unsigned int riscv_kexec_relocate_size; > + > +typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, > + unsigned long jump_addr, > + unsigned long fdt_addr, > + unsigned long hartid, > + unsigned long va_pa_off); > + > +#endif > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index 3dc0abde9..c2594018c 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -9,6 +9,10 @@ CFLAGS_REMOVE_patch.o = $(CC_FLAGS_FTRACE) > CFLAGS_REMOVE_sbi.o = $(CC_FLAGS_FTRACE) > endif > > +ifdef CONFIG_KEXEC > +AFLAGS_kexec_relocate.o := -mcmodel=medany -mno-relax > +endif > + > extra-y += head.o > extra-y += vmlinux.lds > > @@ -54,6 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o > endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > +obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S > new file mode 100644 > index 000000000..616c20771 > --- /dev/null > +++ b/arch/riscv/kernel/kexec_relocate.S > @@ -0,0 +1,156 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ > +#include <asm/page.h> /* For PAGE_SHIFT */ > +#include <linux/linkage.h> /* For SYM_* macros */ > + > +.section ".rodata" > +SYM_CODE_START(riscv_kexec_relocate) > + > + /* > + * s0: Pointer to the current entry > + * s1: (const) Phys address to jump to after relocation > + * s2: (const) Phys address of the FDT image > + * s3: (const) The hartid of the current hart > + * s4: Pointer to the destination address for the relocation > + * s5: (const) Number of words per page > + * s6: (const) 1, used for subtraction > + * s7: (const) va_pa_offset, used when switching MMU off > + * s8: (const) Physical address of the main loop > + * s9: (debug) indirection page counter > + * s10: (debug) entry counter > + * s11: (debug) copied words counter > + */ > + mv s0, a0 > + mv s1, a1 > + mv s2, a2 > + mv s3, a3 > + mv s4, zero > + li s5, ((1 << PAGE_SHIFT) / RISCV_SZPTR) > + li s6, 1 > + mv s7, a4 > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + /* Disable / cleanup interrupts */ > + csrw sie, zero > + csrw sip, zero > + > + /* > + * When we switch SATP.MODE to "Bare" we'll only > + * play with physical addresses. However the first time > + * we try to jump somewhere, the offset on the jump > + * will be relative to pc which will still be on VA. To > + * deal with this we set stvec to the physical address at > + * the start of the loop below so that we jump there in > + * any case. > + */ > + la s8, 1f > + sub s8, s8, s7 > + csrw stvec, s8 > + > + /* Process entries in a loop */ > +.align 2 > +1: > + addi s10, s10, 1 > + REG_L t0, 0(s0) /* t0 = *image->entry */ > + addi s0, s0, RISCV_SZPTR /* image->entry++ */ > + > + /* IND_DESTINATION entry ? -> save destination address */ > + andi t1, t0, 0x1 > + beqz t1, 2f > + andi s4, t0, ~0x1 > + j 1b > + > +2: > + /* IND_INDIRECTION entry ? -> update next entry ptr (PA) */ > + andi t1, t0, 0x2 > + beqz t1, 2f > + andi s0, t0, ~0x2 > + addi s9, s9, 1 > + csrw sptbr, zero > + jalr zero, s8, 0 > + > +2: > + /* IND_DONE entry ? -> jump to done label */ > + andi t1, t0, 0x4 > + beqz t1, 2f > + j 4f > + > +2: > + /* > + * IND_SOURCE entry ? -> copy page word by word to the > + * destination address we got from IND_DESTINATION > + */ > + andi t1, t0, 0x8 > + beqz t1, 1b /* Unknown entry type, ignore it */ > + andi t0, t0, ~0x8 > + mv t3, s5 /* i = num words per page */ > +3: /* copy loop */ > + REG_L t1, (t0) /* t1 = *src_ptr */ > + REG_S t1, (s4) /* *dst_ptr = *src_ptr */ > + addi t0, t0, RISCV_SZPTR /* stc_ptr++ */ > + addi s4, s4, RISCV_SZPTR /* dst_ptr++ */ > + sub t3, t3, s6 /* i-- */ > + addi s11, s11, 1 /* c++ */ > + beqz t3, 1b /* copy done ? */ > + j 3b > + > +4: > + /* Pass the arguments to the next kernel / Cleanup*/ > + mv a0, s3 > + mv a1, s2 > + mv a2, s1 > + > + /* Cleanup */ > + mv a3, zero > + mv a4, zero > + mv a5, zero > + mv a6, zero > + mv a7, zero > + > + mv s0, zero > + mv s1, zero > + mv s2, zero > + mv s3, zero > + mv s4, zero > + mv s5, zero > + mv s6, zero > + mv s7, zero > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + mv t0, zero > + mv t1, zero > + mv t2, zero > + mv t3, zero > + mv t4, zero > + mv t5, zero > + mv t6, zero > + csrw sepc, zero > + csrw scause, zero > + csrw sscratch, zero > + > + /* > + * Make sure the relocated code is visible > + * and jump to the new kernel > + */ > + fence.i > + > + jalr zero, a2, 0 > + > +SYM_CODE_END(riscv_kexec_relocate) > +riscv_kexec_relocate_end: > + > + .section ".rodata" > +SYM_DATA(riscv_kexec_relocate_size, > + .long riscv_kexec_relocate_end - riscv_kexec_relocate) > + > diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c > new file mode 100644 > index 000000000..2ce6c3daf > --- /dev/null > +++ b/arch/riscv/kernel/machine_kexec.c > @@ -0,0 +1,186 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2019 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <linux/kexec.h> > +#include <asm/kexec.h> /* For riscv_kexec_* symbol defines */ > +#include <linux/smp.h> /* For smp_send_stop () */ > +#include <asm/cacheflush.h> /* For local_flush_icache_all() */ > +#include <asm/barrier.h> /* For smp_wmb() */ > +#include <asm/page.h> /* For PAGE_MASK */ > +#include <linux/libfdt.h> /* For fdt_check_header() */ > +#include <asm/set_memory.h> /* For set_memory_x() */ > +#include <linux/compiler.h> /* For unreachable() */ > +#include <linux/cpu.h> /* For cpu_down() */ > + > +/** > + * kexec_image_info - Print received image details > + */ > +static void > +kexec_image_info(const struct kimage *image) > +{ > + unsigned long i; > + > + pr_debug("Kexec image info:\n"); > + pr_debug("\ttype: %d\n", image->type); > + pr_debug("\tstart: %lx\n", image->start); > + pr_debug("\thead: %lx\n", image->head); > + pr_debug("\tnr_segments: %lu\n", image->nr_segments); > + > + for (i = 0; i < image->nr_segments; i++) { > + pr_debug("\t segment[%lu]: %016lx - %016lx", i, > + image->segment[i].mem, > + image->segment[i].mem + image->segment[i].memsz); > + pr_debug("\t\t0x%lx bytes, %lu pages\n", > + (unsigned long) image->segment[i].memsz, > + (unsigned long) image->segment[i].memsz / PAGE_SIZE); > + } > +} > + > +/** > + * machine_kexec_prepare - Initialize kexec > + * > + * This function is called from do_kexec_load, when the user has > + * provided us with an image to be loaded. Its goal is to validate > + * the image and prepare the control code buffer as needed. > + * Note that kimage_alloc_init has already been called and the > + * control buffer has already been allocated. > + */ > +int > +machine_kexec_prepare(struct kimage *image) > +{ > + struct kimage_arch *internal = &image->arch; > + struct fdt_header fdt = {0}; > + void *control_code_buffer = NULL; > + unsigned int control_code_buffer_sz = 0; > + int i = 0; > + > + kexec_image_info(image); > + > + if (image->type == KEXEC_TYPE_CRASH) { > + pr_warn("Loading a crash kernel is unsupported for now.\n"); > + return -EINVAL; > + } > + > + /* Find the Flattened Device Tree and save its physical address */ > + for (i = 0; i < image->nr_segments; i++) { > + if (image->segment[i].memsz <= sizeof(fdt)) > + continue; > + > + if (copy_from_user(&fdt, image->segment[i].buf, sizeof(fdt))) > + continue; > + > + if (fdt_check_header(&fdt)) > + continue; > + > + internal->fdt_addr = (unsigned long) image->segment[i].mem; > + break; > + } > + > + if (!internal->fdt_addr) { > + pr_err("Device tree not included in the provided image\n"); > + return -EINVAL; > + } > + > + /* Copy the assembler code for relocation to the control page */ > + control_code_buffer = page_address(image->control_code_page); > + control_code_buffer_sz = page_size(image->control_code_page); > + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > + pr_err("Relocation code doesn't fit within a control page\n"); > + return -EINVAL; > + } > + memcpy(control_code_buffer, riscv_kexec_relocate, > + riscv_kexec_relocate_size); > + > + /* Mark the control page executable */ > + set_memory_x((unsigned long) control_code_buffer, 1); > + > + return 0; > +} > + > + > +/** > + * machine_kexec_cleanup - Cleanup any leftovers from > + * machine_kexec_prepare > + * > + * This function is called by kimage_free to handle any arch-specific > + * allocations done on machine_kexec_prepare. Since we didn't do any > + * allocations there, this is just an empty function. Note that the > + * control buffer is freed by kimage_free. > + */ > +void > +machine_kexec_cleanup(struct kimage *image) > +{ > +} > + > + > +/* > + * machine_shutdown - Prepare for a kexec reboot > + * > + * This function is called by kernel_kexec just before machine_kexec > + * below. Its goal is to prepare the rest of the system (the other > + * harts and possibly devices etc) for a kexec reboot. > + */ > +void machine_shutdown(void) > +{ > + /* > + * No more interrupts on this hart > + * until we are back up. > + */ > + local_irq_disable(); > + > +#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) > + smp_shutdown_nonboot_cpus(smp_processor_id()); > +#endif > +} > + > +/** > + * machine_crash_shutdown - Prepare to kexec after a kernel crash > + * > + * This function is called by crash_kexec just before machine_kexec > + * below and its goal is similar to machine_shutdown, but in case of > + * a kernel crash. Since we don't handle such cases yet, this function > + * is empty. > + */ > +void > +machine_crash_shutdown(struct pt_regs *regs) > +{ > +} > + > +/** > + * machine_kexec - Jump to the loaded kimage > + * > + * This function is called by kernel_kexec which is called by the > + * reboot system call when the reboot cmd is LINUX_REBOOT_CMD_KEXEC, > + * or by crash_kernel which is called by the kernel's arch-specific > + * trap handler in case of a kernel panic. It's the final stage of > + * the kexec process where the pre-loaded kimage is ready to be > + * executed. We assume at this point that all other harts are > + * suspended and this hart will be the new boot hart. > + */ > +void __noreturn > +machine_kexec(struct kimage *image) > +{ > + struct kimage_arch *internal = &image->arch; > + unsigned long jump_addr = (unsigned long) image->start; > + unsigned long first_ind_entry = (unsigned long) &image->head; > + unsigned long this_hart_id = raw_smp_processor_id(); > + unsigned long fdt_addr = internal->fdt_addr; > + void *control_code_buffer = page_address(image->control_code_page); > + riscv_kexec_do_relocate do_relocate = control_code_buffer; > + > + pr_notice("Will call new kernel at %08lx from hart id %lx\n", > + jump_addr, this_hart_id); > + pr_notice("FDT image at %08lx\n", fdt_addr); > + > + /* Make sure the relocation code is visible to the hart */ > + local_flush_icache_all(); > + > + /* Jump to the relocation code */ > + pr_notice("Bye...\n"); > + do_relocate(first_ind_entry, jump_addr, fdt_addr, > + this_hart_id, va_pa_offset); > + unreachable(); > +} _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 3/5] RISC-V: Improve init_resources 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-05 8:57 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer Cc: paul.walmsley, linux-kernel, Nick Kossifidis, Geert Uytterhoeven * Kernel region is always present and we know where it is, no need to look for it inside the loop, just ignore it like the rest of the reserved regions within system's memory. * Don't call memblock_free inside the loop, if called it'll split the region of pre-allocated resources in two parts, messing things up, just re-use the previous pre-allocated resource and free any unused resources after both loops finish. * memblock_alloc may add a region when called, so increase the number of pre-allocated regions by one to be on the safe side (reported and patched by Geert Uytterhoeven) Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- arch/riscv/kernel/setup.c | 90 ++++++++++++++++++++------------------- 1 file changed, 46 insertions(+), 44 deletions(-) diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index e85bacff1..030554bab 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -60,6 +60,7 @@ static DEFINE_PER_CPU(struct cpu, cpu_devices); * also add "System RAM" regions for compatibility with other * archs, and the rest of the known regions for completeness. */ +static struct resource kimage_res = { .name = "Kernel image", }; static struct resource code_res = { .name = "Kernel code", }; static struct resource data_res = { .name = "Kernel data", }; static struct resource rodata_res = { .name = "Kernel rodata", }; @@ -80,45 +81,54 @@ static int __init add_resource(struct resource *parent, return 1; } -static int __init add_kernel_resources(struct resource *res) +static int __init add_kernel_resources(void) { int ret = 0; /* * The memory region of the kernel image is continuous and - * was reserved on setup_bootmem, find it here and register - * it as a resource, then register the various segments of - * the image as child nodes + * was reserved on setup_bootmem, register it here as a + * resource, with the various segments of the image as + * child nodes. */ - if (!(res->start <= code_res.start && res->end >= data_res.end)) - return 0; - res->name = "Kernel image"; - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + code_res.start = __pa_symbol(_text); + code_res.end = __pa_symbol(_etext) - 1; + code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - /* - * We removed a part of this region on setup_bootmem so - * we need to expand the resource for the bss to fit in. - */ - res->end = bss_res.end; + rodata_res.start = __pa_symbol(__start_rodata); + rodata_res.end = __pa_symbol(__end_rodata) - 1; + rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - ret = add_resource(&iomem_resource, res); + data_res.start = __pa_symbol(_data); + data_res.end = __pa_symbol(_edata) - 1; + data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + bss_res.start = __pa_symbol(__bss_start); + bss_res.end = __pa_symbol(__bss_stop) - 1; + bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + kimage_res.start = code_res.start; + kimage_res.end = bss_res.end; + kimage_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + ret = add_resource(&iomem_resource, &kimage_res); if (ret < 0) return ret; - ret = add_resource(res, &code_res); + ret = add_resource(&kimage_res, &code_res); if (ret < 0) return ret; - ret = add_resource(res, &rodata_res); + ret = add_resource(&kimage_res, &rodata_res); if (ret < 0) return ret; - ret = add_resource(res, &data_res); + ret = add_resource(&kimage_res, &data_res); if (ret < 0) return ret; - ret = add_resource(res, &bss_res); + ret = add_resource(&kimage_res, &bss_res); return ret; } @@ -129,53 +139,42 @@ static void __init init_resources(void) struct resource *res = NULL; struct resource *mem_res = NULL; size_t mem_res_sz = 0; - int ret = 0, i = 0; - - code_res.start = __pa_symbol(_text); - code_res.end = __pa_symbol(_etext) - 1; - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - - rodata_res.start = __pa_symbol(__start_rodata); - rodata_res.end = __pa_symbol(__end_rodata) - 1; - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - - data_res.start = __pa_symbol(_data); - data_res.end = __pa_symbol(_edata) - 1; - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + int num_resources = 0, res_idx = 0; + int ret = 0; - bss_res.start = __pa_symbol(__bss_start); - bss_res.end = __pa_symbol(__bss_stop) - 1; - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + /* + 1 as memblock_alloc() might increase memblock.reserved.cnt */ + num_resources = memblock.memory.cnt + memblock.reserved.cnt + 1; + res_idx = num_resources - 1; - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * sizeof(*mem_res); + mem_res_sz = num_resources * sizeof(*mem_res); mem_res = memblock_alloc(mem_res_sz, SMP_CACHE_BYTES); if (!mem_res) panic("%s: Failed to allocate %zu bytes\n", __func__, mem_res_sz); + /* * Start by adding the reserved regions, if they overlap * with /memory regions, insert_resource later on will take * care of it. */ + ret = add_kernel_resources(); + if (ret < 0) + goto error; + for_each_reserved_mem_region(region) { - res = &mem_res[i++]; + res = &mem_res[res_idx--]; res->name = "Reserved"; res->flags = IORESOURCE_MEM | IORESOURCE_BUSY; res->start = __pfn_to_phys(memblock_region_reserved_base_pfn(region)); res->end = __pfn_to_phys(memblock_region_reserved_end_pfn(region)) - 1; - ret = add_kernel_resources(res); - if (ret < 0) - goto error; - else if (ret) - continue; - /* * Ignore any other reserved regions within * system memory. */ if (memblock_is_memory(res->start)) { - memblock_free((phys_addr_t) res, sizeof(struct resource)); + /* Re-use this pre-allocated resource */ + res_idx++; continue; } @@ -186,7 +185,7 @@ static void __init init_resources(void) /* Add /memory regions to the resource tree */ for_each_mem_region(region) { - res = &mem_res[i++]; + res = &mem_res[res_idx--]; if (unlikely(memblock_is_nomap(region))) { res->name = "Reserved"; @@ -204,6 +203,9 @@ static void __init init_resources(void) goto error; } + /* Clean-up any unused pre-allocated resources */ + mem_res_sz = (num_resources - res_idx + 1) * sizeof(*mem_res); + memblock_free((phys_addr_t) mem_res, mem_res_sz); return; error: -- 2.26.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 3/5] RISC-V: Improve init_resources @ 2021-04-05 8:57 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer Cc: paul.walmsley, linux-kernel, Nick Kossifidis, Geert Uytterhoeven * Kernel region is always present and we know where it is, no need to look for it inside the loop, just ignore it like the rest of the reserved regions within system's memory. * Don't call memblock_free inside the loop, if called it'll split the region of pre-allocated resources in two parts, messing things up, just re-use the previous pre-allocated resource and free any unused resources after both loops finish. * memblock_alloc may add a region when called, so increase the number of pre-allocated regions by one to be on the safe side (reported and patched by Geert Uytterhoeven) Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- arch/riscv/kernel/setup.c | 90 ++++++++++++++++++++------------------- 1 file changed, 46 insertions(+), 44 deletions(-) diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index e85bacff1..030554bab 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -60,6 +60,7 @@ static DEFINE_PER_CPU(struct cpu, cpu_devices); * also add "System RAM" regions for compatibility with other * archs, and the rest of the known regions for completeness. */ +static struct resource kimage_res = { .name = "Kernel image", }; static struct resource code_res = { .name = "Kernel code", }; static struct resource data_res = { .name = "Kernel data", }; static struct resource rodata_res = { .name = "Kernel rodata", }; @@ -80,45 +81,54 @@ static int __init add_resource(struct resource *parent, return 1; } -static int __init add_kernel_resources(struct resource *res) +static int __init add_kernel_resources(void) { int ret = 0; /* * The memory region of the kernel image is continuous and - * was reserved on setup_bootmem, find it here and register - * it as a resource, then register the various segments of - * the image as child nodes + * was reserved on setup_bootmem, register it here as a + * resource, with the various segments of the image as + * child nodes. */ - if (!(res->start <= code_res.start && res->end >= data_res.end)) - return 0; - res->name = "Kernel image"; - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + code_res.start = __pa_symbol(_text); + code_res.end = __pa_symbol(_etext) - 1; + code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - /* - * We removed a part of this region on setup_bootmem so - * we need to expand the resource for the bss to fit in. - */ - res->end = bss_res.end; + rodata_res.start = __pa_symbol(__start_rodata); + rodata_res.end = __pa_symbol(__end_rodata) - 1; + rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - ret = add_resource(&iomem_resource, res); + data_res.start = __pa_symbol(_data); + data_res.end = __pa_symbol(_edata) - 1; + data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + bss_res.start = __pa_symbol(__bss_start); + bss_res.end = __pa_symbol(__bss_stop) - 1; + bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + kimage_res.start = code_res.start; + kimage_res.end = bss_res.end; + kimage_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + ret = add_resource(&iomem_resource, &kimage_res); if (ret < 0) return ret; - ret = add_resource(res, &code_res); + ret = add_resource(&kimage_res, &code_res); if (ret < 0) return ret; - ret = add_resource(res, &rodata_res); + ret = add_resource(&kimage_res, &rodata_res); if (ret < 0) return ret; - ret = add_resource(res, &data_res); + ret = add_resource(&kimage_res, &data_res); if (ret < 0) return ret; - ret = add_resource(res, &bss_res); + ret = add_resource(&kimage_res, &bss_res); return ret; } @@ -129,53 +139,42 @@ static void __init init_resources(void) struct resource *res = NULL; struct resource *mem_res = NULL; size_t mem_res_sz = 0; - int ret = 0, i = 0; - - code_res.start = __pa_symbol(_text); - code_res.end = __pa_symbol(_etext) - 1; - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - - rodata_res.start = __pa_symbol(__start_rodata); - rodata_res.end = __pa_symbol(__end_rodata) - 1; - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - - data_res.start = __pa_symbol(_data); - data_res.end = __pa_symbol(_edata) - 1; - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + int num_resources = 0, res_idx = 0; + int ret = 0; - bss_res.start = __pa_symbol(__bss_start); - bss_res.end = __pa_symbol(__bss_stop) - 1; - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + /* + 1 as memblock_alloc() might increase memblock.reserved.cnt */ + num_resources = memblock.memory.cnt + memblock.reserved.cnt + 1; + res_idx = num_resources - 1; - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * sizeof(*mem_res); + mem_res_sz = num_resources * sizeof(*mem_res); mem_res = memblock_alloc(mem_res_sz, SMP_CACHE_BYTES); if (!mem_res) panic("%s: Failed to allocate %zu bytes\n", __func__, mem_res_sz); + /* * Start by adding the reserved regions, if they overlap * with /memory regions, insert_resource later on will take * care of it. */ + ret = add_kernel_resources(); + if (ret < 0) + goto error; + for_each_reserved_mem_region(region) { - res = &mem_res[i++]; + res = &mem_res[res_idx--]; res->name = "Reserved"; res->flags = IORESOURCE_MEM | IORESOURCE_BUSY; res->start = __pfn_to_phys(memblock_region_reserved_base_pfn(region)); res->end = __pfn_to_phys(memblock_region_reserved_end_pfn(region)) - 1; - ret = add_kernel_resources(res); - if (ret < 0) - goto error; - else if (ret) - continue; - /* * Ignore any other reserved regions within * system memory. */ if (memblock_is_memory(res->start)) { - memblock_free((phys_addr_t) res, sizeof(struct resource)); + /* Re-use this pre-allocated resource */ + res_idx++; continue; } @@ -186,7 +185,7 @@ static void __init init_resources(void) /* Add /memory regions to the resource tree */ for_each_mem_region(region) { - res = &mem_res[i++]; + res = &mem_res[res_idx--]; if (unlikely(memblock_is_nomap(region))) { res->name = "Reserved"; @@ -204,6 +203,9 @@ static void __init init_resources(void) goto error; } + /* Clean-up any unused pre-allocated resources */ + mem_res_sz = (num_resources - res_idx + 1) * sizeof(*mem_res); + memblock_free((phys_addr_t) mem_res, mem_res_sz); return; error: -- 2.26.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-06 7:19 ` Geert Uytterhoeven -1 siblings, 0 replies; 53+ messages in thread From: Geert Uytterhoeven @ 2021-04-06 7:19 UTC (permalink / raw) To: Nick Kossifidis Cc: linux-riscv, Palmer Dabbelt, Paul Walmsley, Linux Kernel Mailing List Hi Nick, Thanks for your patch! On Mon, Apr 5, 2021 at 10:57 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > * Kernel region is always present and we know where it is, no > need to look for it inside the loop, just ignore it like the > rest of the reserved regions within system's memory. > > * Don't call memblock_free inside the loop, if called it'll split > the region of pre-allocated resources in two parts, messing things > up, just re-use the previous pre-allocated resource and free any > unused resources after both loops finish. > > * memblock_alloc may add a region when called, so increase the > number of pre-allocated regions by one to be on the safe side > (reported and patched by Geert Uytterhoeven) > > Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Where does this SoB come from? > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -129,53 +139,42 @@ static void __init init_resources(void) > struct resource *res = NULL; > struct resource *mem_res = NULL; > size_t mem_res_sz = 0; > - int ret = 0, i = 0; > - > - code_res.start = __pa_symbol(_text); > - code_res.end = __pa_symbol(_etext) - 1; > - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > - > - rodata_res.start = __pa_symbol(__start_rodata); > - rodata_res.end = __pa_symbol(__end_rodata) - 1; > - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > - > - data_res.start = __pa_symbol(_data); > - data_res.end = __pa_symbol(_edata) - 1; > - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + int num_resources = 0, res_idx = 0; > + int ret = 0; > > - bss_res.start = __pa_symbol(__bss_start); > - bss_res.end = __pa_symbol(__bss_stop) - 1; > - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + /* + 1 as memblock_alloc() might increase memblock.reserved.cnt */ > + num_resources = memblock.memory.cnt + memblock.reserved.cnt + 1; > + res_idx = num_resources - 1; > > - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * sizeof(*mem_res); Oh, you incorporated my commit ce989f1472ae350e ("RISC-V: Fix out-of-bounds accesses in init_resources()") (from v5.12-rc4) into your patch. Why? This means your patch does not apply against upstream. > + mem_res_sz = num_resources * sizeof(*mem_res); > mem_res = memblock_alloc(mem_res_sz, SMP_CACHE_BYTES); > if (!mem_res) > panic("%s: Failed to allocate %zu bytes\n", __func__, mem_res_sz); Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources @ 2021-04-06 7:19 ` Geert Uytterhoeven 0 siblings, 0 replies; 53+ messages in thread From: Geert Uytterhoeven @ 2021-04-06 7:19 UTC (permalink / raw) To: Nick Kossifidis Cc: linux-riscv, Palmer Dabbelt, Paul Walmsley, Linux Kernel Mailing List Hi Nick, Thanks for your patch! On Mon, Apr 5, 2021 at 10:57 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > * Kernel region is always present and we know where it is, no > need to look for it inside the loop, just ignore it like the > rest of the reserved regions within system's memory. > > * Don't call memblock_free inside the loop, if called it'll split > the region of pre-allocated resources in two parts, messing things > up, just re-use the previous pre-allocated resource and free any > unused resources after both loops finish. > > * memblock_alloc may add a region when called, so increase the > number of pre-allocated regions by one to be on the safe side > (reported and patched by Geert Uytterhoeven) > > Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Where does this SoB come from? > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -129,53 +139,42 @@ static void __init init_resources(void) > struct resource *res = NULL; > struct resource *mem_res = NULL; > size_t mem_res_sz = 0; > - int ret = 0, i = 0; > - > - code_res.start = __pa_symbol(_text); > - code_res.end = __pa_symbol(_etext) - 1; > - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > - > - rodata_res.start = __pa_symbol(__start_rodata); > - rodata_res.end = __pa_symbol(__end_rodata) - 1; > - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > - > - data_res.start = __pa_symbol(_data); > - data_res.end = __pa_symbol(_edata) - 1; > - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + int num_resources = 0, res_idx = 0; > + int ret = 0; > > - bss_res.start = __pa_symbol(__bss_start); > - bss_res.end = __pa_symbol(__bss_stop) - 1; > - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + /* + 1 as memblock_alloc() might increase memblock.reserved.cnt */ > + num_resources = memblock.memory.cnt + memblock.reserved.cnt + 1; > + res_idx = num_resources - 1; > > - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * sizeof(*mem_res); Oh, you incorporated my commit ce989f1472ae350e ("RISC-V: Fix out-of-bounds accesses in init_resources()") (from v5.12-rc4) into your patch. Why? This means your patch does not apply against upstream. > + mem_res_sz = num_resources * sizeof(*mem_res); > mem_res = memblock_alloc(mem_res_sz, SMP_CACHE_BYTES); > if (!mem_res) > panic("%s: Failed to allocate %zu bytes\n", __func__, mem_res_sz); Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources 2021-04-06 7:19 ` Geert Uytterhoeven @ 2021-04-06 8:11 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-06 8:11 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Nick Kossifidis, linux-riscv, Palmer Dabbelt, Paul Walmsley, Linux Kernel Mailing List Hello Geert, Στις 2021-04-06 10:19, Geert Uytterhoeven έγραψε: > Hi Nick, > > Thanks for your patch! > > On Mon, Apr 5, 2021 at 10:57 AM Nick Kossifidis <mick@ics.forth.gr> > wrote: >> * Kernel region is always present and we know where it is, no >> need to look for it inside the loop, just ignore it like the >> rest of the reserved regions within system's memory. >> >> * Don't call memblock_free inside the loop, if called it'll split >> the region of pre-allocated resources in two parts, messing things >> up, just re-use the previous pre-allocated resource and free any >> unused resources after both loops finish. >> >> * memblock_alloc may add a region when called, so increase the >> number of pre-allocated regions by one to be on the safe side >> (reported and patched by Geert Uytterhoeven) >> >> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> > > Where does this SoB come from? > >> Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > >> --- a/arch/riscv/kernel/setup.c >> +++ b/arch/riscv/kernel/setup.c > >> @@ -129,53 +139,42 @@ static void __init init_resources(void) >> struct resource *res = NULL; >> struct resource *mem_res = NULL; >> size_t mem_res_sz = 0; >> - int ret = 0, i = 0; >> - >> - code_res.start = __pa_symbol(_text); >> - code_res.end = __pa_symbol(_etext) - 1; >> - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> - >> - rodata_res.start = __pa_symbol(__start_rodata); >> - rodata_res.end = __pa_symbol(__end_rodata) - 1; >> - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> - >> - data_res.start = __pa_symbol(_data); >> - data_res.end = __pa_symbol(_edata) - 1; >> - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> + int num_resources = 0, res_idx = 0; >> + int ret = 0; >> >> - bss_res.start = __pa_symbol(__bss_start); >> - bss_res.end = __pa_symbol(__bss_stop) - 1; >> - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> + /* + 1 as memblock_alloc() might increase >> memblock.reserved.cnt */ >> + num_resources = memblock.memory.cnt + memblock.reserved.cnt + >> 1; >> + res_idx = num_resources - 1; >> >> - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * >> sizeof(*mem_res); > > Oh, you incorporated my commit ce989f1472ae350e ("RISC-V: Fix > out-of-bounds > accesses in init_resources()") (from v5.12-rc4) into your patch. > Why? This means your patch does not apply against upstream. > Sorry if this looks awkward, I'm under the impression that new features go on for-next instead of fixes and your patch hasn't been merged on for-next yet. I thought it would be cleaner to have one patch to merge for init_resources instead of two, and simpler for people to test the series. I can rebase this on top of fixes if that works better for you or Palmer. Regards, Nick ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources @ 2021-04-06 8:11 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-06 8:11 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Nick Kossifidis, linux-riscv, Palmer Dabbelt, Paul Walmsley, Linux Kernel Mailing List Hello Geert, Στις 2021-04-06 10:19, Geert Uytterhoeven έγραψε: > Hi Nick, > > Thanks for your patch! > > On Mon, Apr 5, 2021 at 10:57 AM Nick Kossifidis <mick@ics.forth.gr> > wrote: >> * Kernel region is always present and we know where it is, no >> need to look for it inside the loop, just ignore it like the >> rest of the reserved regions within system's memory. >> >> * Don't call memblock_free inside the loop, if called it'll split >> the region of pre-allocated resources in two parts, messing things >> up, just re-use the previous pre-allocated resource and free any >> unused resources after both loops finish. >> >> * memblock_alloc may add a region when called, so increase the >> number of pre-allocated regions by one to be on the safe side >> (reported and patched by Geert Uytterhoeven) >> >> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> > > Where does this SoB come from? > >> Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > >> --- a/arch/riscv/kernel/setup.c >> +++ b/arch/riscv/kernel/setup.c > >> @@ -129,53 +139,42 @@ static void __init init_resources(void) >> struct resource *res = NULL; >> struct resource *mem_res = NULL; >> size_t mem_res_sz = 0; >> - int ret = 0, i = 0; >> - >> - code_res.start = __pa_symbol(_text); >> - code_res.end = __pa_symbol(_etext) - 1; >> - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> - >> - rodata_res.start = __pa_symbol(__start_rodata); >> - rodata_res.end = __pa_symbol(__end_rodata) - 1; >> - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> - >> - data_res.start = __pa_symbol(_data); >> - data_res.end = __pa_symbol(_edata) - 1; >> - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> + int num_resources = 0, res_idx = 0; >> + int ret = 0; >> >> - bss_res.start = __pa_symbol(__bss_start); >> - bss_res.end = __pa_symbol(__bss_stop) - 1; >> - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> + /* + 1 as memblock_alloc() might increase >> memblock.reserved.cnt */ >> + num_resources = memblock.memory.cnt + memblock.reserved.cnt + >> 1; >> + res_idx = num_resources - 1; >> >> - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * >> sizeof(*mem_res); > > Oh, you incorporated my commit ce989f1472ae350e ("RISC-V: Fix > out-of-bounds > accesses in init_resources()") (from v5.12-rc4) into your patch. > Why? This means your patch does not apply against upstream. > Sorry if this looks awkward, I'm under the impression that new features go on for-next instead of fixes and your patch hasn't been merged on for-next yet. I thought it would be cleaner to have one patch to merge for init_resources instead of two, and simpler for people to test the series. I can rebase this on top of fixes if that works better for you or Palmer. Regards, Nick _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources 2021-04-06 8:11 ` Nick Kossifidis @ 2021-04-06 8:22 ` Geert Uytterhoeven -1 siblings, 0 replies; 53+ messages in thread From: Geert Uytterhoeven @ 2021-04-06 8:22 UTC (permalink / raw) To: Nick Kossifidis Cc: linux-riscv, Palmer Dabbelt, Paul Walmsley, Linux Kernel Mailing List Hi Nick, On Tue, Apr 6, 2021 at 10:11 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > Hello Geert, > Στις 2021-04-06 10:19, Geert Uytterhoeven έγραψε: > > On Mon, Apr 5, 2021 at 10:57 AM Nick Kossifidis <mick@ics.forth.gr> > > wrote: > >> * Kernel region is always present and we know where it is, no > >> need to look for it inside the loop, just ignore it like the > >> rest of the reserved regions within system's memory. > >> > >> * Don't call memblock_free inside the loop, if called it'll split > >> the region of pre-allocated resources in two parts, messing things > >> up, just re-use the previous pre-allocated resource and free any > >> unused resources after both loops finish. > >> > >> * memblock_alloc may add a region when called, so increase the > >> number of pre-allocated regions by one to be on the safe side > >> (reported and patched by Geert Uytterhoeven) > >> > >> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> > > > > Where does this SoB come from? > > > >> Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > > > >> --- a/arch/riscv/kernel/setup.c > >> +++ b/arch/riscv/kernel/setup.c > > > >> @@ -129,53 +139,42 @@ static void __init init_resources(void) > >> struct resource *res = NULL; > >> struct resource *mem_res = NULL; > >> size_t mem_res_sz = 0; > >> - int ret = 0, i = 0; > >> - > >> - code_res.start = __pa_symbol(_text); > >> - code_res.end = __pa_symbol(_etext) - 1; > >> - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > >> - > >> - rodata_res.start = __pa_symbol(__start_rodata); > >> - rodata_res.end = __pa_symbol(__end_rodata) - 1; > >> - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > >> - > >> - data_res.start = __pa_symbol(_data); > >> - data_res.end = __pa_symbol(_edata) - 1; > >> - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > >> + int num_resources = 0, res_idx = 0; > >> + int ret = 0; > >> > >> - bss_res.start = __pa_symbol(__bss_start); > >> - bss_res.end = __pa_symbol(__bss_stop) - 1; > >> - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > >> + /* + 1 as memblock_alloc() might increase > >> memblock.reserved.cnt */ > >> + num_resources = memblock.memory.cnt + memblock.reserved.cnt + > >> 1; > >> + res_idx = num_resources - 1; > >> > >> - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * > >> sizeof(*mem_res); > > > > Oh, you incorporated my commit ce989f1472ae350e ("RISC-V: Fix > > out-of-bounds > > accesses in init_resources()") (from v5.12-rc4) into your patch. > > Why? This means your patch does not apply against upstream. > > > > Sorry if this looks awkward, I'm under the impression that new features > go on for-next instead of fixes and your patch hasn't been merged on > for-next yet. I thought it would be cleaner to have one patch to merge > for init_resources instead of two, and simpler for people to test the > series. I can rebase this on top of fixes if that works better for you > or Palmer. Ideally the fixes branch is part of the next branch. That also helps to avoid other people having to fix conflicts when merging both. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources @ 2021-04-06 8:22 ` Geert Uytterhoeven 0 siblings, 0 replies; 53+ messages in thread From: Geert Uytterhoeven @ 2021-04-06 8:22 UTC (permalink / raw) To: Nick Kossifidis Cc: linux-riscv, Palmer Dabbelt, Paul Walmsley, Linux Kernel Mailing List Hi Nick, On Tue, Apr 6, 2021 at 10:11 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > Hello Geert, > Στις 2021-04-06 10:19, Geert Uytterhoeven έγραψε: > > On Mon, Apr 5, 2021 at 10:57 AM Nick Kossifidis <mick@ics.forth.gr> > > wrote: > >> * Kernel region is always present and we know where it is, no > >> need to look for it inside the loop, just ignore it like the > >> rest of the reserved regions within system's memory. > >> > >> * Don't call memblock_free inside the loop, if called it'll split > >> the region of pre-allocated resources in two parts, messing things > >> up, just re-use the previous pre-allocated resource and free any > >> unused resources after both loops finish. > >> > >> * memblock_alloc may add a region when called, so increase the > >> number of pre-allocated regions by one to be on the safe side > >> (reported and patched by Geert Uytterhoeven) > >> > >> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> > > > > Where does this SoB come from? > > > >> Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > > > >> --- a/arch/riscv/kernel/setup.c > >> +++ b/arch/riscv/kernel/setup.c > > > >> @@ -129,53 +139,42 @@ static void __init init_resources(void) > >> struct resource *res = NULL; > >> struct resource *mem_res = NULL; > >> size_t mem_res_sz = 0; > >> - int ret = 0, i = 0; > >> - > >> - code_res.start = __pa_symbol(_text); > >> - code_res.end = __pa_symbol(_etext) - 1; > >> - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > >> - > >> - rodata_res.start = __pa_symbol(__start_rodata); > >> - rodata_res.end = __pa_symbol(__end_rodata) - 1; > >> - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > >> - > >> - data_res.start = __pa_symbol(_data); > >> - data_res.end = __pa_symbol(_edata) - 1; > >> - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > >> + int num_resources = 0, res_idx = 0; > >> + int ret = 0; > >> > >> - bss_res.start = __pa_symbol(__bss_start); > >> - bss_res.end = __pa_symbol(__bss_stop) - 1; > >> - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > >> + /* + 1 as memblock_alloc() might increase > >> memblock.reserved.cnt */ > >> + num_resources = memblock.memory.cnt + memblock.reserved.cnt + > >> 1; > >> + res_idx = num_resources - 1; > >> > >> - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * > >> sizeof(*mem_res); > > > > Oh, you incorporated my commit ce989f1472ae350e ("RISC-V: Fix > > out-of-bounds > > accesses in init_resources()") (from v5.12-rc4) into your patch. > > Why? This means your patch does not apply against upstream. > > > > Sorry if this looks awkward, I'm under the impression that new features > go on for-next instead of fixes and your patch hasn't been merged on > for-next yet. I thought it would be cleaner to have one patch to merge > for init_resources instead of two, and simpler for people to test the > series. I can rebase this on top of fixes if that works better for you > or Palmer. Ideally the fixes branch is part of the next branch. That also helps to avoid other people having to fix conflicts when merging both. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources 2021-04-06 8:22 ` Geert Uytterhoeven @ 2021-04-09 10:11 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-09 10:11 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Nick Kossifidis, linux-riscv, Palmer Dabbelt, Paul Walmsley, Linux Kernel Mailing List Στις 2021-04-06 11:22, Geert Uytterhoeven έγραψε: > Hi Nick, > > On Tue, Apr 6, 2021 at 10:11 AM Nick Kossifidis <mick@ics.forth.gr> > wrote: >> Hello Geert, >> Στις 2021-04-06 10:19, Geert Uytterhoeven έγραψε: >> > On Mon, Apr 5, 2021 at 10:57 AM Nick Kossifidis <mick@ics.forth.gr> >> > wrote: >> >> * Kernel region is always present and we know where it is, no >> >> need to look for it inside the loop, just ignore it like the >> >> rest of the reserved regions within system's memory. >> >> >> >> * Don't call memblock_free inside the loop, if called it'll split >> >> the region of pre-allocated resources in two parts, messing things >> >> up, just re-use the previous pre-allocated resource and free any >> >> unused resources after both loops finish. >> >> >> >> * memblock_alloc may add a region when called, so increase the >> >> number of pre-allocated regions by one to be on the safe side >> >> (reported and patched by Geert Uytterhoeven) >> >> >> >> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> >> > >> > Where does this SoB come from? >> > >> >> Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> >> > >> >> --- a/arch/riscv/kernel/setup.c >> >> +++ b/arch/riscv/kernel/setup.c >> > >> >> @@ -129,53 +139,42 @@ static void __init init_resources(void) >> >> struct resource *res = NULL; >> >> struct resource *mem_res = NULL; >> >> size_t mem_res_sz = 0; >> >> - int ret = 0, i = 0; >> >> - >> >> - code_res.start = __pa_symbol(_text); >> >> - code_res.end = __pa_symbol(_etext) - 1; >> >> - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> >> - >> >> - rodata_res.start = __pa_symbol(__start_rodata); >> >> - rodata_res.end = __pa_symbol(__end_rodata) - 1; >> >> - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> >> - >> >> - data_res.start = __pa_symbol(_data); >> >> - data_res.end = __pa_symbol(_edata) - 1; >> >> - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> >> + int num_resources = 0, res_idx = 0; >> >> + int ret = 0; >> >> >> >> - bss_res.start = __pa_symbol(__bss_start); >> >> - bss_res.end = __pa_symbol(__bss_stop) - 1; >> >> - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> >> + /* + 1 as memblock_alloc() might increase >> >> memblock.reserved.cnt */ >> >> + num_resources = memblock.memory.cnt + memblock.reserved.cnt + >> >> 1; >> >> + res_idx = num_resources - 1; >> >> >> >> - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * >> >> sizeof(*mem_res); >> > >> > Oh, you incorporated my commit ce989f1472ae350e ("RISC-V: Fix >> > out-of-bounds >> > accesses in init_resources()") (from v5.12-rc4) into your patch. >> > Why? This means your patch does not apply against upstream. >> > >> >> Sorry if this looks awkward, I'm under the impression that new >> features >> go on for-next instead of fixes and your patch hasn't been merged on >> for-next yet. I thought it would be cleaner to have one patch to merge >> for init_resources instead of two, and simpler for people to test the >> series. I can rebase this on top of fixes if that works better for you >> or Palmer. > > Ideally the fixes branch is part of the next branch. That also helps > to avoid other people having to fix conflicts when merging both. > OK I'll re-base this on top of fixes instead. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources @ 2021-04-09 10:11 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-09 10:11 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Nick Kossifidis, linux-riscv, Palmer Dabbelt, Paul Walmsley, Linux Kernel Mailing List Στις 2021-04-06 11:22, Geert Uytterhoeven έγραψε: > Hi Nick, > > On Tue, Apr 6, 2021 at 10:11 AM Nick Kossifidis <mick@ics.forth.gr> > wrote: >> Hello Geert, >> Στις 2021-04-06 10:19, Geert Uytterhoeven έγραψε: >> > On Mon, Apr 5, 2021 at 10:57 AM Nick Kossifidis <mick@ics.forth.gr> >> > wrote: >> >> * Kernel region is always present and we know where it is, no >> >> need to look for it inside the loop, just ignore it like the >> >> rest of the reserved regions within system's memory. >> >> >> >> * Don't call memblock_free inside the loop, if called it'll split >> >> the region of pre-allocated resources in two parts, messing things >> >> up, just re-use the previous pre-allocated resource and free any >> >> unused resources after both loops finish. >> >> >> >> * memblock_alloc may add a region when called, so increase the >> >> number of pre-allocated regions by one to be on the safe side >> >> (reported and patched by Geert Uytterhoeven) >> >> >> >> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> >> > >> > Where does this SoB come from? >> > >> >> Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> >> > >> >> --- a/arch/riscv/kernel/setup.c >> >> +++ b/arch/riscv/kernel/setup.c >> > >> >> @@ -129,53 +139,42 @@ static void __init init_resources(void) >> >> struct resource *res = NULL; >> >> struct resource *mem_res = NULL; >> >> size_t mem_res_sz = 0; >> >> - int ret = 0, i = 0; >> >> - >> >> - code_res.start = __pa_symbol(_text); >> >> - code_res.end = __pa_symbol(_etext) - 1; >> >> - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> >> - >> >> - rodata_res.start = __pa_symbol(__start_rodata); >> >> - rodata_res.end = __pa_symbol(__end_rodata) - 1; >> >> - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> >> - >> >> - data_res.start = __pa_symbol(_data); >> >> - data_res.end = __pa_symbol(_edata) - 1; >> >> - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> >> + int num_resources = 0, res_idx = 0; >> >> + int ret = 0; >> >> >> >> - bss_res.start = __pa_symbol(__bss_start); >> >> - bss_res.end = __pa_symbol(__bss_stop) - 1; >> >> - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> >> + /* + 1 as memblock_alloc() might increase >> >> memblock.reserved.cnt */ >> >> + num_resources = memblock.memory.cnt + memblock.reserved.cnt + >> >> 1; >> >> + res_idx = num_resources - 1; >> >> >> >> - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * >> >> sizeof(*mem_res); >> > >> > Oh, you incorporated my commit ce989f1472ae350e ("RISC-V: Fix >> > out-of-bounds >> > accesses in init_resources()") (from v5.12-rc4) into your patch. >> > Why? This means your patch does not apply against upstream. >> > >> >> Sorry if this looks awkward, I'm under the impression that new >> features >> go on for-next instead of fixes and your patch hasn't been merged on >> for-next yet. I thought it would be cleaner to have one patch to merge >> for init_resources instead of two, and simpler for people to test the >> series. I can rebase this on top of fixes if that works better for you >> or Palmer. > > Ideally the fixes branch is part of the next branch. That also helps > to avoid other people having to fix conflicts when merging both. > OK I'll re-base this on top of fixes instead. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-23 3:30 ` Palmer Dabbelt -1 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick, geert On Mon, 05 Apr 2021 01:57:10 PDT (-0700), mick@ics.forth.gr wrote: > * Kernel region is always present and we know where it is, no > need to look for it inside the loop, just ignore it like the > rest of the reserved regions within system's memory. > > * Don't call memblock_free inside the loop, if called it'll split > the region of pre-allocated resources in two parts, messing things > up, just re-use the previous pre-allocated resource and free any > unused resources after both loops finish. > > * memblock_alloc may add a region when called, so increase the > number of pre-allocated regions by one to be on the safe side > (reported and patched by Geert Uytterhoeven) IIUC this one has already been fixen on for-next. Either way, it caused a merge conflict. I think I've fixed it up, LMK if something went wrong. Also: I cleaned up the commit text a bit, as this is an odd way to do it. It's probably best to just have split this into two commits. > > Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/kernel/setup.c | 90 ++++++++++++++++++++------------------- > 1 file changed, 46 insertions(+), 44 deletions(-) > > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index e85bacff1..030554bab 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -60,6 +60,7 @@ static DEFINE_PER_CPU(struct cpu, cpu_devices); > * also add "System RAM" regions for compatibility with other > * archs, and the rest of the known regions for completeness. > */ > +static struct resource kimage_res = { .name = "Kernel image", }; > static struct resource code_res = { .name = "Kernel code", }; > static struct resource data_res = { .name = "Kernel data", }; > static struct resource rodata_res = { .name = "Kernel rodata", }; > @@ -80,45 +81,54 @@ static int __init add_resource(struct resource *parent, > return 1; > } > > -static int __init add_kernel_resources(struct resource *res) > +static int __init add_kernel_resources(void) > { > int ret = 0; > > /* > * The memory region of the kernel image is continuous and > - * was reserved on setup_bootmem, find it here and register > - * it as a resource, then register the various segments of > - * the image as child nodes > + * was reserved on setup_bootmem, register it here as a > + * resource, with the various segments of the image as > + * child nodes. > */ > - if (!(res->start <= code_res.start && res->end >= data_res.end)) > - return 0; > > - res->name = "Kernel image"; > - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + code_res.start = __pa_symbol(_text); > + code_res.end = __pa_symbol(_etext) - 1; > + code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > > - /* > - * We removed a part of this region on setup_bootmem so > - * we need to expand the resource for the bss to fit in. > - */ > - res->end = bss_res.end; > + rodata_res.start = __pa_symbol(__start_rodata); > + rodata_res.end = __pa_symbol(__end_rodata) - 1; > + rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > > - ret = add_resource(&iomem_resource, res); > + data_res.start = __pa_symbol(_data); > + data_res.end = __pa_symbol(_edata) - 1; > + data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + > + bss_res.start = __pa_symbol(__bss_start); > + bss_res.end = __pa_symbol(__bss_stop) - 1; > + bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + > + kimage_res.start = code_res.start; > + kimage_res.end = bss_res.end; > + kimage_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + > + ret = add_resource(&iomem_resource, &kimage_res); > if (ret < 0) > return ret; > > - ret = add_resource(res, &code_res); > + ret = add_resource(&kimage_res, &code_res); > if (ret < 0) > return ret; > > - ret = add_resource(res, &rodata_res); > + ret = add_resource(&kimage_res, &rodata_res); > if (ret < 0) > return ret; > > - ret = add_resource(res, &data_res); > + ret = add_resource(&kimage_res, &data_res); > if (ret < 0) > return ret; > > - ret = add_resource(res, &bss_res); > + ret = add_resource(&kimage_res, &bss_res); > > return ret; > } > @@ -129,53 +139,42 @@ static void __init init_resources(void) > struct resource *res = NULL; > struct resource *mem_res = NULL; > size_t mem_res_sz = 0; > - int ret = 0, i = 0; > - > - code_res.start = __pa_symbol(_text); > - code_res.end = __pa_symbol(_etext) - 1; > - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > - > - rodata_res.start = __pa_symbol(__start_rodata); > - rodata_res.end = __pa_symbol(__end_rodata) - 1; > - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > - > - data_res.start = __pa_symbol(_data); > - data_res.end = __pa_symbol(_edata) - 1; > - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + int num_resources = 0, res_idx = 0; > + int ret = 0; > > - bss_res.start = __pa_symbol(__bss_start); > - bss_res.end = __pa_symbol(__bss_stop) - 1; > - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + /* + 1 as memblock_alloc() might increase memblock.reserved.cnt */ > + num_resources = memblock.memory.cnt + memblock.reserved.cnt + 1; > + res_idx = num_resources - 1; > > - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * sizeof(*mem_res); > + mem_res_sz = num_resources * sizeof(*mem_res); > mem_res = memblock_alloc(mem_res_sz, SMP_CACHE_BYTES); > if (!mem_res) > panic("%s: Failed to allocate %zu bytes\n", __func__, mem_res_sz); > + > /* > * Start by adding the reserved regions, if they overlap > * with /memory regions, insert_resource later on will take > * care of it. > */ > + ret = add_kernel_resources(); > + if (ret < 0) > + goto error; > + > for_each_reserved_mem_region(region) { > - res = &mem_res[i++]; > + res = &mem_res[res_idx--]; > > res->name = "Reserved"; > res->flags = IORESOURCE_MEM | IORESOURCE_BUSY; > res->start = __pfn_to_phys(memblock_region_reserved_base_pfn(region)); > res->end = __pfn_to_phys(memblock_region_reserved_end_pfn(region)) - 1; > > - ret = add_kernel_resources(res); > - if (ret < 0) > - goto error; > - else if (ret) > - continue; > - > /* > * Ignore any other reserved regions within > * system memory. > */ > if (memblock_is_memory(res->start)) { > - memblock_free((phys_addr_t) res, sizeof(struct resource)); > + /* Re-use this pre-allocated resource */ > + res_idx++; > continue; > } > > @@ -186,7 +185,7 @@ static void __init init_resources(void) > > /* Add /memory regions to the resource tree */ > for_each_mem_region(region) { > - res = &mem_res[i++]; > + res = &mem_res[res_idx--]; > > if (unlikely(memblock_is_nomap(region))) { > res->name = "Reserved"; > @@ -204,6 +203,9 @@ static void __init init_resources(void) > goto error; > } > > + /* Clean-up any unused pre-allocated resources */ > + mem_res_sz = (num_resources - res_idx + 1) * sizeof(*mem_res); > + memblock_free((phys_addr_t) mem_res, mem_res_sz); > return; > > error: ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 3/5] RISC-V: Improve init_resources @ 2021-04-23 3:30 ` Palmer Dabbelt 0 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick, geert On Mon, 05 Apr 2021 01:57:10 PDT (-0700), mick@ics.forth.gr wrote: > * Kernel region is always present and we know where it is, no > need to look for it inside the loop, just ignore it like the > rest of the reserved regions within system's memory. > > * Don't call memblock_free inside the loop, if called it'll split > the region of pre-allocated resources in two parts, messing things > up, just re-use the previous pre-allocated resource and free any > unused resources after both loops finish. > > * memblock_alloc may add a region when called, so increase the > number of pre-allocated regions by one to be on the safe side > (reported and patched by Geert Uytterhoeven) IIUC this one has already been fixen on for-next. Either way, it caused a merge conflict. I think I've fixed it up, LMK if something went wrong. Also: I cleaned up the commit text a bit, as this is an odd way to do it. It's probably best to just have split this into two commits. > > Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/kernel/setup.c | 90 ++++++++++++++++++++------------------- > 1 file changed, 46 insertions(+), 44 deletions(-) > > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index e85bacff1..030554bab 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -60,6 +60,7 @@ static DEFINE_PER_CPU(struct cpu, cpu_devices); > * also add "System RAM" regions for compatibility with other > * archs, and the rest of the known regions for completeness. > */ > +static struct resource kimage_res = { .name = "Kernel image", }; > static struct resource code_res = { .name = "Kernel code", }; > static struct resource data_res = { .name = "Kernel data", }; > static struct resource rodata_res = { .name = "Kernel rodata", }; > @@ -80,45 +81,54 @@ static int __init add_resource(struct resource *parent, > return 1; > } > > -static int __init add_kernel_resources(struct resource *res) > +static int __init add_kernel_resources(void) > { > int ret = 0; > > /* > * The memory region of the kernel image is continuous and > - * was reserved on setup_bootmem, find it here and register > - * it as a resource, then register the various segments of > - * the image as child nodes > + * was reserved on setup_bootmem, register it here as a > + * resource, with the various segments of the image as > + * child nodes. > */ > - if (!(res->start <= code_res.start && res->end >= data_res.end)) > - return 0; > > - res->name = "Kernel image"; > - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + code_res.start = __pa_symbol(_text); > + code_res.end = __pa_symbol(_etext) - 1; > + code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > > - /* > - * We removed a part of this region on setup_bootmem so > - * we need to expand the resource for the bss to fit in. > - */ > - res->end = bss_res.end; > + rodata_res.start = __pa_symbol(__start_rodata); > + rodata_res.end = __pa_symbol(__end_rodata) - 1; > + rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > > - ret = add_resource(&iomem_resource, res); > + data_res.start = __pa_symbol(_data); > + data_res.end = __pa_symbol(_edata) - 1; > + data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + > + bss_res.start = __pa_symbol(__bss_start); > + bss_res.end = __pa_symbol(__bss_stop) - 1; > + bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + > + kimage_res.start = code_res.start; > + kimage_res.end = bss_res.end; > + kimage_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + > + ret = add_resource(&iomem_resource, &kimage_res); > if (ret < 0) > return ret; > > - ret = add_resource(res, &code_res); > + ret = add_resource(&kimage_res, &code_res); > if (ret < 0) > return ret; > > - ret = add_resource(res, &rodata_res); > + ret = add_resource(&kimage_res, &rodata_res); > if (ret < 0) > return ret; > > - ret = add_resource(res, &data_res); > + ret = add_resource(&kimage_res, &data_res); > if (ret < 0) > return ret; > > - ret = add_resource(res, &bss_res); > + ret = add_resource(&kimage_res, &bss_res); > > return ret; > } > @@ -129,53 +139,42 @@ static void __init init_resources(void) > struct resource *res = NULL; > struct resource *mem_res = NULL; > size_t mem_res_sz = 0; > - int ret = 0, i = 0; > - > - code_res.start = __pa_symbol(_text); > - code_res.end = __pa_symbol(_etext) - 1; > - code_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > - > - rodata_res.start = __pa_symbol(__start_rodata); > - rodata_res.end = __pa_symbol(__end_rodata) - 1; > - rodata_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > - > - data_res.start = __pa_symbol(_data); > - data_res.end = __pa_symbol(_edata) - 1; > - data_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + int num_resources = 0, res_idx = 0; > + int ret = 0; > > - bss_res.start = __pa_symbol(__bss_start); > - bss_res.end = __pa_symbol(__bss_stop) - 1; > - bss_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + /* + 1 as memblock_alloc() might increase memblock.reserved.cnt */ > + num_resources = memblock.memory.cnt + memblock.reserved.cnt + 1; > + res_idx = num_resources - 1; > > - mem_res_sz = (memblock.memory.cnt + memblock.reserved.cnt) * sizeof(*mem_res); > + mem_res_sz = num_resources * sizeof(*mem_res); > mem_res = memblock_alloc(mem_res_sz, SMP_CACHE_BYTES); > if (!mem_res) > panic("%s: Failed to allocate %zu bytes\n", __func__, mem_res_sz); > + > /* > * Start by adding the reserved regions, if they overlap > * with /memory regions, insert_resource later on will take > * care of it. > */ > + ret = add_kernel_resources(); > + if (ret < 0) > + goto error; > + > for_each_reserved_mem_region(region) { > - res = &mem_res[i++]; > + res = &mem_res[res_idx--]; > > res->name = "Reserved"; > res->flags = IORESOURCE_MEM | IORESOURCE_BUSY; > res->start = __pfn_to_phys(memblock_region_reserved_base_pfn(region)); > res->end = __pfn_to_phys(memblock_region_reserved_end_pfn(region)) - 1; > > - ret = add_kernel_resources(res); > - if (ret < 0) > - goto error; > - else if (ret) > - continue; > - > /* > * Ignore any other reserved regions within > * system memory. > */ > if (memblock_is_memory(res->start)) { > - memblock_free((phys_addr_t) res, sizeof(struct resource)); > + /* Re-use this pre-allocated resource */ > + res_idx++; > continue; > } > > @@ -186,7 +185,7 @@ static void __init init_resources(void) > > /* Add /memory regions to the resource tree */ > for_each_mem_region(region) { > - res = &mem_res[i++]; > + res = &mem_res[res_idx--]; > > if (unlikely(memblock_is_nomap(region))) { > res->name = "Reserved"; > @@ -204,6 +203,9 @@ static void __init init_resources(void) > goto error; > } > > + /* Clean-up any unused pre-allocated resources */ > + mem_res_sz = (num_resources - res_idx + 1) * sizeof(*mem_res); > + memblock_free((phys_addr_t) mem_res, mem_res_sz); > return; > > error: _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 4/5] RISC-V: Add kdump support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-05 8:57 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel, Nick Kossifidis This patch adds support for kdump, the kernel will reserve a region for the crash kernel and jump there on panic. In order for userspace tools (kexec-tools) to prepare the crash kernel kexec image, we also need to expose some information on /proc/iomem for the memory regions used by the kernel and for the region reserved for crash kernel. Note that on userspace the device tree is used to determine the system's memory layout so the "System RAM" on /proc/iomem is ignored. I tested this on riscv64 qemu and works as expected, you may test it by triggering a crash through /proc/sysrq_trigger: echo c > /proc/sysrq_trigger v3: * Move ELF_CORE_COPY_REGS to asm/elf.h instead of uapi/asm/elf.h * Set stvec when disabling MMU * Minor cleanups and re-base v2: * Properly populate the ioresources tree, so that it can be used later on for implementing strict /dev/mem. * Minor cleanups and re-base Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- arch/riscv/include/asm/elf.h | 6 +++ arch/riscv/include/asm/kexec.h | 19 ++++--- arch/riscv/kernel/Makefile | 2 +- arch/riscv/kernel/crash_save_regs.S | 56 +++++++++++++++++++++ arch/riscv/kernel/kexec_relocate.S | 68 ++++++++++++++++++++++++- arch/riscv/kernel/machine_kexec.c | 43 +++++++++------- arch/riscv/kernel/setup.c | 11 ++++- arch/riscv/mm/init.c | 77 +++++++++++++++++++++++++++++ 8 files changed, 255 insertions(+), 27 deletions(-) create mode 100644 arch/riscv/kernel/crash_save_regs.S diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h index 5c725e1df..f4b490cd0 100644 --- a/arch/riscv/include/asm/elf.h +++ b/arch/riscv/include/asm/elf.h @@ -81,4 +81,10 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp); #endif /* CONFIG_MMU */ +#define ELF_CORE_COPY_REGS(dest, regs) \ +do { \ + *(struct user_regs_struct *)&(dest) = \ + *(struct user_regs_struct *)regs; \ +} while (0); + #endif /* _ASM_RISCV_ELF_H */ diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h index efc69feb4..4fd583acc 100644 --- a/arch/riscv/include/asm/kexec.h +++ b/arch/riscv/include/asm/kexec.h @@ -21,11 +21,16 @@ #define KEXEC_ARCH KEXEC_ARCH_RISCV +extern void riscv_crash_save_regs(struct pt_regs *newregs); + static inline void crash_setup_regs(struct pt_regs *newregs, struct pt_regs *oldregs) { - /* Dummy implementation for now */ + if (oldregs) + memcpy(newregs, oldregs, sizeof(struct pt_regs)); + else + riscv_crash_save_regs(newregs); } @@ -38,10 +43,12 @@ struct kimage_arch { const extern unsigned char riscv_kexec_relocate[]; const extern unsigned int riscv_kexec_relocate_size; -typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, - unsigned long jump_addr, - unsigned long fdt_addr, - unsigned long hartid, - unsigned long va_pa_off); +typedef void (*riscv_kexec_method)(unsigned long first_ind_entry, + unsigned long jump_addr, + unsigned long fdt_addr, + unsigned long hartid, + unsigned long va_pa_off); + +extern riscv_kexec_method riscv_kexec_norelocate; #endif diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index c2594018c..07f676ad3 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -58,7 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o endif obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o obj-$(CONFIG_KGDB) += kgdb.o -obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o +obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o obj-$(CONFIG_JUMP_LABEL) += jump_label.o diff --git a/arch/riscv/kernel/crash_save_regs.S b/arch/riscv/kernel/crash_save_regs.S new file mode 100644 index 000000000..7832fb763 --- /dev/null +++ b/arch/riscv/kernel/crash_save_regs.S @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2020 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> + */ + +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ +#include <asm/csr.h> /* For CSR_* macros */ +#include <asm/asm-offsets.h> /* For offsets on pt_regs */ +#include <linux/linkage.h> /* For SYM_* macros */ + +.section ".text" +SYM_CODE_START(riscv_crash_save_regs) + REG_S ra, PT_RA(a0) /* x1 */ + REG_S sp, PT_SP(a0) /* x2 */ + REG_S gp, PT_GP(a0) /* x3 */ + REG_S tp, PT_TP(a0) /* x4 */ + REG_S t0, PT_T0(a0) /* x5 */ + REG_S t1, PT_T1(a0) /* x6 */ + REG_S t2, PT_T2(a0) /* x7 */ + REG_S s0, PT_S0(a0) /* x8/fp */ + REG_S s1, PT_S1(a0) /* x9 */ + REG_S a0, PT_A0(a0) /* x10 */ + REG_S a1, PT_A1(a0) /* x11 */ + REG_S a2, PT_A2(a0) /* x12 */ + REG_S a3, PT_A3(a0) /* x13 */ + REG_S a4, PT_A4(a0) /* x14 */ + REG_S a5, PT_A5(a0) /* x15 */ + REG_S a6, PT_A6(a0) /* x16 */ + REG_S a7, PT_A7(a0) /* x17 */ + REG_S s2, PT_S2(a0) /* x18 */ + REG_S s3, PT_S3(a0) /* x19 */ + REG_S s4, PT_S4(a0) /* x20 */ + REG_S s5, PT_S5(a0) /* x21 */ + REG_S s6, PT_S6(a0) /* x22 */ + REG_S s7, PT_S7(a0) /* x23 */ + REG_S s8, PT_S8(a0) /* x24 */ + REG_S s9, PT_S9(a0) /* x25 */ + REG_S s10, PT_S10(a0) /* x26 */ + REG_S s11, PT_S11(a0) /* x27 */ + REG_S t3, PT_T3(a0) /* x28 */ + REG_S t4, PT_T4(a0) /* x29 */ + REG_S t5, PT_T5(a0) /* x30 */ + REG_S t6, PT_T6(a0) /* x31 */ + + csrr t1, CSR_STATUS + csrr t2, CSR_EPC + csrr t3, CSR_TVAL + csrr t4, CSR_CAUSE + + REG_S t1, PT_STATUS(a0) + REG_S t2, PT_EPC(a0) + REG_S t3, PT_BADADDR(a0) + REG_S t4, PT_CAUSE(a0) + ret +SYM_CODE_END(riscv_crash_save_regs) diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S index 616c20771..14220f70f 100644 --- a/arch/riscv/kernel/kexec_relocate.S +++ b/arch/riscv/kernel/kexec_relocate.S @@ -150,7 +150,73 @@ SYM_CODE_START(riscv_kexec_relocate) SYM_CODE_END(riscv_kexec_relocate) riscv_kexec_relocate_end: - .section ".rodata" + +/* Used for jumping to crashkernel */ +.section ".text" +SYM_CODE_START(riscv_kexec_norelocate) + /* + * s0: (const) Phys address to jump to + * s1: (const) Phys address of the FDT image + * s2: (const) The hartid of the current hart + * s3: (const) va_pa_offset, used when switching MMU off + */ + mv s0, a1 + mv s1, a2 + mv s2, a3 + mv s3, a4 + + /* Disable / cleanup interrupts */ + csrw sie, zero + csrw sip, zero + + /* Switch to physical addressing */ + la s4, 1f + sub s4, s4, s3 + csrw stvec, s4 + csrw sptbr, zero + +.align 2 +1: + /* Pass the arguments to the next kernel / Cleanup*/ + mv a0, s2 + mv a1, s1 + mv a2, s0 + + /* Cleanup */ + mv a3, zero + mv a4, zero + mv a5, zero + mv a6, zero + mv a7, zero + + mv s0, zero + mv s1, zero + mv s2, zero + mv s3, zero + mv s4, zero + mv s5, zero + mv s6, zero + mv s7, zero + mv s8, zero + mv s9, zero + mv s10, zero + mv s11, zero + + mv t0, zero + mv t1, zero + mv t2, zero + mv t3, zero + mv t4, zero + mv t5, zero + mv t6, zero + csrw sepc, zero + csrw scause, zero + csrw sscratch, zero + + jalr zero, a2, 0 +SYM_CODE_END(riscv_kexec_norelocate) + +.section ".rodata" SYM_DATA(riscv_kexec_relocate_size, .long riscv_kexec_relocate_end - riscv_kexec_relocate) diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c index 2ce6c3daf..e0596c0ac 100644 --- a/arch/riscv/kernel/machine_kexec.c +++ b/arch/riscv/kernel/machine_kexec.c @@ -59,11 +59,6 @@ machine_kexec_prepare(struct kimage *image) kexec_image_info(image); - if (image->type == KEXEC_TYPE_CRASH) { - pr_warn("Loading a crash kernel is unsupported for now.\n"); - return -EINVAL; - } - /* Find the Flattened Device Tree and save its physical address */ for (i = 0; i < image->nr_segments; i++) { if (image->segment[i].memsz <= sizeof(fdt)) @@ -85,17 +80,21 @@ machine_kexec_prepare(struct kimage *image) } /* Copy the assembler code for relocation to the control page */ - control_code_buffer = page_address(image->control_code_page); - control_code_buffer_sz = page_size(image->control_code_page); - if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { - pr_err("Relocation code doesn't fit within a control page\n"); - return -EINVAL; - } - memcpy(control_code_buffer, riscv_kexec_relocate, - riscv_kexec_relocate_size); + if (image->type != KEXEC_TYPE_CRASH) { + control_code_buffer = page_address(image->control_code_page); + control_code_buffer_sz = page_size(image->control_code_page); - /* Mark the control page executable */ - set_memory_x((unsigned long) control_code_buffer, 1); + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { + pr_err("Relocation code doesn't fit within a control page\n"); + return -EINVAL; + } + + memcpy(control_code_buffer, riscv_kexec_relocate, + riscv_kexec_relocate_size); + + /* Mark the control page executable */ + set_memory_x((unsigned long) control_code_buffer, 1); + } return 0; } @@ -147,6 +146,9 @@ void machine_shutdown(void) void machine_crash_shutdown(struct pt_regs *regs) { + crash_save_cpu(regs, smp_processor_id()); + machine_shutdown(); + pr_info("Starting crashdump kernel...\n"); } /** @@ -169,7 +171,12 @@ machine_kexec(struct kimage *image) unsigned long this_hart_id = raw_smp_processor_id(); unsigned long fdt_addr = internal->fdt_addr; void *control_code_buffer = page_address(image->control_code_page); - riscv_kexec_do_relocate do_relocate = control_code_buffer; + riscv_kexec_method kexec_method = NULL; + + if (image->type != KEXEC_TYPE_CRASH) + kexec_method = control_code_buffer; + else + kexec_method = (riscv_kexec_method) &riscv_kexec_norelocate; pr_notice("Will call new kernel at %08lx from hart id %lx\n", jump_addr, this_hart_id); @@ -180,7 +187,7 @@ machine_kexec(struct kimage *image) /* Jump to the relocation code */ pr_notice("Bye...\n"); - do_relocate(first_ind_entry, jump_addr, fdt_addr, - this_hart_id, va_pa_offset); + kexec_method(first_ind_entry, jump_addr, fdt_addr, + this_hart_id, va_pa_offset); unreachable(); } diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 030554bab..31866dce9 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -20,6 +20,7 @@ #include <linux/swiotlb.h> #include <linux/smp.h> #include <linux/efi.h> +#include <linux/crash_dump.h> #include <asm/cpu_ops.h> #include <asm/early_ioremap.h> @@ -160,6 +161,14 @@ static void __init init_resources(void) if (ret < 0) goto error; +#ifdef CONFIG_KEXEC_CORE + if (crashk_res.start != crashk_res.end) { + ret = add_resource(&iomem_resource, &crashk_res); + if (ret < 0) + goto error; + } +#endif + for_each_reserved_mem_region(region) { res = &mem_res[res_idx--]; @@ -252,7 +261,6 @@ void __init setup_arch(char **cmdline_p) efi_init(); setup_bootmem(); paging_init(); - init_resources(); #if IS_ENABLED(CONFIG_BUILTIN_DTB) unflatten_and_copy_device_tree(); #else @@ -263,6 +271,7 @@ void __init setup_arch(char **cmdline_p) #endif misc_mem_init(); + init_resources(); sbi_init(); if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 7f5036fbe..e71b35cec 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -2,6 +2,8 @@ /* * Copyright (C) 2012 Regents of the University of California * Copyright (C) 2019 Western Digital Corporation or its affiliates. + * Copyright (C) 2020 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> */ #include <linux/init.h> @@ -14,6 +16,7 @@ #include <linux/libfdt.h> #include <linux/set_memory.h> #include <linux/dma-map-ops.h> +#include <linux/crash_dump.h> #include <asm/fixmap.h> #include <asm/tlbflush.h> @@ -586,6 +589,77 @@ void mark_rodata_ro(void) } #endif +#ifdef CONFIG_KEXEC_CORE +/* + * reserve_crashkernel() - reserves memory for crash kernel + * + * This function reserves memory area given in "crashkernel=" kernel command + * line parameter. The memory reserved is used by dump capture kernel when + * primary kernel is crashing. + */ +static void __init reserve_crashkernel(void) +{ + unsigned long long crash_base = 0; + unsigned long long crash_size = 0; + unsigned long search_start = memblock_start_of_DRAM(); + unsigned long search_end = memblock_end_of_DRAM(); + + int ret = 0; + + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), + &crash_size, &crash_base); + if (ret || !crash_size) + return; + + crash_size = PAGE_ALIGN(crash_size); + + if (crash_base == 0) { + /* + * Current riscv boot protocol requires 2MB alignment for + * RV64 and 4MB alignment for RV32 (hugepage size) + */ + crash_base = memblock_find_in_range(search_start, search_end, +#ifdef CONFIG_64BIT + crash_size, SZ_2M); +#else + crash_size, SZ_4M); +#endif + if (crash_base == 0) { + pr_warn("crashkernel: couldn't allocate %lldKB\n", + crash_size >> 10); + return; + } + } else { + /* User specifies base address explicitly. */ + if (!memblock_is_region_memory(crash_base, crash_size)) { + pr_warn("crashkernel: requested region is not memory\n"); + return; + } + + if (memblock_is_region_reserved(crash_base, crash_size)) { + pr_warn("crashkernel: requested region is reserved\n"); + return; + } + +#ifdef CONFIG_64BIT + if (!IS_ALIGNED(crash_base, SZ_2M)) { +#else + if (!IS_ALIGNED(crash_base, SZ_4M)) { +#endif + pr_warn("crashkernel: requested region is misaligned\n"); + return; + } + } + memblock_reserve(crash_base, crash_size); + + pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", + crash_base, crash_base + crash_size, crash_size >> 20); + + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; +} +#endif /* CONFIG_KEXEC_CORE */ + void __init paging_init(void) { setup_vm_final(); @@ -598,6 +672,9 @@ void __init misc_mem_init(void) arch_numa_init(); sparse_init(); zone_sizes_init(); +#ifdef CONFIG_KEXEC_CORE + reserve_crashkernel(); +#endif memblock_dump_all(); } -- 2.26.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 4/5] RISC-V: Add kdump support @ 2021-04-05 8:57 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel, Nick Kossifidis This patch adds support for kdump, the kernel will reserve a region for the crash kernel and jump there on panic. In order for userspace tools (kexec-tools) to prepare the crash kernel kexec image, we also need to expose some information on /proc/iomem for the memory regions used by the kernel and for the region reserved for crash kernel. Note that on userspace the device tree is used to determine the system's memory layout so the "System RAM" on /proc/iomem is ignored. I tested this on riscv64 qemu and works as expected, you may test it by triggering a crash through /proc/sysrq_trigger: echo c > /proc/sysrq_trigger v3: * Move ELF_CORE_COPY_REGS to asm/elf.h instead of uapi/asm/elf.h * Set stvec when disabling MMU * Minor cleanups and re-base v2: * Properly populate the ioresources tree, so that it can be used later on for implementing strict /dev/mem. * Minor cleanups and re-base Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- arch/riscv/include/asm/elf.h | 6 +++ arch/riscv/include/asm/kexec.h | 19 ++++--- arch/riscv/kernel/Makefile | 2 +- arch/riscv/kernel/crash_save_regs.S | 56 +++++++++++++++++++++ arch/riscv/kernel/kexec_relocate.S | 68 ++++++++++++++++++++++++- arch/riscv/kernel/machine_kexec.c | 43 +++++++++------- arch/riscv/kernel/setup.c | 11 ++++- arch/riscv/mm/init.c | 77 +++++++++++++++++++++++++++++ 8 files changed, 255 insertions(+), 27 deletions(-) create mode 100644 arch/riscv/kernel/crash_save_regs.S diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h index 5c725e1df..f4b490cd0 100644 --- a/arch/riscv/include/asm/elf.h +++ b/arch/riscv/include/asm/elf.h @@ -81,4 +81,10 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp); #endif /* CONFIG_MMU */ +#define ELF_CORE_COPY_REGS(dest, regs) \ +do { \ + *(struct user_regs_struct *)&(dest) = \ + *(struct user_regs_struct *)regs; \ +} while (0); + #endif /* _ASM_RISCV_ELF_H */ diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h index efc69feb4..4fd583acc 100644 --- a/arch/riscv/include/asm/kexec.h +++ b/arch/riscv/include/asm/kexec.h @@ -21,11 +21,16 @@ #define KEXEC_ARCH KEXEC_ARCH_RISCV +extern void riscv_crash_save_regs(struct pt_regs *newregs); + static inline void crash_setup_regs(struct pt_regs *newregs, struct pt_regs *oldregs) { - /* Dummy implementation for now */ + if (oldregs) + memcpy(newregs, oldregs, sizeof(struct pt_regs)); + else + riscv_crash_save_regs(newregs); } @@ -38,10 +43,12 @@ struct kimage_arch { const extern unsigned char riscv_kexec_relocate[]; const extern unsigned int riscv_kexec_relocate_size; -typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, - unsigned long jump_addr, - unsigned long fdt_addr, - unsigned long hartid, - unsigned long va_pa_off); +typedef void (*riscv_kexec_method)(unsigned long first_ind_entry, + unsigned long jump_addr, + unsigned long fdt_addr, + unsigned long hartid, + unsigned long va_pa_off); + +extern riscv_kexec_method riscv_kexec_norelocate; #endif diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index c2594018c..07f676ad3 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -58,7 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o endif obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o obj-$(CONFIG_KGDB) += kgdb.o -obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o +obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o obj-$(CONFIG_JUMP_LABEL) += jump_label.o diff --git a/arch/riscv/kernel/crash_save_regs.S b/arch/riscv/kernel/crash_save_regs.S new file mode 100644 index 000000000..7832fb763 --- /dev/null +++ b/arch/riscv/kernel/crash_save_regs.S @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2020 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> + */ + +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ +#include <asm/csr.h> /* For CSR_* macros */ +#include <asm/asm-offsets.h> /* For offsets on pt_regs */ +#include <linux/linkage.h> /* For SYM_* macros */ + +.section ".text" +SYM_CODE_START(riscv_crash_save_regs) + REG_S ra, PT_RA(a0) /* x1 */ + REG_S sp, PT_SP(a0) /* x2 */ + REG_S gp, PT_GP(a0) /* x3 */ + REG_S tp, PT_TP(a0) /* x4 */ + REG_S t0, PT_T0(a0) /* x5 */ + REG_S t1, PT_T1(a0) /* x6 */ + REG_S t2, PT_T2(a0) /* x7 */ + REG_S s0, PT_S0(a0) /* x8/fp */ + REG_S s1, PT_S1(a0) /* x9 */ + REG_S a0, PT_A0(a0) /* x10 */ + REG_S a1, PT_A1(a0) /* x11 */ + REG_S a2, PT_A2(a0) /* x12 */ + REG_S a3, PT_A3(a0) /* x13 */ + REG_S a4, PT_A4(a0) /* x14 */ + REG_S a5, PT_A5(a0) /* x15 */ + REG_S a6, PT_A6(a0) /* x16 */ + REG_S a7, PT_A7(a0) /* x17 */ + REG_S s2, PT_S2(a0) /* x18 */ + REG_S s3, PT_S3(a0) /* x19 */ + REG_S s4, PT_S4(a0) /* x20 */ + REG_S s5, PT_S5(a0) /* x21 */ + REG_S s6, PT_S6(a0) /* x22 */ + REG_S s7, PT_S7(a0) /* x23 */ + REG_S s8, PT_S8(a0) /* x24 */ + REG_S s9, PT_S9(a0) /* x25 */ + REG_S s10, PT_S10(a0) /* x26 */ + REG_S s11, PT_S11(a0) /* x27 */ + REG_S t3, PT_T3(a0) /* x28 */ + REG_S t4, PT_T4(a0) /* x29 */ + REG_S t5, PT_T5(a0) /* x30 */ + REG_S t6, PT_T6(a0) /* x31 */ + + csrr t1, CSR_STATUS + csrr t2, CSR_EPC + csrr t3, CSR_TVAL + csrr t4, CSR_CAUSE + + REG_S t1, PT_STATUS(a0) + REG_S t2, PT_EPC(a0) + REG_S t3, PT_BADADDR(a0) + REG_S t4, PT_CAUSE(a0) + ret +SYM_CODE_END(riscv_crash_save_regs) diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S index 616c20771..14220f70f 100644 --- a/arch/riscv/kernel/kexec_relocate.S +++ b/arch/riscv/kernel/kexec_relocate.S @@ -150,7 +150,73 @@ SYM_CODE_START(riscv_kexec_relocate) SYM_CODE_END(riscv_kexec_relocate) riscv_kexec_relocate_end: - .section ".rodata" + +/* Used for jumping to crashkernel */ +.section ".text" +SYM_CODE_START(riscv_kexec_norelocate) + /* + * s0: (const) Phys address to jump to + * s1: (const) Phys address of the FDT image + * s2: (const) The hartid of the current hart + * s3: (const) va_pa_offset, used when switching MMU off + */ + mv s0, a1 + mv s1, a2 + mv s2, a3 + mv s3, a4 + + /* Disable / cleanup interrupts */ + csrw sie, zero + csrw sip, zero + + /* Switch to physical addressing */ + la s4, 1f + sub s4, s4, s3 + csrw stvec, s4 + csrw sptbr, zero + +.align 2 +1: + /* Pass the arguments to the next kernel / Cleanup*/ + mv a0, s2 + mv a1, s1 + mv a2, s0 + + /* Cleanup */ + mv a3, zero + mv a4, zero + mv a5, zero + mv a6, zero + mv a7, zero + + mv s0, zero + mv s1, zero + mv s2, zero + mv s3, zero + mv s4, zero + mv s5, zero + mv s6, zero + mv s7, zero + mv s8, zero + mv s9, zero + mv s10, zero + mv s11, zero + + mv t0, zero + mv t1, zero + mv t2, zero + mv t3, zero + mv t4, zero + mv t5, zero + mv t6, zero + csrw sepc, zero + csrw scause, zero + csrw sscratch, zero + + jalr zero, a2, 0 +SYM_CODE_END(riscv_kexec_norelocate) + +.section ".rodata" SYM_DATA(riscv_kexec_relocate_size, .long riscv_kexec_relocate_end - riscv_kexec_relocate) diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c index 2ce6c3daf..e0596c0ac 100644 --- a/arch/riscv/kernel/machine_kexec.c +++ b/arch/riscv/kernel/machine_kexec.c @@ -59,11 +59,6 @@ machine_kexec_prepare(struct kimage *image) kexec_image_info(image); - if (image->type == KEXEC_TYPE_CRASH) { - pr_warn("Loading a crash kernel is unsupported for now.\n"); - return -EINVAL; - } - /* Find the Flattened Device Tree and save its physical address */ for (i = 0; i < image->nr_segments; i++) { if (image->segment[i].memsz <= sizeof(fdt)) @@ -85,17 +80,21 @@ machine_kexec_prepare(struct kimage *image) } /* Copy the assembler code for relocation to the control page */ - control_code_buffer = page_address(image->control_code_page); - control_code_buffer_sz = page_size(image->control_code_page); - if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { - pr_err("Relocation code doesn't fit within a control page\n"); - return -EINVAL; - } - memcpy(control_code_buffer, riscv_kexec_relocate, - riscv_kexec_relocate_size); + if (image->type != KEXEC_TYPE_CRASH) { + control_code_buffer = page_address(image->control_code_page); + control_code_buffer_sz = page_size(image->control_code_page); - /* Mark the control page executable */ - set_memory_x((unsigned long) control_code_buffer, 1); + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { + pr_err("Relocation code doesn't fit within a control page\n"); + return -EINVAL; + } + + memcpy(control_code_buffer, riscv_kexec_relocate, + riscv_kexec_relocate_size); + + /* Mark the control page executable */ + set_memory_x((unsigned long) control_code_buffer, 1); + } return 0; } @@ -147,6 +146,9 @@ void machine_shutdown(void) void machine_crash_shutdown(struct pt_regs *regs) { + crash_save_cpu(regs, smp_processor_id()); + machine_shutdown(); + pr_info("Starting crashdump kernel...\n"); } /** @@ -169,7 +171,12 @@ machine_kexec(struct kimage *image) unsigned long this_hart_id = raw_smp_processor_id(); unsigned long fdt_addr = internal->fdt_addr; void *control_code_buffer = page_address(image->control_code_page); - riscv_kexec_do_relocate do_relocate = control_code_buffer; + riscv_kexec_method kexec_method = NULL; + + if (image->type != KEXEC_TYPE_CRASH) + kexec_method = control_code_buffer; + else + kexec_method = (riscv_kexec_method) &riscv_kexec_norelocate; pr_notice("Will call new kernel at %08lx from hart id %lx\n", jump_addr, this_hart_id); @@ -180,7 +187,7 @@ machine_kexec(struct kimage *image) /* Jump to the relocation code */ pr_notice("Bye...\n"); - do_relocate(first_ind_entry, jump_addr, fdt_addr, - this_hart_id, va_pa_offset); + kexec_method(first_ind_entry, jump_addr, fdt_addr, + this_hart_id, va_pa_offset); unreachable(); } diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 030554bab..31866dce9 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -20,6 +20,7 @@ #include <linux/swiotlb.h> #include <linux/smp.h> #include <linux/efi.h> +#include <linux/crash_dump.h> #include <asm/cpu_ops.h> #include <asm/early_ioremap.h> @@ -160,6 +161,14 @@ static void __init init_resources(void) if (ret < 0) goto error; +#ifdef CONFIG_KEXEC_CORE + if (crashk_res.start != crashk_res.end) { + ret = add_resource(&iomem_resource, &crashk_res); + if (ret < 0) + goto error; + } +#endif + for_each_reserved_mem_region(region) { res = &mem_res[res_idx--]; @@ -252,7 +261,6 @@ void __init setup_arch(char **cmdline_p) efi_init(); setup_bootmem(); paging_init(); - init_resources(); #if IS_ENABLED(CONFIG_BUILTIN_DTB) unflatten_and_copy_device_tree(); #else @@ -263,6 +271,7 @@ void __init setup_arch(char **cmdline_p) #endif misc_mem_init(); + init_resources(); sbi_init(); if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 7f5036fbe..e71b35cec 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -2,6 +2,8 @@ /* * Copyright (C) 2012 Regents of the University of California * Copyright (C) 2019 Western Digital Corporation or its affiliates. + * Copyright (C) 2020 FORTH-ICS/CARV + * Nick Kossifidis <mick@ics.forth.gr> */ #include <linux/init.h> @@ -14,6 +16,7 @@ #include <linux/libfdt.h> #include <linux/set_memory.h> #include <linux/dma-map-ops.h> +#include <linux/crash_dump.h> #include <asm/fixmap.h> #include <asm/tlbflush.h> @@ -586,6 +589,77 @@ void mark_rodata_ro(void) } #endif +#ifdef CONFIG_KEXEC_CORE +/* + * reserve_crashkernel() - reserves memory for crash kernel + * + * This function reserves memory area given in "crashkernel=" kernel command + * line parameter. The memory reserved is used by dump capture kernel when + * primary kernel is crashing. + */ +static void __init reserve_crashkernel(void) +{ + unsigned long long crash_base = 0; + unsigned long long crash_size = 0; + unsigned long search_start = memblock_start_of_DRAM(); + unsigned long search_end = memblock_end_of_DRAM(); + + int ret = 0; + + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), + &crash_size, &crash_base); + if (ret || !crash_size) + return; + + crash_size = PAGE_ALIGN(crash_size); + + if (crash_base == 0) { + /* + * Current riscv boot protocol requires 2MB alignment for + * RV64 and 4MB alignment for RV32 (hugepage size) + */ + crash_base = memblock_find_in_range(search_start, search_end, +#ifdef CONFIG_64BIT + crash_size, SZ_2M); +#else + crash_size, SZ_4M); +#endif + if (crash_base == 0) { + pr_warn("crashkernel: couldn't allocate %lldKB\n", + crash_size >> 10); + return; + } + } else { + /* User specifies base address explicitly. */ + if (!memblock_is_region_memory(crash_base, crash_size)) { + pr_warn("crashkernel: requested region is not memory\n"); + return; + } + + if (memblock_is_region_reserved(crash_base, crash_size)) { + pr_warn("crashkernel: requested region is reserved\n"); + return; + } + +#ifdef CONFIG_64BIT + if (!IS_ALIGNED(crash_base, SZ_2M)) { +#else + if (!IS_ALIGNED(crash_base, SZ_4M)) { +#endif + pr_warn("crashkernel: requested region is misaligned\n"); + return; + } + } + memblock_reserve(crash_base, crash_size); + + pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", + crash_base, crash_base + crash_size, crash_size >> 20); + + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; +} +#endif /* CONFIG_KEXEC_CORE */ + void __init paging_init(void) { setup_vm_final(); @@ -598,6 +672,9 @@ void __init misc_mem_init(void) arch_numa_init(); sparse_init(); zone_sizes_init(); +#ifdef CONFIG_KEXEC_CORE + reserve_crashkernel(); +#endif memblock_dump_all(); } -- 2.26.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v3 4/5] RISC-V: Add kdump support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-06 18:36 ` Alex Ghiti -1 siblings, 0 replies; 53+ messages in thread From: Alex Ghiti @ 2021-04-06 18:36 UTC (permalink / raw) To: Nick Kossifidis, linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel Hi Nick, Le 4/5/21 à 4:57 AM, Nick Kossifidis a écrit : > This patch adds support for kdump, the kernel will reserve a > region for the crash kernel and jump there on panic. In order > for userspace tools (kexec-tools) to prepare the crash kernel > kexec image, we also need to expose some information on > /proc/iomem for the memory regions used by the kernel and for > the region reserved for crash kernel. Note that on userspace > the device tree is used to determine the system's memory > layout so the "System RAM" on /proc/iomem is ignored. > > I tested this on riscv64 qemu and works as expected, you may > test it by triggering a crash through /proc/sysrq_trigger: > > echo c > /proc/sysrq_trigger > > v3: > * Move ELF_CORE_COPY_REGS to asm/elf.h instead of uapi/asm/elf.h > * Set stvec when disabling MMU > * Minor cleanups and re-base > > v2: > * Properly populate the ioresources tree, so that it can be > used later on for implementing strict /dev/mem. > * Minor cleanups and re-base > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/include/asm/elf.h | 6 +++ > arch/riscv/include/asm/kexec.h | 19 ++++--- > arch/riscv/kernel/Makefile | 2 +- > arch/riscv/kernel/crash_save_regs.S | 56 +++++++++++++++++++++ > arch/riscv/kernel/kexec_relocate.S | 68 ++++++++++++++++++++++++- > arch/riscv/kernel/machine_kexec.c | 43 +++++++++------- > arch/riscv/kernel/setup.c | 11 ++++- > arch/riscv/mm/init.c | 77 +++++++++++++++++++++++++++++ > 8 files changed, 255 insertions(+), 27 deletions(-) > create mode 100644 arch/riscv/kernel/crash_save_regs.S > > diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h > index 5c725e1df..f4b490cd0 100644 > --- a/arch/riscv/include/asm/elf.h > +++ b/arch/riscv/include/asm/elf.h > @@ -81,4 +81,10 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm, > int uses_interp); > #endif /* CONFIG_MMU */ > > +#define ELF_CORE_COPY_REGS(dest, regs) \ > +do { \ > + *(struct user_regs_struct *)&(dest) = \ > + *(struct user_regs_struct *)regs; \ > +} while (0); > + > #endif /* _ASM_RISCV_ELF_H */ > diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h > index efc69feb4..4fd583acc 100644 > --- a/arch/riscv/include/asm/kexec.h > +++ b/arch/riscv/include/asm/kexec.h > @@ -21,11 +21,16 @@ > > #define KEXEC_ARCH KEXEC_ARCH_RISCV > > +extern void riscv_crash_save_regs(struct pt_regs *newregs); > + > static inline void > crash_setup_regs(struct pt_regs *newregs, > struct pt_regs *oldregs) > { > - /* Dummy implementation for now */ > + if (oldregs) > + memcpy(newregs, oldregs, sizeof(struct pt_regs)); > + else > + riscv_crash_save_regs(newregs); > } > > > @@ -38,10 +43,12 @@ struct kimage_arch { > const extern unsigned char riscv_kexec_relocate[]; > const extern unsigned int riscv_kexec_relocate_size; > > -typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, > - unsigned long jump_addr, > - unsigned long fdt_addr, > - unsigned long hartid, > - unsigned long va_pa_off); > +typedef void (*riscv_kexec_method)(unsigned long first_ind_entry, > + unsigned long jump_addr, > + unsigned long fdt_addr, > + unsigned long hartid, > + unsigned long va_pa_off); > + > +extern riscv_kexec_method riscv_kexec_norelocate; > > #endif > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index c2594018c..07f676ad3 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -58,7 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o > endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > -obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o > +obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/crash_save_regs.S b/arch/riscv/kernel/crash_save_regs.S > new file mode 100644 > index 000000000..7832fb763 > --- /dev/null > +++ b/arch/riscv/kernel/crash_save_regs.S > @@ -0,0 +1,56 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2020 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ > +#include <asm/csr.h> /* For CSR_* macros */ > +#include <asm/asm-offsets.h> /* For offsets on pt_regs */ > +#include <linux/linkage.h> /* For SYM_* macros */ > + > +.section ".text" > +SYM_CODE_START(riscv_crash_save_regs) > + REG_S ra, PT_RA(a0) /* x1 */ > + REG_S sp, PT_SP(a0) /* x2 */ > + REG_S gp, PT_GP(a0) /* x3 */ > + REG_S tp, PT_TP(a0) /* x4 */ > + REG_S t0, PT_T0(a0) /* x5 */ > + REG_S t1, PT_T1(a0) /* x6 */ > + REG_S t2, PT_T2(a0) /* x7 */ > + REG_S s0, PT_S0(a0) /* x8/fp */ > + REG_S s1, PT_S1(a0) /* x9 */ > + REG_S a0, PT_A0(a0) /* x10 */ > + REG_S a1, PT_A1(a0) /* x11 */ > + REG_S a2, PT_A2(a0) /* x12 */ > + REG_S a3, PT_A3(a0) /* x13 */ > + REG_S a4, PT_A4(a0) /* x14 */ > + REG_S a5, PT_A5(a0) /* x15 */ > + REG_S a6, PT_A6(a0) /* x16 */ > + REG_S a7, PT_A7(a0) /* x17 */ > + REG_S s2, PT_S2(a0) /* x18 */ > + REG_S s3, PT_S3(a0) /* x19 */ > + REG_S s4, PT_S4(a0) /* x20 */ > + REG_S s5, PT_S5(a0) /* x21 */ > + REG_S s6, PT_S6(a0) /* x22 */ > + REG_S s7, PT_S7(a0) /* x23 */ > + REG_S s8, PT_S8(a0) /* x24 */ > + REG_S s9, PT_S9(a0) /* x25 */ > + REG_S s10, PT_S10(a0) /* x26 */ > + REG_S s11, PT_S11(a0) /* x27 */ > + REG_S t3, PT_T3(a0) /* x28 */ > + REG_S t4, PT_T4(a0) /* x29 */ > + REG_S t5, PT_T5(a0) /* x30 */ > + REG_S t6, PT_T6(a0) /* x31 */ > + > + csrr t1, CSR_STATUS > + csrr t2, CSR_EPC > + csrr t3, CSR_TVAL > + csrr t4, CSR_CAUSE > + > + REG_S t1, PT_STATUS(a0) > + REG_S t2, PT_EPC(a0) > + REG_S t3, PT_BADADDR(a0) > + REG_S t4, PT_CAUSE(a0) > + ret > +SYM_CODE_END(riscv_crash_save_regs) > diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S > index 616c20771..14220f70f 100644 > --- a/arch/riscv/kernel/kexec_relocate.S > +++ b/arch/riscv/kernel/kexec_relocate.S > @@ -150,7 +150,73 @@ SYM_CODE_START(riscv_kexec_relocate) > SYM_CODE_END(riscv_kexec_relocate) > riscv_kexec_relocate_end: > > - .section ".rodata" > + > +/* Used for jumping to crashkernel */ > +.section ".text" > +SYM_CODE_START(riscv_kexec_norelocate) > + /* > + * s0: (const) Phys address to jump to > + * s1: (const) Phys address of the FDT image > + * s2: (const) The hartid of the current hart > + * s3: (const) va_pa_offset, used when switching MMU off > + */ > + mv s0, a1 > + mv s1, a2 > + mv s2, a3 > + mv s3, a4 > + > + /* Disable / cleanup interrupts */ > + csrw sie, zero > + csrw sip, zero > + > + /* Switch to physical addressing */ > + la s4, 1f > + sub s4, s4, s3 > + csrw stvec, s4 > + csrw sptbr, zero satp is used everywhere instead of sptbr. And maybe you could CSR_**** naming, like you did in riscv_crash_save_regs and like it's done in head.S too. > + > +.align 2 > +1: > + /* Pass the arguments to the next kernel / Cleanup*/ > + mv a0, s2 > + mv a1, s1 > + mv a2, s0 > + > + /* Cleanup */ > + mv a3, zero > + mv a4, zero > + mv a5, zero > + mv a6, zero > + mv a7, zero > + > + mv s0, zero > + mv s1, zero > + mv s2, zero > + mv s3, zero > + mv s4, zero > + mv s5, zero > + mv s6, zero > + mv s7, zero > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + mv t0, zero > + mv t1, zero > + mv t2, zero > + mv t3, zero > + mv t4, zero > + mv t5, zero > + mv t6, zero > + csrw sepc, zero > + csrw scause, zero > + csrw sscratch, zero > + > + jalr zero, a2, 0 > +SYM_CODE_END(riscv_kexec_norelocate) > + > +.section ".rodata" > SYM_DATA(riscv_kexec_relocate_size, > .long riscv_kexec_relocate_end - riscv_kexec_relocate) > > diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c > index 2ce6c3daf..e0596c0ac 100644 > --- a/arch/riscv/kernel/machine_kexec.c > +++ b/arch/riscv/kernel/machine_kexec.c > @@ -59,11 +59,6 @@ machine_kexec_prepare(struct kimage *image) > > kexec_image_info(image); > > - if (image->type == KEXEC_TYPE_CRASH) { > - pr_warn("Loading a crash kernel is unsupported for now.\n"); > - return -EINVAL; > - } > - > /* Find the Flattened Device Tree and save its physical address */ > for (i = 0; i < image->nr_segments; i++) { > if (image->segment[i].memsz <= sizeof(fdt)) > @@ -85,17 +80,21 @@ machine_kexec_prepare(struct kimage *image) > } > > /* Copy the assembler code for relocation to the control page */ > - control_code_buffer = page_address(image->control_code_page); > - control_code_buffer_sz = page_size(image->control_code_page); > - if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > - pr_err("Relocation code doesn't fit within a control page\n"); > - return -EINVAL; > - } > - memcpy(control_code_buffer, riscv_kexec_relocate, > - riscv_kexec_relocate_size); > + if (image->type != KEXEC_TYPE_CRASH) { > + control_code_buffer = page_address(image->control_code_page); > + control_code_buffer_sz = page_size(image->control_code_page); > > - /* Mark the control page executable */ > - set_memory_x((unsigned long) control_code_buffer, 1); > + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > + pr_err("Relocation code doesn't fit within a control page\n"); > + return -EINVAL; > + } > + > + memcpy(control_code_buffer, riscv_kexec_relocate, > + riscv_kexec_relocate_size); > + > + /* Mark the control page executable */ > + set_memory_x((unsigned long) control_code_buffer, 1); > + } > > return 0; > } > @@ -147,6 +146,9 @@ void machine_shutdown(void) > void > machine_crash_shutdown(struct pt_regs *regs) > { > + crash_save_cpu(regs, smp_processor_id()); > + machine_shutdown(); > + pr_info("Starting crashdump kernel...\n"); > } > > /** > @@ -169,7 +171,12 @@ machine_kexec(struct kimage *image) > unsigned long this_hart_id = raw_smp_processor_id(); > unsigned long fdt_addr = internal->fdt_addr; > void *control_code_buffer = page_address(image->control_code_page); > - riscv_kexec_do_relocate do_relocate = control_code_buffer; > + riscv_kexec_method kexec_method = NULL; > + > + if (image->type != KEXEC_TYPE_CRASH) > + kexec_method = control_code_buffer; > + else > + kexec_method = (riscv_kexec_method) &riscv_kexec_norelocate; > > pr_notice("Will call new kernel at %08lx from hart id %lx\n", > jump_addr, this_hart_id); > @@ -180,7 +187,7 @@ machine_kexec(struct kimage *image) > > /* Jump to the relocation code */ > pr_notice("Bye...\n"); > - do_relocate(first_ind_entry, jump_addr, fdt_addr, > - this_hart_id, va_pa_offset); > + kexec_method(first_ind_entry, jump_addr, fdt_addr, > + this_hart_id, va_pa_offset); > unreachable(); > } > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index 030554bab..31866dce9 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -20,6 +20,7 @@ > #include <linux/swiotlb.h> > #include <linux/smp.h> > #include <linux/efi.h> > +#include <linux/crash_dump.h> > > #include <asm/cpu_ops.h> > #include <asm/early_ioremap.h> > @@ -160,6 +161,14 @@ static void __init init_resources(void) > if (ret < 0) > goto error; > > +#ifdef CONFIG_KEXEC_CORE > + if (crashk_res.start != crashk_res.end) { > + ret = add_resource(&iomem_resource, &crashk_res); > + if (ret < 0) > + goto error; > + } > +#endif > + > for_each_reserved_mem_region(region) { > res = &mem_res[res_idx--]; > > @@ -252,7 +261,6 @@ void __init setup_arch(char **cmdline_p) > efi_init(); > setup_bootmem(); > paging_init(); > - init_resources(); > #if IS_ENABLED(CONFIG_BUILTIN_DTB) > unflatten_and_copy_device_tree(); > #else > @@ -263,6 +271,7 @@ void __init setup_arch(char **cmdline_p) > #endif > misc_mem_init(); > > + init_resources(); > sbi_init(); > > if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)) > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 7f5036fbe..e71b35cec 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -2,6 +2,8 @@ > /* > * Copyright (C) 2012 Regents of the University of California > * Copyright (C) 2019 Western Digital Corporation or its affiliates. > + * Copyright (C) 2020 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > */ > > #include <linux/init.h> > @@ -14,6 +16,7 @@ > #include <linux/libfdt.h> > #include <linux/set_memory.h> > #include <linux/dma-map-ops.h> > +#include <linux/crash_dump.h> > > #include <asm/fixmap.h> > #include <asm/tlbflush.h> > @@ -586,6 +589,77 @@ void mark_rodata_ro(void) > } > #endif > > +#ifdef CONFIG_KEXEC_CORE > +/* > + * reserve_crashkernel() - reserves memory for crash kernel > + * > + * This function reserves memory area given in "crashkernel=" kernel command > + * line parameter. The memory reserved is used by dump capture kernel when > + * primary kernel is crashing. > + */ > +static void __init reserve_crashkernel(void) > +{ > + unsigned long long crash_base = 0; > + unsigned long long crash_size = 0; > + unsigned long search_start = memblock_start_of_DRAM(); > + unsigned long search_end = memblock_end_of_DRAM(); > + > + int ret = 0; > + > + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > + &crash_size, &crash_base); > + if (ret || !crash_size) > + return; > + > + crash_size = PAGE_ALIGN(crash_size); > + > + if (crash_base == 0) { > + /* > + * Current riscv boot protocol requires 2MB alignment for > + * RV64 and 4MB alignment for RV32 (hugepage size) > + */ > + crash_base = memblock_find_in_range(search_start, search_end, > +#ifdef CONFIG_64BIT > + crash_size, SZ_2M); > +#else > + crash_size, SZ_4M); > +#endif You can use PMD_SIZE here and get rid of #ifdef. > + if (crash_base == 0) { > + pr_warn("crashkernel: couldn't allocate %lldKB\n", > + crash_size >> 10); > + return; > + } > + } else { > + /* User specifies base address explicitly. */ > + if (!memblock_is_region_memory(crash_base, crash_size)) { > + pr_warn("crashkernel: requested region is not memory\n"); > + return; > + } > + > + if (memblock_is_region_reserved(crash_base, crash_size)) { > + pr_warn("crashkernel: requested region is reserved\n"); > + return; > + } > + > +#ifdef CONFIG_64BIT > + if (!IS_ALIGNED(crash_base, SZ_2M)) { > +#else > + if (!IS_ALIGNED(crash_base, SZ_4M)) { > +#endif Ditto here. > + pr_warn("crashkernel: requested region is misaligned\n"); > + return; > + } > + } > + memblock_reserve(crash_base, crash_size); > + > + pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", > + crash_base, crash_base + crash_size, crash_size >> 20); > + > + crashk_res.start = crash_base; > + crashk_res.end = crash_base + crash_size - 1; > +} > +#endif /* CONFIG_KEXEC_CORE */ > + > void __init paging_init(void) > { > setup_vm_final(); > @@ -598,6 +672,9 @@ void __init misc_mem_init(void) > arch_numa_init(); > sparse_init(); > zone_sizes_init(); > +#ifdef CONFIG_KEXEC_CORE > + reserve_crashkernel(); > +#endif > memblock_dump_all(); > } > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 4/5] RISC-V: Add kdump support @ 2021-04-06 18:36 ` Alex Ghiti 0 siblings, 0 replies; 53+ messages in thread From: Alex Ghiti @ 2021-04-06 18:36 UTC (permalink / raw) To: Nick Kossifidis, linux-riscv, palmer; +Cc: paul.walmsley, linux-kernel Hi Nick, Le 4/5/21 à 4:57 AM, Nick Kossifidis a écrit : > This patch adds support for kdump, the kernel will reserve a > region for the crash kernel and jump there on panic. In order > for userspace tools (kexec-tools) to prepare the crash kernel > kexec image, we also need to expose some information on > /proc/iomem for the memory regions used by the kernel and for > the region reserved for crash kernel. Note that on userspace > the device tree is used to determine the system's memory > layout so the "System RAM" on /proc/iomem is ignored. > > I tested this on riscv64 qemu and works as expected, you may > test it by triggering a crash through /proc/sysrq_trigger: > > echo c > /proc/sysrq_trigger > > v3: > * Move ELF_CORE_COPY_REGS to asm/elf.h instead of uapi/asm/elf.h > * Set stvec when disabling MMU > * Minor cleanups and re-base > > v2: > * Properly populate the ioresources tree, so that it can be > used later on for implementing strict /dev/mem. > * Minor cleanups and re-base > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/include/asm/elf.h | 6 +++ > arch/riscv/include/asm/kexec.h | 19 ++++--- > arch/riscv/kernel/Makefile | 2 +- > arch/riscv/kernel/crash_save_regs.S | 56 +++++++++++++++++++++ > arch/riscv/kernel/kexec_relocate.S | 68 ++++++++++++++++++++++++- > arch/riscv/kernel/machine_kexec.c | 43 +++++++++------- > arch/riscv/kernel/setup.c | 11 ++++- > arch/riscv/mm/init.c | 77 +++++++++++++++++++++++++++++ > 8 files changed, 255 insertions(+), 27 deletions(-) > create mode 100644 arch/riscv/kernel/crash_save_regs.S > > diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h > index 5c725e1df..f4b490cd0 100644 > --- a/arch/riscv/include/asm/elf.h > +++ b/arch/riscv/include/asm/elf.h > @@ -81,4 +81,10 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm, > int uses_interp); > #endif /* CONFIG_MMU */ > > +#define ELF_CORE_COPY_REGS(dest, regs) \ > +do { \ > + *(struct user_regs_struct *)&(dest) = \ > + *(struct user_regs_struct *)regs; \ > +} while (0); > + > #endif /* _ASM_RISCV_ELF_H */ > diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h > index efc69feb4..4fd583acc 100644 > --- a/arch/riscv/include/asm/kexec.h > +++ b/arch/riscv/include/asm/kexec.h > @@ -21,11 +21,16 @@ > > #define KEXEC_ARCH KEXEC_ARCH_RISCV > > +extern void riscv_crash_save_regs(struct pt_regs *newregs); > + > static inline void > crash_setup_regs(struct pt_regs *newregs, > struct pt_regs *oldregs) > { > - /* Dummy implementation for now */ > + if (oldregs) > + memcpy(newregs, oldregs, sizeof(struct pt_regs)); > + else > + riscv_crash_save_regs(newregs); > } > > > @@ -38,10 +43,12 @@ struct kimage_arch { > const extern unsigned char riscv_kexec_relocate[]; > const extern unsigned int riscv_kexec_relocate_size; > > -typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, > - unsigned long jump_addr, > - unsigned long fdt_addr, > - unsigned long hartid, > - unsigned long va_pa_off); > +typedef void (*riscv_kexec_method)(unsigned long first_ind_entry, > + unsigned long jump_addr, > + unsigned long fdt_addr, > + unsigned long hartid, > + unsigned long va_pa_off); > + > +extern riscv_kexec_method riscv_kexec_norelocate; > > #endif > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index c2594018c..07f676ad3 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -58,7 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o > endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > -obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o > +obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/crash_save_regs.S b/arch/riscv/kernel/crash_save_regs.S > new file mode 100644 > index 000000000..7832fb763 > --- /dev/null > +++ b/arch/riscv/kernel/crash_save_regs.S > @@ -0,0 +1,56 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2020 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ > +#include <asm/csr.h> /* For CSR_* macros */ > +#include <asm/asm-offsets.h> /* For offsets on pt_regs */ > +#include <linux/linkage.h> /* For SYM_* macros */ > + > +.section ".text" > +SYM_CODE_START(riscv_crash_save_regs) > + REG_S ra, PT_RA(a0) /* x1 */ > + REG_S sp, PT_SP(a0) /* x2 */ > + REG_S gp, PT_GP(a0) /* x3 */ > + REG_S tp, PT_TP(a0) /* x4 */ > + REG_S t0, PT_T0(a0) /* x5 */ > + REG_S t1, PT_T1(a0) /* x6 */ > + REG_S t2, PT_T2(a0) /* x7 */ > + REG_S s0, PT_S0(a0) /* x8/fp */ > + REG_S s1, PT_S1(a0) /* x9 */ > + REG_S a0, PT_A0(a0) /* x10 */ > + REG_S a1, PT_A1(a0) /* x11 */ > + REG_S a2, PT_A2(a0) /* x12 */ > + REG_S a3, PT_A3(a0) /* x13 */ > + REG_S a4, PT_A4(a0) /* x14 */ > + REG_S a5, PT_A5(a0) /* x15 */ > + REG_S a6, PT_A6(a0) /* x16 */ > + REG_S a7, PT_A7(a0) /* x17 */ > + REG_S s2, PT_S2(a0) /* x18 */ > + REG_S s3, PT_S3(a0) /* x19 */ > + REG_S s4, PT_S4(a0) /* x20 */ > + REG_S s5, PT_S5(a0) /* x21 */ > + REG_S s6, PT_S6(a0) /* x22 */ > + REG_S s7, PT_S7(a0) /* x23 */ > + REG_S s8, PT_S8(a0) /* x24 */ > + REG_S s9, PT_S9(a0) /* x25 */ > + REG_S s10, PT_S10(a0) /* x26 */ > + REG_S s11, PT_S11(a0) /* x27 */ > + REG_S t3, PT_T3(a0) /* x28 */ > + REG_S t4, PT_T4(a0) /* x29 */ > + REG_S t5, PT_T5(a0) /* x30 */ > + REG_S t6, PT_T6(a0) /* x31 */ > + > + csrr t1, CSR_STATUS > + csrr t2, CSR_EPC > + csrr t3, CSR_TVAL > + csrr t4, CSR_CAUSE > + > + REG_S t1, PT_STATUS(a0) > + REG_S t2, PT_EPC(a0) > + REG_S t3, PT_BADADDR(a0) > + REG_S t4, PT_CAUSE(a0) > + ret > +SYM_CODE_END(riscv_crash_save_regs) > diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S > index 616c20771..14220f70f 100644 > --- a/arch/riscv/kernel/kexec_relocate.S > +++ b/arch/riscv/kernel/kexec_relocate.S > @@ -150,7 +150,73 @@ SYM_CODE_START(riscv_kexec_relocate) > SYM_CODE_END(riscv_kexec_relocate) > riscv_kexec_relocate_end: > > - .section ".rodata" > + > +/* Used for jumping to crashkernel */ > +.section ".text" > +SYM_CODE_START(riscv_kexec_norelocate) > + /* > + * s0: (const) Phys address to jump to > + * s1: (const) Phys address of the FDT image > + * s2: (const) The hartid of the current hart > + * s3: (const) va_pa_offset, used when switching MMU off > + */ > + mv s0, a1 > + mv s1, a2 > + mv s2, a3 > + mv s3, a4 > + > + /* Disable / cleanup interrupts */ > + csrw sie, zero > + csrw sip, zero > + > + /* Switch to physical addressing */ > + la s4, 1f > + sub s4, s4, s3 > + csrw stvec, s4 > + csrw sptbr, zero satp is used everywhere instead of sptbr. And maybe you could CSR_**** naming, like you did in riscv_crash_save_regs and like it's done in head.S too. > + > +.align 2 > +1: > + /* Pass the arguments to the next kernel / Cleanup*/ > + mv a0, s2 > + mv a1, s1 > + mv a2, s0 > + > + /* Cleanup */ > + mv a3, zero > + mv a4, zero > + mv a5, zero > + mv a6, zero > + mv a7, zero > + > + mv s0, zero > + mv s1, zero > + mv s2, zero > + mv s3, zero > + mv s4, zero > + mv s5, zero > + mv s6, zero > + mv s7, zero > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + mv t0, zero > + mv t1, zero > + mv t2, zero > + mv t3, zero > + mv t4, zero > + mv t5, zero > + mv t6, zero > + csrw sepc, zero > + csrw scause, zero > + csrw sscratch, zero > + > + jalr zero, a2, 0 > +SYM_CODE_END(riscv_kexec_norelocate) > + > +.section ".rodata" > SYM_DATA(riscv_kexec_relocate_size, > .long riscv_kexec_relocate_end - riscv_kexec_relocate) > > diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c > index 2ce6c3daf..e0596c0ac 100644 > --- a/arch/riscv/kernel/machine_kexec.c > +++ b/arch/riscv/kernel/machine_kexec.c > @@ -59,11 +59,6 @@ machine_kexec_prepare(struct kimage *image) > > kexec_image_info(image); > > - if (image->type == KEXEC_TYPE_CRASH) { > - pr_warn("Loading a crash kernel is unsupported for now.\n"); > - return -EINVAL; > - } > - > /* Find the Flattened Device Tree and save its physical address */ > for (i = 0; i < image->nr_segments; i++) { > if (image->segment[i].memsz <= sizeof(fdt)) > @@ -85,17 +80,21 @@ machine_kexec_prepare(struct kimage *image) > } > > /* Copy the assembler code for relocation to the control page */ > - control_code_buffer = page_address(image->control_code_page); > - control_code_buffer_sz = page_size(image->control_code_page); > - if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > - pr_err("Relocation code doesn't fit within a control page\n"); > - return -EINVAL; > - } > - memcpy(control_code_buffer, riscv_kexec_relocate, > - riscv_kexec_relocate_size); > + if (image->type != KEXEC_TYPE_CRASH) { > + control_code_buffer = page_address(image->control_code_page); > + control_code_buffer_sz = page_size(image->control_code_page); > > - /* Mark the control page executable */ > - set_memory_x((unsigned long) control_code_buffer, 1); > + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > + pr_err("Relocation code doesn't fit within a control page\n"); > + return -EINVAL; > + } > + > + memcpy(control_code_buffer, riscv_kexec_relocate, > + riscv_kexec_relocate_size); > + > + /* Mark the control page executable */ > + set_memory_x((unsigned long) control_code_buffer, 1); > + } > > return 0; > } > @@ -147,6 +146,9 @@ void machine_shutdown(void) > void > machine_crash_shutdown(struct pt_regs *regs) > { > + crash_save_cpu(regs, smp_processor_id()); > + machine_shutdown(); > + pr_info("Starting crashdump kernel...\n"); > } > > /** > @@ -169,7 +171,12 @@ machine_kexec(struct kimage *image) > unsigned long this_hart_id = raw_smp_processor_id(); > unsigned long fdt_addr = internal->fdt_addr; > void *control_code_buffer = page_address(image->control_code_page); > - riscv_kexec_do_relocate do_relocate = control_code_buffer; > + riscv_kexec_method kexec_method = NULL; > + > + if (image->type != KEXEC_TYPE_CRASH) > + kexec_method = control_code_buffer; > + else > + kexec_method = (riscv_kexec_method) &riscv_kexec_norelocate; > > pr_notice("Will call new kernel at %08lx from hart id %lx\n", > jump_addr, this_hart_id); > @@ -180,7 +187,7 @@ machine_kexec(struct kimage *image) > > /* Jump to the relocation code */ > pr_notice("Bye...\n"); > - do_relocate(first_ind_entry, jump_addr, fdt_addr, > - this_hart_id, va_pa_offset); > + kexec_method(first_ind_entry, jump_addr, fdt_addr, > + this_hart_id, va_pa_offset); > unreachable(); > } > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index 030554bab..31866dce9 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -20,6 +20,7 @@ > #include <linux/swiotlb.h> > #include <linux/smp.h> > #include <linux/efi.h> > +#include <linux/crash_dump.h> > > #include <asm/cpu_ops.h> > #include <asm/early_ioremap.h> > @@ -160,6 +161,14 @@ static void __init init_resources(void) > if (ret < 0) > goto error; > > +#ifdef CONFIG_KEXEC_CORE > + if (crashk_res.start != crashk_res.end) { > + ret = add_resource(&iomem_resource, &crashk_res); > + if (ret < 0) > + goto error; > + } > +#endif > + > for_each_reserved_mem_region(region) { > res = &mem_res[res_idx--]; > > @@ -252,7 +261,6 @@ void __init setup_arch(char **cmdline_p) > efi_init(); > setup_bootmem(); > paging_init(); > - init_resources(); > #if IS_ENABLED(CONFIG_BUILTIN_DTB) > unflatten_and_copy_device_tree(); > #else > @@ -263,6 +271,7 @@ void __init setup_arch(char **cmdline_p) > #endif > misc_mem_init(); > > + init_resources(); > sbi_init(); > > if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)) > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 7f5036fbe..e71b35cec 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -2,6 +2,8 @@ > /* > * Copyright (C) 2012 Regents of the University of California > * Copyright (C) 2019 Western Digital Corporation or its affiliates. > + * Copyright (C) 2020 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > */ > > #include <linux/init.h> > @@ -14,6 +16,7 @@ > #include <linux/libfdt.h> > #include <linux/set_memory.h> > #include <linux/dma-map-ops.h> > +#include <linux/crash_dump.h> > > #include <asm/fixmap.h> > #include <asm/tlbflush.h> > @@ -586,6 +589,77 @@ void mark_rodata_ro(void) > } > #endif > > +#ifdef CONFIG_KEXEC_CORE > +/* > + * reserve_crashkernel() - reserves memory for crash kernel > + * > + * This function reserves memory area given in "crashkernel=" kernel command > + * line parameter. The memory reserved is used by dump capture kernel when > + * primary kernel is crashing. > + */ > +static void __init reserve_crashkernel(void) > +{ > + unsigned long long crash_base = 0; > + unsigned long long crash_size = 0; > + unsigned long search_start = memblock_start_of_DRAM(); > + unsigned long search_end = memblock_end_of_DRAM(); > + > + int ret = 0; > + > + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > + &crash_size, &crash_base); > + if (ret || !crash_size) > + return; > + > + crash_size = PAGE_ALIGN(crash_size); > + > + if (crash_base == 0) { > + /* > + * Current riscv boot protocol requires 2MB alignment for > + * RV64 and 4MB alignment for RV32 (hugepage size) > + */ > + crash_base = memblock_find_in_range(search_start, search_end, > +#ifdef CONFIG_64BIT > + crash_size, SZ_2M); > +#else > + crash_size, SZ_4M); > +#endif You can use PMD_SIZE here and get rid of #ifdef. > + if (crash_base == 0) { > + pr_warn("crashkernel: couldn't allocate %lldKB\n", > + crash_size >> 10); > + return; > + } > + } else { > + /* User specifies base address explicitly. */ > + if (!memblock_is_region_memory(crash_base, crash_size)) { > + pr_warn("crashkernel: requested region is not memory\n"); > + return; > + } > + > + if (memblock_is_region_reserved(crash_base, crash_size)) { > + pr_warn("crashkernel: requested region is reserved\n"); > + return; > + } > + > +#ifdef CONFIG_64BIT > + if (!IS_ALIGNED(crash_base, SZ_2M)) { > +#else > + if (!IS_ALIGNED(crash_base, SZ_4M)) { > +#endif Ditto here. > + pr_warn("crashkernel: requested region is misaligned\n"); > + return; > + } > + } > + memblock_reserve(crash_base, crash_size); > + > + pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", > + crash_base, crash_base + crash_size, crash_size >> 20); > + > + crashk_res.start = crash_base; > + crashk_res.end = crash_base + crash_size - 1; > +} > +#endif /* CONFIG_KEXEC_CORE */ > + > void __init paging_init(void) > { > setup_vm_final(); > @@ -598,6 +672,9 @@ void __init misc_mem_init(void) > arch_numa_init(); > sparse_init(); > zone_sizes_init(); > +#ifdef CONFIG_KEXEC_CORE > + reserve_crashkernel(); > +#endif > memblock_dump_all(); > } > > _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 4/5] RISC-V: Add kdump support 2021-04-06 18:36 ` Alex Ghiti @ 2021-04-09 10:21 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-09 10:21 UTC (permalink / raw) To: Alex Ghiti Cc: Nick Kossifidis, linux-riscv, palmer, paul.walmsley, linux-kernel Στις 2021-04-06 21:36, Alex Ghiti έγραψε: > >> + /* Switch to physical addressing */ >> + la s4, 1f >> + sub s4, s4, s3 >> + csrw stvec, s4 >> + csrw sptbr, zero > > satp is used everywhere instead of sptbr. And maybe you could CSR_**** > naming, like you did in riscv_crash_save_regs and like it's done in > head.S too. > ACK >> + crash_base = memblock_find_in_range(search_start, search_end, >> +#ifdef CONFIG_64BIT >> + crash_size, SZ_2M); >> +#else >> + crash_size, SZ_4M); >> +#endif > > You can use PMD_SIZE here and get rid of #ifdef. > >> + >> +#ifdef CONFIG_64BIT >> + if (!IS_ALIGNED(crash_base, SZ_2M)) { >> +#else >> + if (!IS_ALIGNED(crash_base, SZ_4M)) { >> +#endif > > Ditto here. > Will do. Thanks a lot for your review ! ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 4/5] RISC-V: Add kdump support @ 2021-04-09 10:21 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-09 10:21 UTC (permalink / raw) To: Alex Ghiti Cc: Nick Kossifidis, linux-riscv, palmer, paul.walmsley, linux-kernel Στις 2021-04-06 21:36, Alex Ghiti έγραψε: > >> + /* Switch to physical addressing */ >> + la s4, 1f >> + sub s4, s4, s3 >> + csrw stvec, s4 >> + csrw sptbr, zero > > satp is used everywhere instead of sptbr. And maybe you could CSR_**** > naming, like you did in riscv_crash_save_regs and like it's done in > head.S too. > ACK >> + crash_base = memblock_find_in_range(search_start, search_end, >> +#ifdef CONFIG_64BIT >> + crash_size, SZ_2M); >> +#else >> + crash_size, SZ_4M); >> +#endif > > You can use PMD_SIZE here and get rid of #ifdef. > >> + >> +#ifdef CONFIG_64BIT >> + if (!IS_ALIGNED(crash_base, SZ_2M)) { >> +#else >> + if (!IS_ALIGNED(crash_base, SZ_4M)) { >> +#endif > > Ditto here. > Will do. Thanks a lot for your review ! _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 4/5] RISC-V: Add kdump support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-23 3:30 ` Palmer Dabbelt -1 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick On Mon, 05 Apr 2021 01:57:11 PDT (-0700), mick@ics.forth.gr wrote: > This patch adds support for kdump, the kernel will reserve a > region for the crash kernel and jump there on panic. In order > for userspace tools (kexec-tools) to prepare the crash kernel > kexec image, we also need to expose some information on > /proc/iomem for the memory regions used by the kernel and for > the region reserved for crash kernel. Note that on userspace > the device tree is used to determine the system's memory > layout so the "System RAM" on /proc/iomem is ignored. > > I tested this on riscv64 qemu and works as expected, you may > test it by triggering a crash through /proc/sysrq_trigger: > > echo c > /proc/sysrq_trigger > > v3: > * Move ELF_CORE_COPY_REGS to asm/elf.h instead of uapi/asm/elf.h > * Set stvec when disabling MMU > * Minor cleanups and re-base > > v2: > * Properly populate the ioresources tree, so that it can be > used later on for implementing strict /dev/mem. > * Minor cleanups and re-base > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/include/asm/elf.h | 6 +++ > arch/riscv/include/asm/kexec.h | 19 ++++--- > arch/riscv/kernel/Makefile | 2 +- > arch/riscv/kernel/crash_save_regs.S | 56 +++++++++++++++++++++ > arch/riscv/kernel/kexec_relocate.S | 68 ++++++++++++++++++++++++- > arch/riscv/kernel/machine_kexec.c | 43 +++++++++------- > arch/riscv/kernel/setup.c | 11 ++++- > arch/riscv/mm/init.c | 77 +++++++++++++++++++++++++++++ > 8 files changed, 255 insertions(+), 27 deletions(-) > create mode 100644 arch/riscv/kernel/crash_save_regs.S > > diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h > index 5c725e1df..f4b490cd0 100644 > --- a/arch/riscv/include/asm/elf.h > +++ b/arch/riscv/include/asm/elf.h > @@ -81,4 +81,10 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm, > int uses_interp); > #endif /* CONFIG_MMU */ > > +#define ELF_CORE_COPY_REGS(dest, regs) \ > +do { \ > + *(struct user_regs_struct *)&(dest) = \ > + *(struct user_regs_struct *)regs; \ > +} while (0); > + > #endif /* _ASM_RISCV_ELF_H */ > diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h > index efc69feb4..4fd583acc 100644 > --- a/arch/riscv/include/asm/kexec.h > +++ b/arch/riscv/include/asm/kexec.h > @@ -21,11 +21,16 @@ > > #define KEXEC_ARCH KEXEC_ARCH_RISCV > > +extern void riscv_crash_save_regs(struct pt_regs *newregs); > + > static inline void > crash_setup_regs(struct pt_regs *newregs, > struct pt_regs *oldregs) > { > - /* Dummy implementation for now */ > + if (oldregs) > + memcpy(newregs, oldregs, sizeof(struct pt_regs)); > + else > + riscv_crash_save_regs(newregs); > } > > > @@ -38,10 +43,12 @@ struct kimage_arch { > const extern unsigned char riscv_kexec_relocate[]; > const extern unsigned int riscv_kexec_relocate_size; > > -typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, > - unsigned long jump_addr, > - unsigned long fdt_addr, > - unsigned long hartid, > - unsigned long va_pa_off); > +typedef void (*riscv_kexec_method)(unsigned long first_ind_entry, > + unsigned long jump_addr, > + unsigned long fdt_addr, > + unsigned long hartid, > + unsigned long va_pa_off); > + > +extern riscv_kexec_method riscv_kexec_norelocate; > > #endif > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index c2594018c..07f676ad3 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -58,7 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o > endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > -obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o > +obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/crash_save_regs.S b/arch/riscv/kernel/crash_save_regs.S > new file mode 100644 > index 000000000..7832fb763 > --- /dev/null > +++ b/arch/riscv/kernel/crash_save_regs.S > @@ -0,0 +1,56 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2020 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ > +#include <asm/csr.h> /* For CSR_* macros */ > +#include <asm/asm-offsets.h> /* For offsets on pt_regs */ > +#include <linux/linkage.h> /* For SYM_* macros */ > + > +.section ".text" > +SYM_CODE_START(riscv_crash_save_regs) > + REG_S ra, PT_RA(a0) /* x1 */ > + REG_S sp, PT_SP(a0) /* x2 */ > + REG_S gp, PT_GP(a0) /* x3 */ > + REG_S tp, PT_TP(a0) /* x4 */ > + REG_S t0, PT_T0(a0) /* x5 */ > + REG_S t1, PT_T1(a0) /* x6 */ > + REG_S t2, PT_T2(a0) /* x7 */ > + REG_S s0, PT_S0(a0) /* x8/fp */ > + REG_S s1, PT_S1(a0) /* x9 */ > + REG_S a0, PT_A0(a0) /* x10 */ > + REG_S a1, PT_A1(a0) /* x11 */ > + REG_S a2, PT_A2(a0) /* x12 */ > + REG_S a3, PT_A3(a0) /* x13 */ > + REG_S a4, PT_A4(a0) /* x14 */ > + REG_S a5, PT_A5(a0) /* x15 */ > + REG_S a6, PT_A6(a0) /* x16 */ > + REG_S a7, PT_A7(a0) /* x17 */ > + REG_S s2, PT_S2(a0) /* x18 */ > + REG_S s3, PT_S3(a0) /* x19 */ > + REG_S s4, PT_S4(a0) /* x20 */ > + REG_S s5, PT_S5(a0) /* x21 */ > + REG_S s6, PT_S6(a0) /* x22 */ > + REG_S s7, PT_S7(a0) /* x23 */ > + REG_S s8, PT_S8(a0) /* x24 */ > + REG_S s9, PT_S9(a0) /* x25 */ > + REG_S s10, PT_S10(a0) /* x26 */ > + REG_S s11, PT_S11(a0) /* x27 */ > + REG_S t3, PT_T3(a0) /* x28 */ > + REG_S t4, PT_T4(a0) /* x29 */ > + REG_S t5, PT_T5(a0) /* x30 */ > + REG_S t6, PT_T6(a0) /* x31 */ > + > + csrr t1, CSR_STATUS > + csrr t2, CSR_EPC > + csrr t3, CSR_TVAL > + csrr t4, CSR_CAUSE > + > + REG_S t1, PT_STATUS(a0) > + REG_S t2, PT_EPC(a0) > + REG_S t3, PT_BADADDR(a0) > + REG_S t4, PT_CAUSE(a0) > + ret > +SYM_CODE_END(riscv_crash_save_regs) > diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S > index 616c20771..14220f70f 100644 > --- a/arch/riscv/kernel/kexec_relocate.S > +++ b/arch/riscv/kernel/kexec_relocate.S > @@ -150,7 +150,73 @@ SYM_CODE_START(riscv_kexec_relocate) > SYM_CODE_END(riscv_kexec_relocate) > riscv_kexec_relocate_end: > > - .section ".rodata" > + > +/* Used for jumping to crashkernel */ > +.section ".text" > +SYM_CODE_START(riscv_kexec_norelocate) > + /* > + * s0: (const) Phys address to jump to > + * s1: (const) Phys address of the FDT image > + * s2: (const) The hartid of the current hart > + * s3: (const) va_pa_offset, used when switching MMU off > + */ > + mv s0, a1 > + mv s1, a2 > + mv s2, a3 > + mv s3, a4 > + > + /* Disable / cleanup interrupts */ > + csrw sie, zero > + csrw sip, zero > + > + /* Switch to physical addressing */ > + la s4, 1f > + sub s4, s4, s3 > + csrw stvec, s4 > + csrw sptbr, zero > + > +.align 2 > +1: > + /* Pass the arguments to the next kernel / Cleanup*/ > + mv a0, s2 > + mv a1, s1 > + mv a2, s0 > + > + /* Cleanup */ > + mv a3, zero > + mv a4, zero > + mv a5, zero > + mv a6, zero > + mv a7, zero > + > + mv s0, zero > + mv s1, zero > + mv s2, zero > + mv s3, zero > + mv s4, zero > + mv s5, zero > + mv s6, zero > + mv s7, zero > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + mv t0, zero > + mv t1, zero > + mv t2, zero > + mv t3, zero > + mv t4, zero > + mv t5, zero > + mv t6, zero > + csrw sepc, zero > + csrw scause, zero > + csrw sscratch, zero > + > + jalr zero, a2, 0 > +SYM_CODE_END(riscv_kexec_norelocate) > + > +.section ".rodata" > SYM_DATA(riscv_kexec_relocate_size, > .long riscv_kexec_relocate_end - riscv_kexec_relocate) > > diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c > index 2ce6c3daf..e0596c0ac 100644 > --- a/arch/riscv/kernel/machine_kexec.c > +++ b/arch/riscv/kernel/machine_kexec.c > @@ -59,11 +59,6 @@ machine_kexec_prepare(struct kimage *image) > > kexec_image_info(image); > > - if (image->type == KEXEC_TYPE_CRASH) { > - pr_warn("Loading a crash kernel is unsupported for now.\n"); > - return -EINVAL; > - } > - > /* Find the Flattened Device Tree and save its physical address */ > for (i = 0; i < image->nr_segments; i++) { > if (image->segment[i].memsz <= sizeof(fdt)) > @@ -85,17 +80,21 @@ machine_kexec_prepare(struct kimage *image) > } > > /* Copy the assembler code for relocation to the control page */ > - control_code_buffer = page_address(image->control_code_page); > - control_code_buffer_sz = page_size(image->control_code_page); > - if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > - pr_err("Relocation code doesn't fit within a control page\n"); > - return -EINVAL; > - } > - memcpy(control_code_buffer, riscv_kexec_relocate, > - riscv_kexec_relocate_size); > + if (image->type != KEXEC_TYPE_CRASH) { > + control_code_buffer = page_address(image->control_code_page); > + control_code_buffer_sz = page_size(image->control_code_page); > > - /* Mark the control page executable */ > - set_memory_x((unsigned long) control_code_buffer, 1); > + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > + pr_err("Relocation code doesn't fit within a control page\n"); > + return -EINVAL; > + } > + > + memcpy(control_code_buffer, riscv_kexec_relocate, > + riscv_kexec_relocate_size); > + > + /* Mark the control page executable */ > + set_memory_x((unsigned long) control_code_buffer, 1); > + } > > return 0; > } > @@ -147,6 +146,9 @@ void machine_shutdown(void) > void > machine_crash_shutdown(struct pt_regs *regs) > { > + crash_save_cpu(regs, smp_processor_id()); > + machine_shutdown(); > + pr_info("Starting crashdump kernel...\n"); > } > > /** > @@ -169,7 +171,12 @@ machine_kexec(struct kimage *image) > unsigned long this_hart_id = raw_smp_processor_id(); > unsigned long fdt_addr = internal->fdt_addr; > void *control_code_buffer = page_address(image->control_code_page); > - riscv_kexec_do_relocate do_relocate = control_code_buffer; > + riscv_kexec_method kexec_method = NULL; > + > + if (image->type != KEXEC_TYPE_CRASH) > + kexec_method = control_code_buffer; > + else > + kexec_method = (riscv_kexec_method) &riscv_kexec_norelocate; > > pr_notice("Will call new kernel at %08lx from hart id %lx\n", > jump_addr, this_hart_id); > @@ -180,7 +187,7 @@ machine_kexec(struct kimage *image) > > /* Jump to the relocation code */ > pr_notice("Bye...\n"); > - do_relocate(first_ind_entry, jump_addr, fdt_addr, > - this_hart_id, va_pa_offset); > + kexec_method(first_ind_entry, jump_addr, fdt_addr, > + this_hart_id, va_pa_offset); > unreachable(); > } > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index 030554bab..31866dce9 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -20,6 +20,7 @@ > #include <linux/swiotlb.h> > #include <linux/smp.h> > #include <linux/efi.h> > +#include <linux/crash_dump.h> > > #include <asm/cpu_ops.h> > #include <asm/early_ioremap.h> > @@ -160,6 +161,14 @@ static void __init init_resources(void) > if (ret < 0) > goto error; > > +#ifdef CONFIG_KEXEC_CORE > + if (crashk_res.start != crashk_res.end) { > + ret = add_resource(&iomem_resource, &crashk_res); > + if (ret < 0) > + goto error; > + } > +#endif > + > for_each_reserved_mem_region(region) { > res = &mem_res[res_idx--]; > > @@ -252,7 +261,6 @@ void __init setup_arch(char **cmdline_p) > efi_init(); > setup_bootmem(); > paging_init(); > - init_resources(); > #if IS_ENABLED(CONFIG_BUILTIN_DTB) > unflatten_and_copy_device_tree(); > #else > @@ -263,6 +271,7 @@ void __init setup_arch(char **cmdline_p) > #endif > misc_mem_init(); > > + init_resources(); > sbi_init(); This one also caused a merge conflict. > > if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)) > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 7f5036fbe..e71b35cec 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -2,6 +2,8 @@ > /* > * Copyright (C) 2012 Regents of the University of California > * Copyright (C) 2019 Western Digital Corporation or its affiliates. > + * Copyright (C) 2020 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > */ > > #include <linux/init.h> > @@ -14,6 +16,7 @@ > #include <linux/libfdt.h> > #include <linux/set_memory.h> > #include <linux/dma-map-ops.h> > +#include <linux/crash_dump.h> > > #include <asm/fixmap.h> > #include <asm/tlbflush.h> > @@ -586,6 +589,77 @@ void mark_rodata_ro(void) > } > #endif > > +#ifdef CONFIG_KEXEC_CORE > +/* > + * reserve_crashkernel() - reserves memory for crash kernel > + * > + * This function reserves memory area given in "crashkernel=" kernel command > + * line parameter. The memory reserved is used by dump capture kernel when > + * primary kernel is crashing. > + */ > +static void __init reserve_crashkernel(void) > +{ > + unsigned long long crash_base = 0; > + unsigned long long crash_size = 0; > + unsigned long search_start = memblock_start_of_DRAM(); > + unsigned long search_end = memblock_end_of_DRAM(); > + > + int ret = 0; > + > + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > + &crash_size, &crash_base); > + if (ret || !crash_size) > + return; > + > + crash_size = PAGE_ALIGN(crash_size); > + > + if (crash_base == 0) { > + /* > + * Current riscv boot protocol requires 2MB alignment for > + * RV64 and 4MB alignment for RV32 (hugepage size) > + */ > + crash_base = memblock_find_in_range(search_start, search_end, > +#ifdef CONFIG_64BIT > + crash_size, SZ_2M); > +#else > + crash_size, SZ_4M); > +#endif > + if (crash_base == 0) { > + pr_warn("crashkernel: couldn't allocate %lldKB\n", > + crash_size >> 10); > + return; > + } > + } else { > + /* User specifies base address explicitly. */ > + if (!memblock_is_region_memory(crash_base, crash_size)) { > + pr_warn("crashkernel: requested region is not memory\n"); > + return; > + } > + > + if (memblock_is_region_reserved(crash_base, crash_size)) { > + pr_warn("crashkernel: requested region is reserved\n"); > + return; > + } > + > +#ifdef CONFIG_64BIT > + if (!IS_ALIGNED(crash_base, SZ_2M)) { > +#else > + if (!IS_ALIGNED(crash_base, SZ_4M)) { > +#endif > + pr_warn("crashkernel: requested region is misaligned\n"); > + return; > + } > + } > + memblock_reserve(crash_base, crash_size); > + > + pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", > + crash_base, crash_base + crash_size, crash_size >> 20); > + > + crashk_res.start = crash_base; > + crashk_res.end = crash_base + crash_size - 1; > +} > +#endif /* CONFIG_KEXEC_CORE */ > + > void __init paging_init(void) > { > setup_vm_final(); > @@ -598,6 +672,9 @@ void __init misc_mem_init(void) > arch_numa_init(); > sparse_init(); > zone_sizes_init(); > +#ifdef CONFIG_KEXEC_CORE > + reserve_crashkernel(); > +#endif > memblock_dump_all(); > } ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 4/5] RISC-V: Add kdump support @ 2021-04-23 3:30 ` Palmer Dabbelt 0 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick On Mon, 05 Apr 2021 01:57:11 PDT (-0700), mick@ics.forth.gr wrote: > This patch adds support for kdump, the kernel will reserve a > region for the crash kernel and jump there on panic. In order > for userspace tools (kexec-tools) to prepare the crash kernel > kexec image, we also need to expose some information on > /proc/iomem for the memory regions used by the kernel and for > the region reserved for crash kernel. Note that on userspace > the device tree is used to determine the system's memory > layout so the "System RAM" on /proc/iomem is ignored. > > I tested this on riscv64 qemu and works as expected, you may > test it by triggering a crash through /proc/sysrq_trigger: > > echo c > /proc/sysrq_trigger > > v3: > * Move ELF_CORE_COPY_REGS to asm/elf.h instead of uapi/asm/elf.h > * Set stvec when disabling MMU > * Minor cleanups and re-base > > v2: > * Properly populate the ioresources tree, so that it can be > used later on for implementing strict /dev/mem. > * Minor cleanups and re-base > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/include/asm/elf.h | 6 +++ > arch/riscv/include/asm/kexec.h | 19 ++++--- > arch/riscv/kernel/Makefile | 2 +- > arch/riscv/kernel/crash_save_regs.S | 56 +++++++++++++++++++++ > arch/riscv/kernel/kexec_relocate.S | 68 ++++++++++++++++++++++++- > arch/riscv/kernel/machine_kexec.c | 43 +++++++++------- > arch/riscv/kernel/setup.c | 11 ++++- > arch/riscv/mm/init.c | 77 +++++++++++++++++++++++++++++ > 8 files changed, 255 insertions(+), 27 deletions(-) > create mode 100644 arch/riscv/kernel/crash_save_regs.S > > diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h > index 5c725e1df..f4b490cd0 100644 > --- a/arch/riscv/include/asm/elf.h > +++ b/arch/riscv/include/asm/elf.h > @@ -81,4 +81,10 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm, > int uses_interp); > #endif /* CONFIG_MMU */ > > +#define ELF_CORE_COPY_REGS(dest, regs) \ > +do { \ > + *(struct user_regs_struct *)&(dest) = \ > + *(struct user_regs_struct *)regs; \ > +} while (0); > + > #endif /* _ASM_RISCV_ELF_H */ > diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h > index efc69feb4..4fd583acc 100644 > --- a/arch/riscv/include/asm/kexec.h > +++ b/arch/riscv/include/asm/kexec.h > @@ -21,11 +21,16 @@ > > #define KEXEC_ARCH KEXEC_ARCH_RISCV > > +extern void riscv_crash_save_regs(struct pt_regs *newregs); > + > static inline void > crash_setup_regs(struct pt_regs *newregs, > struct pt_regs *oldregs) > { > - /* Dummy implementation for now */ > + if (oldregs) > + memcpy(newregs, oldregs, sizeof(struct pt_regs)); > + else > + riscv_crash_save_regs(newregs); > } > > > @@ -38,10 +43,12 @@ struct kimage_arch { > const extern unsigned char riscv_kexec_relocate[]; > const extern unsigned int riscv_kexec_relocate_size; > > -typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, > - unsigned long jump_addr, > - unsigned long fdt_addr, > - unsigned long hartid, > - unsigned long va_pa_off); > +typedef void (*riscv_kexec_method)(unsigned long first_ind_entry, > + unsigned long jump_addr, > + unsigned long fdt_addr, > + unsigned long hartid, > + unsigned long va_pa_off); > + > +extern riscv_kexec_method riscv_kexec_norelocate; > > #endif > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index c2594018c..07f676ad3 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -58,7 +58,7 @@ obj-$(CONFIG_SMP) += cpu_ops_sbi.o > endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > -obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o > +obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/crash_save_regs.S b/arch/riscv/kernel/crash_save_regs.S > new file mode 100644 > index 000000000..7832fb763 > --- /dev/null > +++ b/arch/riscv/kernel/crash_save_regs.S > @@ -0,0 +1,56 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (C) 2020 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > + */ > + > +#include <asm/asm.h> /* For RISCV_* and REG_* macros */ > +#include <asm/csr.h> /* For CSR_* macros */ > +#include <asm/asm-offsets.h> /* For offsets on pt_regs */ > +#include <linux/linkage.h> /* For SYM_* macros */ > + > +.section ".text" > +SYM_CODE_START(riscv_crash_save_regs) > + REG_S ra, PT_RA(a0) /* x1 */ > + REG_S sp, PT_SP(a0) /* x2 */ > + REG_S gp, PT_GP(a0) /* x3 */ > + REG_S tp, PT_TP(a0) /* x4 */ > + REG_S t0, PT_T0(a0) /* x5 */ > + REG_S t1, PT_T1(a0) /* x6 */ > + REG_S t2, PT_T2(a0) /* x7 */ > + REG_S s0, PT_S0(a0) /* x8/fp */ > + REG_S s1, PT_S1(a0) /* x9 */ > + REG_S a0, PT_A0(a0) /* x10 */ > + REG_S a1, PT_A1(a0) /* x11 */ > + REG_S a2, PT_A2(a0) /* x12 */ > + REG_S a3, PT_A3(a0) /* x13 */ > + REG_S a4, PT_A4(a0) /* x14 */ > + REG_S a5, PT_A5(a0) /* x15 */ > + REG_S a6, PT_A6(a0) /* x16 */ > + REG_S a7, PT_A7(a0) /* x17 */ > + REG_S s2, PT_S2(a0) /* x18 */ > + REG_S s3, PT_S3(a0) /* x19 */ > + REG_S s4, PT_S4(a0) /* x20 */ > + REG_S s5, PT_S5(a0) /* x21 */ > + REG_S s6, PT_S6(a0) /* x22 */ > + REG_S s7, PT_S7(a0) /* x23 */ > + REG_S s8, PT_S8(a0) /* x24 */ > + REG_S s9, PT_S9(a0) /* x25 */ > + REG_S s10, PT_S10(a0) /* x26 */ > + REG_S s11, PT_S11(a0) /* x27 */ > + REG_S t3, PT_T3(a0) /* x28 */ > + REG_S t4, PT_T4(a0) /* x29 */ > + REG_S t5, PT_T5(a0) /* x30 */ > + REG_S t6, PT_T6(a0) /* x31 */ > + > + csrr t1, CSR_STATUS > + csrr t2, CSR_EPC > + csrr t3, CSR_TVAL > + csrr t4, CSR_CAUSE > + > + REG_S t1, PT_STATUS(a0) > + REG_S t2, PT_EPC(a0) > + REG_S t3, PT_BADADDR(a0) > + REG_S t4, PT_CAUSE(a0) > + ret > +SYM_CODE_END(riscv_crash_save_regs) > diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S > index 616c20771..14220f70f 100644 > --- a/arch/riscv/kernel/kexec_relocate.S > +++ b/arch/riscv/kernel/kexec_relocate.S > @@ -150,7 +150,73 @@ SYM_CODE_START(riscv_kexec_relocate) > SYM_CODE_END(riscv_kexec_relocate) > riscv_kexec_relocate_end: > > - .section ".rodata" > + > +/* Used for jumping to crashkernel */ > +.section ".text" > +SYM_CODE_START(riscv_kexec_norelocate) > + /* > + * s0: (const) Phys address to jump to > + * s1: (const) Phys address of the FDT image > + * s2: (const) The hartid of the current hart > + * s3: (const) va_pa_offset, used when switching MMU off > + */ > + mv s0, a1 > + mv s1, a2 > + mv s2, a3 > + mv s3, a4 > + > + /* Disable / cleanup interrupts */ > + csrw sie, zero > + csrw sip, zero > + > + /* Switch to physical addressing */ > + la s4, 1f > + sub s4, s4, s3 > + csrw stvec, s4 > + csrw sptbr, zero > + > +.align 2 > +1: > + /* Pass the arguments to the next kernel / Cleanup*/ > + mv a0, s2 > + mv a1, s1 > + mv a2, s0 > + > + /* Cleanup */ > + mv a3, zero > + mv a4, zero > + mv a5, zero > + mv a6, zero > + mv a7, zero > + > + mv s0, zero > + mv s1, zero > + mv s2, zero > + mv s3, zero > + mv s4, zero > + mv s5, zero > + mv s6, zero > + mv s7, zero > + mv s8, zero > + mv s9, zero > + mv s10, zero > + mv s11, zero > + > + mv t0, zero > + mv t1, zero > + mv t2, zero > + mv t3, zero > + mv t4, zero > + mv t5, zero > + mv t6, zero > + csrw sepc, zero > + csrw scause, zero > + csrw sscratch, zero > + > + jalr zero, a2, 0 > +SYM_CODE_END(riscv_kexec_norelocate) > + > +.section ".rodata" > SYM_DATA(riscv_kexec_relocate_size, > .long riscv_kexec_relocate_end - riscv_kexec_relocate) > > diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c > index 2ce6c3daf..e0596c0ac 100644 > --- a/arch/riscv/kernel/machine_kexec.c > +++ b/arch/riscv/kernel/machine_kexec.c > @@ -59,11 +59,6 @@ machine_kexec_prepare(struct kimage *image) > > kexec_image_info(image); > > - if (image->type == KEXEC_TYPE_CRASH) { > - pr_warn("Loading a crash kernel is unsupported for now.\n"); > - return -EINVAL; > - } > - > /* Find the Flattened Device Tree and save its physical address */ > for (i = 0; i < image->nr_segments; i++) { > if (image->segment[i].memsz <= sizeof(fdt)) > @@ -85,17 +80,21 @@ machine_kexec_prepare(struct kimage *image) > } > > /* Copy the assembler code for relocation to the control page */ > - control_code_buffer = page_address(image->control_code_page); > - control_code_buffer_sz = page_size(image->control_code_page); > - if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > - pr_err("Relocation code doesn't fit within a control page\n"); > - return -EINVAL; > - } > - memcpy(control_code_buffer, riscv_kexec_relocate, > - riscv_kexec_relocate_size); > + if (image->type != KEXEC_TYPE_CRASH) { > + control_code_buffer = page_address(image->control_code_page); > + control_code_buffer_sz = page_size(image->control_code_page); > > - /* Mark the control page executable */ > - set_memory_x((unsigned long) control_code_buffer, 1); > + if (unlikely(riscv_kexec_relocate_size > control_code_buffer_sz)) { > + pr_err("Relocation code doesn't fit within a control page\n"); > + return -EINVAL; > + } > + > + memcpy(control_code_buffer, riscv_kexec_relocate, > + riscv_kexec_relocate_size); > + > + /* Mark the control page executable */ > + set_memory_x((unsigned long) control_code_buffer, 1); > + } > > return 0; > } > @@ -147,6 +146,9 @@ void machine_shutdown(void) > void > machine_crash_shutdown(struct pt_regs *regs) > { > + crash_save_cpu(regs, smp_processor_id()); > + machine_shutdown(); > + pr_info("Starting crashdump kernel...\n"); > } > > /** > @@ -169,7 +171,12 @@ machine_kexec(struct kimage *image) > unsigned long this_hart_id = raw_smp_processor_id(); > unsigned long fdt_addr = internal->fdt_addr; > void *control_code_buffer = page_address(image->control_code_page); > - riscv_kexec_do_relocate do_relocate = control_code_buffer; > + riscv_kexec_method kexec_method = NULL; > + > + if (image->type != KEXEC_TYPE_CRASH) > + kexec_method = control_code_buffer; > + else > + kexec_method = (riscv_kexec_method) &riscv_kexec_norelocate; > > pr_notice("Will call new kernel at %08lx from hart id %lx\n", > jump_addr, this_hart_id); > @@ -180,7 +187,7 @@ machine_kexec(struct kimage *image) > > /* Jump to the relocation code */ > pr_notice("Bye...\n"); > - do_relocate(first_ind_entry, jump_addr, fdt_addr, > - this_hart_id, va_pa_offset); > + kexec_method(first_ind_entry, jump_addr, fdt_addr, > + this_hart_id, va_pa_offset); > unreachable(); > } > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index 030554bab..31866dce9 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -20,6 +20,7 @@ > #include <linux/swiotlb.h> > #include <linux/smp.h> > #include <linux/efi.h> > +#include <linux/crash_dump.h> > > #include <asm/cpu_ops.h> > #include <asm/early_ioremap.h> > @@ -160,6 +161,14 @@ static void __init init_resources(void) > if (ret < 0) > goto error; > > +#ifdef CONFIG_KEXEC_CORE > + if (crashk_res.start != crashk_res.end) { > + ret = add_resource(&iomem_resource, &crashk_res); > + if (ret < 0) > + goto error; > + } > +#endif > + > for_each_reserved_mem_region(region) { > res = &mem_res[res_idx--]; > > @@ -252,7 +261,6 @@ void __init setup_arch(char **cmdline_p) > efi_init(); > setup_bootmem(); > paging_init(); > - init_resources(); > #if IS_ENABLED(CONFIG_BUILTIN_DTB) > unflatten_and_copy_device_tree(); > #else > @@ -263,6 +271,7 @@ void __init setup_arch(char **cmdline_p) > #endif > misc_mem_init(); > > + init_resources(); > sbi_init(); This one also caused a merge conflict. > > if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)) > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 7f5036fbe..e71b35cec 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -2,6 +2,8 @@ > /* > * Copyright (C) 2012 Regents of the University of California > * Copyright (C) 2019 Western Digital Corporation or its affiliates. > + * Copyright (C) 2020 FORTH-ICS/CARV > + * Nick Kossifidis <mick@ics.forth.gr> > */ > > #include <linux/init.h> > @@ -14,6 +16,7 @@ > #include <linux/libfdt.h> > #include <linux/set_memory.h> > #include <linux/dma-map-ops.h> > +#include <linux/crash_dump.h> > > #include <asm/fixmap.h> > #include <asm/tlbflush.h> > @@ -586,6 +589,77 @@ void mark_rodata_ro(void) > } > #endif > > +#ifdef CONFIG_KEXEC_CORE > +/* > + * reserve_crashkernel() - reserves memory for crash kernel > + * > + * This function reserves memory area given in "crashkernel=" kernel command > + * line parameter. The memory reserved is used by dump capture kernel when > + * primary kernel is crashing. > + */ > +static void __init reserve_crashkernel(void) > +{ > + unsigned long long crash_base = 0; > + unsigned long long crash_size = 0; > + unsigned long search_start = memblock_start_of_DRAM(); > + unsigned long search_end = memblock_end_of_DRAM(); > + > + int ret = 0; > + > + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > + &crash_size, &crash_base); > + if (ret || !crash_size) > + return; > + > + crash_size = PAGE_ALIGN(crash_size); > + > + if (crash_base == 0) { > + /* > + * Current riscv boot protocol requires 2MB alignment for > + * RV64 and 4MB alignment for RV32 (hugepage size) > + */ > + crash_base = memblock_find_in_range(search_start, search_end, > +#ifdef CONFIG_64BIT > + crash_size, SZ_2M); > +#else > + crash_size, SZ_4M); > +#endif > + if (crash_base == 0) { > + pr_warn("crashkernel: couldn't allocate %lldKB\n", > + crash_size >> 10); > + return; > + } > + } else { > + /* User specifies base address explicitly. */ > + if (!memblock_is_region_memory(crash_base, crash_size)) { > + pr_warn("crashkernel: requested region is not memory\n"); > + return; > + } > + > + if (memblock_is_region_reserved(crash_base, crash_size)) { > + pr_warn("crashkernel: requested region is reserved\n"); > + return; > + } > + > +#ifdef CONFIG_64BIT > + if (!IS_ALIGNED(crash_base, SZ_2M)) { > +#else > + if (!IS_ALIGNED(crash_base, SZ_4M)) { > +#endif > + pr_warn("crashkernel: requested region is misaligned\n"); > + return; > + } > + } > + memblock_reserve(crash_base, crash_size); > + > + pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", > + crash_base, crash_base + crash_size, crash_size >> 20); > + > + crashk_res.start = crash_base; > + crashk_res.end = crash_base + crash_size - 1; > +} > +#endif /* CONFIG_KEXEC_CORE */ > + > void __init paging_init(void) > { > setup_vm_final(); > @@ -598,6 +672,9 @@ void __init misc_mem_init(void) > arch_numa_init(); > sparse_init(); > zone_sizes_init(); > +#ifdef CONFIG_KEXEC_CORE > + reserve_crashkernel(); > +#endif > memblock_dump_all(); > } _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 5/5] RISC-V: Add crash kernel support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-05 8:57 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer Cc: paul.walmsley, linux-kernel, Nick Kossifidis, Nick Kossifidis From: Nick Kossifidis <mickflemm@gmail.com> This patch allows Linux to act as a crash kernel for use with kdump. Userspace will let the crash kernel know about the memory region it can use through linux,usable-memory property on the /memory node (overriding its reg property), and about the memory region where the elf core header of the previous kernel is saved, through a reserved-memory node with a compatible string of "linux,elfcorehdr". This approach is the least invasive and re-uses functionality already present. I tested this on riscv64 qemu and it works as expected, you may test it by retrieving the dmesg of the previous kernel through /proc/vmcore, using the vmcore-dmesg utility from kexec-tools. v3: * Rebase v2: * Use linux,usable-memory on /memory instead of a new binding * Use a reserved-memory node for ELF core header Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- arch/riscv/Kconfig | 10 ++++++++ arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/crash_dump.c | 46 ++++++++++++++++++++++++++++++++++ arch/riscv/kernel/setup.c | 12 +++++++++ arch/riscv/mm/init.c | 33 ++++++++++++++++++++++++ 5 files changed, 102 insertions(+) create mode 100644 arch/riscv/kernel/crash_dump.c diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 3716262ef..553c2dced 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -403,6 +403,16 @@ config KEXEC The name comes from the similarity to the exec system call. +config CRASH_DUMP + bool "Build kdump crash kernel" + help + Generate crash dump after being started by kexec. This should + be normally only set in special crash dump kernels which are + loaded in the main kernel with kexec-tools into a specially + reserved region and then later executed after a crash by + kdump/kexec. + + For more details see Documentation/admin-guide/kdump/kdump.rst endmenu diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 07f676ad3..bd66d2ce0 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -59,6 +59,7 @@ endif obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o obj-$(CONFIG_KGDB) += kgdb.o obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o +obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_JUMP_LABEL) += jump_label.o diff --git a/arch/riscv/kernel/crash_dump.c b/arch/riscv/kernel/crash_dump.c new file mode 100644 index 000000000..86cc0ada5 --- /dev/null +++ b/arch/riscv/kernel/crash_dump.c @@ -0,0 +1,46 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * This code comes from arch/arm64/kernel/crash_dump.c + * Created by: AKASHI Takahiro <takahiro.akashi@linaro.org> + * Copyright (C) 2017 Linaro Limited + */ + +#include <linux/crash_dump.h> +#include <linux/io.h> + +/** + * copy_oldmem_page() - copy one page from old kernel memory + * @pfn: page frame number to be copied + * @buf: buffer where the copied page is placed + * @csize: number of bytes to copy + * @offset: offset in bytes into the page + * @userbuf: if set, @buf is in a user address space + * + * This function copies one page from old kernel memory into buffer pointed by + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes + * copied or negative error in case of failure. + */ +ssize_t copy_oldmem_page(unsigned long pfn, char *buf, + size_t csize, unsigned long offset, + int userbuf) +{ + void *vaddr; + + if (!csize) + return 0; + + vaddr = memremap(__pfn_to_phys(pfn), PAGE_SIZE, MEMREMAP_WB); + if (!vaddr) + return -ENOMEM; + + if (userbuf) { + if (copy_to_user((char __user *)buf, vaddr + offset, csize)) { + memunmap(vaddr); + return -EFAULT; + } + } else + memcpy(buf, vaddr + offset, csize); + + memunmap(vaddr); + return csize; +} diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 31866dce9..ff398a3d8 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -66,6 +66,9 @@ static struct resource code_res = { .name = "Kernel code", }; static struct resource data_res = { .name = "Kernel data", }; static struct resource rodata_res = { .name = "Kernel rodata", }; static struct resource bss_res = { .name = "Kernel bss", }; +#ifdef CONFIG_CRASH_DUMP +static struct resource elfcorehdr_res = { .name = "ELF Core hdr", }; +#endif static int __init add_resource(struct resource *parent, struct resource *res) @@ -169,6 +172,15 @@ static void __init init_resources(void) } #endif +#ifdef CONFIG_CRASH_DUMP + if (elfcorehdr_size > 0) { + elfcorehdr_res.start = elfcorehdr_addr; + elfcorehdr_res.end = elfcorehdr_addr + elfcorehdr_size - 1; + elfcorehdr_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + add_resource(&iomem_resource, &elfcorehdr_res); + } +#endif + for_each_reserved_mem_region(region) { res = &mem_res[res_idx--]; diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index e71b35cec..f66011816 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -13,6 +13,7 @@ #include <linux/swap.h> #include <linux/sizes.h> #include <linux/of_fdt.h> +#include <linux/of_reserved_mem.h> #include <linux/libfdt.h> #include <linux/set_memory.h> #include <linux/dma-map-ops.h> @@ -606,6 +607,18 @@ static void __init reserve_crashkernel(void) int ret = 0; + /* + * Don't reserve a region for a crash kernel on a crash kernel + * since it doesn't make much sense and we have limited memory + * resources. + */ +#ifdef CONFIG_CRASH_DUMP + if (is_kdump_kernel()) { + pr_info("crashkernel: ignoring reservation request\n"); + return; + } +#endif + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), &crash_size, &crash_base); if (ret || !crash_size) @@ -660,6 +673,26 @@ static void __init reserve_crashkernel(void) } #endif /* CONFIG_KEXEC_CORE */ +#ifdef CONFIG_CRASH_DUMP +/* + * We keep track of the ELF core header of the crashed + * kernel with a reserved-memory region with compatible + * string "linux,elfcorehdr". Here we register a callback + * to populate elfcorehdr_addr/size when this region is + * present. Note that this region will be marked as + * reserved once we call early_init_fdt_scan_reserved_mem() + * later on. + */ +static int elfcore_hdr_setup(struct reserved_mem *rmem) +{ + elfcorehdr_addr = rmem->base; + elfcorehdr_size = rmem->size; + return 0; +} + +RESERVEDMEM_OF_DECLARE(elfcorehdr, "linux,elfcorehdr", elfcore_hdr_setup); +#endif + void __init paging_init(void) { setup_vm_final(); -- 2.26.2 ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v3 5/5] RISC-V: Add crash kernel support @ 2021-04-05 8:57 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-05 8:57 UTC (permalink / raw) To: linux-riscv, palmer Cc: paul.walmsley, linux-kernel, Nick Kossifidis, Nick Kossifidis From: Nick Kossifidis <mickflemm@gmail.com> This patch allows Linux to act as a crash kernel for use with kdump. Userspace will let the crash kernel know about the memory region it can use through linux,usable-memory property on the /memory node (overriding its reg property), and about the memory region where the elf core header of the previous kernel is saved, through a reserved-memory node with a compatible string of "linux,elfcorehdr". This approach is the least invasive and re-uses functionality already present. I tested this on riscv64 qemu and it works as expected, you may test it by retrieving the dmesg of the previous kernel through /proc/vmcore, using the vmcore-dmesg utility from kexec-tools. v3: * Rebase v2: * Use linux,usable-memory on /memory instead of a new binding * Use a reserved-memory node for ELF core header Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- arch/riscv/Kconfig | 10 ++++++++ arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/crash_dump.c | 46 ++++++++++++++++++++++++++++++++++ arch/riscv/kernel/setup.c | 12 +++++++++ arch/riscv/mm/init.c | 33 ++++++++++++++++++++++++ 5 files changed, 102 insertions(+) create mode 100644 arch/riscv/kernel/crash_dump.c diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 3716262ef..553c2dced 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -403,6 +403,16 @@ config KEXEC The name comes from the similarity to the exec system call. +config CRASH_DUMP + bool "Build kdump crash kernel" + help + Generate crash dump after being started by kexec. This should + be normally only set in special crash dump kernels which are + loaded in the main kernel with kexec-tools into a specially + reserved region and then later executed after a crash by + kdump/kexec. + + For more details see Documentation/admin-guide/kdump/kdump.rst endmenu diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 07f676ad3..bd66d2ce0 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -59,6 +59,7 @@ endif obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o obj-$(CONFIG_KGDB) += kgdb.o obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o +obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_JUMP_LABEL) += jump_label.o diff --git a/arch/riscv/kernel/crash_dump.c b/arch/riscv/kernel/crash_dump.c new file mode 100644 index 000000000..86cc0ada5 --- /dev/null +++ b/arch/riscv/kernel/crash_dump.c @@ -0,0 +1,46 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * This code comes from arch/arm64/kernel/crash_dump.c + * Created by: AKASHI Takahiro <takahiro.akashi@linaro.org> + * Copyright (C) 2017 Linaro Limited + */ + +#include <linux/crash_dump.h> +#include <linux/io.h> + +/** + * copy_oldmem_page() - copy one page from old kernel memory + * @pfn: page frame number to be copied + * @buf: buffer where the copied page is placed + * @csize: number of bytes to copy + * @offset: offset in bytes into the page + * @userbuf: if set, @buf is in a user address space + * + * This function copies one page from old kernel memory into buffer pointed by + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes + * copied or negative error in case of failure. + */ +ssize_t copy_oldmem_page(unsigned long pfn, char *buf, + size_t csize, unsigned long offset, + int userbuf) +{ + void *vaddr; + + if (!csize) + return 0; + + vaddr = memremap(__pfn_to_phys(pfn), PAGE_SIZE, MEMREMAP_WB); + if (!vaddr) + return -ENOMEM; + + if (userbuf) { + if (copy_to_user((char __user *)buf, vaddr + offset, csize)) { + memunmap(vaddr); + return -EFAULT; + } + } else + memcpy(buf, vaddr + offset, csize); + + memunmap(vaddr); + return csize; +} diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 31866dce9..ff398a3d8 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -66,6 +66,9 @@ static struct resource code_res = { .name = "Kernel code", }; static struct resource data_res = { .name = "Kernel data", }; static struct resource rodata_res = { .name = "Kernel rodata", }; static struct resource bss_res = { .name = "Kernel bss", }; +#ifdef CONFIG_CRASH_DUMP +static struct resource elfcorehdr_res = { .name = "ELF Core hdr", }; +#endif static int __init add_resource(struct resource *parent, struct resource *res) @@ -169,6 +172,15 @@ static void __init init_resources(void) } #endif +#ifdef CONFIG_CRASH_DUMP + if (elfcorehdr_size > 0) { + elfcorehdr_res.start = elfcorehdr_addr; + elfcorehdr_res.end = elfcorehdr_addr + elfcorehdr_size - 1; + elfcorehdr_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + add_resource(&iomem_resource, &elfcorehdr_res); + } +#endif + for_each_reserved_mem_region(region) { res = &mem_res[res_idx--]; diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index e71b35cec..f66011816 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -13,6 +13,7 @@ #include <linux/swap.h> #include <linux/sizes.h> #include <linux/of_fdt.h> +#include <linux/of_reserved_mem.h> #include <linux/libfdt.h> #include <linux/set_memory.h> #include <linux/dma-map-ops.h> @@ -606,6 +607,18 @@ static void __init reserve_crashkernel(void) int ret = 0; + /* + * Don't reserve a region for a crash kernel on a crash kernel + * since it doesn't make much sense and we have limited memory + * resources. + */ +#ifdef CONFIG_CRASH_DUMP + if (is_kdump_kernel()) { + pr_info("crashkernel: ignoring reservation request\n"); + return; + } +#endif + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), &crash_size, &crash_base); if (ret || !crash_size) @@ -660,6 +673,26 @@ static void __init reserve_crashkernel(void) } #endif /* CONFIG_KEXEC_CORE */ +#ifdef CONFIG_CRASH_DUMP +/* + * We keep track of the ELF core header of the crashed + * kernel with a reserved-memory region with compatible + * string "linux,elfcorehdr". Here we register a callback + * to populate elfcorehdr_addr/size when this region is + * present. Note that this region will be marked as + * reserved once we call early_init_fdt_scan_reserved_mem() + * later on. + */ +static int elfcore_hdr_setup(struct reserved_mem *rmem) +{ + elfcorehdr_addr = rmem->base; + elfcorehdr_size = rmem->size; + return 0; +} + +RESERVEDMEM_OF_DECLARE(elfcorehdr, "linux,elfcorehdr", elfcore_hdr_setup); +#endif + void __init paging_init(void) { setup_vm_final(); -- 2.26.2 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v3 5/5] RISC-V: Add crash kernel support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-23 3:30 ` Palmer Dabbelt -1 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mickflemm, mick On Mon, 05 Apr 2021 01:57:12 PDT (-0700), mick@ics.forth.gr wrote: > From: Nick Kossifidis <mickflemm@gmail.com> This doesn't match the SOB. I just fixed it up. IIUC that's not generally the right way to go, but since it came from the right address I'm OK with it this time. > > This patch allows Linux to act as a crash kernel for use with > kdump. Userspace will let the crash kernel know about the > memory region it can use through linux,usable-memory property > on the /memory node (overriding its reg property), and about the > memory region where the elf core header of the previous kernel > is saved, through a reserved-memory node with a compatible string > of "linux,elfcorehdr". This approach is the least invasive and > re-uses functionality already present. > > I tested this on riscv64 qemu and it works as expected, you > may test it by retrieving the dmesg of the previous kernel > through /proc/vmcore, using the vmcore-dmesg utility from > kexec-tools. > > v3: > * Rebase > > v2: > * Use linux,usable-memory on /memory instead of a new binding > * Use a reserved-memory node for ELF core header > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/Kconfig | 10 ++++++++ > arch/riscv/kernel/Makefile | 1 + > arch/riscv/kernel/crash_dump.c | 46 ++++++++++++++++++++++++++++++++++ > arch/riscv/kernel/setup.c | 12 +++++++++ > arch/riscv/mm/init.c | 33 ++++++++++++++++++++++++ > 5 files changed, 102 insertions(+) > create mode 100644 arch/riscv/kernel/crash_dump.c > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index 3716262ef..553c2dced 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -403,6 +403,16 @@ config KEXEC > > The name comes from the similarity to the exec system call. > > +config CRASH_DUMP > + bool "Build kdump crash kernel" > + help > + Generate crash dump after being started by kexec. This should > + be normally only set in special crash dump kernels which are > + loaded in the main kernel with kexec-tools into a specially > + reserved region and then later executed after a crash by > + kdump/kexec. > + > + For more details see Documentation/admin-guide/kdump/kdump.rst > > endmenu > > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index 07f676ad3..bd66d2ce0 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -59,6 +59,7 @@ endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o > +obj-$(CONFIG_CRASH_DUMP) += crash_dump.o > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/crash_dump.c b/arch/riscv/kernel/crash_dump.c > new file mode 100644 > index 000000000..86cc0ada5 > --- /dev/null > +++ b/arch/riscv/kernel/crash_dump.c > @@ -0,0 +1,46 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * This code comes from arch/arm64/kernel/crash_dump.c > + * Created by: AKASHI Takahiro <takahiro.akashi@linaro.org> > + * Copyright (C) 2017 Linaro Limited > + */ > + > +#include <linux/crash_dump.h> > +#include <linux/io.h> > + > +/** > + * copy_oldmem_page() - copy one page from old kernel memory > + * @pfn: page frame number to be copied > + * @buf: buffer where the copied page is placed > + * @csize: number of bytes to copy > + * @offset: offset in bytes into the page > + * @userbuf: if set, @buf is in a user address space > + * > + * This function copies one page from old kernel memory into buffer pointed by > + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes > + * copied or negative error in case of failure. > + */ > +ssize_t copy_oldmem_page(unsigned long pfn, char *buf, > + size_t csize, unsigned long offset, > + int userbuf) > +{ > + void *vaddr; > + > + if (!csize) > + return 0; > + > + vaddr = memremap(__pfn_to_phys(pfn), PAGE_SIZE, MEMREMAP_WB); > + if (!vaddr) > + return -ENOMEM; > + > + if (userbuf) { > + if (copy_to_user((char __user *)buf, vaddr + offset, csize)) { > + memunmap(vaddr); > + return -EFAULT; > + } > + } else > + memcpy(buf, vaddr + offset, csize); > + > + memunmap(vaddr); > + return csize; > +} > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index 31866dce9..ff398a3d8 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -66,6 +66,9 @@ static struct resource code_res = { .name = "Kernel code", }; > static struct resource data_res = { .name = "Kernel data", }; > static struct resource rodata_res = { .name = "Kernel rodata", }; > static struct resource bss_res = { .name = "Kernel bss", }; > +#ifdef CONFIG_CRASH_DUMP > +static struct resource elfcorehdr_res = { .name = "ELF Core hdr", }; > +#endif > > static int __init add_resource(struct resource *parent, > struct resource *res) > @@ -169,6 +172,15 @@ static void __init init_resources(void) > } > #endif > > +#ifdef CONFIG_CRASH_DUMP > + if (elfcorehdr_size > 0) { > + elfcorehdr_res.start = elfcorehdr_addr; > + elfcorehdr_res.end = elfcorehdr_addr + elfcorehdr_size - 1; > + elfcorehdr_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + add_resource(&iomem_resource, &elfcorehdr_res); > + } > +#endif > + > for_each_reserved_mem_region(region) { > res = &mem_res[res_idx--]; > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index e71b35cec..f66011816 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -13,6 +13,7 @@ > #include <linux/swap.h> > #include <linux/sizes.h> > #include <linux/of_fdt.h> > +#include <linux/of_reserved_mem.h> > #include <linux/libfdt.h> > #include <linux/set_memory.h> > #include <linux/dma-map-ops.h> > @@ -606,6 +607,18 @@ static void __init reserve_crashkernel(void) > > int ret = 0; > > + /* > + * Don't reserve a region for a crash kernel on a crash kernel > + * since it doesn't make much sense and we have limited memory > + * resources. > + */ > +#ifdef CONFIG_CRASH_DUMP > + if (is_kdump_kernel()) { > + pr_info("crashkernel: ignoring reservation request\n"); > + return; > + } > +#endif > + > ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > &crash_size, &crash_base); > if (ret || !crash_size) > @@ -660,6 +673,26 @@ static void __init reserve_crashkernel(void) > } > #endif /* CONFIG_KEXEC_CORE */ > > +#ifdef CONFIG_CRASH_DUMP > +/* > + * We keep track of the ELF core header of the crashed > + * kernel with a reserved-memory region with compatible > + * string "linux,elfcorehdr". Here we register a callback > + * to populate elfcorehdr_addr/size when this region is > + * present. Note that this region will be marked as > + * reserved once we call early_init_fdt_scan_reserved_mem() > + * later on. > + */ > +static int elfcore_hdr_setup(struct reserved_mem *rmem) > +{ > + elfcorehdr_addr = rmem->base; > + elfcorehdr_size = rmem->size; > + return 0; > +} > + > +RESERVEDMEM_OF_DECLARE(elfcorehdr, "linux,elfcorehdr", elfcore_hdr_setup); > +#endif > + > void __init paging_init(void) > { > setup_vm_final(); ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 5/5] RISC-V: Add crash kernel support @ 2021-04-23 3:30 ` Palmer Dabbelt 0 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mickflemm, mick On Mon, 05 Apr 2021 01:57:12 PDT (-0700), mick@ics.forth.gr wrote: > From: Nick Kossifidis <mickflemm@gmail.com> This doesn't match the SOB. I just fixed it up. IIUC that's not generally the right way to go, but since it came from the right address I'm OK with it this time. > > This patch allows Linux to act as a crash kernel for use with > kdump. Userspace will let the crash kernel know about the > memory region it can use through linux,usable-memory property > on the /memory node (overriding its reg property), and about the > memory region where the elf core header of the previous kernel > is saved, through a reserved-memory node with a compatible string > of "linux,elfcorehdr". This approach is the least invasive and > re-uses functionality already present. > > I tested this on riscv64 qemu and it works as expected, you > may test it by retrieving the dmesg of the previous kernel > through /proc/vmcore, using the vmcore-dmesg utility from > kexec-tools. > > v3: > * Rebase > > v2: > * Use linux,usable-memory on /memory instead of a new binding > * Use a reserved-memory node for ELF core header > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/Kconfig | 10 ++++++++ > arch/riscv/kernel/Makefile | 1 + > arch/riscv/kernel/crash_dump.c | 46 ++++++++++++++++++++++++++++++++++ > arch/riscv/kernel/setup.c | 12 +++++++++ > arch/riscv/mm/init.c | 33 ++++++++++++++++++++++++ > 5 files changed, 102 insertions(+) > create mode 100644 arch/riscv/kernel/crash_dump.c > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index 3716262ef..553c2dced 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -403,6 +403,16 @@ config KEXEC > > The name comes from the similarity to the exec system call. > > +config CRASH_DUMP > + bool "Build kdump crash kernel" > + help > + Generate crash dump after being started by kexec. This should > + be normally only set in special crash dump kernels which are > + loaded in the main kernel with kexec-tools into a specially > + reserved region and then later executed after a crash by > + kdump/kexec. > + > + For more details see Documentation/admin-guide/kdump/kdump.rst > > endmenu > > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index 07f676ad3..bd66d2ce0 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -59,6 +59,7 @@ endif > obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o > obj-$(CONFIG_KGDB) += kgdb.o > obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o > +obj-$(CONFIG_CRASH_DUMP) += crash_dump.o > > obj-$(CONFIG_JUMP_LABEL) += jump_label.o > > diff --git a/arch/riscv/kernel/crash_dump.c b/arch/riscv/kernel/crash_dump.c > new file mode 100644 > index 000000000..86cc0ada5 > --- /dev/null > +++ b/arch/riscv/kernel/crash_dump.c > @@ -0,0 +1,46 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * This code comes from arch/arm64/kernel/crash_dump.c > + * Created by: AKASHI Takahiro <takahiro.akashi@linaro.org> > + * Copyright (C) 2017 Linaro Limited > + */ > + > +#include <linux/crash_dump.h> > +#include <linux/io.h> > + > +/** > + * copy_oldmem_page() - copy one page from old kernel memory > + * @pfn: page frame number to be copied > + * @buf: buffer where the copied page is placed > + * @csize: number of bytes to copy > + * @offset: offset in bytes into the page > + * @userbuf: if set, @buf is in a user address space > + * > + * This function copies one page from old kernel memory into buffer pointed by > + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes > + * copied or negative error in case of failure. > + */ > +ssize_t copy_oldmem_page(unsigned long pfn, char *buf, > + size_t csize, unsigned long offset, > + int userbuf) > +{ > + void *vaddr; > + > + if (!csize) > + return 0; > + > + vaddr = memremap(__pfn_to_phys(pfn), PAGE_SIZE, MEMREMAP_WB); > + if (!vaddr) > + return -ENOMEM; > + > + if (userbuf) { > + if (copy_to_user((char __user *)buf, vaddr + offset, csize)) { > + memunmap(vaddr); > + return -EFAULT; > + } > + } else > + memcpy(buf, vaddr + offset, csize); > + > + memunmap(vaddr); > + return csize; > +} > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index 31866dce9..ff398a3d8 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -66,6 +66,9 @@ static struct resource code_res = { .name = "Kernel code", }; > static struct resource data_res = { .name = "Kernel data", }; > static struct resource rodata_res = { .name = "Kernel rodata", }; > static struct resource bss_res = { .name = "Kernel bss", }; > +#ifdef CONFIG_CRASH_DUMP > +static struct resource elfcorehdr_res = { .name = "ELF Core hdr", }; > +#endif > > static int __init add_resource(struct resource *parent, > struct resource *res) > @@ -169,6 +172,15 @@ static void __init init_resources(void) > } > #endif > > +#ifdef CONFIG_CRASH_DUMP > + if (elfcorehdr_size > 0) { > + elfcorehdr_res.start = elfcorehdr_addr; > + elfcorehdr_res.end = elfcorehdr_addr + elfcorehdr_size - 1; > + elfcorehdr_res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; > + add_resource(&iomem_resource, &elfcorehdr_res); > + } > +#endif > + > for_each_reserved_mem_region(region) { > res = &mem_res[res_idx--]; > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index e71b35cec..f66011816 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -13,6 +13,7 @@ > #include <linux/swap.h> > #include <linux/sizes.h> > #include <linux/of_fdt.h> > +#include <linux/of_reserved_mem.h> > #include <linux/libfdt.h> > #include <linux/set_memory.h> > #include <linux/dma-map-ops.h> > @@ -606,6 +607,18 @@ static void __init reserve_crashkernel(void) > > int ret = 0; > > + /* > + * Don't reserve a region for a crash kernel on a crash kernel > + * since it doesn't make much sense and we have limited memory > + * resources. > + */ > +#ifdef CONFIG_CRASH_DUMP > + if (is_kdump_kernel()) { > + pr_info("crashkernel: ignoring reservation request\n"); > + return; > + } > +#endif > + > ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > &crash_size, &crash_base); > if (ret || !crash_size) > @@ -660,6 +673,26 @@ static void __init reserve_crashkernel(void) > } > #endif /* CONFIG_KEXEC_CORE */ > > +#ifdef CONFIG_CRASH_DUMP > +/* > + * We keep track of the ELF core header of the crashed > + * kernel with a reserved-memory region with compatible > + * string "linux,elfcorehdr". Here we register a callback > + * to populate elfcorehdr_addr/size when this region is > + * present. Note that this region will be marked as > + * reserved once we call early_init_fdt_scan_reserved_mem() > + * later on. > + */ > +static int elfcore_hdr_setup(struct reserved_mem *rmem) > +{ > + elfcorehdr_addr = rmem->base; > + elfcorehdr_size = rmem->size; > + return 0; > +} > + > +RESERVEDMEM_OF_DECLARE(elfcorehdr, "linux,elfcorehdr", elfcore_hdr_setup); > +#endif > + > void __init paging_init(void) > { > setup_vm_final(); _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-07 7:45 ` Yixun Lan -1 siblings, 0 replies; 53+ messages in thread From: Yixun Lan @ 2021-04-07 7:45 UTC (permalink / raw) To: Nick Kossifidis, linux-riscv, Palmer Dabbelt Cc: Paul Walmsley, linux-kernel@vger.kernel.org List Hi Nick On 4/5/21 8:57 AM, Nick Kossifidis wrote: > This patch series adds kexec/kdump and crash kernel > support on RISC-V. For testing the patches a patched > version of kexec-tools is needed (still a work in > progress) which can be found at: > > https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz > > v3: > * Rebase on newer kernel tree > * Minor cleanups > * Split UAPI changes to a separate patch > * Improve / cleanup init_resources > * Resolve Palmer's comments > > v2: > * Rebase on newer kernel tree > * Minor cleanups > * Properly populate the ioresources tre, so that it > can be used later on for implementing strict /dev/mem > * Use linux,usable-memory on /memory instead of a new binding > * USe a reserved-memory node for ELF core header > > Nick Kossifidis (5): > RISC-V: Add EM_RISCV to kexec UAPI header > RISC-V: Add kexec support > RISC-V: Improve init_resources > RISC-V: Add kdump support > RISC-V: Add crash kernel support > > arch/riscv/Kconfig | 25 ++++ > arch/riscv/include/asm/elf.h | 6 + > arch/riscv/include/asm/kexec.h | 54 +++++++ > arch/riscv/kernel/Makefile | 6 + > arch/riscv/kernel/crash_dump.c | 46 ++++++ > arch/riscv/kernel/crash_save_regs.S | 56 +++++++ > arch/riscv/kernel/kexec_relocate.S | 222 ++++++++++++++++++++++++++++ > arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ > arch/riscv/kernel/setup.c | 113 ++++++++------ > arch/riscv/mm/init.c | 110 ++++++++++++++ > include/uapi/linux/kexec.h | 1 + > 11 files changed, 787 insertions(+), 45 deletions(-) > create mode 100644 arch/riscv/include/asm/kexec.h > create mode 100644 arch/riscv/kernel/crash_dump.c > create mode 100644 arch/riscv/kernel/crash_save_regs.S > create mode 100644 arch/riscv/kernel/kexec_relocate.S > create mode 100644 arch/riscv/kernel/machine_kexec.c > Just ask out of curiosity (maybe out of topic).. Is crash analysis [1] capable of parsing RISC-V kdump image? No? Or, any plan working on it? [1] https://github.com/crash-utility/crash Yxun Lan ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support @ 2021-04-07 7:45 ` Yixun Lan 0 siblings, 0 replies; 53+ messages in thread From: Yixun Lan @ 2021-04-07 7:45 UTC (permalink / raw) To: Nick Kossifidis, linux-riscv, Palmer Dabbelt Cc: Paul Walmsley, linux-kernel@vger.kernel.org List Hi Nick On 4/5/21 8:57 AM, Nick Kossifidis wrote: > This patch series adds kexec/kdump and crash kernel > support on RISC-V. For testing the patches a patched > version of kexec-tools is needed (still a work in > progress) which can be found at: > > https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz > > v3: > * Rebase on newer kernel tree > * Minor cleanups > * Split UAPI changes to a separate patch > * Improve / cleanup init_resources > * Resolve Palmer's comments > > v2: > * Rebase on newer kernel tree > * Minor cleanups > * Properly populate the ioresources tre, so that it > can be used later on for implementing strict /dev/mem > * Use linux,usable-memory on /memory instead of a new binding > * USe a reserved-memory node for ELF core header > > Nick Kossifidis (5): > RISC-V: Add EM_RISCV to kexec UAPI header > RISC-V: Add kexec support > RISC-V: Improve init_resources > RISC-V: Add kdump support > RISC-V: Add crash kernel support > > arch/riscv/Kconfig | 25 ++++ > arch/riscv/include/asm/elf.h | 6 + > arch/riscv/include/asm/kexec.h | 54 +++++++ > arch/riscv/kernel/Makefile | 6 + > arch/riscv/kernel/crash_dump.c | 46 ++++++ > arch/riscv/kernel/crash_save_regs.S | 56 +++++++ > arch/riscv/kernel/kexec_relocate.S | 222 ++++++++++++++++++++++++++++ > arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ > arch/riscv/kernel/setup.c | 113 ++++++++------ > arch/riscv/mm/init.c | 110 ++++++++++++++ > include/uapi/linux/kexec.h | 1 + > 11 files changed, 787 insertions(+), 45 deletions(-) > create mode 100644 arch/riscv/include/asm/kexec.h > create mode 100644 arch/riscv/kernel/crash_dump.c > create mode 100644 arch/riscv/kernel/crash_save_regs.S > create mode 100644 arch/riscv/kernel/kexec_relocate.S > create mode 100644 arch/riscv/kernel/machine_kexec.c > Just ask out of curiosity (maybe out of topic).. Is crash analysis [1] capable of parsing RISC-V kdump image? No? Or, any plan working on it? [1] https://github.com/crash-utility/crash Yxun Lan _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-07 16:29 ` Rob Herring -1 siblings, 0 replies; 53+ messages in thread From: Rob Herring @ 2021-04-07 16:29 UTC (permalink / raw) To: Nick Kossifidis; +Cc: linux-riscv, palmer, paul.walmsley, linux-kernel On Mon, Apr 05, 2021 at 11:57:07AM +0300, Nick Kossifidis wrote: > This patch series adds kexec/kdump and crash kernel > support on RISC-V. For testing the patches a patched > version of kexec-tools is needed (still a work in > progress) which can be found at: > > https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz > > v3: > * Rebase on newer kernel tree > * Minor cleanups > * Split UAPI changes to a separate patch > * Improve / cleanup init_resources > * Resolve Palmer's comments > > v2: > * Rebase on newer kernel tree > * Minor cleanups > * Properly populate the ioresources tre, so that it > can be used later on for implementing strict /dev/mem > * Use linux,usable-memory on /memory instead of a new binding Where? In any case, that's not going to work well with EFI support assuming like arm64, 'memory' is passed in UEFI structures instead of DT. That's why there's now a /chosen linux,usable-memory-ranges property. Isn't the preferred kexec interface the file based interface? I'd expect a new arch to only support that. And there's common kexec DT handling for that pending for 5.13. Rob ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support @ 2021-04-07 16:29 ` Rob Herring 0 siblings, 0 replies; 53+ messages in thread From: Rob Herring @ 2021-04-07 16:29 UTC (permalink / raw) To: Nick Kossifidis; +Cc: linux-riscv, palmer, paul.walmsley, linux-kernel On Mon, Apr 05, 2021 at 11:57:07AM +0300, Nick Kossifidis wrote: > This patch series adds kexec/kdump and crash kernel > support on RISC-V. For testing the patches a patched > version of kexec-tools is needed (still a work in > progress) which can be found at: > > https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz > > v3: > * Rebase on newer kernel tree > * Minor cleanups > * Split UAPI changes to a separate patch > * Improve / cleanup init_resources > * Resolve Palmer's comments > > v2: > * Rebase on newer kernel tree > * Minor cleanups > * Properly populate the ioresources tre, so that it > can be used later on for implementing strict /dev/mem > * Use linux,usable-memory on /memory instead of a new binding Where? In any case, that's not going to work well with EFI support assuming like arm64, 'memory' is passed in UEFI structures instead of DT. That's why there's now a /chosen linux,usable-memory-ranges property. Isn't the preferred kexec interface the file based interface? I'd expect a new arch to only support that. And there's common kexec DT handling for that pending for 5.13. Rob _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support 2021-04-07 16:29 ` Rob Herring @ 2021-04-09 10:02 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-09 10:02 UTC (permalink / raw) To: Rob Herring Cc: Nick Kossifidis, linux-riscv, palmer, paul.walmsley, linux-kernel Στις 2021-04-07 19:29, Rob Herring έγραψε: > On Mon, Apr 05, 2021 at 11:57:07AM +0300, Nick Kossifidis wrote: >> This patch series adds kexec/kdump and crash kernel >> support on RISC-V. For testing the patches a patched >> version of kexec-tools is needed (still a work in >> progress) which can be found at: >> >> https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz >> >> v3: >> * Rebase on newer kernel tree >> * Minor cleanups >> * Split UAPI changes to a separate patch >> * Improve / cleanup init_resources >> * Resolve Palmer's comments >> >> v2: >> * Rebase on newer kernel tree >> * Minor cleanups >> * Properly populate the ioresources tre, so that it >> can be used later on for implementing strict /dev/mem >> * Use linux,usable-memory on /memory instead of a new binding > > Where? In any case, that's not going to work well with EFI support > assuming like arm64, 'memory' is passed in UEFI structures instead of > DT. That's why there's now a /chosen linux,usable-memory-ranges > property. > Here: https://elixir.bootlin.com/linux/v5.12-rc5/source/drivers/of/fdt.c#L1001 The "linux,usable-memory" binding is already defined and is part of early_init_dt_scan_memory() which we call on mm/init.c to determine system's memory layout. It's simple, clean and I don't see a reason to use another binding on /chosen and add extra code for this, when we already handle it on early_init_dt_scan_memory() anyway. As for EFI, even when enabled, we still use DT to determine system memory layout, not EFI structures, plus I don't see how EFI is relevant here, the bootloader in kexec's case is Linux, not EFI. BTW the /memory node is mandatory in any case, it should exist on DT regardless of EFI, /chosen node on the other hand is -in general- optional, and we can still boot a riscv system without /chosen node present (we only require it for the built-in cmdline to work). Also a simple grep for "linux,usable-memory-ranges" on the latest kernel sources didn't return anything, there is also nothing on chosen.txt, where is that binding documented/implemented ? > Isn't the preferred kexec interface the file based interface? I'd > expect a new arch to only support that. And there's common kexec DT > handling for that pending for 5.13. > Both approaches have their pros an cons, that's why both are available, in no way CONFIG_KEXEC is deprecated in favor of CONFIG_KEXEC_FILE, at least not as far as I know. The main point for the file-based syscall is to support secure boot, since the image is loaded by the kernel directly without any processing by the userspace tools, so it can be pre-signed by the kernel's "vendor". On the other hand, the kernel part is more complicated and you can't pass a new device tree, the kernel needs to re-use the existing one (or modify it in-kernel), you can only override the cmdline. This doesn't work for our use cases in FORTH, where we use kexec not only to re-boot our systems, but also to boot to a system with different hw layout (e.g. FPGA prototypes or systems with FPGAs on the side), device tree overlays also don't cover our use cases. To give you an idea we can add/remove/modify devices, move them to another region etc and still use kexec to avoid going through the full boot cycle. We just unload their drivers, perform a full or partial re-programming of the FPGA from within Linux, and kexec to the new system with the new device tree. The file-based syscall can't cover this scenario, in general it's less flexible and it's only there for secure boot, not for using custom-built kernels, nor custom device tree images. Security-wise the file load syscall provides guarantees for integrity and authenticity, but depending on the kernel "vendor"'s infrastructure and signing process this may allow e.g. to load an older/vulnerable kernel through kexec and get away with it, there is no check as far as I know to make sure the loaded kernel is at least as old as the running kernel, the assumption is that the "vendor" will use a different signing key/cert for each kernel and that you'll kexec to a kernel/crash kernel that's the same version as the running one. Until we have clear guidelines on how this is meant to be used and have a discussion on secure boot within RISC-V (we have something on the TEE TG but we'll probably switch to a SIG committee for this), I don't see how this feature is a priority compared to the more generic CONFIG_KEXEC. Regards, Nick ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support @ 2021-04-09 10:02 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-09 10:02 UTC (permalink / raw) To: Rob Herring Cc: Nick Kossifidis, linux-riscv, palmer, paul.walmsley, linux-kernel Στις 2021-04-07 19:29, Rob Herring έγραψε: > On Mon, Apr 05, 2021 at 11:57:07AM +0300, Nick Kossifidis wrote: >> This patch series adds kexec/kdump and crash kernel >> support on RISC-V. For testing the patches a patched >> version of kexec-tools is needed (still a work in >> progress) which can be found at: >> >> https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz >> >> v3: >> * Rebase on newer kernel tree >> * Minor cleanups >> * Split UAPI changes to a separate patch >> * Improve / cleanup init_resources >> * Resolve Palmer's comments >> >> v2: >> * Rebase on newer kernel tree >> * Minor cleanups >> * Properly populate the ioresources tre, so that it >> can be used later on for implementing strict /dev/mem >> * Use linux,usable-memory on /memory instead of a new binding > > Where? In any case, that's not going to work well with EFI support > assuming like arm64, 'memory' is passed in UEFI structures instead of > DT. That's why there's now a /chosen linux,usable-memory-ranges > property. > Here: https://elixir.bootlin.com/linux/v5.12-rc5/source/drivers/of/fdt.c#L1001 The "linux,usable-memory" binding is already defined and is part of early_init_dt_scan_memory() which we call on mm/init.c to determine system's memory layout. It's simple, clean and I don't see a reason to use another binding on /chosen and add extra code for this, when we already handle it on early_init_dt_scan_memory() anyway. As for EFI, even when enabled, we still use DT to determine system memory layout, not EFI structures, plus I don't see how EFI is relevant here, the bootloader in kexec's case is Linux, not EFI. BTW the /memory node is mandatory in any case, it should exist on DT regardless of EFI, /chosen node on the other hand is -in general- optional, and we can still boot a riscv system without /chosen node present (we only require it for the built-in cmdline to work). Also a simple grep for "linux,usable-memory-ranges" on the latest kernel sources didn't return anything, there is also nothing on chosen.txt, where is that binding documented/implemented ? > Isn't the preferred kexec interface the file based interface? I'd > expect a new arch to only support that. And there's common kexec DT > handling for that pending for 5.13. > Both approaches have their pros an cons, that's why both are available, in no way CONFIG_KEXEC is deprecated in favor of CONFIG_KEXEC_FILE, at least not as far as I know. The main point for the file-based syscall is to support secure boot, since the image is loaded by the kernel directly without any processing by the userspace tools, so it can be pre-signed by the kernel's "vendor". On the other hand, the kernel part is more complicated and you can't pass a new device tree, the kernel needs to re-use the existing one (or modify it in-kernel), you can only override the cmdline. This doesn't work for our use cases in FORTH, where we use kexec not only to re-boot our systems, but also to boot to a system with different hw layout (e.g. FPGA prototypes or systems with FPGAs on the side), device tree overlays also don't cover our use cases. To give you an idea we can add/remove/modify devices, move them to another region etc and still use kexec to avoid going through the full boot cycle. We just unload their drivers, perform a full or partial re-programming of the FPGA from within Linux, and kexec to the new system with the new device tree. The file-based syscall can't cover this scenario, in general it's less flexible and it's only there for secure boot, not for using custom-built kernels, nor custom device tree images. Security-wise the file load syscall provides guarantees for integrity and authenticity, but depending on the kernel "vendor"'s infrastructure and signing process this may allow e.g. to load an older/vulnerable kernel through kexec and get away with it, there is no check as far as I know to make sure the loaded kernel is at least as old as the running kernel, the assumption is that the "vendor" will use a different signing key/cert for each kernel and that you'll kexec to a kernel/crash kernel that's the same version as the running one. Until we have clear guidelines on how this is meant to be used and have a discussion on secure boot within RISC-V (we have something on the TEE TG but we'll probably switch to a SIG committee for this), I don't see how this feature is a priority compared to the more generic CONFIG_KEXEC. Regards, Nick _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support 2021-04-05 8:57 ` Nick Kossifidis @ 2021-04-23 3:30 ` Palmer Dabbelt -1 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick On Mon, 05 Apr 2021 01:57:07 PDT (-0700), mick@ics.forth.gr wrote: > This patch series adds kexec/kdump and crash kernel > support on RISC-V. For testing the patches a patched > version of kexec-tools is needed (still a work in > progress) which can be found at: > > https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz > > v3: > * Rebase on newer kernel tree > * Minor cleanups > * Split UAPI changes to a separate patch > * Improve / cleanup init_resources > * Resolve Palmer's comments > > v2: > * Rebase on newer kernel tree > * Minor cleanups > * Properly populate the ioresources tre, so that it > can be used later on for implementing strict /dev/mem > * Use linux,usable-memory on /memory instead of a new binding > * USe a reserved-memory node for ELF core header > > Nick Kossifidis (5): > RISC-V: Add EM_RISCV to kexec UAPI header > RISC-V: Add kexec support > RISC-V: Improve init_resources > RISC-V: Add kdump support > RISC-V: Add crash kernel support > > arch/riscv/Kconfig | 25 ++++ > arch/riscv/include/asm/elf.h | 6 + > arch/riscv/include/asm/kexec.h | 54 +++++++ > arch/riscv/kernel/Makefile | 6 + > arch/riscv/kernel/crash_dump.c | 46 ++++++ > arch/riscv/kernel/crash_save_regs.S | 56 +++++++ > arch/riscv/kernel/kexec_relocate.S | 222 ++++++++++++++++++++++++++++ > arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ > arch/riscv/kernel/setup.c | 113 ++++++++------ > arch/riscv/mm/init.c | 110 ++++++++++++++ > include/uapi/linux/kexec.h | 1 + > 11 files changed, 787 insertions(+), 45 deletions(-) > create mode 100644 arch/riscv/include/asm/kexec.h > create mode 100644 arch/riscv/kernel/crash_dump.c > create mode 100644 arch/riscv/kernel/crash_save_regs.S > create mode 100644 arch/riscv/kernel/kexec_relocate.S > create mode 100644 arch/riscv/kernel/machine_kexec.c Thanks. There were some minor issues and some merge conflicts, I put this on for-next with some fixups. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support @ 2021-04-23 3:30 ` Palmer Dabbelt 0 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:30 UTC (permalink / raw) To: mick; +Cc: linux-riscv, Paul Walmsley, linux-kernel, mick On Mon, 05 Apr 2021 01:57:07 PDT (-0700), mick@ics.forth.gr wrote: > This patch series adds kexec/kdump and crash kernel > support on RISC-V. For testing the patches a patched > version of kexec-tools is needed (still a work in > progress) which can be found at: > > https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz > > v3: > * Rebase on newer kernel tree > * Minor cleanups > * Split UAPI changes to a separate patch > * Improve / cleanup init_resources > * Resolve Palmer's comments > > v2: > * Rebase on newer kernel tree > * Minor cleanups > * Properly populate the ioresources tre, so that it > can be used later on for implementing strict /dev/mem > * Use linux,usable-memory on /memory instead of a new binding > * USe a reserved-memory node for ELF core header > > Nick Kossifidis (5): > RISC-V: Add EM_RISCV to kexec UAPI header > RISC-V: Add kexec support > RISC-V: Improve init_resources > RISC-V: Add kdump support > RISC-V: Add crash kernel support > > arch/riscv/Kconfig | 25 ++++ > arch/riscv/include/asm/elf.h | 6 + > arch/riscv/include/asm/kexec.h | 54 +++++++ > arch/riscv/kernel/Makefile | 6 + > arch/riscv/kernel/crash_dump.c | 46 ++++++ > arch/riscv/kernel/crash_save_regs.S | 56 +++++++ > arch/riscv/kernel/kexec_relocate.S | 222 ++++++++++++++++++++++++++++ > arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ > arch/riscv/kernel/setup.c | 113 ++++++++------ > arch/riscv/mm/init.c | 110 ++++++++++++++ > include/uapi/linux/kexec.h | 1 + > 11 files changed, 787 insertions(+), 45 deletions(-) > create mode 100644 arch/riscv/include/asm/kexec.h > create mode 100644 arch/riscv/kernel/crash_dump.c > create mode 100644 arch/riscv/kernel/crash_save_regs.S > create mode 100644 arch/riscv/kernel/kexec_relocate.S > create mode 100644 arch/riscv/kernel/machine_kexec.c Thanks. There were some minor issues and some merge conflicts, I put this on for-next with some fixups. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support 2021-04-23 3:30 ` Palmer Dabbelt @ 2021-04-23 3:36 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-23 3:36 UTC (permalink / raw) To: Palmer Dabbelt; +Cc: mick, linux-riscv, Paul Walmsley, linux-kernel Στις 2021-04-23 06:30, Palmer Dabbelt έγραψε: > On Mon, 05 Apr 2021 01:57:07 PDT (-0700), mick@ics.forth.gr wrote: >> This patch series adds kexec/kdump and crash kernel >> support on RISC-V. For testing the patches a patched >> version of kexec-tools is needed (still a work in >> progress) which can be found at: >> >> https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz >> >> v3: >> * Rebase on newer kernel tree >> * Minor cleanups >> * Split UAPI changes to a separate patch >> * Improve / cleanup init_resources >> * Resolve Palmer's comments >> >> v2: >> * Rebase on newer kernel tree >> * Minor cleanups >> * Properly populate the ioresources tre, so that it >> can be used later on for implementing strict /dev/mem >> * Use linux,usable-memory on /memory instead of a new binding >> * USe a reserved-memory node for ELF core header >> >> Nick Kossifidis (5): >> RISC-V: Add EM_RISCV to kexec UAPI header >> RISC-V: Add kexec support >> RISC-V: Improve init_resources >> RISC-V: Add kdump support >> RISC-V: Add crash kernel support >> >> arch/riscv/Kconfig | 25 ++++ >> arch/riscv/include/asm/elf.h | 6 + >> arch/riscv/include/asm/kexec.h | 54 +++++++ >> arch/riscv/kernel/Makefile | 6 + >> arch/riscv/kernel/crash_dump.c | 46 ++++++ >> arch/riscv/kernel/crash_save_regs.S | 56 +++++++ >> arch/riscv/kernel/kexec_relocate.S | 222 >> ++++++++++++++++++++++++++++ >> arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ >> arch/riscv/kernel/setup.c | 113 ++++++++------ >> arch/riscv/mm/init.c | 110 ++++++++++++++ >> include/uapi/linux/kexec.h | 1 + >> 11 files changed, 787 insertions(+), 45 deletions(-) >> create mode 100644 arch/riscv/include/asm/kexec.h >> create mode 100644 arch/riscv/kernel/crash_dump.c >> create mode 100644 arch/riscv/kernel/crash_save_regs.S >> create mode 100644 arch/riscv/kernel/kexec_relocate.S >> create mode 100644 arch/riscv/kernel/machine_kexec.c > > Thanks. There were some minor issues and some merge conflicts, I put > this on for-next with some fixups. I've sent a v4 that shouldn't have merge conflicts, addressing some comments from Alex as well, could you use that instead or is it too late ? ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support @ 2021-04-23 3:36 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-23 3:36 UTC (permalink / raw) To: Palmer Dabbelt; +Cc: mick, linux-riscv, Paul Walmsley, linux-kernel Στις 2021-04-23 06:30, Palmer Dabbelt έγραψε: > On Mon, 05 Apr 2021 01:57:07 PDT (-0700), mick@ics.forth.gr wrote: >> This patch series adds kexec/kdump and crash kernel >> support on RISC-V. For testing the patches a patched >> version of kexec-tools is needed (still a work in >> progress) which can be found at: >> >> https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz >> >> v3: >> * Rebase on newer kernel tree >> * Minor cleanups >> * Split UAPI changes to a separate patch >> * Improve / cleanup init_resources >> * Resolve Palmer's comments >> >> v2: >> * Rebase on newer kernel tree >> * Minor cleanups >> * Properly populate the ioresources tre, so that it >> can be used later on for implementing strict /dev/mem >> * Use linux,usable-memory on /memory instead of a new binding >> * USe a reserved-memory node for ELF core header >> >> Nick Kossifidis (5): >> RISC-V: Add EM_RISCV to kexec UAPI header >> RISC-V: Add kexec support >> RISC-V: Improve init_resources >> RISC-V: Add kdump support >> RISC-V: Add crash kernel support >> >> arch/riscv/Kconfig | 25 ++++ >> arch/riscv/include/asm/elf.h | 6 + >> arch/riscv/include/asm/kexec.h | 54 +++++++ >> arch/riscv/kernel/Makefile | 6 + >> arch/riscv/kernel/crash_dump.c | 46 ++++++ >> arch/riscv/kernel/crash_save_regs.S | 56 +++++++ >> arch/riscv/kernel/kexec_relocate.S | 222 >> ++++++++++++++++++++++++++++ >> arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ >> arch/riscv/kernel/setup.c | 113 ++++++++------ >> arch/riscv/mm/init.c | 110 ++++++++++++++ >> include/uapi/linux/kexec.h | 1 + >> 11 files changed, 787 insertions(+), 45 deletions(-) >> create mode 100644 arch/riscv/include/asm/kexec.h >> create mode 100644 arch/riscv/kernel/crash_dump.c >> create mode 100644 arch/riscv/kernel/crash_save_regs.S >> create mode 100644 arch/riscv/kernel/kexec_relocate.S >> create mode 100644 arch/riscv/kernel/machine_kexec.c > > Thanks. There were some minor issues and some merge conflicts, I put > this on for-next with some fixups. I've sent a v4 that shouldn't have merge conflicts, addressing some comments from Alex as well, could you use that instead or is it too late ? _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support 2021-04-23 3:36 ` Nick Kossifidis @ 2021-04-23 3:48 ` Palmer Dabbelt -1 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:48 UTC (permalink / raw) To: mick; +Cc: mick, linux-riscv, Paul Walmsley, linux-kernel On Thu, 22 Apr 2021 20:36:56 PDT (-0700), mick@ics.forth.gr wrote: > Στις 2021-04-23 06:30, Palmer Dabbelt έγραψε: >> On Mon, 05 Apr 2021 01:57:07 PDT (-0700), mick@ics.forth.gr wrote: >>> This patch series adds kexec/kdump and crash kernel >>> support on RISC-V. For testing the patches a patched >>> version of kexec-tools is needed (still a work in >>> progress) which can be found at: >>> >>> https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz >>> >>> v3: >>> * Rebase on newer kernel tree >>> * Minor cleanups >>> * Split UAPI changes to a separate patch >>> * Improve / cleanup init_resources >>> * Resolve Palmer's comments >>> >>> v2: >>> * Rebase on newer kernel tree >>> * Minor cleanups >>> * Properly populate the ioresources tre, so that it >>> can be used later on for implementing strict /dev/mem >>> * Use linux,usable-memory on /memory instead of a new binding >>> * USe a reserved-memory node for ELF core header >>> >>> Nick Kossifidis (5): >>> RISC-V: Add EM_RISCV to kexec UAPI header >>> RISC-V: Add kexec support >>> RISC-V: Improve init_resources >>> RISC-V: Add kdump support >>> RISC-V: Add crash kernel support >>> >>> arch/riscv/Kconfig | 25 ++++ >>> arch/riscv/include/asm/elf.h | 6 + >>> arch/riscv/include/asm/kexec.h | 54 +++++++ >>> arch/riscv/kernel/Makefile | 6 + >>> arch/riscv/kernel/crash_dump.c | 46 ++++++ >>> arch/riscv/kernel/crash_save_regs.S | 56 +++++++ >>> arch/riscv/kernel/kexec_relocate.S | 222 >>> ++++++++++++++++++++++++++++ >>> arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ >>> arch/riscv/kernel/setup.c | 113 ++++++++------ >>> arch/riscv/mm/init.c | 110 ++++++++++++++ >>> include/uapi/linux/kexec.h | 1 + >>> 11 files changed, 787 insertions(+), 45 deletions(-) >>> create mode 100644 arch/riscv/include/asm/kexec.h >>> create mode 100644 arch/riscv/kernel/crash_dump.c >>> create mode 100644 arch/riscv/kernel/crash_save_regs.S >>> create mode 100644 arch/riscv/kernel/kexec_relocate.S >>> create mode 100644 arch/riscv/kernel/machine_kexec.c >> >> Thanks. There were some minor issues and some merge conflicts, I put >> this on for-next with some fixups. > > I've sent a v4 that shouldn't have merge conflicts, addressing some > comments from Alex as well, could you use that instead or is it too late > ? Thanks, for some reason I didn't see it when poking around. There was still that one init_resources() merge conflict and I fixed up some of the commit texts, it's now on for-next as b94394119804 (HEAD -> for-next, riscv/for-next) RISC-V: Add crash kernel support 6e8451782c90 RISC-V: Add kdump support 0a0652429bdb RISC-V: Improve init_resources() d9a8897d6b5d RISC-V: Add kexec support f59938095b94 RISC-V: Add EM_RISCV to kexec UAPI header ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support @ 2021-04-23 3:48 ` Palmer Dabbelt 0 siblings, 0 replies; 53+ messages in thread From: Palmer Dabbelt @ 2021-04-23 3:48 UTC (permalink / raw) To: mick; +Cc: mick, linux-riscv, Paul Walmsley, linux-kernel On Thu, 22 Apr 2021 20:36:56 PDT (-0700), mick@ics.forth.gr wrote: > Στις 2021-04-23 06:30, Palmer Dabbelt έγραψε: >> On Mon, 05 Apr 2021 01:57:07 PDT (-0700), mick@ics.forth.gr wrote: >>> This patch series adds kexec/kdump and crash kernel >>> support on RISC-V. For testing the patches a patched >>> version of kexec-tools is needed (still a work in >>> progress) which can be found at: >>> >>> https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz >>> >>> v3: >>> * Rebase on newer kernel tree >>> * Minor cleanups >>> * Split UAPI changes to a separate patch >>> * Improve / cleanup init_resources >>> * Resolve Palmer's comments >>> >>> v2: >>> * Rebase on newer kernel tree >>> * Minor cleanups >>> * Properly populate the ioresources tre, so that it >>> can be used later on for implementing strict /dev/mem >>> * Use linux,usable-memory on /memory instead of a new binding >>> * USe a reserved-memory node for ELF core header >>> >>> Nick Kossifidis (5): >>> RISC-V: Add EM_RISCV to kexec UAPI header >>> RISC-V: Add kexec support >>> RISC-V: Improve init_resources >>> RISC-V: Add kdump support >>> RISC-V: Add crash kernel support >>> >>> arch/riscv/Kconfig | 25 ++++ >>> arch/riscv/include/asm/elf.h | 6 + >>> arch/riscv/include/asm/kexec.h | 54 +++++++ >>> arch/riscv/kernel/Makefile | 6 + >>> arch/riscv/kernel/crash_dump.c | 46 ++++++ >>> arch/riscv/kernel/crash_save_regs.S | 56 +++++++ >>> arch/riscv/kernel/kexec_relocate.S | 222 >>> ++++++++++++++++++++++++++++ >>> arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ >>> arch/riscv/kernel/setup.c | 113 ++++++++------ >>> arch/riscv/mm/init.c | 110 ++++++++++++++ >>> include/uapi/linux/kexec.h | 1 + >>> 11 files changed, 787 insertions(+), 45 deletions(-) >>> create mode 100644 arch/riscv/include/asm/kexec.h >>> create mode 100644 arch/riscv/kernel/crash_dump.c >>> create mode 100644 arch/riscv/kernel/crash_save_regs.S >>> create mode 100644 arch/riscv/kernel/kexec_relocate.S >>> create mode 100644 arch/riscv/kernel/machine_kexec.c >> >> Thanks. There were some minor issues and some merge conflicts, I put >> this on for-next with some fixups. > > I've sent a v4 that shouldn't have merge conflicts, addressing some > comments from Alex as well, could you use that instead or is it too late > ? Thanks, for some reason I didn't see it when poking around. There was still that one init_resources() merge conflict and I fixed up some of the commit texts, it's now on for-next as b94394119804 (HEAD -> for-next, riscv/for-next) RISC-V: Add crash kernel support 6e8451782c90 RISC-V: Add kdump support 0a0652429bdb RISC-V: Improve init_resources() d9a8897d6b5d RISC-V: Add kexec support f59938095b94 RISC-V: Add EM_RISCV to kexec UAPI header _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support 2021-04-23 3:48 ` Palmer Dabbelt @ 2021-04-23 3:53 ` Nick Kossifidis -1 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-23 3:53 UTC (permalink / raw) To: Palmer Dabbelt; +Cc: mick, linux-riscv, Paul Walmsley, linux-kernel Στις 2021-04-23 06:48, Palmer Dabbelt έγραψε: > On Thu, 22 Apr 2021 20:36:56 PDT (-0700), mick@ics.forth.gr wrote: >> Στις 2021-04-23 06:30, Palmer Dabbelt έγραψε: >>> On Mon, 05 Apr 2021 01:57:07 PDT (-0700), mick@ics.forth.gr wrote: >>>> This patch series adds kexec/kdump and crash kernel >>>> support on RISC-V. For testing the patches a patched >>>> version of kexec-tools is needed (still a work in >>>> progress) which can be found at: >>>> >>>> https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz >>>> >>>> v3: >>>> * Rebase on newer kernel tree >>>> * Minor cleanups >>>> * Split UAPI changes to a separate patch >>>> * Improve / cleanup init_resources >>>> * Resolve Palmer's comments >>>> >>>> v2: >>>> * Rebase on newer kernel tree >>>> * Minor cleanups >>>> * Properly populate the ioresources tre, so that it >>>> can be used later on for implementing strict /dev/mem >>>> * Use linux,usable-memory on /memory instead of a new binding >>>> * USe a reserved-memory node for ELF core header >>>> >>>> Nick Kossifidis (5): >>>> RISC-V: Add EM_RISCV to kexec UAPI header >>>> RISC-V: Add kexec support >>>> RISC-V: Improve init_resources >>>> RISC-V: Add kdump support >>>> RISC-V: Add crash kernel support >>>> >>>> arch/riscv/Kconfig | 25 ++++ >>>> arch/riscv/include/asm/elf.h | 6 + >>>> arch/riscv/include/asm/kexec.h | 54 +++++++ >>>> arch/riscv/kernel/Makefile | 6 + >>>> arch/riscv/kernel/crash_dump.c | 46 ++++++ >>>> arch/riscv/kernel/crash_save_regs.S | 56 +++++++ >>>> arch/riscv/kernel/kexec_relocate.S | 222 >>>> ++++++++++++++++++++++++++++ >>>> arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ >>>> arch/riscv/kernel/setup.c | 113 ++++++++------ >>>> arch/riscv/mm/init.c | 110 ++++++++++++++ >>>> include/uapi/linux/kexec.h | 1 + >>>> 11 files changed, 787 insertions(+), 45 deletions(-) >>>> create mode 100644 arch/riscv/include/asm/kexec.h >>>> create mode 100644 arch/riscv/kernel/crash_dump.c >>>> create mode 100644 arch/riscv/kernel/crash_save_regs.S >>>> create mode 100644 arch/riscv/kernel/kexec_relocate.S >>>> create mode 100644 arch/riscv/kernel/machine_kexec.c >>> >>> Thanks. There were some minor issues and some merge conflicts, I put >>> this on for-next with some fixups. >> >> I've sent a v4 that shouldn't have merge conflicts, addressing some >> comments from Alex as well, could you use that instead or is it too >> late >> ? > > Thanks, for some reason I didn't see it when poking around. There was > still that one init_resources() merge conflict and I fixed up some of > the commit texts, it's now on for-next as > > b94394119804 (HEAD -> for-next, riscv/for-next) RISC-V: Add crash > kernel support > 6e8451782c90 RISC-V: Add kdump support > 0a0652429bdb RISC-V: Improve init_resources() > d9a8897d6b5d RISC-V: Add kexec support > f59938095b94 RISC-V: Add EM_RISCV to kexec UAPI header Thanks a lot ! I'll keep on working on the user-space part and submit it on kexec-tools later on. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support @ 2021-04-23 3:53 ` Nick Kossifidis 0 siblings, 0 replies; 53+ messages in thread From: Nick Kossifidis @ 2021-04-23 3:53 UTC (permalink / raw) To: Palmer Dabbelt; +Cc: mick, linux-riscv, Paul Walmsley, linux-kernel Στις 2021-04-23 06:48, Palmer Dabbelt έγραψε: > On Thu, 22 Apr 2021 20:36:56 PDT (-0700), mick@ics.forth.gr wrote: >> Στις 2021-04-23 06:30, Palmer Dabbelt έγραψε: >>> On Mon, 05 Apr 2021 01:57:07 PDT (-0700), mick@ics.forth.gr wrote: >>>> This patch series adds kexec/kdump and crash kernel >>>> support on RISC-V. For testing the patches a patched >>>> version of kexec-tools is needed (still a work in >>>> progress) which can be found at: >>>> >>>> https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz >>>> >>>> v3: >>>> * Rebase on newer kernel tree >>>> * Minor cleanups >>>> * Split UAPI changes to a separate patch >>>> * Improve / cleanup init_resources >>>> * Resolve Palmer's comments >>>> >>>> v2: >>>> * Rebase on newer kernel tree >>>> * Minor cleanups >>>> * Properly populate the ioresources tre, so that it >>>> can be used later on for implementing strict /dev/mem >>>> * Use linux,usable-memory on /memory instead of a new binding >>>> * USe a reserved-memory node for ELF core header >>>> >>>> Nick Kossifidis (5): >>>> RISC-V: Add EM_RISCV to kexec UAPI header >>>> RISC-V: Add kexec support >>>> RISC-V: Improve init_resources >>>> RISC-V: Add kdump support >>>> RISC-V: Add crash kernel support >>>> >>>> arch/riscv/Kconfig | 25 ++++ >>>> arch/riscv/include/asm/elf.h | 6 + >>>> arch/riscv/include/asm/kexec.h | 54 +++++++ >>>> arch/riscv/kernel/Makefile | 6 + >>>> arch/riscv/kernel/crash_dump.c | 46 ++++++ >>>> arch/riscv/kernel/crash_save_regs.S | 56 +++++++ >>>> arch/riscv/kernel/kexec_relocate.S | 222 >>>> ++++++++++++++++++++++++++++ >>>> arch/riscv/kernel/machine_kexec.c | 193 ++++++++++++++++++++++++ >>>> arch/riscv/kernel/setup.c | 113 ++++++++------ >>>> arch/riscv/mm/init.c | 110 ++++++++++++++ >>>> include/uapi/linux/kexec.h | 1 + >>>> 11 files changed, 787 insertions(+), 45 deletions(-) >>>> create mode 100644 arch/riscv/include/asm/kexec.h >>>> create mode 100644 arch/riscv/kernel/crash_dump.c >>>> create mode 100644 arch/riscv/kernel/crash_save_regs.S >>>> create mode 100644 arch/riscv/kernel/kexec_relocate.S >>>> create mode 100644 arch/riscv/kernel/machine_kexec.c >>> >>> Thanks. There were some minor issues and some merge conflicts, I put >>> this on for-next with some fixups. >> >> I've sent a v4 that shouldn't have merge conflicts, addressing some >> comments from Alex as well, could you use that instead or is it too >> late >> ? > > Thanks, for some reason I didn't see it when poking around. There was > still that one init_resources() merge conflict and I fixed up some of > the commit texts, it's now on for-next as > > b94394119804 (HEAD -> for-next, riscv/for-next) RISC-V: Add crash > kernel support > 6e8451782c90 RISC-V: Add kdump support > 0a0652429bdb RISC-V: Improve init_resources() > d9a8897d6b5d RISC-V: Add kexec support > f59938095b94 RISC-V: Add EM_RISCV to kexec UAPI header Thanks a lot ! I'll keep on working on the user-space part and submit it on kexec-tools later on. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2021-04-23 3:54 UTC | newest] Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-05 8:57 [PATCH v3 0/5] RISC-V: Add kexec/kdump support Nick Kossifidis 2021-04-05 8:57 ` Nick Kossifidis 2021-04-05 8:57 ` [PATCH v3 1/5] RISC-V: Add EM_RISCV to kexec UAPI header Nick Kossifidis 2021-04-05 8:57 ` Nick Kossifidis 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-05 8:57 ` [PATCH v3 2/5] RISC-V: Add kexec support Nick Kossifidis 2021-04-05 8:57 ` Nick Kossifidis 2021-04-06 18:38 ` Alex Ghiti 2021-04-06 18:38 ` Alex Ghiti 2021-04-09 10:19 ` Nick Kossifidis 2021-04-09 10:19 ` Nick Kossifidis 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-05 8:57 ` [PATCH v3 3/5] RISC-V: Improve init_resources Nick Kossifidis 2021-04-05 8:57 ` Nick Kossifidis 2021-04-06 7:19 ` Geert Uytterhoeven 2021-04-06 7:19 ` Geert Uytterhoeven 2021-04-06 8:11 ` Nick Kossifidis 2021-04-06 8:11 ` Nick Kossifidis 2021-04-06 8:22 ` Geert Uytterhoeven 2021-04-06 8:22 ` Geert Uytterhoeven 2021-04-09 10:11 ` Nick Kossifidis 2021-04-09 10:11 ` Nick Kossifidis 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-05 8:57 ` [PATCH v3 4/5] RISC-V: Add kdump support Nick Kossifidis 2021-04-05 8:57 ` Nick Kossifidis 2021-04-06 18:36 ` Alex Ghiti 2021-04-06 18:36 ` Alex Ghiti 2021-04-09 10:21 ` Nick Kossifidis 2021-04-09 10:21 ` Nick Kossifidis 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-05 8:57 ` [PATCH v3 5/5] RISC-V: Add crash kernel support Nick Kossifidis 2021-04-05 8:57 ` Nick Kossifidis 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-07 7:45 ` [PATCH v3 0/5] RISC-V: Add kexec/kdump support Yixun Lan 2021-04-07 7:45 ` Yixun Lan 2021-04-07 16:29 ` Rob Herring 2021-04-07 16:29 ` Rob Herring 2021-04-09 10:02 ` Nick Kossifidis 2021-04-09 10:02 ` Nick Kossifidis 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-23 3:30 ` Palmer Dabbelt 2021-04-23 3:36 ` Nick Kossifidis 2021-04-23 3:36 ` Nick Kossifidis 2021-04-23 3:48 ` Palmer Dabbelt 2021-04-23 3:48 ` Palmer Dabbelt 2021-04-23 3:53 ` Nick Kossifidis 2021-04-23 3:53 ` Nick Kossifidis
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.