linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v2 0/5] arm64: add kdump support
@ 2015-04-24  7:53 AKASHI Takahiro
  2015-04-24  7:53 ` [v2 1/5] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
                   ` (5 more replies)
  0 siblings, 6 replies; 29+ messages in thread
From: AKASHI Takahiro @ 2015-04-24  7:53 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, vgoyal, hbabus
  Cc: geoff, broonie, david.griego, kexec, linux-arm-kernel,
	linaro-kernel, linux-kernel, AKASHI Takahiro

This patch set enables kdump (crash dump kernel) support on arm64 on top of
Geoff's kexec patchset.

In this version, there are some arm64-specific usage/constraints:
1) "mem=" boot parameter must be specified on crash dump kernel
   if the system starts on uefi.
2) Kvm will not be enabled on crash dump kernel even if configured
See commit messages and Documentation/kdump/kdump.txt for details.

The only concern I have is whether or not we can use the exact same kernel
as both system kernel and crash dump kernel. The current arm64 kernel is
not relocatable in the exact sense but I see no problems in using the same
binary when testing kdump.

I tested the code with
	- ATF v1.1 + EDK2(UEFI) v3.0-rc0
	- kernel v4.0 + Geoff' kexec v9
on
	- Base fast model, and
	- MediaTek MT8173-EVB
using my own kexec-tools [1], currently v0.12.

You may want to start a kernel with the following boot parameter:
	crashkernel=64M (or so, on model)
and try
	$ kexec -p --load <vmlinux> --append ...
	$ echo c > /proc/sysrq-trigger

To examine vmcore (/proc/vmcore), you should use
	- gdb v7.7 or later
	- crash + a small patch (to recognize v4.0 kernel)

Changes from v1:
* rebased to Geoff's v9
* tested this patchset on real hardware and fixed bugs:
- added cache flush operation in ipi_cpu_stop() when shutting down
  the system. Otherwise, data saved in vmcore's note sections by
  crash_save_cpu() might not be flushed to dumped memory and crash command
  fail to fetch correct data.
  I will address Mark's commit[2] after Geoff takes care of it on kexec.
- modified to use ioremap_cache() instead of ioremap() when reading
  crash memory. Otherwise, accessing /proc/vmcore on crash dump kernel
  might cause an alignment fault.
* allows reserve_crashkernel() to handle "crashkernel=xyz[MG]" correctly,
  thanks to Pratyush Anand. And it now also enforces memory limit.
* moved reserve_crashkernel() and reserve_elfcorehdr() to
  arm64_memblock_init() to clarify that they should be called before
  dma_contignuous_reserve().

[1] https://git.linaro.org/people/takahiro.akashi/kexec-tools.git
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-April/338171.html


AKASHI Takahiro (5):
  arm64: kdump: reserve memory for crash dump kernel
  arm64: kdump: implement machine_crash_shutdown()
  arm64: kdump: do not go into EL2 before starting a crash dump kernel
  arm64: add kdump support
  arm64: enable kdump in the arm64 defconfig

 Documentation/kdump/kdump.txt     |   31 +++++++++++++-
 arch/arm64/Kconfig                |   12 ++++++
 arch/arm64/configs/defconfig      |    1 +
 arch/arm64/include/asm/kexec.h    |   34 ++++++++++++++-
 arch/arm64/kernel/Makefile        |    1 +
 arch/arm64/kernel/crash_dump.c    |   71 +++++++++++++++++++++++++++++++
 arch/arm64/kernel/machine_kexec.c |   55 +++++++++++++++++++++++-
 arch/arm64/kernel/process.c       |    7 +++-
 arch/arm64/kernel/setup.c         |    8 +++-
 arch/arm64/kernel/smp.c           |   12 +++++-
 arch/arm64/mm/init.c              |   84 +++++++++++++++++++++++++++++++++++++
 11 files changed, 309 insertions(+), 7 deletions(-)
 create mode 100644 arch/arm64/kernel/crash_dump.c

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [v2 1/5] arm64: kdump: reserve memory for crash dump kernel
  2015-04-24  7:53 [v2 0/5] arm64: add kdump support AKASHI Takahiro
@ 2015-04-24  7:53 ` AKASHI Takahiro
  2015-04-24 10:11   ` Mark Rutland
  2015-04-28  9:19   ` Baoquan He
  2015-04-24  7:53 ` [v2 2/5] arm64: kdump: implement machine_crash_shutdown() AKASHI Takahiro
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 29+ messages in thread
From: AKASHI Takahiro @ 2015-04-24  7:53 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, vgoyal, hbabus
  Cc: geoff, broonie, david.griego, kexec, linux-arm-kernel,
	linaro-kernel, linux-kernel, AKASHI Takahiro

On system kernel, the memory region used by crash dump kernel must be
specified by "crashkernel=X@Y" boot parameter. reserve_crashkernel()
will allocate the region in "System RAM" and reserve it for later use.

On crash dump kernel, memory region information in system kernel is
described in a specific region specified by "elfcorehdr=X@Y" boot parameter.
reserve_elfcorehdr() will set aside the region to avoid data destruction
by the kernel.

Crash dump kernel will access memory regions in system kernel via
copy_oldmem_page(), which reads a page by ioremap'ing it assuming that
such pages are not part of main memory of crash dump kernel.
This is true under non-UEFI environment because kexec-tools modifies
a device tree adding "usablemem" attributes to memory sections.
Under UEFI, however, this is not true because UEFI remove memory sections
in a device tree and export all the memory regions, even though they belong
to system kernel.

So we should add "mem=X[MG]" boot parameter to limit the memory size and
avoid hitting the following assertion in ioremap():
	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
		return NULL;

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/kernel/Makefile     |    1 +
 arch/arm64/kernel/crash_dump.c |   71 +++++++++++++++++++++++++++++++++
 arch/arm64/kernel/setup.c      |    8 +++-
 arch/arm64/mm/init.c           |   84 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 163 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/crash_dump.c

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ac3c2e2..6fcc602 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -36,6 +36,7 @@ arm64-obj-$(CONFIG_EFI)			+= efi.o efi-stub.o efi-entry.o
 arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o
+arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
new file mode 100644
index 0000000..3d86c0a
--- /dev/null
+++ b/arch/arm64/kernel/crash_dump.c
@@ -0,0 +1,71 @@
+/*
+ * Routines for doing kexec-based kdump
+ *
+ * Copyright (C) 2014 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crash_dump.h>
+#include <linux/errno.h>
+#include <linux/io.h>
+#include <linux/memblock.h>
+#include <linux/uaccess.h>
+#include <asm/memory.h>
+
+/**
+ * copy_oldmem_page() - copy one page from old kernel memory
+ * @pfn: page frame number to be copied
+ * @buf: buffer where the copied page is placed
+ * @csize: number of bytes to copy
+ * @offset: offset in bytes into the page
+ * @userbuf: if set, @buf is in a user address space
+ *
+ * This function copies one page from old kernel memory into buffer pointed by
+ * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
+ * copied or negative error in case of failure.
+ */
+ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+			 size_t csize, unsigned long offset,
+			 int userbuf)
+{
+	void *vaddr;
+
+	if (!csize)
+		return 0;
+
+	vaddr = ioremap_cache(pfn << PAGE_SHIFT, PAGE_SIZE);
+	if (!vaddr)
+		return -ENOMEM;
+
+	if (userbuf) {
+		if (copy_to_user(buf, vaddr + offset, csize)) {
+			iounmap(vaddr);
+			return -EFAULT;
+		}
+	} else {
+		memcpy(buf, vaddr + offset, csize);
+	}
+
+	iounmap(vaddr);
+
+	return csize;
+}
+
+/**
+ * elfcorehdr_read - read from ELF core header
+ * @buf: buffer where the data is placed
+ * @csize: number of bytes to read
+ * @ppos: address in the memory
+ *
+ * This function reads @count bytes from elf core header which exists
+ * on crash dump kernel's memory.
+ */
+ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
+{
+	memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
+	return count;
+}
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 51ef972..7932bd0 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -31,7 +31,6 @@
 #include <linux/screen_info.h>
 #include <linux/init.h>
 #include <linux/kexec.h>
-#include <linux/crash_dump.h>
 #include <linux/root_dev.h>
 #include <linux/clk-provider.h>
 #include <linux/cpu.h>
@@ -364,6 +363,12 @@ static void __init request_standard_resources(void)
 		    kernel_data.end <= res->end)
 			request_resource(res, &kernel_data);
 	}
+
+#ifdef CONFIG_KEXEC
+	/* User space tools will detect the region with /proc/iomem */
+	if (crashk_res.end)
+		insert_resource(&iomem_resource, &crashk_res);
+#endif
 }
 
 u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
@@ -393,6 +398,7 @@ void __init setup_arch(char **cmdline_p)
 	local_async_enable();
 
 	efi_init();
+
 	arm64_memblock_init();
 
 	paging_init();
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index ae85da6..ea70d41 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -34,6 +34,8 @@
 #include <linux/dma-contiguous.h>
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
+#include <linux/kexec.h>
+#include <linux/crash_dump.h>
 
 #include <asm/fixmap.h>
 #include <asm/memory.h>
@@ -66,6 +68,81 @@ static int __init early_initrd(char *p)
 early_param("initrd", early_initrd);
 #endif
 
+#ifdef CONFIG_KEXEC
+/*
+ * reserve_crashkernel() - reserves memory for crash kernel
+ *
+ * This function reserves memory area given in "crashkernel=" kernel command
+ * line parameter. The memory reserved is used by a dump capture kernel when
+ * primary kernel is crashing.
+ */
+static void __init reserve_crashkernel(phys_addr_t limit)
+{
+	unsigned long long crash_size = 0, crash_base = 0;
+	int ret;
+
+	ret = parse_crashkernel(boot_command_line, limit,
+				&crash_size, &crash_base);
+	if (ret)
+		return;
+
+	if (crash_base == 0) {
+		crash_base = memblock_alloc(crash_size, 1 << 20);
+		if (crash_base == 0) {
+			pr_warn("crashkernel allocation failed (size:%llx)\n",
+				crash_size);
+			return;
+		}
+	} else {
+		/* User specifies base address explicitly. Sanity check */
+		if (!memblock_is_region_memory(crash_base, crash_size) ||
+			memblock_is_region_reserved(crash_base, crash_size)) {
+			pr_warn("crashkernel= has wrong address or size\n");
+			return;
+		}
+
+		if (memblock_reserve(crash_base, crash_size)) {
+			pr_warn("crashkernel reservation failed - out of memory\n");
+			return;
+		}
+	}
+
+	pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
+		crash_size >> 20, crash_base >> 20);
+
+	crashk_res.start = crash_base;
+	crashk_res.end = crash_base + crash_size - 1;
+}
+#endif /* CONFIG_KEXEC */
+
+#ifdef CONFIG_CRASH_DUMP
+/*
+ * reserve_elfcorehdr() - reserves memory for elf core header
+ *
+ * This function reserves memory area given in "elfcorehdr=" kernel command
+ * line parameter. The memory reserved is used by a dump capture kernel to
+ * identify the memory used by primary kernel.
+ */
+static void __init reserve_elfcorehdr(void)
+{
+	if (!elfcorehdr_size)
+		return;
+
+	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
+		pr_warn("elfcorehdr reservation failed - memory is in use (0x%llx)\n",
+			elfcorehdr_addr);
+		return;
+	}
+
+	if (memblock_reserve(elfcorehdr_addr, elfcorehdr_size)) {
+		pr_warn("elfcorehdr reservation failed - out of memory\n");
+		return;
+	}
+
+	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
+		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);
+}
+#endif /* CONFIG_CRASH_DUMP */
 /*
  * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
  * currently assumes that for memory starting above 4G, 32-bit devices will
@@ -170,6 +247,13 @@ void __init arm64_memblock_init(void)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
 
+#ifdef CONFIG_KEXEC
+	reserve_crashkernel(memory_limit);
+#endif
+#ifdef CONFIG_CRASH_DUMP
+	reserve_elfcorehdr();
+#endif
+
 	early_init_fdt_scan_reserved_mem();
 
 	/* 4GB maximum for 32-bit only capable devices */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [v2 2/5] arm64: kdump: implement machine_crash_shutdown()
  2015-04-24  7:53 [v2 0/5] arm64: add kdump support AKASHI Takahiro
  2015-04-24  7:53 ` [v2 1/5] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
@ 2015-04-24  7:53 ` AKASHI Takahiro
  2015-04-24 10:39   ` Mark Rutland
  2015-04-24  7:53 ` [v2 3/5] arm64: kdump: do not go into EL2 before starting a crash dump kernel AKASHI Takahiro
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 29+ messages in thread
From: AKASHI Takahiro @ 2015-04-24  7:53 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, vgoyal, hbabus
  Cc: geoff, broonie, david.griego, kexec, linux-arm-kernel,
	linaro-kernel, linux-kernel, AKASHI Takahiro

kdump calls machine_crash_shutdown() to shut down non-boot cpus and
save per-cpu general-purpose registers before restarting the crash dump
kernel. See kernel_kexec().
ipi_cpu_stop() is used and a bit modified to support this behavior.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/include/asm/kexec.h    |   34 ++++++++++++++++++++++-
 arch/arm64/kernel/machine_kexec.c |   55 ++++++++++++++++++++++++++++++++++++-
 arch/arm64/kernel/smp.c           |   12 ++++++--
 3 files changed, 97 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 3530ff5..eaf3fcb 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -30,6 +30,8 @@
 
 #if !defined(__ASSEMBLY__)
 
+extern bool in_crash_kexec;
+
 /**
  * crash_setup_regs() - save registers for the panic kernel
  *
@@ -40,7 +42,37 @@
 static inline void crash_setup_regs(struct pt_regs *newregs,
 				    struct pt_regs *oldregs)
 {
-	/* Empty routine needed to avoid build errors. */
+	if (oldregs) {
+		memcpy(newregs, oldregs, sizeof(*newregs));
+	} else {
+		__asm__ __volatile__ (
+			"stp	 x0,   x1, [%3]\n\t"
+			"stp	 x2,   x3, [%3, 0x10]\n\t"
+			"stp	 x4,   x5, [%3, 0x20]\n\t"
+			"stp	 x6,   x7, [%3, 0x30]\n\t"
+			"stp	 x8,   x9, [%3, 0x40]\n\t"
+			"stp	x10,  x11, [%3, 0x50]\n\t"
+			"stp	x12,  x13, [%3, 0x60]\n\t"
+			"stp	x14,  x15, [%3, 0x70]\n\t"
+			"stp	x16,  x17, [%3, 0x80]\n\t"
+			"stp	x18,  x19, [%3, 0x90]\n\t"
+			"stp	x20,  x21, [%3, 0xa0]\n\t"
+			"stp	x22,  x23, [%3, 0xb0]\n\t"
+			"stp	x24,  x25, [%3, 0xc0]\n\t"
+			"stp	x26,  x27, [%3, 0xd0]\n\t"
+			"stp	x28,  x29, [%3, 0xe0]\n\t"
+			"str	x30,	   [%3, 0xf0]\n\t"
+			"mov	%0, sp\n\t"
+			"adr	%1, 1f\n\t"
+			"mrs	%2, spsr_el1\n\t"
+		"1:"
+			: "=r" (newregs->sp),
+			  "=r" (newregs->pc),
+			  "=r" (newregs->pstate)
+			: "r"  (&newregs->regs)
+			: "memory"
+		);
+	}
 }
 
 #endif /* !defined(__ASSEMBLY__) */
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index fd6b742..6f7887b 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -9,6 +9,8 @@
  * published by the Free Software Foundation.
  */
 
+#include <linux/irq.h>
+#include <linux/kernel.h>
 #include <linux/kexec.h>
 #include <linux/of_fdt.h>
 #include <linux/slab.h>
@@ -24,6 +26,8 @@ extern unsigned long arm64_kexec_dtb_addr;
 extern unsigned long arm64_kexec_kimage_head;
 extern unsigned long arm64_kexec_kimage_start;
 
+bool in_crash_kexec = false;
+
 /**
  * kexec_is_dtb - Helper routine to check the device tree header signature.
  */
@@ -178,7 +182,56 @@ void machine_kexec(struct kimage *image)
 	soft_restart(reboot_code_buffer_phys);
 }
 
+static void machine_kexec_mask_interrupts(void)
+{
+	unsigned int i;
+	struct irq_desc *desc;
+
+	for_each_irq_desc(i, desc) {
+		struct irq_chip *chip;
+
+		chip = irq_desc_get_chip(desc);
+		if (!chip)
+			continue;
+
+		if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
+			chip->irq_eoi(&desc->irq_data);
+
+		if (chip->irq_mask)
+			chip->irq_mask(&desc->irq_data);
+
+		if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
+			chip->irq_disable(&desc->irq_data);
+	}
+}
+
+/**
+ * machine_crash_shutdown - shutdown non-boot cpus and save registers
+ */
 void machine_crash_shutdown(struct pt_regs *regs)
 {
-	/* Empty routine needed to avoid build errors. */
+	struct pt_regs dummy_regs;
+	int cpu;
+
+	local_irq_disable();
+
+	in_crash_kexec = true;
+
+	/*
+	 * clear and initialize the per-cpu info. This is necessary
+	 * because, otherwise, slots for offline cpus would not be
+	 * filled up. See smp_send_stop().
+	 */
+	memset(&dummy_regs, 0, sizeof(dummy_regs));
+	for_each_possible_cpu(cpu)
+		crash_save_cpu(&dummy_regs, cpu);
+
+	/* shutdown non-boot cpus */
+	smp_send_stop();
+
+	/* for this cpu */
+	crash_save_cpu(regs, smp_processor_id());
+	machine_kexec_mask_interrupts();
+
+	pr_info("Loading crashdump kernel...\n");
 }
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index ffe8e1b..cf7f361 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -36,6 +36,7 @@
 #include <linux/completion.h>
 #include <linux/of.h>
 #include <linux/irq_work.h>
+#include <linux/kexec.h>
 
 #include <asm/alternative.h>
 #include <asm/atomic.h>
@@ -542,7 +543,7 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
 /*
  * ipi_cpu_stop - handle IPI from smp_send_stop()
  */
-static void ipi_cpu_stop(unsigned int cpu)
+static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
 {
 	if (system_state == SYSTEM_BOOTING ||
 	    system_state == SYSTEM_RUNNING) {
@@ -556,6 +557,13 @@ static void ipi_cpu_stop(unsigned int cpu)
 
 	local_irq_disable();
 
+#ifdef CONFIG_KEXEC
+	if (in_crash_kexec) {
+		crash_save_cpu(regs, cpu);
+		flush_cache_all();
+	}
+#endif /* CONFIG_KEXEC */
+
 	while (1)
 		cpu_relax();
 }
@@ -586,7 +594,7 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 
 	case IPI_CPU_STOP:
 		irq_enter();
-		ipi_cpu_stop(cpu);
+		ipi_cpu_stop(cpu, regs);
 		irq_exit();
 		break;
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [v2 3/5] arm64: kdump: do not go into EL2 before starting a crash dump kernel
  2015-04-24  7:53 [v2 0/5] arm64: add kdump support AKASHI Takahiro
  2015-04-24  7:53 ` [v2 1/5] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
  2015-04-24  7:53 ` [v2 2/5] arm64: kdump: implement machine_crash_shutdown() AKASHI Takahiro
@ 2015-04-24  7:53 ` AKASHI Takahiro
  2015-04-24  7:53 ` [v2 4/5] arm64: add kdump support AKASHI Takahiro
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 29+ messages in thread
From: AKASHI Takahiro @ 2015-04-24  7:53 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, vgoyal, hbabus
  Cc: geoff, broonie, david.griego, kexec, linux-arm-kernel,
	linaro-kernel, linux-kernel, AKASHI Takahiro

Unlike normal kexec case, we don't have a chance to reset EL2 context in
a generic way because bad exceptions may directly invoke crash_kexec().
(See die().)
Kvm is not useful on crash dump kernel anyway, and so we let it
un-initialized across rebooting.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/kernel/process.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index d894d3e..9859f5c 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -48,6 +48,7 @@
 #include <asm/compat.h>
 #include <asm/cacheflush.h>
 #include <asm/fpsimd.h>
+#include <asm/kexec.h>
 #include <asm/mmu_context.h>
 #include <asm/processor.h>
 #include <asm/stacktrace.h>
@@ -64,7 +65,11 @@ void soft_restart(unsigned long addr)
 	setup_mm_for_reboot();
 
 	cpu_soft_restart(virt_to_phys(cpu_reset),
-		is_hyp_mode_available(), addr);
+#ifdef CONFIG_KEXEC
+		!in_crash_kexec &&
+#endif
+		is_hyp_mode_available(),
+		addr);
 
 	/* Should never get here */
 	BUG();
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [v2 4/5] arm64: add kdump support
  2015-04-24  7:53 [v2 0/5] arm64: add kdump support AKASHI Takahiro
                   ` (2 preceding siblings ...)
  2015-04-24  7:53 ` [v2 3/5] arm64: kdump: do not go into EL2 before starting a crash dump kernel AKASHI Takahiro
@ 2015-04-24  7:53 ` AKASHI Takahiro
  2015-05-08 12:19   ` Dave Young
  2015-04-24  7:53 ` [v2 5/5] arm64: enable kdump in the arm64 defconfig AKASHI Takahiro
  2015-04-24  9:53 ` [v2 0/5] arm64: add kdump support Mark Rutland
  5 siblings, 1 reply; 29+ messages in thread
From: AKASHI Takahiro @ 2015-04-24  7:53 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, vgoyal, hbabus
  Cc: geoff, broonie, david.griego, kexec, linux-arm-kernel,
	linaro-kernel, linux-kernel, AKASHI Takahiro

Please read the following commits for arm64-specific constraints:
    arm64: kdump: reserve memory for crash dump kernel
    arm64: kdump: do not go into EL2 before starting a crash dump kernel

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 Documentation/kdump/kdump.txt |   31 ++++++++++++++++++++++++++++++-
 arch/arm64/Kconfig            |   12 ++++++++++++
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index bc4bd5a..f9cf6f5 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
 a remote system.
 
 Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
-s390x and arm architectures.
+s390x, arm and arm64 architectures.
 
 When the system kernel boots, it reserves a small section of memory for
 the dump-capture kernel. This ensures that ongoing Direct Memory Access
@@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
 
     AUTO_ZRELADDR=y
 
+Dump-capture kernel config options (Arch Dependent, arm64)
+----------------------------------------------------------
+
+1) Disable symmetric multi-processing support under "Processor type and
+   features":
+
+   CONFIG_SMP=n
+
+   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
+   when loading the dump-capture kernel, see section "Load the Dump-capture
+   Kernel".)
+
+2) Under uefi, the maximum memory size on the dump-capture kernel must be
+   limited by specifying:
+
+   mem=X[MG]
+
+   where X should be less than or equal to the size in "crashkernel="
+   boot parameter. Kexec-tools will automatically add this.
+
+3) Currently, kvm will not be initialized on the dump-capture kernel even
+   if it is configured.
+
 Extended crashkernel syntax
 ===========================
 
@@ -312,6 +335,7 @@ Boot into System Kernel
    any space below the alignment point may be overwritten by the dump-capture kernel,
    which means it is possible that the vmcore is not that precise as expected.
 
+   On arm64, use "crashkernel=Y@X".
 
 Load the Dump-capture Kernel
 ============================
@@ -334,6 +358,8 @@ For s390x:
 	- Use image or bzImage
 For arm:
 	- Use zImage
+For arm64:
+	- Use vmlinux
 
 If you are using a uncompressed vmlinux image then use following command
 to load dump-capture kernel.
@@ -377,6 +403,9 @@ For s390x:
 For arm:
 	"1 maxcpus=1 reset_devices"
 
+For arm64:
+	"1 maxcpus=1 mem=X[MG] reset_devices"
+
 Notes on loading the dump-capture kernel:
 
 * By default, the ELF headers are stored in ELF64 format to support
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5716edf..8e2f545 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -562,6 +562,18 @@ config KEXEC
 	  but it is independent of the system firmware.   And like a reboot
 	  you can start any kernel with it, not just Linux.
 
+config CRASH_DUMP
+	bool "Build kdump crash kernel"
+	help
+	  Generate crash dump after being started by kexec. This should
+	  be normally only set in special crash dump kernels which are
+	  loaded in the main kernel with kexec-tools into a specially
+	  reserved region and then later executed after a crash by
+	  kdump/kexec. The crash dump kernel must be compiled to a
+	  memory address not used by the main kernel.
+
+	  For more details see Documentation/kdump/kdump.txt
+
 config XEN_DOM0
 	def_bool y
 	depends on XEN
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [v2 5/5] arm64: enable kdump in the arm64 defconfig
  2015-04-24  7:53 [v2 0/5] arm64: add kdump support AKASHI Takahiro
                   ` (3 preceding siblings ...)
  2015-04-24  7:53 ` [v2 4/5] arm64: add kdump support AKASHI Takahiro
@ 2015-04-24  7:53 ` AKASHI Takahiro
  2015-04-24  9:53 ` [v2 0/5] arm64: add kdump support Mark Rutland
  5 siblings, 0 replies; 29+ messages in thread
From: AKASHI Takahiro @ 2015-04-24  7:53 UTC (permalink / raw)
  To: catalin.marinas, will.deacon, vgoyal, hbabus
  Cc: geoff, broonie, david.griego, kexec, linux-arm-kernel,
	linaro-kernel, linux-kernel, AKASHI Takahiro

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/configs/defconfig |    1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 583cacb..7b3fa3a 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -49,6 +49,7 @@ CONFIG_KSM=y
 CONFIG_TRANSPARENT_HUGEPAGE=y
 CONFIG_CMA=y
 CONFIG_KEXEC=y
+CONFIG_CRASH_DUMP=y
 CONFIG_CMDLINE="console=ttyAMA0"
 # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
 CONFIG_COMPAT=y
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [v2 0/5] arm64: add kdump support
  2015-04-24  7:53 [v2 0/5] arm64: add kdump support AKASHI Takahiro
                   ` (4 preceding siblings ...)
  2015-04-24  7:53 ` [v2 5/5] arm64: enable kdump in the arm64 defconfig AKASHI Takahiro
@ 2015-04-24  9:53 ` Mark Rutland
  2015-05-11  6:16   ` AKASHI Takahiro
  5 siblings, 1 reply; 29+ messages in thread
From: Mark Rutland @ 2015-04-24  9:53 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel, ard.biesheuvel

Hi,

On Fri, Apr 24, 2015 at 08:53:03AM +0100, AKASHI Takahiro wrote:
> This patch set enables kdump (crash dump kernel) support on arm64 on top of
> Geoff's kexec patchset.
> 
> In this version, there are some arm64-specific usage/constraints:
> 1) "mem=" boot parameter must be specified on crash dump kernel
>    if the system starts on uefi.

This sounds very painful. Why is this the case, and how do x86 and/or
ia64 get around that?

> 2) Kvm will not be enabled on crash dump kernel even if configured
> See commit messages and Documentation/kdump/kdump.txt for details.
> 
> The only concern I have is whether or not we can use the exact same kernel
> as both system kernel and crash dump kernel. The current arm64 kernel is
> not relocatable in the exact sense but I see no problems in using the same
> binary when testing kdump.

Ard has been working on decoupling the kernel text/data, FDT, and linear
memory mappings, which would allow the kernel to be loaded anywhere and
still be able to access all of memory [3]. I assume that's what you mean
by "relocatable"?

> I tested the code with
> 	- ATF v1.1 + EDK2(UEFI) v3.0-rc0
> 	- kernel v4.0 + Geoff' kexec v9
> on
> 	- Base fast model, and
> 	- MediaTek MT8173-EVB
> using my own kexec-tools [1], currently v0.12.
> 
> You may want to start a kernel with the following boot parameter:
> 	crashkernel=64M (or so, on model)
> and try
> 	$ kexec -p --load <vmlinux> --append ...
> 	$ echo c > /proc/sysrq-trigger
> 
> To examine vmcore (/proc/vmcore), you should use
> 	- gdb v7.7 or later
> 	- crash + a small patch (to recognize v4.0 kernel)
> 
> Changes from v1:
> * rebased to Geoff's v9
> * tested this patchset on real hardware and fixed bugs:
> - added cache flush operation in ipi_cpu_stop() when shutting down
>   the system. Otherwise, data saved in vmcore's note sections by
>   crash_save_cpu() might not be flushed to dumped memory and crash command
>   fail to fetch correct data.

We'll need to give this a go on something with a system cache too (e.g.
Seattle or X-Gene). Even if that's only UP it would give me much greater
confidence in the cache maintenance.

>   I will address Mark's commit[2] after Geoff takes care of it on kexec.
> - modified to use ioremap_cache() instead of ioremap() when reading
>   crash memory. Otherwise, accessing /proc/vmcore on crash dump kernel
>   might cause an alignment fault.
> * allows reserve_crashkernel() to handle "crashkernel=xyz[MG]" correctly,
>   thanks to Pratyush Anand. And it now also enforces memory limit.

I worry that there could be potentially bad interaction between this and
Ard's patches, depending on how the memory area to use is chosen. It is
probably fine, but we should make sure that it is.

> * moved reserve_crashkernel() and reserve_elfcorehdr() to
>   arm64_memblock_init() to clarify that they should be called before
>   dma_contignuous_reserve().
> 
> [1] https://git.linaro.org/people/takahiro.akashi/kexec-tools.git
> [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-April/338171.html

[3] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-April/337596.html

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 1/5] arm64: kdump: reserve memory for crash dump kernel
  2015-04-24  7:53 ` [v2 1/5] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
@ 2015-04-24 10:11   ` Mark Rutland
  2015-05-11  6:44     ` AKASHI Takahiro
  2015-04-28  9:19   ` Baoquan He
  1 sibling, 1 reply; 29+ messages in thread
From: Mark Rutland @ 2015-04-24 10:11 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel

On Fri, Apr 24, 2015 at 08:53:04AM +0100, AKASHI Takahiro wrote:
> On system kernel, the memory region used by crash dump kernel must be
> specified by "crashkernel=X@Y" boot parameter. reserve_crashkernel()
> will allocate the region in "System RAM" and reserve it for later use.
> 
> On crash dump kernel, memory region information in system kernel is
> described in a specific region specified by "elfcorehdr=X@Y" boot parameter.
> reserve_elfcorehdr() will set aside the region to avoid data destruction
> by the kernel.
> 
> Crash dump kernel will access memory regions in system kernel via
> copy_oldmem_page(), which reads a page by ioremap'ing it assuming that
> such pages are not part of main memory of crash dump kernel.
> This is true under non-UEFI environment because kexec-tools modifies
> a device tree adding "usablemem" attributes to memory sections.

I'm not sure what you mean by "usablemem" here.

Do you just mean that the memory nodes are altered such that they only
cover memory usable by the crash kernel?

Why not _always_ require a command line argument for the crash kernel
that restricts its memory usage to a particular range? That way it
doesn't matter whether we're using UEFI or not.

> Under UEFI, however, this is not true because UEFI remove memory sections
> in a device tree and export all the memory regions, even though they belong
> to system kernel.
> 
> So we should add "mem=X[MG]" boot parameter to limit the memory size and
> avoid hitting the following assertion in ioremap():
>         if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
>                 return NULL;

That looks suspicious. What is being ioremapped at that point?

[...]

> @@ -393,6 +398,7 @@ void __init setup_arch(char **cmdline_p)
>         local_async_enable();
> 
>         efi_init();
> +
>         arm64_memblock_init();
> 
>         paging_init();

Nit: unrelated whitespace change.

> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index ae85da6..ea70d41 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -34,6 +34,8 @@
>  #include <linux/dma-contiguous.h>
>  #include <linux/efi.h>
>  #include <linux/swiotlb.h>
> +#include <linux/kexec.h>
> +#include <linux/crash_dump.h>

Nit: please keep these ordered.

[...]

> +               if (memblock_reserve(crash_base, crash_size)) {
> +                       pr_warn("crashkernel reservation failed - out of memory\n");
> +                       return;
> +               }

If we can remove this memory rather than reserving it, we can limit the
first kernel's ability to accidentally clobber the crash kernel, at the
expense of having to explicitly map/unmap around loading it.

Mark.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 2/5] arm64: kdump: implement machine_crash_shutdown()
  2015-04-24  7:53 ` [v2 2/5] arm64: kdump: implement machine_crash_shutdown() AKASHI Takahiro
@ 2015-04-24 10:39   ` Mark Rutland
  2015-04-24 10:43     ` Marc Zyngier
  2015-05-11  7:10     ` AKASHI Takahiro
  0 siblings, 2 replies; 29+ messages in thread
From: Mark Rutland @ 2015-04-24 10:39 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel, marc.zyngier

On Fri, Apr 24, 2015 at 08:53:05AM +0100, AKASHI Takahiro wrote:
> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
> save per-cpu general-purpose registers before restarting the crash dump
> kernel. See kernel_kexec().
> ipi_cpu_stop() is used and a bit modified to support this behavior.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm64/include/asm/kexec.h    |   34 ++++++++++++++++++++++-
>  arch/arm64/kernel/machine_kexec.c |   55 ++++++++++++++++++++++++++++++++++++-
>  arch/arm64/kernel/smp.c           |   12 ++++++--
>  3 files changed, 97 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> index 3530ff5..eaf3fcb 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -30,6 +30,8 @@
>  
>  #if !defined(__ASSEMBLY__)
>  
> +extern bool in_crash_kexec;
> +
>  /**
>   * crash_setup_regs() - save registers for the panic kernel
>   *
> @@ -40,7 +42,37 @@
>  static inline void crash_setup_regs(struct pt_regs *newregs,
>  				    struct pt_regs *oldregs)
>  {
> -	/* Empty routine needed to avoid build errors. */
> +	if (oldregs) {
> +		memcpy(newregs, oldregs, sizeof(*newregs));
> +	} else {
> +		__asm__ __volatile__ (
> +			"stp	 x0,   x1, [%3]\n\t"

Why the tabs?

Please use #16 * N as the offset for consistency with entry.S, with 0
for the first N.

[...]

> +static void machine_kexec_mask_interrupts(void)
> +{
> +	unsigned int i;
> +	struct irq_desc *desc;
> +
> +	for_each_irq_desc(i, desc) {
> +		struct irq_chip *chip;
> +
> +		chip = irq_desc_get_chip(desc);
> +		if (!chip)
> +			continue;
> +
> +		if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
> +			chip->irq_eoi(&desc->irq_data);
> +
> +		if (chip->irq_mask)
> +			chip->irq_mask(&desc->irq_data);
> +
> +		if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
> +			chip->irq_disable(&desc->irq_data);
> +	}
> +}

I'm surprised that this isn't left to the irqchip driver init code in
the crash kernel. For all we know this state could be corrupt anyway.

Is there any reason we can't get the GIC driver to nuke all of this at
probe time?

[...]

> @@ -542,7 +543,7 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>  /*
>   * ipi_cpu_stop - handle IPI from smp_send_stop()
>   */
> -static void ipi_cpu_stop(unsigned int cpu)
> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>  {
>  	if (system_state == SYSTEM_BOOTING ||
>  	    system_state == SYSTEM_RUNNING) {
> @@ -556,6 +557,13 @@ static void ipi_cpu_stop(unsigned int cpu)
>  
>  	local_irq_disable();
>  
> +#ifdef CONFIG_KEXEC
> +	if (in_crash_kexec) {
> +		crash_save_cpu(regs, cpu);
> +		flush_cache_all();

Any cache maintenance will need to be by VA; flush_cache_all doesn't do
what the name implies, though may appear to work by chance.

Is kdump implemented for ARM? I don't see equivalent for in the arch/arm
ipi_cpu_stop.

Mark.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 2/5] arm64: kdump: implement machine_crash_shutdown()
  2015-04-24 10:39   ` Mark Rutland
@ 2015-04-24 10:43     ` Marc Zyngier
  2015-08-06  7:09       ` AKASHI Takahiro
  2015-05-11  7:10     ` AKASHI Takahiro
  1 sibling, 1 reply; 29+ messages in thread
From: Marc Zyngier @ 2015-04-24 10:43 UTC (permalink / raw)
  To: Mark Rutland, AKASHI Takahiro
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel

On 24/04/15 11:39, Mark Rutland wrote:
> On Fri, Apr 24, 2015 at 08:53:05AM +0100, AKASHI Takahiro wrote:
>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>> save per-cpu general-purpose registers before restarting the crash dump
>> kernel. See kernel_kexec().
>> ipi_cpu_stop() is used and a bit modified to support this behavior.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>  arch/arm64/include/asm/kexec.h    |   34 ++++++++++++++++++++++-
>>  arch/arm64/kernel/machine_kexec.c |   55 ++++++++++++++++++++++++++++++++++++-
>>  arch/arm64/kernel/smp.c           |   12 ++++++--
>>  3 files changed, 97 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>> index 3530ff5..eaf3fcb 100644
>> --- a/arch/arm64/include/asm/kexec.h
>> +++ b/arch/arm64/include/asm/kexec.h
>> @@ -30,6 +30,8 @@
>>  
>>  #if !defined(__ASSEMBLY__)
>>  
>> +extern bool in_crash_kexec;
>> +
>>  /**
>>   * crash_setup_regs() - save registers for the panic kernel
>>   *
>> @@ -40,7 +42,37 @@
>>  static inline void crash_setup_regs(struct pt_regs *newregs,
>>  				    struct pt_regs *oldregs)
>>  {
>> -	/* Empty routine needed to avoid build errors. */
>> +	if (oldregs) {
>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>> +	} else {
>> +		__asm__ __volatile__ (
>> +			"stp	 x0,   x1, [%3]\n\t"
> 
> Why the tabs?
> 
> Please use #16 * N as the offset for consistency with entry.S, with 0
> for the first N.
> 
> [...]
> 
>> +static void machine_kexec_mask_interrupts(void)
>> +{
>> +	unsigned int i;
>> +	struct irq_desc *desc;
>> +
>> +	for_each_irq_desc(i, desc) {
>> +		struct irq_chip *chip;
>> +
>> +		chip = irq_desc_get_chip(desc);
>> +		if (!chip)
>> +			continue;
>> +
>> +		if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
>> +			chip->irq_eoi(&desc->irq_data);
>> +
>> +		if (chip->irq_mask)
>> +			chip->irq_mask(&desc->irq_data);
>> +
>> +		if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
>> +			chip->irq_disable(&desc->irq_data);
>> +	}
>> +}
> 
> I'm surprised that this isn't left to the irqchip driver init code in
> the crash kernel. For all we know this state could be corrupt anyway.

Indeed, parsing the irqdesc list is a recipe for disaster. Who knows
which locks have been taken or simply corrupted, pointers nuked...

> Is there any reason we can't get the GIC driver to nuke all of this at
> probe time?

This feels like the better option. I can cook a patch or two for that.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 1/5] arm64: kdump: reserve memory for crash dump kernel
  2015-04-24  7:53 ` [v2 1/5] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
  2015-04-24 10:11   ` Mark Rutland
@ 2015-04-28  9:19   ` Baoquan He
  2015-05-11  7:38     ` AKASHI Takahiro
  1 sibling, 1 reply; 29+ messages in thread
From: Baoquan He @ 2015-04-28  9:19 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, vgoyal, hbabus, geoff, broonie,
	david.griego, kexec, linux-arm-kernel, linaro-kernel,
	linux-kernel

> +#ifdef CONFIG_CRASH_DUMP
> +/*
> + * reserve_elfcorehdr() - reserves memory for elf core header
> + *
> + * This function reserves memory area given in "elfcorehdr=" kernel command
> + * line parameter. The memory reserved is used by a dump capture kernel to
> + * identify the memory used by primary kernel.
> + */

Hi AKASHI,

May I know why elfcorehdr need be reserved separately but not locate a
memory region in crashkernel reserved region like all other ARCHs? Is
there any special reason?

Thanks
Baoquan

> +static void __init reserve_elfcorehdr(void)
> +{
> +	if (!elfcorehdr_size)
> +		return;
> +
> +	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
> +		pr_warn("elfcorehdr reservation failed - memory is in use (0x%llx)\n",
> +			elfcorehdr_addr);
> +		return;
> +	}
> +
> +	if (memblock_reserve(elfcorehdr_addr, elfcorehdr_size)) {
> +		pr_warn("elfcorehdr reservation failed - out of memory\n");
> +		return;
> +	}
> +
> +	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
> +		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);
> +}
> +#endif /* CONFIG_CRASH_DUMP */
>  /*
>   * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
>   * currently assumes that for memory starting above 4G, 32-bit devices will
> @@ -170,6 +247,13 @@ void __init arm64_memblock_init(void)
>  		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
>  #endif
>  
> +#ifdef CONFIG_KEXEC
> +	reserve_crashkernel(memory_limit);
> +#endif
> +#ifdef CONFIG_CRASH_DUMP
> +	reserve_elfcorehdr();
> +#endif
> +
>  	early_init_fdt_scan_reserved_mem();
>  
>  	/* 4GB maximum for 32-bit only capable devices */
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 4/5] arm64: add kdump support
  2015-04-24  7:53 ` [v2 4/5] arm64: add kdump support AKASHI Takahiro
@ 2015-05-08 12:19   ` Dave Young
  2015-05-11  7:47     ` Dave Young
  0 siblings, 1 reply; 29+ messages in thread
From: Dave Young @ 2015-05-08 12:19 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel, panand

Hi,

On 04/24/15 at 04:53pm, AKASHI Takahiro wrote:
> Please read the following commits for arm64-specific constraints:
>     arm64: kdump: reserve memory for crash dump kernel
>     arm64: kdump: do not go into EL2 before starting a crash dump kernel
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  Documentation/kdump/kdump.txt |   31 ++++++++++++++++++++++++++++++-
>  arch/arm64/Kconfig            |   12 ++++++++++++
>  2 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> index bc4bd5a..f9cf6f5 100644
> --- a/Documentation/kdump/kdump.txt
> +++ b/Documentation/kdump/kdump.txt
> @@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
>  a remote system.
>  
>  Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
> -s390x and arm architectures.
> +s390x, arm and arm64 architectures.
>  
>  When the system kernel boots, it reserves a small section of memory for
>  the dump-capture kernel. This ensures that ongoing Direct Memory Access
> @@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
>  
>      AUTO_ZRELADDR=y
>  
> +Dump-capture kernel config options (Arch Dependent, arm64)
> +----------------------------------------------------------
> +
> +1) Disable symmetric multi-processing support under "Processor type and
> +   features":
> +
> +   CONFIG_SMP=n
> +
> +   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
> +   when loading the dump-capture kernel, see section "Load the Dump-capture
> +   Kernel".)
> +
> +2) Under uefi, the maximum memory size on the dump-capture kernel must be
> +   limited by specifying:
> +
> +   mem=X[MG]
> +
> +   where X should be less than or equal to the size in "crashkernel="
> +   boot parameter. Kexec-tools will automatically add this.

I noticed you are passing mem=X in kexec-tools, Pratyush found a problem
that vmcore is corrupted because kdump kernel is using the crash notes memory.

How does kdump kernel know the system ram range especially the mem_start?
Only with the size from mem=? parammeter?

In X86 because it use E820 so kexec-tools can pass the memory range to kdump
kernel even for UEFI booting. But in arm64 may need find other way to
communicate with 2nd kernel like memmap=exactmap in X86..

> +
> +3) Currently, kvm will not be initialized on the dump-capture kernel even
> +   if it is configured.
> +
>  Extended crashkernel syntax
>  ===========================
>  
> @@ -312,6 +335,7 @@ Boot into System Kernel
>     any space below the alignment point may be overwritten by the dump-capture kernel,
>     which means it is possible that the vmcore is not that precise as expected.
>  
> +   On arm64, use "crashkernel=Y@X".
>  
>  Load the Dump-capture Kernel
>  ============================
> @@ -334,6 +358,8 @@ For s390x:
>  	- Use image or bzImage
>  For arm:
>  	- Use zImage
> +For arm64:
> +	- Use vmlinux
>  
>  If you are using a uncompressed vmlinux image then use following command
>  to load dump-capture kernel.
> @@ -377,6 +403,9 @@ For s390x:
>  For arm:
>  	"1 maxcpus=1 reset_devices"
>  
> +For arm64:
> +	"1 maxcpus=1 mem=X[MG] reset_devices"
> +
>  Notes on loading the dump-capture kernel:
>  
>  * By default, the ELF headers are stored in ELF64 format to support
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 5716edf..8e2f545 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -562,6 +562,18 @@ config KEXEC
>  	  but it is independent of the system firmware.   And like a reboot
>  	  you can start any kernel with it, not just Linux.
>  
> +config CRASH_DUMP
> +	bool "Build kdump crash kernel"
> +	help
> +	  Generate crash dump after being started by kexec. This should
> +	  be normally only set in special crash dump kernels which are
> +	  loaded in the main kernel with kexec-tools into a specially
> +	  reserved region and then later executed after a crash by
> +	  kdump/kexec. The crash dump kernel must be compiled to a
> +	  memory address not used by the main kernel.
> +
> +	  For more details see Documentation/kdump/kdump.txt
> +
>  config XEN_DOM0
>  	def_bool y
>  	depends on XEN
> -- 
> 1.7.9.5
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 0/5] arm64: add kdump support
  2015-04-24  9:53 ` [v2 0/5] arm64: add kdump support Mark Rutland
@ 2015-05-11  6:16   ` AKASHI Takahiro
  2015-05-12  5:43     ` Dave Young
  0 siblings, 1 reply; 29+ messages in thread
From: AKASHI Takahiro @ 2015-05-11  6:16 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel, ard.biesheuvel

Hi

Sorry for late response. I was on vacation.

On 04/24/2015 06:53 PM, Mark Rutland wrote:
> Hi,
>
> On Fri, Apr 24, 2015 at 08:53:03AM +0100, AKASHI Takahiro wrote:
>> This patch set enables kdump (crash dump kernel) support on arm64 on top of
>> Geoff's kexec patchset.
>>
>> In this version, there are some arm64-specific usage/constraints:
>> 1) "mem=" boot parameter must be specified on crash dump kernel
>>     if the system starts on uefi.
>
> This sounds very painful. Why is this the case, and how do x86 and/or
> ia64 get around that?

As Dave (Young) said, x86 uses "memmap=XX" kernel commandline parameters
to specify usable memory for crash dump kernel.
On my arm64 implementation, "linux,usable-memory" property is added
to device tree blob by kexec-tools for this purpose.
This is because, when I first implemented kdump on arm64, ppc is the only
architecture that supports kdump AND utilizes device trees.
Since kexec-tools as well as the kernel already has this framework,
I believed that device-tree approach was smarter than a commandline
parameter.

However, uefi-based kernel ignores all the memory-related properties
in a device tree and so this "mem=" workaround was added.


>> 2) Kvm will not be enabled on crash dump kernel even if configured
>> See commit messages and Documentation/kdump/kdump.txt for details.
>>
>> The only concern I have is whether or not we can use the exact same kernel
>> as both system kernel and crash dump kernel. The current arm64 kernel is
>> not relocatable in the exact sense but I see no problems in using the same
>> binary when testing kdump.
>
> Ard has been working on decoupling the kernel text/data, FDT, and linear
> memory mappings, which would allow the kernel to be loaded anywhere and
> still be able to access all of memory [3]. I assume that's what you mean
> by "relocatable"?

I'm still trying to understand Ard's patchset, but I think yes.

>> I tested the code with
>> 	- ATF v1.1 + EDK2(UEFI) v3.0-rc0
>> 	- kernel v4.0 + Geoff' kexec v9
>> on
>> 	- Base fast model, and
>> 	- MediaTek MT8173-EVB
>> using my own kexec-tools [1], currently v0.12.
>>
>> You may want to start a kernel with the following boot parameter:
>> 	crashkernel=64M (or so, on model)
>> and try
>> 	$ kexec -p --load <vmlinux> --append ...
>> 	$ echo c > /proc/sysrq-trigger
>>
>> To examine vmcore (/proc/vmcore), you should use
>> 	- gdb v7.7 or later
>> 	- crash + a small patch (to recognize v4.0 kernel)
>>
>> Changes from v1:
>> * rebased to Geoff's v9
>> * tested this patchset on real hardware and fixed bugs:
>> - added cache flush operation in ipi_cpu_stop() when shutting down
>>    the system. Otherwise, data saved in vmcore's note sections by
>>    crash_save_cpu() might not be flushed to dumped memory and crash command
>>    fail to fetch correct data.
>
> We'll need to give this a go on something with a system cache too (e.g.
> Seattle or X-Gene). Even if that's only UP it would give me much greater
> confidence in the cache maintenance.

Is there any genric interface to do so?

>>    I will address Mark's commit[2] after Geoff takes care of it on kexec.
>> - modified to use ioremap_cache() instead of ioremap() when reading
>>    crash memory. Otherwise, accessing /proc/vmcore on crash dump kernel
>>    might cause an alignment fault.
>> * allows reserve_crashkernel() to handle "crashkernel=xyz[MG]" correctly,
>>    thanks to Pratyush Anand. And it now also enforces memory limit.
>
> I worry that there could be potentially bad interaction between this and
> Ard's patches, depending on how the memory area to use is chosen. It is
> probably fine, but we should make sure that it is.

I'm not sure what the point is, but I will try to check it.

Thanks,
-Takahiro AKASHI

>> * moved reserve_crashkernel() and reserve_elfcorehdr() to
>>    arm64_memblock_init() to clarify that they should be called before
>>    dma_contignuous_reserve().
>>
>> [1] https://git.linaro.org/people/takahiro.akashi/kexec-tools.git
>> [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-April/338171.html
>
> [3] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-April/337596.html
>
> Thanks,
> Mark.
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 1/5] arm64: kdump: reserve memory for crash dump kernel
  2015-04-24 10:11   ` Mark Rutland
@ 2015-05-11  6:44     ` AKASHI Takahiro
  0 siblings, 0 replies; 29+ messages in thread
From: AKASHI Takahiro @ 2015-05-11  6:44 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel

On 04/24/2015 07:11 PM, Mark Rutland wrote:
> On Fri, Apr 24, 2015 at 08:53:04AM +0100, AKASHI Takahiro wrote:
>> On system kernel, the memory region used by crash dump kernel must be
>> specified by "crashkernel=X@Y" boot parameter. reserve_crashkernel()
>> will allocate the region in "System RAM" and reserve it for later use.
>>
>> On crash dump kernel, memory region information in system kernel is
>> described in a specific region specified by "elfcorehdr=X@Y" boot parameter.
>> reserve_elfcorehdr() will set aside the region to avoid data destruction
>> by the kernel.
>>
>> Crash dump kernel will access memory regions in system kernel via
>> copy_oldmem_page(), which reads a page by ioremap'ing it assuming that
>> such pages are not part of main memory of crash dump kernel.
>> This is true under non-UEFI environment because kexec-tools modifies
>> a device tree adding "usablemem" attributes to memory sections.
>
> I'm not sure what you mean by "usablemem" here.

I think I explained it in my previous reply.

> Do you just mean that the memory nodes are altered such that they only
> cover memory usable by the crash kernel?
>
> Why not _always_ require a command line argument for the crash kernel
> that restricts its memory usage to a particular range? That way it
> doesn't matter whether we're using UEFI or not.

This is one option, but why does uefi ignore all the memory properties?

>> Under UEFI, however, this is not true because UEFI remove memory sections
>> in a device tree and export all the memory regions, even though they belong
>> to system kernel.
>>
>> So we should add "mem=X[MG]" boot parameter to limit the memory size and
>> avoid hitting the following assertion in ioremap():
>>          if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
>>                  return NULL;
>
> That looks suspicious. What is being ioremapped at that point?

As explained so far, all the memory regions are exposed to crash dump kernel,
and it recognizes any pages which should belong to the old kernel also as
part of crash kernel's memory. So pfn_valid() returns true.


> [...]
>
>> @@ -393,6 +398,7 @@ void __init setup_arch(char **cmdline_p)
>>          local_async_enable();
>>
>>          efi_init();
>> +
>>          arm64_memblock_init();
>>
>>          paging_init();
>
> Nit: unrelated whitespace change.

Ok. Will fix it.

>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index ae85da6..ea70d41 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -34,6 +34,8 @@
>>   #include <linux/dma-contiguous.h>
>>   #include <linux/efi.h>
>>   #include <linux/swiotlb.h>
>> +#include <linux/kexec.h>
>> +#include <linux/crash_dump.h>
>
> Nit: please keep these ordered.

Yeah, but others "linux/*.h" in this file are already in a random order.


> [...]
>
>> +               if (memblock_reserve(crash_base, crash_size)) {
>> +                       pr_warn("crashkernel reservation failed - out of memory\n");
>> +                       return;
>> +               }
>
> If we can remove this memory rather than reserving it, we can limit the
> first kernel's ability to accidentally clobber the crash kernel, at the
> expense of having to explicitly map/unmap around loading it.

Do you mean that we should remove mmu mapping of crash kernel memory?
Might be a good idea, but it requires modifying kernel/kexec.c.

-Takahiro AKASHI

> Mark.
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 2/5] arm64: kdump: implement machine_crash_shutdown()
  2015-04-24 10:39   ` Mark Rutland
  2015-04-24 10:43     ` Marc Zyngier
@ 2015-05-11  7:10     ` AKASHI Takahiro
  2015-05-22  5:56       ` AKASHI Takahiro
  1 sibling, 1 reply; 29+ messages in thread
From: AKASHI Takahiro @ 2015-05-11  7:10 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel, marc.zyngier

On 04/24/2015 07:39 PM, Mark Rutland wrote:
> On Fri, Apr 24, 2015 at 08:53:05AM +0100, AKASHI Takahiro wrote:
>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>> save per-cpu general-purpose registers before restarting the crash dump
>> kernel. See kernel_kexec().
>> ipi_cpu_stop() is used and a bit modified to support this behavior.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/include/asm/kexec.h    |   34 ++++++++++++++++++++++-
>>   arch/arm64/kernel/machine_kexec.c |   55 ++++++++++++++++++++++++++++++++++++-
>>   arch/arm64/kernel/smp.c           |   12 ++++++--
>>   3 files changed, 97 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>> index 3530ff5..eaf3fcb 100644
>> --- a/arch/arm64/include/asm/kexec.h
>> +++ b/arch/arm64/include/asm/kexec.h
>> @@ -30,6 +30,8 @@
>>
>>   #if !defined(__ASSEMBLY__)
>>
>> +extern bool in_crash_kexec;
>> +
>>   /**
>>    * crash_setup_regs() - save registers for the panic kernel
>>    *
>> @@ -40,7 +42,37 @@
>>   static inline void crash_setup_regs(struct pt_regs *newregs,
>>   				    struct pt_regs *oldregs)
>>   {
>> -	/* Empty routine needed to avoid build errors. */
>> +	if (oldregs) {
>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>> +	} else {
>> +		__asm__ __volatile__ (
>> +			"stp	 x0,   x1, [%3]\n\t"
>
> Why the tabs?

Don't know. Will remove them.

> Please use #16 * N as the offset for consistency with entry.S, with 0
> for the first N.

OK.

> [...]
>
>> +static void machine_kexec_mask_interrupts(void)
>> +{
>> +	unsigned int i;
>> +	struct irq_desc *desc;
>> +
>> +	for_each_irq_desc(i, desc) {
>> +		struct irq_chip *chip;
>> +
>> +		chip = irq_desc_get_chip(desc);
>> +		if (!chip)
>> +			continue;
>> +
>> +		if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
>> +			chip->irq_eoi(&desc->irq_data);
>> +
>> +		if (chip->irq_mask)
>> +			chip->irq_mask(&desc->irq_data);
>> +
>> +		if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
>> +			chip->irq_disable(&desc->irq_data);
>> +	}
>> +}
>
> I'm surprised that this isn't left to the irqchip driver init code in
> the crash kernel. For all we know this state could be corrupt anyway.
>
> Is there any reason we can't get the GIC driver to nuke all of this at
> probe time?

I don't get the point. You mean that the code be put in probe() of GIC driver?
To be honest, this function was copied from arm's implementation.

> [...]
>
>> @@ -542,7 +543,7 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>   /*
>>    * ipi_cpu_stop - handle IPI from smp_send_stop()
>>    */
>> -static void ipi_cpu_stop(unsigned int cpu)
>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>   {
>>   	if (system_state == SYSTEM_BOOTING ||
>>   	    system_state == SYSTEM_RUNNING) {
>> @@ -556,6 +557,13 @@ static void ipi_cpu_stop(unsigned int cpu)
>>
>>   	local_irq_disable();
>>
>> +#ifdef CONFIG_KEXEC
>> +	if (in_crash_kexec) {
>> +		crash_save_cpu(regs, cpu);
>> +		flush_cache_all();
>
> Any cache maintenance will need to be by VA; flush_cache_all doesn't do
> what the name implies, though may appear to work by chance.
>
> Is kdump implemented for ARM? I don't see equivalent for in the arch/arm
> ipi_cpu_stop.

Arm has a dedicated function in arch/arm/kernel/machine_kexec.c:
    machine_crash_shutdown()
       => machine_crash_nonpanic_core()

Thanks,
-Takahiro AKASHI

> Mark.
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 1/5] arm64: kdump: reserve memory for crash dump kernel
  2015-04-28  9:19   ` Baoquan He
@ 2015-05-11  7:38     ` AKASHI Takahiro
  2015-05-11  7:54       ` Baoquan He
  0 siblings, 1 reply; 29+ messages in thread
From: AKASHI Takahiro @ 2015-05-11  7:38 UTC (permalink / raw)
  To: Baoquan He
  Cc: catalin.marinas, will.deacon, vgoyal, hbabus, geoff, broonie,
	david.griego, kexec, linux-arm-kernel, linaro-kernel,
	linux-kernel

Hi Baoquan,

On 04/28/2015 06:19 PM, Baoquan He wrote:
>> +#ifdef CONFIG_CRASH_DUMP
>> +/*
>> + * reserve_elfcorehdr() - reserves memory for elf core header
>> + *
>> + * This function reserves memory area given in "elfcorehdr=" kernel command
>> + * line parameter. The memory reserved is used by a dump capture kernel to
>> + * identify the memory used by primary kernel.
>> + */
>
> Hi AKASHI,
>
> May I know why elfcorehdr need be reserved separately but not locate a
> memory region in crashkernel reserved region like all other ARCHs? Is
> there any special reason?

I don't get your point, but arm as well as arm64 locates elfcorehdr
in a crash kernel's memory region.
See kexec/arch/arm{,64}/crashdump-arm{,64}.c in kexec-tools.

And this region is reserved at boot time *on crash kernel* because we don't want
to corrupt it accidentally.
(After Mark's comment, we might better remove the mmu mapping for this region, too.)


Make sense?

-Takahiro AKASHI

> Thanks
> Baoquan
>
>> +static void __init reserve_elfcorehdr(void)
>> +{
>> +	if (!elfcorehdr_size)
>> +		return;
>> +
>> +	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
>> +		pr_warn("elfcorehdr reservation failed - memory is in use (0x%llx)\n",
>> +			elfcorehdr_addr);
>> +		return;
>> +	}
>> +
>> +	if (memblock_reserve(elfcorehdr_addr, elfcorehdr_size)) {
>> +		pr_warn("elfcorehdr reservation failed - out of memory\n");
>> +		return;
>> +	}
>> +
>> +	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
>> +		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);
>> +}
>> +#endif /* CONFIG_CRASH_DUMP */
>>   /*
>>    * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
>>    * currently assumes that for memory starting above 4G, 32-bit devices will
>> @@ -170,6 +247,13 @@ void __init arm64_memblock_init(void)
>>   		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
>>   #endif
>>
>> +#ifdef CONFIG_KEXEC
>> +	reserve_crashkernel(memory_limit);
>> +#endif
>> +#ifdef CONFIG_CRASH_DUMP
>> +	reserve_elfcorehdr();
>> +#endif
>> +
>>   	early_init_fdt_scan_reserved_mem();
>>
>>   	/* 4GB maximum for 32-bit only capable devices */
>> --
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 4/5] arm64: add kdump support
  2015-05-08 12:19   ` Dave Young
@ 2015-05-11  7:47     ` Dave Young
  2015-05-11  7:58       ` AKASHI Takahiro
  0 siblings, 1 reply; 29+ messages in thread
From: Dave Young @ 2015-05-11  7:47 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: linux-arm-kernel, hbabus, linaro-kernel, geoff, catalin.marinas,
	panand, will.deacon, linux-kernel, broonie, david.griego, kexec,
	vgoyal

On 05/08/15 at 08:19pm, Dave Young wrote:
> Hi,
> 
> On 04/24/15 at 04:53pm, AKASHI Takahiro wrote:
> > Please read the following commits for arm64-specific constraints:
> >     arm64: kdump: reserve memory for crash dump kernel
> >     arm64: kdump: do not go into EL2 before starting a crash dump kernel
> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > ---
> >  Documentation/kdump/kdump.txt |   31 ++++++++++++++++++++++++++++++-
> >  arch/arm64/Kconfig            |   12 ++++++++++++
> >  2 files changed, 42 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> > index bc4bd5a..f9cf6f5 100644
> > --- a/Documentation/kdump/kdump.txt
> > +++ b/Documentation/kdump/kdump.txt
> > @@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
> >  a remote system.
> >  
> >  Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
> > -s390x and arm architectures.
> > +s390x, arm and arm64 architectures.
> >  
> >  When the system kernel boots, it reserves a small section of memory for
> >  the dump-capture kernel. This ensures that ongoing Direct Memory Access
> > @@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
> >  
> >      AUTO_ZRELADDR=y
> >  
> > +Dump-capture kernel config options (Arch Dependent, arm64)
> > +----------------------------------------------------------
> > +
> > +1) Disable symmetric multi-processing support under "Processor type and
> > +   features":
> > +
> > +   CONFIG_SMP=n
> > +
> > +   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
> > +   when loading the dump-capture kernel, see section "Load the Dump-capture
> > +   Kernel".)
> > +
> > +2) Under uefi, the maximum memory size on the dump-capture kernel must be
> > +   limited by specifying:
> > +
> > +   mem=X[MG]
> > +
> > +   where X should be less than or equal to the size in "crashkernel="
> > +   boot parameter. Kexec-tools will automatically add this.
> 
> I noticed you are passing mem=X in kexec-tools, Pratyush found a problem
> that vmcore is corrupted because kdump kernel is using the crash notes memory.
> 
> How does kdump kernel know the system ram range especially the mem_start?
> Only with the size from mem=? parammeter?
> 
> In X86 because it use E820 so kexec-tools can pass the memory range to kdump
> kernel even for UEFI booting. But in arm64 may need find other way to
> communicate with 2nd kernel like memmap=exactmap in X86..

Discussed with Pratyush in irc, he mentioned that start address is get from
kernel->entry which was passed by kexec-tools.

So sorry for the noise, it will works fine with the UEFI mem= limitation 
fix from Mark Salter

> 
> > +
> > +3) Currently, kvm will not be initialized on the dump-capture kernel even
> > +   if it is configured.
> > +
> >  Extended crashkernel syntax
> >  ===========================
> >  
> > @@ -312,6 +335,7 @@ Boot into System Kernel
> >     any space below the alignment point may be overwritten by the dump-capture kernel,
> >     which means it is possible that the vmcore is not that precise as expected.
> >  
> > +   On arm64, use "crashkernel=Y@X".
> >  
> >  Load the Dump-capture Kernel
> >  ============================
> > @@ -334,6 +358,8 @@ For s390x:
> >  	- Use image or bzImage
> >  For arm:
> >  	- Use zImage
> > +For arm64:
> > +	- Use vmlinux
> >  
> >  If you are using a uncompressed vmlinux image then use following command
> >  to load dump-capture kernel.
> > @@ -377,6 +403,9 @@ For s390x:
> >  For arm:
> >  	"1 maxcpus=1 reset_devices"
> >  
> > +For arm64:
> > +	"1 maxcpus=1 mem=X[MG] reset_devices"
> > +
> >  Notes on loading the dump-capture kernel:
> >  
> >  * By default, the ELF headers are stored in ELF64 format to support
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 5716edf..8e2f545 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -562,6 +562,18 @@ config KEXEC
> >  	  but it is independent of the system firmware.   And like a reboot
> >  	  you can start any kernel with it, not just Linux.
> >  
> > +config CRASH_DUMP
> > +	bool "Build kdump crash kernel"
> > +	help
> > +	  Generate crash dump after being started by kexec. This should
> > +	  be normally only set in special crash dump kernels which are
> > +	  loaded in the main kernel with kexec-tools into a specially
> > +	  reserved region and then later executed after a crash by
> > +	  kdump/kexec. The crash dump kernel must be compiled to a
> > +	  memory address not used by the main kernel.
> > +
> > +	  For more details see Documentation/kdump/kdump.txt
> > +
> >  config XEN_DOM0
> >  	def_bool y
> >  	depends on XEN
> > -- 
> > 1.7.9.5
> > 
> > 
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> > 
> > 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 1/5] arm64: kdump: reserve memory for crash dump kernel
  2015-05-11  7:38     ` AKASHI Takahiro
@ 2015-05-11  7:54       ` Baoquan He
  2015-05-11  8:17         ` AKASHI Takahiro
  0 siblings, 1 reply; 29+ messages in thread
From: Baoquan He @ 2015-05-11  7:54 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: catalin.marinas, will.deacon, vgoyal, hbabus, geoff, broonie,
	david.griego, kexec, linux-arm-kernel, linaro-kernel,
	linux-kernel

On 05/11/15 at 04:38pm, AKASHI Takahiro wrote:
> Hi Baoquan,
> 
> On 04/28/2015 06:19 PM, Baoquan He wrote:
> >>+#ifdef CONFIG_CRASH_DUMP
> >>+/*
> >>+ * reserve_elfcorehdr() - reserves memory for elf core header
> >>+ *
> >>+ * This function reserves memory area given in "elfcorehdr=" kernel command
> >>+ * line parameter. The memory reserved is used by a dump capture kernel to
> >>+ * identify the memory used by primary kernel.
> >>+ */
> >
> >Hi AKASHI,
> >
> >May I know why elfcorehdr need be reserved separately but not locate a
> >memory region in crashkernel reserved region like all other ARCHs? Is
> >there any special reason?
> 
> I don't get your point, but arm as well as arm64 locates elfcorehdr
> in a crash kernel's memory region.
> See kexec/arch/arm{,64}/crashdump-arm{,64}.c in kexec-tools.
> 
> And this region is reserved at boot time *on crash kernel* because we don't want
> to corrupt it accidentally.
> (After Mark's comment, we might better remove the mmu mapping for this region, too.)


Sorry, I don't make myself clear.

In this patch you reserve a separate memory region in 1st kernel to
store elfcorehdr. I am wondering why you don't call add_buffer in
kexec-tools directly. Like this you can get a region from reserved
crashkernel region. Then you don't need reserve_elfcorehdr() to reserve
memory for elfcorehdr specifically. Like other ARCHs do only one memory
region is reserved in 1st kernel, that's crashkernel region.

Thanks
Baoquan
> 
> 
> Make sense?
> 
> -Takahiro AKASHI
> 
> >Thanks
> >Baoquan
> >
> >>+static void __init reserve_elfcorehdr(void)
> >>+{
> >>+	if (!elfcorehdr_size)
> >>+		return;
> >>+
> >>+	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
> >>+		pr_warn("elfcorehdr reservation failed - memory is in use (0x%llx)\n",
> >>+			elfcorehdr_addr);
> >>+		return;
> >>+	}
> >>+
> >>+	if (memblock_reserve(elfcorehdr_addr, elfcorehdr_size)) {
> >>+		pr_warn("elfcorehdr reservation failed - out of memory\n");
> >>+		return;
> >>+	}
> >>+
> >>+	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
> >>+		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);
> >>+}
> >>+#endif /* CONFIG_CRASH_DUMP */
> >>  /*
> >>   * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
> >>   * currently assumes that for memory starting above 4G, 32-bit devices will
> >>@@ -170,6 +247,13 @@ void __init arm64_memblock_init(void)
> >>  		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
> >>  #endif
> >>
> >>+#ifdef CONFIG_KEXEC
> >>+	reserve_crashkernel(memory_limit);
> >>+#endif
> >>+#ifdef CONFIG_CRASH_DUMP
> >>+	reserve_elfcorehdr();
> >>+#endif
> >>+
> >>  	early_init_fdt_scan_reserved_mem();
> >>
> >>  	/* 4GB maximum for 32-bit only capable devices */
> >>--
> >>1.7.9.5
> >>
> >>--
> >>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >>the body of a message to majordomo@vger.kernel.org
> >>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 4/5] arm64: add kdump support
  2015-05-11  7:47     ` Dave Young
@ 2015-05-11  7:58       ` AKASHI Takahiro
  2015-05-11  8:39         ` Dave Young
  0 siblings, 1 reply; 29+ messages in thread
From: AKASHI Takahiro @ 2015-05-11  7:58 UTC (permalink / raw)
  To: Dave Young
  Cc: linux-arm-kernel, hbabus, linaro-kernel, geoff, catalin.marinas,
	panand, will.deacon, linux-kernel, broonie, david.griego, kexec,
	vgoyal

Dave,

On 05/11/2015 04:47 PM, Dave Young wrote:
> On 05/08/15 at 08:19pm, Dave Young wrote:
>> Hi,
>>
>> On 04/24/15 at 04:53pm, AKASHI Takahiro wrote:
>>> Please read the following commits for arm64-specific constraints:
>>>      arm64: kdump: reserve memory for crash dump kernel
>>>      arm64: kdump: do not go into EL2 before starting a crash dump kernel
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   Documentation/kdump/kdump.txt |   31 ++++++++++++++++++++++++++++++-
>>>   arch/arm64/Kconfig            |   12 ++++++++++++
>>>   2 files changed, 42 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
>>> index bc4bd5a..f9cf6f5 100644
>>> --- a/Documentation/kdump/kdump.txt
>>> +++ b/Documentation/kdump/kdump.txt
>>> @@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
>>>   a remote system.
>>>
>>>   Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
>>> -s390x and arm architectures.
>>> +s390x, arm and arm64 architectures.
>>>
>>>   When the system kernel boots, it reserves a small section of memory for
>>>   the dump-capture kernel. This ensures that ongoing Direct Memory Access
>>> @@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
>>>
>>>       AUTO_ZRELADDR=y
>>>
>>> +Dump-capture kernel config options (Arch Dependent, arm64)
>>> +----------------------------------------------------------
>>> +
>>> +1) Disable symmetric multi-processing support under "Processor type and
>>> +   features":
>>> +
>>> +   CONFIG_SMP=n
>>> +
>>> +   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
>>> +   when loading the dump-capture kernel, see section "Load the Dump-capture
>>> +   Kernel".)
>>> +
>>> +2) Under uefi, the maximum memory size on the dump-capture kernel must be
>>> +   limited by specifying:
>>> +
>>> +   mem=X[MG]
>>> +
>>> +   where X should be less than or equal to the size in "crashkernel="
>>> +   boot parameter. Kexec-tools will automatically add this.
>>
>> I noticed you are passing mem=X in kexec-tools, Pratyush found a problem
>> that vmcore is corrupted because kdump kernel is using the crash notes memory.
>>
>> How does kdump kernel know the system ram range especially the mem_start?
>> Only with the size from mem=? parammeter?
>>
>> In X86 because it use E820 so kexec-tools can pass the memory range to kdump
>> kernel even for UEFI booting. But in arm64 may need find other way to
>> communicate with 2nd kernel like memmap=exactmap in X86..
>
> Discussed with Pratyush in irc, he mentioned that start address is get from
> kernel->entry which was passed by kexec-tools.
>
> So sorry for the noise, it will works fine with the UEFI mem= limitation
> fix from Mark Salter

No, not at all.
As you know, in another thread (with Mark), we're discussing this.
If we conclude that we should go for "memmap=" approach, I will update my patch.

Thanks,
-Takahiro AKASHI


>>
>>> +
>>> +3) Currently, kvm will not be initialized on the dump-capture kernel even
>>> +   if it is configured.
>>> +
>>>   Extended crashkernel syntax
>>>   ===========================
>>>
>>> @@ -312,6 +335,7 @@ Boot into System Kernel
>>>      any space below the alignment point may be overwritten by the dump-capture kernel,
>>>      which means it is possible that the vmcore is not that precise as expected.
>>>
>>> +   On arm64, use "crashkernel=Y@X".
>>>
>>>   Load the Dump-capture Kernel
>>>   ============================
>>> @@ -334,6 +358,8 @@ For s390x:
>>>   	- Use image or bzImage
>>>   For arm:
>>>   	- Use zImage
>>> +For arm64:
>>> +	- Use vmlinux
>>>
>>>   If you are using a uncompressed vmlinux image then use following command
>>>   to load dump-capture kernel.
>>> @@ -377,6 +403,9 @@ For s390x:
>>>   For arm:
>>>   	"1 maxcpus=1 reset_devices"
>>>
>>> +For arm64:
>>> +	"1 maxcpus=1 mem=X[MG] reset_devices"
>>> +
>>>   Notes on loading the dump-capture kernel:
>>>
>>>   * By default, the ELF headers are stored in ELF64 format to support
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index 5716edf..8e2f545 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -562,6 +562,18 @@ config KEXEC
>>>   	  but it is independent of the system firmware.   And like a reboot
>>>   	  you can start any kernel with it, not just Linux.
>>>
>>> +config CRASH_DUMP
>>> +	bool "Build kdump crash kernel"
>>> +	help
>>> +	  Generate crash dump after being started by kexec. This should
>>> +	  be normally only set in special crash dump kernels which are
>>> +	  loaded in the main kernel with kexec-tools into a specially
>>> +	  reserved region and then later executed after a crash by
>>> +	  kdump/kexec. The crash dump kernel must be compiled to a
>>> +	  memory address not used by the main kernel.
>>> +
>>> +	  For more details see Documentation/kdump/kdump.txt
>>> +
>>>   config XEN_DOM0
>>>   	def_bool y
>>>   	depends on XEN
>>> --
>>> 1.7.9.5
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
>>>
>>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>>
>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 1/5] arm64: kdump: reserve memory for crash dump kernel
  2015-05-11  7:54       ` Baoquan He
@ 2015-05-11  8:17         ` AKASHI Takahiro
  2015-05-11  9:41           ` Baoquan He
  0 siblings, 1 reply; 29+ messages in thread
From: AKASHI Takahiro @ 2015-05-11  8:17 UTC (permalink / raw)
  To: Baoquan He
  Cc: catalin.marinas, will.deacon, vgoyal, hbabus, geoff, broonie,
	david.griego, kexec, linux-arm-kernel, linaro-kernel,
	linux-kernel

On 05/11/2015 04:54 PM, Baoquan He wrote:
> On 05/11/15 at 04:38pm, AKASHI Takahiro wrote:
>> Hi Baoquan,
>>
>> On 04/28/2015 06:19 PM, Baoquan He wrote:
>>>> +#ifdef CONFIG_CRASH_DUMP
>>>> +/*
>>>> + * reserve_elfcorehdr() - reserves memory for elf core header
>>>> + *
>>>> + * This function reserves memory area given in "elfcorehdr=" kernel command
>>>> + * line parameter. The memory reserved is used by a dump capture kernel to
>>>> + * identify the memory used by primary kernel.
>>>> + */
>>>
>>> Hi AKASHI,
>>>
>>> May I know why elfcorehdr need be reserved separately but not locate a
>>> memory region in crashkernel reserved region like all other ARCHs? Is
>>> there any special reason?
>>
>> I don't get your point, but arm as well as arm64 locates elfcorehdr
>> in a crash kernel's memory region.
>> See kexec/arch/arm{,64}/crashdump-arm{,64}.c in kexec-tools.
>>
>> And this region is reserved at boot time *on crash kernel* because we don't want
>> to corrupt it accidentally.
>> (After Mark's comment, we might better remove the mmu mapping for this region, too.)
>
>
> Sorry, I don't make myself clear.
>
> In this patch you reserve a separate memory region in 1st kernel to
> store elfcorehdr. I am wondering why you don't call add_buffer in
> kexec-tools directly. Like this you can get a region from reserved
> crashkernel region. Then you don't need reserve_elfcorehdr() to reserve
> memory for elfcorehdr specifically. Like other ARCHs do only one memory
> region is reserved in 1st kernel, that's crashkernel region.

I think that you misunderstand somewhat.
* Kexec-tools only locates/identifies a small region for elfcore header within crash kernel's
memory region while 1st kernel is running.
* the data in elfcore header is filled up by kexec_load system call on 1st kernel.
* 1st kernel doesn't reserve any region for elfcore header because the kernel
commandline parameters don't contains "elfcorehdr=" parameter, then elfcorehdr_size=0.
* Crash dump kernel does reserve the region, as I said, because we don't want to
corrupt the info in elfcore header accidentally while crash kernel is running.

Clear?

-Takahiro AKASHI

> Thanks
> Baoquan
>>
>>
>> Make sense?
>>
>> -Takahiro AKASHI
>>
>>> Thanks
>>> Baoquan
>>>
>>>> +static void __init reserve_elfcorehdr(void)
>>>> +{
>>>> +	if (!elfcorehdr_size)
>>>> +		return;
>>>> +
>>>> +	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
>>>> +		pr_warn("elfcorehdr reservation failed - memory is in use (0x%llx)\n",
>>>> +			elfcorehdr_addr);
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	if (memblock_reserve(elfcorehdr_addr, elfcorehdr_size)) {
>>>> +		pr_warn("elfcorehdr reservation failed - out of memory\n");
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
>>>> +		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);
>>>> +}
>>>> +#endif /* CONFIG_CRASH_DUMP */
>>>>   /*
>>>>    * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
>>>>    * currently assumes that for memory starting above 4G, 32-bit devices will
>>>> @@ -170,6 +247,13 @@ void __init arm64_memblock_init(void)
>>>>   		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
>>>>   #endif
>>>>
>>>> +#ifdef CONFIG_KEXEC
>>>> +	reserve_crashkernel(memory_limit);
>>>> +#endif
>>>> +#ifdef CONFIG_CRASH_DUMP
>>>> +	reserve_elfcorehdr();
>>>> +#endif
>>>> +
>>>>   	early_init_fdt_scan_reserved_mem();
>>>>
>>>>   	/* 4GB maximum for 32-bit only capable devices */
>>>> --
>>>> 1.7.9.5
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 4/5] arm64: add kdump support
  2015-05-11  7:58       ` AKASHI Takahiro
@ 2015-05-11  8:39         ` Dave Young
  0 siblings, 0 replies; 29+ messages in thread
From: Dave Young @ 2015-05-11  8:39 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: panand, hbabus, linaro-kernel, geoff, catalin.marinas,
	will.deacon, linux-kernel, vgoyal, broonie, david.griego, kexec,
	linux-arm-kernel

On 05/11/15 at 04:58pm, AKASHI Takahiro wrote:
> Dave,
> 
> On 05/11/2015 04:47 PM, Dave Young wrote:
> >On 05/08/15 at 08:19pm, Dave Young wrote:
> >>Hi,
> >>
> >>On 04/24/15 at 04:53pm, AKASHI Takahiro wrote:
> >>>Please read the following commits for arm64-specific constraints:
> >>>     arm64: kdump: reserve memory for crash dump kernel
> >>>     arm64: kdump: do not go into EL2 before starting a crash dump kernel
> >>>
> >>>Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> >>>---
> >>>  Documentation/kdump/kdump.txt |   31 ++++++++++++++++++++++++++++++-
> >>>  arch/arm64/Kconfig            |   12 ++++++++++++
> >>>  2 files changed, 42 insertions(+), 1 deletion(-)
> >>>
> >>>diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> >>>index bc4bd5a..f9cf6f5 100644
> >>>--- a/Documentation/kdump/kdump.txt
> >>>+++ b/Documentation/kdump/kdump.txt
> >>>@@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
> >>>  a remote system.
> >>>
> >>>  Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
> >>>-s390x and arm architectures.
> >>>+s390x, arm and arm64 architectures.
> >>>
> >>>  When the system kernel boots, it reserves a small section of memory for
> >>>  the dump-capture kernel. This ensures that ongoing Direct Memory Access
> >>>@@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
> >>>
> >>>      AUTO_ZRELADDR=y
> >>>
> >>>+Dump-capture kernel config options (Arch Dependent, arm64)
> >>>+----------------------------------------------------------
> >>>+
> >>>+1) Disable symmetric multi-processing support under "Processor type and
> >>>+   features":
> >>>+
> >>>+   CONFIG_SMP=n
> >>>+
> >>>+   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
> >>>+   when loading the dump-capture kernel, see section "Load the Dump-capture
> >>>+   Kernel".)
> >>>+
> >>>+2) Under uefi, the maximum memory size on the dump-capture kernel must be
> >>>+   limited by specifying:
> >>>+
> >>>+   mem=X[MG]
> >>>+
> >>>+   where X should be less than or equal to the size in "crashkernel="
> >>>+   boot parameter. Kexec-tools will automatically add this.
> >>
> >>I noticed you are passing mem=X in kexec-tools, Pratyush found a problem
> >>that vmcore is corrupted because kdump kernel is using the crash notes memory.
> >>
> >>How does kdump kernel know the system ram range especially the mem_start?
> >>Only with the size from mem=? parammeter?
> >>
> >>In X86 because it use E820 so kexec-tools can pass the memory range to kdump
> >>kernel even for UEFI booting. But in arm64 may need find other way to
> >>communicate with 2nd kernel like memmap=exactmap in X86..
> >
> >Discussed with Pratyush in irc, he mentioned that start address is get from
> >kernel->entry which was passed by kexec-tools.
> >
> >So sorry for the noise, it will works fine with the UEFI mem= limitation
> >fix from Mark Salter
> 
> No, not at all.
> As you know, in another thread (with Mark), we're discussing this.
> If we conclude that we should go for "memmap=" approach, I will update my patch.

Ok, good to know. I did not notice the thread, will take a look.

There's another problem for memmap= approach though. In x86 we have problems when
we need pass reserved memory ranges as well because of some pci mmconfig issue.
In that case kernel command line buffer is too small and there's too many memory
ranges for reserved ranges. Thus later we switched to pass e820 in x86 boot params
directly.

If there's same potential problems we may consider to use memmap= only for system
ram, and for any other reserved types we can still get from same dtb in 1st kernel.

Thanks
Dave

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 1/5] arm64: kdump: reserve memory for crash dump kernel
  2015-05-11  8:17         ` AKASHI Takahiro
@ 2015-05-11  9:41           ` Baoquan He
  2015-05-12  7:32             ` AKASHI Takahiro
  0 siblings, 1 reply; 29+ messages in thread
From: Baoquan He @ 2015-05-11  9:41 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: linux-arm-kernel, hbabus, linaro-kernel, geoff, catalin.marinas,
	will.deacon, linux-kernel, broonie, david.griego, kexec, vgoyal

On 05/11/15 at 05:17pm, AKASHI Takahiro wrote:
> On 05/11/2015 04:54 PM, Baoquan He wrote:
> >In this patch you reserve a separate memory region in 1st kernel to
> >store elfcorehdr. I am wondering why you don't call add_buffer in
> >kexec-tools directly. Like this you can get a region from reserved
> >crashkernel region. Then you don't need reserve_elfcorehdr() to reserve
> >memory for elfcorehdr specifically. Like other ARCHs do only one memory
> >region is reserved in 1st kernel, that's crashkernel region.
> 
> I think that you misunderstand somewhat.
> * Kexec-tools only locates/identifies a small region for elfcore header within crash kernel's
> memory region while 1st kernel is running.
> * the data in elfcore header is filled up by kexec_load system call on 1st kernel.
> * 1st kernel doesn't reserve any region for elfcore header because the kernel
> commandline parameters don't contains "elfcorehdr=" parameter, then elfcorehdr_size=0.
> * Crash dump kernel does reserve the region, as I said, because we don't want to
> corrupt the info in elfcore header accidentally while crash kernel is running.
> 
> Clear?

OK, got it now.

Then I am wondering why "elfcorehdr=" can't be contained in kernel
cmdline as other ARCH does. Maybe I need go over all related threads
then know why it is. Thanks for explanation.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 0/5] arm64: add kdump support
  2015-05-11  6:16   ` AKASHI Takahiro
@ 2015-05-12  5:43     ` Dave Young
  2015-05-18  8:08       ` AKASHI Takahiro
  0 siblings, 1 reply; 29+ messages in thread
From: Dave Young @ 2015-05-12  5:43 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Mark Rutland, Catalin Marinas, Will Deacon, vgoyal, hbabus,
	linaro-kernel, geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel, ard.biesheuvel

On 05/11/15 at 03:16pm, AKASHI Takahiro wrote:
> Hi
> 
> Sorry for late response. I was on vacation.
> 
> On 04/24/2015 06:53 PM, Mark Rutland wrote:
> >Hi,
> >
> >On Fri, Apr 24, 2015 at 08:53:03AM +0100, AKASHI Takahiro wrote:
> >>This patch set enables kdump (crash dump kernel) support on arm64 on top of
> >>Geoff's kexec patchset.
> >>
> >>In this version, there are some arm64-specific usage/constraints:
> >>1) "mem=" boot parameter must be specified on crash dump kernel
> >>    if the system starts on uefi.
> >
> >This sounds very painful. Why is this the case, and how do x86 and/or
> >ia64 get around that?
> 
> As Dave (Young) said, x86 uses "memmap=XX" kernel commandline parameters
> to specify usable memory for crash dump kernel.

Originally x86 use memmap=exactmap memmap=XX to specify each section of
memories for 2nd kernel. But later because a lot of reserved type ranges
need to be passed ie. for pci mmconfig, and kernel cmdline buffer is
limited so kexec-tools later switch to passing these in x86 boot params as
E820 memory ranges directly.

> On my arm64 implementation, "linux,usable-memory" property is added
> to device tree blob by kexec-tools for this purpose.
> This is because, when I first implemented kdump on arm64, ppc is the only
> architecture that supports kdump AND utilizes device trees.
> Since kexec-tools as well as the kernel already has this framework,
> I believed that device-tree approach was smarter than a commandline
> parameter.
> 
> However, uefi-based kernel ignores all the memory-related properties
> in a device tree and so this "mem=" workaround was added.

Kdump kernel reuses the memmap info getting from firmware during 1st kernel
boot, I do not think the memmap info can be cooked for crash kernel usable
memory. But it might be a better way to use a special fdt node for crash
kernel memory even for UEFI..

Another way is introducing a similar memmap=, but maybe consider only
system_ram type ranges. For other memory areas still use UEFI memmap.

Thanks
Dave

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 1/5] arm64: kdump: reserve memory for crash dump kernel
  2015-05-11  9:41           ` Baoquan He
@ 2015-05-12  7:32             ` AKASHI Takahiro
  0 siblings, 0 replies; 29+ messages in thread
From: AKASHI Takahiro @ 2015-05-12  7:32 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-arm-kernel, hbabus, linaro-kernel, geoff, catalin.marinas,
	will.deacon, linux-kernel, broonie, david.griego, kexec, vgoyal

On 05/11/2015 06:41 PM, Baoquan He wrote:
> On 05/11/15 at 05:17pm, AKASHI Takahiro wrote:
>> On 05/11/2015 04:54 PM, Baoquan He wrote:
>>> In this patch you reserve a separate memory region in 1st kernel to
>>> store elfcorehdr. I am wondering why you don't call add_buffer in
>>> kexec-tools directly. Like this you can get a region from reserved
>>> crashkernel region. Then you don't need reserve_elfcorehdr() to reserve
>>> memory for elfcorehdr specifically. Like other ARCHs do only one memory
>>> region is reserved in 1st kernel, that's crashkernel region.
>>
>> I think that you misunderstand somewhat.
>> * Kexec-tools only locates/identifies a small region for elfcore header within crash kernel's
>> memory region while 1st kernel is running.
>> * the data in elfcore header is filled up by kexec_load system call on 1st kernel.
>> * 1st kernel doesn't reserve any region for elfcore header because the kernel
>> commandline parameters don't contains "elfcorehdr=" parameter, then elfcorehdr_size=0.
>> * Crash dump kernel does reserve the region, as I said, because we don't want to
>> corrupt the info in elfcore header accidentally while crash kernel is running.
>>
>> Clear?
>
> OK, got it now.
>
> Then I am wondering why "elfcorehdr=" can't be contained in kernel
> cmdline as other ARCH does. Maybe I need go over all related threads
> then know why it is. Thanks for explanation.

Kexec-tools on 1st kernel appends "elfcorehdr=" to the kernel command line,
actually chosen/bootargs in a device tree, that is passed to *crash dump kernel*.
So when crash dump kernel boots up, it can recognizes that area (and reserves it
for later use of managing /proc/vmcore.)

Thanks,
-Takahiro AKASHI

> Thanks
> Baoquan
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 0/5] arm64: add kdump support
  2015-05-12  5:43     ` Dave Young
@ 2015-05-18  8:08       ` AKASHI Takahiro
  0 siblings, 0 replies; 29+ messages in thread
From: AKASHI Takahiro @ 2015-05-18  8:08 UTC (permalink / raw)
  To: Dave Young
  Cc: Mark Rutland, Catalin Marinas, Will Deacon, vgoyal, hbabus,
	linaro-kernel, geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel, ard.biesheuvel

Mark,

On 05/12/2015 02:43 PM, Dave Young wrote:
> On 05/11/15 at 03:16pm, AKASHI Takahiro wrote:
>> Hi
>>
>> Sorry for late response. I was on vacation.
>>
>> On 04/24/2015 06:53 PM, Mark Rutland wrote:
>>> Hi,
>>>
>>> On Fri, Apr 24, 2015 at 08:53:03AM +0100, AKASHI Takahiro wrote:
>>>> This patch set enables kdump (crash dump kernel) support on arm64 on top of
>>>> Geoff's kexec patchset.
>>>>
>>>> In this version, there are some arm64-specific usage/constraints:
>>>> 1) "mem=" boot parameter must be specified on crash dump kernel
>>>>     if the system starts on uefi.
>>>
>>> This sounds very painful. Why is this the case, and how do x86 and/or
>>> ia64 get around that?
>>
>> As Dave (Young) said, x86 uses "memmap=XX" kernel commandline parameters
>> to specify usable memory for crash dump kernel.
>
> Originally x86 use memmap=exactmap memmap=XX to specify each section of
> memories for 2nd kernel. But later because a lot of reserved type ranges
> need to be passed ie. for pci mmconfig, and kernel cmdline buffer is
> limited so kexec-tools later switch to passing these in x86 boot params as
> E820 memory ranges directly.
>
>> On my arm64 implementation, "linux,usable-memory" property is added
>> to device tree blob by kexec-tools for this purpose.
>> This is because, when I first implemented kdump on arm64, ppc is the only
>> architecture that supports kdump AND utilizes device trees.
>> Since kexec-tools as well as the kernel already has this framework,
>> I believed that device-tree approach was smarter than a commandline
>> parameter.
>>
>> However, uefi-based kernel ignores all the memory-related properties
>> in a device tree and so this "mem=" workaround
> Kdump kernel reuses the memmap info getting from firmware during 1st kernel
> boot, I do not think the memmap info can be cooked for crash kernel usable
> memory. But it might be a better way to use a special fdt node for crash
> kernel memory even for UEFI..

Do you still prefer "memmap=" approach?

Just FYI,

       x86                     arm            arm64            powerpc{,64}

(a)   cmdline(crashkernel=)   cmdline        cmdline          cmdline
(b)   iomem                   iomem          iomem            dev tree
(c)   cmdline(memmap=)        cmdline(mem=)  dev tree         dev tree
                                              & cmdline(mem=)

(a) how we specify/reserve crash dump kernel's memory on 1st kernel
(b) how we inform userspace tool of crash dump kernel's memory on 1st kernel
     - iomem: add "Crash kernel" entry in /proc/iomem
     - dev tree: add /chosen/linux,crashkernel-base (and crashkernel-size)
(c) how we specify usable memory on crash dump kernel
     - dev tree: add /chosen/memory/linux,usable-memory


As you see, arm64, and arm as well, does things in a *hybrid* way of x86 and ppc.
I think we should go for a uniform way with device tree as ppc does.
(Even for (a), we can use a device tree.)

> Another way is introducing a similar memmap=, but maybe consider only
> system_ram type ranges. For other memory areas still use UEFI memmap.

Yeah, I don't know whether we need additional formats of memmap=, like
memmap=XX$YY or memmap=XX#YY, for ACPI stuffs.

Thanks,
-Takahiro AKASHI

> Thanks
> Dave
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 2/5] arm64: kdump: implement machine_crash_shutdown()
  2015-05-11  7:10     ` AKASHI Takahiro
@ 2015-05-22  5:56       ` AKASHI Takahiro
  0 siblings, 0 replies; 29+ messages in thread
From: AKASHI Takahiro @ 2015-05-22  5:56 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel, marc.zyngier

Mark,

On 05/11/2015 04:10 PM, AKASHI Takahiro wrote:
> On 04/24/2015 07:39 PM, Mark Rutland wrote:
>> On Fri, Apr 24, 2015 at 08:53:05AM +0100, AKASHI Takahiro wrote:
>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>> save per-cpu general-purpose registers before restarting the crash dump
>>> kernel. See kernel_kexec().
>>> ipi_cpu_stop() is used and a bit modified to support this behavior.
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm64/include/asm/kexec.h    |   34 ++++++++++++++++++++++-
>>>   arch/arm64/kernel/machine_kexec.c |   55 ++++++++++++++++++++++++++++++++++++-
>>>   arch/arm64/kernel/smp.c           |   12 ++++++--
>>>   3 files changed, 97 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>> index 3530ff5..eaf3fcb 100644
>>> --- a/arch/arm64/include/asm/kexec.h
>>> +++ b/arch/arm64/include/asm/kexec.h
>>> @@ -30,6 +30,8 @@
>>>
>>>   #if !defined(__ASSEMBLY__)
>>>
>>> +extern bool in_crash_kexec;
>>> +
>>>   /**
>>>    * crash_setup_regs() - save registers for the panic kernel
>>>    *
>>> @@ -40,7 +42,37 @@
>>>   static inline void crash_setup_regs(struct pt_regs *newregs,
>>>                       struct pt_regs *oldregs)
>>>   {
>>> -    /* Empty routine needed to avoid build errors. */
>>> +    if (oldregs) {
>>> +        memcpy(newregs, oldregs, sizeof(*newregs));
>>> +    } else {
>>> +        __asm__ __volatile__ (
>>> +            "stp     x0,   x1, [%3]\n\t"
>>
>> Why the tabs?
>
> Don't know. Will remove them.
>
>> Please use #16 * N as the offset for consistency with entry.S, with 0
>> for the first N.
>
> OK.
>
>> [...]
>>
>>> +static void machine_kexec_mask_interrupts(void)
>>> +{
>>> +    unsigned int i;
>>> +    struct irq_desc *desc;
>>> +
>>> +    for_each_irq_desc(i, desc) {
>>> +        struct irq_chip *chip;
>>> +
>>> +        chip = irq_desc_get_chip(desc);
>>> +        if (!chip)
>>> +            continue;
>>> +
>>> +        if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
>>> +            chip->irq_eoi(&desc->irq_data);
>>> +
>>> +        if (chip->irq_mask)
>>> +            chip->irq_mask(&desc->irq_data);
>>> +
>>> +        if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
>>> +            chip->irq_disable(&desc->irq_data);
>>> +    }
>>> +}
>>
>> I'm surprised that this isn't left to the irqchip driver init code in
>> the crash kernel. For all we know this state could be corrupt anyway.
>>
>> Is there any reason we can't get the GIC driver to nuke all of this at
>> probe time?
>
> I don't get the point. You mean that the code be put in probe() of GIC driver?
> To be honest, this function was copied from arm's implementation.

Could you please elaborate a bit more details so that I can address the issue?

Thanks,
-Takahiro AKASHI

>> [...]
>>
>>> @@ -542,7 +543,7 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>>   /*
>>>    * ipi_cpu_stop - handle IPI from smp_send_stop()
>>>    */
>>> -static void ipi_cpu_stop(unsigned int cpu)
>>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>>   {
>>>       if (system_state == SYSTEM_BOOTING ||
>>>           system_state == SYSTEM_RUNNING) {
>>> @@ -556,6 +557,13 @@ static void ipi_cpu_stop(unsigned int cpu)
>>>
>>>       local_irq_disable();
>>>
>>> +#ifdef CONFIG_KEXEC
>>> +    if (in_crash_kexec) {
>>> +        crash_save_cpu(regs, cpu);
>>> +        flush_cache_all();
>>
>> Any cache maintenance will need to be by VA; flush_cache_all doesn't do
>> what the name implies, though may appear to work by chance.
>>
>> Is kdump implemented for ARM? I don't see equivalent for in the arch/arm
>> ipi_cpu_stop.
>
> Arm has a dedicated function in arch/arm/kernel/machine_kexec.c:
>     machine_crash_shutdown()
>        => machine_crash_nonpanic_core()
>
> Thanks,
> -Takahiro AKASHI
>
>> Mark.
>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 2/5] arm64: kdump: implement machine_crash_shutdown()
  2015-04-24 10:43     ` Marc Zyngier
@ 2015-08-06  7:09       ` AKASHI Takahiro
  2015-08-06 15:51         ` Marc Zyngier
  0 siblings, 1 reply; 29+ messages in thread
From: AKASHI Takahiro @ 2015-08-06  7:09 UTC (permalink / raw)
  To: Marc Zyngier, Mark Rutland
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel

Marc, Mark

Sorry for not revisiting your comment below for a while.

On 04/24/2015 07:43 PM, Marc Zyngier wrote:
> On 24/04/15 11:39, Mark Rutland wrote:
>> On Fri, Apr 24, 2015 at 08:53:05AM +0100, AKASHI Takahiro wrote:
>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>> save per-cpu general-purpose registers before restarting the crash dump
>>> kernel. See kernel_kexec().
>>> ipi_cpu_stop() is used and a bit modified to support this behavior.
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm64/include/asm/kexec.h    |   34 ++++++++++++++++++++++-
>>>   arch/arm64/kernel/machine_kexec.c |   55 ++++++++++++++++++++++++++++++++++++-
>>>   arch/arm64/kernel/smp.c           |   12 ++++++--
>>>   3 files changed, 97 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>> index 3530ff5..eaf3fcb 100644
>>> --- a/arch/arm64/include/asm/kexec.h
>>> +++ b/arch/arm64/include/asm/kexec.h
>>> @@ -30,6 +30,8 @@
>>>
>>>   #if !defined(__ASSEMBLY__)
>>>
>>> +extern bool in_crash_kexec;
>>> +
>>>   /**
>>>    * crash_setup_regs() - save registers for the panic kernel
>>>    *
>>> @@ -40,7 +42,37 @@
>>>   static inline void crash_setup_regs(struct pt_regs *newregs,
>>>   				    struct pt_regs *oldregs)
>>>   {
>>> -	/* Empty routine needed to avoid build errors. */
>>> +	if (oldregs) {
>>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>>> +	} else {
>>> +		__asm__ __volatile__ (
>>> +			"stp	 x0,   x1, [%3]\n\t"
>>
>> Why the tabs?
>>
>> Please use #16 * N as the offset for consistency with entry.S, with 0
>> for the first N.
>>
>> [...]
>>
>>> +static void machine_kexec_mask_interrupts(void)
>>> +{
>>> +	unsigned int i;
>>> +	struct irq_desc *desc;
>>> +
>>> +	for_each_irq_desc(i, desc) {
>>> +		struct irq_chip *chip;
>>> +
>>> +		chip = irq_desc_get_chip(desc);
>>> +		if (!chip)
>>> +			continue;
>>> +
>>> +		if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
>>> +			chip->irq_eoi(&desc->irq_data);
>>> +
>>> +		if (chip->irq_mask)
>>> +			chip->irq_mask(&desc->irq_data);
>>> +
>>> +		if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
>>> +			chip->irq_disable(&desc->irq_data);
>>> +	}
>>> +}
>>
>> I'm surprised that this isn't left to the irqchip driver init code in
>> the crash kernel. For all we know this state could be corrupt anyway.
>
> Indeed, parsing the irqdesc list is a recipe for disaster. Who knows
> which locks have been taken or simply corrupted, pointers nuked...
>
>> Is there any reason we can't get the GIC driver to nuke all of this at
>> probe time?

Is it just enough to remove kexec_mask_interrupts() and add gic_eoi_irq()
at the beginning of gic_cpu_init() in irq-gic.c and irq-gic-v3.c?

> This feels like the better option. I can cook a patch or two for that.

If you do, that will be much better :)

BTW, in arm-gic-v3.h, GICD_CTRL_ARE_NS is defined as
     (1U << 4)
but should it be 5?
(I'm referring to the page 8-415 in IHI0069A.)

Thanks,
-Takahiro AKASHI

> Thanks,
>
> 	M.
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 2/5] arm64: kdump: implement machine_crash_shutdown()
  2015-08-06  7:09       ` AKASHI Takahiro
@ 2015-08-06 15:51         ` Marc Zyngier
  2015-08-07  4:24           ` AKASHI Takahiro
  0 siblings, 1 reply; 29+ messages in thread
From: Marc Zyngier @ 2015-08-06 15:51 UTC (permalink / raw)
  To: AKASHI Takahiro, Mark Rutland
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel

Hi,

On 06/08/15 08:09, AKASHI Takahiro wrote:
> Marc, Mark
> 
> Sorry for not revisiting your comment below for a while.

Wow. It took me a few minutes to page the context back in.

> On 04/24/2015 07:43 PM, Marc Zyngier wrote:
>> On 24/04/15 11:39, Mark Rutland wrote:
>>> On Fri, Apr 24, 2015 at 08:53:05AM +0100, AKASHI Takahiro wrote:
>>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>>> save per-cpu general-purpose registers before restarting the crash dump
>>>> kernel. See kernel_kexec().
>>>> ipi_cpu_stop() is used and a bit modified to support this behavior.
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>   arch/arm64/include/asm/kexec.h    |   34 ++++++++++++++++++++++-
>>>>   arch/arm64/kernel/machine_kexec.c |   55 ++++++++++++++++++++++++++++++++++++-
>>>>   arch/arm64/kernel/smp.c           |   12 ++++++--
>>>>   3 files changed, 97 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>>> index 3530ff5..eaf3fcb 100644
>>>> --- a/arch/arm64/include/asm/kexec.h
>>>> +++ b/arch/arm64/include/asm/kexec.h
>>>> @@ -30,6 +30,8 @@
>>>>
>>>>   #if !defined(__ASSEMBLY__)
>>>>
>>>> +extern bool in_crash_kexec;
>>>> +
>>>>   /**
>>>>    * crash_setup_regs() - save registers for the panic kernel
>>>>    *
>>>> @@ -40,7 +42,37 @@
>>>>   static inline void crash_setup_regs(struct pt_regs *newregs,
>>>>   				    struct pt_regs *oldregs)
>>>>   {
>>>> -	/* Empty routine needed to avoid build errors. */
>>>> +	if (oldregs) {
>>>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>>>> +	} else {
>>>> +		__asm__ __volatile__ (
>>>> +			"stp	 x0,   x1, [%3]\n\t"
>>>
>>> Why the tabs?
>>>
>>> Please use #16 * N as the offset for consistency with entry.S, with 0
>>> for the first N.
>>>
>>> [...]
>>>
>>>> +static void machine_kexec_mask_interrupts(void)
>>>> +{
>>>> +	unsigned int i;
>>>> +	struct irq_desc *desc;
>>>> +
>>>> +	for_each_irq_desc(i, desc) {
>>>> +		struct irq_chip *chip;
>>>> +
>>>> +		chip = irq_desc_get_chip(desc);
>>>> +		if (!chip)
>>>> +			continue;
>>>> +
>>>> +		if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
>>>> +			chip->irq_eoi(&desc->irq_data);
>>>> +
>>>> +		if (chip->irq_mask)
>>>> +			chip->irq_mask(&desc->irq_data);
>>>> +
>>>> +		if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
>>>> +			chip->irq_disable(&desc->irq_data);
>>>> +	}
>>>> +}
>>>
>>> I'm surprised that this isn't left to the irqchip driver init code in
>>> the crash kernel. For all we know this state could be corrupt anyway.
>>
>> Indeed, parsing the irqdesc list is a recipe for disaster. Who knows
>> which locks have been taken or simply corrupted, pointers nuked...
>>
>>> Is there any reason we can't get the GIC driver to nuke all of this at
>>> probe time?
> 
> Is it just enough to remove kexec_mask_interrupts() and add gic_eoi_irq()
> at the beginning of gic_cpu_init() in irq-gic.c and irq-gic-v3.c?

No, doing an EOI is definitely the wrong thing to do. If you do it in
the wrong order, you just screw up the GIC state machine. Plus, you have
no idea what to write there...

The only real solution is to zero the "active" registers.

>> This feels like the better option. I can cook a patch or two for that.
> 
> If you do, that will be much better :)

OK, I'll prepare something that we can merge at the same time kexec
comes back from the dead (if it ever does - I'm not holding my breath).

> 
> BTW, in arm-gic-v3.h, GICD_CTRL_ARE_NS is defined as
>      (1U << 4)
> but should it be 5?
> (I'm referring to the page 8-415 in IHI0069A.)

No, look at the definition ARE_NS has when the access is non-secure or
on a system supporting a single security state. The definition you're
referring to is for a secure access (firmware).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [v2 2/5] arm64: kdump: implement machine_crash_shutdown()
  2015-08-06 15:51         ` Marc Zyngier
@ 2015-08-07  4:24           ` AKASHI Takahiro
  0 siblings, 0 replies; 29+ messages in thread
From: AKASHI Takahiro @ 2015-08-07  4:24 UTC (permalink / raw)
  To: Marc Zyngier, Mark Rutland
  Cc: Catalin Marinas, Will Deacon, vgoyal, hbabus, linaro-kernel,
	geoff, kexec, linux-kernel, broonie, david.griego,
	linux-arm-kernel

Marc,

On 08/07/2015 12:51 AM, Marc Zyngier wrote:
> Hi,
>
> On 06/08/15 08:09, AKASHI Takahiro wrote:
>> Marc, Mark
>>
>> Sorry for not revisiting your comment below for a while.
>
> Wow. It took me a few minutes to page the context back in.

Please don't purge the page from your cache for a while :)

>> On 04/24/2015 07:43 PM, Marc Zyngier wrote:
>>> On 24/04/15 11:39, Mark Rutland wrote:
>>>> On Fri, Apr 24, 2015 at 08:53:05AM +0100, AKASHI Takahiro wrote:
>>>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>>>> save per-cpu general-purpose registers before restarting the crash dump
>>>>> kernel. See kernel_kexec().
>>>>> ipi_cpu_stop() is used and a bit modified to support this behavior.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>    arch/arm64/include/asm/kexec.h    |   34 ++++++++++++++++++++++-
>>>>>    arch/arm64/kernel/machine_kexec.c |   55 ++++++++++++++++++++++++++++++++++++-
>>>>>    arch/arm64/kernel/smp.c           |   12 ++++++--
>>>>>    3 files changed, 97 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>>>> index 3530ff5..eaf3fcb 100644
>>>>> --- a/arch/arm64/include/asm/kexec.h
>>>>> +++ b/arch/arm64/include/asm/kexec.h
>>>>> @@ -30,6 +30,8 @@
>>>>>
>>>>>    #if !defined(__ASSEMBLY__)
>>>>>
>>>>> +extern bool in_crash_kexec;
>>>>> +
>>>>>    /**
>>>>>     * crash_setup_regs() - save registers for the panic kernel
>>>>>     *
>>>>> @@ -40,7 +42,37 @@
>>>>>    static inline void crash_setup_regs(struct pt_regs *newregs,
>>>>>    				    struct pt_regs *oldregs)
>>>>>    {
>>>>> -	/* Empty routine needed to avoid build errors. */
>>>>> +	if (oldregs) {
>>>>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>>>>> +	} else {
>>>>> +		__asm__ __volatile__ (
>>>>> +			"stp	 x0,   x1, [%3]\n\t"
>>>>
>>>> Why the tabs?
>>>>
>>>> Please use #16 * N as the offset for consistency with entry.S, with 0
>>>> for the first N.
>>>>
>>>> [...]
>>>>
>>>>> +static void machine_kexec_mask_interrupts(void)
>>>>> +{
>>>>> +	unsigned int i;
>>>>> +	struct irq_desc *desc;
>>>>> +
>>>>> +	for_each_irq_desc(i, desc) {
>>>>> +		struct irq_chip *chip;
>>>>> +
>>>>> +		chip = irq_desc_get_chip(desc);
>>>>> +		if (!chip)
>>>>> +			continue;
>>>>> +
>>>>> +		if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
>>>>> +			chip->irq_eoi(&desc->irq_data);
>>>>> +
>>>>> +		if (chip->irq_mask)
>>>>> +			chip->irq_mask(&desc->irq_data);
>>>>> +
>>>>> +		if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
>>>>> +			chip->irq_disable(&desc->irq_data);
>>>>> +	}
>>>>> +}
>>>>
>>>> I'm surprised that this isn't left to the irqchip driver init code in
>>>> the crash kernel. For all we know this state could be corrupt anyway.
>>>
>>> Indeed, parsing the irqdesc list is a recipe for disaster. Who knows
>>> which locks have been taken or simply corrupted, pointers nuked...
>>>
>>>> Is there any reason we can't get the GIC driver to nuke all of this at
>>>> probe time?
>>
>> Is it just enough to remove kexec_mask_interrupts() and add gic_eoi_irq()
>> at the beginning of gic_cpu_init() in irq-gic.c and irq-gic-v3.c?
>
> No, doing an EOI is definitely the wrong thing to do. If you do it in
> the wrong order, you just screw up the GIC state machine. Plus, you have
> no idea what to write there...
>
> The only real solution is to zero the "active" registers.
>
>>> This feels like the better option. I can cook a patch or two for that.
>>
>> If you do, that will be much better :)
>
> OK, I'll prepare something that we can merge at the same time kexec
> comes back from the dead (if it ever does - I'm not holding my breath).

Thank you.
Please note that the same function, machine_kexec_mask_interrupts(),
is already there on arm(/kernel/machine_kexec.c).

Well, kexec/kdump stuff is not dead.
Hopefully I and Geoff will submit a full series of patchset in a few weeks
although the main logic will be the same.

>>
>> BTW, in arm-gic-v3.h, GICD_CTRL_ARE_NS is defined as
>>       (1U << 4)
>> but should it be 5?
>> (I'm referring to the page 8-415 in IHI0069A.)
>
> No, look at the definition ARE_NS has when the access is non-secure or
> on a system supporting a single security state. The definition you're
> referring to is for a secure access (firmware).

Aha, I should remember that arm has "multiple personalities."

-Takahiro AKASHI

> Thanks,
>
> 	M.
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-08-07  4:25 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-24  7:53 [v2 0/5] arm64: add kdump support AKASHI Takahiro
2015-04-24  7:53 ` [v2 1/5] arm64: kdump: reserve memory for crash dump kernel AKASHI Takahiro
2015-04-24 10:11   ` Mark Rutland
2015-05-11  6:44     ` AKASHI Takahiro
2015-04-28  9:19   ` Baoquan He
2015-05-11  7:38     ` AKASHI Takahiro
2015-05-11  7:54       ` Baoquan He
2015-05-11  8:17         ` AKASHI Takahiro
2015-05-11  9:41           ` Baoquan He
2015-05-12  7:32             ` AKASHI Takahiro
2015-04-24  7:53 ` [v2 2/5] arm64: kdump: implement machine_crash_shutdown() AKASHI Takahiro
2015-04-24 10:39   ` Mark Rutland
2015-04-24 10:43     ` Marc Zyngier
2015-08-06  7:09       ` AKASHI Takahiro
2015-08-06 15:51         ` Marc Zyngier
2015-08-07  4:24           ` AKASHI Takahiro
2015-05-11  7:10     ` AKASHI Takahiro
2015-05-22  5:56       ` AKASHI Takahiro
2015-04-24  7:53 ` [v2 3/5] arm64: kdump: do not go into EL2 before starting a crash dump kernel AKASHI Takahiro
2015-04-24  7:53 ` [v2 4/5] arm64: add kdump support AKASHI Takahiro
2015-05-08 12:19   ` Dave Young
2015-05-11  7:47     ` Dave Young
2015-05-11  7:58       ` AKASHI Takahiro
2015-05-11  8:39         ` Dave Young
2015-04-24  7:53 ` [v2 5/5] arm64: enable kdump in the arm64 defconfig AKASHI Takahiro
2015-04-24  9:53 ` [v2 0/5] arm64: add kdump support Mark Rutland
2015-05-11  6:16   ` AKASHI Takahiro
2015-05-12  5:43     ` Dave Young
2015-05-18  8:08       ` AKASHI Takahiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).