All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v12 00/16] arm64 kexec kernel patches v12
@ 2015-11-24 22:25 ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi All,

This series adds the core support for kexec re-boot and kdump on ARM64.  This
version of the series combines Takahiro's kdump patches with my kexec patches.

Changes since v11:

o No changes, rebase to v4.4-rc2.

Changes since v10:

o Move the flush of the new image from arm64_relocate_new_kernel to machine_kexec.
o Pass values to arm64_relocate_new_kernel in registers, not in global variables.
o Fixups to setting the sctlr_el1 and sctlr_el2 flags.

To load a second stage kernel and execute a kexec re-boot or to work with kdump
on ARM64 systems a series of patches to kexec-tools [2], which have not yet been
merged upstream, are needed.

I have tested kexec with the ARM Foundation model, and Takahiro has reported
that kdump is working on the 96boards HiKey developer board.  Kexec on EFI
systems works correctly.  More ACPI + kexec testing is needed.

Patch 1 here moves the macros from proc-macros.S to asm/assembler.h so that the
dcache_line_size macro it defines can be uesd by kexec's relocate kernel
routine.

Patches 2 & 3 rework the ARM64 hcall mechanism to give the CPU reset routines
the ability to switch exception levels from EL1 to EL2 for kernels that were
entered in EL2.

Patch 4 allows KVM to handle a CPU reset.

Patches 5 - 7 add back the ARM64 CPU reset support that was recently removed
from the kernel.

Patches 8 - 10 add kexec support.

Patches 11-16 add kdump support.

Please consider all patches for inclusion.

[1]  https://git.kernel.org/cgit/linux/kernel/git/geoff/linux-kexec.git
[2]  https://git.kernel.org/cgit/linux/kernel/git/geoff/kexec-tools.git

-Geoff

The following changes since commit 1ec218373b8ebda821aec00bb156a9c94fad9cd4:

  Linux 4.4-rc2 (2015-11-22 16:45:59 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/geoff/linux-kexec.git kexec-v12

for you to fetch changes up to 83db66df70fc1f7cd4da8b580d264b9320991cf0:

  arm64: kdump: relax BUG_ON() if more than one cpus are still active (2015-11-24 13:59:27 -0800)

----------------------------------------------------------------
AKASHI Takahiro (7):
      arm64: kvm: allows kvm cpu hotplug
      arm64: kdump: reserve memory for crash dump kernel
      arm64: kdump: implement machine_crash_shutdown()
      arm64: kdump: add kdump support
      arm64: kdump: update a kernel doc
      arm64: kdump: enable kdump in the arm64 defconfig
      arm64: kdump: relax BUG_ON() if more than one cpus are still active

Geoff Levand (9):
      arm64: Fold proc-macros.S into assembler.h
      arm64: Convert hcalls to use HVC immediate value
      arm64: Add new hcall HVC_CALL_FUNC
      arm64: Add back cpu_reset routines
      Revert "arm64: mm: remove unused cpu_set_idmap_tcr_t0sz function"
      Revert "arm64: remove dead code"
      arm64/kexec: Add core kexec support
      arm64/kexec: Add pr_devel output
      arm64/kexec: Enable kexec in the arm64 defconfig

 Documentation/kdump/kdump.txt        |  32 ++++-
 arch/arm/include/asm/kvm_host.h      |  10 +-
 arch/arm/include/asm/kvm_mmu.h       |   1 +
 arch/arm/kvm/arm.c                   |  79 ++++++-----
 arch/arm/kvm/mmu.c                   |   5 +
 arch/arm64/Kconfig                   |  22 ++++
 arch/arm64/configs/defconfig         |   2 +
 arch/arm64/include/asm/assembler.h   |  48 ++++++-
 arch/arm64/include/asm/kexec.h       |  80 +++++++++++
 arch/arm64/include/asm/kvm_host.h    |  16 ++-
 arch/arm64/include/asm/kvm_mmu.h     |   1 +
 arch/arm64/include/asm/mmu.h         |   1 +
 arch/arm64/include/asm/mmu_context.h |  35 +++--
 arch/arm64/include/asm/virt.h        |  49 +++++++
 arch/arm64/kernel/Makefile           |   3 +
 arch/arm64/kernel/cpu-reset.S        |  84 ++++++++++++
 arch/arm64/kernel/cpu-reset.h        |  22 ++++
 arch/arm64/kernel/crash_dump.c       |  71 ++++++++++
 arch/arm64/kernel/hyp-stub.S         |  43 ++++--
 arch/arm64/kernel/machine_kexec.c    | 248 +++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S  | 131 ++++++++++++++++++
 arch/arm64/kernel/setup.c            |   7 +-
 arch/arm64/kernel/smp.c              |  16 ++-
 arch/arm64/kvm/hyp-init.S            |  34 ++++-
 arch/arm64/kvm/hyp.S                 |  44 +++++--
 arch/arm64/mm/cache.S                |   2 -
 arch/arm64/mm/init.c                 |  83 ++++++++++++
 arch/arm64/mm/mmu.c                  |  11 ++
 arch/arm64/mm/proc-macros.S          |  64 ---------
 arch/arm64/mm/proc.S                 |   3 -
 include/uapi/linux/kexec.h           |   1 +
 31 files changed, 1097 insertions(+), 151 deletions(-)
 create mode 100644 arch/arm64/include/asm/kexec.h
 create mode 100644 arch/arm64/kernel/cpu-reset.S
 create mode 100644 arch/arm64/kernel/cpu-reset.h
 create mode 100644 arch/arm64/kernel/crash_dump.c
 create mode 100644 arch/arm64/kernel/machine_kexec.c
 create mode 100644 arch/arm64/kernel/relocate_kernel.S
 delete mode 100644 arch/arm64/mm/proc-macros.S

-- 
2.5.0

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 02/16] arm64: Convert hcalls to use HVC immediate value
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

The existing arm64 hcall implementations are limited in that they only allow
for two distinct hcalls; with the x0 register either zero or not zero.  Also,
the API of the hyp-stub exception vector routines and the KVM exception vector
routines differ; hyp-stub uses a non-zero value in x0 to implement
__hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp.

To allow for additional hcalls to be defined and to make the arm64 hcall API
more consistent across exception vector routines, change the hcall
implementations to use the 16 bit immediate value of the HVC instruction to
specify the hcall type.

Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and
HVC_SET_VECTORS to be used as hcall type specifiers and convert the
existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines
to use these new macros when executing an HVC call.  Also, change the
corresponding hyp-stub and KVM el1_sync exception vector routines to use these
new macros.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/include/asm/virt.h | 27 +++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 32 +++++++++++++++++++++-----------
 arch/arm64/kvm/hyp.S          | 16 +++++++++-------
 3 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df52..eb10368 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -18,6 +18,33 @@
 #ifndef __ASM__VIRT_H
 #define __ASM__VIRT_H
 
+/*
+ * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to
+ * specify the hcall type.  The exception handlers are allowed to use registers
+ * x17 and x18 in their implementation.  Any routine issuing an hcall must not
+ * expect these registers to be preserved.
+ */
+
+/*
+ * HVC_CALL_HYP - Execute a hyp routine.
+ */
+
+#define HVC_CALL_HYP 0
+
+/*
+ * HVC_GET_VECTORS - Return the value of the vbar_el2 register.
+ */
+
+#define HVC_GET_VECTORS 1
+
+/*
+ * HVC_SET_VECTORS - Set the value of the vbar_el2 register.
+ *
+ * @x0: Physical address of the new vector table.
+ */
+
+#define HVC_SET_VECTORS 2
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index a272f33..017ab519 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -22,6 +22,7 @@
 #include <linux/irqchip/arm-gic-v3.h>
 
 #include <asm/assembler.h>
+#include <asm/kvm_arm.h>
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
@@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors)
 	.align 11
 
 el1_sync:
-	mrs	x1, esr_el2
-	lsr	x1, x1, #26
-	cmp	x1, #0x16
+	mrs	x18, esr_el2
+	lsr	x17, x18, #ESR_ELx_EC_SHIFT
+	and	x18, x18, #ESR_ELx_ISS_MASK
+
+	cmp	x17, #ESR_ELx_EC_HVC64
 	b.ne	2f				// Not an HVC trap
-	cbz	x0, 1f
-	msr	vbar_el2, x0			// Set vbar_el2
+
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
+	mrs	x0, vbar_el2
 	b	2f
-1:	mrs	x0, vbar_el2			// Return vbar_el2
+
+1:	cmp	x18, #HVC_SET_VECTORS
+	b.ne	2f
+	msr	vbar_el2, x0
+
 2:	eret
 ENDPROC(el1_sync)
 
@@ -100,11 +109,12 @@ ENDPROC(\label)
  * initialisation entry point.
  */
 
-ENTRY(__hyp_get_vectors)
-	mov	x0, xzr
-	// fall through
 ENTRY(__hyp_set_vectors)
-	hvc	#0
+	hvc	#HVC_SET_VECTORS
 	ret
-ENDPROC(__hyp_get_vectors)
 ENDPROC(__hyp_set_vectors)
+
+ENTRY(__hyp_get_vectors)
+	hvc	#HVC_GET_VECTORS
+	ret
+ENDPROC(__hyp_get_vectors)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 1599701..1bef8db 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -29,6 +29,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/memory.h>
+#include <asm/virt.h>
 
 #define CPU_GP_REG_OFFSET(x)	(CPU_GP_REGS + x)
 #define CPU_XREG_OFFSET(x)	CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x)
@@ -932,12 +933,9 @@ __hyp_panic_str:
  * in Hyp mode (see init_hyp_mode in arch/arm/kvm/arm.c).  Return values are
  * passed in r0 and r1.
  *
- * A function pointer with a value of 0 has a special meaning, and is
- * used to implement __hyp_get_vectors in the same way as in
- * arch/arm64/kernel/hyp_stub.S.
  */
 ENTRY(kvm_call_hyp)
-	hvc	#0
+	hvc	#HVC_CALL_HYP
 	ret
 ENDPROC(kvm_call_hyp)
 
@@ -968,6 +966,7 @@ el1_sync:					// Guest trapped into EL2
 
 	mrs	x1, esr_el2
 	lsr	x2, x1, #ESR_ELx_EC_SHIFT
+	and	x0, x1, #ESR_ELx_ISS_MASK
 
 	cmp	x2, #ESR_ELx_EC_HVC64
 	b.ne	el1_trap
@@ -976,15 +975,18 @@ el1_sync:					// Guest trapped into EL2
 	cbnz	x3, el1_trap			// called HVC
 
 	/* Here, we're pretty sure the host called HVC. */
+	mov	x18, x0
 	pop	x2, x3
 	pop	x0, x1
 
-	/* Check for __hyp_get_vectors */
-	cbnz	x0, 1f
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
 	mrs	x0, vbar_el2
 	b	2f
 
-1:	push	lr, xzr
+1:	/* Default to HVC_CALL_HYP. */
+
+	push	lr, xzr
 
 	/*
 	 * Compute the function address in EL2, and shuffle the parameters.
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

The current kvm implementation on arm64 does cpu-specific initialization
at system boot, and has no way to gracefully shutdown a core in terms of
kvm. This prevents, especially, kexec from rebooting the system on a boot
core in EL2.

This patch adds a cpu tear-down function and also puts an existing cpu-init
code into a separate function, kvm_arch_hardware_disable() and
kvm_arch_hardware_enable() respectively.
We don't need arm64-specific cpu hotplug hook any more.

Since this patch modifies common part of code between arm and arm64, one
stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
compiling errors.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm/include/asm/kvm_host.h   | 10 ++++-
 arch/arm/include/asm/kvm_mmu.h    |  1 +
 arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
 arch/arm/kvm/mmu.c                |  5 +++
 arch/arm64/include/asm/kvm_host.h | 16 +++++++-
 arch/arm64/include/asm/kvm_mmu.h  |  1 +
 arch/arm64/include/asm/virt.h     |  9 +++++
 arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
 arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
 9 files changed, 138 insertions(+), 48 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 6692982..9242765 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -214,6 +214,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * TODO
+	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+	 */
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
@@ -226,7 +235,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa18..dc6fadf 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index eab83b2..a5d9d74 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -16,7 +16,6 @@
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
 
-#include <linux/cpu.h>
 #include <linux/cpu_pm.h>
 #include <linux/errno.h>
 #include <linux/err.h>
@@ -61,6 +60,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
 static u8 kvm_next_vmid;
 static DEFINE_SPINLOCK(kvm_vmid_lock);
 
+static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
+
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
 	BUG_ON(preemptible());
@@ -85,11 +86,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
 	return &kvm_arm_running_vcpu;
 }
 
-int kvm_arch_hardware_enable(void)
-{
-	return 0;
-}
-
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
@@ -959,7 +955,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 }
 
-static void cpu_init_hyp_mode(void *dummy)
+int kvm_arch_hardware_enable(void)
 {
 	phys_addr_t boot_pgd_ptr;
 	phys_addr_t pgd_ptr;
@@ -967,6 +963,9 @@ static void cpu_init_hyp_mode(void *dummy)
 	unsigned long stack_page;
 	unsigned long vector_ptr;
 
+	if (__hyp_get_vectors() != hyp_default_vectors)
+		return 0;
+
 	/* Switch from the HYP stub to our own HYP init vector */
 	__hyp_set_vectors(kvm_get_idmap_vector());
 
@@ -979,38 +978,50 @@ static void cpu_init_hyp_mode(void *dummy)
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
 	kvm_arm_init_debug();
+
+	return 0;
 }
 
-static int hyp_init_cpu_notify(struct notifier_block *self,
-			       unsigned long action, void *cpu)
+void kvm_arch_hardware_disable(void)
 {
-	switch (action) {
-	case CPU_STARTING:
-	case CPU_STARTING_FROZEN:
-		if (__hyp_get_vectors() == hyp_default_vectors)
-			cpu_init_hyp_mode(NULL);
-		break;
-	}
+	phys_addr_t boot_pgd_ptr;
+	phys_addr_t phys_idmap_start;
 
-	return NOTIFY_OK;
-}
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		return;
 
-static struct notifier_block hyp_init_cpu_nb = {
-	.notifier_call = hyp_init_cpu_notify,
-};
+	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
+	phys_idmap_start = kvm_get_idmap_start();
+
+	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
+}
 
 #ifdef CONFIG_CPU_PM
 static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
 				    unsigned long cmd,
 				    void *v)
 {
-	if (cmd == CPU_PM_EXIT &&
-	    __hyp_get_vectors() == hyp_default_vectors) {
-		cpu_init_hyp_mode(NULL);
+	switch (cmd) {
+	case CPU_PM_ENTER:
+		if (__hyp_get_vectors() != hyp_default_vectors)
+			__this_cpu_write(kvm_arm_hardware_enabled, 1);
+		else
+			__this_cpu_write(kvm_arm_hardware_enabled, 0);
+		/*
+		 * don't call kvm_arch_hardware_disable() in case of
+		 * CPU_PM_ENTER because it does't actually save any state.
+		 */
+
+		return NOTIFY_OK;
+	case CPU_PM_EXIT:
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			kvm_arch_hardware_enable();
+
 		return NOTIFY_OK;
-	}
 
-	return NOTIFY_DONE;
+	default:
+		return NOTIFY_DONE;
+	}
 }
 
 static struct notifier_block hyp_init_cpu_pm_nb = {
@@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
 	}
 
 	/*
-	 * Execute the init code on each CPU.
-	 */
-	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
-
-	/*
 	 * Init HYP view of VGIC
 	 */
 	err = kvm_vgic_hyp_init();
@@ -1186,26 +1192,15 @@ int kvm_arch_init(void *opaque)
 		}
 	}
 
-	cpu_notifier_register_begin();
-
 	err = init_hyp_mode();
 	if (err)
 		goto out_err;
 
-	err = __register_cpu_notifier(&hyp_init_cpu_nb);
-	if (err) {
-		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
-		goto out_err;
-	}
-
-	cpu_notifier_register_done();
-
 	hyp_cpu_pm_init();
 
 	kvm_coproc_table_init();
 	return 0;
 out_err:
-	cpu_notifier_register_done();
 	return err;
 }
 
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 6984342..69b4a33 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1644,6 +1644,11 @@ phys_addr_t kvm_get_idmap_vector(void)
 	return hyp_idmap_vector;
 }
 
+phys_addr_t kvm_get_idmap_start(void)
+{
+	return hyp_idmap_start;
+}
+
 int kvm_mmu_init(void)
 {
 	int err;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index a35ce72..0b540f8 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -223,6 +223,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
 u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_call_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
@@ -247,7 +248,20 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 		     hyp_stack_ptr, vector_ptr);
 }
 
-static inline void kvm_arch_hardware_disable(void) {}
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * Call reset code, and switch back to stub hyp vectors.
+	 */
+	kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+}
+
+struct vgic_sr_vectors {
+	void	*save_vgic;
+	void	*restore_vgic;
+};
+
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 6150567..ff5a087 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -98,6 +98,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 3070096..bca79f9 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -58,9 +58,18 @@
 
 #define HVC_CALL_FUNC 3
 
+/*
+ * HVC_RESET_CPU - Reset cpu in EL2 to initial state.
+ *
+ * @x0: entry address in trampoline code in va
+ * @x1: identical mapping page table in pa
+ */
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
+#define HVC_RESET_CPU 4
+
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 2e67a48..1925163 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -139,6 +139,39 @@ merged:
 	eret
 ENDPROC(__kvm_hyp_init)
 
+	/*
+	 * x0: HYP boot pgd
+	 * x1: HYP phys_idmap_start
+	 */
+ENTRY(__kvm_hyp_reset)
+	/* We're in trampoline code in VA, switch back to boot page tables */
+	msr	ttbr0_el2, x0
+	isb
+
+	/* Invalidate the old TLBs */
+	tlbi	alle2
+	dsb	sy
+
+	/* Branch into PA space */
+	adr	x0, 1f
+	bfi	x1, x0, #0, #PAGE_SHIFT
+	br	x1
+
+	/* We're now in idmap, disable MMU */
+1:	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_EL2_FLAGS
+	bic	x0, x0, x1		// Clear SCTL_M and etc
+	msr	sctlr_el2, x0
+	isb
+
+	/* Install stub vectors */
+	adrp	x0, __hyp_stub_vectors
+	add	x0, x0, #:lo12:__hyp_stub_vectors
+	msr	vbar_el2, x0
+
+	eret
+ENDPROC(__kvm_hyp_reset)
+
 	.ltorg
 
 	.popsection
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 1bef8db..aca11d6 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -939,6 +939,11 @@ ENTRY(kvm_call_hyp)
 	ret
 ENDPROC(kvm_call_hyp)
 
+ENTRY(kvm_call_reset)
+	hvc	#HVC_RESET_CPU
+	ret
+ENDPROC(kvm_call_reset)
+
 .macro invalid_vector	label, target
 	.align	2
 \label:
@@ -982,10 +987,27 @@ el1_sync:					// Guest trapped into EL2
 	cmp	x18, #HVC_GET_VECTORS
 	b.ne	1f
 	mrs	x0, vbar_el2
-	b	2f
-
-1:	/* Default to HVC_CALL_HYP. */
+	b	do_eret
 
+	/* jump into trampoline code */
+1:	cmp	x18, #HVC_RESET_CPU
+	b.ne	2f
+	/*
+	 * Entry point is:
+	 *	TRAMPOLINE_VA
+	 *	+ (__kvm_hyp_reset - (__hyp_idmap_text_start & PAGE_MASK))
+	 */
+	adrp	x2, __kvm_hyp_reset
+	add	x2, x2, #:lo12:__kvm_hyp_reset
+	adrp	x3, __hyp_idmap_text_start
+	add	x3, x3, #:lo12:__hyp_idmap_text_start
+	and	x3, x3, PAGE_MASK
+	sub	x2, x2, x3
+	ldr	x3, =TRAMPOLINE_VA
+	add	x2, x2, x3
+	br	x2				// no return
+
+2:	/* Default to HVC_CALL_HYP. */
 	push	lr, xzr
 
 	/*
@@ -999,7 +1021,9 @@ el1_sync:					// Guest trapped into EL2
 	blr	lr
 
 	pop	lr, xzr
-2:	eret
+
+do_eret:
+	eret
 
 el1_trap:
 	/*
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 03/16] arm64: Add new hcall HVC_CALL_FUNC
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

Add the new hcall HVC_CALL_FUNC that allows execution of a function at EL2.
During CPU reset the CPU must be brought to the exception level it had on
entry to the kernel.  The HVC_CALL_FUNC hcall will provide the mechanism
needed for this exception level switch.

To allow the HVC_CALL_FUNC exception vector to work without a stack, which is
needed to support an hcall at CPU reset, this implementation uses register x18
to store the link register across the caller provided function.  This dictates
that the caller provided function must preserve the contents of register x18.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/include/asm/virt.h | 13 +++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 13 ++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index eb10368..3070096 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -45,6 +45,19 @@
 
 #define HVC_SET_VECTORS 2
 
+/*
+ * HVC_CALL_FUNC - Execute a function at EL2.
+ *
+ * @x0: Physical address of the function to be executed.
+ * @x1: Passed as the first argument to the function.
+ * @x2: Passed as the second argument to the function.
+ * @x3: Passed as the third argument to the function.
+ *
+ * The called function must preserve the contents of register x18.
+ */
+
+#define HVC_CALL_FUNC 3
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 017ab519..e8febe9 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -67,8 +67,19 @@ el1_sync:
 	b	2f
 
 1:	cmp	x18, #HVC_SET_VECTORS
-	b.ne	2f
+	b.ne	1f
 	msr	vbar_el2, x0
+	b	2f
+
+1:	cmp	x18, #HVC_CALL_FUNC
+	b.ne	2f
+	mov	x18, lr
+	mov	lr, x0
+	mov	x0, x1
+	mov	x1, x2
+	mov	x2, x3
+	blr	lr
+	mov	lr, x18
 
 2:	eret
 ENDPROC(el1_sync)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 07/16] Revert "arm64: remove dead code"
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

This reverts commit b08d4640a3dca68670fc5af2fe9205b395a02388.

Add back the setup_mm_for_reboot() needed for kexec.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/include/asm/mmu.h |  1 +
 arch/arm64/mm/mmu.c          | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 990124a..6326d11 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -29,6 +29,7 @@ typedef struct {
 #define ASID(mm)	((mm)->context.id.counter & 0xffff)
 
 extern void paging_init(void);
+extern void setup_mm_for_reboot(void);
 extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
 extern void init_mem_pgprot(void);
 extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index abb66f8..c5bce3c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -515,6 +515,17 @@ void __init paging_init(void)
 }
 
 /*
+ * Enable the identity mapping to allow the MMU disabling.
+ */
+void setup_mm_for_reboot(void)
+{
+	cpu_set_reserved_ttbr0();
+	flush_tlb_all();
+	cpu_set_idmap_tcr_t0sz();
+	cpu_switch_mm(idmap_pg_dir, &init_mm);
+}
+
+/*
  * Check whether a kernel address is valid (derived from arch/x86/).
  */
 int kern_addr_valid(unsigned long addr)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 01/16] arm64: Fold proc-macros.S into assembler.h
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to be used
outside the mm code move the contents of proc-macros.S to asm/assembler.h.  Also,
delete proc-macros.S, and fix up all references to proc-macros.S.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/include/asm/assembler.h | 48 +++++++++++++++++++++++++++-
 arch/arm64/kvm/hyp-init.S          |  1 -
 arch/arm64/mm/cache.S              |  2 --
 arch/arm64/mm/proc-macros.S        | 64 --------------------------------------
 arch/arm64/mm/proc.S               |  3 --
 5 files changed, 47 insertions(+), 71 deletions(-)
 delete mode 100644 arch/arm64/mm/proc-macros.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 12eff92..21979a4 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -1,5 +1,5 @@
 /*
- * Based on arch/arm/include/asm/assembler.h
+ * Based on arch/arm/include/asm/assembler.h, arch/arm/mm/proc-macros.S
  *
  * Copyright (C) 1996-2000 Russell King
  * Copyright (C) 2012 ARM Ltd.
@@ -23,6 +23,8 @@
 #ifndef __ASM_ASSEMBLER_H
 #define __ASM_ASSEMBLER_H
 
+#include <asm/asm-offsets.h>
+#include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
@@ -194,6 +196,50 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
+ */
+	.macro	vma_vm_mm, rd, rn
+	ldr	\rd, [\rn, #VMA_VM_MM]
+	.endm
+
+/*
+ * mmid - get context id from mm pointer (mm->context.id)
+ */
+	.macro	mmid, rd, rn
+	ldr	\rd, [\rn, #MM_CONTEXT_ID]
+	.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register.
+ */
+	.macro	icache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	and	\tmp, \tmp, #0xf		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
+ */
+	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
+#ifndef CONFIG_ARM64_VA_BITS_48
+	ldr_l	\tmpreg, idmap_t0sz
+	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
+#endif
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 178ba22..2e67a48 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -20,7 +20,6 @@
 #include <asm/assembler.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
-#include <asm/pgtable-hwdef.h>
 
 	.text
 	.pushsection	.hyp.idmap.text, "ax"
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index cfa44a6..f49041d 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -24,8 +24,6 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
 
-#include "proc-macros.S"
-
 /*
  *	flush_icache_range(start,end)
  *
diff --git a/arch/arm64/mm/proc-macros.S b/arch/arm64/mm/proc-macros.S
deleted file mode 100644
index 4c4d93c..0000000
--- a/arch/arm64/mm/proc-macros.S
+++ /dev/null
@@ -1,64 +0,0 @@
-/*
- * Based on arch/arm/mm/proc-macros.S
- *
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-
-/*
- * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
- */
-	.macro	vma_vm_mm, rd, rn
-	ldr	\rd, [\rn, #VMA_VM_MM]
-	.endm
-
-/*
- * mmid - get context id from mm pointer (mm->context.id)
- */
-	.macro	mmid, rd, rn
-	ldr	\rd, [\rn, #MM_CONTEXT_ID]
-	.endm
-
-/*
- * dcache_line_size - get the minimum D-cache line size from the CTR register.
- */
-	.macro	dcache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * icache_line_size - get the minimum I-cache line size from the CTR register.
- */
-	.macro	icache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	and	\tmp, \tmp, #0xf		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
- */
-	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
-#ifndef CONFIG_ARM64_VA_BITS_48
-	ldr_l	\tmpreg, idmap_t0sz
-	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
-#endif
-	.endm
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index cacecc4..7ab3a90 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -23,11 +23,8 @@
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
-#include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 
-#include "proc-macros.S"
-
 #ifdef CONFIG_ARM64_64K_PAGES
 #define TCR_TG_FLAGS	TCR_TG0_64K | TCR_TG1_64K
 #elif defined(CONFIG_ARM64_16K_PAGES)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 06/16] Revert "arm64: mm: remove unused cpu_set_idmap_tcr_t0sz function"
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

This reverts commit c51e97d89e526368eb697f87cd4d391b9e19f369.

Add back the cpu_set_idmap_tcr_t0sz function needed by setup_mmu_for_reboot.
---
 arch/arm64/include/asm/mmu_context.h | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 2416578..7567030 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -70,23 +70,34 @@ static inline bool __cpu_uses_extended_idmap(void)
 		unlikely(idmap_t0sz != TCR_T0SZ(VA_BITS)));
 }
 
+static inline void __cpu_set_tcr_t0sz(u64 t0sz)
+{
+	unsigned long tcr;
+
+	if (__cpu_uses_extended_idmap())
+		asm volatile (
+		"	mrs	%0, tcr_el1	;"
+		"	bfi	%0, %1, %2, %3	;"
+		"	msr	tcr_el1, %0	;"
+		"	isb"
+		: "=&r" (tcr)
+		: "r"(t0sz), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH));
+}
+
+/*
+ * Set TCR.T0SZ to the value appropriate for activating the identity map.
+ */
+static inline void cpu_set_idmap_tcr_t0sz(void)
+{
+	__cpu_set_tcr_t0sz(idmap_t0sz);
+}
+
 /*
  * Set TCR.T0SZ to its default value (based on VA_BITS)
  */
 static inline void cpu_set_default_tcr_t0sz(void)
 {
-	unsigned long tcr;
-
-	if (!__cpu_uses_extended_idmap())
-		return;
-
-	asm volatile (
-	"	mrs	%0, tcr_el1	;"
-	"	bfi	%0, %1, %2, %3	;"
-	"	msr	tcr_el1, %0	;"
-	"	isb"
-	: "=&r" (tcr)
-	: "r"(TCR_T0SZ(VA_BITS)), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH));
+	__cpu_set_tcr_t0sz(TCR_T0SZ(VA_BITS));
 }
 
 /*
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

Commit 68234df4ea7939f98431aa81113fbdce10c4a84b (arm64: kill flush_cache_all())
removed the global arm64 routines cpu_reset() and cpu_soft_restart() needed by
the arm64 kexec and kdump support.  Add simplified versions of those two
routines back with some changes needed for kexec in the new files cpu_reset.S,
and cpu_reset.h.

When a CPU is reset it needs to be put into the exception level it had
when it entered the kernel. Update cpu_reset() to accept an argument
which signals if the reset address needs to be entered at EL1 or EL2.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/kernel/cpu-reset.S | 84 +++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/cpu-reset.h | 22 ++++++++++++
 2 files changed, 106 insertions(+)
 create mode 100644 arch/arm64/kernel/cpu-reset.S
 create mode 100644 arch/arm64/kernel/cpu-reset.h

diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
new file mode 100644
index 0000000..0423f27
--- /dev/null
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -0,0 +1,84 @@
+/*
+ * cpu reset routines
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+#include <asm/virt.h>
+
+#define SCTLR_ELx_I	(1 << 12)
+#define SCTLR_ELx_SA	(1 << 3)
+#define SCTLR_ELx_C	(1 << 2)
+#define SCTLR_ELx_A	(1 << 1)
+#define SCTLR_ELx_M	1
+#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C |	\
+			 SCTLR_ELx_SA | SCTLR_ELx_I)
+
+.text
+.pushsection    .idmap.text, "ax"
+
+.align 5
+
+/*
+ * cpu_soft_restart(cpu_reset, el2_switch, entry, arg0, arg1) - Perform a cpu
+ * soft reset.
+ *
+ * @cpu_reset: Physical address of the cpu_reset routine.
+ * @el2_switch: Flag to indicate a swich to EL2 is needed, passed to cpu_reset.
+ * @entry: Location to jump to for soft reset, passed to cpu_reset.
+ * arg0: First argument passed to @entry.
+ * arg1: Second argument passed to @entry.
+ */
+
+ENTRY(cpu_soft_restart)
+	mov	x18, x0				// cpu_reset
+	mov	x0, x1				// el2_switch
+	mov	x1, x2				// entry
+	mov	x2, x3				// arg0
+	mov	x3, x4				// arg1
+	ret	x18
+ENDPROC(cpu_soft_restart)
+
+/*
+ * cpu_reset(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
+ *
+ * @el2_switch: Flag to indicate a swich to EL2 is needed.
+ * @entry: Location to jump to for soft reset.
+ * arg0: First argument passed to @entry.
+ * arg1: Second argument passed to @entry.
+ *
+ * Put the CPU into the same state as it would be if it had been reset, and
+ * branch to what would be the reset vector. It must be executed with the
+ * flat identity mapping.
+ */
+
+ENTRY(cpu_reset)
+	/* Clear sctlr_el1 flags. */
+	mrs	x12, sctlr_el1
+	ldr	x13, =SCTLR_ELx_FLAGS
+	bic	x12, x12, x13
+	msr	sctlr_el1, x12
+	isb
+
+	cbz	x0, 1f				// el2_switch?
+
+	mov	x0, x1				// entry
+	mov	x1, x2				// arg0
+	mov	x2, x3				// arg1
+	hvc	#HVC_CALL_FUNC			// no return
+
+1:	mov	x18, x1				// entry
+	mov	x0, x2				// arg0
+	mov	x1, x3				// arg1
+	ret	x18
+ENDPROC(cpu_reset)
+
+.popsection
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
new file mode 100644
index 0000000..79816b6
--- /dev/null
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -0,0 +1,22 @@
+/*
+ * cpu reset routines
+ *
+ * Copyright (C) 2015 Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#if !defined(_ARM64_CPU_RESET_H)
+#define _ARM64_CPU_RESET_H
+
+#include <asm/virt.h>
+
+void __attribute__((noreturn)) cpu_reset(unsigned long el2_switch,
+	unsigned long entry, unsigned long arg0, unsigned long arg1);
+void __attribute__((noreturn)) cpu_soft_restart(phys_addr_t cpu_reset,
+	unsigned long el2_switch, unsigned long entry, unsigned long arg0,
+	unsigned long arg1);
+
+#endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 00/16] arm64 kexec kernel patches v12
@ 2015-11-24 22:25 ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

Hi All,

This series adds the core support for kexec re-boot and kdump on ARM64.  This
version of the series combines Takahiro's kdump patches with my kexec patches.

Changes since v11:

o No changes, rebase to v4.4-rc2.

Changes since v10:

o Move the flush of the new image from arm64_relocate_new_kernel to machine_kexec.
o Pass values to arm64_relocate_new_kernel in registers, not in global variables.
o Fixups to setting the sctlr_el1 and sctlr_el2 flags.

To load a second stage kernel and execute a kexec re-boot or to work with kdump
on ARM64 systems a series of patches to kexec-tools [2], which have not yet been
merged upstream, are needed.

I have tested kexec with the ARM Foundation model, and Takahiro has reported
that kdump is working on the 96boards HiKey developer board.  Kexec on EFI
systems works correctly.  More ACPI + kexec testing is needed.

Patch 1 here moves the macros from proc-macros.S to asm/assembler.h so that the
dcache_line_size macro it defines can be uesd by kexec's relocate kernel
routine.

Patches 2 & 3 rework the ARM64 hcall mechanism to give the CPU reset routines
the ability to switch exception levels from EL1 to EL2 for kernels that were
entered in EL2.

Patch 4 allows KVM to handle a CPU reset.

Patches 5 - 7 add back the ARM64 CPU reset support that was recently removed
from the kernel.

Patches 8 - 10 add kexec support.

Patches 11-16 add kdump support.

Please consider all patches for inclusion.

[1]  https://git.kernel.org/cgit/linux/kernel/git/geoff/linux-kexec.git
[2]  https://git.kernel.org/cgit/linux/kernel/git/geoff/kexec-tools.git

-Geoff

The following changes since commit 1ec218373b8ebda821aec00bb156a9c94fad9cd4:

  Linux 4.4-rc2 (2015-11-22 16:45:59 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/geoff/linux-kexec.git kexec-v12

for you to fetch changes up to 83db66df70fc1f7cd4da8b580d264b9320991cf0:

  arm64: kdump: relax BUG_ON() if more than one cpus are still active (2015-11-24 13:59:27 -0800)

----------------------------------------------------------------
AKASHI Takahiro (7):
      arm64: kvm: allows kvm cpu hotplug
      arm64: kdump: reserve memory for crash dump kernel
      arm64: kdump: implement machine_crash_shutdown()
      arm64: kdump: add kdump support
      arm64: kdump: update a kernel doc
      arm64: kdump: enable kdump in the arm64 defconfig
      arm64: kdump: relax BUG_ON() if more than one cpus are still active

Geoff Levand (9):
      arm64: Fold proc-macros.S into assembler.h
      arm64: Convert hcalls to use HVC immediate value
      arm64: Add new hcall HVC_CALL_FUNC
      arm64: Add back cpu_reset routines
      Revert "arm64: mm: remove unused cpu_set_idmap_tcr_t0sz function"
      Revert "arm64: remove dead code"
      arm64/kexec: Add core kexec support
      arm64/kexec: Add pr_devel output
      arm64/kexec: Enable kexec in the arm64 defconfig

 Documentation/kdump/kdump.txt        |  32 ++++-
 arch/arm/include/asm/kvm_host.h      |  10 +-
 arch/arm/include/asm/kvm_mmu.h       |   1 +
 arch/arm/kvm/arm.c                   |  79 ++++++-----
 arch/arm/kvm/mmu.c                   |   5 +
 arch/arm64/Kconfig                   |  22 ++++
 arch/arm64/configs/defconfig         |   2 +
 arch/arm64/include/asm/assembler.h   |  48 ++++++-
 arch/arm64/include/asm/kexec.h       |  80 +++++++++++
 arch/arm64/include/asm/kvm_host.h    |  16 ++-
 arch/arm64/include/asm/kvm_mmu.h     |   1 +
 arch/arm64/include/asm/mmu.h         |   1 +
 arch/arm64/include/asm/mmu_context.h |  35 +++--
 arch/arm64/include/asm/virt.h        |  49 +++++++
 arch/arm64/kernel/Makefile           |   3 +
 arch/arm64/kernel/cpu-reset.S        |  84 ++++++++++++
 arch/arm64/kernel/cpu-reset.h        |  22 ++++
 arch/arm64/kernel/crash_dump.c       |  71 ++++++++++
 arch/arm64/kernel/hyp-stub.S         |  43 ++++--
 arch/arm64/kernel/machine_kexec.c    | 248 +++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S  | 131 ++++++++++++++++++
 arch/arm64/kernel/setup.c            |   7 +-
 arch/arm64/kernel/smp.c              |  16 ++-
 arch/arm64/kvm/hyp-init.S            |  34 ++++-
 arch/arm64/kvm/hyp.S                 |  44 +++++--
 arch/arm64/mm/cache.S                |   2 -
 arch/arm64/mm/init.c                 |  83 ++++++++++++
 arch/arm64/mm/mmu.c                  |  11 ++
 arch/arm64/mm/proc-macros.S          |  64 ---------
 arch/arm64/mm/proc.S                 |   3 -
 include/uapi/linux/kexec.h           |   1 +
 31 files changed, 1097 insertions(+), 151 deletions(-)
 create mode 100644 arch/arm64/include/asm/kexec.h
 create mode 100644 arch/arm64/kernel/cpu-reset.S
 create mode 100644 arch/arm64/kernel/cpu-reset.h
 create mode 100644 arch/arm64/kernel/crash_dump.c
 create mode 100644 arch/arm64/kernel/machine_kexec.c
 create mode 100644 arch/arm64/kernel/relocate_kernel.S
 delete mode 100644 arch/arm64/mm/proc-macros.S

-- 
2.5.0


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 02/16] arm64: Convert hcalls to use HVC immediate value
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

The existing arm64 hcall implementations are limited in that they only allow
for two distinct hcalls; with the x0 register either zero or not zero.  Also,
the API of the hyp-stub exception vector routines and the KVM exception vector
routines differ; hyp-stub uses a non-zero value in x0 to implement
__hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp.

To allow for additional hcalls to be defined and to make the arm64 hcall API
more consistent across exception vector routines, change the hcall
implementations to use the 16 bit immediate value of the HVC instruction to
specify the hcall type.

Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and
HVC_SET_VECTORS to be used as hcall type specifiers and convert the
existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines
to use these new macros when executing an HVC call.  Also, change the
corresponding hyp-stub and KVM el1_sync exception vector routines to use these
new macros.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/include/asm/virt.h | 27 +++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 32 +++++++++++++++++++++-----------
 arch/arm64/kvm/hyp.S          | 16 +++++++++-------
 3 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df52..eb10368 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -18,6 +18,33 @@
 #ifndef __ASM__VIRT_H
 #define __ASM__VIRT_H
 
+/*
+ * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to
+ * specify the hcall type.  The exception handlers are allowed to use registers
+ * x17 and x18 in their implementation.  Any routine issuing an hcall must not
+ * expect these registers to be preserved.
+ */
+
+/*
+ * HVC_CALL_HYP - Execute a hyp routine.
+ */
+
+#define HVC_CALL_HYP 0
+
+/*
+ * HVC_GET_VECTORS - Return the value of the vbar_el2 register.
+ */
+
+#define HVC_GET_VECTORS 1
+
+/*
+ * HVC_SET_VECTORS - Set the value of the vbar_el2 register.
+ *
+ * @x0: Physical address of the new vector table.
+ */
+
+#define HVC_SET_VECTORS 2
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index a272f33..017ab519 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -22,6 +22,7 @@
 #include <linux/irqchip/arm-gic-v3.h>
 
 #include <asm/assembler.h>
+#include <asm/kvm_arm.h>
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
@@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors)
 	.align 11
 
 el1_sync:
-	mrs	x1, esr_el2
-	lsr	x1, x1, #26
-	cmp	x1, #0x16
+	mrs	x18, esr_el2
+	lsr	x17, x18, #ESR_ELx_EC_SHIFT
+	and	x18, x18, #ESR_ELx_ISS_MASK
+
+	cmp	x17, #ESR_ELx_EC_HVC64
 	b.ne	2f				// Not an HVC trap
-	cbz	x0, 1f
-	msr	vbar_el2, x0			// Set vbar_el2
+
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
+	mrs	x0, vbar_el2
 	b	2f
-1:	mrs	x0, vbar_el2			// Return vbar_el2
+
+1:	cmp	x18, #HVC_SET_VECTORS
+	b.ne	2f
+	msr	vbar_el2, x0
+
 2:	eret
 ENDPROC(el1_sync)
 
@@ -100,11 +109,12 @@ ENDPROC(\label)
  * initialisation entry point.
  */
 
-ENTRY(__hyp_get_vectors)
-	mov	x0, xzr
-	// fall through
 ENTRY(__hyp_set_vectors)
-	hvc	#0
+	hvc	#HVC_SET_VECTORS
 	ret
-ENDPROC(__hyp_get_vectors)
 ENDPROC(__hyp_set_vectors)
+
+ENTRY(__hyp_get_vectors)
+	hvc	#HVC_GET_VECTORS
+	ret
+ENDPROC(__hyp_get_vectors)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 1599701..1bef8db 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -29,6 +29,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/memory.h>
+#include <asm/virt.h>
 
 #define CPU_GP_REG_OFFSET(x)	(CPU_GP_REGS + x)
 #define CPU_XREG_OFFSET(x)	CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x)
@@ -932,12 +933,9 @@ __hyp_panic_str:
  * in Hyp mode (see init_hyp_mode in arch/arm/kvm/arm.c).  Return values are
  * passed in r0 and r1.
  *
- * A function pointer with a value of 0 has a special meaning, and is
- * used to implement __hyp_get_vectors in the same way as in
- * arch/arm64/kernel/hyp_stub.S.
  */
 ENTRY(kvm_call_hyp)
-	hvc	#0
+	hvc	#HVC_CALL_HYP
 	ret
 ENDPROC(kvm_call_hyp)
 
@@ -968,6 +966,7 @@ el1_sync:					// Guest trapped into EL2
 
 	mrs	x1, esr_el2
 	lsr	x2, x1, #ESR_ELx_EC_SHIFT
+	and	x0, x1, #ESR_ELx_ISS_MASK
 
 	cmp	x2, #ESR_ELx_EC_HVC64
 	b.ne	el1_trap
@@ -976,15 +975,18 @@ el1_sync:					// Guest trapped into EL2
 	cbnz	x3, el1_trap			// called HVC
 
 	/* Here, we're pretty sure the host called HVC. */
+	mov	x18, x0
 	pop	x2, x3
 	pop	x0, x1
 
-	/* Check for __hyp_get_vectors */
-	cbnz	x0, 1f
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
 	mrs	x0, vbar_el2
 	b	2f
 
-1:	push	lr, xzr
+1:	/* Default to HVC_CALL_HYP. */
+
+	push	lr, xzr
 
 	/*
 	 * Compute the function address in EL2, and shuffle the parameters.
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

The current kvm implementation on arm64 does cpu-specific initialization
at system boot, and has no way to gracefully shutdown a core in terms of
kvm. This prevents, especially, kexec from rebooting the system on a boot
core in EL2.

This patch adds a cpu tear-down function and also puts an existing cpu-init
code into a separate function, kvm_arch_hardware_disable() and
kvm_arch_hardware_enable() respectively.
We don't need arm64-specific cpu hotplug hook any more.

Since this patch modifies common part of code between arm and arm64, one
stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
compiling errors.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm/include/asm/kvm_host.h   | 10 ++++-
 arch/arm/include/asm/kvm_mmu.h    |  1 +
 arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
 arch/arm/kvm/mmu.c                |  5 +++
 arch/arm64/include/asm/kvm_host.h | 16 +++++++-
 arch/arm64/include/asm/kvm_mmu.h  |  1 +
 arch/arm64/include/asm/virt.h     |  9 +++++
 arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
 arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
 9 files changed, 138 insertions(+), 48 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 6692982..9242765 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -214,6 +214,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * TODO
+	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+	 */
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
@@ -226,7 +235,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa18..dc6fadf 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index eab83b2..a5d9d74 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -16,7 +16,6 @@
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
 
-#include <linux/cpu.h>
 #include <linux/cpu_pm.h>
 #include <linux/errno.h>
 #include <linux/err.h>
@@ -61,6 +60,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
 static u8 kvm_next_vmid;
 static DEFINE_SPINLOCK(kvm_vmid_lock);
 
+static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
+
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
 	BUG_ON(preemptible());
@@ -85,11 +86,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
 	return &kvm_arm_running_vcpu;
 }
 
-int kvm_arch_hardware_enable(void)
-{
-	return 0;
-}
-
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
@@ -959,7 +955,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 }
 
-static void cpu_init_hyp_mode(void *dummy)
+int kvm_arch_hardware_enable(void)
 {
 	phys_addr_t boot_pgd_ptr;
 	phys_addr_t pgd_ptr;
@@ -967,6 +963,9 @@ static void cpu_init_hyp_mode(void *dummy)
 	unsigned long stack_page;
 	unsigned long vector_ptr;
 
+	if (__hyp_get_vectors() != hyp_default_vectors)
+		return 0;
+
 	/* Switch from the HYP stub to our own HYP init vector */
 	__hyp_set_vectors(kvm_get_idmap_vector());
 
@@ -979,38 +978,50 @@ static void cpu_init_hyp_mode(void *dummy)
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
 	kvm_arm_init_debug();
+
+	return 0;
 }
 
-static int hyp_init_cpu_notify(struct notifier_block *self,
-			       unsigned long action, void *cpu)
+void kvm_arch_hardware_disable(void)
 {
-	switch (action) {
-	case CPU_STARTING:
-	case CPU_STARTING_FROZEN:
-		if (__hyp_get_vectors() == hyp_default_vectors)
-			cpu_init_hyp_mode(NULL);
-		break;
-	}
+	phys_addr_t boot_pgd_ptr;
+	phys_addr_t phys_idmap_start;
 
-	return NOTIFY_OK;
-}
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		return;
 
-static struct notifier_block hyp_init_cpu_nb = {
-	.notifier_call = hyp_init_cpu_notify,
-};
+	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
+	phys_idmap_start = kvm_get_idmap_start();
+
+	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
+}
 
 #ifdef CONFIG_CPU_PM
 static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
 				    unsigned long cmd,
 				    void *v)
 {
-	if (cmd == CPU_PM_EXIT &&
-	    __hyp_get_vectors() == hyp_default_vectors) {
-		cpu_init_hyp_mode(NULL);
+	switch (cmd) {
+	case CPU_PM_ENTER:
+		if (__hyp_get_vectors() != hyp_default_vectors)
+			__this_cpu_write(kvm_arm_hardware_enabled, 1);
+		else
+			__this_cpu_write(kvm_arm_hardware_enabled, 0);
+		/*
+		 * don't call kvm_arch_hardware_disable() in case of
+		 * CPU_PM_ENTER because it does't actually save any state.
+		 */
+
+		return NOTIFY_OK;
+	case CPU_PM_EXIT:
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			kvm_arch_hardware_enable();
+
 		return NOTIFY_OK;
-	}
 
-	return NOTIFY_DONE;
+	default:
+		return NOTIFY_DONE;
+	}
 }
 
 static struct notifier_block hyp_init_cpu_pm_nb = {
@@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
 	}
 
 	/*
-	 * Execute the init code on each CPU.
-	 */
-	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
-
-	/*
 	 * Init HYP view of VGIC
 	 */
 	err = kvm_vgic_hyp_init();
@@ -1186,26 +1192,15 @@ int kvm_arch_init(void *opaque)
 		}
 	}
 
-	cpu_notifier_register_begin();
-
 	err = init_hyp_mode();
 	if (err)
 		goto out_err;
 
-	err = __register_cpu_notifier(&hyp_init_cpu_nb);
-	if (err) {
-		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
-		goto out_err;
-	}
-
-	cpu_notifier_register_done();
-
 	hyp_cpu_pm_init();
 
 	kvm_coproc_table_init();
 	return 0;
 out_err:
-	cpu_notifier_register_done();
 	return err;
 }
 
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 6984342..69b4a33 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1644,6 +1644,11 @@ phys_addr_t kvm_get_idmap_vector(void)
 	return hyp_idmap_vector;
 }
 
+phys_addr_t kvm_get_idmap_start(void)
+{
+	return hyp_idmap_start;
+}
+
 int kvm_mmu_init(void)
 {
 	int err;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index a35ce72..0b540f8 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -223,6 +223,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
 u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_call_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
@@ -247,7 +248,20 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 		     hyp_stack_ptr, vector_ptr);
 }
 
-static inline void kvm_arch_hardware_disable(void) {}
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * Call reset code, and switch back to stub hyp vectors.
+	 */
+	kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+}
+
+struct vgic_sr_vectors {
+	void	*save_vgic;
+	void	*restore_vgic;
+};
+
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 6150567..ff5a087 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -98,6 +98,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 3070096..bca79f9 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -58,9 +58,18 @@
 
 #define HVC_CALL_FUNC 3
 
+/*
+ * HVC_RESET_CPU - Reset cpu in EL2 to initial state.
+ *
+ * @x0: entry address in trampoline code in va
+ * @x1: identical mapping page table in pa
+ */
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
+#define HVC_RESET_CPU 4
+
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 2e67a48..1925163 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -139,6 +139,39 @@ merged:
 	eret
 ENDPROC(__kvm_hyp_init)
 
+	/*
+	 * x0: HYP boot pgd
+	 * x1: HYP phys_idmap_start
+	 */
+ENTRY(__kvm_hyp_reset)
+	/* We're in trampoline code in VA, switch back to boot page tables */
+	msr	ttbr0_el2, x0
+	isb
+
+	/* Invalidate the old TLBs */
+	tlbi	alle2
+	dsb	sy
+
+	/* Branch into PA space */
+	adr	x0, 1f
+	bfi	x1, x0, #0, #PAGE_SHIFT
+	br	x1
+
+	/* We're now in idmap, disable MMU */
+1:	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_EL2_FLAGS
+	bic	x0, x0, x1		// Clear SCTL_M and etc
+	msr	sctlr_el2, x0
+	isb
+
+	/* Install stub vectors */
+	adrp	x0, __hyp_stub_vectors
+	add	x0, x0, #:lo12:__hyp_stub_vectors
+	msr	vbar_el2, x0
+
+	eret
+ENDPROC(__kvm_hyp_reset)
+
 	.ltorg
 
 	.popsection
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 1bef8db..aca11d6 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -939,6 +939,11 @@ ENTRY(kvm_call_hyp)
 	ret
 ENDPROC(kvm_call_hyp)
 
+ENTRY(kvm_call_reset)
+	hvc	#HVC_RESET_CPU
+	ret
+ENDPROC(kvm_call_reset)
+
 .macro invalid_vector	label, target
 	.align	2
 \label:
@@ -982,10 +987,27 @@ el1_sync:					// Guest trapped into EL2
 	cmp	x18, #HVC_GET_VECTORS
 	b.ne	1f
 	mrs	x0, vbar_el2
-	b	2f
-
-1:	/* Default to HVC_CALL_HYP. */
+	b	do_eret
 
+	/* jump into trampoline code */
+1:	cmp	x18, #HVC_RESET_CPU
+	b.ne	2f
+	/*
+	 * Entry point is:
+	 *	TRAMPOLINE_VA
+	 *	+ (__kvm_hyp_reset - (__hyp_idmap_text_start & PAGE_MASK))
+	 */
+	adrp	x2, __kvm_hyp_reset
+	add	x2, x2, #:lo12:__kvm_hyp_reset
+	adrp	x3, __hyp_idmap_text_start
+	add	x3, x3, #:lo12:__hyp_idmap_text_start
+	and	x3, x3, PAGE_MASK
+	sub	x2, x2, x3
+	ldr	x3, =TRAMPOLINE_VA
+	add	x2, x2, x3
+	br	x2				// no return
+
+2:	/* Default to HVC_CALL_HYP. */
 	push	lr, xzr
 
 	/*
@@ -999,7 +1021,9 @@ el1_sync:					// Guest trapped into EL2
 	blr	lr
 
 	pop	lr, xzr
-2:	eret
+
+do_eret:
+	eret
 
 el1_trap:
 	/*
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 03/16] arm64: Add new hcall HVC_CALL_FUNC
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

Add the new hcall HVC_CALL_FUNC that allows execution of a function at EL2.
During CPU reset the CPU must be brought to the exception level it had on
entry to the kernel.  The HVC_CALL_FUNC hcall will provide the mechanism
needed for this exception level switch.

To allow the HVC_CALL_FUNC exception vector to work without a stack, which is
needed to support an hcall at CPU reset, this implementation uses register x18
to store the link register across the caller provided function.  This dictates
that the caller provided function must preserve the contents of register x18.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/include/asm/virt.h | 13 +++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 13 ++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index eb10368..3070096 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -45,6 +45,19 @@
 
 #define HVC_SET_VECTORS 2
 
+/*
+ * HVC_CALL_FUNC - Execute a function at EL2.
+ *
+ * @x0: Physical address of the function to be executed.
+ * @x1: Passed as the first argument to the function.
+ * @x2: Passed as the second argument to the function.
+ * @x3: Passed as the third argument to the function.
+ *
+ * The called function must preserve the contents of register x18.
+ */
+
+#define HVC_CALL_FUNC 3
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 017ab519..e8febe9 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -67,8 +67,19 @@ el1_sync:
 	b	2f
 
 1:	cmp	x18, #HVC_SET_VECTORS
-	b.ne	2f
+	b.ne	1f
 	msr	vbar_el2, x0
+	b	2f
+
+1:	cmp	x18, #HVC_CALL_FUNC
+	b.ne	2f
+	mov	x18, lr
+	mov	lr, x0
+	mov	x0, x1
+	mov	x1, x2
+	mov	x2, x3
+	blr	lr
+	mov	lr, x18
 
 2:	eret
 ENDPROC(el1_sync)
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 07/16] Revert "arm64: remove dead code"
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

This reverts commit b08d4640a3dca68670fc5af2fe9205b395a02388.

Add back the setup_mm_for_reboot() needed for kexec.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/include/asm/mmu.h |  1 +
 arch/arm64/mm/mmu.c          | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 990124a..6326d11 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -29,6 +29,7 @@ typedef struct {
 #define ASID(mm)	((mm)->context.id.counter & 0xffff)
 
 extern void paging_init(void);
+extern void setup_mm_for_reboot(void);
 extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
 extern void init_mem_pgprot(void);
 extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index abb66f8..c5bce3c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -515,6 +515,17 @@ void __init paging_init(void)
 }
 
 /*
+ * Enable the identity mapping to allow the MMU disabling.
+ */
+void setup_mm_for_reboot(void)
+{
+	cpu_set_reserved_ttbr0();
+	flush_tlb_all();
+	cpu_set_idmap_tcr_t0sz();
+	cpu_switch_mm(idmap_pg_dir, &init_mm);
+}
+
+/*
  * Check whether a kernel address is valid (derived from arch/x86/).
  */
 int kern_addr_valid(unsigned long addr)
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 01/16] arm64: Fold proc-macros.S into assembler.h
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to be used
outside the mm code move the contents of proc-macros.S to asm/assembler.h.  Also,
delete proc-macros.S, and fix up all references to proc-macros.S.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/include/asm/assembler.h | 48 +++++++++++++++++++++++++++-
 arch/arm64/kvm/hyp-init.S          |  1 -
 arch/arm64/mm/cache.S              |  2 --
 arch/arm64/mm/proc-macros.S        | 64 --------------------------------------
 arch/arm64/mm/proc.S               |  3 --
 5 files changed, 47 insertions(+), 71 deletions(-)
 delete mode 100644 arch/arm64/mm/proc-macros.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 12eff92..21979a4 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -1,5 +1,5 @@
 /*
- * Based on arch/arm/include/asm/assembler.h
+ * Based on arch/arm/include/asm/assembler.h, arch/arm/mm/proc-macros.S
  *
  * Copyright (C) 1996-2000 Russell King
  * Copyright (C) 2012 ARM Ltd.
@@ -23,6 +23,8 @@
 #ifndef __ASM_ASSEMBLER_H
 #define __ASM_ASSEMBLER_H
 
+#include <asm/asm-offsets.h>
+#include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
@@ -194,6 +196,50 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
+ */
+	.macro	vma_vm_mm, rd, rn
+	ldr	\rd, [\rn, #VMA_VM_MM]
+	.endm
+
+/*
+ * mmid - get context id from mm pointer (mm->context.id)
+ */
+	.macro	mmid, rd, rn
+	ldr	\rd, [\rn, #MM_CONTEXT_ID]
+	.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register.
+ */
+	.macro	icache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	and	\tmp, \tmp, #0xf		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
+ */
+	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
+#ifndef CONFIG_ARM64_VA_BITS_48
+	ldr_l	\tmpreg, idmap_t0sz
+	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
+#endif
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 178ba22..2e67a48 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -20,7 +20,6 @@
 #include <asm/assembler.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
-#include <asm/pgtable-hwdef.h>
 
 	.text
 	.pushsection	.hyp.idmap.text, "ax"
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index cfa44a6..f49041d 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -24,8 +24,6 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
 
-#include "proc-macros.S"
-
 /*
  *	flush_icache_range(start,end)
  *
diff --git a/arch/arm64/mm/proc-macros.S b/arch/arm64/mm/proc-macros.S
deleted file mode 100644
index 4c4d93c..0000000
--- a/arch/arm64/mm/proc-macros.S
+++ /dev/null
@@ -1,64 +0,0 @@
-/*
- * Based on arch/arm/mm/proc-macros.S
- *
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-
-/*
- * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
- */
-	.macro	vma_vm_mm, rd, rn
-	ldr	\rd, [\rn, #VMA_VM_MM]
-	.endm
-
-/*
- * mmid - get context id from mm pointer (mm->context.id)
- */
-	.macro	mmid, rd, rn
-	ldr	\rd, [\rn, #MM_CONTEXT_ID]
-	.endm
-
-/*
- * dcache_line_size - get the minimum D-cache line size from the CTR register.
- */
-	.macro	dcache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * icache_line_size - get the minimum I-cache line size from the CTR register.
- */
-	.macro	icache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	and	\tmp, \tmp, #0xf		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
- */
-	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
-#ifndef CONFIG_ARM64_VA_BITS_48
-	ldr_l	\tmpreg, idmap_t0sz
-	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
-#endif
-	.endm
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index cacecc4..7ab3a90 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -23,11 +23,8 @@
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
-#include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 
-#include "proc-macros.S"
-
 #ifdef CONFIG_ARM64_64K_PAGES
 #define TCR_TG_FLAGS	TCR_TG0_64K | TCR_TG1_64K
 #elif defined(CONFIG_ARM64_16K_PAGES)
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 06/16] Revert "arm64: mm: remove unused cpu_set_idmap_tcr_t0sz function"
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

This reverts commit c51e97d89e526368eb697f87cd4d391b9e19f369.

Add back the cpu_set_idmap_tcr_t0sz function needed by setup_mmu_for_reboot.
---
 arch/arm64/include/asm/mmu_context.h | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 2416578..7567030 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -70,23 +70,34 @@ static inline bool __cpu_uses_extended_idmap(void)
 		unlikely(idmap_t0sz != TCR_T0SZ(VA_BITS)));
 }
 
+static inline void __cpu_set_tcr_t0sz(u64 t0sz)
+{
+	unsigned long tcr;
+
+	if (__cpu_uses_extended_idmap())
+		asm volatile (
+		"	mrs	%0, tcr_el1	;"
+		"	bfi	%0, %1, %2, %3	;"
+		"	msr	tcr_el1, %0	;"
+		"	isb"
+		: "=&r" (tcr)
+		: "r"(t0sz), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH));
+}
+
+/*
+ * Set TCR.T0SZ to the value appropriate for activating the identity map.
+ */
+static inline void cpu_set_idmap_tcr_t0sz(void)
+{
+	__cpu_set_tcr_t0sz(idmap_t0sz);
+}
+
 /*
  * Set TCR.T0SZ to its default value (based on VA_BITS)
  */
 static inline void cpu_set_default_tcr_t0sz(void)
 {
-	unsigned long tcr;
-
-	if (!__cpu_uses_extended_idmap())
-		return;
-
-	asm volatile (
-	"	mrs	%0, tcr_el1	;"
-	"	bfi	%0, %1, %2, %3	;"
-	"	msr	tcr_el1, %0	;"
-	"	isb"
-	: "=&r" (tcr)
-	: "r"(TCR_T0SZ(VA_BITS)), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH));
+	__cpu_set_tcr_t0sz(TCR_T0SZ(VA_BITS));
 }
 
 /*
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

Commit 68234df4ea7939f98431aa81113fbdce10c4a84b (arm64: kill flush_cache_all())
removed the global arm64 routines cpu_reset() and cpu_soft_restart() needed by
the arm64 kexec and kdump support.  Add simplified versions of those two
routines back with some changes needed for kexec in the new files cpu_reset.S,
and cpu_reset.h.

When a CPU is reset it needs to be put into the exception level it had
when it entered the kernel. Update cpu_reset() to accept an argument
which signals if the reset address needs to be entered at EL1 or EL2.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/kernel/cpu-reset.S | 84 +++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/cpu-reset.h | 22 ++++++++++++
 2 files changed, 106 insertions(+)
 create mode 100644 arch/arm64/kernel/cpu-reset.S
 create mode 100644 arch/arm64/kernel/cpu-reset.h

diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
new file mode 100644
index 0000000..0423f27
--- /dev/null
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -0,0 +1,84 @@
+/*
+ * cpu reset routines
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+#include <asm/virt.h>
+
+#define SCTLR_ELx_I	(1 << 12)
+#define SCTLR_ELx_SA	(1 << 3)
+#define SCTLR_ELx_C	(1 << 2)
+#define SCTLR_ELx_A	(1 << 1)
+#define SCTLR_ELx_M	1
+#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C |	\
+			 SCTLR_ELx_SA | SCTLR_ELx_I)
+
+.text
+.pushsection    .idmap.text, "ax"
+
+.align 5
+
+/*
+ * cpu_soft_restart(cpu_reset, el2_switch, entry, arg0, arg1) - Perform a cpu
+ * soft reset.
+ *
+ * @cpu_reset: Physical address of the cpu_reset routine.
+ * @el2_switch: Flag to indicate a swich to EL2 is needed, passed to cpu_reset.
+ * @entry: Location to jump to for soft reset, passed to cpu_reset.
+ * arg0: First argument passed to @entry.
+ * arg1: Second argument passed to @entry.
+ */
+
+ENTRY(cpu_soft_restart)
+	mov	x18, x0				// cpu_reset
+	mov	x0, x1				// el2_switch
+	mov	x1, x2				// entry
+	mov	x2, x3				// arg0
+	mov	x3, x4				// arg1
+	ret	x18
+ENDPROC(cpu_soft_restart)
+
+/*
+ * cpu_reset(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
+ *
+ * @el2_switch: Flag to indicate a swich to EL2 is needed.
+ * @entry: Location to jump to for soft reset.
+ * arg0: First argument passed to @entry.
+ * arg1: Second argument passed to @entry.
+ *
+ * Put the CPU into the same state as it would be if it had been reset, and
+ * branch to what would be the reset vector. It must be executed with the
+ * flat identity mapping.
+ */
+
+ENTRY(cpu_reset)
+	/* Clear sctlr_el1 flags. */
+	mrs	x12, sctlr_el1
+	ldr	x13, =SCTLR_ELx_FLAGS
+	bic	x12, x12, x13
+	msr	sctlr_el1, x12
+	isb
+
+	cbz	x0, 1f				// el2_switch?
+
+	mov	x0, x1				// entry
+	mov	x1, x2				// arg0
+	mov	x2, x3				// arg1
+	hvc	#HVC_CALL_FUNC			// no return
+
+1:	mov	x18, x1				// entry
+	mov	x0, x2				// arg0
+	mov	x1, x3				// arg1
+	ret	x18
+ENDPROC(cpu_reset)
+
+.popsection
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
new file mode 100644
index 0000000..79816b6
--- /dev/null
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -0,0 +1,22 @@
+/*
+ * cpu reset routines
+ *
+ * Copyright (C) 2015 Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#if !defined(_ARM64_CPU_RESET_H)
+#define _ARM64_CPU_RESET_H
+
+#include <asm/virt.h>
+
+void __attribute__((noreturn)) cpu_reset(unsigned long el2_switch,
+	unsigned long entry, unsigned long arg0, unsigned long arg1);
+void __attribute__((noreturn)) cpu_soft_restart(phys_addr_t cpu_reset,
+	unsigned long el2_switch, unsigned long entry, unsigned long arg0,
+	unsigned long arg1);
+
+#endif
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the
arm64 architecture that add support for the kexec re-boot mechanism
(CONFIG_KEXEC) on arm64 platforms.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/Kconfig                  |  10 +++
 arch/arm64/include/asm/kexec.h      |  48 ++++++++++++
 arch/arm64/kernel/Makefile          |   2 +
 arch/arm64/kernel/machine_kexec.c   | 152 ++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S | 131 +++++++++++++++++++++++++++++++
 include/uapi/linux/kexec.h          |   1 +
 6 files changed, 344 insertions(+)
 create mode 100644 arch/arm64/include/asm/kexec.h
 create mode 100644 arch/arm64/kernel/machine_kexec.c
 create mode 100644 arch/arm64/kernel/relocate_kernel.S

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9ac16a4..c23fd77 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -535,6 +535,16 @@ config SECCOMP
 	  and the task is only allowed to execute a few safe syscalls
 	  defined by each seccomp mode.
 
+config KEXEC
+	depends on PM_SLEEP_SMP
+	select KEXEC_CORE
+	bool "kexec system call"
+	---help---
+	  kexec is a system call that implements the ability to shutdown your
+	  current kernel, and to start another kernel.  It is like a reboot
+	  but it is independent of the system firmware.   And like a reboot
+	  you can start any kernel with it, not just Linux.
+
 config XEN_DOM0
 	def_bool y
 	depends on XEN
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
new file mode 100644
index 0000000..46d63cd
--- /dev/null
+++ b/arch/arm64/include/asm/kexec.h
@@ -0,0 +1,48 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#if !defined(_ARM64_KEXEC_H)
+#define _ARM64_KEXEC_H
+
+/* Maximum physical address we can use pages from */
+
+#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
+
+/* Maximum address we can reach in physical address mode */
+
+#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
+
+/* Maximum address we can use for the control code buffer */
+
+#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL)
+
+#define KEXEC_CONTROL_PAGE_SIZE	4096
+
+#define KEXEC_ARCH KEXEC_ARCH_ARM64
+
+#if !defined(__ASSEMBLY__)
+
+/**
+ * crash_setup_regs() - save registers for the panic kernel
+ *
+ * @newregs: registers are saved here
+ * @oldregs: registers to be saved (may be %NULL)
+ */
+
+static inline void crash_setup_regs(struct pt_regs *newregs,
+				    struct pt_regs *oldregs)
+{
+	/* Empty routine needed to avoid build errors. */
+}
+
+#endif /* !defined(__ASSEMBLY__) */
+
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f..f68420d 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -41,6 +41,8 @@ arm64-obj-$(CONFIG_EFI)			+= efi.o efi-entry.stub.o
 arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
+arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
+					   cpu-reset.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
new file mode 100644
index 0000000..8b990b8
--- /dev/null
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -0,0 +1,152 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kexec.h>
+#include <linux/of_fdt.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include <asm/cacheflush.h>
+#include <asm/system_misc.h>
+
+#include "cpu-reset.h"
+
+/* Global variables for the arm64_relocate_new_kernel routine. */
+extern const unsigned char arm64_relocate_new_kernel[];
+extern const unsigned long arm64_relocate_new_kernel_size;
+
+static unsigned long kimage_start;
+
+void machine_kexec_cleanup(struct kimage *kimage)
+{
+	/* Empty routine needed to avoid build errors. */
+}
+
+/**
+ * machine_kexec_prepare - Prepare for a kexec reboot.
+ *
+ * Called from the core kexec code when a kernel image is loaded.
+ */
+int machine_kexec_prepare(struct kimage *kimage)
+{
+	kimage_start = kimage->start;
+	return 0;
+}
+
+/**
+ * kexec_list_flush - Helper to flush the kimage list to PoC.
+ */
+static void kexec_list_flush(unsigned long kimage_head)
+{
+	unsigned long *entry;
+
+	for (entry = &kimage_head; ; entry++) {
+		unsigned int flag = *entry & IND_FLAGS;
+		void *addr = phys_to_virt(*entry & PAGE_MASK);
+
+		switch (flag) {
+		case IND_INDIRECTION:
+			entry = (unsigned long *)addr - 1;
+			__flush_dcache_area(addr, PAGE_SIZE);
+			break;
+		case IND_DESTINATION:
+			break;
+		case IND_SOURCE:
+			__flush_dcache_area(addr, PAGE_SIZE);
+			break;
+		case IND_DONE:
+			return;
+		default:
+			BUG();
+		}
+	}
+}
+
+/**
+ * kexec_segment_flush - Helper to flush the kimage segments to PoC.
+ */
+static void kexec_segment_flush(const struct kimage *kimage)
+{
+	unsigned long i;
+
+	pr_devel("%s:\n", __func__);
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		pr_devel("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
+			i,
+			kimage->segment[i].mem,
+			kimage->segment[i].mem + kimage->segment[i].memsz,
+			kimage->segment[i].memsz,
+			kimage->segment[i].memsz /  PAGE_SIZE);
+
+		__flush_dcache_area(phys_to_virt(kimage->segment[i].mem),
+			kimage->segment[i].memsz);
+	}
+}
+
+/**
+ * machine_kexec - Do the kexec reboot.
+ *
+ * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
+ */
+void machine_kexec(struct kimage *kimage)
+{
+	phys_addr_t reboot_code_buffer_phys;
+	void *reboot_code_buffer;
+
+	BUG_ON(num_online_cpus() > 1);
+
+	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
+	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
+
+	/*
+	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
+	 * after the kernel is shut down.
+	 */
+	memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
+		arm64_relocate_new_kernel_size);
+
+	/* Flush the reboot_code_buffer in preparation for its execution. */
+	__flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
+
+	/* Flush the new image. */
+	kexec_segment_flush(kimage);
+
+	/* Flush the kimage list. */
+	kexec_list_flush(kimage->head);
+
+	pr_info("Bye!\n");
+
+	/* Disable all DAIF exceptions. */
+	asm volatile ("msr daifset, #0xf" : : : "memory");
+
+	setup_mm_for_reboot();
+
+	/*
+	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
+	 * transfer control to the reboot_code_buffer which contains a copy of
+	 * the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
+	 * uses physical addressing to relocate the new image to its final
+	 * position and transfers control to the image entry point when the
+	 * relocation is complete.
+	 */
+
+	cpu_soft_restart(virt_to_phys(cpu_reset),
+		is_hyp_mode_available(),
+		reboot_code_buffer_phys, kimage->head, kimage_start);
+
+	BUG(); /* Should never get here. */
+}
+
+void machine_crash_shutdown(struct pt_regs *regs)
+{
+	/* Empty routine needed to avoid build errors. */
+}
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
new file mode 100644
index 0000000..71cab0e
--- /dev/null
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -0,0 +1,131 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kexec.h>
+
+#include <asm/assembler.h>
+#include <asm/kexec.h>
+#include <asm/kvm_arm.h>
+#include <asm/page.h>
+
+/*
+ * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
+ *
+ * The memory that the old kernel occupies may be overwritten when coping the
+ * new image to its final location.  To assure that the
+ * arm64_relocate_new_kernel routine which does that copy is not overwritten,
+ * all code and data needed by arm64_relocate_new_kernel must be between the
+ * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end.  The
+ * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
+ * control_code_page, a special page which has been set up to be preserved
+ * during the copy operation.
+ */
+.globl arm64_relocate_new_kernel
+arm64_relocate_new_kernel:
+
+	/* Setup the list loop variables. */
+	mov	x18, x0				/* x18 = kimage_head */
+	mov	x17, x1				/* x17 = kimage_start */
+	dcache_line_size x16, x0		/* x16 = dcache line size */
+	mov	x15, xzr			/* x15 = segment start */
+	mov	x14, xzr			/* x14 = entry ptr */
+	mov	x13, xzr			/* x13 = copy dest */
+
+	/* Clear the sctlr_el2 flags. */
+	mrs	x0, CurrentEL
+	cmp	x0, #CurrentEL_EL2
+	b.ne	1f
+	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_EL2_FLAGS
+	bic	x0, x0, x1
+	msr	sctlr_el2, x0
+	isb
+1:
+
+	/* Check if the new image needs relocation. */
+	cbz	x18, .Ldone
+	tbnz	x18, IND_DONE_BIT, .Ldone
+
+.Lloop:
+	and	x12, x18, PAGE_MASK		/* x12 = addr */
+
+	/* Test the entry flags. */
+.Ltest_source:
+	tbz	x18, IND_SOURCE_BIT, .Ltest_indirection
+
+	mov x20, x13				/*  x20 = copy dest */
+	mov x21, x12				/*  x21 = copy src */
+
+	/* Copy page. */
+1:	ldp	x22, x23, [x21]
+	ldp	x24, x25, [x21, #16]
+	ldp	x26, x27, [x21, #32]
+	ldp	x28, x29, [x21, #48]
+	add	x21, x21, #64
+	stnp	x22, x23, [x20]
+	stnp	x24, x25, [x20, #16]
+	stnp	x26, x27, [x20, #32]
+	stnp	x28, x29, [x20, #48]
+	add	x20, x20, #64
+	tst	x21, #(PAGE_SIZE - 1)
+	b.ne	1b
+
+	/* dest += PAGE_SIZE */
+	add	x13, x13, PAGE_SIZE
+	b	.Lnext
+
+.Ltest_indirection:
+	tbz	x18, IND_INDIRECTION_BIT, .Ltest_destination
+
+	/* ptr = addr */
+	mov	x14, x12
+	b	.Lnext
+
+.Ltest_destination:
+	tbz	x18, IND_DESTINATION_BIT, .Lnext
+
+	mov	x15, x12
+
+	/* dest = addr */
+	mov	x13, x12
+
+.Lnext:
+	/* entry = *ptr++ */
+	ldr	x18, [x14], #8
+
+	/* while (!(entry & DONE)) */
+	tbz	x18, IND_DONE_BIT, .Lloop
+
+.Ldone:
+	dsb	sy
+	ic	ialluis
+	dsb	sy
+	isb
+
+	/* Start new image. */
+	mov	x0, xzr
+	mov	x1, xzr
+	mov	x2, xzr
+	mov	x3, xzr
+	br	x17
+
+.align 3	/* To keep the 64-bit values below naturally aligned. */
+
+.Lcopy_end:
+.org	KEXEC_CONTROL_PAGE_SIZE
+
+/*
+ * arm64_relocate_new_kernel_size - Number of bytes to copy to the
+ * control_code_page.
+ */
+.globl arm64_relocate_new_kernel_size
+arm64_relocate_new_kernel_size:
+	.quad	.Lcopy_end - arm64_relocate_new_kernel
diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index 99048e5..ccec467 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -39,6 +39,7 @@
 #define KEXEC_ARCH_SH      (42 << 16)
 #define KEXEC_ARCH_MIPS_LE (10 << 16)
 #define KEXEC_ARCH_MIPS    ( 8 << 16)
+#define KEXEC_ARCH_ARM64   (183 << 16)
 
 /* The artificial cap on the number of segments passed to kexec_load. */
 #define KEXEC_SEGMENT_MAX 16
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 09/16] arm64/kexec: Add pr_devel output
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

To aid in debugging kexec problems or when adding new functionality to kexec add
a new routine kexec_image_info() and several inline pr_devel statements.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/kernel/machine_kexec.c | 63 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 8b990b8..da28a26 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -25,6 +25,48 @@ extern const unsigned long arm64_relocate_new_kernel_size;
 
 static unsigned long kimage_start;
 
+/**
+ * kexec_is_dtb - Helper routine to check the device tree header signature.
+ */
+static bool kexec_is_dtb(const void *dtb)
+{
+	__be32 magic;
+
+	return get_user(magic, (__be32 *)dtb) ? false :
+		(be32_to_cpu(magic) == OF_DT_HEADER);
+}
+
+/**
+ * kexec_image_info - For debugging output.
+ */
+#define kexec_image_info(_i) _kexec_image_info(__func__, __LINE__, _i)
+static void _kexec_image_info(const char *func, int line,
+	const struct kimage *kimage)
+{
+	unsigned long i;
+
+#if !defined(DEBUG)
+	return;
+#endif
+	pr_devel("%s:%d:\n", func, line);
+	pr_devel("  kexec kimage info:\n");
+	pr_devel("    type:        %d\n", kimage->type);
+	pr_devel("    start:       %lx\n", kimage->start);
+	pr_devel("    head:        %lx\n", kimage->head);
+	pr_devel("    nr_segments: %lu\n", kimage->nr_segments);
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		pr_devel("      segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages%s\n",
+			i,
+			kimage->segment[i].mem,
+			kimage->segment[i].mem + kimage->segment[i].memsz,
+			kimage->segment[i].memsz,
+			kimage->segment[i].memsz /  PAGE_SIZE,
+			(kexec_is_dtb(kimage->segment[i].buf) ?
+				", dtb segment" : ""));
+	}
+}
+
 void machine_kexec_cleanup(struct kimage *kimage)
 {
 	/* Empty routine needed to avoid build errors. */
@@ -38,6 +80,8 @@ void machine_kexec_cleanup(struct kimage *kimage)
 int machine_kexec_prepare(struct kimage *kimage)
 {
 	kimage_start = kimage->start;
+	kexec_image_info(kimage);
+
 	return 0;
 }
 
@@ -107,6 +151,25 @@ void machine_kexec(struct kimage *kimage)
 	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
 	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
 
+	kexec_image_info(kimage);
+
+	pr_devel("%s:%d: control_code_page:        %p\n", __func__, __LINE__,
+		kimage->control_code_page);
+	pr_devel("%s:%d: reboot_code_buffer_phys:  %pa\n", __func__, __LINE__,
+		&reboot_code_buffer_phys);
+	pr_devel("%s:%d: reboot_code_buffer:       %p\n", __func__, __LINE__,
+		reboot_code_buffer);
+	pr_devel("%s:%d: relocate_new_kernel:      %p\n", __func__, __LINE__,
+		arm64_relocate_new_kernel);
+	pr_devel("%s:%d: relocate_new_kernel_size: 0x%lx(%lu) bytes\n",
+		__func__, __LINE__, arm64_relocate_new_kernel_size,
+		arm64_relocate_new_kernel_size);
+
+	pr_devel("%s:%d: kimage_head:              %lx\n", __func__, __LINE__,
+		kimage->head);
+	pr_devel("%s:%d: kimage_start:             %lx\n", __func__, __LINE__,
+		kimage_start);
+
 	/*
 	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
 	 * after the kernel is shut down.
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 10/16] arm64/kexec: Enable kexec in the arm64 defconfig
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index bdd7aa3..0470fdf 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -58,6 +58,7 @@ CONFIG_PREEMPT=y
 CONFIG_KSM=y
 CONFIG_TRANSPARENT_HUGEPAGE=y
 CONFIG_CMA=y
+CONFIG_KEXEC=y
 CONFIG_CMDLINE="console=ttyAMA0"
 # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
 CONFIG_COMPAT=y
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

kdump calls machine_crash_shutdown() to shut down non-boot cpus and
save registers' status in per-cpu ELF notes before starting the crash
dump kernel. See kernel_kexec().

ipi_cpu_stop() is a bit modified and used to support this behavior.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
 arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
 arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
 3 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 46d63cd..555a955 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -30,6 +30,8 @@
 
 #if !defined(__ASSEMBLY__)
 
+extern bool in_crash_kexec;
+
 /**
  * crash_setup_regs() - save registers for the panic kernel
  *
@@ -40,7 +42,37 @@
 static inline void crash_setup_regs(struct pt_regs *newregs,
 				    struct pt_regs *oldregs)
 {
-	/* Empty routine needed to avoid build errors. */
+	if (oldregs) {
+		memcpy(newregs, oldregs, sizeof(*newregs));
+	} else {
+		__asm__ __volatile__ (
+			"stp	 x0,   x1, [%3, #16 *  0]\n"
+			"stp	 x2,   x3, [%3, #16 *  1]\n"
+			"stp	 x4,   x5, [%3, #16 *  2]\n"
+			"stp	 x6,   x7, [%3, #16 *  3]\n"
+			"stp	 x8,   x9, [%3, #16 *  4]\n"
+			"stp	x10,  x11, [%3, #16 *  5]\n"
+			"stp	x12,  x13, [%3, #16 *  6]\n"
+			"stp	x14,  x15, [%3, #16 *  7]\n"
+			"stp	x16,  x17, [%3, #16 *  8]\n"
+			"stp	x18,  x19, [%3, #16 *  9]\n"
+			"stp	x20,  x21, [%3, #16 * 10]\n"
+			"stp	x22,  x23, [%3, #16 * 11]\n"
+			"stp	x24,  x25, [%3, #16 * 12]\n"
+			"stp	x26,  x27, [%3, #16 * 13]\n"
+			"stp	x28,  x29, [%3, #16 * 14]\n"
+			"str	x30,	   [%3, #16 * 15]\n"
+			"mov	%0, sp\n"
+			"adr	%1, 1f\n"
+			"mrs	%2, spsr_el1\n"
+		"1:"
+			: "=r" (newregs->sp),
+			  "=r" (newregs->pc),
+			  "=r" (newregs->pstate)
+			: "r"  (&newregs->regs)
+			: "memory"
+		);
+	}
 }
 
 #endif /* !defined(__ASSEMBLY__) */
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index da28a26..d2d7e90 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -9,6 +9,7 @@
  * published by the Free Software Foundation.
  */
 
+#include <linux/kernel.h>
 #include <linux/kexec.h>
 #include <linux/of_fdt.h>
 #include <linux/slab.h>
@@ -23,6 +24,7 @@
 extern const unsigned char arm64_relocate_new_kernel[];
 extern const unsigned long arm64_relocate_new_kernel_size;
 
+bool in_crash_kexec;
 static unsigned long kimage_start;
 
 /**
@@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
 	 */
 
 	cpu_soft_restart(virt_to_phys(cpu_reset),
-		is_hyp_mode_available(),
+		in_crash_kexec ? 0 : is_hyp_mode_available(),
 		reboot_code_buffer_phys, kimage->head, kimage_start);
 
 	BUG(); /* Should never get here. */
 }
 
+/**
+ * machine_crash_shutdown - shutdown non-boot cpus and save registers
+ */
 void machine_crash_shutdown(struct pt_regs *regs)
 {
-	/* Empty routine needed to avoid build errors. */
+	struct pt_regs dummy_regs;
+	int cpu;
+
+	local_irq_disable();
+
+	in_crash_kexec = true;
+
+	/*
+	 * clear and initialize the per-cpu info. This is necessary
+	 * because, otherwise, slots for offline cpus would never be
+	 * filled up. See smp_send_stop().
+	 */
+	memset(&dummy_regs, 0, sizeof(dummy_regs));
+	for_each_possible_cpu(cpu)
+		crash_save_cpu(&dummy_regs, cpu);
+
+	/* shutdown non-boot cpus */
+	smp_send_stop();
+
+	/* for boot cpu */
+	crash_save_cpu(regs, smp_processor_id());
+
+	pr_info("Starting crashdump kernel...\n");
 }
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index b1adc51..15aabef 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -37,6 +37,7 @@
 #include <linux/completion.h>
 #include <linux/of.h>
 #include <linux/irq_work.h>
+#include <linux/kexec.h>
 
 #include <asm/alternative.h>
 #include <asm/atomic.h>
@@ -54,6 +55,8 @@
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
+#include "cpu-reset.h"
+
 #define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
 
@@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
 /*
  * ipi_cpu_stop - handle IPI from smp_send_stop()
  */
-static void ipi_cpu_stop(unsigned int cpu)
+static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
 {
+#ifdef CONFIG_KEXEC
+	/* printing messages may slow down the shutdown. */
+	if (!in_crash_kexec)
+#endif
 	if (system_state == SYSTEM_BOOTING ||
 	    system_state == SYSTEM_RUNNING) {
 		raw_spin_lock(&stop_lock);
@@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
 
 	local_irq_disable();
 
+#ifdef CONFIG_KEXEC
+	if (in_crash_kexec)
+		crash_save_cpu(regs, cpu);
+#endif /* CONFIG_KEXEC */
+
 	while (1)
 		cpu_relax();
 }
@@ -727,7 +739,7 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 
 	case IPI_CPU_STOP:
 		irq_enter();
-		ipi_cpu_stop(cpu);
+		ipi_cpu_stop(cpu, regs);
 		irq_exit();
 		break;
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 11/16] arm64: kdump: reserve memory for crash dump kernel
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

On primary kernel, the memory region used by crash dump kernel must be
specified by "crashkernel=" boot parameter. reserve_crashkernel()
will allocate and reserve the region for later use.

User space tools will be able to find the region marked as "Crash kernel"
in /proc/iomem.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/kernel/setup.c |  7 +++++-
 arch/arm64/mm/init.c      | 54 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 8119479..f9fffc9 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -31,7 +31,6 @@
 #include <linux/screen_info.h>
 #include <linux/init.h>
 #include <linux/kexec.h>
-#include <linux/crash_dump.h>
 #include <linux/root_dev.h>
 #include <linux/cpu.h>
 #include <linux/interrupt.h>
@@ -221,6 +220,12 @@ static void __init request_standard_resources(void)
 		    kernel_data.end <= res->end)
 			request_resource(res, &kernel_data);
 	}
+
+#ifdef CONFIG_KEXEC
+	/* User space tools will find "Crash kernel" region in /proc/iomem. */
+	if (crashk_res.end)
+		insert_resource(&iomem_resource, &crashk_res);
+#endif
 }
 
 #ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 17bf39a..24f0a1c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -34,6 +34,7 @@
 #include <linux/dma-contiguous.h>
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
+#include <linux/kexec.h>
 
 #include <asm/fixmap.h>
 #include <asm/memory.h>
@@ -66,6 +67,55 @@ static int __init early_initrd(char *p)
 early_param("initrd", early_initrd);
 #endif
 
+#ifdef CONFIG_KEXEC
+/*
+ * reserve_crashkernel() - reserves memory for crash kernel
+ *
+ * This function reserves memory area given in "crashkernel=" kernel command
+ * line parameter. The memory reserved is used by dump capture kernel when
+ * primary kernel is crashing.
+ */
+static void __init reserve_crashkernel(void)
+{
+	unsigned long long crash_size = 0, crash_base = 0;
+	int ret;
+
+	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+				&crash_size, &crash_base);
+	if (ret)
+		return;
+
+	if (crash_base == 0) {
+		crash_base = memblock_alloc(crash_size, 1 << 21);
+		if (crash_base == 0) {
+			pr_warn("Unable to allocate crashkernel (size:%llx)\n",
+				crash_size);
+			return;
+		}
+	} else {
+		/* User specifies base address explicitly. */
+		if (!memblock_is_region_memory(crash_base, crash_size) ||
+			memblock_is_region_reserved(crash_base, crash_size)) {
+			pr_warn("crashkernel has wrong address or size\n");
+			return;
+		}
+
+		if (crash_base & ((1 << 21) - 1)) {
+			pr_warn("crashkernel base address is not 2MB aligned\n");
+			return;
+		}
+
+		memblock_reserve(crash_base, crash_size);
+	}
+
+	pr_info("Reserving %lldMB of memory@%lldMB for crashkernel\n",
+		crash_size >> 20, crash_base >> 20);
+
+	crashk_res.start = crash_base;
+	crashk_res.end = crash_base + crash_size - 1;
+}
+#endif /* CONFIG_KEXEC */
+
 /*
  * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
  * currently assumes that for memory starting above 4G, 32-bit devices will
@@ -171,6 +221,10 @@ void __init arm64_memblock_init(void)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
 
+#ifdef CONFIG_KEXEC
+	reserve_crashkernel();
+#endif
+
 	early_init_fdt_scan_reserved_mem();
 
 	/* 4GB maximum for 32-bit only capable devices */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 16/16] arm64: kdump: relax BUG_ON() if more than one cpus are still active
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

We should try best in case of kdump.
So even if not all secondary cpus have shut down, we do kdump anyway.
---
 arch/arm64/kernel/machine_kexec.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index d2d7e90..482aae7 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -148,7 +148,13 @@ void machine_kexec(struct kimage *kimage)
 	phys_addr_t reboot_code_buffer_phys;
 	void *reboot_code_buffer;
 
-	BUG_ON(num_online_cpus() > 1);
+	if (num_online_cpus() > 1) {
+		if (in_crash_kexec)
+			pr_warn("kdump might fail because %d cpus are still online\n",
+					num_online_cpus());
+		else
+			BUG();
+	}
 
 	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
 	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 15/16] arm64: kdump: enable kdump in the arm64 defconfig
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 0470fdf..6dc3d00 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -59,6 +59,7 @@ CONFIG_KSM=y
 CONFIG_TRANSPARENT_HUGEPAGE=y
 CONFIG_CMA=y
 CONFIG_KEXEC=y
+CONFIG_CRASH_DUMP=y
 CONFIG_CMDLINE="console=ttyAMA0"
 # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
 CONFIG_COMPAT=y
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 14/16] arm64: kdump: update a kernel doc
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

This patch adds arch specific descriptions about kdump usage on arm64
to kdump.txt.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 Documentation/kdump/kdump.txt | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index bc4bd5a..7d3fad0 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
 a remote system.
 
 Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
-s390x and arm architectures.
+s390x, arm and arm64 architectures.
 
 When the system kernel boots, it reserves a small section of memory for
 the dump-capture kernel. This ensures that ongoing Direct Memory Access
@@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
 
     AUTO_ZRELADDR=y
 
+Dump-capture kernel config options (Arch Dependent, arm64)
+----------------------------------------------------------
+
+1) Disable symmetric multi-processing support under "Processor type and
+   features":
+
+   CONFIG_SMP=n
+
+   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
+   when loading the dump-capture kernel, see section "Load the Dump-capture
+   Kernel".)
+
+2) The maximum memory size on the dump-capture kernel must be limited by
+   specifying:
+
+   mem=X[MG]
+
+   where X should be less than or equal to the size in "crashkernel="
+   boot parameter. Kexec-tools will automatically add this.
+
+3) Currently, kvm will not be enabled on the dump-capture kernel even
+   if it is configured.
+
 Extended crashkernel syntax
 ===========================
 
@@ -312,6 +335,8 @@ Boot into System Kernel
    any space below the alignment point may be overwritten by the dump-capture kernel,
    which means it is possible that the vmcore is not that precise as expected.
 
+   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
+   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
 
 Load the Dump-capture Kernel
 ============================
@@ -334,6 +359,8 @@ For s390x:
 	- Use image or bzImage
 For arm:
 	- Use zImage
+For arm64:
+	- Use vmlinux
 
 If you are using a uncompressed vmlinux image then use following command
 to load dump-capture kernel.
@@ -377,6 +404,9 @@ For s390x:
 For arm:
 	"1 maxcpus=1 reset_devices"
 
+For arm64:
+	"1 mem=X[MG] maxcpus=1 reset_devices"
+
 Notes on loading the dump-capture kernel:
 
 * By default, the ELF headers are stored in ELF64 format to support
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 13/16] arm64: kdump: add kdump support
  2015-11-24 22:25 ` Geoff Levand
@ 2015-11-24 22:25   ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

On crash dump kernel, all the information about primary kernel's core
image is available in elf core header specified by "elfcorehdr=" boot
parameter. reserve_elfcorehdr() will set aside the region to avoid any
corruption by crash dump kernel.

Crash dump kernel will access the system memory of primary kernel via
copy_oldmem_page(), which reads one page by ioremap'ing it since it does
not reside in linear mapping on crash dump kernel.
Please note that we should add "mem=X[MG]" boot parameter to limit the
memory size and avoid the following assertion at ioremap():
	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
		return NULL;
when accessing any pages beyond the usable memories of crash dump kernel.

We also need our own elfcorehdr_read() here since the weak definition of
elfcorehdr_read() utilizes copy_oldmem_page() and will hit the assertion
above on arm64.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/Kconfig             | 12 +++++++
 arch/arm64/kernel/Makefile     |  1 +
 arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/mm/init.c           | 29 +++++++++++++++++
 4 files changed, 113 insertions(+)
 create mode 100644 arch/arm64/kernel/crash_dump.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c23fd77..4bac7dc 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -545,6 +545,18 @@ config KEXEC
 	  but it is independent of the system firmware.   And like a reboot
 	  you can start any kernel with it, not just Linux.
 
+config CRASH_DUMP
+	bool "Build kdump crash kernel"
+	help
+	  Generate crash dump after being started by kexec. This should
+	  be normally only set in special crash dump kernels which are
+	  loaded in the main kernel with kexec-tools into a specially
+	  reserved region and then later executed after a crash by
+	  kdump/kexec. The crash dump kernel must be compiled to a
+	  memory address not used by the main kernel.
+
+	  For more details see Documentation/kdump/kdump.txt
+
 config XEN_DOM0
 	def_bool y
 	depends on XEN
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index f68420d..a08b054 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -43,6 +43,7 @@ arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
 arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
+arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
new file mode 100644
index 0000000..3d86c0a
--- /dev/null
+++ b/arch/arm64/kernel/crash_dump.c
@@ -0,0 +1,71 @@
+/*
+ * Routines for doing kexec-based kdump
+ *
+ * Copyright (C) 2014 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crash_dump.h>
+#include <linux/errno.h>
+#include <linux/io.h>
+#include <linux/memblock.h>
+#include <linux/uaccess.h>
+#include <asm/memory.h>
+
+/**
+ * copy_oldmem_page() - copy one page from old kernel memory
+ * @pfn: page frame number to be copied
+ * @buf: buffer where the copied page is placed
+ * @csize: number of bytes to copy
+ * @offset: offset in bytes into the page
+ * @userbuf: if set, @buf is in a user address space
+ *
+ * This function copies one page from old kernel memory into buffer pointed by
+ * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
+ * copied or negative error in case of failure.
+ */
+ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+			 size_t csize, unsigned long offset,
+			 int userbuf)
+{
+	void *vaddr;
+
+	if (!csize)
+		return 0;
+
+	vaddr = ioremap_cache(pfn << PAGE_SHIFT, PAGE_SIZE);
+	if (!vaddr)
+		return -ENOMEM;
+
+	if (userbuf) {
+		if (copy_to_user(buf, vaddr + offset, csize)) {
+			iounmap(vaddr);
+			return -EFAULT;
+		}
+	} else {
+		memcpy(buf, vaddr + offset, csize);
+	}
+
+	iounmap(vaddr);
+
+	return csize;
+}
+
+/**
+ * elfcorehdr_read - read from ELF core header
+ * @buf: buffer where the data is placed
+ * @csize: number of bytes to read
+ * @ppos: address in the memory
+ *
+ * This function reads @count bytes from elf core header which exists
+ * on crash dump kernel's memory.
+ */
+ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
+{
+	memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
+	return count;
+}
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 24f0a1c..52a1469 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,6 +35,7 @@
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
 #include <linux/kexec.h>
+#include <linux/crash_dump.h>
 
 #include <asm/fixmap.h>
 #include <asm/memory.h>
@@ -116,6 +117,31 @@ static void __init reserve_crashkernel(void)
 }
 #endif /* CONFIG_KEXEC */
 
+#ifdef CONFIG_CRASH_DUMP
+/*
+ * reserve_elfcorehdr() - reserves memory for elf core header
+ *
+ * This function reserves elf core header given in "elfcorehdr=" kernel
+ * command line parameter. This region contains all the information about
+ * primary kernel's core image and is used by a dump capture kernel to
+ * access the system memory on primary kernel.
+ */
+static void __init reserve_elfcorehdr(void)
+{
+	if (!elfcorehdr_size)
+		return;
+
+	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
+		pr_warn("elfcorehdr is overlapped\n");
+		return;
+	}
+
+	memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
+
+	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
+		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);
+}
+#endif /* CONFIG_CRASH_DUMP */
 /*
  * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
  * currently assumes that for memory starting above 4G, 32-bit devices will
@@ -224,6 +250,9 @@ void __init arm64_memblock_init(void)
 #ifdef CONFIG_KEXEC
 	reserve_crashkernel();
 #endif
+#ifdef CONFIG_CRASH_DUMP
+	reserve_elfcorehdr();
+#endif
 
 	early_init_fdt_scan_reserved_mem();
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the
arm64 architecture that add support for the kexec re-boot mechanism
(CONFIG_KEXEC) on arm64 platforms.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/Kconfig                  |  10 +++
 arch/arm64/include/asm/kexec.h      |  48 ++++++++++++
 arch/arm64/kernel/Makefile          |   2 +
 arch/arm64/kernel/machine_kexec.c   | 152 ++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S | 131 +++++++++++++++++++++++++++++++
 include/uapi/linux/kexec.h          |   1 +
 6 files changed, 344 insertions(+)
 create mode 100644 arch/arm64/include/asm/kexec.h
 create mode 100644 arch/arm64/kernel/machine_kexec.c
 create mode 100644 arch/arm64/kernel/relocate_kernel.S

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9ac16a4..c23fd77 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -535,6 +535,16 @@ config SECCOMP
 	  and the task is only allowed to execute a few safe syscalls
 	  defined by each seccomp mode.
 
+config KEXEC
+	depends on PM_SLEEP_SMP
+	select KEXEC_CORE
+	bool "kexec system call"
+	---help---
+	  kexec is a system call that implements the ability to shutdown your
+	  current kernel, and to start another kernel.  It is like a reboot
+	  but it is independent of the system firmware.   And like a reboot
+	  you can start any kernel with it, not just Linux.
+
 config XEN_DOM0
 	def_bool y
 	depends on XEN
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
new file mode 100644
index 0000000..46d63cd
--- /dev/null
+++ b/arch/arm64/include/asm/kexec.h
@@ -0,0 +1,48 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#if !defined(_ARM64_KEXEC_H)
+#define _ARM64_KEXEC_H
+
+/* Maximum physical address we can use pages from */
+
+#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
+
+/* Maximum address we can reach in physical address mode */
+
+#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
+
+/* Maximum address we can use for the control code buffer */
+
+#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL)
+
+#define KEXEC_CONTROL_PAGE_SIZE	4096
+
+#define KEXEC_ARCH KEXEC_ARCH_ARM64
+
+#if !defined(__ASSEMBLY__)
+
+/**
+ * crash_setup_regs() - save registers for the panic kernel
+ *
+ * @newregs: registers are saved here
+ * @oldregs: registers to be saved (may be %NULL)
+ */
+
+static inline void crash_setup_regs(struct pt_regs *newregs,
+				    struct pt_regs *oldregs)
+{
+	/* Empty routine needed to avoid build errors. */
+}
+
+#endif /* !defined(__ASSEMBLY__) */
+
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f..f68420d 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -41,6 +41,8 @@ arm64-obj-$(CONFIG_EFI)			+= efi.o efi-entry.stub.o
 arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
+arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
+					   cpu-reset.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
new file mode 100644
index 0000000..8b990b8
--- /dev/null
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -0,0 +1,152 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kexec.h>
+#include <linux/of_fdt.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include <asm/cacheflush.h>
+#include <asm/system_misc.h>
+
+#include "cpu-reset.h"
+
+/* Global variables for the arm64_relocate_new_kernel routine. */
+extern const unsigned char arm64_relocate_new_kernel[];
+extern const unsigned long arm64_relocate_new_kernel_size;
+
+static unsigned long kimage_start;
+
+void machine_kexec_cleanup(struct kimage *kimage)
+{
+	/* Empty routine needed to avoid build errors. */
+}
+
+/**
+ * machine_kexec_prepare - Prepare for a kexec reboot.
+ *
+ * Called from the core kexec code when a kernel image is loaded.
+ */
+int machine_kexec_prepare(struct kimage *kimage)
+{
+	kimage_start = kimage->start;
+	return 0;
+}
+
+/**
+ * kexec_list_flush - Helper to flush the kimage list to PoC.
+ */
+static void kexec_list_flush(unsigned long kimage_head)
+{
+	unsigned long *entry;
+
+	for (entry = &kimage_head; ; entry++) {
+		unsigned int flag = *entry & IND_FLAGS;
+		void *addr = phys_to_virt(*entry & PAGE_MASK);
+
+		switch (flag) {
+		case IND_INDIRECTION:
+			entry = (unsigned long *)addr - 1;
+			__flush_dcache_area(addr, PAGE_SIZE);
+			break;
+		case IND_DESTINATION:
+			break;
+		case IND_SOURCE:
+			__flush_dcache_area(addr, PAGE_SIZE);
+			break;
+		case IND_DONE:
+			return;
+		default:
+			BUG();
+		}
+	}
+}
+
+/**
+ * kexec_segment_flush - Helper to flush the kimage segments to PoC.
+ */
+static void kexec_segment_flush(const struct kimage *kimage)
+{
+	unsigned long i;
+
+	pr_devel("%s:\n", __func__);
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		pr_devel("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
+			i,
+			kimage->segment[i].mem,
+			kimage->segment[i].mem + kimage->segment[i].memsz,
+			kimage->segment[i].memsz,
+			kimage->segment[i].memsz /  PAGE_SIZE);
+
+		__flush_dcache_area(phys_to_virt(kimage->segment[i].mem),
+			kimage->segment[i].memsz);
+	}
+}
+
+/**
+ * machine_kexec - Do the kexec reboot.
+ *
+ * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
+ */
+void machine_kexec(struct kimage *kimage)
+{
+	phys_addr_t reboot_code_buffer_phys;
+	void *reboot_code_buffer;
+
+	BUG_ON(num_online_cpus() > 1);
+
+	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
+	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
+
+	/*
+	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
+	 * after the kernel is shut down.
+	 */
+	memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
+		arm64_relocate_new_kernel_size);
+
+	/* Flush the reboot_code_buffer in preparation for its execution. */
+	__flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
+
+	/* Flush the new image. */
+	kexec_segment_flush(kimage);
+
+	/* Flush the kimage list. */
+	kexec_list_flush(kimage->head);
+
+	pr_info("Bye!\n");
+
+	/* Disable all DAIF exceptions. */
+	asm volatile ("msr daifset, #0xf" : : : "memory");
+
+	setup_mm_for_reboot();
+
+	/*
+	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
+	 * transfer control to the reboot_code_buffer which contains a copy of
+	 * the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
+	 * uses physical addressing to relocate the new image to its final
+	 * position and transfers control to the image entry point when the
+	 * relocation is complete.
+	 */
+
+	cpu_soft_restart(virt_to_phys(cpu_reset),
+		is_hyp_mode_available(),
+		reboot_code_buffer_phys, kimage->head, kimage_start);
+
+	BUG(); /* Should never get here. */
+}
+
+void machine_crash_shutdown(struct pt_regs *regs)
+{
+	/* Empty routine needed to avoid build errors. */
+}
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
new file mode 100644
index 0000000..71cab0e
--- /dev/null
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -0,0 +1,131 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kexec.h>
+
+#include <asm/assembler.h>
+#include <asm/kexec.h>
+#include <asm/kvm_arm.h>
+#include <asm/page.h>
+
+/*
+ * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
+ *
+ * The memory that the old kernel occupies may be overwritten when coping the
+ * new image to its final location.  To assure that the
+ * arm64_relocate_new_kernel routine which does that copy is not overwritten,
+ * all code and data needed by arm64_relocate_new_kernel must be between the
+ * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end.  The
+ * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
+ * control_code_page, a special page which has been set up to be preserved
+ * during the copy operation.
+ */
+.globl arm64_relocate_new_kernel
+arm64_relocate_new_kernel:
+
+	/* Setup the list loop variables. */
+	mov	x18, x0				/* x18 = kimage_head */
+	mov	x17, x1				/* x17 = kimage_start */
+	dcache_line_size x16, x0		/* x16 = dcache line size */
+	mov	x15, xzr			/* x15 = segment start */
+	mov	x14, xzr			/* x14 = entry ptr */
+	mov	x13, xzr			/* x13 = copy dest */
+
+	/* Clear the sctlr_el2 flags. */
+	mrs	x0, CurrentEL
+	cmp	x0, #CurrentEL_EL2
+	b.ne	1f
+	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_EL2_FLAGS
+	bic	x0, x0, x1
+	msr	sctlr_el2, x0
+	isb
+1:
+
+	/* Check if the new image needs relocation. */
+	cbz	x18, .Ldone
+	tbnz	x18, IND_DONE_BIT, .Ldone
+
+.Lloop:
+	and	x12, x18, PAGE_MASK		/* x12 = addr */
+
+	/* Test the entry flags. */
+.Ltest_source:
+	tbz	x18, IND_SOURCE_BIT, .Ltest_indirection
+
+	mov x20, x13				/*  x20 = copy dest */
+	mov x21, x12				/*  x21 = copy src */
+
+	/* Copy page. */
+1:	ldp	x22, x23, [x21]
+	ldp	x24, x25, [x21, #16]
+	ldp	x26, x27, [x21, #32]
+	ldp	x28, x29, [x21, #48]
+	add	x21, x21, #64
+	stnp	x22, x23, [x20]
+	stnp	x24, x25, [x20, #16]
+	stnp	x26, x27, [x20, #32]
+	stnp	x28, x29, [x20, #48]
+	add	x20, x20, #64
+	tst	x21, #(PAGE_SIZE - 1)
+	b.ne	1b
+
+	/* dest += PAGE_SIZE */
+	add	x13, x13, PAGE_SIZE
+	b	.Lnext
+
+.Ltest_indirection:
+	tbz	x18, IND_INDIRECTION_BIT, .Ltest_destination
+
+	/* ptr = addr */
+	mov	x14, x12
+	b	.Lnext
+
+.Ltest_destination:
+	tbz	x18, IND_DESTINATION_BIT, .Lnext
+
+	mov	x15, x12
+
+	/* dest = addr */
+	mov	x13, x12
+
+.Lnext:
+	/* entry = *ptr++ */
+	ldr	x18, [x14], #8
+
+	/* while (!(entry & DONE)) */
+	tbz	x18, IND_DONE_BIT, .Lloop
+
+.Ldone:
+	dsb	sy
+	ic	ialluis
+	dsb	sy
+	isb
+
+	/* Start new image. */
+	mov	x0, xzr
+	mov	x1, xzr
+	mov	x2, xzr
+	mov	x3, xzr
+	br	x17
+
+.align 3	/* To keep the 64-bit values below naturally aligned. */
+
+.Lcopy_end:
+.org	KEXEC_CONTROL_PAGE_SIZE
+
+/*
+ * arm64_relocate_new_kernel_size - Number of bytes to copy to the
+ * control_code_page.
+ */
+.globl arm64_relocate_new_kernel_size
+arm64_relocate_new_kernel_size:
+	.quad	.Lcopy_end - arm64_relocate_new_kernel
diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index 99048e5..ccec467 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -39,6 +39,7 @@
 #define KEXEC_ARCH_SH      (42 << 16)
 #define KEXEC_ARCH_MIPS_LE (10 << 16)
 #define KEXEC_ARCH_MIPS    ( 8 << 16)
+#define KEXEC_ARCH_ARM64   (183 << 16)
 
 /* The artificial cap on the number of segments passed to kexec_load. */
 #define KEXEC_SEGMENT_MAX 16
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 09/16] arm64/kexec: Add pr_devel output
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

To aid in debugging kexec problems or when adding new functionality to kexec add
a new routine kexec_image_info() and several inline pr_devel statements.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/kernel/machine_kexec.c | 63 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 8b990b8..da28a26 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -25,6 +25,48 @@ extern const unsigned long arm64_relocate_new_kernel_size;
 
 static unsigned long kimage_start;
 
+/**
+ * kexec_is_dtb - Helper routine to check the device tree header signature.
+ */
+static bool kexec_is_dtb(const void *dtb)
+{
+	__be32 magic;
+
+	return get_user(magic, (__be32 *)dtb) ? false :
+		(be32_to_cpu(magic) == OF_DT_HEADER);
+}
+
+/**
+ * kexec_image_info - For debugging output.
+ */
+#define kexec_image_info(_i) _kexec_image_info(__func__, __LINE__, _i)
+static void _kexec_image_info(const char *func, int line,
+	const struct kimage *kimage)
+{
+	unsigned long i;
+
+#if !defined(DEBUG)
+	return;
+#endif
+	pr_devel("%s:%d:\n", func, line);
+	pr_devel("  kexec kimage info:\n");
+	pr_devel("    type:        %d\n", kimage->type);
+	pr_devel("    start:       %lx\n", kimage->start);
+	pr_devel("    head:        %lx\n", kimage->head);
+	pr_devel("    nr_segments: %lu\n", kimage->nr_segments);
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		pr_devel("      segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages%s\n",
+			i,
+			kimage->segment[i].mem,
+			kimage->segment[i].mem + kimage->segment[i].memsz,
+			kimage->segment[i].memsz,
+			kimage->segment[i].memsz /  PAGE_SIZE,
+			(kexec_is_dtb(kimage->segment[i].buf) ?
+				", dtb segment" : ""));
+	}
+}
+
 void machine_kexec_cleanup(struct kimage *kimage)
 {
 	/* Empty routine needed to avoid build errors. */
@@ -38,6 +80,8 @@ void machine_kexec_cleanup(struct kimage *kimage)
 int machine_kexec_prepare(struct kimage *kimage)
 {
 	kimage_start = kimage->start;
+	kexec_image_info(kimage);
+
 	return 0;
 }
 
@@ -107,6 +151,25 @@ void machine_kexec(struct kimage *kimage)
 	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
 	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
 
+	kexec_image_info(kimage);
+
+	pr_devel("%s:%d: control_code_page:        %p\n", __func__, __LINE__,
+		kimage->control_code_page);
+	pr_devel("%s:%d: reboot_code_buffer_phys:  %pa\n", __func__, __LINE__,
+		&reboot_code_buffer_phys);
+	pr_devel("%s:%d: reboot_code_buffer:       %p\n", __func__, __LINE__,
+		reboot_code_buffer);
+	pr_devel("%s:%d: relocate_new_kernel:      %p\n", __func__, __LINE__,
+		arm64_relocate_new_kernel);
+	pr_devel("%s:%d: relocate_new_kernel_size: 0x%lx(%lu) bytes\n",
+		__func__, __LINE__, arm64_relocate_new_kernel_size,
+		arm64_relocate_new_kernel_size);
+
+	pr_devel("%s:%d: kimage_head:              %lx\n", __func__, __LINE__,
+		kimage->head);
+	pr_devel("%s:%d: kimage_start:             %lx\n", __func__, __LINE__,
+		kimage_start);
+
 	/*
 	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
 	 * after the kernel is shut down.
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

kdump calls machine_crash_shutdown() to shut down non-boot cpus and
save registers' status in per-cpu ELF notes before starting the crash
dump kernel. See kernel_kexec().

ipi_cpu_stop() is a bit modified and used to support this behavior.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
 arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
 arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
 3 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 46d63cd..555a955 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -30,6 +30,8 @@
 
 #if !defined(__ASSEMBLY__)
 
+extern bool in_crash_kexec;
+
 /**
  * crash_setup_regs() - save registers for the panic kernel
  *
@@ -40,7 +42,37 @@
 static inline void crash_setup_regs(struct pt_regs *newregs,
 				    struct pt_regs *oldregs)
 {
-	/* Empty routine needed to avoid build errors. */
+	if (oldregs) {
+		memcpy(newregs, oldregs, sizeof(*newregs));
+	} else {
+		__asm__ __volatile__ (
+			"stp	 x0,   x1, [%3, #16 *  0]\n"
+			"stp	 x2,   x3, [%3, #16 *  1]\n"
+			"stp	 x4,   x5, [%3, #16 *  2]\n"
+			"stp	 x6,   x7, [%3, #16 *  3]\n"
+			"stp	 x8,   x9, [%3, #16 *  4]\n"
+			"stp	x10,  x11, [%3, #16 *  5]\n"
+			"stp	x12,  x13, [%3, #16 *  6]\n"
+			"stp	x14,  x15, [%3, #16 *  7]\n"
+			"stp	x16,  x17, [%3, #16 *  8]\n"
+			"stp	x18,  x19, [%3, #16 *  9]\n"
+			"stp	x20,  x21, [%3, #16 * 10]\n"
+			"stp	x22,  x23, [%3, #16 * 11]\n"
+			"stp	x24,  x25, [%3, #16 * 12]\n"
+			"stp	x26,  x27, [%3, #16 * 13]\n"
+			"stp	x28,  x29, [%3, #16 * 14]\n"
+			"str	x30,	   [%3, #16 * 15]\n"
+			"mov	%0, sp\n"
+			"adr	%1, 1f\n"
+			"mrs	%2, spsr_el1\n"
+		"1:"
+			: "=r" (newregs->sp),
+			  "=r" (newregs->pc),
+			  "=r" (newregs->pstate)
+			: "r"  (&newregs->regs)
+			: "memory"
+		);
+	}
 }
 
 #endif /* !defined(__ASSEMBLY__) */
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index da28a26..d2d7e90 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -9,6 +9,7 @@
  * published by the Free Software Foundation.
  */
 
+#include <linux/kernel.h>
 #include <linux/kexec.h>
 #include <linux/of_fdt.h>
 #include <linux/slab.h>
@@ -23,6 +24,7 @@
 extern const unsigned char arm64_relocate_new_kernel[];
 extern const unsigned long arm64_relocate_new_kernel_size;
 
+bool in_crash_kexec;
 static unsigned long kimage_start;
 
 /**
@@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
 	 */
 
 	cpu_soft_restart(virt_to_phys(cpu_reset),
-		is_hyp_mode_available(),
+		in_crash_kexec ? 0 : is_hyp_mode_available(),
 		reboot_code_buffer_phys, kimage->head, kimage_start);
 
 	BUG(); /* Should never get here. */
 }
 
+/**
+ * machine_crash_shutdown - shutdown non-boot cpus and save registers
+ */
 void machine_crash_shutdown(struct pt_regs *regs)
 {
-	/* Empty routine needed to avoid build errors. */
+	struct pt_regs dummy_regs;
+	int cpu;
+
+	local_irq_disable();
+
+	in_crash_kexec = true;
+
+	/*
+	 * clear and initialize the per-cpu info. This is necessary
+	 * because, otherwise, slots for offline cpus would never be
+	 * filled up. See smp_send_stop().
+	 */
+	memset(&dummy_regs, 0, sizeof(dummy_regs));
+	for_each_possible_cpu(cpu)
+		crash_save_cpu(&dummy_regs, cpu);
+
+	/* shutdown non-boot cpus */
+	smp_send_stop();
+
+	/* for boot cpu */
+	crash_save_cpu(regs, smp_processor_id());
+
+	pr_info("Starting crashdump kernel...\n");
 }
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index b1adc51..15aabef 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -37,6 +37,7 @@
 #include <linux/completion.h>
 #include <linux/of.h>
 #include <linux/irq_work.h>
+#include <linux/kexec.h>
 
 #include <asm/alternative.h>
 #include <asm/atomic.h>
@@ -54,6 +55,8 @@
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
+#include "cpu-reset.h"
+
 #define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
 
@@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
 /*
  * ipi_cpu_stop - handle IPI from smp_send_stop()
  */
-static void ipi_cpu_stop(unsigned int cpu)
+static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
 {
+#ifdef CONFIG_KEXEC
+	/* printing messages may slow down the shutdown. */
+	if (!in_crash_kexec)
+#endif
 	if (system_state == SYSTEM_BOOTING ||
 	    system_state == SYSTEM_RUNNING) {
 		raw_spin_lock(&stop_lock);
@@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
 
 	local_irq_disable();
 
+#ifdef CONFIG_KEXEC
+	if (in_crash_kexec)
+		crash_save_cpu(regs, cpu);
+#endif /* CONFIG_KEXEC */
+
 	while (1)
 		cpu_relax();
 }
@@ -727,7 +739,7 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 
 	case IPI_CPU_STOP:
 		irq_enter();
-		ipi_cpu_stop(cpu);
+		ipi_cpu_stop(cpu, regs);
 		irq_exit();
 		break;
 
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 11/16] arm64: kdump: reserve memory for crash dump kernel
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

On primary kernel, the memory region used by crash dump kernel must be
specified by "crashkernel=" boot parameter. reserve_crashkernel()
will allocate and reserve the region for later use.

User space tools will be able to find the region marked as "Crash kernel"
in /proc/iomem.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/kernel/setup.c |  7 +++++-
 arch/arm64/mm/init.c      | 54 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 8119479..f9fffc9 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -31,7 +31,6 @@
 #include <linux/screen_info.h>
 #include <linux/init.h>
 #include <linux/kexec.h>
-#include <linux/crash_dump.h>
 #include <linux/root_dev.h>
 #include <linux/cpu.h>
 #include <linux/interrupt.h>
@@ -221,6 +220,12 @@ static void __init request_standard_resources(void)
 		    kernel_data.end <= res->end)
 			request_resource(res, &kernel_data);
 	}
+
+#ifdef CONFIG_KEXEC
+	/* User space tools will find "Crash kernel" region in /proc/iomem. */
+	if (crashk_res.end)
+		insert_resource(&iomem_resource, &crashk_res);
+#endif
 }
 
 #ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 17bf39a..24f0a1c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -34,6 +34,7 @@
 #include <linux/dma-contiguous.h>
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
+#include <linux/kexec.h>
 
 #include <asm/fixmap.h>
 #include <asm/memory.h>
@@ -66,6 +67,55 @@ static int __init early_initrd(char *p)
 early_param("initrd", early_initrd);
 #endif
 
+#ifdef CONFIG_KEXEC
+/*
+ * reserve_crashkernel() - reserves memory for crash kernel
+ *
+ * This function reserves memory area given in "crashkernel=" kernel command
+ * line parameter. The memory reserved is used by dump capture kernel when
+ * primary kernel is crashing.
+ */
+static void __init reserve_crashkernel(void)
+{
+	unsigned long long crash_size = 0, crash_base = 0;
+	int ret;
+
+	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+				&crash_size, &crash_base);
+	if (ret)
+		return;
+
+	if (crash_base == 0) {
+		crash_base = memblock_alloc(crash_size, 1 << 21);
+		if (crash_base == 0) {
+			pr_warn("Unable to allocate crashkernel (size:%llx)\n",
+				crash_size);
+			return;
+		}
+	} else {
+		/* User specifies base address explicitly. */
+		if (!memblock_is_region_memory(crash_base, crash_size) ||
+			memblock_is_region_reserved(crash_base, crash_size)) {
+			pr_warn("crashkernel has wrong address or size\n");
+			return;
+		}
+
+		if (crash_base & ((1 << 21) - 1)) {
+			pr_warn("crashkernel base address is not 2MB aligned\n");
+			return;
+		}
+
+		memblock_reserve(crash_base, crash_size);
+	}
+
+	pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
+		crash_size >> 20, crash_base >> 20);
+
+	crashk_res.start = crash_base;
+	crashk_res.end = crash_base + crash_size - 1;
+}
+#endif /* CONFIG_KEXEC */
+
 /*
  * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
  * currently assumes that for memory starting above 4G, 32-bit devices will
@@ -171,6 +221,10 @@ void __init arm64_memblock_init(void)
 		memblock_reserve(__virt_to_phys(initrd_start), initrd_end - initrd_start);
 #endif
 
+#ifdef CONFIG_KEXEC
+	reserve_crashkernel();
+#endif
+
 	early_init_fdt_scan_reserved_mem();
 
 	/* 4GB maximum for 32-bit only capable devices */
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 10/16] arm64/kexec: Enable kexec in the arm64 defconfig
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index bdd7aa3..0470fdf 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -58,6 +58,7 @@ CONFIG_PREEMPT=y
 CONFIG_KSM=y
 CONFIG_TRANSPARENT_HUGEPAGE=y
 CONFIG_CMA=y
+CONFIG_KEXEC=y
 CONFIG_CMDLINE="console=ttyAMA0"
 # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
 CONFIG_COMPAT=y
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 16/16] arm64: kdump: relax BUG_ON() if more than one cpus are still active
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

We should try best in case of kdump.
So even if not all secondary cpus have shut down, we do kdump anyway.
---
 arch/arm64/kernel/machine_kexec.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index d2d7e90..482aae7 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -148,7 +148,13 @@ void machine_kexec(struct kimage *kimage)
 	phys_addr_t reboot_code_buffer_phys;
 	void *reboot_code_buffer;
 
-	BUG_ON(num_online_cpus() > 1);
+	if (num_online_cpus() > 1) {
+		if (in_crash_kexec)
+			pr_warn("kdump might fail because %d cpus are still online\n",
+					num_online_cpus());
+		else
+			BUG();
+	}
 
 	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
 	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
-- 
2.5.0


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 15/16] arm64: kdump: enable kdump in the arm64 defconfig
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 0470fdf..6dc3d00 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -59,6 +59,7 @@ CONFIG_KSM=y
 CONFIG_TRANSPARENT_HUGEPAGE=y
 CONFIG_CMA=y
 CONFIG_KEXEC=y
+CONFIG_CRASH_DUMP=y
 CONFIG_CMDLINE="console=ttyAMA0"
 # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
 CONFIG_COMPAT=y
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 14/16] arm64: kdump: update a kernel doc
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

This patch adds arch specific descriptions about kdump usage on arm64
to kdump.txt.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 Documentation/kdump/kdump.txt | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index bc4bd5a..7d3fad0 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
 a remote system.
 
 Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
-s390x and arm architectures.
+s390x, arm and arm64 architectures.
 
 When the system kernel boots, it reserves a small section of memory for
 the dump-capture kernel. This ensures that ongoing Direct Memory Access
@@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
 
     AUTO_ZRELADDR=y
 
+Dump-capture kernel config options (Arch Dependent, arm64)
+----------------------------------------------------------
+
+1) Disable symmetric multi-processing support under "Processor type and
+   features":
+
+   CONFIG_SMP=n
+
+   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
+   when loading the dump-capture kernel, see section "Load the Dump-capture
+   Kernel".)
+
+2) The maximum memory size on the dump-capture kernel must be limited by
+   specifying:
+
+   mem=X[MG]
+
+   where X should be less than or equal to the size in "crashkernel="
+   boot parameter. Kexec-tools will automatically add this.
+
+3) Currently, kvm will not be enabled on the dump-capture kernel even
+   if it is configured.
+
 Extended crashkernel syntax
 ===========================
 
@@ -312,6 +335,8 @@ Boot into System Kernel
    any space below the alignment point may be overwritten by the dump-capture kernel,
    which means it is possible that the vmcore is not that precise as expected.
 
+   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
+   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
 
 Load the Dump-capture Kernel
 ============================
@@ -334,6 +359,8 @@ For s390x:
 	- Use image or bzImage
 For arm:
 	- Use zImage
+For arm64:
+	- Use vmlinux
 
 If you are using a uncompressed vmlinux image then use following command
 to load dump-capture kernel.
@@ -377,6 +404,9 @@ For s390x:
 For arm:
 	"1 maxcpus=1 reset_devices"
 
+For arm64:
+	"1 mem=X[MG] maxcpus=1 reset_devices"
+
 Notes on loading the dump-capture kernel:
 
 * By default, the ELF headers are stored in ELF64 format to support
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 13/16] arm64: kdump: add kdump support
@ 2015-11-24 22:25   ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-24 22:25 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

On crash dump kernel, all the information about primary kernel's core
image is available in elf core header specified by "elfcorehdr=" boot
parameter. reserve_elfcorehdr() will set aside the region to avoid any
corruption by crash dump kernel.

Crash dump kernel will access the system memory of primary kernel via
copy_oldmem_page(), which reads one page by ioremap'ing it since it does
not reside in linear mapping on crash dump kernel.
Please note that we should add "mem=X[MG]" boot parameter to limit the
memory size and avoid the following assertion at ioremap():
	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
		return NULL;
when accessing any pages beyond the usable memories of crash dump kernel.

We also need our own elfcorehdr_read() here since the weak definition of
elfcorehdr_read() utilizes copy_oldmem_page() and will hit the assertion
above on arm64.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm64/Kconfig             | 12 +++++++
 arch/arm64/kernel/Makefile     |  1 +
 arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/mm/init.c           | 29 +++++++++++++++++
 4 files changed, 113 insertions(+)
 create mode 100644 arch/arm64/kernel/crash_dump.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c23fd77..4bac7dc 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -545,6 +545,18 @@ config KEXEC
 	  but it is independent of the system firmware.   And like a reboot
 	  you can start any kernel with it, not just Linux.
 
+config CRASH_DUMP
+	bool "Build kdump crash kernel"
+	help
+	  Generate crash dump after being started by kexec. This should
+	  be normally only set in special crash dump kernels which are
+	  loaded in the main kernel with kexec-tools into a specially
+	  reserved region and then later executed after a crash by
+	  kdump/kexec. The crash dump kernel must be compiled to a
+	  memory address not used by the main kernel.
+
+	  For more details see Documentation/kdump/kdump.txt
+
 config XEN_DOM0
 	def_bool y
 	depends on XEN
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index f68420d..a08b054 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -43,6 +43,7 @@ arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
 arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
+arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
new file mode 100644
index 0000000..3d86c0a
--- /dev/null
+++ b/arch/arm64/kernel/crash_dump.c
@@ -0,0 +1,71 @@
+/*
+ * Routines for doing kexec-based kdump
+ *
+ * Copyright (C) 2014 Linaro Limited
+ * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crash_dump.h>
+#include <linux/errno.h>
+#include <linux/io.h>
+#include <linux/memblock.h>
+#include <linux/uaccess.h>
+#include <asm/memory.h>
+
+/**
+ * copy_oldmem_page() - copy one page from old kernel memory
+ * @pfn: page frame number to be copied
+ * @buf: buffer where the copied page is placed
+ * @csize: number of bytes to copy
+ * @offset: offset in bytes into the page
+ * @userbuf: if set, @buf is in a user address space
+ *
+ * This function copies one page from old kernel memory into buffer pointed by
+ * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
+ * copied or negative error in case of failure.
+ */
+ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+			 size_t csize, unsigned long offset,
+			 int userbuf)
+{
+	void *vaddr;
+
+	if (!csize)
+		return 0;
+
+	vaddr = ioremap_cache(pfn << PAGE_SHIFT, PAGE_SIZE);
+	if (!vaddr)
+		return -ENOMEM;
+
+	if (userbuf) {
+		if (copy_to_user(buf, vaddr + offset, csize)) {
+			iounmap(vaddr);
+			return -EFAULT;
+		}
+	} else {
+		memcpy(buf, vaddr + offset, csize);
+	}
+
+	iounmap(vaddr);
+
+	return csize;
+}
+
+/**
+ * elfcorehdr_read - read from ELF core header
+ * @buf: buffer where the data is placed
+ * @csize: number of bytes to read
+ * @ppos: address in the memory
+ *
+ * This function reads @count bytes from elf core header which exists
+ * on crash dump kernel's memory.
+ */
+ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
+{
+	memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
+	return count;
+}
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 24f0a1c..52a1469 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -35,6 +35,7 @@
 #include <linux/efi.h>
 #include <linux/swiotlb.h>
 #include <linux/kexec.h>
+#include <linux/crash_dump.h>
 
 #include <asm/fixmap.h>
 #include <asm/memory.h>
@@ -116,6 +117,31 @@ static void __init reserve_crashkernel(void)
 }
 #endif /* CONFIG_KEXEC */
 
+#ifdef CONFIG_CRASH_DUMP
+/*
+ * reserve_elfcorehdr() - reserves memory for elf core header
+ *
+ * This function reserves elf core header given in "elfcorehdr=" kernel
+ * command line parameter. This region contains all the information about
+ * primary kernel's core image and is used by a dump capture kernel to
+ * access the system memory on primary kernel.
+ */
+static void __init reserve_elfcorehdr(void)
+{
+	if (!elfcorehdr_size)
+		return;
+
+	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
+		pr_warn("elfcorehdr is overlapped\n");
+		return;
+	}
+
+	memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
+
+	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
+		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);
+}
+#endif /* CONFIG_CRASH_DUMP */
 /*
  * Return the maximum physical address for ZONE_DMA (DMA_BIT_MASK(32)). It
  * currently assumes that for memory starting above 4G, 32-bit devices will
@@ -224,6 +250,9 @@ void __init arm64_memblock_init(void)
 #ifdef CONFIG_KEXEC
 	reserve_crashkernel();
 #endif
+#ifdef CONFIG_CRASH_DUMP
+	reserve_elfcorehdr();
+#endif
 
 	early_init_fdt_scan_reserved_mem();
 
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-11-24 22:25   ` Geoff Levand
@ 2015-11-27 13:13     ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-11-27 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Geoff,

On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> +	/* Test the entry flags. */
> +.Ltest_source:
> +	tbz	x18, IND_SOURCE_BIT, .Ltest_indirection
> +
> +	mov x20, x13				/*  x20 = copy dest */
> +	mov x21, x12				/*  x21 = copy src */

Till v10 we had here invalidation for relocated destination page to PoC. I could
not understand, why it was removed. Removing that piece of code breaks kexec
booting with mustang. I need [1] to kexec boot into second kernel with mustang
platform.

[1]
https://github.com/pratyushanand/linux/commit/431e3247391981a1e8b2864a83a5743e8a274cb9

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-11-27 13:13     ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-11-27 13:13 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, Will Deacon,
	AKASHI Takahiro, linux-arm-kernel, Dave Young, kexec,
	christoffer.dall

Hi Geoff,

On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> +	/* Test the entry flags. */
> +.Ltest_source:
> +	tbz	x18, IND_SOURCE_BIT, .Ltest_indirection
> +
> +	mov x20, x13				/*  x20 = copy dest */
> +	mov x21, x12				/*  x21 = copy src */

Till v10 we had here invalidation for relocated destination page to PoC. I could
not understand, why it was removed. Removing that piece of code breaks kexec
booting with mustang. I need [1] to kexec boot into second kernel with mustang
platform.

[1]
https://github.com/pratyushanand/linux/commit/431e3247391981a1e8b2864a83a5743e8a274cb9

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-11-24 22:25   ` Geoff Levand
@ 2015-11-27 13:54     ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-11-27 13:54 UTC (permalink / raw)
  To: linux-arm-kernel

On 24/11/15 22:25, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> The current kvm implementation on arm64 does cpu-specific initialization
> at system boot, and has no way to gracefully shutdown a core in terms of
> kvm. This prevents, especially, kexec from rebooting the system on a boot
> core in EL2.
> 
> This patch adds a cpu tear-down function and also puts an existing cpu-init
> code into a separate function, kvm_arch_hardware_disable() and
> kvm_arch_hardware_enable() respectively.
> We don't need arm64-specific cpu hotplug hook any more.
> 
> Since this patch modifies common part of code between arm and arm64, one
> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
> compiling errors.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm/include/asm/kvm_host.h   | 10 ++++-
>  arch/arm/include/asm/kvm_mmu.h    |  1 +
>  arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>  arch/arm/kvm/mmu.c                |  5 +++
>  arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>  arch/arm64/include/asm/virt.h     |  9 +++++
>  arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>  arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>  9 files changed, 138 insertions(+), 48 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 6692982..9242765 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -214,6 +214,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>  }
>  
> +static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
> +					phys_addr_t phys_idmap_start)
> +{
> +	/*
> +	 * TODO
> +	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
> +	 */
> +}
> +
>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>  {
>  	return 0;
> @@ -226,7 +235,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>  
>  struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
>  
> -static inline void kvm_arch_hardware_disable(void) {}
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa18..dc6fadf 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
>  phys_addr_t kvm_mmu_get_httbr(void);
>  phys_addr_t kvm_mmu_get_boot_httbr(void);
>  phys_addr_t kvm_get_idmap_vector(void);
> +phys_addr_t kvm_get_idmap_start(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index eab83b2..a5d9d74 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -16,7 +16,6 @@
>   * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
>   */
>  
> -#include <linux/cpu.h>
>  #include <linux/cpu_pm.h>
>  #include <linux/errno.h>
>  #include <linux/err.h>
> @@ -61,6 +60,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
>  static u8 kvm_next_vmid;
>  static DEFINE_SPINLOCK(kvm_vmid_lock);
>  
> +static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
> +
>  static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
>  {
>  	BUG_ON(preemptible());
> @@ -85,11 +86,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
>  	return &kvm_arm_running_vcpu;
>  }
>  
> -int kvm_arch_hardware_enable(void)
> -{
> -	return 0;
> -}
> -
>  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>  {
>  	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
> @@ -959,7 +955,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  	}
>  }
>  
> -static void cpu_init_hyp_mode(void *dummy)
> +int kvm_arch_hardware_enable(void)
>  {
>  	phys_addr_t boot_pgd_ptr;
>  	phys_addr_t pgd_ptr;
> @@ -967,6 +963,9 @@ static void cpu_init_hyp_mode(void *dummy)
>  	unsigned long stack_page;
>  	unsigned long vector_ptr;
>  
> +	if (__hyp_get_vectors() != hyp_default_vectors)
> +		return 0;
> +
>  	/* Switch from the HYP stub to our own HYP init vector */
>  	__hyp_set_vectors(kvm_get_idmap_vector());
>  
> @@ -979,38 +978,50 @@ static void cpu_init_hyp_mode(void *dummy)
>  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>  
>  	kvm_arm_init_debug();
> +
> +	return 0;
>  }
>  
> -static int hyp_init_cpu_notify(struct notifier_block *self,
> -			       unsigned long action, void *cpu)
> +void kvm_arch_hardware_disable(void)
>  {
> -	switch (action) {
> -	case CPU_STARTING:
> -	case CPU_STARTING_FROZEN:
> -		if (__hyp_get_vectors() == hyp_default_vectors)
> -			cpu_init_hyp_mode(NULL);
> -		break;
> -	}
> +	phys_addr_t boot_pgd_ptr;
> +	phys_addr_t phys_idmap_start;
>  
> -	return NOTIFY_OK;
> -}
> +	if (__hyp_get_vectors() == hyp_default_vectors)
> +		return;
>  
> -static struct notifier_block hyp_init_cpu_nb = {
> -	.notifier_call = hyp_init_cpu_notify,
> -};
> +	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
> +	phys_idmap_start = kvm_get_idmap_start();
> +
> +	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
> +}
>  
>  #ifdef CONFIG_CPU_PM
>  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>  				    unsigned long cmd,
>  				    void *v)
>  {
> -	if (cmd == CPU_PM_EXIT &&
> -	    __hyp_get_vectors() == hyp_default_vectors) {
> -		cpu_init_hyp_mode(NULL);
> +	switch (cmd) {
> +	case CPU_PM_ENTER:
> +		if (__hyp_get_vectors() != hyp_default_vectors)
> +			__this_cpu_write(kvm_arm_hardware_enabled, 1);
> +		else
> +			__this_cpu_write(kvm_arm_hardware_enabled, 0);
> +		/*
> +		 * don't call kvm_arch_hardware_disable() in case of
> +		 * CPU_PM_ENTER because it does't actually save any state.
> +		 */
> +
> +		return NOTIFY_OK;
> +	case CPU_PM_EXIT:
> +		if (__this_cpu_read(kvm_arm_hardware_enabled))
> +			kvm_arch_hardware_enable();
> +
>  		return NOTIFY_OK;
> -	}
>  
> -	return NOTIFY_DONE;
> +	default:
> +		return NOTIFY_DONE;
> +	}
>  }
>  
>  static struct notifier_block hyp_init_cpu_pm_nb = {
> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>  	}
>  
>  	/*
> -	 * Execute the init code on each CPU.
> -	 */
> -	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
> -
> -	/*
>  	 * Init HYP view of VGIC
>  	 */
>  	err = kvm_vgic_hyp_init();
> @@ -1186,26 +1192,15 @@ int kvm_arch_init(void *opaque)
>  		}
>  	}
>  
> -	cpu_notifier_register_begin();
> -
>  	err = init_hyp_mode();
>  	if (err)
>  		goto out_err;
>  
> -	err = __register_cpu_notifier(&hyp_init_cpu_nb);
> -	if (err) {
> -		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
> -		goto out_err;
> -	}
> -
> -	cpu_notifier_register_done();
> -
>  	hyp_cpu_pm_init();
>  
>  	kvm_coproc_table_init();
>  	return 0;
>  out_err:
> -	cpu_notifier_register_done();
>  	return err;
>  }
>  
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 6984342..69b4a33 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1644,6 +1644,11 @@ phys_addr_t kvm_get_idmap_vector(void)
>  	return hyp_idmap_vector;
>  }
>  
> +phys_addr_t kvm_get_idmap_start(void)
> +{
> +	return hyp_idmap_start;
> +}
> +
>  int kvm_mmu_init(void)
>  {
>  	int err;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index a35ce72..0b540f8 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -223,6 +223,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
>  struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
>  
>  u64 kvm_call_hyp(void *hypfn, ...);
> +void kvm_call_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
>  void force_vm_exit(const cpumask_t *mask);
>  void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>  
> @@ -247,7 +248,20 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  		     hyp_stack_ptr, vector_ptr);
>  }
>  
> -static inline void kvm_arch_hardware_disable(void) {}
> +static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
> +					phys_addr_t phys_idmap_start)
> +{
> +	/*
> +	 * Call reset code, and switch back to stub hyp vectors.
> +	 */
> +	kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
> +}
> +
> +struct vgic_sr_vectors {
> +	void	*save_vgic;
> +	void	*restore_vgic;
> +};
> +

Why is this structure being re-introduced? It has been removed in 48f8bd5.

>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 6150567..ff5a087 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -98,6 +98,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
>  phys_addr_t kvm_mmu_get_httbr(void);
>  phys_addr_t kvm_mmu_get_boot_httbr(void);
>  phys_addr_t kvm_get_idmap_vector(void);
> +phys_addr_t kvm_get_idmap_start(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 3070096..bca79f9 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -58,9 +58,18 @@
>  
>  #define HVC_CALL_FUNC 3
>  
> +/*
> + * HVC_RESET_CPU - Reset cpu in EL2 to initial state.
> + *
> + * @x0: entry address in trampoline code in va
> + * @x1: identical mapping page table in pa
> + */
> +
>  #define BOOT_CPU_MODE_EL1	(0xe11)
>  #define BOOT_CPU_MODE_EL2	(0xe12)
>  
> +#define HVC_RESET_CPU 4
> +

Why isn't this next to the comment that document it?

>  #ifndef __ASSEMBLY__
>  
>  /*
> diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
> index 2e67a48..1925163 100644
> --- a/arch/arm64/kvm/hyp-init.S
> +++ b/arch/arm64/kvm/hyp-init.S
> @@ -139,6 +139,39 @@ merged:
>  	eret
>  ENDPROC(__kvm_hyp_init)
>  
> +	/*
> +	 * x0: HYP boot pgd
> +	 * x1: HYP phys_idmap_start
> +	 */
> +ENTRY(__kvm_hyp_reset)
> +	/* We're in trampoline code in VA, switch back to boot page tables */
> +	msr	ttbr0_el2, x0
> +	isb
> +
> +	/* Invalidate the old TLBs */
> +	tlbi	alle2
> +	dsb	sy

I find the location of that TLBI a bit weird. If you want to do
something more useful, move it to after the MMU has been disabled.

> +
> +	/* Branch into PA space */
> +	adr	x0, 1f
> +	bfi	x1, x0, #0, #PAGE_SHIFT
> +	br	x1
> +
> +	/* We're now in idmap, disable MMU */
> +1:	mrs	x0, sctlr_el2
> +	ldr	x1, =SCTLR_EL2_FLAGS
> +	bic	x0, x0, x1		// Clear SCTL_M and etc
> +	msr	sctlr_el2, x0
> +	isb
> +
> +	/* Install stub vectors */
> +	adrp	x0, __hyp_stub_vectors
> +	add	x0, x0, #:lo12:__hyp_stub_vectors
> +	msr	vbar_el2, x0
> +
> +	eret
> +ENDPROC(__kvm_hyp_reset)
> +
>  	.ltorg
>  
>  	.popsection
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 1bef8db..aca11d6 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -939,6 +939,11 @@ ENTRY(kvm_call_hyp)
>  	ret
>  ENDPROC(kvm_call_hyp)
>  
> +ENTRY(kvm_call_reset)
> +	hvc	#HVC_RESET_CPU
> +	ret
> +ENDPROC(kvm_call_reset)
> +
>  .macro invalid_vector	label, target
>  	.align	2
>  \label:
> @@ -982,10 +987,27 @@ el1_sync:					// Guest trapped into EL2
>  	cmp	x18, #HVC_GET_VECTORS
>  	b.ne	1f
>  	mrs	x0, vbar_el2
> -	b	2f
> -
> -1:	/* Default to HVC_CALL_HYP. */
> +	b	do_eret
>  
> +	/* jump into trampoline code */
> +1:	cmp	x18, #HVC_RESET_CPU
> +	b.ne	2f
> +	/*
> +	 * Entry point is:
> +	 *	TRAMPOLINE_VA
> +	 *	+ (__kvm_hyp_reset - (__hyp_idmap_text_start & PAGE_MASK))
> +	 */
> +	adrp	x2, __kvm_hyp_reset
> +	add	x2, x2, #:lo12:__kvm_hyp_reset
> +	adrp	x3, __hyp_idmap_text_start
> +	add	x3, x3, #:lo12:__hyp_idmap_text_start
> +	and	x3, x3, PAGE_MASK
> +	sub	x2, x2, x3
> +	ldr	x3, =TRAMPOLINE_VA

Nit: You should be able to do a "mov x3, #TRAMPOLINE_VA and avoid the
load from memory.

> +	add	x2, x2, x3
> +	br	x2				// no return
> +
> +2:	/* Default to HVC_CALL_HYP. */
>  	push	lr, xzr
>  
>  	/*
> @@ -999,7 +1021,9 @@ el1_sync:					// Guest trapped into EL2
>  	blr	lr
>  
>  	pop	lr, xzr
> -2:	eret
> +
> +do_eret:
> +	eret
>  
>  el1_trap:
>  	/*
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-11-27 13:54     ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-11-27 13:54 UTC (permalink / raw)
  To: Geoff Levand, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, AKASHI Takahiro, kexec, linux-arm-kernel, christoffer.dall

On 24/11/15 22:25, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> The current kvm implementation on arm64 does cpu-specific initialization
> at system boot, and has no way to gracefully shutdown a core in terms of
> kvm. This prevents, especially, kexec from rebooting the system on a boot
> core in EL2.
> 
> This patch adds a cpu tear-down function and also puts an existing cpu-init
> code into a separate function, kvm_arch_hardware_disable() and
> kvm_arch_hardware_enable() respectively.
> We don't need arm64-specific cpu hotplug hook any more.
> 
> Since this patch modifies common part of code between arm and arm64, one
> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
> compiling errors.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm/include/asm/kvm_host.h   | 10 ++++-
>  arch/arm/include/asm/kvm_mmu.h    |  1 +
>  arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>  arch/arm/kvm/mmu.c                |  5 +++
>  arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>  arch/arm64/include/asm/virt.h     |  9 +++++
>  arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>  arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>  9 files changed, 138 insertions(+), 48 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 6692982..9242765 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -214,6 +214,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>  }
>  
> +static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
> +					phys_addr_t phys_idmap_start)
> +{
> +	/*
> +	 * TODO
> +	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
> +	 */
> +}
> +
>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>  {
>  	return 0;
> @@ -226,7 +235,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>  
>  struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
>  
> -static inline void kvm_arch_hardware_disable(void) {}
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa18..dc6fadf 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
>  phys_addr_t kvm_mmu_get_httbr(void);
>  phys_addr_t kvm_mmu_get_boot_httbr(void);
>  phys_addr_t kvm_get_idmap_vector(void);
> +phys_addr_t kvm_get_idmap_start(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index eab83b2..a5d9d74 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -16,7 +16,6 @@
>   * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
>   */
>  
> -#include <linux/cpu.h>
>  #include <linux/cpu_pm.h>
>  #include <linux/errno.h>
>  #include <linux/err.h>
> @@ -61,6 +60,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
>  static u8 kvm_next_vmid;
>  static DEFINE_SPINLOCK(kvm_vmid_lock);
>  
> +static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
> +
>  static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
>  {
>  	BUG_ON(preemptible());
> @@ -85,11 +86,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
>  	return &kvm_arm_running_vcpu;
>  }
>  
> -int kvm_arch_hardware_enable(void)
> -{
> -	return 0;
> -}
> -
>  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>  {
>  	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
> @@ -959,7 +955,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  	}
>  }
>  
> -static void cpu_init_hyp_mode(void *dummy)
> +int kvm_arch_hardware_enable(void)
>  {
>  	phys_addr_t boot_pgd_ptr;
>  	phys_addr_t pgd_ptr;
> @@ -967,6 +963,9 @@ static void cpu_init_hyp_mode(void *dummy)
>  	unsigned long stack_page;
>  	unsigned long vector_ptr;
>  
> +	if (__hyp_get_vectors() != hyp_default_vectors)
> +		return 0;
> +
>  	/* Switch from the HYP stub to our own HYP init vector */
>  	__hyp_set_vectors(kvm_get_idmap_vector());
>  
> @@ -979,38 +978,50 @@ static void cpu_init_hyp_mode(void *dummy)
>  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
>  
>  	kvm_arm_init_debug();
> +
> +	return 0;
>  }
>  
> -static int hyp_init_cpu_notify(struct notifier_block *self,
> -			       unsigned long action, void *cpu)
> +void kvm_arch_hardware_disable(void)
>  {
> -	switch (action) {
> -	case CPU_STARTING:
> -	case CPU_STARTING_FROZEN:
> -		if (__hyp_get_vectors() == hyp_default_vectors)
> -			cpu_init_hyp_mode(NULL);
> -		break;
> -	}
> +	phys_addr_t boot_pgd_ptr;
> +	phys_addr_t phys_idmap_start;
>  
> -	return NOTIFY_OK;
> -}
> +	if (__hyp_get_vectors() == hyp_default_vectors)
> +		return;
>  
> -static struct notifier_block hyp_init_cpu_nb = {
> -	.notifier_call = hyp_init_cpu_notify,
> -};
> +	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
> +	phys_idmap_start = kvm_get_idmap_start();
> +
> +	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
> +}
>  
>  #ifdef CONFIG_CPU_PM
>  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>  				    unsigned long cmd,
>  				    void *v)
>  {
> -	if (cmd == CPU_PM_EXIT &&
> -	    __hyp_get_vectors() == hyp_default_vectors) {
> -		cpu_init_hyp_mode(NULL);
> +	switch (cmd) {
> +	case CPU_PM_ENTER:
> +		if (__hyp_get_vectors() != hyp_default_vectors)
> +			__this_cpu_write(kvm_arm_hardware_enabled, 1);
> +		else
> +			__this_cpu_write(kvm_arm_hardware_enabled, 0);
> +		/*
> +		 * don't call kvm_arch_hardware_disable() in case of
> +		 * CPU_PM_ENTER because it does't actually save any state.
> +		 */
> +
> +		return NOTIFY_OK;
> +	case CPU_PM_EXIT:
> +		if (__this_cpu_read(kvm_arm_hardware_enabled))
> +			kvm_arch_hardware_enable();
> +
>  		return NOTIFY_OK;
> -	}
>  
> -	return NOTIFY_DONE;
> +	default:
> +		return NOTIFY_DONE;
> +	}
>  }
>  
>  static struct notifier_block hyp_init_cpu_pm_nb = {
> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>  	}
>  
>  	/*
> -	 * Execute the init code on each CPU.
> -	 */
> -	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
> -
> -	/*
>  	 * Init HYP view of VGIC
>  	 */
>  	err = kvm_vgic_hyp_init();
> @@ -1186,26 +1192,15 @@ int kvm_arch_init(void *opaque)
>  		}
>  	}
>  
> -	cpu_notifier_register_begin();
> -
>  	err = init_hyp_mode();
>  	if (err)
>  		goto out_err;
>  
> -	err = __register_cpu_notifier(&hyp_init_cpu_nb);
> -	if (err) {
> -		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
> -		goto out_err;
> -	}
> -
> -	cpu_notifier_register_done();
> -
>  	hyp_cpu_pm_init();
>  
>  	kvm_coproc_table_init();
>  	return 0;
>  out_err:
> -	cpu_notifier_register_done();
>  	return err;
>  }
>  
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 6984342..69b4a33 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1644,6 +1644,11 @@ phys_addr_t kvm_get_idmap_vector(void)
>  	return hyp_idmap_vector;
>  }
>  
> +phys_addr_t kvm_get_idmap_start(void)
> +{
> +	return hyp_idmap_start;
> +}
> +
>  int kvm_mmu_init(void)
>  {
>  	int err;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index a35ce72..0b540f8 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -223,6 +223,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
>  struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
>  
>  u64 kvm_call_hyp(void *hypfn, ...);
> +void kvm_call_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
>  void force_vm_exit(const cpumask_t *mask);
>  void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>  
> @@ -247,7 +248,20 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  		     hyp_stack_ptr, vector_ptr);
>  }
>  
> -static inline void kvm_arch_hardware_disable(void) {}
> +static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
> +					phys_addr_t phys_idmap_start)
> +{
> +	/*
> +	 * Call reset code, and switch back to stub hyp vectors.
> +	 */
> +	kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
> +}
> +
> +struct vgic_sr_vectors {
> +	void	*save_vgic;
> +	void	*restore_vgic;
> +};
> +

Why is this structure being re-introduced? It has been removed in 48f8bd5.

>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 6150567..ff5a087 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -98,6 +98,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
>  phys_addr_t kvm_mmu_get_httbr(void);
>  phys_addr_t kvm_mmu_get_boot_httbr(void);
>  phys_addr_t kvm_get_idmap_vector(void);
> +phys_addr_t kvm_get_idmap_start(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 3070096..bca79f9 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -58,9 +58,18 @@
>  
>  #define HVC_CALL_FUNC 3
>  
> +/*
> + * HVC_RESET_CPU - Reset cpu in EL2 to initial state.
> + *
> + * @x0: entry address in trampoline code in va
> + * @x1: identical mapping page table in pa
> + */
> +
>  #define BOOT_CPU_MODE_EL1	(0xe11)
>  #define BOOT_CPU_MODE_EL2	(0xe12)
>  
> +#define HVC_RESET_CPU 4
> +

Why isn't this next to the comment that document it?

>  #ifndef __ASSEMBLY__
>  
>  /*
> diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
> index 2e67a48..1925163 100644
> --- a/arch/arm64/kvm/hyp-init.S
> +++ b/arch/arm64/kvm/hyp-init.S
> @@ -139,6 +139,39 @@ merged:
>  	eret
>  ENDPROC(__kvm_hyp_init)
>  
> +	/*
> +	 * x0: HYP boot pgd
> +	 * x1: HYP phys_idmap_start
> +	 */
> +ENTRY(__kvm_hyp_reset)
> +	/* We're in trampoline code in VA, switch back to boot page tables */
> +	msr	ttbr0_el2, x0
> +	isb
> +
> +	/* Invalidate the old TLBs */
> +	tlbi	alle2
> +	dsb	sy

I find the location of that TLBI a bit weird. If you want to do
something more useful, move it to after the MMU has been disabled.

> +
> +	/* Branch into PA space */
> +	adr	x0, 1f
> +	bfi	x1, x0, #0, #PAGE_SHIFT
> +	br	x1
> +
> +	/* We're now in idmap, disable MMU */
> +1:	mrs	x0, sctlr_el2
> +	ldr	x1, =SCTLR_EL2_FLAGS
> +	bic	x0, x0, x1		// Clear SCTL_M and etc
> +	msr	sctlr_el2, x0
> +	isb
> +
> +	/* Install stub vectors */
> +	adrp	x0, __hyp_stub_vectors
> +	add	x0, x0, #:lo12:__hyp_stub_vectors
> +	msr	vbar_el2, x0
> +
> +	eret
> +ENDPROC(__kvm_hyp_reset)
> +
>  	.ltorg
>  
>  	.popsection
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 1bef8db..aca11d6 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -939,6 +939,11 @@ ENTRY(kvm_call_hyp)
>  	ret
>  ENDPROC(kvm_call_hyp)
>  
> +ENTRY(kvm_call_reset)
> +	hvc	#HVC_RESET_CPU
> +	ret
> +ENDPROC(kvm_call_reset)
> +
>  .macro invalid_vector	label, target
>  	.align	2
>  \label:
> @@ -982,10 +987,27 @@ el1_sync:					// Guest trapped into EL2
>  	cmp	x18, #HVC_GET_VECTORS
>  	b.ne	1f
>  	mrs	x0, vbar_el2
> -	b	2f
> -
> -1:	/* Default to HVC_CALL_HYP. */
> +	b	do_eret
>  
> +	/* jump into trampoline code */
> +1:	cmp	x18, #HVC_RESET_CPU
> +	b.ne	2f
> +	/*
> +	 * Entry point is:
> +	 *	TRAMPOLINE_VA
> +	 *	+ (__kvm_hyp_reset - (__hyp_idmap_text_start & PAGE_MASK))
> +	 */
> +	adrp	x2, __kvm_hyp_reset
> +	add	x2, x2, #:lo12:__kvm_hyp_reset
> +	adrp	x3, __hyp_idmap_text_start
> +	add	x3, x3, #:lo12:__hyp_idmap_text_start
> +	and	x3, x3, PAGE_MASK
> +	sub	x2, x2, x3
> +	ldr	x3, =TRAMPOLINE_VA

Nit: You should be able to do a "mov x3, #TRAMPOLINE_VA and avoid the
load from memory.

> +	add	x2, x2, x3
> +	br	x2				// no return
> +
> +2:	/* Default to HVC_CALL_HYP. */
>  	push	lr, xzr
>  
>  	/*
> @@ -999,7 +1021,9 @@ el1_sync:					// Guest trapped into EL2
>  	blr	lr
>  
>  	pop	lr, xzr
> -2:	eret
> +
> +do_eret:
> +	eret
>  
>  el1_trap:
>  	/*
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-11-24 22:25   ` Geoff Levand
@ 2015-11-27 14:19     ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-11-27 14:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 24/11/15 22:25, Geoff Levand wrote:
> Commit 68234df4ea7939f98431aa81113fbdce10c4a84b (arm64: kill flush_cache_all())
> removed the global arm64 routines cpu_reset() and cpu_soft_restart() needed by
> the arm64 kexec and kdump support.  Add simplified versions of those two
> routines back with some changes needed for kexec in the new files cpu_reset.S,
> and cpu_reset.h.
> 
> When a CPU is reset it needs to be put into the exception level it had
> when it entered the kernel. Update cpu_reset() to accept an argument
> which signals if the reset address needs to be entered at EL1 or EL2.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
>  arch/arm64/kernel/cpu-reset.S | 84 +++++++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/cpu-reset.h | 22 ++++++++++++
>  2 files changed, 106 insertions(+)
>  create mode 100644 arch/arm64/kernel/cpu-reset.S
>  create mode 100644 arch/arm64/kernel/cpu-reset.h
> 
> diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
> new file mode 100644
> index 0000000..0423f27
> --- /dev/null
> +++ b/arch/arm64/kernel/cpu-reset.S
> @@ -0,0 +1,84 @@
> +/*
> + * cpu reset routines
> + *
> + * Copyright (C) 2001 Deep Blue Solutions Ltd.
> + * Copyright (C) 2012 ARM Ltd.
> + * Copyright (C) 2015 Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/linkage.h>
> +#include <asm/assembler.h>
> +#include <asm/virt.h>
> +
> +#define SCTLR_ELx_I	(1 << 12)
> +#define SCTLR_ELx_SA	(1 << 3)
> +#define SCTLR_ELx_C	(1 << 2)
> +#define SCTLR_ELx_A	(1 << 1)
> +#define SCTLR_ELx_M	1

kvm_arm.h already has some of these defined, and sysreg.h has another
bunch. Can you please consolidate this into a single place (probably
sysreg.h)?

> +#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C |	\
> +			 SCTLR_ELx_SA | SCTLR_ELx_I)
> +
> +.text
> +.pushsection    .idmap.text, "ax"
> +
> +.align 5

Why is this .align directive required?

> +
> +/*
> + * cpu_soft_restart(cpu_reset, el2_switch, entry, arg0, arg1) - Perform a cpu
> + * soft reset.
> + *
> + * @cpu_reset: Physical address of the cpu_reset routine.
> + * @el2_switch: Flag to indicate a swich to EL2 is needed, passed to cpu_reset.
> + * @entry: Location to jump to for soft reset, passed to cpu_reset.
> + * arg0: First argument passed to @entry.
> + * arg1: Second argument passed to @entry.
> + */
> +
> +ENTRY(cpu_soft_restart)
> +	mov	x18, x0				// cpu_reset
> +	mov	x0, x1				// el2_switch
> +	mov	x1, x2				// entry
> +	mov	x2, x3				// arg0
> +	mov	x3, x4				// arg1
> +	ret	x18
> +ENDPROC(cpu_soft_restart)

Grepping through the tree, I can only find a single use of
cpu_soft_restart, with cpu_reset as its first parameter.

Why do we need this indirection? Having

void cpu_soft_restart(el2_switch, entry, arg0, arg1);

should be enough...

> +
> +/*
> + * cpu_reset(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
> + *
> + * @el2_switch: Flag to indicate a swich to EL2 is needed.
> + * @entry: Location to jump to for soft reset.
> + * arg0: First argument passed to @entry.
> + * arg1: Second argument passed to @entry.
> + *
> + * Put the CPU into the same state as it would be if it had been reset, and
> + * branch to what would be the reset vector. It must be executed with the
> + * flat identity mapping.
> + */
> +
> +ENTRY(cpu_reset)
> +	/* Clear sctlr_el1 flags. */
> +	mrs	x12, sctlr_el1
> +	ldr	x13, =SCTLR_ELx_FLAGS
> +	bic	x12, x12, x13
> +	msr	sctlr_el1, x12
> +	isb
> +
> +	cbz	x0, 1f				// el2_switch?
> +
> +	mov	x0, x1				// entry
> +	mov	x1, x2				// arg0
> +	mov	x2, x3				// arg1
> +	hvc	#HVC_CALL_FUNC			// no return
> +
> +1:	mov	x18, x1				// entry
> +	mov	x0, x2				// arg0
> +	mov	x1, x3				// arg1
> +	ret	x18
> +ENDPROC(cpu_reset)

And this can be an actual part of cpu_soft_reset.

> +
> +.popsection
> diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
> new file mode 100644
> index 0000000..79816b6
> --- /dev/null
> +++ b/arch/arm64/kernel/cpu-reset.h
> @@ -0,0 +1,22 @@
> +/*
> + * cpu reset routines
> + *
> + * Copyright (C) 2015 Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#if !defined(_ARM64_CPU_RESET_H)

The usual idiom is "#ifndef __BLAH_H".

> +#define _ARM64_CPU_RESET_H
> +
> +#include <asm/virt.h>
> +
> +void __attribute__((noreturn)) cpu_reset(unsigned long el2_switch,
> +	unsigned long entry, unsigned long arg0, unsigned long arg1);
> +void __attribute__((noreturn)) cpu_soft_restart(phys_addr_t cpu_reset,
> +	unsigned long el2_switch, unsigned long entry, unsigned long arg0,
> +	unsigned long arg1);
> +
> +#endif
> 

There is a __noreturn attribute already defined.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-11-27 14:19     ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-11-27 14:19 UTC (permalink / raw)
  To: Geoff Levand, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, AKASHI Takahiro, kexec, linux-arm-kernel, christoffer.dall

On 24/11/15 22:25, Geoff Levand wrote:
> Commit 68234df4ea7939f98431aa81113fbdce10c4a84b (arm64: kill flush_cache_all())
> removed the global arm64 routines cpu_reset() and cpu_soft_restart() needed by
> the arm64 kexec and kdump support.  Add simplified versions of those two
> routines back with some changes needed for kexec in the new files cpu_reset.S,
> and cpu_reset.h.
> 
> When a CPU is reset it needs to be put into the exception level it had
> when it entered the kernel. Update cpu_reset() to accept an argument
> which signals if the reset address needs to be entered at EL1 or EL2.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
>  arch/arm64/kernel/cpu-reset.S | 84 +++++++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/cpu-reset.h | 22 ++++++++++++
>  2 files changed, 106 insertions(+)
>  create mode 100644 arch/arm64/kernel/cpu-reset.S
>  create mode 100644 arch/arm64/kernel/cpu-reset.h
> 
> diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
> new file mode 100644
> index 0000000..0423f27
> --- /dev/null
> +++ b/arch/arm64/kernel/cpu-reset.S
> @@ -0,0 +1,84 @@
> +/*
> + * cpu reset routines
> + *
> + * Copyright (C) 2001 Deep Blue Solutions Ltd.
> + * Copyright (C) 2012 ARM Ltd.
> + * Copyright (C) 2015 Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/linkage.h>
> +#include <asm/assembler.h>
> +#include <asm/virt.h>
> +
> +#define SCTLR_ELx_I	(1 << 12)
> +#define SCTLR_ELx_SA	(1 << 3)
> +#define SCTLR_ELx_C	(1 << 2)
> +#define SCTLR_ELx_A	(1 << 1)
> +#define SCTLR_ELx_M	1

kvm_arm.h already has some of these defined, and sysreg.h has another
bunch. Can you please consolidate this into a single place (probably
sysreg.h)?

> +#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C |	\
> +			 SCTLR_ELx_SA | SCTLR_ELx_I)
> +
> +.text
> +.pushsection    .idmap.text, "ax"
> +
> +.align 5

Why is this .align directive required?

> +
> +/*
> + * cpu_soft_restart(cpu_reset, el2_switch, entry, arg0, arg1) - Perform a cpu
> + * soft reset.
> + *
> + * @cpu_reset: Physical address of the cpu_reset routine.
> + * @el2_switch: Flag to indicate a swich to EL2 is needed, passed to cpu_reset.
> + * @entry: Location to jump to for soft reset, passed to cpu_reset.
> + * arg0: First argument passed to @entry.
> + * arg1: Second argument passed to @entry.
> + */
> +
> +ENTRY(cpu_soft_restart)
> +	mov	x18, x0				// cpu_reset
> +	mov	x0, x1				// el2_switch
> +	mov	x1, x2				// entry
> +	mov	x2, x3				// arg0
> +	mov	x3, x4				// arg1
> +	ret	x18
> +ENDPROC(cpu_soft_restart)

Grepping through the tree, I can only find a single use of
cpu_soft_restart, with cpu_reset as its first parameter.

Why do we need this indirection? Having

void cpu_soft_restart(el2_switch, entry, arg0, arg1);

should be enough...

> +
> +/*
> + * cpu_reset(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
> + *
> + * @el2_switch: Flag to indicate a swich to EL2 is needed.
> + * @entry: Location to jump to for soft reset.
> + * arg0: First argument passed to @entry.
> + * arg1: Second argument passed to @entry.
> + *
> + * Put the CPU into the same state as it would be if it had been reset, and
> + * branch to what would be the reset vector. It must be executed with the
> + * flat identity mapping.
> + */
> +
> +ENTRY(cpu_reset)
> +	/* Clear sctlr_el1 flags. */
> +	mrs	x12, sctlr_el1
> +	ldr	x13, =SCTLR_ELx_FLAGS
> +	bic	x12, x12, x13
> +	msr	sctlr_el1, x12
> +	isb
> +
> +	cbz	x0, 1f				// el2_switch?
> +
> +	mov	x0, x1				// entry
> +	mov	x1, x2				// arg0
> +	mov	x2, x3				// arg1
> +	hvc	#HVC_CALL_FUNC			// no return
> +
> +1:	mov	x18, x1				// entry
> +	mov	x0, x2				// arg0
> +	mov	x1, x3				// arg1
> +	ret	x18
> +ENDPROC(cpu_reset)

And this can be an actual part of cpu_soft_reset.

> +
> +.popsection
> diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
> new file mode 100644
> index 0000000..79816b6
> --- /dev/null
> +++ b/arch/arm64/kernel/cpu-reset.h
> @@ -0,0 +1,22 @@
> +/*
> + * cpu reset routines
> + *
> + * Copyright (C) 2015 Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#if !defined(_ARM64_CPU_RESET_H)

The usual idiom is "#ifndef __BLAH_H".

> +#define _ARM64_CPU_RESET_H
> +
> +#include <asm/virt.h>
> +
> +void __attribute__((noreturn)) cpu_reset(unsigned long el2_switch,
> +	unsigned long entry, unsigned long arg0, unsigned long arg1);
> +void __attribute__((noreturn)) cpu_soft_restart(phys_addr_t cpu_reset,
> +	unsigned long el2_switch, unsigned long entry, unsigned long arg0,
> +	unsigned long arg1);
> +
> +#endif
> 

There is a __noreturn attribute already defined.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
  2015-11-24 22:25   ` Geoff Levand
@ 2015-11-27 14:39     ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-11-27 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 24/11/15 22:25, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
> save registers' status in per-cpu ELF notes before starting the crash
> dump kernel. See kernel_kexec().
> 
> ipi_cpu_stop() is a bit modified and used to support this behavior.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>  arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>  arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>  3 files changed, 76 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> index 46d63cd..555a955 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -30,6 +30,8 @@
>  
>  #if !defined(__ASSEMBLY__)
>  
> +extern bool in_crash_kexec;
> +
>  /**
>   * crash_setup_regs() - save registers for the panic kernel
>   *
> @@ -40,7 +42,37 @@
>  static inline void crash_setup_regs(struct pt_regs *newregs,
>  				    struct pt_regs *oldregs)
>  {
> -	/* Empty routine needed to avoid build errors. */
> +	if (oldregs) {
> +		memcpy(newregs, oldregs, sizeof(*newregs));
> +	} else {
> +		__asm__ __volatile__ (
> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
> +			"stp	x10,  x11, [%3, #16 *  5]\n"
> +			"stp	x12,  x13, [%3, #16 *  6]\n"
> +			"stp	x14,  x15, [%3, #16 *  7]\n"
> +			"stp	x16,  x17, [%3, #16 *  8]\n"
> +			"stp	x18,  x19, [%3, #16 *  9]\n"
> +			"stp	x20,  x21, [%3, #16 * 10]\n"
> +			"stp	x22,  x23, [%3, #16 * 11]\n"
> +			"stp	x24,  x25, [%3, #16 * 12]\n"
> +			"stp	x26,  x27, [%3, #16 * 13]\n"
> +			"stp	x28,  x29, [%3, #16 * 14]\n"
> +			"str	x30,	   [%3, #16 * 15]\n"
> +			"mov	%0, sp\n"
> +			"adr	%1, 1f\n"
> +			"mrs	%2, spsr_el1\n"
> +		"1:"
> +			: "=r" (newregs->sp),
> +			  "=r" (newregs->pc),
> +			  "=r" (newregs->pstate)
> +			: "r"  (&newregs->regs)
> +			: "memory"
> +		);

I wonder how useful this thing is, given that it starts by corrupting
whatever register is holding newregs->regs. Maybe this is not supposed
to be accurate anyway...


> +	}
>  }
>  
>  #endif /* !defined(__ASSEMBLY__) */
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index da28a26..d2d7e90 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -9,6 +9,7 @@
>   * published by the Free Software Foundation.
>   */
>  
> +#include <linux/kernel.h>
>  #include <linux/kexec.h>
>  #include <linux/of_fdt.h>
>  #include <linux/slab.h>
> @@ -23,6 +24,7 @@
>  extern const unsigned char arm64_relocate_new_kernel[];
>  extern const unsigned long arm64_relocate_new_kernel_size;
>  
> +bool in_crash_kexec;
>  static unsigned long kimage_start;
>  
>  /**
> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>  	 */
>  
>  	cpu_soft_restart(virt_to_phys(cpu_reset),
> -		is_hyp_mode_available(),
> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>  		reboot_code_buffer_phys, kimage->head, kimage_start);
>  
>  	BUG(); /* Should never get here. */
>  }
>  
> +/**
> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
> + */
>  void machine_crash_shutdown(struct pt_regs *regs)
>  {
> -	/* Empty routine needed to avoid build errors. */
> +	struct pt_regs dummy_regs;
> +	int cpu;
> +
> +	local_irq_disable();
> +
> +	in_crash_kexec = true;
> +
> +	/*
> +	 * clear and initialize the per-cpu info. This is necessary
> +	 * because, otherwise, slots for offline cpus would never be
> +	 * filled up. See smp_send_stop().
> +	 */
> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
> +	for_each_possible_cpu(cpu)
> +		crash_save_cpu(&dummy_regs, cpu);
> +
> +	/* shutdown non-boot cpus */
> +	smp_send_stop();
> +
> +	/* for boot cpu */
> +	crash_save_cpu(regs, smp_processor_id());
> +
> +	pr_info("Starting crashdump kernel...\n");
>  }
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index b1adc51..15aabef 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -37,6 +37,7 @@
>  #include <linux/completion.h>
>  #include <linux/of.h>
>  #include <linux/irq_work.h>
> +#include <linux/kexec.h>
>  
>  #include <asm/alternative.h>
>  #include <asm/atomic.h>
> @@ -54,6 +55,8 @@
>  #include <asm/ptrace.h>
>  #include <asm/virt.h>
>  
> +#include "cpu-reset.h"
> +
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/ipi.h>
>  
> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>  /*
>   * ipi_cpu_stop - handle IPI from smp_send_stop()
>   */
> -static void ipi_cpu_stop(unsigned int cpu)
> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>  {
> +#ifdef CONFIG_KEXEC
> +	/* printing messages may slow down the shutdown. */
> +	if (!in_crash_kexec)
> +#endif
>  	if (system_state == SYSTEM_BOOTING ||
>  	    system_state == SYSTEM_RUNNING) {
>  		raw_spin_lock(&stop_lock);

Irrespective of how useful this change is, how about having a predicate
instead? Something like:

static inline bool is_in_crash_kexec(void)
{
#ifdef CONFIG_KEXEC
	return in_crash_kexec;
#else
	return false;
#endif
}

located in machine_kexec.c (making the in_crash_kernel static), and then

	if (!is_in_crash_kexec() && (systen_state == ... || ...) {

It would certainly look better.

> @@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
>  
>  	local_irq_disable();
>  
> +#ifdef CONFIG_KEXEC
> +	if (in_crash_kexec)
> +		crash_save_cpu(regs, cpu);
> +#endif /* CONFIG_KEXEC */
> +

Same here.

>  	while (1)
>  		cpu_relax();
>  }
> @@ -727,7 +739,7 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
>  
>  	case IPI_CPU_STOP:
>  		irq_enter();
> -		ipi_cpu_stop(cpu);
> +		ipi_cpu_stop(cpu, regs);
>  		irq_exit();
>  		break;
>  
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
@ 2015-11-27 14:39     ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-11-27 14:39 UTC (permalink / raw)
  To: Geoff Levand, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, AKASHI Takahiro, kexec, linux-arm-kernel, christoffer.dall

On 24/11/15 22:25, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
> save registers' status in per-cpu ELF notes before starting the crash
> dump kernel. See kernel_kexec().
> 
> ipi_cpu_stop() is a bit modified and used to support this behavior.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>  arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>  arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>  3 files changed, 76 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> index 46d63cd..555a955 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -30,6 +30,8 @@
>  
>  #if !defined(__ASSEMBLY__)
>  
> +extern bool in_crash_kexec;
> +
>  /**
>   * crash_setup_regs() - save registers for the panic kernel
>   *
> @@ -40,7 +42,37 @@
>  static inline void crash_setup_regs(struct pt_regs *newregs,
>  				    struct pt_regs *oldregs)
>  {
> -	/* Empty routine needed to avoid build errors. */
> +	if (oldregs) {
> +		memcpy(newregs, oldregs, sizeof(*newregs));
> +	} else {
> +		__asm__ __volatile__ (
> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
> +			"stp	x10,  x11, [%3, #16 *  5]\n"
> +			"stp	x12,  x13, [%3, #16 *  6]\n"
> +			"stp	x14,  x15, [%3, #16 *  7]\n"
> +			"stp	x16,  x17, [%3, #16 *  8]\n"
> +			"stp	x18,  x19, [%3, #16 *  9]\n"
> +			"stp	x20,  x21, [%3, #16 * 10]\n"
> +			"stp	x22,  x23, [%3, #16 * 11]\n"
> +			"stp	x24,  x25, [%3, #16 * 12]\n"
> +			"stp	x26,  x27, [%3, #16 * 13]\n"
> +			"stp	x28,  x29, [%3, #16 * 14]\n"
> +			"str	x30,	   [%3, #16 * 15]\n"
> +			"mov	%0, sp\n"
> +			"adr	%1, 1f\n"
> +			"mrs	%2, spsr_el1\n"
> +		"1:"
> +			: "=r" (newregs->sp),
> +			  "=r" (newregs->pc),
> +			  "=r" (newregs->pstate)
> +			: "r"  (&newregs->regs)
> +			: "memory"
> +		);

I wonder how useful this thing is, given that it starts by corrupting
whatever register is holding newregs->regs. Maybe this is not supposed
to be accurate anyway...


> +	}
>  }
>  
>  #endif /* !defined(__ASSEMBLY__) */
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index da28a26..d2d7e90 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -9,6 +9,7 @@
>   * published by the Free Software Foundation.
>   */
>  
> +#include <linux/kernel.h>
>  #include <linux/kexec.h>
>  #include <linux/of_fdt.h>
>  #include <linux/slab.h>
> @@ -23,6 +24,7 @@
>  extern const unsigned char arm64_relocate_new_kernel[];
>  extern const unsigned long arm64_relocate_new_kernel_size;
>  
> +bool in_crash_kexec;
>  static unsigned long kimage_start;
>  
>  /**
> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>  	 */
>  
>  	cpu_soft_restart(virt_to_phys(cpu_reset),
> -		is_hyp_mode_available(),
> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>  		reboot_code_buffer_phys, kimage->head, kimage_start);
>  
>  	BUG(); /* Should never get here. */
>  }
>  
> +/**
> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
> + */
>  void machine_crash_shutdown(struct pt_regs *regs)
>  {
> -	/* Empty routine needed to avoid build errors. */
> +	struct pt_regs dummy_regs;
> +	int cpu;
> +
> +	local_irq_disable();
> +
> +	in_crash_kexec = true;
> +
> +	/*
> +	 * clear and initialize the per-cpu info. This is necessary
> +	 * because, otherwise, slots for offline cpus would never be
> +	 * filled up. See smp_send_stop().
> +	 */
> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
> +	for_each_possible_cpu(cpu)
> +		crash_save_cpu(&dummy_regs, cpu);
> +
> +	/* shutdown non-boot cpus */
> +	smp_send_stop();
> +
> +	/* for boot cpu */
> +	crash_save_cpu(regs, smp_processor_id());
> +
> +	pr_info("Starting crashdump kernel...\n");
>  }
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index b1adc51..15aabef 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -37,6 +37,7 @@
>  #include <linux/completion.h>
>  #include <linux/of.h>
>  #include <linux/irq_work.h>
> +#include <linux/kexec.h>
>  
>  #include <asm/alternative.h>
>  #include <asm/atomic.h>
> @@ -54,6 +55,8 @@
>  #include <asm/ptrace.h>
>  #include <asm/virt.h>
>  
> +#include "cpu-reset.h"
> +
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/ipi.h>
>  
> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>  /*
>   * ipi_cpu_stop - handle IPI from smp_send_stop()
>   */
> -static void ipi_cpu_stop(unsigned int cpu)
> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>  {
> +#ifdef CONFIG_KEXEC
> +	/* printing messages may slow down the shutdown. */
> +	if (!in_crash_kexec)
> +#endif
>  	if (system_state == SYSTEM_BOOTING ||
>  	    system_state == SYSTEM_RUNNING) {
>  		raw_spin_lock(&stop_lock);

Irrespective of how useful this change is, how about having a predicate
instead? Something like:

static inline bool is_in_crash_kexec(void)
{
#ifdef CONFIG_KEXEC
	return in_crash_kexec;
#else
	return false;
#endif
}

located in machine_kexec.c (making the in_crash_kernel static), and then

	if (!is_in_crash_kexec() && (systen_state == ... || ...) {

It would certainly look better.

> @@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
>  
>  	local_irq_disable();
>  
> +#ifdef CONFIG_KEXEC
> +	if (in_crash_kexec)
> +		crash_save_cpu(regs, cpu);
> +#endif /* CONFIG_KEXEC */
> +

Same here.

>  	while (1)
>  		cpu_relax();
>  }
> @@ -727,7 +739,7 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
>  
>  	case IPI_CPU_STOP:
>  		irq_enter();
> -		ipi_cpu_stop(cpu);
> +		ipi_cpu_stop(cpu, regs);
>  		irq_exit();
>  		break;
>  
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-11-27 14:19     ` Marc Zyngier
@ 2015-11-30  5:28       ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-11-30  5:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 27/11/2015:02:19:37 PM, Marc Zyngier wrote:
> On 24/11/15 22:25, Geoff Levand wrote:
> > +ENTRY(cpu_soft_restart)
> > +	mov	x18, x0				// cpu_reset
> > +	mov	x0, x1				// el2_switch
> > +	mov	x1, x2				// entry
> > +	mov	x2, x3				// arg0
> > +	mov	x3, x4				// arg1
> > +	ret	x18
> > +ENDPROC(cpu_soft_restart)
> 
> Grepping through the tree, I can only find a single use of
> cpu_soft_restart, with cpu_reset as its first parameter.
> 
> Why do we need this indirection? Having

It is needed because we need to execute cpu_reset() in physical address space.

> 
> void cpu_soft_restart(el2_switch, entry, arg0, arg1);
> 
> should be enough...

We can do with only cpu_soft_restart(), but then a function pointer to __pa() of
it need to be called. May  be current approach is more cleaner.

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-11-30  5:28       ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-11-30  5:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Geoff Levand, Catalin Marinas, Will Deacon,
	AKASHI Takahiro, christoffer.dall, kexec, linux-arm-kernel

On 27/11/2015:02:19:37 PM, Marc Zyngier wrote:
> On 24/11/15 22:25, Geoff Levand wrote:
> > +ENTRY(cpu_soft_restart)
> > +	mov	x18, x0				// cpu_reset
> > +	mov	x0, x1				// el2_switch
> > +	mov	x1, x2				// entry
> > +	mov	x2, x3				// arg0
> > +	mov	x3, x4				// arg1
> > +	ret	x18
> > +ENDPROC(cpu_soft_restart)
> 
> Grepping through the tree, I can only find a single use of
> cpu_soft_restart, with cpu_reset as its first parameter.
> 
> Why do we need this indirection? Having

It is needed because we need to execute cpu_reset() in physical address space.

> 
> void cpu_soft_restart(el2_switch, entry, arg0, arg1);
> 
> should be enough...

We can do with only cpu_soft_restart(), but then a function pointer to __pa() of
it need to be called. May  be current approach is more cleaner.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-11-30  5:28       ` Pratyush Anand
@ 2015-11-30 10:40         ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-11-30 10:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 30 Nov 2015 10:58:43 +0530
Pratyush Anand <panand@redhat.com> wrote:

> On 27/11/2015:02:19:37 PM, Marc Zyngier wrote:
> > On 24/11/15 22:25, Geoff Levand wrote:
> > > +ENTRY(cpu_soft_restart)
> > > +	mov	x18, x0				// cpu_reset
> > > +	mov	x0, x1				// el2_switch
> > > +	mov	x1, x2				// entry
> > > +	mov	x2, x3				// arg0
> > > +	mov	x3, x4				// arg1
> > > +	ret	x18
> > > +ENDPROC(cpu_soft_restart)
> > 
> > Grepping through the tree, I can only find a single use of
> > cpu_soft_restart, with cpu_reset as its first parameter.
> > 
> > Why do we need this indirection? Having
> 
> It is needed because we need to execute cpu_reset() in physical address space.
> 
> > 
> > void cpu_soft_restart(el2_switch, entry, arg0, arg1);
> > 
> > should be enough...
> 
> We can do with only cpu_soft_restart(), but then a function pointer to __pa() of
> it need to be called. May  be current approach is more cleaner.

All that can be solved in C, and mostly at compile time. Using an
assembler trampoline is complicating things for no good reason:

diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 0423f27..099f6f2 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -28,27 +28,7 @@
 .align 5
 
 /*
- * cpu_soft_restart(cpu_reset, el2_switch, entry, arg0, arg1) - Perform a cpu
- * soft reset.
- *
- * @cpu_reset: Physical address of the cpu_reset routine.
- * @el2_switch: Flag to indicate a swich to EL2 is needed, passed to cpu_reset.
- * @entry: Location to jump to for soft reset, passed to cpu_reset.
- * arg0: First argument passed to @entry.
- * arg1: Second argument passed to @entry.
- */
-
-ENTRY(cpu_soft_restart)
-	mov	x18, x0				// cpu_reset
-	mov	x0, x1				// el2_switch
-	mov	x1, x2				// entry
-	mov	x2, x3				// arg0
-	mov	x3, x4				// arg1
-	ret	x18
-ENDPROC(cpu_soft_restart)
-
-/*
- * cpu_reset(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
+ * __cpu_soft_restart(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
  *
  * @el2_switch: Flag to indicate a swich to EL2 is needed.
  * @entry: Location to jump to for soft reset.
@@ -60,7 +40,7 @@ ENDPROC(cpu_soft_restart)
  * flat identity mapping.
  */
 
-ENTRY(cpu_reset)
+ENTRY(__cpu_soft_restart)
 	/* Clear sctlr_el1 flags. */
 	mrs	x12, sctlr_el1
 	ldr	x13, =SCTLR_ELx_FLAGS
@@ -79,6 +59,6 @@ ENTRY(cpu_reset)
 	mov	x0, x2				// arg0
 	mov	x1, x3				// arg1
 	ret	x18
-ENDPROC(cpu_reset)
+ENDPROC(__cpu_soft_restart)
 
 .popsection
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index 79816b6..df88892 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -13,10 +13,17 @@
 
 #include <asm/virt.h>
 
-void __attribute__((noreturn)) cpu_reset(unsigned long el2_switch,
-	unsigned long entry, unsigned long arg0, unsigned long arg1);
-void __attribute__((noreturn)) cpu_soft_restart(phys_addr_t cpu_reset,
-	unsigned long el2_switch, unsigned long entry, unsigned long arg0,
-	unsigned long arg1);
+void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
+			unsigned long arg0, unsigned long arg1);
+
+static inline
+void __noreturn cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
+				 unsigned long arg0, unsigned long arg1)
+{
+	typeof(__cpu_soft_restart) *restart;
+	restart = (void *)virt_to_phys(__cpu_soft_restart);
+	restart(el2_switch, entry, arg0, arg1);
+	unreachable();
+}
 
 #endif
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 482aae7..ecdf822 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -210,8 +210,7 @@ void machine_kexec(struct kimage *kimage)
 	 * relocation is complete.
 	 */
 
-	cpu_soft_restart(virt_to_phys(cpu_reset),
-		in_crash_kexec ? 0 : is_hyp_mode_available(),
+	cpu_soft_restart(in_crash_kexec ? 0 : is_hyp_mode_available(),
 		reboot_code_buffer_phys, kimage->head, kimage_start);
 
 	BUG(); /* Should never get here. */


Simpler, more readable, no pointless register shifting, smaller idmap
footprint.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-11-30 10:40         ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-11-30 10:40 UTC (permalink / raw)
  To: Pratyush Anand
  Cc: Mark Rutland, Geoff Levand, Catalin Marinas, Will Deacon,
	AKASHI Takahiro, christoffer.dall, kexec, linux-arm-kernel

On Mon, 30 Nov 2015 10:58:43 +0530
Pratyush Anand <panand@redhat.com> wrote:

> On 27/11/2015:02:19:37 PM, Marc Zyngier wrote:
> > On 24/11/15 22:25, Geoff Levand wrote:
> > > +ENTRY(cpu_soft_restart)
> > > +	mov	x18, x0				// cpu_reset
> > > +	mov	x0, x1				// el2_switch
> > > +	mov	x1, x2				// entry
> > > +	mov	x2, x3				// arg0
> > > +	mov	x3, x4				// arg1
> > > +	ret	x18
> > > +ENDPROC(cpu_soft_restart)
> > 
> > Grepping through the tree, I can only find a single use of
> > cpu_soft_restart, with cpu_reset as its first parameter.
> > 
> > Why do we need this indirection? Having
> 
> It is needed because we need to execute cpu_reset() in physical address space.
> 
> > 
> > void cpu_soft_restart(el2_switch, entry, arg0, arg1);
> > 
> > should be enough...
> 
> We can do with only cpu_soft_restart(), but then a function pointer to __pa() of
> it need to be called. May  be current approach is more cleaner.

All that can be solved in C, and mostly at compile time. Using an
assembler trampoline is complicating things for no good reason:

diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 0423f27..099f6f2 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -28,27 +28,7 @@
 .align 5
 
 /*
- * cpu_soft_restart(cpu_reset, el2_switch, entry, arg0, arg1) - Perform a cpu
- * soft reset.
- *
- * @cpu_reset: Physical address of the cpu_reset routine.
- * @el2_switch: Flag to indicate a swich to EL2 is needed, passed to cpu_reset.
- * @entry: Location to jump to for soft reset, passed to cpu_reset.
- * arg0: First argument passed to @entry.
- * arg1: Second argument passed to @entry.
- */
-
-ENTRY(cpu_soft_restart)
-	mov	x18, x0				// cpu_reset
-	mov	x0, x1				// el2_switch
-	mov	x1, x2				// entry
-	mov	x2, x3				// arg0
-	mov	x3, x4				// arg1
-	ret	x18
-ENDPROC(cpu_soft_restart)
-
-/*
- * cpu_reset(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
+ * __cpu_soft_restart(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
  *
  * @el2_switch: Flag to indicate a swich to EL2 is needed.
  * @entry: Location to jump to for soft reset.
@@ -60,7 +40,7 @@ ENDPROC(cpu_soft_restart)
  * flat identity mapping.
  */
 
-ENTRY(cpu_reset)
+ENTRY(__cpu_soft_restart)
 	/* Clear sctlr_el1 flags. */
 	mrs	x12, sctlr_el1
 	ldr	x13, =SCTLR_ELx_FLAGS
@@ -79,6 +59,6 @@ ENTRY(cpu_reset)
 	mov	x0, x2				// arg0
 	mov	x1, x3				// arg1
 	ret	x18
-ENDPROC(cpu_reset)
+ENDPROC(__cpu_soft_restart)
 
 .popsection
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index 79816b6..df88892 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -13,10 +13,17 @@
 
 #include <asm/virt.h>
 
-void __attribute__((noreturn)) cpu_reset(unsigned long el2_switch,
-	unsigned long entry, unsigned long arg0, unsigned long arg1);
-void __attribute__((noreturn)) cpu_soft_restart(phys_addr_t cpu_reset,
-	unsigned long el2_switch, unsigned long entry, unsigned long arg0,
-	unsigned long arg1);
+void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
+			unsigned long arg0, unsigned long arg1);
+
+static inline
+void __noreturn cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
+				 unsigned long arg0, unsigned long arg1)
+{
+	typeof(__cpu_soft_restart) *restart;
+	restart = (void *)virt_to_phys(__cpu_soft_restart);
+	restart(el2_switch, entry, arg0, arg1);
+	unreachable();
+}
 
 #endif
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 482aae7..ecdf822 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -210,8 +210,7 @@ void machine_kexec(struct kimage *kimage)
 	 * relocation is complete.
 	 */
 
-	cpu_soft_restart(virt_to_phys(cpu_reset),
-		in_crash_kexec ? 0 : is_hyp_mode_available(),
+	cpu_soft_restart(in_crash_kexec ? 0 : is_hyp_mode_available(),
 		reboot_code_buffer_phys, kimage->head, kimage_start);
 
 	BUG(); /* Should never get here. */


Simpler, more readable, no pointless register shifting, smaller idmap
footprint.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-11-27 13:13     ` Pratyush Anand
@ 2015-11-30 18:51       ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-30 18:51 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Fri, 2015-11-27 at 18:43 +0530, Pratyush Anand wrote:

> On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> > +> > 	> > /* Test the entry flags. */
> > +.Ltest_source:
> > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > +
> > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20 = copy dest */
> > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21 = copy src */
> 
> Till v10 we had here invalidation for relocated destination page to PoC. I could
> not understand, why it was removed. Removing that piece of code breaks kexec
> booting with mustang. I need [1] to kexec boot into second kernel with mustang
> platform.

We need to flush the new kernel to PoC.  The code that was here that
was doing that would only be executed when the new kernel needed
relocation (the standard kexec case).  We also need to flush kernels
that do not need relocation (the standard kdump case).

I moved the new kernel flush to kexec_segment_flush(), called
unconditionally in machine_kexec() so we can handle both cases
with one piece of code.

Have you experienced a problem on mustang with the current version?

-Geoff

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-11-30 18:51       ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-30 18:51 UTC (permalink / raw)
  To: Pratyush Anand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, Will Deacon,
	AKASHI Takahiro, linux-arm-kernel, Dave Young, kexec,
	christoffer.dall

Hi,

On Fri, 2015-11-27 at 18:43 +0530, Pratyush Anand wrote:

> On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> > +> > 	> > /* Test the entry flags. */
> > +.Ltest_source:
> > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > +
> > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20 = copy dest */
> > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21 = copy src */
> 
> Till v10 we had here invalidation for relocated destination page to PoC. I could
> not understand, why it was removed. Removing that piece of code breaks kexec
> booting with mustang. I need [1] to kexec boot into second kernel with mustang
> platform.

We need to flush the new kernel to PoC.  The code that was here that
was doing that would only be executed when the new kernel needed
relocation (the standard kexec case).  We also need to flush kernels
that do not need relocation (the standard kdump case).

I moved the new kernel flush to kexec_segment_flush(), called
unconditionally in machine_kexec() so we can handle both cases
with one piece of code.

Have you experienced a problem on mustang with the current version?

-Geoff


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-11-27 14:19     ` Marc Zyngier
@ 2015-11-30 20:03       ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-30 20:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marc,

On Fri, 2015-11-27 at 14:19 +0000, Marc Zyngier wrote:
> On 24/11/15 22:25, Geoff Levand wrote:
> > +#define SCTLR_ELx_I> > 	> > (1 << 12)
> > +#define SCTLR_ELx_SA> > 	> > (1 << 3)
> > +#define SCTLR_ELx_C> > 	> > (1 << 2)
> > +#define SCTLR_ELx_A> > 	> > (1 << 1)
> > +#define SCTLR_ELx_M> > 	> > 1
> 
> kvm_arm.h already has some of these defined, and sysreg.h has another
> bunch. Can you please consolidate this into a single place (probably
> sysreg.h)?

OK, I'll look into it.

> > +#define SCTLR_ELx_FLAGS> > 	> > (SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C |> > 	> > \
> > +> > 	> > 	> > 	> >  SCTLR_ELx_SA | SCTLR_ELx_I)
> > +
> > +.text
> > +.pushsection    .idmap.text, "ax"
> > +
> > +.align 5
> 
> Why is this .align directive required?

In the original proc.S file, this was to align the cpu_reset
routine.  It has been there since the initial commit of the
proc.S.  I don't think this alignment is needed, but maybe
there is some reason it is there that I am not aware of?

> > +
> > +/*
> > + * cpu_soft_restart(cpu_reset, el2_switch, entry, arg0, arg1) - Perform a cpu
> > + * soft reset.
> > + *
> > + * @cpu_reset: Physical address of the cpu_reset routine.
> > + * @el2_switch: Flag to indicate a swich to EL2 is needed, passed to cpu_reset.
> > + * @entry: Location to jump to for soft reset, passed to cpu_reset.
> > + * arg0: First argument passed to @entry.
> > + * arg1: Second argument passed to @entry.
> > + */
> > +
> > +ENTRY(cpu_soft_restart)
> > +> > 	> > mov> > 	> > x18, x0> > 	> > 	> > 	> > 	> > // cpu_reset
> > +> > 	> > mov> > 	> > x0, x1> > 	> > 	> > 	> > 	> > // el2_switch
> > +> > 	> > mov> > 	> > x1, x2> > 	> > 	> > 	> > 	> > // entry
> > +> > 	> > mov> > 	> > x2, x3> > 	> > 	> > 	> > 	> > // arg0
> > +> > 	> > mov> > 	> > x3, x4> > 	> > 	> > 	> > 	> > // arg1
> > +> > 	> > ret> > 	> > x18
> > +ENDPROC(cpu_soft_restart)
> 
> Grepping through the tree, I can only find a single use of
> cpu_soft_restart, with cpu_reset as its first parameter.
> 
> Why do we need this indirection? Having
> 
> void cpu_soft_restart(el2_switch, entry, arg0, arg1);
> 
> should be enough...

> > +/*
> > + * cpu_reset(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
> > + *
> > + * @el2_switch: Flag to indicate a swich to EL2 is needed.
> > + * @entry: Location to jump to for soft reset.
> > + * arg0: First argument passed to @entry.
> > + * arg1: Second argument passed to @entry.
> > + *
> > + * Put the CPU into the same state as it would be if it had been reset, and
> > + * branch to what would be the reset vector. It must be executed with the
> > + * flat identity mapping.
> > + */
> > +
> > +ENTRY(cpu_reset)
> > +> > 	> > /* Clear sctlr_el1 flags. */
> > +> > 	> > mrs> > 	> > x12, sctlr_el1
> > +> > 	> > ldr> > 	> > x13, =SCTLR_ELx_FLAGS
> > +> > 	> > bic> > 	> > x12, x12, x13
> > +> > 	> > msr> > 	> > sctlr_el1, x12
> > +> > 	> > isb
> > +
> > +> > 	> > cbz> > 	> > x0, 1f> > 	> > 	> > 	> > 	> > // el2_switch?
> > +
> > +> > 	> > mov> > 	> > x0, x1> > 	> > 	> > 	> > 	> > // entry
> > +> > 	> > mov> > 	> > x1, x2> > 	> > 	> > 	> > 	> > // arg0
> > +> > 	> > mov> > 	> > x2, x3> > 	> > 	> > 	> > 	> > // arg1
> > +> > 	> > hvc> > 	> > #HVC_CALL_FUNC> > 	> > 	> > 	> > // no return
> > +
> > +1:> > 	> > mov> > 	> > x18, x1> > 	> > 	> > 	> > 	> > // entry
> > +> > 	> > mov> > 	> > x0, x2> > 	> > 	> > 	> > 	> > // arg0
> > +> > 	> > mov> > 	> > x1, x3> > 	> > 	> > 	> > 	> > // arg1
> > +> > 	> > ret> > 	> > x18
> > +ENDPROC(cpu_reset)
> 
> And this can be an actual part of cpu_soft_reset.

Sure, I'll look at your proposal from your reply to Pratyush.

> > +
> > +.popsection
> > diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
> > new file mode 100644
> > index 0000000..79816b6
> > --- /dev/null
> > +++ b/arch/arm64/kernel/cpu-reset.h
> > @@ -0,0 +1,22 @@
> > +/*
> > + * cpu reset routines
> > + *
> > + * Copyright (C) 2015 Huawei Futurewei Technologies.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#if !defined(_ARM64_CPU_RESET_H)
> 
> The usual idiom is "#ifndef __BLAH_H".

Sure, I can change that.  Grepping the arm64 sources shows one other
use of this style, in a kvm header...

> > +#define _ARM64_CPU_RESET_H
> > +
> > +#include 
> > +
> > +void __attribute__((noreturn)) cpu_reset(unsigned long el2_switch,
> > +> > 	> > unsigned long entry, unsigned long arg0, unsigned long arg1);
> > +void __attribute__((noreturn)) cpu_soft_restart(phys_addr_t cpu_reset,
> > +> > 	> > unsigned long el2_switch, unsigned long entry, unsigned long arg0,
> > +> > 	> > unsigned long arg1);
> > +
> > +#endif
> > 
> 
> There is a __noreturn attribute already defined.

I'll switch these over.

Thanks for the review.

-Geoff

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-11-30 20:03       ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-11-30 20:03 UTC (permalink / raw)
  To: Marc Zyngier, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, AKASHI Takahiro, kexec, linux-arm-kernel, christoffer.dall

Hi Marc,

On Fri, 2015-11-27 at 14:19 +0000, Marc Zyngier wrote:
> On 24/11/15 22:25, Geoff Levand wrote:
> > +#define SCTLR_ELx_I> > 	> > (1 << 12)
> > +#define SCTLR_ELx_SA> > 	> > (1 << 3)
> > +#define SCTLR_ELx_C> > 	> > (1 << 2)
> > +#define SCTLR_ELx_A> > 	> > (1 << 1)
> > +#define SCTLR_ELx_M> > 	> > 1
> 
> kvm_arm.h already has some of these defined, and sysreg.h has another
> bunch. Can you please consolidate this into a single place (probably
> sysreg.h)?

OK, I'll look into it.

> > +#define SCTLR_ELx_FLAGS> > 	> > (SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C |> > 	> > \
> > +> > 	> > 	> > 	> >  SCTLR_ELx_SA | SCTLR_ELx_I)
> > +
> > +.text
> > +.pushsection    .idmap.text, "ax"
> > +
> > +.align 5
> 
> Why is this .align directive required?

In the original proc.S file, this was to align the cpu_reset
routine.  It has been there since the initial commit of the
proc.S.  I don't think this alignment is needed, but maybe
there is some reason it is there that I am not aware of?

> > +
> > +/*
> > + * cpu_soft_restart(cpu_reset, el2_switch, entry, arg0, arg1) - Perform a cpu
> > + * soft reset.
> > + *
> > + * @cpu_reset: Physical address of the cpu_reset routine.
> > + * @el2_switch: Flag to indicate a swich to EL2 is needed, passed to cpu_reset.
> > + * @entry: Location to jump to for soft reset, passed to cpu_reset.
> > + * arg0: First argument passed to @entry.
> > + * arg1: Second argument passed to @entry.
> > + */
> > +
> > +ENTRY(cpu_soft_restart)
> > +> > 	> > mov> > 	> > x18, x0> > 	> > 	> > 	> > 	> > // cpu_reset
> > +> > 	> > mov> > 	> > x0, x1> > 	> > 	> > 	> > 	> > // el2_switch
> > +> > 	> > mov> > 	> > x1, x2> > 	> > 	> > 	> > 	> > // entry
> > +> > 	> > mov> > 	> > x2, x3> > 	> > 	> > 	> > 	> > // arg0
> > +> > 	> > mov> > 	> > x3, x4> > 	> > 	> > 	> > 	> > // arg1
> > +> > 	> > ret> > 	> > x18
> > +ENDPROC(cpu_soft_restart)
> 
> Grepping through the tree, I can only find a single use of
> cpu_soft_restart, with cpu_reset as its first parameter.
> 
> Why do we need this indirection? Having
> 
> void cpu_soft_restart(el2_switch, entry, arg0, arg1);
> 
> should be enough...

> > +/*
> > + * cpu_reset(el2_switch, entry, arg0, arg1) - Helper for cpu_soft_restart.
> > + *
> > + * @el2_switch: Flag to indicate a swich to EL2 is needed.
> > + * @entry: Location to jump to for soft reset.
> > + * arg0: First argument passed to @entry.
> > + * arg1: Second argument passed to @entry.
> > + *
> > + * Put the CPU into the same state as it would be if it had been reset, and
> > + * branch to what would be the reset vector. It must be executed with the
> > + * flat identity mapping.
> > + */
> > +
> > +ENTRY(cpu_reset)
> > +> > 	> > /* Clear sctlr_el1 flags. */
> > +> > 	> > mrs> > 	> > x12, sctlr_el1
> > +> > 	> > ldr> > 	> > x13, =SCTLR_ELx_FLAGS
> > +> > 	> > bic> > 	> > x12, x12, x13
> > +> > 	> > msr> > 	> > sctlr_el1, x12
> > +> > 	> > isb
> > +
> > +> > 	> > cbz> > 	> > x0, 1f> > 	> > 	> > 	> > 	> > // el2_switch?
> > +
> > +> > 	> > mov> > 	> > x0, x1> > 	> > 	> > 	> > 	> > // entry
> > +> > 	> > mov> > 	> > x1, x2> > 	> > 	> > 	> > 	> > // arg0
> > +> > 	> > mov> > 	> > x2, x3> > 	> > 	> > 	> > 	> > // arg1
> > +> > 	> > hvc> > 	> > #HVC_CALL_FUNC> > 	> > 	> > 	> > // no return
> > +
> > +1:> > 	> > mov> > 	> > x18, x1> > 	> > 	> > 	> > 	> > // entry
> > +> > 	> > mov> > 	> > x0, x2> > 	> > 	> > 	> > 	> > // arg0
> > +> > 	> > mov> > 	> > x1, x3> > 	> > 	> > 	> > 	> > // arg1
> > +> > 	> > ret> > 	> > x18
> > +ENDPROC(cpu_reset)
> 
> And this can be an actual part of cpu_soft_reset.

Sure, I'll look at your proposal from your reply to Pratyush.

> > +
> > +.popsection
> > diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
> > new file mode 100644
> > index 0000000..79816b6
> > --- /dev/null
> > +++ b/arch/arm64/kernel/cpu-reset.h
> > @@ -0,0 +1,22 @@
> > +/*
> > + * cpu reset routines
> > + *
> > + * Copyright (C) 2015 Huawei Futurewei Technologies.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#if !defined(_ARM64_CPU_RESET_H)
> 
> The usual idiom is "#ifndef __BLAH_H".

Sure, I can change that.  Grepping the arm64 sources shows one other
use of this style, in a kvm header...

> > +#define _ARM64_CPU_RESET_H
> > +
> > +#include 
> > +
> > +void __attribute__((noreturn)) cpu_reset(unsigned long el2_switch,
> > +> > 	> > unsigned long entry, unsigned long arg0, unsigned long arg1);
> > +void __attribute__((noreturn)) cpu_soft_restart(phys_addr_t cpu_reset,
> > +> > 	> > unsigned long el2_switch, unsigned long entry, unsigned long arg0,
> > +> > 	> > unsigned long arg1);
> > +
> > +#endif
> > 
> 
> There is a __noreturn attribute already defined.

I'll switch these over.

Thanks for the review.

-Geoff


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-11-30 18:51       ` Geoff Levand
@ 2015-12-01  2:16         ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-01  2:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Geoff,

On 30/11/2015:10:51:15 AM, Geoff Levand wrote:
> Hi,
> 
> On Fri, 2015-11-27 at 18:43 +0530, Pratyush Anand wrote:
> 
> > On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> > > +> > 	> > /* Test the entry flags. */
> > > +.Ltest_source:
> > > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > > +
> > > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20 = copy dest */
> > > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21 = copy src */
> > 
> > Till v10 we had here invalidation for relocated destination page to PoC. I could
> > not understand, why it was removed. Removing that piece of code breaks kexec
> > booting with mustang. I need [1] to kexec boot into second kernel with mustang
> > platform.
> 
> We need to flush the new kernel to PoC.  The code that was here that
> was doing that would only be executed when the new kernel needed
> relocation (the standard kexec case).  We also need to flush kernels
> that do not need relocation (the standard kdump case).
> 
> I moved the new kernel flush to kexec_segment_flush(), called
> unconditionally in machine_kexec() so we can handle both cases
> with one piece of code.

Yes, I had noticed that. Actually flushing before cache is disabled can always
cause heisenbug.

> 
> Have you experienced a problem on mustang with the current version?

Yes, v10 works fine, but I need invalidation fix for both v11 and v12 to work on
mustang.
I have not tested vanilla v12 on seattle, but v12 ported on rhelsa needs
invalidation fix to work on seattle as well.

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-01  2:16         ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-01  2:16 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, Will Deacon,
	AKASHI Takahiro, linux-arm-kernel, Dave Young, kexec,
	christoffer.dall

Hi Geoff,

On 30/11/2015:10:51:15 AM, Geoff Levand wrote:
> Hi,
> 
> On Fri, 2015-11-27 at 18:43 +0530, Pratyush Anand wrote:
> 
> > On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> > > +> > 	> > /* Test the entry flags. */
> > > +.Ltest_source:
> > > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > > +
> > > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20 = copy dest */
> > > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21 = copy src */
> > 
> > Till v10 we had here invalidation for relocated destination page to PoC. I could
> > not understand, why it was removed. Removing that piece of code breaks kexec
> > booting with mustang. I need [1] to kexec boot into second kernel with mustang
> > platform.
> 
> We need to flush the new kernel to PoC.  The code that was here that
> was doing that would only be executed when the new kernel needed
> relocation (the standard kexec case).  We also need to flush kernels
> that do not need relocation (the standard kdump case).
> 
> I moved the new kernel flush to kexec_segment_flush(), called
> unconditionally in machine_kexec() so we can handle both cases
> with one piece of code.

Yes, I had noticed that. Actually flushing before cache is disabled can always
cause heisenbug.

> 
> Have you experienced a problem on mustang with the current version?

Yes, v10 works fine, but I need invalidation fix for both v11 and v12 to work on
mustang.
I have not tested vanilla v12 on seattle, but v12 ported on rhelsa needs
invalidation fix to work on seattle as well.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-11-30 20:03       ` Geoff Levand
@ 2015-12-01  9:38         ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-01  9:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 30/11/15 20:03, Geoff Levand wrote:

Hi Geoff,

>>> +.align 5
>>
>> Why is this .align directive required?
> 
> In the original proc.S file, this was to align the cpu_reset
> routine.  It has been there since the initial commit of the
> proc.S.  I don't think this alignment is needed, but maybe
> there is some reason it is there that I am not aware of?

None that I can think of, so just drop it - we'll find out pretty
quickly if there was an actual requirement.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-12-01  9:38         ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-01  9:38 UTC (permalink / raw)
  To: Geoff Levand, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, AKASHI Takahiro, kexec, linux-arm-kernel, christoffer.dall

On 30/11/15 20:03, Geoff Levand wrote:

Hi Geoff,

>>> +.align 5
>>
>> Why is this .align directive required?
> 
> In the original proc.S file, this was to align the cpu_reset
> routine.  It has been there since the initial commit of the
> proc.S.  I don't think this alignment is needed, but maybe
> there is some reason it is there that I am not aware of?

None that I can think of, so just drop it - we'll find out pretty
quickly if there was an actual requirement.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-01  2:16         ` Pratyush Anand
@ 2015-12-01 18:32           ` Azriel Samson
  -1 siblings, 0 replies; 178+ messages in thread
From: Azriel Samson @ 2015-12-01 18:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

I tested with v11 on our platform and I too needed the invalidation fix.

Thanks,
Azriel Samson
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,?
a Linux Foundation Collaborative Project

-----Original Message-----
From: kexec [mailto:kexec-bounces at lists.infradead.org] On Behalf Of Pratyush
Anand
Sent: Monday, November 30, 2015 7:16 PM
To: Geoff Levand
Cc: Mark Rutland; marc.zyngier at arm.com; Catalin Marinas; Will Deacon; AKASHI
Takahiro; linux-arm-kernel at lists.infradead.org; Dave Young;
kexec at lists.infradead.org; christoffer.dall at linaro.org
Subject: Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support

Hi Geoff,

On 30/11/2015:10:51:15 AM, Geoff Levand wrote:
> Hi,
> 
> On Fri, 2015-11-27 at 18:43 +0530, Pratyush Anand wrote:
> 
> > On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> > > +> > 	> > /* Test the entry flags. */
> > > +.Ltest_source:
> > > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > > +
> > > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20
= copy dest */
> > > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21
= copy src */
> > 
> > Till v10 we had here invalidation for relocated destination page to 
> > PoC. I could not understand, why it was removed. Removing that piece 
> > of code breaks kexec booting with mustang. I need [1] to kexec boot 
> > into second kernel with mustang platform.
> 
> We need to flush the new kernel to PoC.  The code that was here that 
> was doing that would only be executed when the new kernel needed 
> relocation (the standard kexec case).  We also need to flush kernels 
> that do not need relocation (the standard kdump case).
> 
> I moved the new kernel flush to kexec_segment_flush(), called 
> unconditionally in machine_kexec() so we can handle both cases with 
> one piece of code.

Yes, I had noticed that. Actually flushing before cache is disabled can
always cause heisenbug.

> 
> Have you experienced a problem on mustang with the current version?

Yes, v10 works fine, but I need invalidation fix for both v11 and v12 to
work on mustang.
I have not tested vanilla v12 on seattle, but v12 ported on rhelsa needs
invalidation fix to work on seattle as well.

~Pratyush

_______________________________________________
kexec mailing list
kexec at lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* RE: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-01 18:32           ` Azriel Samson
  0 siblings, 0 replies; 178+ messages in thread
From: Azriel Samson @ 2015-12-01 18:32 UTC (permalink / raw)
  To: 'Pratyush Anand', 'Geoff Levand'
  Cc: 'Mark Rutland', marc.zyngier, 'Catalin Marinas',
	'Will Deacon', 'AKASHI Takahiro',
	christoffer.dall, 'Dave Young',
	kexec, linux-arm-kernel

Hi,

I tested with v11 on our platform and I too needed the invalidation fix.

Thanks,
Azriel Samson
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project

-----Original Message-----
From: kexec [mailto:kexec-bounces@lists.infradead.org] On Behalf Of Pratyush
Anand
Sent: Monday, November 30, 2015 7:16 PM
To: Geoff Levand
Cc: Mark Rutland; marc.zyngier@arm.com; Catalin Marinas; Will Deacon; AKASHI
Takahiro; linux-arm-kernel@lists.infradead.org; Dave Young;
kexec@lists.infradead.org; christoffer.dall@linaro.org
Subject: Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support

Hi Geoff,

On 30/11/2015:10:51:15 AM, Geoff Levand wrote:
> Hi,
> 
> On Fri, 2015-11-27 at 18:43 +0530, Pratyush Anand wrote:
> 
> > On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> > > +> > 	> > /* Test the entry flags. */
> > > +.Ltest_source:
> > > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > > +
> > > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20
= copy dest */
> > > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21
= copy src */
> > 
> > Till v10 we had here invalidation for relocated destination page to 
> > PoC. I could not understand, why it was removed. Removing that piece 
> > of code breaks kexec booting with mustang. I need [1] to kexec boot 
> > into second kernel with mustang platform.
> 
> We need to flush the new kernel to PoC.  The code that was here that 
> was doing that would only be executed when the new kernel needed 
> relocation (the standard kexec case).  We also need to flush kernels 
> that do not need relocation (the standard kdump case).
> 
> I moved the new kernel flush to kexec_segment_flush(), called 
> unconditionally in machine_kexec() so we can handle both cases with 
> one piece of code.

Yes, I had noticed that. Actually flushing before cache is disabled can
always cause heisenbug.

> 
> Have you experienced a problem on mustang with the current version?

Yes, v10 works fine, but I need invalidation fix for both v11 and v12 to
work on mustang.
I have not tested vanilla v12 on seattle, but v12 ported on rhelsa needs
invalidation fix to work on seattle as well.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-01  2:16         ` Pratyush Anand
@ 2015-12-01 19:03           ` Mark Rutland
  -1 siblings, 0 replies; 178+ messages in thread
From: Mark Rutland @ 2015-12-01 19:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 01, 2015 at 07:46:15AM +0530, Pratyush Anand wrote:
> Hi Geoff,
> 
> On 30/11/2015:10:51:15 AM, Geoff Levand wrote:
> > Hi,
> > 
> > On Fri, 2015-11-27 at 18:43 +0530, Pratyush Anand wrote:
> > 
> > > On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> > > > +> > 	> > /* Test the entry flags. */
> > > > +.Ltest_source:
> > > > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > > > +
> > > > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20 = copy dest */
> > > > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21 = copy src */
> > > 
> > > Till v10 we had here invalidation for relocated destination page to PoC. I could
> > > not understand, why it was removed. Removing that piece of code breaks kexec
> > > booting with mustang. I need [1] to kexec boot into second kernel with mustang
> > > platform.
> > 
> > We need to flush the new kernel to PoC.  The code that was here that
> > was doing that would only be executed when the new kernel needed
> > relocation (the standard kexec case).  We also need to flush kernels
> > that do not need relocation (the standard kdump case).
> > 
> > I moved the new kernel flush to kexec_segment_flush(), called
> > unconditionally in machine_kexec() so we can handle both cases
> > with one piece of code.
> 
> Yes, I had noticed that. Actually flushing before cache is disabled can always
> cause heisenbug.

Please use "clean", "invalidate", or "clean+invalidate" rather than
"flush". The latter is ambiguous and misleading.

You can validly perform maintenance while the cache may allocate for a
region of memory, it's just that afterwards the cache may hold a clean
entries for that region.

You can clean/clean+invalidate to push data to the PoC, or you can
invalidate/clean+invalidate to ensure that no asynchronous writebacks
occur later (so long as you do not make a cacheable write to said
cacheline).

The only thing that you cannot guarantee is that there is not some clean
cacheline allocated for a region of memory to which cacheable accesses
may be performed.

Note that the kernel only requires its Image to be clean to the PoC. So
long as this is true, we know that there will not be asynchrnoous
writebacks, and can invalidate as necessary as part of the boot process.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-01 19:03           ` Mark Rutland
  0 siblings, 0 replies; 178+ messages in thread
From: Mark Rutland @ 2015-12-01 19:03 UTC (permalink / raw)
  To: Pratyush Anand
  Cc: Geoff Levand, Catalin Marinas, Will Deacon, AKASHI Takahiro,
	linux-arm-kernel, marc.zyngier, Dave Young, kexec,
	christoffer.dall

On Tue, Dec 01, 2015 at 07:46:15AM +0530, Pratyush Anand wrote:
> Hi Geoff,
> 
> On 30/11/2015:10:51:15 AM, Geoff Levand wrote:
> > Hi,
> > 
> > On Fri, 2015-11-27 at 18:43 +0530, Pratyush Anand wrote:
> > 
> > > On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> > > > +> > 	> > /* Test the entry flags. */
> > > > +.Ltest_source:
> > > > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > > > +
> > > > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20 = copy dest */
> > > > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21 = copy src */
> > > 
> > > Till v10 we had here invalidation for relocated destination page to PoC. I could
> > > not understand, why it was removed. Removing that piece of code breaks kexec
> > > booting with mustang. I need [1] to kexec boot into second kernel with mustang
> > > platform.
> > 
> > We need to flush the new kernel to PoC.  The code that was here that
> > was doing that would only be executed when the new kernel needed
> > relocation (the standard kexec case).  We also need to flush kernels
> > that do not need relocation (the standard kdump case).
> > 
> > I moved the new kernel flush to kexec_segment_flush(), called
> > unconditionally in machine_kexec() so we can handle both cases
> > with one piece of code.
> 
> Yes, I had noticed that. Actually flushing before cache is disabled can always
> cause heisenbug.

Please use "clean", "invalidate", or "clean+invalidate" rather than
"flush". The latter is ambiguous and misleading.

You can validly perform maintenance while the cache may allocate for a
region of memory, it's just that afterwards the cache may hold a clean
entries for that region.

You can clean/clean+invalidate to push data to the PoC, or you can
invalidate/clean+invalidate to ensure that no asynchronous writebacks
occur later (so long as you do not make a cacheable write to said
cacheline).

The only thing that you cannot guarantee is that there is not some clean
cacheline allocated for a region of memory to which cacheable accesses
may be performed.

Note that the kernel only requires its Image to be clean to the PoC. So
long as this is true, we know that there will not be asynchrnoous
writebacks, and can invalidate as necessary as part of the boot process.

Thanks,
Mark.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-01 19:03           ` Mark Rutland
@ 2015-12-02 21:08             ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-02 21:08 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On Tue, 2015-12-01 at 19:03 +0000, Mark Rutland wrote:
> You can validly perform maintenance while the cache may allocate for a
> region of memory, it's just that afterwards the cache may hold a clean
> entries for that region.
> 
> You can clean/clean+invalidate to push data to the PoC, or you can
> invalidate/clean+invalidate to ensure that no asynchronous writebacks
> occur later (so long as you do not make a cacheable write to said
> cacheline).
> 
> The only thing that you cannot guarantee is that there is not some clean
> cacheline allocated for a region of memory to which cacheable accesses
> may be performed.
> 
> Note that the kernel only requires its Image to be clean to the PoC. So
> long as this is true, we know that there will not be asynchrnoous
> writebacks, and can invalidate as necessary as part of the boot process.

In v10, which worked for Mustang and Qualcomm, we had:

  clean+invalidate to PoC all source pages
  disable d-cache
  loop {
    invalidate to PoC destination page
    copy page source->destination
  }
  enter new image

In v11 I changed this, and it did not work for those platforms:
 
  clean+invalidate to PoC all source pages
  clean+invalidate to PoC all destination pages
  disable d-cache
  loop {
    copy page source->destination
  }
  enter new image

Based on your comments above I would think both should work OK.

-Geoff
  

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-02 21:08             ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-02 21:08 UTC (permalink / raw)
  To: Mark Rutland, Pratyush Anand
  Cc: marc.zyngier, Catalin Marinas, Will Deacon, AKASHI Takahiro,
	linux-arm-kernel, Dave Young, kexec, christoffer.dall

Hi Mark,

On Tue, 2015-12-01 at 19:03 +0000, Mark Rutland wrote:
> You can validly perform maintenance while the cache may allocate for a
> region of memory, it's just that afterwards the cache may hold a clean
> entries for that region.
> 
> You can clean/clean+invalidate to push data to the PoC, or you can
> invalidate/clean+invalidate to ensure that no asynchronous writebacks
> occur later (so long as you do not make a cacheable write to said
> cacheline).
> 
> The only thing that you cannot guarantee is that there is not some clean
> cacheline allocated for a region of memory to which cacheable accesses
> may be performed.
> 
> Note that the kernel only requires its Image to be clean to the PoC. So
> long as this is true, we know that there will not be asynchrnoous
> writebacks, and can invalidate as necessary as part of the boot process.

In v10, which worked for Mustang and Qualcomm, we had:

  clean+invalidate to PoC all source pages
  disable d-cache
  loop {
    invalidate to PoC destination page
    copy page source->destination
  }
  enter new image

In v11 I changed this, and it did not work for those platforms:
 
  clean+invalidate to PoC all source pages
  clean+invalidate to PoC all destination pages
  disable d-cache
  loop {
    copy page source->destination
  }
  enter new image

Based on your comments above I would think both should work OK.

-Geoff
  

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-02 22:40     ` Ashwin Chaugule
  -1 siblings, 0 replies; 178+ messages in thread
From: Ashwin Chaugule @ 2015-12-02 22:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>
> The current kvm implementation on arm64 does cpu-specific initialization
> at system boot, and has no way to gracefully shutdown a core in terms of
> kvm. This prevents, especially, kexec from rebooting the system on a boot
> core in EL2.
>
> This patch adds a cpu tear-down function and also puts an existing cpu-init
> code into a separate function, kvm_arch_hardware_disable() and
> kvm_arch_hardware_enable() respectively.
> We don't need arm64-specific cpu hotplug hook any more.
>
> Since this patch modifies common part of code between arm and arm64, one
> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
> compiling errors.
>
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm/include/asm/kvm_host.h   | 10 ++++-
>  arch/arm/include/asm/kvm_mmu.h    |  1 +
>  arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>  arch/arm/kvm/mmu.c                |  5 +++
>  arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>  arch/arm64/include/asm/virt.h     |  9 +++++
>  arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>  arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>  9 files changed, 138 insertions(+), 48 deletions(-)

[..]

>
>
>  static struct notifier_block hyp_init_cpu_pm_nb = {
> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>         }
>
>         /*
> -        * Execute the init code on each CPU.
> -        */
> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
> -
> -       /*
>          * Init HYP view of VGIC
>          */
>         err = kvm_vgic_hyp_init();

With this flow, the cpu_init_hyp_mode() is called only at VM guest
creation, but vgic_hyp_init() is called at bootup. On a system with
GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
(to get the number of LRs), because we're not reading it from EL2
anymore.

Whats the best way to fix this?
- Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
- Fold the VGIC init stuff back into hardware_enable()?
- Read the VGIC number of LRs from the hyp stub?
- ..

Regards,
Ashwin.

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-02 22:40     ` Ashwin Chaugule
  0 siblings, 0 replies; 178+ messages in thread
From: Ashwin Chaugule @ 2015-12-02 22:40 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, vikrams, Marc Zyngier, Catalin Marinas,
	Will Deacon, AKASHI Takahiro, shankerd, linux-arm-kernel, kexec,
	Christoffer Dall

Hello,

On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>
> The current kvm implementation on arm64 does cpu-specific initialization
> at system boot, and has no way to gracefully shutdown a core in terms of
> kvm. This prevents, especially, kexec from rebooting the system on a boot
> core in EL2.
>
> This patch adds a cpu tear-down function and also puts an existing cpu-init
> code into a separate function, kvm_arch_hardware_disable() and
> kvm_arch_hardware_enable() respectively.
> We don't need arm64-specific cpu hotplug hook any more.
>
> Since this patch modifies common part of code between arm and arm64, one
> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
> compiling errors.
>
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm/include/asm/kvm_host.h   | 10 ++++-
>  arch/arm/include/asm/kvm_mmu.h    |  1 +
>  arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>  arch/arm/kvm/mmu.c                |  5 +++
>  arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>  arch/arm64/include/asm/virt.h     |  9 +++++
>  arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>  arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>  9 files changed, 138 insertions(+), 48 deletions(-)

[..]

>
>
>  static struct notifier_block hyp_init_cpu_pm_nb = {
> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>         }
>
>         /*
> -        * Execute the init code on each CPU.
> -        */
> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
> -
> -       /*
>          * Init HYP view of VGIC
>          */
>         err = kvm_vgic_hyp_init();

With this flow, the cpu_init_hyp_mode() is called only at VM guest
creation, but vgic_hyp_init() is called at bootup. On a system with
GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
(to get the number of LRs), because we're not reading it from EL2
anymore.

Whats the best way to fix this?
- Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
- Fold the VGIC init stuff back into hardware_enable()?
- Read the VGIC number of LRs from the hyp stub?
- ..

Regards,
Ashwin.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-01 18:32           ` Azriel Samson
@ 2015-12-02 22:49             ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-02 22:49 UTC (permalink / raw)
  To: linux-arm-kernel

Azriel and Pratyush,

On Tue, 2015-12-01 at 11:32 -0700, Azriel Samson wrote:
> I tested with v11 on our platform and I too needed the invalidation
> fix.

I pushed out a kexec-v12.1 branch to my linux-kexec repo [1] with
fixes.  Could you test it with both kexec reboot and kdump?  For
kdump, use something like:

  CMDLINE += 'crashkernel=64M at 2240M'

then

  # kexec -d --load-panic /boot/vmlinux.strip --append="1 mem=64M maxcpus=1 reset_devices console=... root=..."
  # echo c > /proc/sysrq-trigger

We need to test both reboot and kdump since the the way the kexec is
done is different for the two.

[1]  https://git.kernel.org/cgit/linux/kernel/git/geoff/linux-kexec.git
-Geoff

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-02 22:49             ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-02 22:49 UTC (permalink / raw)
  To: Azriel Samson, 'Pratyush Anand'
  Cc: 'Mark Rutland', marc.zyngier, 'Catalin Marinas',
	'Will Deacon', 'AKASHI Takahiro',
	christoffer.dall, 'Dave Young',
	kexec, linux-arm-kernel

Azriel and Pratyush,

On Tue, 2015-12-01 at 11:32 -0700, Azriel Samson wrote:
> I tested with v11 on our platform and I too needed the invalidation
> fix.

I pushed out a kexec-v12.1 branch to my linux-kexec repo [1] with
fixes.  Could you test it with both kexec reboot and kdump?  For
kdump, use something like:

  CMDLINE += 'crashkernel=64M@2240M'

then

  # kexec -d --load-panic /boot/vmlinux.strip --append="1 mem=64M maxcpus=1 reset_devices console=... root=..."
  # echo c > /proc/sysrq-trigger

We need to test both reboot and kdump since the the way the kexec is
done is different for the two.

[1]  https://git.kernel.org/cgit/linux/kernel/git/geoff/linux-kexec.git
-Geoff


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-11-30 10:40         ` Marc Zyngier
@ 2015-12-02 22:57           ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-02 22:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2015-11-30 at 10:40 +0000, Marc Zyngier wrote:

> All that can be solved in C, and mostly at compile time. Using an
> assembler trampoline is complicating things for no good reason:

I added this into my kexec-v12.1 branch.  After some more testing I'll
post a v13 to the list.

-Geoff

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-12-02 22:57           ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-02 22:57 UTC (permalink / raw)
  To: Marc Zyngier, Pratyush Anand
  Cc: Mark Rutland, Catalin Marinas, Will Deacon, AKASHI Takahiro,
	christoffer.dall, kexec, linux-arm-kernel

On Mon, 2015-11-30 at 10:40 +0000, Marc Zyngier wrote:

> All that can be solved in C, and mostly at compile time. Using an
> assembler trampoline is complicating things for no good reason:

I added this into my kexec-v12.1 branch.  After some more testing I'll
post a v13 to the list.

-Geoff


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-03  4:15     ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-03  4:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Akashi,

Some of the points which came while discussing with Mark Salter are worth
including in v13.

On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>

[...]

> +/**
> + * machine_crash_shutdown - shutdown non-boot cpus and save registers

"non-panic" would be correct in stead of "non-boot". 

> +	/* shutdown non-boot cpus */

Ditto

> +	smp_send_stop();
> +
> +	/* for boot cpu */

"for panic cpu"


[...]

> @@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
>  
>  	local_irq_disable();

We have "set_cpu_online(cpu, false);" just before it.
Panic core is waiting for non-panic to go offline, i.e. for the above event.

>  
> +#ifdef CONFIG_KEXEC
> +	if (in_crash_kexec)
> +		crash_save_cpu(regs, cpu);
> +#endif /* CONFIG_KEXEC */

However, we are still saving crash info for non-panic core. So, it would be good
to move crash_save_cpu() before set_cpu_online() to avoid any race condition.

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
@ 2015-12-03  4:15     ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-03  4:15 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Mark Rutland, Geoff Levand, Catalin Marinas, Will Deacon,
	marc.zyngier, linux-arm-kernel, msalter, kexec, christoffer.dall

Hi Akashi,

Some of the points which came while discussing with Mark Salter are worth
including in v13.

On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>

[...]

> +/**
> + * machine_crash_shutdown - shutdown non-boot cpus and save registers

"non-panic" would be correct in stead of "non-boot". 

> +	/* shutdown non-boot cpus */

Ditto

> +	smp_send_stop();
> +
> +	/* for boot cpu */

"for panic cpu"


[...]

> @@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
>  
>  	local_irq_disable();

We have "set_cpu_online(cpu, false);" just before it.
Panic core is waiting for non-panic to go offline, i.e. for the above event.

>  
> +#ifdef CONFIG_KEXEC
> +	if (in_crash_kexec)
> +		crash_save_cpu(regs, cpu);
> +#endif /* CONFIG_KEXEC */

However, we are still saving crash info for non-panic core. So, it would be good
to move crash_save_cpu() before set_cpu_online() to avoid any race condition.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-02 22:49             ` Geoff Levand
@ 2015-12-03  4:37               ` Azriel Samson
  -1 siblings, 0 replies; 178+ messages in thread
From: Azriel Samson @ 2015-12-03  4:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Geoff,

> I pushed out a kexec-v12.1 branch to my linux-kexec repo [1] with fixes.
Could you test it with both kexec reboot and kdump?  For kdump, use
something like:
 > CMDLINE += 'crashkernel=64M at 2240M'

For kdump, I also require a version of the patch "arm64/kexec: Add support
for kexec-lite". Without this, the dump capture kernel cannot find the
device tree even on v11 that I tested.

Thanks,
Azriel Samson
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,?
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 178+ messages in thread

* RE: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-03  4:37               ` Azriel Samson
  0 siblings, 0 replies; 178+ messages in thread
From: Azriel Samson @ 2015-12-03  4:37 UTC (permalink / raw)
  To: 'Geoff Levand'
  Cc: 'Mark Rutland', 'Pratyush Anand',
	marc.zyngier, 'Catalin Marinas', 'Will Deacon',
	'AKASHI Takahiro', linux-arm-kernel, 'Dave Young',
	kexec, christoffer.dall

Hi Geoff,

> I pushed out a kexec-v12.1 branch to my linux-kexec repo [1] with fixes.
Could you test it with both kexec reboot and kdump?  For kdump, use
something like:
 > CMDLINE += 'crashkernel=64M@2240M'

For kdump, I also require a version of the patch "arm64/kexec: Add support
for kexec-lite". Without this, the dump capture kernel cannot find the
device tree even on v11 that I tested.

Thanks,
Azriel Samson
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-02 22:49             ` Geoff Levand
@ 2015-12-03  6:09               ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-03  6:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Geoff,

On 02/12/2015:02:49:22 PM, Geoff Levand wrote:
> Azriel and Pratyush,
> 
> On Tue, 2015-12-01 at 11:32 -0700, Azriel Samson wrote:
> > I tested with v11 on our platform and I too needed the invalidation
> > fix.
> 
> I pushed out a kexec-v12.1 branch to my linux-kexec repo [1] with
> fixes.  Could you test it with both kexec reboot and kdump?  For
> kdump, use something like:

I tested kexec-v12.1 with mustang and it works for both kexec and kdump.

There had been few updates with Akashi, which are not there in kexec-v12.1.
Please take his updates and merge them in v13.

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-03  6:09               ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-03  6:09 UTC (permalink / raw)
  To: Geoff Levand
  Cc: 'Mark Rutland', marc.zyngier, 'Catalin Marinas',
	Azriel Samson, 'Will Deacon', 'AKASHI Takahiro',
	christoffer.dall, 'Dave Young',
	kexec, linux-arm-kernel

Hi Geoff,

On 02/12/2015:02:49:22 PM, Geoff Levand wrote:
> Azriel and Pratyush,
> 
> On Tue, 2015-12-01 at 11:32 -0700, Azriel Samson wrote:
> > I tested with v11 on our platform and I too needed the invalidation
> > fix.
> 
> I pushed out a kexec-v12.1 branch to my linux-kexec repo [1] with
> fixes.  Could you test it with both kexec reboot and kdump?  For
> kdump, use something like:

I tested kexec-v12.1 with mustang and it works for both kexec and kdump.

There had been few updates with Akashi, which are not there in kexec-v12.1.
Please take his updates and merge them in v13.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-12-02 22:57           ` Geoff Levand
@ 2015-12-03  9:32             ` Will Deacon
  -1 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-03  9:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 02, 2015 at 02:57:30PM -0800, Geoff Levand wrote:
> On Mon, 2015-11-30 at 10:40 +0000, Marc Zyngier wrote:
> 
> > All that can be solved in C, and mostly at compile time. Using an
> > assembler trampoline is complicating things for no good reason:
> 
> I added this into my kexec-v12.1 branch.  After some more testing I'll
> post a v13 to the list.

You may well need some notrace annotations if you want to run C code
here. Can you give it a spin with things like ftrace and kprobes?

Will

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-12-03  9:32             ` Will Deacon
  0 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-03  9:32 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Pratyush Anand, Mark Rutland, Marc Zyngier, Catalin Marinas,
	kexec, AKASHI Takahiro, christoffer.dall, linux-arm-kernel

On Wed, Dec 02, 2015 at 02:57:30PM -0800, Geoff Levand wrote:
> On Mon, 2015-11-30 at 10:40 +0000, Marc Zyngier wrote:
> 
> > All that can be solved in C, and mostly at compile time. Using an
> > assembler trampoline is complicating things for no good reason:
> 
> I added this into my kexec-v12.1 branch.  After some more testing I'll
> post a v13 to the list.

You may well need some notrace annotations if you want to run C code
here. Can you give it a spin with things like ftrace and kprobes?

Will

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-02 22:40     ` Ashwin Chaugule
@ 2015-12-03 13:55       ` Ashwin Chaugule
  -1 siblings, 0 replies; 178+ messages in thread
From: Ashwin Chaugule @ 2015-12-03 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 December 2015 at 17:40, Ashwin Chaugule <ashwin.chaugule@linaro.org> wrote:
> Hello,
>
> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> The current kvm implementation on arm64 does cpu-specific initialization
>> at system boot, and has no way to gracefully shutdown a core in terms of
>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>> core in EL2.
>>
>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>> code into a separate function, kvm_arch_hardware_disable() and
>> kvm_arch_hardware_enable() respectively.
>> We don't need arm64-specific cpu hotplug hook any more.
>>
>> Since this patch modifies common part of code between arm and arm64, one
>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>> compiling errors.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>  arch/arm/include/asm/kvm_mmu.h    |  1 +
>>  arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>  arch/arm/kvm/mmu.c                |  5 +++
>>  arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>  arch/arm64/include/asm/virt.h     |  9 +++++
>>  arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>  arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>  9 files changed, 138 insertions(+), 48 deletions(-)
>
> [..]
>
>>
>>
>>  static struct notifier_block hyp_init_cpu_pm_nb = {
>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>         }
>>
>>         /*
>> -        * Execute the init code on each CPU.
>> -        */
>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>> -
>> -       /*
>>          * Init HYP view of VGIC
>>          */
>>         err = kvm_vgic_hyp_init();
>
> With this flow, the cpu_init_hyp_mode() is called only at VM guest
> creation, but vgic_hyp_init() is called at bootup. On a system with
> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
> (to get the number of LRs), because we're not reading it from EL2
> anymore.

..or more likely, that the function to read the ICH_VTR_EL2 is not
mapped in EL2, because we deferred the cpu_init_hyp_mode() calls. In
any case, this ends up breaking KVM and manifests as a guest console
which is stuck.

>
> Whats the best way to fix this?
> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
> - Fold the VGIC init stuff back into hardware_enable()?
> - Read the VGIC number of LRs from the hyp stub?
> - ..
>
> Regards,
> Ashwin.

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-03 13:55       ` Ashwin Chaugule
  0 siblings, 0 replies; 178+ messages in thread
From: Ashwin Chaugule @ 2015-12-03 13:55 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, vikrams, Marc Zyngier, Catalin Marinas,
	Will Deacon, AKASHI Takahiro, shankerd, linux-arm-kernel, kexec,
	Christoffer Dall

On 2 December 2015 at 17:40, Ashwin Chaugule <ashwin.chaugule@linaro.org> wrote:
> Hello,
>
> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> The current kvm implementation on arm64 does cpu-specific initialization
>> at system boot, and has no way to gracefully shutdown a core in terms of
>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>> core in EL2.
>>
>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>> code into a separate function, kvm_arch_hardware_disable() and
>> kvm_arch_hardware_enable() respectively.
>> We don't need arm64-specific cpu hotplug hook any more.
>>
>> Since this patch modifies common part of code between arm and arm64, one
>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>> compiling errors.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>  arch/arm/include/asm/kvm_mmu.h    |  1 +
>>  arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>  arch/arm/kvm/mmu.c                |  5 +++
>>  arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>  arch/arm64/include/asm/virt.h     |  9 +++++
>>  arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>  arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>  9 files changed, 138 insertions(+), 48 deletions(-)
>
> [..]
>
>>
>>
>>  static struct notifier_block hyp_init_cpu_pm_nb = {
>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>         }
>>
>>         /*
>> -        * Execute the init code on each CPU.
>> -        */
>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>> -
>> -       /*
>>          * Init HYP view of VGIC
>>          */
>>         err = kvm_vgic_hyp_init();
>
> With this flow, the cpu_init_hyp_mode() is called only at VM guest
> creation, but vgic_hyp_init() is called at bootup. On a system with
> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
> (to get the number of LRs), because we're not reading it from EL2
> anymore.

..or more likely, that the function to read the ICH_VTR_EL2 is not
mapped in EL2, because we deferred the cpu_init_hyp_mode() calls. In
any case, this ends up breaking KVM and manifests as a guest console
which is stuck.

>
> Whats the best way to fix this?
> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
> - Fold the VGIC init stuff back into hardware_enable()?
> - Read the VGIC number of LRs from the hyp stub?
> - ..
>
> Regards,
> Ashwin.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-02 22:40     ` Ashwin Chaugule
@ 2015-12-03 13:58       ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-03 13:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/12/15 22:40, Ashwin Chaugule wrote:
> Hello,
> 
> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> The current kvm implementation on arm64 does cpu-specific initialization
>> at system boot, and has no way to gracefully shutdown a core in terms of
>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>> core in EL2.
>>
>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>> code into a separate function, kvm_arch_hardware_disable() and
>> kvm_arch_hardware_enable() respectively.
>> We don't need arm64-specific cpu hotplug hook any more.
>>
>> Since this patch modifies common part of code between arm and arm64, one
>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>> compiling errors.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>  arch/arm/include/asm/kvm_mmu.h    |  1 +
>>  arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>  arch/arm/kvm/mmu.c                |  5 +++
>>  arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>  arch/arm64/include/asm/virt.h     |  9 +++++
>>  arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>  arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>  9 files changed, 138 insertions(+), 48 deletions(-)
> 
> [..]
> 
>>
>>
>>  static struct notifier_block hyp_init_cpu_pm_nb = {
>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>         }
>>
>>         /*
>> -        * Execute the init code on each CPU.
>> -        */
>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>> -
>> -       /*
>>          * Init HYP view of VGIC
>>          */
>>         err = kvm_vgic_hyp_init();
> 
> With this flow, the cpu_init_hyp_mode() is called only at VM guest
> creation, but vgic_hyp_init() is called at bootup. On a system with
> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
> (to get the number of LRs), because we're not reading it from EL2
> anymore.

Indeed, this is completely broken (I just reproduced the issue on a
model). I wish this kind of details had been checked earlier, but thanks
for pointing it out.

> Whats the best way to fix this?
> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
> - Fold the VGIC init stuff back into hardware_enable()?

None of that works - kvm_arch_hardware_enable() is called once per CPU,
while vgic_hyp_init() can only be called once. Also,
kvm_arch_hardware_enable() is called from interrupt context, and I
wouldn't feel comfortable starting probing DT and allocating stuff from
there.

> - Read the VGIC number of LRs from the hyp stub?

That's may UNDEF if called in the wrong context. Also, this defeats the
point of stubs, which is just to install the vectors for the hypervisor.

> - ..

Yeah, quite.

Geoff, Takahiro?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-03 13:58       ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-03 13:58 UTC (permalink / raw)
  To: Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon,
	AKASHI Takahiro, shankerd, linux-arm-kernel, kexec,
	Christoffer Dall

On 02/12/15 22:40, Ashwin Chaugule wrote:
> Hello,
> 
> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> The current kvm implementation on arm64 does cpu-specific initialization
>> at system boot, and has no way to gracefully shutdown a core in terms of
>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>> core in EL2.
>>
>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>> code into a separate function, kvm_arch_hardware_disable() and
>> kvm_arch_hardware_enable() respectively.
>> We don't need arm64-specific cpu hotplug hook any more.
>>
>> Since this patch modifies common part of code between arm and arm64, one
>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>> compiling errors.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>  arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>  arch/arm/include/asm/kvm_mmu.h    |  1 +
>>  arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>  arch/arm/kvm/mmu.c                |  5 +++
>>  arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>  arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>  arch/arm64/include/asm/virt.h     |  9 +++++
>>  arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>  arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>  9 files changed, 138 insertions(+), 48 deletions(-)
> 
> [..]
> 
>>
>>
>>  static struct notifier_block hyp_init_cpu_pm_nb = {
>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>         }
>>
>>         /*
>> -        * Execute the init code on each CPU.
>> -        */
>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>> -
>> -       /*
>>          * Init HYP view of VGIC
>>          */
>>         err = kvm_vgic_hyp_init();
> 
> With this flow, the cpu_init_hyp_mode() is called only at VM guest
> creation, but vgic_hyp_init() is called at bootup. On a system with
> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
> (to get the number of LRs), because we're not reading it from EL2
> anymore.

Indeed, this is completely broken (I just reproduced the issue on a
model). I wish this kind of details had been checked earlier, but thanks
for pointing it out.

> Whats the best way to fix this?
> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
> - Fold the VGIC init stuff back into hardware_enable()?

None of that works - kvm_arch_hardware_enable() is called once per CPU,
while vgic_hyp_init() can only be called once. Also,
kvm_arch_hardware_enable() is called from interrupt context, and I
wouldn't feel comfortable starting probing DT and allocating stuff from
there.

> - Read the VGIC number of LRs from the hyp stub?

That's may UNDEF if called in the wrong context. Also, this defeats the
point of stubs, which is just to install the vectors for the hypervisor.

> - ..

Yeah, quite.

Geoff, Takahiro?

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-02 21:08             ` Geoff Levand
@ 2015-12-03 16:06               ` Mark Rutland
  -1 siblings, 0 replies; 178+ messages in thread
From: Mark Rutland @ 2015-12-03 16:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 02, 2015 at 01:08:33PM -0800, Geoff Levand wrote:
> Hi Mark,
> 
> On Tue, 2015-12-01 at 19:03 +0000, Mark Rutland wrote:
> > You can validly perform maintenance while the cache may allocate for a
> > region of memory, it's just that afterwards the cache may hold a clean
> > entries for that region.
> > 
> > You can clean/clean+invalidate to push data to the PoC, or you can
> > invalidate/clean+invalidate to ensure that no asynchronous writebacks
> > occur later (so long as you do not make a cacheable write to said
> > cacheline).
> > 
> > The only thing that you cannot guarantee is that there is not some clean
> > cacheline allocated for a region of memory to which cacheable accesses
> > may be performed.
> > 
> > Note that the kernel only requires its Image to be clean to the PoC. So
> > long as this is true, we know that there will not be asynchrnoous
> > writebacks, and can invalidate as necessary as part of the boot process.

I've realised that my wording here is somewhat confusing.

When I say "clean to the PoC", I mean that for a given PA the cache can
either:
* Not hold an entry (i.e. be invalid).
* Hold a clean entry, where the data is identical to that at the PoC.

> In v10, which worked for Mustang and Qualcomm, we had:
> 
>   clean+invalidate to PoC all source pages
>   disable d-cache
>   loop {
>     invalidate to PoC destination page
>     copy page source->destination
>   }
>   enter new image
> 
> In v11 I changed this, and it did not work for those platforms:
>  
>   clean+invalidate to PoC all source pages
>   clean+invalidate to PoC all destination pages
>   disable d-cache
>   loop {
>     copy page source->destination
>   }
>   enter new image
> 
> Based on your comments above I would think both should work OK.

No.

In the latter case, clean lines can be allocated before cacheable data
accesses are inhibited. So stale clean lines can shadow the data copied
via non-cacheable accesses.

While the cache is clean, it isn't clean to the PoC.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-03 16:06               ` Mark Rutland
  0 siblings, 0 replies; 178+ messages in thread
From: Mark Rutland @ 2015-12-03 16:06 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Pratyush Anand, marc.zyngier, Catalin Marinas, Will Deacon,
	AKASHI Takahiro, linux-arm-kernel, Dave Young, kexec,
	christoffer.dall

On Wed, Dec 02, 2015 at 01:08:33PM -0800, Geoff Levand wrote:
> Hi Mark,
> 
> On Tue, 2015-12-01 at 19:03 +0000, Mark Rutland wrote:
> > You can validly perform maintenance while the cache may allocate for a
> > region of memory, it's just that afterwards the cache may hold a clean
> > entries for that region.
> > 
> > You can clean/clean+invalidate to push data to the PoC, or you can
> > invalidate/clean+invalidate to ensure that no asynchronous writebacks
> > occur later (so long as you do not make a cacheable write to said
> > cacheline).
> > 
> > The only thing that you cannot guarantee is that there is not some clean
> > cacheline allocated for a region of memory to which cacheable accesses
> > may be performed.
> > 
> > Note that the kernel only requires its Image to be clean to the PoC. So
> > long as this is true, we know that there will not be asynchrnoous
> > writebacks, and can invalidate as necessary as part of the boot process.

I've realised that my wording here is somewhat confusing.

When I say "clean to the PoC", I mean that for a given PA the cache can
either:
* Not hold an entry (i.e. be invalid).
* Hold a clean entry, where the data is identical to that at the PoC.

> In v10, which worked for Mustang and Qualcomm, we had:
> 
>   clean+invalidate to PoC all source pages
>   disable d-cache
>   loop {
>     invalidate to PoC destination page
>     copy page source->destination
>   }
>   enter new image
> 
> In v11 I changed this, and it did not work for those platforms:
>  
>   clean+invalidate to PoC all source pages
>   clean+invalidate to PoC all destination pages
>   disable d-cache
>   loop {
>     copy page source->destination
>   }
>   enter new image
> 
> Based on your comments above I would think both should work OK.

No.

In the latter case, clean lines can be allocated before cacheable data
accesses are inhibited. So stale clean lines can shadow the data copied
via non-cacheable accesses.

While the cache is clean, it isn't clean to the PoC.

Thanks,
Mark.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-03  4:37               ` Azriel Samson
@ 2015-12-03 19:56                 ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-03 19:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Wed, 2015-12-02 at 21:37 -0700, Azriel Samson wrote:
> I pushed out a kexec-v12.1 branch to my linux-kexec repo [1] with fixes.
> Could you test it with both kexec reboot and kdump?  For kdump, use
> something like:
>  > CMDLINE += 'crashkernel=64M at 2240M'
> 
> For kdump, I also require a version of the patch "arm64/kexec: Add support
> for kexec-lite". Without this, the dump capture kernel cannot find the
> device tree even on v11 that I tested.

Without that patch, does the 2nd stage kernel get the correct
dtb address in register x0?

Is the problem that the dtb data is corrupt when in the 2nd
stage kernel?

Is the dtb data OK when in the 1st stage kernel? 

-Geoff

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-03 19:56                 ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-03 19:56 UTC (permalink / raw)
  To: Azriel Samson
  Cc: 'Mark Rutland', 'Pratyush Anand',
	marc.zyngier, 'Catalin Marinas', 'Will Deacon',
	'AKASHI Takahiro', linux-arm-kernel, 'Dave Young',
	kexec, christoffer.dall

Hi,

On Wed, 2015-12-02 at 21:37 -0700, Azriel Samson wrote:
> I pushed out a kexec-v12.1 branch to my linux-kexec repo [1] with fixes.
> Could you test it with both kexec reboot and kdump?  For kdump, use
> something like:
>  > CMDLINE += 'crashkernel=64M@2240M'
> 
> For kdump, I also require a version of the patch "arm64/kexec: Add support
> for kexec-lite". Without this, the dump capture kernel cannot find the
> device tree even on v11 that I tested.

Without that patch, does the 2nd stage kernel get the correct
dtb address in register x0?

Is the problem that the dtb data is corrupt when in the 2nd
stage kernel?

Is the dtb data OK when in the 1st stage kernel? 

-Geoff

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-03 19:56                 ` Geoff Levand
@ 2015-12-04  0:39                   ` Azriel Samson
  -1 siblings, 0 replies; 178+ messages in thread
From: Azriel Samson @ 2015-12-04  0:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 12/3/2015 12:56 PM, Geoff Levand wrote:
> Hi,
>
> On Wed, 2015-12-02 at 21:37 -0700, Azriel Samson wrote:
>> For kdump, I also require a version of the patch "arm64/kexec: Add support
>> for kexec-lite". Without this, the dump capture kernel cannot find the
>> device tree even on v11 that I tested.

This was a setup issue on my side. kdump with your v11 patches worked 
fine without the above patch when I used your version of kexec-tools.

In kexec-tools, I had to skip check_cpu_nodes() since we are using ACPI 
and the device tree does not contain cpu_info.

>
> Without that patch, does the 2nd stage kernel get the correct
> dtb address in register x0?
>
> Is the problem that the dtb data is corrupt when in the 2nd
> stage kernel?
>
> Is the dtb data OK when in the 1st stage kernel?
>
> -Geoff
>

Thanks,
Azriel Samson
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-04  0:39                   ` Azriel Samson
  0 siblings, 0 replies; 178+ messages in thread
From: Azriel Samson @ 2015-12-04  0:39 UTC (permalink / raw)
  To: Geoff Levand
  Cc: 'Mark Rutland', 'Pratyush Anand',
	marc.zyngier, 'Catalin Marinas', 'Will Deacon',
	'AKASHI Takahiro', linux-arm-kernel, 'Dave Young',
	kexec, christoffer.dall

Hi,

On 12/3/2015 12:56 PM, Geoff Levand wrote:
> Hi,
>
> On Wed, 2015-12-02 at 21:37 -0700, Azriel Samson wrote:
>> For kdump, I also require a version of the patch "arm64/kexec: Add support
>> for kexec-lite". Without this, the dump capture kernel cannot find the
>> device tree even on v11 that I tested.

This was a setup issue on my side. kdump with your v11 patches worked 
fine without the above patch when I used your version of kexec-tools.

In kexec-tools, I had to skip check_cpu_nodes() since we are using ACPI 
and the device tree does not contain cpu_info.

>
> Without that patch, does the 2nd stage kernel get the correct
> dtb address in register x0?
>
> Is the problem that the dtb data is corrupt when in the 2nd
> stage kernel?
>
> Is the dtb data OK when in the 1st stage kernel?
>
> -Geoff
>

Thanks,
Azriel Samson
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-04  0:39                   ` Azriel Samson
@ 2015-12-04  3:54                     ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-04  3:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Geoff,

On 03/12/2015:05:39:12 PM, Azriel Samson wrote:
> This was a setup issue on my side. kdump with your v11 patches worked fine
> without the above patch when I used your version of kexec-tools.
> 
> In kexec-tools, I had to skip check_cpu_nodes() since we are using ACPI and
> the device tree does not contain cpu_info.

I think, something like following should be squashed into patch "arm64: Add arm64
kexec support".
https://github.com/pratyushanand/kexec-tools/commit/85f403cabb9ae0d7a37d1656a00fb609a9525086

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-04  3:54                     ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-04  3:54 UTC (permalink / raw)
  To: Azriel Samson, Geoff Levand
  Cc: 'Mark Rutland', marc.zyngier, 'Catalin Marinas',
	'Will Deacon', 'AKASHI Takahiro',
	linux-arm-kernel, 'Dave Young',
	kexec, christoffer.dall

Hi Geoff,

On 03/12/2015:05:39:12 PM, Azriel Samson wrote:
> This was a setup issue on my side. kdump with your v11 patches worked fine
> without the above patch when I used your version of kexec-tools.
> 
> In kexec-tools, I had to skip check_cpu_nodes() since we are using ACPI and
> the device tree does not contain cpu_info.

I think, something like following should be squashed into patch "arm64: Add arm64
kexec support".
https://github.com/pratyushanand/kexec-tools/commit/85f403cabb9ae0d7a37d1656a00fb609a9525086

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-04  3:54                     ` Pratyush Anand
@ 2015-12-07 18:47                       ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-07 18:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2015-12-04 at 09:24 +0530, Pratyush Anand wrote:
> Hi Geoff,
> 
> On 03/12/2015:05:39:12 PM, Azriel Samson wrote:
> > This was a setup issue on my side. kdump with your v11 patches worked fine
> > without the above patch when I used your version of kexec-tools.
> > 
> > In kexec-tools, I had to skip check_cpu_nodes() since we are using ACPI and
> > the device tree does not contain cpu_info.
> 
> I think, something like following should be squashed into patch "arm64: Add arm64
> kexec support".
> https://github.com/pratyushanand/kexec-tools/commit/85f403cabb9ae0d7a37d1656a00fb609a9525086

I think it better to test for ACPI and then run
any meaningful checks than to just ignore the
check_cpu_nodes failure.

-Geoff

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-07 18:47                       ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-07 18:47 UTC (permalink / raw)
  To: Pratyush Anand, Azriel Samson
  Cc: 'Mark Rutland', marc.zyngier, 'Catalin Marinas',
	'Will Deacon', 'AKASHI Takahiro',
	linux-arm-kernel, 'Dave Young',
	kexec, christoffer.dall

On Fri, 2015-12-04 at 09:24 +0530, Pratyush Anand wrote:
> Hi Geoff,
> 
> On 03/12/2015:05:39:12 PM, Azriel Samson wrote:
> > This was a setup issue on my side. kdump with your v11 patches worked fine
> > without the above patch when I used your version of kexec-tools.
> > 
> > In kexec-tools, I had to skip check_cpu_nodes() since we are using ACPI and
> > the device tree does not contain cpu_info.
> 
> I think, something like following should be squashed into patch "arm64: Add arm64
> kexec support".
> https://github.com/pratyushanand/kexec-tools/commit/85f403cabb9ae0d7a37d1656a00fb609a9525086

I think it better to test for ACPI and then run
any meaningful checks than to just ignore the
check_cpu_nodes failure.

-Geoff

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-12-03  9:32             ` Will Deacon
@ 2015-12-10  0:49               ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-10  0:49 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On Thu, 2015-12-03 at 09:32 +0000, Will Deacon wrote:
> On Wed, Dec 02, 2015 at 02:57:30PM -0800, Geoff Levand wrote:
> > On Mon, 2015-11-30 at 10:40 +0000, Marc Zyngier wrote:
> > 
> > > All that can be solved in C, and mostly at compile time. Using an
> > > assembler trampoline is complicating things for no good reason:
> > 
> > I added this into my kexec-v12.1 branch.  After some more testing I'll
> > post a v13 to the list.
> 
> You may well need some notrace annotations if you want to run C code
> here. Can you give it a spin with things like ftrace and kprobes?

I tested ftrace with and without the notrace annotation on
cpu_soft_restart() and could do kexec re-boots either way.

When cpu_soft_restart() is called we've only done setup_mm_for_reboot(),
so I would think C code should still be OK.

-Geoff

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-12-10  0:49               ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-10  0:49 UTC (permalink / raw)
  To: Will Deacon
  Cc: Pratyush Anand, Mark Rutland, Marc Zyngier, Catalin Marinas,
	kexec, AKASHI Takahiro, christoffer.dall, linux-arm-kernel

Hi Will,

On Thu, 2015-12-03 at 09:32 +0000, Will Deacon wrote:
> On Wed, Dec 02, 2015 at 02:57:30PM -0800, Geoff Levand wrote:
> > On Mon, 2015-11-30 at 10:40 +0000, Marc Zyngier wrote:
> > 
> > > All that can be solved in C, and mostly at compile time. Using an
> > > assembler trampoline is complicating things for no good reason:
> > 
> > I added this into my kexec-v12.1 branch.  After some more testing I'll
> > post a v13 to the list.
> 
> You may well need some notrace annotations if you want to run C code
> here. Can you give it a spin with things like ftrace and kprobes?

I tested ftrace with and without the notrace annotation on
cpu_soft_restart() and could do kexec re-boots either way.

When cpu_soft_restart() is called we've only done setup_mm_for_reboot(),
so I would think C code should still be OK.

-Geoff


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 05/16] arm64: Add back cpu_reset routines
  2015-12-10  0:49               ` Geoff Levand
@ 2015-12-10 10:17                 ` Will Deacon
  -1 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-10 10:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 09, 2015 at 04:49:22PM -0800, Geoff Levand wrote:
> On Thu, 2015-12-03 at 09:32 +0000, Will Deacon wrote:
> > On Wed, Dec 02, 2015 at 02:57:30PM -0800, Geoff Levand wrote:
> > > On Mon, 2015-11-30 at 10:40 +0000, Marc Zyngier wrote:
> > > 
> > > > All that can be solved in C, and mostly at compile time. Using an
> > > > assembler trampoline is complicating things for no good reason:
> > > 
> > > I added this into my kexec-v12.1 branch.  After some more testing I'll
> > > post a v13 to the list.
> > 
> > You may well need some notrace annotations if you want to run C code
> > here. Can you give it a spin with things like ftrace and kprobes?
> 
> I tested ftrace with and without the notrace annotation on
> cpu_soft_restart() and could do kexec re-boots either way.
> 
> When cpu_soft_restart() is called we've only done setup_mm_for_reboot(),
> so I would think C code should still be OK.

Yeah, you're right. I mistakenly thought that cpu_soft_restart was running
at the idmap address as opposed to the kernel virtual address.

Will

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 05/16] arm64: Add back cpu_reset routines
@ 2015-12-10 10:17                 ` Will Deacon
  0 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-10 10:17 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Pratyush Anand, Mark Rutland, Marc Zyngier, Catalin Marinas,
	kexec, AKASHI Takahiro, christoffer.dall, linux-arm-kernel

On Wed, Dec 09, 2015 at 04:49:22PM -0800, Geoff Levand wrote:
> On Thu, 2015-12-03 at 09:32 +0000, Will Deacon wrote:
> > On Wed, Dec 02, 2015 at 02:57:30PM -0800, Geoff Levand wrote:
> > > On Mon, 2015-11-30 at 10:40 +0000, Marc Zyngier wrote:
> > > 
> > > > All that can be solved in C, and mostly at compile time. Using an
> > > > assembler trampoline is complicating things for no good reason:
> > > 
> > > I added this into my kexec-v12.1 branch.  After some more testing I'll
> > > post a v13 to the list.
> > 
> > You may well need some notrace annotations if you want to run C code
> > here. Can you give it a spin with things like ftrace and kprobes?
> 
> I tested ftrace with and without the notrace annotation on
> cpu_soft_restart() and could do kexec re-boots either way.
> 
> When cpu_soft_restart() is called we've only done setup_mm_for_reboot(),
> so I would think C code should still be OK.

Yeah, you're right. I mistakenly thought that cpu_soft_restart was running
at the idmap address as opposed to the kernel virtual address.

Will

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
  2015-11-27 14:39     ` Marc Zyngier
@ 2015-12-10 11:34       ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-10 11:34 UTC (permalink / raw)
  To: linux-arm-kernel

Marc,

I was back from my vacation.

On 11/27/2015 11:39 PM, Marc Zyngier wrote:
> On 24/11/15 22:25, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>> save registers' status in per-cpu ELF notes before starting the crash
>> dump kernel. See kernel_kexec().
>>
>> ipi_cpu_stop() is a bit modified and used to support this behavior.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>>   arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>>   arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>>   3 files changed, 76 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>> index 46d63cd..555a955 100644
>> --- a/arch/arm64/include/asm/kexec.h
>> +++ b/arch/arm64/include/asm/kexec.h
>> @@ -30,6 +30,8 @@
>>
>>   #if !defined(__ASSEMBLY__)
>>
>> +extern bool in_crash_kexec;
>> +
>>   /**
>>    * crash_setup_regs() - save registers for the panic kernel
>>    *
>> @@ -40,7 +42,37 @@
>>   static inline void crash_setup_regs(struct pt_regs *newregs,
>>   				    struct pt_regs *oldregs)
>>   {
>> -	/* Empty routine needed to avoid build errors. */
>> +	if (oldregs) {
>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>> +	} else {
>> +		__asm__ __volatile__ (
>> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
>> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
>> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
>> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
>> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
>> +			"stp	x10,  x11, [%3, #16 *  5]\n"
>> +			"stp	x12,  x13, [%3, #16 *  6]\n"
>> +			"stp	x14,  x15, [%3, #16 *  7]\n"
>> +			"stp	x16,  x17, [%3, #16 *  8]\n"
>> +			"stp	x18,  x19, [%3, #16 *  9]\n"
>> +			"stp	x20,  x21, [%3, #16 * 10]\n"
>> +			"stp	x22,  x23, [%3, #16 * 11]\n"
>> +			"stp	x24,  x25, [%3, #16 * 12]\n"
>> +			"stp	x26,  x27, [%3, #16 * 13]\n"
>> +			"stp	x28,  x29, [%3, #16 * 14]\n"
>> +			"str	x30,	   [%3, #16 * 15]\n"
>> +			"mov	%0, sp\n"
>> +			"adr	%1, 1f\n"
>> +			"mrs	%2, spsr_el1\n"
>> +		"1:"
>> +			: "=r" (newregs->sp),
>> +			  "=r" (newregs->pc),
>> +			  "=r" (newregs->pstate)
>> +			: "r"  (&newregs->regs)
>> +			: "memory"
>> +		);
>
> I wonder how useful this thing is, given that it starts by corrupting
> whatever register is holding newregs->regs. Maybe this is not supposed
> to be accurate anyway...

I'm not quite sure about what part of my code you're mentioning here, but
crash_setup_regs() is solely called by crash_kexec(), and panic() is
the only caller of crash_kexec() with NULL argument which, in turn, is
used as 'oldregs' in crash_setup_regs().

Given this fact, I think that the values saved in newregs as indicated above
will be the best estimate of current cpu contexts.

The other caller of crash_kexec() is die() in traps.c, but here we call
it with explicit cpu contexts at exception.

>
>> +	}
>>   }
>>
>>   #endif /* !defined(__ASSEMBLY__) */
>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>> index da28a26..d2d7e90 100644
>> --- a/arch/arm64/kernel/machine_kexec.c
>> +++ b/arch/arm64/kernel/machine_kexec.c
>> @@ -9,6 +9,7 @@
>>    * published by the Free Software Foundation.
>>    */
>>
>> +#include <linux/kernel.h>
>>   #include <linux/kexec.h>
>>   #include <linux/of_fdt.h>
>>   #include <linux/slab.h>
>> @@ -23,6 +24,7 @@
>>   extern const unsigned char arm64_relocate_new_kernel[];
>>   extern const unsigned long arm64_relocate_new_kernel_size;
>>
>> +bool in_crash_kexec;
>>   static unsigned long kimage_start;
>>
>>   /**
>> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>>   	 */
>>
>>   	cpu_soft_restart(virt_to_phys(cpu_reset),
>> -		is_hyp_mode_available(),
>> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>>   		reboot_code_buffer_phys, kimage->head, kimage_start);
>>
>>   	BUG(); /* Should never get here. */
>>   }
>>
>> +/**
>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>> + */
>>   void machine_crash_shutdown(struct pt_regs *regs)
>>   {
>> -	/* Empty routine needed to avoid build errors. */
>> +	struct pt_regs dummy_regs;
>> +	int cpu;
>> +
>> +	local_irq_disable();
>> +
>> +	in_crash_kexec = true;
>> +
>> +	/*
>> +	 * clear and initialize the per-cpu info. This is necessary
>> +	 * because, otherwise, slots for offline cpus would never be
>> +	 * filled up. See smp_send_stop().
>> +	 */
>> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
>> +	for_each_possible_cpu(cpu)
>> +		crash_save_cpu(&dummy_regs, cpu);
>> +
>> +	/* shutdown non-boot cpus */
>> +	smp_send_stop();
>> +
>> +	/* for boot cpu */
>> +	crash_save_cpu(regs, smp_processor_id());
>> +
>> +	pr_info("Starting crashdump kernel...\n");
>>   }
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index b1adc51..15aabef 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -37,6 +37,7 @@
>>   #include <linux/completion.h>
>>   #include <linux/of.h>
>>   #include <linux/irq_work.h>
>> +#include <linux/kexec.h>
>>
>>   #include <asm/alternative.h>
>>   #include <asm/atomic.h>
>> @@ -54,6 +55,8 @@
>>   #include <asm/ptrace.h>
>>   #include <asm/virt.h>
>>
>> +#include "cpu-reset.h"
>> +
>>   #define CREATE_TRACE_POINTS
>>   #include <trace/events/ipi.h>
>>
>> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>   /*
>>    * ipi_cpu_stop - handle IPI from smp_send_stop()
>>    */
>> -static void ipi_cpu_stop(unsigned int cpu)
>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>   {
>> +#ifdef CONFIG_KEXEC
>> +	/* printing messages may slow down the shutdown. */
>> +	if (!in_crash_kexec)
>> +#endif
>>   	if (system_state == SYSTEM_BOOTING ||
>>   	    system_state == SYSTEM_RUNNING) {
>>   		raw_spin_lock(&stop_lock);
>
> Irrespective of how useful this change is, how about having a predicate
> instead? Something like:
>
> static inline bool is_in_crash_kexec(void)
> {
> #ifdef CONFIG_KEXEC
> 	return in_crash_kexec;
> #else
> 	return false;
> #endif
> }

OK, I will take your idea.

> located in machine_kexec.c (making the in_crash_kernel static), and then

but cannot make in_crash_kernel static because it is also used in both smp.c
and machine_kexec.c.

> 	if (!is_in_crash_kexec() && (systen_state == ... || ...) {
>
> It would certainly look better.

Thanks,
-Takahiro AKASHI

>> @@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
>>
>>   	local_irq_disable();
>>
>> +#ifdef CONFIG_KEXEC
>> +	if (in_crash_kexec)
>> +		crash_save_cpu(regs, cpu);
>> +#endif /* CONFIG_KEXEC */
>> +
>
> Same here.
>
>>   	while (1)
>>   		cpu_relax();
>>   }
>> @@ -727,7 +739,7 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
>>
>>   	case IPI_CPU_STOP:
>>   		irq_enter();
>> -		ipi_cpu_stop(cpu);
>> +		ipi_cpu_stop(cpu, regs);
>>   		irq_exit();
>>   		break;
>>
>>
>
> Thanks,
>
> 	M.
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
@ 2015-12-10 11:34       ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-10 11:34 UTC (permalink / raw)
  To: Marc Zyngier, Geoff Levand, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, kexec, linux-arm-kernel, christoffer.dall

Marc,

I was back from my vacation.

On 11/27/2015 11:39 PM, Marc Zyngier wrote:
> On 24/11/15 22:25, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>> save registers' status in per-cpu ELF notes before starting the crash
>> dump kernel. See kernel_kexec().
>>
>> ipi_cpu_stop() is a bit modified and used to support this behavior.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>>   arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>>   arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>>   3 files changed, 76 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>> index 46d63cd..555a955 100644
>> --- a/arch/arm64/include/asm/kexec.h
>> +++ b/arch/arm64/include/asm/kexec.h
>> @@ -30,6 +30,8 @@
>>
>>   #if !defined(__ASSEMBLY__)
>>
>> +extern bool in_crash_kexec;
>> +
>>   /**
>>    * crash_setup_regs() - save registers for the panic kernel
>>    *
>> @@ -40,7 +42,37 @@
>>   static inline void crash_setup_regs(struct pt_regs *newregs,
>>   				    struct pt_regs *oldregs)
>>   {
>> -	/* Empty routine needed to avoid build errors. */
>> +	if (oldregs) {
>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>> +	} else {
>> +		__asm__ __volatile__ (
>> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
>> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
>> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
>> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
>> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
>> +			"stp	x10,  x11, [%3, #16 *  5]\n"
>> +			"stp	x12,  x13, [%3, #16 *  6]\n"
>> +			"stp	x14,  x15, [%3, #16 *  7]\n"
>> +			"stp	x16,  x17, [%3, #16 *  8]\n"
>> +			"stp	x18,  x19, [%3, #16 *  9]\n"
>> +			"stp	x20,  x21, [%3, #16 * 10]\n"
>> +			"stp	x22,  x23, [%3, #16 * 11]\n"
>> +			"stp	x24,  x25, [%3, #16 * 12]\n"
>> +			"stp	x26,  x27, [%3, #16 * 13]\n"
>> +			"stp	x28,  x29, [%3, #16 * 14]\n"
>> +			"str	x30,	   [%3, #16 * 15]\n"
>> +			"mov	%0, sp\n"
>> +			"adr	%1, 1f\n"
>> +			"mrs	%2, spsr_el1\n"
>> +		"1:"
>> +			: "=r" (newregs->sp),
>> +			  "=r" (newregs->pc),
>> +			  "=r" (newregs->pstate)
>> +			: "r"  (&newregs->regs)
>> +			: "memory"
>> +		);
>
> I wonder how useful this thing is, given that it starts by corrupting
> whatever register is holding newregs->regs. Maybe this is not supposed
> to be accurate anyway...

I'm not quite sure about what part of my code you're mentioning here, but
crash_setup_regs() is solely called by crash_kexec(), and panic() is
the only caller of crash_kexec() with NULL argument which, in turn, is
used as 'oldregs' in crash_setup_regs().

Given this fact, I think that the values saved in newregs as indicated above
will be the best estimate of current cpu contexts.

The other caller of crash_kexec() is die() in traps.c, but here we call
it with explicit cpu contexts at exception.

>
>> +	}
>>   }
>>
>>   #endif /* !defined(__ASSEMBLY__) */
>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>> index da28a26..d2d7e90 100644
>> --- a/arch/arm64/kernel/machine_kexec.c
>> +++ b/arch/arm64/kernel/machine_kexec.c
>> @@ -9,6 +9,7 @@
>>    * published by the Free Software Foundation.
>>    */
>>
>> +#include <linux/kernel.h>
>>   #include <linux/kexec.h>
>>   #include <linux/of_fdt.h>
>>   #include <linux/slab.h>
>> @@ -23,6 +24,7 @@
>>   extern const unsigned char arm64_relocate_new_kernel[];
>>   extern const unsigned long arm64_relocate_new_kernel_size;
>>
>> +bool in_crash_kexec;
>>   static unsigned long kimage_start;
>>
>>   /**
>> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>>   	 */
>>
>>   	cpu_soft_restart(virt_to_phys(cpu_reset),
>> -		is_hyp_mode_available(),
>> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>>   		reboot_code_buffer_phys, kimage->head, kimage_start);
>>
>>   	BUG(); /* Should never get here. */
>>   }
>>
>> +/**
>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>> + */
>>   void machine_crash_shutdown(struct pt_regs *regs)
>>   {
>> -	/* Empty routine needed to avoid build errors. */
>> +	struct pt_regs dummy_regs;
>> +	int cpu;
>> +
>> +	local_irq_disable();
>> +
>> +	in_crash_kexec = true;
>> +
>> +	/*
>> +	 * clear and initialize the per-cpu info. This is necessary
>> +	 * because, otherwise, slots for offline cpus would never be
>> +	 * filled up. See smp_send_stop().
>> +	 */
>> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
>> +	for_each_possible_cpu(cpu)
>> +		crash_save_cpu(&dummy_regs, cpu);
>> +
>> +	/* shutdown non-boot cpus */
>> +	smp_send_stop();
>> +
>> +	/* for boot cpu */
>> +	crash_save_cpu(regs, smp_processor_id());
>> +
>> +	pr_info("Starting crashdump kernel...\n");
>>   }
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index b1adc51..15aabef 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -37,6 +37,7 @@
>>   #include <linux/completion.h>
>>   #include <linux/of.h>
>>   #include <linux/irq_work.h>
>> +#include <linux/kexec.h>
>>
>>   #include <asm/alternative.h>
>>   #include <asm/atomic.h>
>> @@ -54,6 +55,8 @@
>>   #include <asm/ptrace.h>
>>   #include <asm/virt.h>
>>
>> +#include "cpu-reset.h"
>> +
>>   #define CREATE_TRACE_POINTS
>>   #include <trace/events/ipi.h>
>>
>> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>   /*
>>    * ipi_cpu_stop - handle IPI from smp_send_stop()
>>    */
>> -static void ipi_cpu_stop(unsigned int cpu)
>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>   {
>> +#ifdef CONFIG_KEXEC
>> +	/* printing messages may slow down the shutdown. */
>> +	if (!in_crash_kexec)
>> +#endif
>>   	if (system_state == SYSTEM_BOOTING ||
>>   	    system_state == SYSTEM_RUNNING) {
>>   		raw_spin_lock(&stop_lock);
>
> Irrespective of how useful this change is, how about having a predicate
> instead? Something like:
>
> static inline bool is_in_crash_kexec(void)
> {
> #ifdef CONFIG_KEXEC
> 	return in_crash_kexec;
> #else
> 	return false;
> #endif
> }

OK, I will take your idea.

> located in machine_kexec.c (making the in_crash_kernel static), and then

but cannot make in_crash_kernel static because it is also used in both smp.c
and machine_kexec.c.

> 	if (!is_in_crash_kexec() && (systen_state == ... || ...) {
>
> It would certainly look better.

Thanks,
-Takahiro AKASHI

>> @@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
>>
>>   	local_irq_disable();
>>
>> +#ifdef CONFIG_KEXEC
>> +	if (in_crash_kexec)
>> +		crash_save_cpu(regs, cpu);
>> +#endif /* CONFIG_KEXEC */
>> +
>
> Same here.
>
>>   	while (1)
>>   		cpu_relax();
>>   }
>> @@ -727,7 +739,7 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
>>
>>   	case IPI_CPU_STOP:
>>   		irq_enter();
>> -		ipi_cpu_stop(cpu);
>> +		ipi_cpu_stop(cpu, regs);
>>   		irq_exit();
>>   		break;
>>
>>
>
> Thanks,
>
> 	M.
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
  2015-12-03  4:15     ` Pratyush Anand
@ 2015-12-10 11:42       ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-10 11:42 UTC (permalink / raw)
  To: linux-arm-kernel

Pratyush,

On 12/03/2015 01:15 PM, Pratyush Anand wrote:
> Hi Akashi,
>
> Some of the points which came while discussing with Mark Salter are worth
> including in v13.
>
> On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>
> [...]
>
>> +/**
>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>
> "non-panic" would be correct in stead of "non-boot".

OK, but for consistency with other places (say, arm/kernel/machine_kexec.c)
I prefer "non-crashing cpus."

>> +	/* shutdown non-boot cpus */
>
> Ditto
>
>> +	smp_send_stop();
>> +
>> +	/* for boot cpu */
>
> "for panic cpu"

Ditto.
"for crashing cpu"

>
>
> [...]
>
>> @@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
>>
>>   	local_irq_disable();
>
> We have "set_cpu_online(cpu, false);" just before it.
> Panic core is waiting for non-panic to go offline, i.e. for the above event.
>
>>
>> +#ifdef CONFIG_KEXEC
>> +	if (in_crash_kexec)
>> +		crash_save_cpu(regs, cpu);
>> +#endif /* CONFIG_KEXEC */
>
> However, we are still saving crash info for non-panic core. So, it would be good
> to move crash_save_cpu() before set_cpu_online() to avoid any race condition.

Good point, race will be very unlikely though. I will fix it.

Thanks,
-Takahiro AKASHI

> ~Pratyush
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
@ 2015-12-10 11:42       ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-10 11:42 UTC (permalink / raw)
  To: Pratyush Anand
  Cc: Mark Rutland, Geoff Levand, Catalin Marinas, Will Deacon,
	marc.zyngier, linux-arm-kernel, msalter, kexec, christoffer.dall

Pratyush,

On 12/03/2015 01:15 PM, Pratyush Anand wrote:
> Hi Akashi,
>
> Some of the points which came while discussing with Mark Salter are worth
> including in v13.
>
> On 24/11/2015:10:25:34 PM, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>
> [...]
>
>> +/**
>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>
> "non-panic" would be correct in stead of "non-boot".

OK, but for consistency with other places (say, arm/kernel/machine_kexec.c)
I prefer "non-crashing cpus."

>> +	/* shutdown non-boot cpus */
>
> Ditto
>
>> +	smp_send_stop();
>> +
>> +	/* for boot cpu */
>
> "for panic cpu"

Ditto.
"for crashing cpu"

>
>
> [...]
>
>> @@ -697,6 +704,11 @@ static void ipi_cpu_stop(unsigned int cpu)
>>
>>   	local_irq_disable();
>
> We have "set_cpu_online(cpu, false);" just before it.
> Panic core is waiting for non-panic to go offline, i.e. for the above event.
>
>>
>> +#ifdef CONFIG_KEXEC
>> +	if (in_crash_kexec)
>> +		crash_save_cpu(regs, cpu);
>> +#endif /* CONFIG_KEXEC */
>
> However, we are still saving crash info for non-panic core. So, it would be good
> to move crash_save_cpu() before set_cpu_online() to avoid any race condition.

Good point, race will be very unlikely though. I will fix it.

Thanks,
-Takahiro AKASHI

> ~Pratyush
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
  2015-12-10 11:34       ` AKASHI Takahiro
@ 2015-12-10 11:44         ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-10 11:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/12/15 11:34, AKASHI Takahiro wrote:
> Marc,
> 
> I was back from my vacation.
> 
> On 11/27/2015 11:39 PM, Marc Zyngier wrote:
>> On 24/11/15 22:25, Geoff Levand wrote:
>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>
>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>> save registers' status in per-cpu ELF notes before starting the crash
>>> dump kernel. See kernel_kexec().
>>>
>>> ipi_cpu_stop() is a bit modified and used to support this behavior.
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>>>   arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>>>   arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>>>   3 files changed, 76 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>> index 46d63cd..555a955 100644
>>> --- a/arch/arm64/include/asm/kexec.h
>>> +++ b/arch/arm64/include/asm/kexec.h
>>> @@ -30,6 +30,8 @@
>>>
>>>   #if !defined(__ASSEMBLY__)
>>>
>>> +extern bool in_crash_kexec;
>>> +
>>>   /**
>>>    * crash_setup_regs() - save registers for the panic kernel
>>>    *
>>> @@ -40,7 +42,37 @@
>>>   static inline void crash_setup_regs(struct pt_regs *newregs,
>>>   				    struct pt_regs *oldregs)
>>>   {
>>> -	/* Empty routine needed to avoid build errors. */
>>> +	if (oldregs) {
>>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>>> +	} else {
>>> +		__asm__ __volatile__ (
>>> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
>>> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
>>> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
>>> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
>>> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
>>> +			"stp	x10,  x11, [%3, #16 *  5]\n"
>>> +			"stp	x12,  x13, [%3, #16 *  6]\n"
>>> +			"stp	x14,  x15, [%3, #16 *  7]\n"
>>> +			"stp	x16,  x17, [%3, #16 *  8]\n"
>>> +			"stp	x18,  x19, [%3, #16 *  9]\n"
>>> +			"stp	x20,  x21, [%3, #16 * 10]\n"
>>> +			"stp	x22,  x23, [%3, #16 * 11]\n"
>>> +			"stp	x24,  x25, [%3, #16 * 12]\n"
>>> +			"stp	x26,  x27, [%3, #16 * 13]\n"
>>> +			"stp	x28,  x29, [%3, #16 * 14]\n"
>>> +			"str	x30,	   [%3, #16 * 15]\n"
>>> +			"mov	%0, sp\n"
>>> +			"adr	%1, 1f\n"
>>> +			"mrs	%2, spsr_el1\n"
>>> +		"1:"
>>> +			: "=r" (newregs->sp),
>>> +			  "=r" (newregs->pc),
>>> +			  "=r" (newregs->pstate)
>>> +			: "r"  (&newregs->regs)
>>> +			: "memory"
>>> +		);
>>
>> I wonder how useful this thing is, given that it starts by corrupting
>> whatever register is holding newregs->regs. Maybe this is not supposed
>> to be accurate anyway...
> 
> I'm not quite sure about what part of my code you're mentioning here, but
> crash_setup_regs() is solely called by crash_kexec(), and panic() is
> the only caller of crash_kexec() with NULL argument which, in turn, is
> used as 'oldregs' in crash_setup_regs().

You have this assembly sequence:

stp	 x0,   x1, [%3, #16 *  0]
[...]

where %3 itself is one of the x[0..30] registers. So you are saving
things that have already been corrupted by the saving procedure. Not
sure how useful that is, but as I said, maybe it is not supposed to be
completely accurate.

> Given this fact, I think that the values saved in newregs as indicated above
> will be the best estimate of current cpu contexts.
> 
> The other caller of crash_kexec() is die() in traps.c, but here we call
> it with explicit cpu contexts at exception.
> 
>>
>>> +	}
>>>   }
>>>
>>>   #endif /* !defined(__ASSEMBLY__) */
>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>>> index da28a26..d2d7e90 100644
>>> --- a/arch/arm64/kernel/machine_kexec.c
>>> +++ b/arch/arm64/kernel/machine_kexec.c
>>> @@ -9,6 +9,7 @@
>>>    * published by the Free Software Foundation.
>>>    */
>>>
>>> +#include <linux/kernel.h>
>>>   #include <linux/kexec.h>
>>>   #include <linux/of_fdt.h>
>>>   #include <linux/slab.h>
>>> @@ -23,6 +24,7 @@
>>>   extern const unsigned char arm64_relocate_new_kernel[];
>>>   extern const unsigned long arm64_relocate_new_kernel_size;
>>>
>>> +bool in_crash_kexec;
>>>   static unsigned long kimage_start;
>>>
>>>   /**
>>> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>>>   	 */
>>>
>>>   	cpu_soft_restart(virt_to_phys(cpu_reset),
>>> -		is_hyp_mode_available(),
>>> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>>>   		reboot_code_buffer_phys, kimage->head, kimage_start);
>>>
>>>   	BUG(); /* Should never get here. */
>>>   }
>>>
>>> +/**
>>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>>> + */
>>>   void machine_crash_shutdown(struct pt_regs *regs)
>>>   {
>>> -	/* Empty routine needed to avoid build errors. */
>>> +	struct pt_regs dummy_regs;
>>> +	int cpu;
>>> +
>>> +	local_irq_disable();
>>> +
>>> +	in_crash_kexec = true;
>>> +
>>> +	/*
>>> +	 * clear and initialize the per-cpu info. This is necessary
>>> +	 * because, otherwise, slots for offline cpus would never be
>>> +	 * filled up. See smp_send_stop().
>>> +	 */
>>> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
>>> +	for_each_possible_cpu(cpu)
>>> +		crash_save_cpu(&dummy_regs, cpu);
>>> +
>>> +	/* shutdown non-boot cpus */
>>> +	smp_send_stop();
>>> +
>>> +	/* for boot cpu */
>>> +	crash_save_cpu(regs, smp_processor_id());
>>> +
>>> +	pr_info("Starting crashdump kernel...\n");
>>>   }
>>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>>> index b1adc51..15aabef 100644
>>> --- a/arch/arm64/kernel/smp.c
>>> +++ b/arch/arm64/kernel/smp.c
>>> @@ -37,6 +37,7 @@
>>>   #include <linux/completion.h>
>>>   #include <linux/of.h>
>>>   #include <linux/irq_work.h>
>>> +#include <linux/kexec.h>
>>>
>>>   #include <asm/alternative.h>
>>>   #include <asm/atomic.h>
>>> @@ -54,6 +55,8 @@
>>>   #include <asm/ptrace.h>
>>>   #include <asm/virt.h>
>>>
>>> +#include "cpu-reset.h"
>>> +
>>>   #define CREATE_TRACE_POINTS
>>>   #include <trace/events/ipi.h>
>>>
>>> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>>   /*
>>>    * ipi_cpu_stop - handle IPI from smp_send_stop()
>>>    */
>>> -static void ipi_cpu_stop(unsigned int cpu)
>>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>>   {
>>> +#ifdef CONFIG_KEXEC
>>> +	/* printing messages may slow down the shutdown. */
>>> +	if (!in_crash_kexec)
>>> +#endif
>>>   	if (system_state == SYSTEM_BOOTING ||
>>>   	    system_state == SYSTEM_RUNNING) {
>>>   		raw_spin_lock(&stop_lock);
>>
>> Irrespective of how useful this change is, how about having a predicate
>> instead? Something like:
>>
>> static inline bool is_in_crash_kexec(void)
>> {
>> #ifdef CONFIG_KEXEC
>> 	return in_crash_kexec;
>> #else
>> 	return false;
>> #endif
>> }
> 
> OK, I will take your idea.
> 
>> located in machine_kexec.c (making the in_crash_kernel static), and then
> 
> but cannot make in_crash_kernel static because it is also used in both smp.c
> and machine_kexec.c.

smp.c only reads from in_crash_kernel (at least from what I can see in
this patch), so it should be able to use the accessor.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
@ 2015-12-10 11:44         ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-10 11:44 UTC (permalink / raw)
  To: AKASHI Takahiro, Geoff Levand, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, kexec, linux-arm-kernel, christoffer.dall

On 10/12/15 11:34, AKASHI Takahiro wrote:
> Marc,
> 
> I was back from my vacation.
> 
> On 11/27/2015 11:39 PM, Marc Zyngier wrote:
>> On 24/11/15 22:25, Geoff Levand wrote:
>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>
>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>> save registers' status in per-cpu ELF notes before starting the crash
>>> dump kernel. See kernel_kexec().
>>>
>>> ipi_cpu_stop() is a bit modified and used to support this behavior.
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>>>   arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>>>   arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>>>   3 files changed, 76 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>> index 46d63cd..555a955 100644
>>> --- a/arch/arm64/include/asm/kexec.h
>>> +++ b/arch/arm64/include/asm/kexec.h
>>> @@ -30,6 +30,8 @@
>>>
>>>   #if !defined(__ASSEMBLY__)
>>>
>>> +extern bool in_crash_kexec;
>>> +
>>>   /**
>>>    * crash_setup_regs() - save registers for the panic kernel
>>>    *
>>> @@ -40,7 +42,37 @@
>>>   static inline void crash_setup_regs(struct pt_regs *newregs,
>>>   				    struct pt_regs *oldregs)
>>>   {
>>> -	/* Empty routine needed to avoid build errors. */
>>> +	if (oldregs) {
>>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>>> +	} else {
>>> +		__asm__ __volatile__ (
>>> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
>>> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
>>> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
>>> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
>>> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
>>> +			"stp	x10,  x11, [%3, #16 *  5]\n"
>>> +			"stp	x12,  x13, [%3, #16 *  6]\n"
>>> +			"stp	x14,  x15, [%3, #16 *  7]\n"
>>> +			"stp	x16,  x17, [%3, #16 *  8]\n"
>>> +			"stp	x18,  x19, [%3, #16 *  9]\n"
>>> +			"stp	x20,  x21, [%3, #16 * 10]\n"
>>> +			"stp	x22,  x23, [%3, #16 * 11]\n"
>>> +			"stp	x24,  x25, [%3, #16 * 12]\n"
>>> +			"stp	x26,  x27, [%3, #16 * 13]\n"
>>> +			"stp	x28,  x29, [%3, #16 * 14]\n"
>>> +			"str	x30,	   [%3, #16 * 15]\n"
>>> +			"mov	%0, sp\n"
>>> +			"adr	%1, 1f\n"
>>> +			"mrs	%2, spsr_el1\n"
>>> +		"1:"
>>> +			: "=r" (newregs->sp),
>>> +			  "=r" (newregs->pc),
>>> +			  "=r" (newregs->pstate)
>>> +			: "r"  (&newregs->regs)
>>> +			: "memory"
>>> +		);
>>
>> I wonder how useful this thing is, given that it starts by corrupting
>> whatever register is holding newregs->regs. Maybe this is not supposed
>> to be accurate anyway...
> 
> I'm not quite sure about what part of my code you're mentioning here, but
> crash_setup_regs() is solely called by crash_kexec(), and panic() is
> the only caller of crash_kexec() with NULL argument which, in turn, is
> used as 'oldregs' in crash_setup_regs().

You have this assembly sequence:

stp	 x0,   x1, [%3, #16 *  0]
[...]

where %3 itself is one of the x[0..30] registers. So you are saving
things that have already been corrupted by the saving procedure. Not
sure how useful that is, but as I said, maybe it is not supposed to be
completely accurate.

> Given this fact, I think that the values saved in newregs as indicated above
> will be the best estimate of current cpu contexts.
> 
> The other caller of crash_kexec() is die() in traps.c, but here we call
> it with explicit cpu contexts at exception.
> 
>>
>>> +	}
>>>   }
>>>
>>>   #endif /* !defined(__ASSEMBLY__) */
>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>>> index da28a26..d2d7e90 100644
>>> --- a/arch/arm64/kernel/machine_kexec.c
>>> +++ b/arch/arm64/kernel/machine_kexec.c
>>> @@ -9,6 +9,7 @@
>>>    * published by the Free Software Foundation.
>>>    */
>>>
>>> +#include <linux/kernel.h>
>>>   #include <linux/kexec.h>
>>>   #include <linux/of_fdt.h>
>>>   #include <linux/slab.h>
>>> @@ -23,6 +24,7 @@
>>>   extern const unsigned char arm64_relocate_new_kernel[];
>>>   extern const unsigned long arm64_relocate_new_kernel_size;
>>>
>>> +bool in_crash_kexec;
>>>   static unsigned long kimage_start;
>>>
>>>   /**
>>> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>>>   	 */
>>>
>>>   	cpu_soft_restart(virt_to_phys(cpu_reset),
>>> -		is_hyp_mode_available(),
>>> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>>>   		reboot_code_buffer_phys, kimage->head, kimage_start);
>>>
>>>   	BUG(); /* Should never get here. */
>>>   }
>>>
>>> +/**
>>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>>> + */
>>>   void machine_crash_shutdown(struct pt_regs *regs)
>>>   {
>>> -	/* Empty routine needed to avoid build errors. */
>>> +	struct pt_regs dummy_regs;
>>> +	int cpu;
>>> +
>>> +	local_irq_disable();
>>> +
>>> +	in_crash_kexec = true;
>>> +
>>> +	/*
>>> +	 * clear and initialize the per-cpu info. This is necessary
>>> +	 * because, otherwise, slots for offline cpus would never be
>>> +	 * filled up. See smp_send_stop().
>>> +	 */
>>> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
>>> +	for_each_possible_cpu(cpu)
>>> +		crash_save_cpu(&dummy_regs, cpu);
>>> +
>>> +	/* shutdown non-boot cpus */
>>> +	smp_send_stop();
>>> +
>>> +	/* for boot cpu */
>>> +	crash_save_cpu(regs, smp_processor_id());
>>> +
>>> +	pr_info("Starting crashdump kernel...\n");
>>>   }
>>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>>> index b1adc51..15aabef 100644
>>> --- a/arch/arm64/kernel/smp.c
>>> +++ b/arch/arm64/kernel/smp.c
>>> @@ -37,6 +37,7 @@
>>>   #include <linux/completion.h>
>>>   #include <linux/of.h>
>>>   #include <linux/irq_work.h>
>>> +#include <linux/kexec.h>
>>>
>>>   #include <asm/alternative.h>
>>>   #include <asm/atomic.h>
>>> @@ -54,6 +55,8 @@
>>>   #include <asm/ptrace.h>
>>>   #include <asm/virt.h>
>>>
>>> +#include "cpu-reset.h"
>>> +
>>>   #define CREATE_TRACE_POINTS
>>>   #include <trace/events/ipi.h>
>>>
>>> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>>   /*
>>>    * ipi_cpu_stop - handle IPI from smp_send_stop()
>>>    */
>>> -static void ipi_cpu_stop(unsigned int cpu)
>>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>>   {
>>> +#ifdef CONFIG_KEXEC
>>> +	/* printing messages may slow down the shutdown. */
>>> +	if (!in_crash_kexec)
>>> +#endif
>>>   	if (system_state == SYSTEM_BOOTING ||
>>>   	    system_state == SYSTEM_RUNNING) {
>>>   		raw_spin_lock(&stop_lock);
>>
>> Irrespective of how useful this change is, how about having a predicate
>> instead? Something like:
>>
>> static inline bool is_in_crash_kexec(void)
>> {
>> #ifdef CONFIG_KEXEC
>> 	return in_crash_kexec;
>> #else
>> 	return false;
>> #endif
>> }
> 
> OK, I will take your idea.
> 
>> located in machine_kexec.c (making the in_crash_kernel static), and then
> 
> but cannot make in_crash_kernel static because it is also used in both smp.c
> and machine_kexec.c.

smp.c only reads from in_crash_kernel (at least from what I can see in
this patch), so it should be able to use the accessor.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
  2015-12-10 11:42       ` AKASHI Takahiro
@ 2015-12-10 11:50         ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-10 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/12/2015:08:42:12 PM, AKASHI Takahiro wrote:
> >>+ * machine_crash_shutdown - shutdown non-boot cpus and save registers
> >
> >"non-panic" would be correct in stead of "non-boot".
> 
> OK, but for consistency with other places (say, arm/kernel/machine_kexec.c)
> I prefer "non-crashing cpus."

OK.

> 
> >>+	/* shutdown non-boot cpus */
> >
> >Ditto
> >
> >>+	smp_send_stop();
> >>+
> >>+	/* for boot cpu */
> >
> >"for panic cpu"
> 
> Ditto.
> "for crashing cpu"

OK.

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
@ 2015-12-10 11:50         ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-10 11:50 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Mark Rutland, Geoff Levand, Catalin Marinas, Will Deacon,
	marc.zyngier, linux-arm-kernel, msalter, kexec, christoffer.dall

On 10/12/2015:08:42:12 PM, AKASHI Takahiro wrote:
> >>+ * machine_crash_shutdown - shutdown non-boot cpus and save registers
> >
> >"non-panic" would be correct in stead of "non-boot".
> 
> OK, but for consistency with other places (say, arm/kernel/machine_kexec.c)
> I prefer "non-crashing cpus."

OK.

> 
> >>+	/* shutdown non-boot cpus */
> >
> >Ditto
> >
> >>+	smp_send_stop();
> >>+
> >>+	/* for boot cpu */
> >
> >"for panic cpu"
> 
> Ditto.
> "for crashing cpu"

OK.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
  2015-12-10 11:44         ` Marc Zyngier
@ 2015-12-10 12:55           ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-10 12:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/10/2015 08:44 PM, Marc Zyngier wrote:
> On 10/12/15 11:34, AKASHI Takahiro wrote:
>> Marc,
>>
>> I was back from my vacation.
>>
>> On 11/27/2015 11:39 PM, Marc Zyngier wrote:
>>> On 24/11/15 22:25, Geoff Levand wrote:
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>
>>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>>> save registers' status in per-cpu ELF notes before starting the crash
>>>> dump kernel. See kernel_kexec().
>>>>
>>>> ipi_cpu_stop() is a bit modified and used to support this behavior.
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>    arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>>>>    arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>>>>    arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>>>>    3 files changed, 76 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>>> index 46d63cd..555a955 100644
>>>> --- a/arch/arm64/include/asm/kexec.h
>>>> +++ b/arch/arm64/include/asm/kexec.h
>>>> @@ -30,6 +30,8 @@
>>>>
>>>>    #if !defined(__ASSEMBLY__)
>>>>
>>>> +extern bool in_crash_kexec;
>>>> +
>>>>    /**
>>>>     * crash_setup_regs() - save registers for the panic kernel
>>>>     *
>>>> @@ -40,7 +42,37 @@
>>>>    static inline void crash_setup_regs(struct pt_regs *newregs,
>>>>    				    struct pt_regs *oldregs)
>>>>    {
>>>> -	/* Empty routine needed to avoid build errors. */
>>>> +	if (oldregs) {
>>>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>>>> +	} else {
>>>> +		__asm__ __volatile__ (
>>>> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
>>>> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
>>>> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
>>>> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
>>>> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
>>>> +			"stp	x10,  x11, [%3, #16 *  5]\n"
>>>> +			"stp	x12,  x13, [%3, #16 *  6]\n"
>>>> +			"stp	x14,  x15, [%3, #16 *  7]\n"
>>>> +			"stp	x16,  x17, [%3, #16 *  8]\n"
>>>> +			"stp	x18,  x19, [%3, #16 *  9]\n"
>>>> +			"stp	x20,  x21, [%3, #16 * 10]\n"
>>>> +			"stp	x22,  x23, [%3, #16 * 11]\n"
>>>> +			"stp	x24,  x25, [%3, #16 * 12]\n"
>>>> +			"stp	x26,  x27, [%3, #16 * 13]\n"
>>>> +			"stp	x28,  x29, [%3, #16 * 14]\n"
>>>> +			"str	x30,	   [%3, #16 * 15]\n"
>>>> +			"mov	%0, sp\n"
>>>> +			"adr	%1, 1f\n"
>>>> +			"mrs	%2, spsr_el1\n"
>>>> +		"1:"
>>>> +			: "=r" (newregs->sp),
>>>> +			  "=r" (newregs->pc),
>>>> +			  "=r" (newregs->pstate)
>>>> +			: "r"  (&newregs->regs)
>>>> +			: "memory"
>>>> +		);
>>>
>>> I wonder how useful this thing is, given that it starts by corrupting
>>> whatever register is holding newregs->regs. Maybe this is not supposed
>>> to be accurate anyway...
>>
>> I'm not quite sure about what part of my code you're mentioning here, but
>> crash_setup_regs() is solely called by crash_kexec(), and panic() is
>> the only caller of crash_kexec() with NULL argument which, in turn, is
>> used as 'oldregs' in crash_setup_regs().
>
> You have this assembly sequence:
>
> stp	 x0,   x1, [%3, #16 *  0]
> [...]
>
> where %3 itself is one of the x[0..30] registers.

Not only %3, but also

> So you are saving
> things that have already been corrupted by the saving procedure. Not
> sure how useful that is, but as I said, maybe it is not supposed to be
> completely accurate.

x0, x1 ... are the current values in panic(), and not the exact cpu contexts
at the place we are really interested in.
We have no way here in panic() to know them, but sp and pc would still be useful
for back-tracing in later investigation of dump file.

Please note that the same problem exists on arm (and x86) implementation.

>> Given this fact, I think that the values saved in newregs as indicated above
>> will be the best estimate of current cpu contexts.
>>
>> The other caller of crash_kexec() is die() in traps.c, but here we call
>> it with explicit cpu contexts at exception.
>>
>>>
>>>> +	}
>>>>    }
>>>>
>>>>    #endif /* !defined(__ASSEMBLY__) */
>>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>>>> index da28a26..d2d7e90 100644
>>>> --- a/arch/arm64/kernel/machine_kexec.c
>>>> +++ b/arch/arm64/kernel/machine_kexec.c
>>>> @@ -9,6 +9,7 @@
>>>>     * published by the Free Software Foundation.
>>>>     */
>>>>
>>>> +#include <linux/kernel.h>
>>>>    #include <linux/kexec.h>
>>>>    #include <linux/of_fdt.h>
>>>>    #include <linux/slab.h>
>>>> @@ -23,6 +24,7 @@
>>>>    extern const unsigned char arm64_relocate_new_kernel[];
>>>>    extern const unsigned long arm64_relocate_new_kernel_size;
>>>>
>>>> +bool in_crash_kexec;
>>>>    static unsigned long kimage_start;
>>>>
>>>>    /**
>>>> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>>>>    	 */
>>>>
>>>>    	cpu_soft_restart(virt_to_phys(cpu_reset),
>>>> -		is_hyp_mode_available(),
>>>> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>>>>    		reboot_code_buffer_phys, kimage->head, kimage_start);
>>>>
>>>>    	BUG(); /* Should never get here. */
>>>>    }
>>>>
>>>> +/**
>>>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>>>> + */
>>>>    void machine_crash_shutdown(struct pt_regs *regs)
>>>>    {
>>>> -	/* Empty routine needed to avoid build errors. */
>>>> +	struct pt_regs dummy_regs;
>>>> +	int cpu;
>>>> +
>>>> +	local_irq_disable();
>>>> +
>>>> +	in_crash_kexec = true;
>>>> +
>>>> +	/*
>>>> +	 * clear and initialize the per-cpu info. This is necessary
>>>> +	 * because, otherwise, slots for offline cpus would never be
>>>> +	 * filled up. See smp_send_stop().
>>>> +	 */
>>>> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
>>>> +	for_each_possible_cpu(cpu)
>>>> +		crash_save_cpu(&dummy_regs, cpu);
>>>> +
>>>> +	/* shutdown non-boot cpus */
>>>> +	smp_send_stop();
>>>> +
>>>> +	/* for boot cpu */
>>>> +	crash_save_cpu(regs, smp_processor_id());
>>>> +
>>>> +	pr_info("Starting crashdump kernel...\n");
>>>>    }
>>>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>>>> index b1adc51..15aabef 100644
>>>> --- a/arch/arm64/kernel/smp.c
>>>> +++ b/arch/arm64/kernel/smp.c
>>>> @@ -37,6 +37,7 @@
>>>>    #include <linux/completion.h>
>>>>    #include <linux/of.h>
>>>>    #include <linux/irq_work.h>
>>>> +#include <linux/kexec.h>
>>>>
>>>>    #include <asm/alternative.h>
>>>>    #include <asm/atomic.h>
>>>> @@ -54,6 +55,8 @@
>>>>    #include <asm/ptrace.h>
>>>>    #include <asm/virt.h>
>>>>
>>>> +#include "cpu-reset.h"
>>>> +
>>>>    #define CREATE_TRACE_POINTS
>>>>    #include <trace/events/ipi.h>
>>>>
>>>> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>>>    /*
>>>>     * ipi_cpu_stop - handle IPI from smp_send_stop()
>>>>     */
>>>> -static void ipi_cpu_stop(unsigned int cpu)
>>>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>>>    {
>>>> +#ifdef CONFIG_KEXEC
>>>> +	/* printing messages may slow down the shutdown. */
>>>> +	if (!in_crash_kexec)
>>>> +#endif
>>>>    	if (system_state == SYSTEM_BOOTING ||
>>>>    	    system_state == SYSTEM_RUNNING) {
>>>>    		raw_spin_lock(&stop_lock);
>>>
>>> Irrespective of how useful this change is, how about having a predicate
>>> instead? Something like:
>>>
>>> static inline bool is_in_crash_kexec(void)
>>> {
>>> #ifdef CONFIG_KEXEC
>>> 	return in_crash_kexec;
>>> #else
>>> 	return false;
>>> #endif
>>> }
>>
>> OK, I will take your idea.
>>
>>> located in machine_kexec.c (making the in_crash_kernel static), and then
>>
>> but cannot make in_crash_kernel static because it is also used in both smp.c
>> and machine_kexec.c.
>
> smp.c only reads from in_crash_kernel (at least from what I can see in
> this patch), so it should be able to use the accessor.

Only if we define the accessor as a real function, not an inline function in a header :)

-Takahiro AKASHI

> Thanks,
>
> 	M.
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
@ 2015-12-10 12:55           ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-10 12:55 UTC (permalink / raw)
  To: Marc Zyngier, Geoff Levand, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, kexec, linux-arm-kernel, christoffer.dall

On 12/10/2015 08:44 PM, Marc Zyngier wrote:
> On 10/12/15 11:34, AKASHI Takahiro wrote:
>> Marc,
>>
>> I was back from my vacation.
>>
>> On 11/27/2015 11:39 PM, Marc Zyngier wrote:
>>> On 24/11/15 22:25, Geoff Levand wrote:
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>
>>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>>> save registers' status in per-cpu ELF notes before starting the crash
>>>> dump kernel. See kernel_kexec().
>>>>
>>>> ipi_cpu_stop() is a bit modified and used to support this behavior.
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>    arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>>>>    arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>>>>    arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>>>>    3 files changed, 76 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>>> index 46d63cd..555a955 100644
>>>> --- a/arch/arm64/include/asm/kexec.h
>>>> +++ b/arch/arm64/include/asm/kexec.h
>>>> @@ -30,6 +30,8 @@
>>>>
>>>>    #if !defined(__ASSEMBLY__)
>>>>
>>>> +extern bool in_crash_kexec;
>>>> +
>>>>    /**
>>>>     * crash_setup_regs() - save registers for the panic kernel
>>>>     *
>>>> @@ -40,7 +42,37 @@
>>>>    static inline void crash_setup_regs(struct pt_regs *newregs,
>>>>    				    struct pt_regs *oldregs)
>>>>    {
>>>> -	/* Empty routine needed to avoid build errors. */
>>>> +	if (oldregs) {
>>>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>>>> +	} else {
>>>> +		__asm__ __volatile__ (
>>>> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
>>>> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
>>>> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
>>>> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
>>>> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
>>>> +			"stp	x10,  x11, [%3, #16 *  5]\n"
>>>> +			"stp	x12,  x13, [%3, #16 *  6]\n"
>>>> +			"stp	x14,  x15, [%3, #16 *  7]\n"
>>>> +			"stp	x16,  x17, [%3, #16 *  8]\n"
>>>> +			"stp	x18,  x19, [%3, #16 *  9]\n"
>>>> +			"stp	x20,  x21, [%3, #16 * 10]\n"
>>>> +			"stp	x22,  x23, [%3, #16 * 11]\n"
>>>> +			"stp	x24,  x25, [%3, #16 * 12]\n"
>>>> +			"stp	x26,  x27, [%3, #16 * 13]\n"
>>>> +			"stp	x28,  x29, [%3, #16 * 14]\n"
>>>> +			"str	x30,	   [%3, #16 * 15]\n"
>>>> +			"mov	%0, sp\n"
>>>> +			"adr	%1, 1f\n"
>>>> +			"mrs	%2, spsr_el1\n"
>>>> +		"1:"
>>>> +			: "=r" (newregs->sp),
>>>> +			  "=r" (newregs->pc),
>>>> +			  "=r" (newregs->pstate)
>>>> +			: "r"  (&newregs->regs)
>>>> +			: "memory"
>>>> +		);
>>>
>>> I wonder how useful this thing is, given that it starts by corrupting
>>> whatever register is holding newregs->regs. Maybe this is not supposed
>>> to be accurate anyway...
>>
>> I'm not quite sure about what part of my code you're mentioning here, but
>> crash_setup_regs() is solely called by crash_kexec(), and panic() is
>> the only caller of crash_kexec() with NULL argument which, in turn, is
>> used as 'oldregs' in crash_setup_regs().
>
> You have this assembly sequence:
>
> stp	 x0,   x1, [%3, #16 *  0]
> [...]
>
> where %3 itself is one of the x[0..30] registers.

Not only %3, but also

> So you are saving
> things that have already been corrupted by the saving procedure. Not
> sure how useful that is, but as I said, maybe it is not supposed to be
> completely accurate.

x0, x1 ... are the current values in panic(), and not the exact cpu contexts
at the place we are really interested in.
We have no way here in panic() to know them, but sp and pc would still be useful
for back-tracing in later investigation of dump file.

Please note that the same problem exists on arm (and x86) implementation.

>> Given this fact, I think that the values saved in newregs as indicated above
>> will be the best estimate of current cpu contexts.
>>
>> The other caller of crash_kexec() is die() in traps.c, but here we call
>> it with explicit cpu contexts at exception.
>>
>>>
>>>> +	}
>>>>    }
>>>>
>>>>    #endif /* !defined(__ASSEMBLY__) */
>>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>>>> index da28a26..d2d7e90 100644
>>>> --- a/arch/arm64/kernel/machine_kexec.c
>>>> +++ b/arch/arm64/kernel/machine_kexec.c
>>>> @@ -9,6 +9,7 @@
>>>>     * published by the Free Software Foundation.
>>>>     */
>>>>
>>>> +#include <linux/kernel.h>
>>>>    #include <linux/kexec.h>
>>>>    #include <linux/of_fdt.h>
>>>>    #include <linux/slab.h>
>>>> @@ -23,6 +24,7 @@
>>>>    extern const unsigned char arm64_relocate_new_kernel[];
>>>>    extern const unsigned long arm64_relocate_new_kernel_size;
>>>>
>>>> +bool in_crash_kexec;
>>>>    static unsigned long kimage_start;
>>>>
>>>>    /**
>>>> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>>>>    	 */
>>>>
>>>>    	cpu_soft_restart(virt_to_phys(cpu_reset),
>>>> -		is_hyp_mode_available(),
>>>> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>>>>    		reboot_code_buffer_phys, kimage->head, kimage_start);
>>>>
>>>>    	BUG(); /* Should never get here. */
>>>>    }
>>>>
>>>> +/**
>>>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>>>> + */
>>>>    void machine_crash_shutdown(struct pt_regs *regs)
>>>>    {
>>>> -	/* Empty routine needed to avoid build errors. */
>>>> +	struct pt_regs dummy_regs;
>>>> +	int cpu;
>>>> +
>>>> +	local_irq_disable();
>>>> +
>>>> +	in_crash_kexec = true;
>>>> +
>>>> +	/*
>>>> +	 * clear and initialize the per-cpu info. This is necessary
>>>> +	 * because, otherwise, slots for offline cpus would never be
>>>> +	 * filled up. See smp_send_stop().
>>>> +	 */
>>>> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
>>>> +	for_each_possible_cpu(cpu)
>>>> +		crash_save_cpu(&dummy_regs, cpu);
>>>> +
>>>> +	/* shutdown non-boot cpus */
>>>> +	smp_send_stop();
>>>> +
>>>> +	/* for boot cpu */
>>>> +	crash_save_cpu(regs, smp_processor_id());
>>>> +
>>>> +	pr_info("Starting crashdump kernel...\n");
>>>>    }
>>>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>>>> index b1adc51..15aabef 100644
>>>> --- a/arch/arm64/kernel/smp.c
>>>> +++ b/arch/arm64/kernel/smp.c
>>>> @@ -37,6 +37,7 @@
>>>>    #include <linux/completion.h>
>>>>    #include <linux/of.h>
>>>>    #include <linux/irq_work.h>
>>>> +#include <linux/kexec.h>
>>>>
>>>>    #include <asm/alternative.h>
>>>>    #include <asm/atomic.h>
>>>> @@ -54,6 +55,8 @@
>>>>    #include <asm/ptrace.h>
>>>>    #include <asm/virt.h>
>>>>
>>>> +#include "cpu-reset.h"
>>>> +
>>>>    #define CREATE_TRACE_POINTS
>>>>    #include <trace/events/ipi.h>
>>>>
>>>> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>>>    /*
>>>>     * ipi_cpu_stop - handle IPI from smp_send_stop()
>>>>     */
>>>> -static void ipi_cpu_stop(unsigned int cpu)
>>>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>>>    {
>>>> +#ifdef CONFIG_KEXEC
>>>> +	/* printing messages may slow down the shutdown. */
>>>> +	if (!in_crash_kexec)
>>>> +#endif
>>>>    	if (system_state == SYSTEM_BOOTING ||
>>>>    	    system_state == SYSTEM_RUNNING) {
>>>>    		raw_spin_lock(&stop_lock);
>>>
>>> Irrespective of how useful this change is, how about having a predicate
>>> instead? Something like:
>>>
>>> static inline bool is_in_crash_kexec(void)
>>> {
>>> #ifdef CONFIG_KEXEC
>>> 	return in_crash_kexec;
>>> #else
>>> 	return false;
>>> #endif
>>> }
>>
>> OK, I will take your idea.
>>
>>> located in machine_kexec.c (making the in_crash_kernel static), and then
>>
>> but cannot make in_crash_kernel static because it is also used in both smp.c
>> and machine_kexec.c.
>
> smp.c only reads from in_crash_kernel (at least from what I can see in
> this patch), so it should be able to use the accessor.

Only if we define the accessor as a real function, not an inline function in a header :)

-Takahiro AKASHI

> Thanks,
>
> 	M.
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
  2015-12-10 12:55           ` AKASHI Takahiro
@ 2015-12-10 13:43             ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-10 13:43 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/12/15 12:55, AKASHI Takahiro wrote:
> On 12/10/2015 08:44 PM, Marc Zyngier wrote:
>> On 10/12/15 11:34, AKASHI Takahiro wrote:
>>> Marc,
>>>
>>> I was back from my vacation.
>>>
>>> On 11/27/2015 11:39 PM, Marc Zyngier wrote:
>>>> On 24/11/15 22:25, Geoff Levand wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>
>>>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>>>> save registers' status in per-cpu ELF notes before starting the crash
>>>>> dump kernel. See kernel_kexec().
>>>>>
>>>>> ipi_cpu_stop() is a bit modified and used to support this behavior.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>    arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>>>>>    arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>>>>>    arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>>>>>    3 files changed, 76 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>>>> index 46d63cd..555a955 100644
>>>>> --- a/arch/arm64/include/asm/kexec.h
>>>>> +++ b/arch/arm64/include/asm/kexec.h
>>>>> @@ -30,6 +30,8 @@
>>>>>
>>>>>    #if !defined(__ASSEMBLY__)
>>>>>
>>>>> +extern bool in_crash_kexec;
>>>>> +
>>>>>    /**
>>>>>     * crash_setup_regs() - save registers for the panic kernel
>>>>>     *
>>>>> @@ -40,7 +42,37 @@
>>>>>    static inline void crash_setup_regs(struct pt_regs *newregs,
>>>>>    				    struct pt_regs *oldregs)
>>>>>    {
>>>>> -	/* Empty routine needed to avoid build errors. */
>>>>> +	if (oldregs) {
>>>>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>>>>> +	} else {
>>>>> +		__asm__ __volatile__ (
>>>>> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
>>>>> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
>>>>> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
>>>>> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
>>>>> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
>>>>> +			"stp	x10,  x11, [%3, #16 *  5]\n"
>>>>> +			"stp	x12,  x13, [%3, #16 *  6]\n"
>>>>> +			"stp	x14,  x15, [%3, #16 *  7]\n"
>>>>> +			"stp	x16,  x17, [%3, #16 *  8]\n"
>>>>> +			"stp	x18,  x19, [%3, #16 *  9]\n"
>>>>> +			"stp	x20,  x21, [%3, #16 * 10]\n"
>>>>> +			"stp	x22,  x23, [%3, #16 * 11]\n"
>>>>> +			"stp	x24,  x25, [%3, #16 * 12]\n"
>>>>> +			"stp	x26,  x27, [%3, #16 * 13]\n"
>>>>> +			"stp	x28,  x29, [%3, #16 * 14]\n"
>>>>> +			"str	x30,	   [%3, #16 * 15]\n"
>>>>> +			"mov	%0, sp\n"
>>>>> +			"adr	%1, 1f\n"
>>>>> +			"mrs	%2, spsr_el1\n"
>>>>> +		"1:"
>>>>> +			: "=r" (newregs->sp),
>>>>> +			  "=r" (newregs->pc),
>>>>> +			  "=r" (newregs->pstate)
>>>>> +			: "r"  (&newregs->regs)
>>>>> +			: "memory"
>>>>> +		);
>>>>
>>>> I wonder how useful this thing is, given that it starts by corrupting
>>>> whatever register is holding newregs->regs. Maybe this is not supposed
>>>> to be accurate anyway...
>>>
>>> I'm not quite sure about what part of my code you're mentioning here, but
>>> crash_setup_regs() is solely called by crash_kexec(), and panic() is
>>> the only caller of crash_kexec() with NULL argument which, in turn, is
>>> used as 'oldregs' in crash_setup_regs().
>>
>> You have this assembly sequence:
>>
>> stp	 x0,   x1, [%3, #16 *  0]
>> [...]
>>
>> where %3 itself is one of the x[0..30] registers.
> 
> Not only %3, but also
> 
>> So you are saving
>> things that have already been corrupted by the saving procedure. Not
>> sure how useful that is, but as I said, maybe it is not supposed to be
>> completely accurate.
> 
> x0, x1 ... are the current values in panic(), and not the exact cpu contexts
> at the place we are really interested in.
> We have no way here in panic() to know them, but sp and pc would still be useful
> for back-tracing in later investigation of dump file.
> 
> Please note that the same problem exists on arm (and x86) implementation.

As I said: if people don't expect to have a precise dump of the register
file, then fine.

>>> Given this fact, I think that the values saved in newregs as indicated above
>>> will be the best estimate of current cpu contexts.
>>>
>>> The other caller of crash_kexec() is die() in traps.c, but here we call
>>> it with explicit cpu contexts at exception.
>>>
>>>>
>>>>> +	}
>>>>>    }
>>>>>
>>>>>    #endif /* !defined(__ASSEMBLY__) */
>>>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>>>>> index da28a26..d2d7e90 100644
>>>>> --- a/arch/arm64/kernel/machine_kexec.c
>>>>> +++ b/arch/arm64/kernel/machine_kexec.c
>>>>> @@ -9,6 +9,7 @@
>>>>>     * published by the Free Software Foundation.
>>>>>     */
>>>>>
>>>>> +#include <linux/kernel.h>
>>>>>    #include <linux/kexec.h>
>>>>>    #include <linux/of_fdt.h>
>>>>>    #include <linux/slab.h>
>>>>> @@ -23,6 +24,7 @@
>>>>>    extern const unsigned char arm64_relocate_new_kernel[];
>>>>>    extern const unsigned long arm64_relocate_new_kernel_size;
>>>>>
>>>>> +bool in_crash_kexec;
>>>>>    static unsigned long kimage_start;
>>>>>
>>>>>    /**
>>>>> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>>>>>    	 */
>>>>>
>>>>>    	cpu_soft_restart(virt_to_phys(cpu_reset),
>>>>> -		is_hyp_mode_available(),
>>>>> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>>>>>    		reboot_code_buffer_phys, kimage->head, kimage_start);
>>>>>
>>>>>    	BUG(); /* Should never get here. */
>>>>>    }
>>>>>
>>>>> +/**
>>>>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>>>>> + */
>>>>>    void machine_crash_shutdown(struct pt_regs *regs)
>>>>>    {
>>>>> -	/* Empty routine needed to avoid build errors. */
>>>>> +	struct pt_regs dummy_regs;
>>>>> +	int cpu;
>>>>> +
>>>>> +	local_irq_disable();
>>>>> +
>>>>> +	in_crash_kexec = true;
>>>>> +
>>>>> +	/*
>>>>> +	 * clear and initialize the per-cpu info. This is necessary
>>>>> +	 * because, otherwise, slots for offline cpus would never be
>>>>> +	 * filled up. See smp_send_stop().
>>>>> +	 */
>>>>> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
>>>>> +	for_each_possible_cpu(cpu)
>>>>> +		crash_save_cpu(&dummy_regs, cpu);
>>>>> +
>>>>> +	/* shutdown non-boot cpus */
>>>>> +	smp_send_stop();
>>>>> +
>>>>> +	/* for boot cpu */
>>>>> +	crash_save_cpu(regs, smp_processor_id());
>>>>> +
>>>>> +	pr_info("Starting crashdump kernel...\n");
>>>>>    }
>>>>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>>>>> index b1adc51..15aabef 100644
>>>>> --- a/arch/arm64/kernel/smp.c
>>>>> +++ b/arch/arm64/kernel/smp.c
>>>>> @@ -37,6 +37,7 @@
>>>>>    #include <linux/completion.h>
>>>>>    #include <linux/of.h>
>>>>>    #include <linux/irq_work.h>
>>>>> +#include <linux/kexec.h>
>>>>>
>>>>>    #include <asm/alternative.h>
>>>>>    #include <asm/atomic.h>
>>>>> @@ -54,6 +55,8 @@
>>>>>    #include <asm/ptrace.h>
>>>>>    #include <asm/virt.h>
>>>>>
>>>>> +#include "cpu-reset.h"
>>>>> +
>>>>>    #define CREATE_TRACE_POINTS
>>>>>    #include <trace/events/ipi.h>
>>>>>
>>>>> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>>>>    /*
>>>>>     * ipi_cpu_stop - handle IPI from smp_send_stop()
>>>>>     */
>>>>> -static void ipi_cpu_stop(unsigned int cpu)
>>>>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>>>>    {
>>>>> +#ifdef CONFIG_KEXEC
>>>>> +	/* printing messages may slow down the shutdown. */
>>>>> +	if (!in_crash_kexec)
>>>>> +#endif
>>>>>    	if (system_state == SYSTEM_BOOTING ||
>>>>>    	    system_state == SYSTEM_RUNNING) {
>>>>>    		raw_spin_lock(&stop_lock);
>>>>
>>>> Irrespective of how useful this change is, how about having a predicate
>>>> instead? Something like:
>>>>
>>>> static inline bool is_in_crash_kexec(void)
>>>> {
>>>> #ifdef CONFIG_KEXEC
>>>> 	return in_crash_kexec;
>>>> #else
>>>> 	return false;
>>>> #endif
>>>> }
>>>
>>> OK, I will take your idea.
>>>
>>>> located in machine_kexec.c (making the in_crash_kernel static), and then
>>>
>>> but cannot make in_crash_kernel static because it is also used in both smp.c
>>> and machine_kexec.c.
>>
>> smp.c only reads from in_crash_kernel (at least from what I can see in
>> this patch), so it should be able to use the accessor.
> 
> Only if we define the accessor as a real function, not an inline function in a header :)

I'll leave the implementation details in your capable hands. :-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown()
@ 2015-12-10 13:43             ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-10 13:43 UTC (permalink / raw)
  To: AKASHI Takahiro, Geoff Levand, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, kexec, linux-arm-kernel, christoffer.dall

On 10/12/15 12:55, AKASHI Takahiro wrote:
> On 12/10/2015 08:44 PM, Marc Zyngier wrote:
>> On 10/12/15 11:34, AKASHI Takahiro wrote:
>>> Marc,
>>>
>>> I was back from my vacation.
>>>
>>> On 11/27/2015 11:39 PM, Marc Zyngier wrote:
>>>> On 24/11/15 22:25, Geoff Levand wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>
>>>>> kdump calls machine_crash_shutdown() to shut down non-boot cpus and
>>>>> save registers' status in per-cpu ELF notes before starting the crash
>>>>> dump kernel. See kernel_kexec().
>>>>>
>>>>> ipi_cpu_stop() is a bit modified and used to support this behavior.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>    arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
>>>>>    arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
>>>>>    arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
>>>>>    3 files changed, 76 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
>>>>> index 46d63cd..555a955 100644
>>>>> --- a/arch/arm64/include/asm/kexec.h
>>>>> +++ b/arch/arm64/include/asm/kexec.h
>>>>> @@ -30,6 +30,8 @@
>>>>>
>>>>>    #if !defined(__ASSEMBLY__)
>>>>>
>>>>> +extern bool in_crash_kexec;
>>>>> +
>>>>>    /**
>>>>>     * crash_setup_regs() - save registers for the panic kernel
>>>>>     *
>>>>> @@ -40,7 +42,37 @@
>>>>>    static inline void crash_setup_regs(struct pt_regs *newregs,
>>>>>    				    struct pt_regs *oldregs)
>>>>>    {
>>>>> -	/* Empty routine needed to avoid build errors. */
>>>>> +	if (oldregs) {
>>>>> +		memcpy(newregs, oldregs, sizeof(*newregs));
>>>>> +	} else {
>>>>> +		__asm__ __volatile__ (
>>>>> +			"stp	 x0,   x1, [%3, #16 *  0]\n"
>>>>> +			"stp	 x2,   x3, [%3, #16 *  1]\n"
>>>>> +			"stp	 x4,   x5, [%3, #16 *  2]\n"
>>>>> +			"stp	 x6,   x7, [%3, #16 *  3]\n"
>>>>> +			"stp	 x8,   x9, [%3, #16 *  4]\n"
>>>>> +			"stp	x10,  x11, [%3, #16 *  5]\n"
>>>>> +			"stp	x12,  x13, [%3, #16 *  6]\n"
>>>>> +			"stp	x14,  x15, [%3, #16 *  7]\n"
>>>>> +			"stp	x16,  x17, [%3, #16 *  8]\n"
>>>>> +			"stp	x18,  x19, [%3, #16 *  9]\n"
>>>>> +			"stp	x20,  x21, [%3, #16 * 10]\n"
>>>>> +			"stp	x22,  x23, [%3, #16 * 11]\n"
>>>>> +			"stp	x24,  x25, [%3, #16 * 12]\n"
>>>>> +			"stp	x26,  x27, [%3, #16 * 13]\n"
>>>>> +			"stp	x28,  x29, [%3, #16 * 14]\n"
>>>>> +			"str	x30,	   [%3, #16 * 15]\n"
>>>>> +			"mov	%0, sp\n"
>>>>> +			"adr	%1, 1f\n"
>>>>> +			"mrs	%2, spsr_el1\n"
>>>>> +		"1:"
>>>>> +			: "=r" (newregs->sp),
>>>>> +			  "=r" (newregs->pc),
>>>>> +			  "=r" (newregs->pstate)
>>>>> +			: "r"  (&newregs->regs)
>>>>> +			: "memory"
>>>>> +		);
>>>>
>>>> I wonder how useful this thing is, given that it starts by corrupting
>>>> whatever register is holding newregs->regs. Maybe this is not supposed
>>>> to be accurate anyway...
>>>
>>> I'm not quite sure about what part of my code you're mentioning here, but
>>> crash_setup_regs() is solely called by crash_kexec(), and panic() is
>>> the only caller of crash_kexec() with NULL argument which, in turn, is
>>> used as 'oldregs' in crash_setup_regs().
>>
>> You have this assembly sequence:
>>
>> stp	 x0,   x1, [%3, #16 *  0]
>> [...]
>>
>> where %3 itself is one of the x[0..30] registers.
> 
> Not only %3, but also
> 
>> So you are saving
>> things that have already been corrupted by the saving procedure. Not
>> sure how useful that is, but as I said, maybe it is not supposed to be
>> completely accurate.
> 
> x0, x1 ... are the current values in panic(), and not the exact cpu contexts
> at the place we are really interested in.
> We have no way here in panic() to know them, but sp and pc would still be useful
> for back-tracing in later investigation of dump file.
> 
> Please note that the same problem exists on arm (and x86) implementation.

As I said: if people don't expect to have a precise dump of the register
file, then fine.

>>> Given this fact, I think that the values saved in newregs as indicated above
>>> will be the best estimate of current cpu contexts.
>>>
>>> The other caller of crash_kexec() is die() in traps.c, but here we call
>>> it with explicit cpu contexts at exception.
>>>
>>>>
>>>>> +	}
>>>>>    }
>>>>>
>>>>>    #endif /* !defined(__ASSEMBLY__) */
>>>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>>>>> index da28a26..d2d7e90 100644
>>>>> --- a/arch/arm64/kernel/machine_kexec.c
>>>>> +++ b/arch/arm64/kernel/machine_kexec.c
>>>>> @@ -9,6 +9,7 @@
>>>>>     * published by the Free Software Foundation.
>>>>>     */
>>>>>
>>>>> +#include <linux/kernel.h>
>>>>>    #include <linux/kexec.h>
>>>>>    #include <linux/of_fdt.h>
>>>>>    #include <linux/slab.h>
>>>>> @@ -23,6 +24,7 @@
>>>>>    extern const unsigned char arm64_relocate_new_kernel[];
>>>>>    extern const unsigned long arm64_relocate_new_kernel_size;
>>>>>
>>>>> +bool in_crash_kexec;
>>>>>    static unsigned long kimage_start;
>>>>>
>>>>>    /**
>>>>> @@ -203,13 +205,38 @@ void machine_kexec(struct kimage *kimage)
>>>>>    	 */
>>>>>
>>>>>    	cpu_soft_restart(virt_to_phys(cpu_reset),
>>>>> -		is_hyp_mode_available(),
>>>>> +		in_crash_kexec ? 0 : is_hyp_mode_available(),
>>>>>    		reboot_code_buffer_phys, kimage->head, kimage_start);
>>>>>
>>>>>    	BUG(); /* Should never get here. */
>>>>>    }
>>>>>
>>>>> +/**
>>>>> + * machine_crash_shutdown - shutdown non-boot cpus and save registers
>>>>> + */
>>>>>    void machine_crash_shutdown(struct pt_regs *regs)
>>>>>    {
>>>>> -	/* Empty routine needed to avoid build errors. */
>>>>> +	struct pt_regs dummy_regs;
>>>>> +	int cpu;
>>>>> +
>>>>> +	local_irq_disable();
>>>>> +
>>>>> +	in_crash_kexec = true;
>>>>> +
>>>>> +	/*
>>>>> +	 * clear and initialize the per-cpu info. This is necessary
>>>>> +	 * because, otherwise, slots for offline cpus would never be
>>>>> +	 * filled up. See smp_send_stop().
>>>>> +	 */
>>>>> +	memset(&dummy_regs, 0, sizeof(dummy_regs));
>>>>> +	for_each_possible_cpu(cpu)
>>>>> +		crash_save_cpu(&dummy_regs, cpu);
>>>>> +
>>>>> +	/* shutdown non-boot cpus */
>>>>> +	smp_send_stop();
>>>>> +
>>>>> +	/* for boot cpu */
>>>>> +	crash_save_cpu(regs, smp_processor_id());
>>>>> +
>>>>> +	pr_info("Starting crashdump kernel...\n");
>>>>>    }
>>>>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>>>>> index b1adc51..15aabef 100644
>>>>> --- a/arch/arm64/kernel/smp.c
>>>>> +++ b/arch/arm64/kernel/smp.c
>>>>> @@ -37,6 +37,7 @@
>>>>>    #include <linux/completion.h>
>>>>>    #include <linux/of.h>
>>>>>    #include <linux/irq_work.h>
>>>>> +#include <linux/kexec.h>
>>>>>
>>>>>    #include <asm/alternative.h>
>>>>>    #include <asm/atomic.h>
>>>>> @@ -54,6 +55,8 @@
>>>>>    #include <asm/ptrace.h>
>>>>>    #include <asm/virt.h>
>>>>>
>>>>> +#include "cpu-reset.h"
>>>>> +
>>>>>    #define CREATE_TRACE_POINTS
>>>>>    #include <trace/events/ipi.h>
>>>>>
>>>>> @@ -683,8 +686,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
>>>>>    /*
>>>>>     * ipi_cpu_stop - handle IPI from smp_send_stop()
>>>>>     */
>>>>> -static void ipi_cpu_stop(unsigned int cpu)
>>>>> +static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
>>>>>    {
>>>>> +#ifdef CONFIG_KEXEC
>>>>> +	/* printing messages may slow down the shutdown. */
>>>>> +	if (!in_crash_kexec)
>>>>> +#endif
>>>>>    	if (system_state == SYSTEM_BOOTING ||
>>>>>    	    system_state == SYSTEM_RUNNING) {
>>>>>    		raw_spin_lock(&stop_lock);
>>>>
>>>> Irrespective of how useful this change is, how about having a predicate
>>>> instead? Something like:
>>>>
>>>> static inline bool is_in_crash_kexec(void)
>>>> {
>>>> #ifdef CONFIG_KEXEC
>>>> 	return in_crash_kexec;
>>>> #else
>>>> 	return false;
>>>> #endif
>>>> }
>>>
>>> OK, I will take your idea.
>>>
>>>> located in machine_kexec.c (making the in_crash_kernel static), and then
>>>
>>> but cannot make in_crash_kernel static because it is also used in both smp.c
>>> and machine_kexec.c.
>>
>> smp.c only reads from in_crash_kernel (at least from what I can see in
>> this patch), so it should be able to use the accessor.
> 
> Only if we define the accessor as a real function, not an inline function in a header :)

I'll leave the implementation details in your capable hands. :-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-03 13:58       ` Marc Zyngier
@ 2015-12-10 18:31         ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-10 18:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi All,

On Thu, 2015-12-03 at 13:58 +0000, Marc Zyngier wrote:
> Indeed, this is completely broken (I just reproduced the issue on a
> model).

There are users out there that want kexec/kdump support.  I think we
should remove this patch from the series, replace it with a temporary
Kconfig conditional that allows only one of KVM or kexec to be enabled,
and merge kexec and kdump support in for 4.5.  This will satisfy users
who need kexec but not KVM, like kexec based bootloaders.  Once this
KVM hot plug patch is fixed we then merge it, either for 4.5 or 4.6,
depending on the timing.

I'll prepare and post a v13 series that does the above.

-Geoff 

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-10 18:31         ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-10 18:31 UTC (permalink / raw)
  To: Marc Zyngier, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, vikrams, kexec, AKASHI Takahiro, shankerd,
	Christoffer Dall, linux-arm-kernel

Hi All,

On Thu, 2015-12-03 at 13:58 +0000, Marc Zyngier wrote:
> Indeed, this is completely broken (I just reproduced the issue on a
> model).

There are users out there that want kexec/kdump support.  I think we
should remove this patch from the series, replace it with a temporary
Kconfig conditional that allows only one of KVM or kexec to be enabled,
and merge kexec and kdump support in for 4.5.  This will satisfy users
who need kexec but not KVM, like kexec based bootloaders.  Once this
KVM hot plug patch is fixed we then merge it, either for 4.5 or 4.6,
depending on the timing.

I'll prepare and post a v13 series that does the above.

-Geoff 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-03 13:58       ` Marc Zyngier
@ 2015-12-10 18:44         ` Shi, Yang
  -1 siblings, 0 replies; 178+ messages in thread
From: Shi, Yang @ 2015-12-10 18:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/3/2015 5:58 AM, Marc Zyngier wrote:
> On 02/12/15 22:40, Ashwin Chaugule wrote:
>> Hello,
>>
>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>
>>> The current kvm implementation on arm64 does cpu-specific initialization
>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>> core in EL2.
>>>
>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>> code into a separate function, kvm_arch_hardware_disable() and
>>> kvm_arch_hardware_enable() respectively.
>>> We don't need arm64-specific cpu hotplug hook any more.
>>>
>>> Since this patch modifies common part of code between arm and arm64, one
>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>> compiling errors.
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>   arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>
>> [..]
>>
>>>
>>>
>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>          }
>>>
>>>          /*
>>> -        * Execute the init code on each CPU.
>>> -        */
>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>> -
>>> -       /*
>>>           * Init HYP view of VGIC
>>>           */
>>>          err = kvm_vgic_hyp_init();
>>
>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>> creation, but vgic_hyp_init() is called at bootup. On a system with
>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>> (to get the number of LRs), because we're not reading it from EL2
>> anymore.
>
> Indeed, this is completely broken (I just reproduced the issue on a
> model). I wish this kind of details had been checked earlier, but thanks
> for pointing it out.

Sorry for chiming in late. I did run into something similar when I 
tested some earlier version patches with Akashi San.

The guest bootup just hangs with 4.1 kernel on my LS2085 board which has 
GICv3. Not sure if this is the same issue. But, my bisect did point to 
this patch.

I used to think it is my kernel's problem (4.1 sounds a little bit old) 
since Akashi San can't reproduce this on his Hikey board.

Thanks,
Yang

>
>> Whats the best way to fix this?
>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>> - Fold the VGIC init stuff back into hardware_enable()?
>
> None of that works - kvm_arch_hardware_enable() is called once per CPU,
> while vgic_hyp_init() can only be called once. Also,
> kvm_arch_hardware_enable() is called from interrupt context, and I
> wouldn't feel comfortable starting probing DT and allocating stuff from
> there.
>
>> - Read the VGIC number of LRs from the hyp stub?
>
> That's may UNDEF if called in the wrong context. Also, this defeats the
> point of stubs, which is just to install the vectors for the hypervisor.
>
>> - ..
>
> Yeah, quite.
>
> Geoff, Takahiro?
>
> 	M.
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-10 18:44         ` Shi, Yang
  0 siblings, 0 replies; 178+ messages in thread
From: Shi, Yang @ 2015-12-10 18:44 UTC (permalink / raw)
  To: Marc Zyngier, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon,
	Christoffer Dall, AKASHI Takahiro, linux-arm-kernel, kexec,
	shankerd

On 12/3/2015 5:58 AM, Marc Zyngier wrote:
> On 02/12/15 22:40, Ashwin Chaugule wrote:
>> Hello,
>>
>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>
>>> The current kvm implementation on arm64 does cpu-specific initialization
>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>> core in EL2.
>>>
>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>> code into a separate function, kvm_arch_hardware_disable() and
>>> kvm_arch_hardware_enable() respectively.
>>> We don't need arm64-specific cpu hotplug hook any more.
>>>
>>> Since this patch modifies common part of code between arm and arm64, one
>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>> compiling errors.
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>   arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>
>> [..]
>>
>>>
>>>
>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>          }
>>>
>>>          /*
>>> -        * Execute the init code on each CPU.
>>> -        */
>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>> -
>>> -       /*
>>>           * Init HYP view of VGIC
>>>           */
>>>          err = kvm_vgic_hyp_init();
>>
>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>> creation, but vgic_hyp_init() is called at bootup. On a system with
>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>> (to get the number of LRs), because we're not reading it from EL2
>> anymore.
>
> Indeed, this is completely broken (I just reproduced the issue on a
> model). I wish this kind of details had been checked earlier, but thanks
> for pointing it out.

Sorry for chiming in late. I did run into something similar when I 
tested some earlier version patches with Akashi San.

The guest bootup just hangs with 4.1 kernel on my LS2085 board which has 
GICv3. Not sure if this is the same issue. But, my bisect did point to 
this patch.

I used to think it is my kernel's problem (4.1 sounds a little bit old) 
since Akashi San can't reproduce this on his Hikey board.

Thanks,
Yang

>
>> Whats the best way to fix this?
>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>> - Fold the VGIC init stuff back into hardware_enable()?
>
> None of that works - kvm_arch_hardware_enable() is called once per CPU,
> while vgic_hyp_init() can only be called once. Also,
> kvm_arch_hardware_enable() is called from interrupt context, and I
> wouldn't feel comfortable starting probing DT and allocating stuff from
> there.
>
>> - Read the VGIC number of LRs from the hyp stub?
>
> That's may UNDEF if called in the wrong context. Also, this defeats the
> point of stubs, which is just to install the vectors for the hypervisor.
>
>> - ..
>
> Yeah, quite.
>
> Geoff, Takahiro?
>
> 	M.
>


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-03 13:58       ` Marc Zyngier
@ 2015-12-11  8:06         ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-11  8:06 UTC (permalink / raw)
  To: linux-arm-kernel

Ashwin, Marc,

On 12/03/2015 10:58 PM, Marc Zyngier wrote:
> On 02/12/15 22:40, Ashwin Chaugule wrote:
>> Hello,
>>
>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>
>>> The current kvm implementation on arm64 does cpu-specific initialization
>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>> core in EL2.
>>>
>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>> code into a separate function, kvm_arch_hardware_disable() and
>>> kvm_arch_hardware_enable() respectively.
>>> We don't need arm64-specific cpu hotplug hook any more.
>>>
>>> Since this patch modifies common part of code between arm and arm64, one
>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>> compiling errors.
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>   arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>
>> [..]
>>
>>>
>>>
>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>          }
>>>
>>>          /*
>>> -        * Execute the init code on each CPU.
>>> -        */
>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>> -
>>> -       /*
>>>           * Init HYP view of VGIC
>>>           */
>>>          err = kvm_vgic_hyp_init();
>>
>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>> creation, but vgic_hyp_init() is called at bootup. On a system with
>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>> (to get the number of LRs), because we're not reading it from EL2
>> anymore.

Thank you for pointing this out.
Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
I didn't notice this problem.

> Indeed, this is completely broken (I just reproduced the issue on a
> model). I wish this kind of details had been checked earlier, but thanks
> for pointing it out.
>
>> Whats the best way to fix this?
>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>> - Fold the VGIC init stuff back into hardware_enable()?
>
> None of that works - kvm_arch_hardware_enable() is called once per CPU,
> while vgic_hyp_init() can only be called once. Also,
> kvm_arch_hardware_enable() is called from interrupt context, and I
> wouldn't feel comfortable starting probing DT and allocating stuff from
> there.

Do you think so?
How about the fixup! patch attached below?
The point is that, like Ashwin's first idea, we initialize cpus temporarily
before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
kvm cpu hotplug will still continue to work as before.
Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
original code, the change will not be a big jump.

If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
I hope this should work. Actually I confirmed that, with this fixup! patch,
we could run a kvm guest and also successfully executed kexec on model w/gic-v3.

My only concern is the following kernel message I saw when kexec shut down
the kernel:
(Please note that I was running one kvm quest (pid=961) here.)

===
sh-4.3# ./kexec -d -e
kexec version: 15.11.16.11.06-g41e52e2
arch_process_options:112: command_line: (null)
arch_process_options:114: initrd: (null)
arch_process_options:115: dtb: (null)
arch_process_options:117: port: 0x0
kvm: exiting hardware virtualization
kvm [961]: Unsupported exception type: 6248304    <== this message
kexec_core: Starting new kernel
Disabling non-boot CPUs ...
CPU1: shutdown
CPU2: shutdown
CPU3: shutdown
CPU4: shutdown
CPU5: shutdown
CPU6: shutdown
CPU7: shutdown
Bye!
Booting Linux on physical CPU 0x0
...
===

I don't know whether we can ignore this kind of message or not.
Any thoughts?

Thanks,
-Takahiro AKASHI



>> - Read the VGIC number of LRs from the hyp stub?
>
> That's may UNDEF if called in the wrong context. Also, this defeats the
> point of stubs, which is just to install the vectors for the hypervisor.
>
>> - ..
>
> Yeah, quite.
>
> Geoff, Takahiro?
>
> 	M.
>
----8<----
 From 66ca3baedf45c78c81a76ea77ddd6ace49550ab6 Mon Sep 17 00:00:00 2001
From: AKASHI Takahiro <takahiro.akashi@linaro.org>
Date: Fri, 11 Dec 2015 13:43:35 +0900
Subject: [PATCH 7/7] fixup! arm64: kvm: allows kvm cpu hotplug

---
  arch/arm/kvm/arm.c |   37 +++++++++++++++++++++++++++----------
  1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 518c3c7..8fe59ba 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -950,7 +950,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
  	}
  }

-int kvm_arch_hardware_enable(void)
+static void cpu_init_hyp_mode(void *dummy)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t pgd_ptr;
@@ -958,9 +958,6 @@ int kvm_arch_hardware_enable(void)
  	unsigned long stack_page;
  	unsigned long vector_ptr;

-	if (__hyp_get_vectors() != hyp_default_vectors)
-		return 0;
-
  	/* Switch from the HYP stub to our own HYP init vector */
  	__hyp_set_vectors(kvm_get_idmap_vector());

@@ -973,24 +970,35 @@ int kvm_arch_hardware_enable(void)
  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);

  	kvm_arm_init_debug();
-
-	return 0;
  }

-void kvm_arch_hardware_disable(void)
+static void cpu_reset_hyp_mode(void *dummy)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t phys_idmap_start;

-	if (__hyp_get_vectors() == hyp_default_vectors)
-		return;
-
  	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
  	phys_idmap_start = kvm_get_idmap_start();

  	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
  }

+int kvm_arch_hardware_enable(void)
+{
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		cpu_init_hyp_mode(NULL);
+
+	return 0;
+}
+
+void kvm_arch_hardware_disable(void)
+{
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		return;
+
+	cpu_reset_hyp_mode(NULL);
+}
+
  #ifdef CONFIG_CPU_PM
  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
  				    unsigned long cmd,
@@ -1114,6 +1122,12 @@ static int init_hyp_mode(void)
  	}

  	/*
+	 * Execute the init code on each CPU.
+	 * Only needed to execute kvm_hyp_call() during *_hyp_init().
+	 */
+	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
+
+	/*
  	 * Init HYP view of VGIC
  	 */
  	err = kvm_vgic_hyp_init();
@@ -1127,6 +1141,8 @@ static int init_hyp_mode(void)
  	if (err)
  		goto out_free_context;

+	on_each_cpu(cpu_reset_hyp_mode, NULL, 1);
+
  #ifndef CONFIG_HOTPLUG_CPU
  	free_boot_hyp_pgd();
  #endif
@@ -1137,6 +1153,7 @@ static int init_hyp_mode(void)

  	return 0;
  out_free_context:
+	on_each_cpu(cpu_reset_hyp_mode, NULL, 1);
  	free_percpu(kvm_host_cpu_state);
  out_free_mappings:
  	free_hyp_pgds();
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-11  8:06         ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-11  8:06 UTC (permalink / raw)
  To: Marc Zyngier, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon, shankerd,
	linux-arm-kernel, kexec, Christoffer Dall

Ashwin, Marc,

On 12/03/2015 10:58 PM, Marc Zyngier wrote:
> On 02/12/15 22:40, Ashwin Chaugule wrote:
>> Hello,
>>
>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>
>>> The current kvm implementation on arm64 does cpu-specific initialization
>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>> core in EL2.
>>>
>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>> code into a separate function, kvm_arch_hardware_disable() and
>>> kvm_arch_hardware_enable() respectively.
>>> We don't need arm64-specific cpu hotplug hook any more.
>>>
>>> Since this patch modifies common part of code between arm and arm64, one
>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>> compiling errors.
>>>
>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> ---
>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>   arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>
>> [..]
>>
>>>
>>>
>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>          }
>>>
>>>          /*
>>> -        * Execute the init code on each CPU.
>>> -        */
>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>> -
>>> -       /*
>>>           * Init HYP view of VGIC
>>>           */
>>>          err = kvm_vgic_hyp_init();
>>
>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>> creation, but vgic_hyp_init() is called at bootup. On a system with
>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>> (to get the number of LRs), because we're not reading it from EL2
>> anymore.

Thank you for pointing this out.
Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
I didn't notice this problem.

> Indeed, this is completely broken (I just reproduced the issue on a
> model). I wish this kind of details had been checked earlier, but thanks
> for pointing it out.
>
>> Whats the best way to fix this?
>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>> - Fold the VGIC init stuff back into hardware_enable()?
>
> None of that works - kvm_arch_hardware_enable() is called once per CPU,
> while vgic_hyp_init() can only be called once. Also,
> kvm_arch_hardware_enable() is called from interrupt context, and I
> wouldn't feel comfortable starting probing DT and allocating stuff from
> there.

Do you think so?
How about the fixup! patch attached below?
The point is that, like Ashwin's first idea, we initialize cpus temporarily
before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
kvm cpu hotplug will still continue to work as before.
Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
original code, the change will not be a big jump.

If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
I hope this should work. Actually I confirmed that, with this fixup! patch,
we could run a kvm guest and also successfully executed kexec on model w/gic-v3.

My only concern is the following kernel message I saw when kexec shut down
the kernel:
(Please note that I was running one kvm quest (pid=961) here.)

===
sh-4.3# ./kexec -d -e
kexec version: 15.11.16.11.06-g41e52e2
arch_process_options:112: command_line: (null)
arch_process_options:114: initrd: (null)
arch_process_options:115: dtb: (null)
arch_process_options:117: port: 0x0
kvm: exiting hardware virtualization
kvm [961]: Unsupported exception type: 6248304    <== this message
kexec_core: Starting new kernel
Disabling non-boot CPUs ...
CPU1: shutdown
CPU2: shutdown
CPU3: shutdown
CPU4: shutdown
CPU5: shutdown
CPU6: shutdown
CPU7: shutdown
Bye!
Booting Linux on physical CPU 0x0
...
===

I don't know whether we can ignore this kind of message or not.
Any thoughts?

Thanks,
-Takahiro AKASHI



>> - Read the VGIC number of LRs from the hyp stub?
>
> That's may UNDEF if called in the wrong context. Also, this defeats the
> point of stubs, which is just to install the vectors for the hypervisor.
>
>> - ..
>
> Yeah, quite.
>
> Geoff, Takahiro?
>
> 	M.
>
----8<----
 From 66ca3baedf45c78c81a76ea77ddd6ace49550ab6 Mon Sep 17 00:00:00 2001
From: AKASHI Takahiro <takahiro.akashi@linaro.org>
Date: Fri, 11 Dec 2015 13:43:35 +0900
Subject: [PATCH 7/7] fixup! arm64: kvm: allows kvm cpu hotplug

---
  arch/arm/kvm/arm.c |   37 +++++++++++++++++++++++++++----------
  1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 518c3c7..8fe59ba 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -950,7 +950,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
  	}
  }

-int kvm_arch_hardware_enable(void)
+static void cpu_init_hyp_mode(void *dummy)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t pgd_ptr;
@@ -958,9 +958,6 @@ int kvm_arch_hardware_enable(void)
  	unsigned long stack_page;
  	unsigned long vector_ptr;

-	if (__hyp_get_vectors() != hyp_default_vectors)
-		return 0;
-
  	/* Switch from the HYP stub to our own HYP init vector */
  	__hyp_set_vectors(kvm_get_idmap_vector());

@@ -973,24 +970,35 @@ int kvm_arch_hardware_enable(void)
  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);

  	kvm_arm_init_debug();
-
-	return 0;
  }

-void kvm_arch_hardware_disable(void)
+static void cpu_reset_hyp_mode(void *dummy)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t phys_idmap_start;

-	if (__hyp_get_vectors() == hyp_default_vectors)
-		return;
-
  	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
  	phys_idmap_start = kvm_get_idmap_start();

  	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
  }

+int kvm_arch_hardware_enable(void)
+{
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		cpu_init_hyp_mode(NULL);
+
+	return 0;
+}
+
+void kvm_arch_hardware_disable(void)
+{
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		return;
+
+	cpu_reset_hyp_mode(NULL);
+}
+
  #ifdef CONFIG_CPU_PM
  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
  				    unsigned long cmd,
@@ -1114,6 +1122,12 @@ static int init_hyp_mode(void)
  	}

  	/*
+	 * Execute the init code on each CPU.
+	 * Only needed to execute kvm_hyp_call() during *_hyp_init().
+	 */
+	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
+
+	/*
  	 * Init HYP view of VGIC
  	 */
  	err = kvm_vgic_hyp_init();
@@ -1127,6 +1141,8 @@ static int init_hyp_mode(void)
  	if (err)
  		goto out_free_context;

+	on_each_cpu(cpu_reset_hyp_mode, NULL, 1);
+
  #ifndef CONFIG_HOTPLUG_CPU
  	free_boot_hyp_pgd();
  #endif
@@ -1137,6 +1153,7 @@ static int init_hyp_mode(void)

  	return 0;
  out_free_context:
+	on_each_cpu(cpu_reset_hyp_mode, NULL, 1);
  	free_percpu(kvm_host_cpu_state);
  out_free_mappings:
  	free_hyp_pgds();
-- 
1.7.9.5

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-10 18:44         ` Shi, Yang
@ 2015-12-11  8:09           ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-11  8:09 UTC (permalink / raw)
  To: linux-arm-kernel

Yang,

On 12/11/2015 03:44 AM, Shi, Yang wrote:
> On 12/3/2015 5:58 AM, Marc Zyngier wrote:
>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>> Hello,
>>>
>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>
>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>> core in EL2.
>>>>
>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>> kvm_arch_hardware_enable() respectively.
>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>
>>>> Since this patch modifies common part of code between arm and arm64, one
>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>> compiling errors.
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>   arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>>
>>> [..]
>>>
>>>>
>>>>
>>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>          }
>>>>
>>>>          /*
>>>> -        * Execute the init code on each CPU.
>>>> -        */
>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>> -
>>>> -       /*
>>>>           * Init HYP view of VGIC
>>>>           */
>>>>          err = kvm_vgic_hyp_init();
>>>
>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>> (to get the number of LRs), because we're not reading it from EL2
>>> anymore.
>>
>> Indeed, this is completely broken (I just reproduced the issue on a
>> model). I wish this kind of details had been checked earlier, but thanks
>> for pointing it out.
>
> Sorry for chiming in late. I did run into something similar when I tested some earlier version patches with Akashi San.
>
> The guest bootup just hangs with 4.1 kernel on my LS2085 board which has GICv3. Not sure if this is the same issue. But,
> my bisect did point to this patch.

Can you please try my fixup! patch to determine whether it is the same issue or not?

Thanks
-Takahiro AKASHI


> I used to think it is my kernel's problem (4.1 sounds a little bit old) since Akashi San can't reproduce this on his
> Hikey board.
>
> Thanks,
> Yang
>
>>
>>> Whats the best way to fix this?
>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>> - Fold the VGIC init stuff back into hardware_enable()?
>>
>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>> while vgic_hyp_init() can only be called once. Also,
>> kvm_arch_hardware_enable() is called from interrupt context, and I
>> wouldn't feel comfortable starting probing DT and allocating stuff from
>> there.
>>
>>> - Read the VGIC number of LRs from the hyp stub?
>>
>> That's may UNDEF if called in the wrong context. Also, this defeats the
>> point of stubs, which is just to install the vectors for the hypervisor.
>>
>>> - ..
>>
>> Yeah, quite.
>>
>> Geoff, Takahiro?
>>
>>     M.
>>
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-11  8:09           ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-11  8:09 UTC (permalink / raw)
  To: Shi, Yang, Marc Zyngier, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon,
	Christoffer Dall, linux-arm-kernel, kexec, shankerd

Yang,

On 12/11/2015 03:44 AM, Shi, Yang wrote:
> On 12/3/2015 5:58 AM, Marc Zyngier wrote:
>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>> Hello,
>>>
>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>
>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>> core in EL2.
>>>>
>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>> kvm_arch_hardware_enable() respectively.
>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>
>>>> Since this patch modifies common part of code between arm and arm64, one
>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>> compiling errors.
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>   arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>>
>>> [..]
>>>
>>>>
>>>>
>>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>          }
>>>>
>>>>          /*
>>>> -        * Execute the init code on each CPU.
>>>> -        */
>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>> -
>>>> -       /*
>>>>           * Init HYP view of VGIC
>>>>           */
>>>>          err = kvm_vgic_hyp_init();
>>>
>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>> (to get the number of LRs), because we're not reading it from EL2
>>> anymore.
>>
>> Indeed, this is completely broken (I just reproduced the issue on a
>> model). I wish this kind of details had been checked earlier, but thanks
>> for pointing it out.
>
> Sorry for chiming in late. I did run into something similar when I tested some earlier version patches with Akashi San.
>
> The guest bootup just hangs with 4.1 kernel on my LS2085 board which has GICv3. Not sure if this is the same issue. But,
> my bisect did point to this patch.

Can you please try my fixup! patch to determine whether it is the same issue or not?

Thanks
-Takahiro AKASHI


> I used to think it is my kernel's problem (4.1 sounds a little bit old) since Akashi San can't reproduce this on his
> Hikey board.
>
> Thanks,
> Yang
>
>>
>>> Whats the best way to fix this?
>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>> - Fold the VGIC init stuff back into hardware_enable()?
>>
>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>> while vgic_hyp_init() can only be called once. Also,
>> kvm_arch_hardware_enable() is called from interrupt context, and I
>> wouldn't feel comfortable starting probing DT and allocating stuff from
>> there.
>>
>>> - Read the VGIC number of LRs from the hyp stub?
>>
>> That's may UNDEF if called in the wrong context. Also, this defeats the
>> point of stubs, which is just to install the vectors for the hypervisor.
>>
>>> - ..
>>
>> Yeah, quite.
>>
>> Geoff, Takahiro?
>>
>>     M.
>>
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-11  8:06         ` AKASHI Takahiro
@ 2015-12-11 13:00           ` Shanker Donthineni
  -1 siblings, 0 replies; 178+ messages in thread
From: Shanker Donthineni @ 2015-12-11 13:00 UTC (permalink / raw)
  To: linux-arm-kernel



On 12/11/2015 02:06 AM, AKASHI Takahiro wrote:
> Ashwin, Marc,
>
> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>> Hello,
>>>
>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>
>>>> The current kvm implementation on arm64 does cpu-specific
> initialization
>>>> at system boot, and has no way to gracefully shutdown a core in terms
> of
>>>> kvm. This prevents, especially, kexec from rebooting the system on a
> boot
>>>> core in EL2.
>>>>
>>>> This patch adds a cpu tear-down function and also puts an existing
> cpu-init
>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>> kvm_arch_hardware_enable() respectively.
>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>
>>>> Since this patch modifies common part of code between arm and arm64,
> one
>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>> compiling errors.
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>   arch/arm/kvm/arm.c                | 79
> ++++++++++++++++++---------------------
>>>>   arch/arm/kvm/mmu.c                | 5 +++
>>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>>
>>> [..]
>>>
>>>>
>>>>
>>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>          }
>>>>
>>>>          /*
>>>> -        * Execute the init code on each CPU.
>>>> -        */
>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>> -
>>>> -       /*
>>>>           * Init HYP view of VGIC
>>>>           */
>>>>          err = kvm_vgic_hyp_init();
>>>
>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>> (to get the number of LRs), because we're not reading it from EL2
>>> anymore.
>
> Thank you for pointing this out.
> Recently I tested my kdump code on hikey, and as hikey(hi6220) has
> gic-400,
> I didn't notice this problem.
>
>> Indeed, this is completely broken (I just reproduced the issue on a
>> model). I wish this kind of details had been checked earlier, but thanks
>> for pointing it out.
>>
>>> Whats the best way to fix this?
>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable
> later?
>>> - Fold the VGIC init stuff back into hardware_enable()?
>>
>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>> while vgic_hyp_init() can only be called once. Also,
>> kvm_arch_hardware_enable() is called from interrupt context, and I
>> wouldn't feel comfortable starting probing DT and allocating stuff from
>> there.
>
> Do you think so?
> How about the fixup! patch attached below?
> The point is that, like Ashwin's first idea, we initialize cpus
> temporarily
> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
> kvm cpu hotplug will still continue to work as before.
> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
> original code, the change will not be a big jump.
I have tested this patch and verified the fix on our hardware with GICv3 
without any issue.
>
> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
> I hope this should work. Actually I confirmed that, with this fixup!
> patch,
> we could run a kvm guest and also successfully executed kexec on model
> w/gic-v3.
>
> My only concern is the following kernel message I saw when kexec shut 
> down
> the kernel:
> (Please note that I was running one kvm quest (pid=961) here.)
>
> ===
> sh-4.3# ./kexec -d -e
> kexec version: 15.11.16.11.06-g41e52e2
> arch_process_options:112: command_line: (null)
> arch_process_options:114: initrd: (null)
> arch_process_options:115: dtb: (null)
> arch_process_options:117: port: 0x0
> kvm: exiting hardware virtualization
> kvm [961]: Unsupported exception type: 6248304    <== this message
> kexec_core: Starting new kernel
> Disabling non-boot CPUs ...
> CPU1: shutdown
> CPU2: shutdown
> CPU3: shutdown
> CPU4: shutdown
> CPU5: shutdown
> CPU6: shutdown
> CPU7: shutdown
> Bye!
> Booting Linux on physical CPU 0x0
> ...
> ===
>
> I don't know whether we can ignore this kind of message or not.
> Any thoughts?
>
> Thanks,
> -Takahiro AKASHI
>
>
>
>>> - Read the VGIC number of LRs from the hyp stub?
>>
>> That's may UNDEF if called in the wrong context. Also, this defeats the
>> point of stubs, which is just to install the vectors for the hypervisor.
>>
>>> - ..
>>
>> Yeah, quite.
>>
>> Geoff, Takahiro?
>>
>>     M.
>>
> ----8<----
> From 66ca3baedf45c78c81a76ea77ddd6ace49550ab6 Mon Sep 17 00:00:00 2001
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Date: Fri, 11 Dec 2015 13:43:35 +0900
> Subject: [PATCH 7/7] fixup! arm64: kvm: allows kvm cpu hotplug
>
> ---
>  arch/arm/kvm/arm.c |   37 +++++++++++++++++++++++++++----------
>  1 file changed, 27 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 518c3c7..8fe59ba 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -950,7 +950,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>      }
>  }
>
> -int kvm_arch_hardware_enable(void)
> +static void cpu_init_hyp_mode(void *dummy)
>  {
>      phys_addr_t boot_pgd_ptr;
>      phys_addr_t pgd_ptr;
> @@ -958,9 +958,6 @@ int kvm_arch_hardware_enable(void)
>      unsigned long stack_page;
>      unsigned long vector_ptr;
>
> -    if (__hyp_get_vectors() != hyp_default_vectors)
> -        return 0;
> -
>      /* Switch from the HYP stub to our own HYP init vector */
>      __hyp_set_vectors(kvm_get_idmap_vector());
>
> @@ -973,24 +970,35 @@ int kvm_arch_hardware_enable(void)
>      __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr,
> vector_ptr);
>
>      kvm_arm_init_debug();
> -
> -    return 0;
>  }
>
> -void kvm_arch_hardware_disable(void)
> +static void cpu_reset_hyp_mode(void *dummy)
>  {
>      phys_addr_t boot_pgd_ptr;
>      phys_addr_t phys_idmap_start;
>
> -    if (__hyp_get_vectors() == hyp_default_vectors)
> -        return;
> -
>      boot_pgd_ptr = kvm_mmu_get_boot_httbr();
>      phys_idmap_start = kvm_get_idmap_start();
>
>      __cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
>  }
>
> +int kvm_arch_hardware_enable(void)
> +{
> +    if (__hyp_get_vectors() == hyp_default_vectors)
> +        cpu_init_hyp_mode(NULL);
> +
> +    return 0;
> +}
> +
> +void kvm_arch_hardware_disable(void)
> +{
> +    if (__hyp_get_vectors() == hyp_default_vectors)
> +        return;
> +
> +    cpu_reset_hyp_mode(NULL);
> +}
> +
>  #ifdef CONFIG_CPU_PM
>  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>                      unsigned long cmd,
> @@ -1114,6 +1122,12 @@ static int init_hyp_mode(void)
>      }
>
>      /*
> +     * Execute the init code on each CPU.
> +     * Only needed to execute kvm_hyp_call() during *_hyp_init().
> +     */
> +    on_each_cpu(cpu_init_hyp_mode, NULL, 1);
> +
> +    /*
>       * Init HYP view of VGIC
>       */
>      err = kvm_vgic_hyp_init();
> @@ -1127,6 +1141,8 @@ static int init_hyp_mode(void)
>      if (err)
>          goto out_free_context;
>
> +    on_each_cpu(cpu_reset_hyp_mode, NULL, 1);
> +
>  #ifndef CONFIG_HOTPLUG_CPU
>      free_boot_hyp_pgd();
>  #endif
> @@ -1137,6 +1153,7 @@ static int init_hyp_mode(void)
>
>      return 0;
>  out_free_context:
> +    on_each_cpu(cpu_reset_hyp_mode, NULL, 1);
>      free_percpu(kvm_host_cpu_state);
>  out_free_mappings:
>      free_hyp_pgds();

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-11 13:00           ` Shanker Donthineni
  0 siblings, 0 replies; 178+ messages in thread
From: Shanker Donthineni @ 2015-12-11 13:00 UTC (permalink / raw)
  To: AKASHI Takahiro, Marc Zyngier, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon,
	Christoffer Dall, kexec, linux-arm-kernel



On 12/11/2015 02:06 AM, AKASHI Takahiro wrote:
> Ashwin, Marc,
>
> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>> Hello,
>>>
>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>
>>>> The current kvm implementation on arm64 does cpu-specific
> initialization
>>>> at system boot, and has no way to gracefully shutdown a core in terms
> of
>>>> kvm. This prevents, especially, kexec from rebooting the system on a
> boot
>>>> core in EL2.
>>>>
>>>> This patch adds a cpu tear-down function and also puts an existing
> cpu-init
>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>> kvm_arch_hardware_enable() respectively.
>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>
>>>> Since this patch modifies common part of code between arm and arm64,
> one
>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>> compiling errors.
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>   arch/arm/kvm/arm.c                | 79
> ++++++++++++++++++---------------------
>>>>   arch/arm/kvm/mmu.c                | 5 +++
>>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>>
>>> [..]
>>>
>>>>
>>>>
>>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>          }
>>>>
>>>>          /*
>>>> -        * Execute the init code on each CPU.
>>>> -        */
>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>> -
>>>> -       /*
>>>>           * Init HYP view of VGIC
>>>>           */
>>>>          err = kvm_vgic_hyp_init();
>>>
>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>> (to get the number of LRs), because we're not reading it from EL2
>>> anymore.
>
> Thank you for pointing this out.
> Recently I tested my kdump code on hikey, and as hikey(hi6220) has
> gic-400,
> I didn't notice this problem.
>
>> Indeed, this is completely broken (I just reproduced the issue on a
>> model). I wish this kind of details had been checked earlier, but thanks
>> for pointing it out.
>>
>>> Whats the best way to fix this?
>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable
> later?
>>> - Fold the VGIC init stuff back into hardware_enable()?
>>
>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>> while vgic_hyp_init() can only be called once. Also,
>> kvm_arch_hardware_enable() is called from interrupt context, and I
>> wouldn't feel comfortable starting probing DT and allocating stuff from
>> there.
>
> Do you think so?
> How about the fixup! patch attached below?
> The point is that, like Ashwin's first idea, we initialize cpus
> temporarily
> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
> kvm cpu hotplug will still continue to work as before.
> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
> original code, the change will not be a big jump.
I have tested this patch and verified the fix on our hardware with GICv3 
without any issue.
>
> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
> I hope this should work. Actually I confirmed that, with this fixup!
> patch,
> we could run a kvm guest and also successfully executed kexec on model
> w/gic-v3.
>
> My only concern is the following kernel message I saw when kexec shut 
> down
> the kernel:
> (Please note that I was running one kvm quest (pid=961) here.)
>
> ===
> sh-4.3# ./kexec -d -e
> kexec version: 15.11.16.11.06-g41e52e2
> arch_process_options:112: command_line: (null)
> arch_process_options:114: initrd: (null)
> arch_process_options:115: dtb: (null)
> arch_process_options:117: port: 0x0
> kvm: exiting hardware virtualization
> kvm [961]: Unsupported exception type: 6248304    <== this message
> kexec_core: Starting new kernel
> Disabling non-boot CPUs ...
> CPU1: shutdown
> CPU2: shutdown
> CPU3: shutdown
> CPU4: shutdown
> CPU5: shutdown
> CPU6: shutdown
> CPU7: shutdown
> Bye!
> Booting Linux on physical CPU 0x0
> ...
> ===
>
> I don't know whether we can ignore this kind of message or not.
> Any thoughts?
>
> Thanks,
> -Takahiro AKASHI
>
>
>
>>> - Read the VGIC number of LRs from the hyp stub?
>>
>> That's may UNDEF if called in the wrong context. Also, this defeats the
>> point of stubs, which is just to install the vectors for the hypervisor.
>>
>>> - ..
>>
>> Yeah, quite.
>>
>> Geoff, Takahiro?
>>
>>     M.
>>
> ----8<----
> From 66ca3baedf45c78c81a76ea77ddd6ace49550ab6 Mon Sep 17 00:00:00 2001
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Date: Fri, 11 Dec 2015 13:43:35 +0900
> Subject: [PATCH 7/7] fixup! arm64: kvm: allows kvm cpu hotplug
>
> ---
>  arch/arm/kvm/arm.c |   37 +++++++++++++++++++++++++++----------
>  1 file changed, 27 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 518c3c7..8fe59ba 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -950,7 +950,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>      }
>  }
>
> -int kvm_arch_hardware_enable(void)
> +static void cpu_init_hyp_mode(void *dummy)
>  {
>      phys_addr_t boot_pgd_ptr;
>      phys_addr_t pgd_ptr;
> @@ -958,9 +958,6 @@ int kvm_arch_hardware_enable(void)
>      unsigned long stack_page;
>      unsigned long vector_ptr;
>
> -    if (__hyp_get_vectors() != hyp_default_vectors)
> -        return 0;
> -
>      /* Switch from the HYP stub to our own HYP init vector */
>      __hyp_set_vectors(kvm_get_idmap_vector());
>
> @@ -973,24 +970,35 @@ int kvm_arch_hardware_enable(void)
>      __cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr,
> vector_ptr);
>
>      kvm_arm_init_debug();
> -
> -    return 0;
>  }
>
> -void kvm_arch_hardware_disable(void)
> +static void cpu_reset_hyp_mode(void *dummy)
>  {
>      phys_addr_t boot_pgd_ptr;
>      phys_addr_t phys_idmap_start;
>
> -    if (__hyp_get_vectors() == hyp_default_vectors)
> -        return;
> -
>      boot_pgd_ptr = kvm_mmu_get_boot_httbr();
>      phys_idmap_start = kvm_get_idmap_start();
>
>      __cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
>  }
>
> +int kvm_arch_hardware_enable(void)
> +{
> +    if (__hyp_get_vectors() == hyp_default_vectors)
> +        cpu_init_hyp_mode(NULL);
> +
> +    return 0;
> +}
> +
> +void kvm_arch_hardware_disable(void)
> +{
> +    if (__hyp_get_vectors() == hyp_default_vectors)
> +        return;
> +
> +    cpu_reset_hyp_mode(NULL);
> +}
> +
>  #ifdef CONFIG_CPU_PM
>  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>                      unsigned long cmd,
> @@ -1114,6 +1122,12 @@ static int init_hyp_mode(void)
>      }
>
>      /*
> +     * Execute the init code on each CPU.
> +     * Only needed to execute kvm_hyp_call() during *_hyp_init().
> +     */
> +    on_each_cpu(cpu_init_hyp_mode, NULL, 1);
> +
> +    /*
>       * Init HYP view of VGIC
>       */
>      err = kvm_vgic_hyp_init();
> @@ -1127,6 +1141,8 @@ static int init_hyp_mode(void)
>      if (err)
>          goto out_free_context;
>
> +    on_each_cpu(cpu_reset_hyp_mode, NULL, 1);
> +
>  #ifndef CONFIG_HOTPLUG_CPU
>      free_boot_hyp_pgd();
>  #endif
> @@ -1137,6 +1153,7 @@ static int init_hyp_mode(void)
>
>      return 0;
>  out_free_context:
> +    on_each_cpu(cpu_reset_hyp_mode, NULL, 1);
>      free_percpu(kvm_host_cpu_state);
>  out_free_mappings:
>      free_hyp_pgds();


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-11  8:06         ` AKASHI Takahiro
@ 2015-12-11 16:28           ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-11 16:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/12/15 08:06, AKASHI Takahiro wrote:
> Ashwin, Marc,
> 
> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>> Hello,
>>>
>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>
>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>> core in EL2.
>>>>
>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>> kvm_arch_hardware_enable() respectively.
>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>
>>>> Since this patch modifies common part of code between arm and arm64, one
>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>> compiling errors.
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>   arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>>
>>> [..]
>>>
>>>>
>>>>
>>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>          }
>>>>
>>>>          /*
>>>> -        * Execute the init code on each CPU.
>>>> -        */
>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>> -
>>>> -       /*
>>>>           * Init HYP view of VGIC
>>>>           */
>>>>          err = kvm_vgic_hyp_init();
>>>
>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>> (to get the number of LRs), because we're not reading it from EL2
>>> anymore.
> 
> Thank you for pointing this out.
> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
> I didn't notice this problem.

Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
GICv3 uses some system registers that are only available at EL2, and KVM
needs some information contained in these registers before being able to
get initialized.

>> Indeed, this is completely broken (I just reproduced the issue on a
>> model). I wish this kind of details had been checked earlier, but thanks
>> for pointing it out.
>>
>>> Whats the best way to fix this?
>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>> - Fold the VGIC init stuff back into hardware_enable()?
>>
>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>> while vgic_hyp_init() can only be called once. Also,
>> kvm_arch_hardware_enable() is called from interrupt context, and I
>> wouldn't feel comfortable starting probing DT and allocating stuff from
>> there.
> 
> Do you think so?
> How about the fixup! patch attached below?
> The point is that, like Ashwin's first idea, we initialize cpus temporarily
> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
> kvm cpu hotplug will still continue to work as before.
> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
> original code, the change will not be a big jump.

This seems quite complicated:
- init EL2 on  all CPUs
- do some initialization
- tear all CPUs EL2 down
- let KVM drive the vectors being set or not

My questions are: why do we need to do this on *all* cpus? Can't that
work on a single one?

Also, the simple fact that we were able to get some junk value is a sign
that something is amiss. I'd expect a splat of some sort, because we now
have a possibility of doing things in the wrong context.

> 
> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
> I hope this should work. Actually I confirmed that, with this fixup! patch,
> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
> 
> My only concern is the following kernel message I saw when kexec shut down
> the kernel:
> (Please note that I was running one kvm quest (pid=961) here.)
> 
> ===
> sh-4.3# ./kexec -d -e
> kexec version: 15.11.16.11.06-g41e52e2
> arch_process_options:112: command_line: (null)
> arch_process_options:114: initrd: (null)
> arch_process_options:115: dtb: (null)
> arch_process_options:117: port: 0x0
> kvm: exiting hardware virtualization
> kvm [961]: Unsupported exception type: 6248304    <== this message

That makes me feel very uncomfortable. It looks like we've exited a
guest with some horrible value in X0. How is that even possible?

This deserves to be investigated.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-11 16:28           ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-11 16:28 UTC (permalink / raw)
  To: AKASHI Takahiro, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon, shankerd,
	linux-arm-kernel, kexec, Christoffer Dall

On 11/12/15 08:06, AKASHI Takahiro wrote:
> Ashwin, Marc,
> 
> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>> Hello,
>>>
>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>
>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>> core in EL2.
>>>>
>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>> kvm_arch_hardware_enable() respectively.
>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>
>>>> Since this patch modifies common part of code between arm and arm64, one
>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>> compiling errors.
>>>>
>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> ---
>>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>   arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>>
>>> [..]
>>>
>>>>
>>>>
>>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>          }
>>>>
>>>>          /*
>>>> -        * Execute the init code on each CPU.
>>>> -        */
>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>> -
>>>> -       /*
>>>>           * Init HYP view of VGIC
>>>>           */
>>>>          err = kvm_vgic_hyp_init();
>>>
>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>> (to get the number of LRs), because we're not reading it from EL2
>>> anymore.
> 
> Thank you for pointing this out.
> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
> I didn't notice this problem.

Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
GICv3 uses some system registers that are only available at EL2, and KVM
needs some information contained in these registers before being able to
get initialized.

>> Indeed, this is completely broken (I just reproduced the issue on a
>> model). I wish this kind of details had been checked earlier, but thanks
>> for pointing it out.
>>
>>> Whats the best way to fix this?
>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>> - Fold the VGIC init stuff back into hardware_enable()?
>>
>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>> while vgic_hyp_init() can only be called once. Also,
>> kvm_arch_hardware_enable() is called from interrupt context, and I
>> wouldn't feel comfortable starting probing DT and allocating stuff from
>> there.
> 
> Do you think so?
> How about the fixup! patch attached below?
> The point is that, like Ashwin's first idea, we initialize cpus temporarily
> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
> kvm cpu hotplug will still continue to work as before.
> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
> original code, the change will not be a big jump.

This seems quite complicated:
- init EL2 on  all CPUs
- do some initialization
- tear all CPUs EL2 down
- let KVM drive the vectors being set or not

My questions are: why do we need to do this on *all* cpus? Can't that
work on a single one?

Also, the simple fact that we were able to get some junk value is a sign
that something is amiss. I'd expect a splat of some sort, because we now
have a possibility of doing things in the wrong context.

> 
> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
> I hope this should work. Actually I confirmed that, with this fixup! patch,
> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
> 
> My only concern is the following kernel message I saw when kexec shut down
> the kernel:
> (Please note that I was running one kvm quest (pid=961) here.)
> 
> ===
> sh-4.3# ./kexec -d -e
> kexec version: 15.11.16.11.06-g41e52e2
> arch_process_options:112: command_line: (null)
> arch_process_options:114: initrd: (null)
> arch_process_options:115: dtb: (null)
> arch_process_options:117: port: 0x0
> kvm: exiting hardware virtualization
> kvm [961]: Unsupported exception type: 6248304    <== this message

That makes me feel very uncomfortable. It looks like we've exited a
guest with some horrible value in X0. How is that even possible?

This deserves to be investigated.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-10 18:31         ` Geoff Levand
@ 2015-12-11 16:31           ` Will Deacon
  -1 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-11 16:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 10, 2015 at 10:31:48AM -0800, Geoff Levand wrote:
> On Thu, 2015-12-03 at 13:58 +0000, Marc Zyngier wrote:
> > Indeed, this is completely broken (I just reproduced the issue on a
> > model).
> 
> There are users out there that want kexec/kdump support.  I think we
> should remove this patch from the series, replace it with a temporary
> Kconfig conditional that allows only one of KVM or kexec to be enabled,
> and merge kexec and kdump support in for 4.5.  This will satisfy users
> who need kexec but not KVM, like kexec based bootloaders.  Once this
> KVM hot plug patch is fixed we then merge it, either for 4.5 or 4.6,
> depending on the timing.
> 
> I'll prepare and post a v13 series that does the above.

I'm not really keen on merging this based on a "temporary" Kconfig change
with a promise to have it resolved in the future. Given how long this
series has been floating around already, I'm not filled with confidence
at the prospect of waiting for a fixup patch after it's been merged.

Please address the outstanding review comments before reposting.

Will

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-11 16:31           ` Will Deacon
  0 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-11 16:31 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, vikrams, Marc Zyngier, Catalin Marinas, kexec,
	AKASHI Takahiro, shankerd, linux-arm-kernel, Christoffer Dall

On Thu, Dec 10, 2015 at 10:31:48AM -0800, Geoff Levand wrote:
> On Thu, 2015-12-03 at 13:58 +0000, Marc Zyngier wrote:
> > Indeed, this is completely broken (I just reproduced the issue on a
> > model).
> 
> There are users out there that want kexec/kdump support.  I think we
> should remove this patch from the series, replace it with a temporary
> Kconfig conditional that allows only one of KVM or kexec to be enabled,
> and merge kexec and kdump support in for 4.5.  This will satisfy users
> who need kexec but not KVM, like kexec based bootloaders.  Once this
> KVM hot plug patch is fixed we then merge it, either for 4.5 or 4.6,
> depending on the timing.
> 
> I'll prepare and post a v13 series that does the above.

I'm not really keen on merging this based on a "temporary" Kconfig change
with a promise to have it resolved in the future. Given how long this
series has been floating around already, I'm not filled with confidence
at the prospect of waiting for a fixup patch after it's been merged.

Please address the outstanding review comments before reposting.

Will

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-11 16:28           ` Marc Zyngier
@ 2015-12-11 18:00             ` Shanker Donthineni
  -1 siblings, 0 replies; 178+ messages in thread
From: Shanker Donthineni @ 2015-12-11 18:00 UTC (permalink / raw)
  To: linux-arm-kernel



On 12/11/2015 10:28 AM, Marc Zyngier wrote:
> On 11/12/15 08:06, AKASHI Takahiro wrote:
>> Ashwin, Marc,
>>
>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>> Hello,
>>>>
>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>
>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>> core in EL2.
>>>>>
>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>> kvm_arch_hardware_enable() respectively.
>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>
>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>> compiling errors.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>    arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>    arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>    arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>    arch/arm/kvm/mmu.c                |  5 +++
>>>>>    arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>    arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>    arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>    arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>    arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>    9 files changed, 138 insertions(+), 48 deletions(-)
>>>> [..]
>>>>
>>>>>
>>>>>    static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>           }
>>>>>
>>>>>           /*
>>>>> -        * Execute the init code on each CPU.
>>>>> -        */
>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>> -
>>>>> -       /*
>>>>>            * Init HYP view of VGIC
>>>>>            */
>>>>>           err = kvm_vgic_hyp_init();
>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>> (to get the number of LRs), because we're not reading it from EL2
>>>> anymore.
>> Thank you for pointing this out.
>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>> I didn't notice this problem.
> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
> GICv3 uses some system registers that are only available at EL2, and KVM
> needs some information contained in these registers before being able to
> get initialized.
>
>>> Indeed, this is completely broken (I just reproduced the issue on a
>>> model). I wish this kind of details had been checked earlier, but thanks
>>> for pointing it out.
>>>
>>>> Whats the best way to fix this?
>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>> while vgic_hyp_init() can only be called once. Also,
>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>> there.
>> Do you think so?
>> How about the fixup! patch attached below?
>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>> kvm cpu hotplug will still continue to work as before.
>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>> original code, the change will not be a big jump.
> This seems quite complicated:
> - init EL2 on  all CPUs
> - do some initialization
> - tear all CPUs EL2 down
> - let KVM drive the vectors being set or not
>
> My questions are: why do we need to do this on *all* cpus? Can't that
> work on a single one?
>   

Single CPU EL2 initialization should be fine as long as no kernel 
preemption happens
in between init EL2  and  kvm_vgic_hyp_init() execution. The function 
init_hyp_mode()
is called by do_basic_setup() with preemption enabled.

I don't have deeper knowledge of how scheduler is handled during the 
kernel boot
time, but initializing all CPUs definitely helps if preemption happens 
before reading
ICH_VTR_EL2 register and after kvm_vgic_hyp_init().

> Also, the simple fact that we were able to get some junk value is a sign
> that something is amiss. I'd expect a splat of some sort, because we now
> have a possibility of doing things in the wrong context.
>
>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>
>> My only concern is the following kernel message I saw when kexec shut down
>> the kernel:
>> (Please note that I was running one kvm quest (pid=961) here.)
>>
>> ===
>> sh-4.3# ./kexec -d -e
>> kexec version: 15.11.16.11.06-g41e52e2
>> arch_process_options:112: command_line: (null)
>> arch_process_options:114: initrd: (null)
>> arch_process_options:115: dtb: (null)
>> arch_process_options:117: port: 0x0
>> kvm: exiting hardware virtualization
>> kvm [961]: Unsupported exception type: 6248304    <== this message
> That makes me feel very uncomfortable. It looks like we've exited a
> guest with some horrible value in X0. How is that even possible?
>
> This deserves to be investigated.
>
> Thanks,
>
> 	M.

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-11 18:00             ` Shanker Donthineni
  0 siblings, 0 replies; 178+ messages in thread
From: Shanker Donthineni @ 2015-12-11 18:00 UTC (permalink / raw)
  To: Marc Zyngier, AKASHI Takahiro, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kexec, Christoffer Dall



On 12/11/2015 10:28 AM, Marc Zyngier wrote:
> On 11/12/15 08:06, AKASHI Takahiro wrote:
>> Ashwin, Marc,
>>
>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>> Hello,
>>>>
>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>
>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>> core in EL2.
>>>>>
>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>> kvm_arch_hardware_enable() respectively.
>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>
>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>> compiling errors.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>    arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>    arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>    arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>    arch/arm/kvm/mmu.c                |  5 +++
>>>>>    arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>    arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>    arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>    arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>    arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>    9 files changed, 138 insertions(+), 48 deletions(-)
>>>> [..]
>>>>
>>>>>
>>>>>    static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>           }
>>>>>
>>>>>           /*
>>>>> -        * Execute the init code on each CPU.
>>>>> -        */
>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>> -
>>>>> -       /*
>>>>>            * Init HYP view of VGIC
>>>>>            */
>>>>>           err = kvm_vgic_hyp_init();
>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>> (to get the number of LRs), because we're not reading it from EL2
>>>> anymore.
>> Thank you for pointing this out.
>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>> I didn't notice this problem.
> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
> GICv3 uses some system registers that are only available at EL2, and KVM
> needs some information contained in these registers before being able to
> get initialized.
>
>>> Indeed, this is completely broken (I just reproduced the issue on a
>>> model). I wish this kind of details had been checked earlier, but thanks
>>> for pointing it out.
>>>
>>>> Whats the best way to fix this?
>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>> while vgic_hyp_init() can only be called once. Also,
>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>> there.
>> Do you think so?
>> How about the fixup! patch attached below?
>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>> kvm cpu hotplug will still continue to work as before.
>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>> original code, the change will not be a big jump.
> This seems quite complicated:
> - init EL2 on  all CPUs
> - do some initialization
> - tear all CPUs EL2 down
> - let KVM drive the vectors being set or not
>
> My questions are: why do we need to do this on *all* cpus? Can't that
> work on a single one?
>   

Single CPU EL2 initialization should be fine as long as no kernel 
preemption happens
in between init EL2  and  kvm_vgic_hyp_init() execution. The function 
init_hyp_mode()
is called by do_basic_setup() with preemption enabled.

I don't have deeper knowledge of how scheduler is handled during the 
kernel boot
time, but initializing all CPUs definitely helps if preemption happens 
before reading
ICH_VTR_EL2 register and after kvm_vgic_hyp_init().

> Also, the simple fact that we were able to get some junk value is a sign
> that something is amiss. I'd expect a splat of some sort, because we now
> have a possibility of doing things in the wrong context.
>
>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>
>> My only concern is the following kernel message I saw when kexec shut down
>> the kernel:
>> (Please note that I was running one kvm quest (pid=961) here.)
>>
>> ===
>> sh-4.3# ./kexec -d -e
>> kexec version: 15.11.16.11.06-g41e52e2
>> arch_process_options:112: command_line: (null)
>> arch_process_options:114: initrd: (null)
>> arch_process_options:115: dtb: (null)
>> arch_process_options:117: port: 0x0
>> kvm: exiting hardware virtualization
>> kvm [961]: Unsupported exception type: 6248304    <== this message
> That makes me feel very uncomfortable. It looks like we've exited a
> guest with some horrible value in X0. How is that even possible?
>
> This deserves to be investigated.
>
> Thanks,
>
> 	M.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-11 18:00             ` Shanker Donthineni
@ 2015-12-11 18:11               ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-11 18:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/12/15 18:00, Shanker Donthineni wrote:
> 
> 
> On 12/11/2015 10:28 AM, Marc Zyngier wrote:
>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>> Ashwin, Marc,
>>>
>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>> Hello,
>>>>>
>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>
>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>> core in EL2.
>>>>>>
>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>
>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>> compiling errors.
>>>>>>
>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>> ---
>>>>>>    arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>    arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>    arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>    arch/arm/kvm/mmu.c                |  5 +++
>>>>>>    arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>    arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>    arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>    arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>    arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>    9 files changed, 138 insertions(+), 48 deletions(-)
>>>>> [..]
>>>>>
>>>>>>
>>>>>>    static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>           }
>>>>>>
>>>>>>           /*
>>>>>> -        * Execute the init code on each CPU.
>>>>>> -        */
>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>> -
>>>>>> -       /*
>>>>>>            * Init HYP view of VGIC
>>>>>>            */
>>>>>>           err = kvm_vgic_hyp_init();
>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>> anymore.
>>> Thank you for pointing this out.
>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>> I didn't notice this problem.
>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>> GICv3 uses some system registers that are only available at EL2, and KVM
>> needs some information contained in these registers before being able to
>> get initialized.
>>
>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>> for pointing it out.
>>>>
>>>>> Whats the best way to fix this?
>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>> while vgic_hyp_init() can only be called once. Also,
>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>> there.
>>> Do you think so?
>>> How about the fixup! patch attached below?
>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>> kvm cpu hotplug will still continue to work as before.
>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>> original code, the change will not be a big jump.
>> This seems quite complicated:
>> - init EL2 on  all CPUs
>> - do some initialization
>> - tear all CPUs EL2 down
>> - let KVM drive the vectors being set or not
>>
>> My questions are: why do we need to do this on *all* cpus? Can't that
>> work on a single one?
>>   
> 
> Single CPU EL2 initialization should be fine as long as no kernel 
> preemption happens in between init EL2  and  kvm_vgic_hyp_init()
> execution. The function init_hyp_mode() is called by do_basic_setup()
> with preemption enabled.

Indeed. So far, we never needed this since we were executing this code
with interrupts disabled.

> I don't have deeper knowledge of how scheduler is handled during the
>  kernel boot time, but initializing all CPUs definitely helps if
> preemption happens before reading ICH_VTR_EL2 register and after
> kvm_vgic_hyp_init().

What is wrong with wrapping the critical path with
preempt_{enabled,disabled}?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-11 18:11               ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-11 18:11 UTC (permalink / raw)
  To: Shanker Donthineni, AKASHI Takahiro, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kexec, Christoffer Dall

On 11/12/15 18:00, Shanker Donthineni wrote:
> 
> 
> On 12/11/2015 10:28 AM, Marc Zyngier wrote:
>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>> Ashwin, Marc,
>>>
>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>> Hello,
>>>>>
>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>
>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>> core in EL2.
>>>>>>
>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>
>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>> compiling errors.
>>>>>>
>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>> ---
>>>>>>    arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>    arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>    arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>    arch/arm/kvm/mmu.c                |  5 +++
>>>>>>    arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>    arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>    arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>    arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>    arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>    9 files changed, 138 insertions(+), 48 deletions(-)
>>>>> [..]
>>>>>
>>>>>>
>>>>>>    static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>           }
>>>>>>
>>>>>>           /*
>>>>>> -        * Execute the init code on each CPU.
>>>>>> -        */
>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>> -
>>>>>> -       /*
>>>>>>            * Init HYP view of VGIC
>>>>>>            */
>>>>>>           err = kvm_vgic_hyp_init();
>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>> anymore.
>>> Thank you for pointing this out.
>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>> I didn't notice this problem.
>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>> GICv3 uses some system registers that are only available at EL2, and KVM
>> needs some information contained in these registers before being able to
>> get initialized.
>>
>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>> for pointing it out.
>>>>
>>>>> Whats the best way to fix this?
>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>> while vgic_hyp_init() can only be called once. Also,
>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>> there.
>>> Do you think so?
>>> How about the fixup! patch attached below?
>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>> kvm cpu hotplug will still continue to work as before.
>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>> original code, the change will not be a big jump.
>> This seems quite complicated:
>> - init EL2 on  all CPUs
>> - do some initialization
>> - tear all CPUs EL2 down
>> - let KVM drive the vectors being set or not
>>
>> My questions are: why do we need to do this on *all* cpus? Can't that
>> work on a single one?
>>   
> 
> Single CPU EL2 initialization should be fine as long as no kernel 
> preemption happens in between init EL2  and  kvm_vgic_hyp_init()
> execution. The function init_hyp_mode() is called by do_basic_setup()
> with preemption enabled.

Indeed. So far, we never needed this since we were executing this code
with interrupts disabled.

> I don't have deeper knowledge of how scheduler is handled during the
>  kernel boot time, but initializing all CPUs definitely helps if
> preemption happens before reading ICH_VTR_EL2 register and after
> kvm_vgic_hyp_init().

What is wrong with wrapping the critical path with
preempt_{enabled,disabled}?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-11 18:11               ` Marc Zyngier
@ 2015-12-11 19:11                 ` Shanker Donthineni
  -1 siblings, 0 replies; 178+ messages in thread
From: Shanker Donthineni @ 2015-12-11 19:11 UTC (permalink / raw)
  To: linux-arm-kernel



On 12/11/2015 12:11 PM, Marc Zyngier wrote:
> On 11/12/15 18:00, Shanker Donthineni wrote:
>>
>> On 12/11/2015 10:28 AM, Marc Zyngier wrote:
>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>> Ashwin, Marc,
>>>>
>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>
>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>> core in EL2.
>>>>>>>
>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>
>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>> compiling errors.
>>>>>>>
>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>> ---
>>>>>>>     arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>     arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>     arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>     arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>     arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>     arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>     arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>     arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>     arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>     9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>> [..]
>>>>>>
>>>>>>>     static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>            }
>>>>>>>
>>>>>>>            /*
>>>>>>> -        * Execute the init code on each CPU.
>>>>>>> -        */
>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>> -
>>>>>>> -       /*
>>>>>>>             * Init HYP view of VGIC
>>>>>>>             */
>>>>>>>            err = kvm_vgic_hyp_init();
>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>> anymore.
>>>> Thank you for pointing this out.
>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>> I didn't notice this problem.
>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>> needs some information contained in these registers before being able to
>>> get initialized.
>>>
>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>> for pointing it out.
>>>>>
>>>>>> Whats the best way to fix this?
>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>> there.
>>>> Do you think so?
>>>> How about the fixup! patch attached below?
>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>> kvm cpu hotplug will still continue to work as before.
>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>> original code, the change will not be a big jump.
>>> This seems quite complicated:
>>> - init EL2 on  all CPUs
>>> - do some initialization
>>> - tear all CPUs EL2 down
>>> - let KVM drive the vectors being set or not
>>>
>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>> work on a single one?
>>>    
>> Single CPU EL2 initialization should be fine as long as no kernel
>> preemption happens in between init EL2  and  kvm_vgic_hyp_init()
>> execution. The function init_hyp_mode() is called by do_basic_setup()
>> with preemption enabled.
> Indeed. So far, we never needed this since we were executing this code
> with interrupts disabled.
>
>> I don't have deeper knowledge of how scheduler is handled during the
>>   kernel boot time, but initializing all CPUs definitely helps if
>> preemption happens before reading ICH_VTR_EL2 register and after
>> kvm_vgic_hyp_init().
> What is wrong with wrapping the critical path with
> preempt_{enabled,disabled}?

I agree with you nothing wrong implementing your suggestion. It will fix 
the problem.

> Thanks,
>
> 	M.

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-11 19:11                 ` Shanker Donthineni
  0 siblings, 0 replies; 178+ messages in thread
From: Shanker Donthineni @ 2015-12-11 19:11 UTC (permalink / raw)
  To: Marc Zyngier, AKASHI Takahiro, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon,
	linux-arm-kernel, kexec, Christoffer Dall



On 12/11/2015 12:11 PM, Marc Zyngier wrote:
> On 11/12/15 18:00, Shanker Donthineni wrote:
>>
>> On 12/11/2015 10:28 AM, Marc Zyngier wrote:
>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>> Ashwin, Marc,
>>>>
>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>
>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>> core in EL2.
>>>>>>>
>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>
>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>> compiling errors.
>>>>>>>
>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>> ---
>>>>>>>     arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>     arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>     arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>     arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>     arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>     arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>     arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>     arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>     arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>     9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>> [..]
>>>>>>
>>>>>>>     static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>            }
>>>>>>>
>>>>>>>            /*
>>>>>>> -        * Execute the init code on each CPU.
>>>>>>> -        */
>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>> -
>>>>>>> -       /*
>>>>>>>             * Init HYP view of VGIC
>>>>>>>             */
>>>>>>>            err = kvm_vgic_hyp_init();
>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>> anymore.
>>>> Thank you for pointing this out.
>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>> I didn't notice this problem.
>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>> needs some information contained in these registers before being able to
>>> get initialized.
>>>
>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>> for pointing it out.
>>>>>
>>>>>> Whats the best way to fix this?
>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>> there.
>>>> Do you think so?
>>>> How about the fixup! patch attached below?
>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>> kvm cpu hotplug will still continue to work as before.
>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>> original code, the change will not be a big jump.
>>> This seems quite complicated:
>>> - init EL2 on  all CPUs
>>> - do some initialization
>>> - tear all CPUs EL2 down
>>> - let KVM drive the vectors being set or not
>>>
>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>> work on a single one?
>>>    
>> Single CPU EL2 initialization should be fine as long as no kernel
>> preemption happens in between init EL2  and  kvm_vgic_hyp_init()
>> execution. The function init_hyp_mode() is called by do_basic_setup()
>> with preemption enabled.
> Indeed. So far, we never needed this since we were executing this code
> with interrupts disabled.
>
>> I don't have deeper knowledge of how scheduler is handled during the
>>   kernel boot time, but initializing all CPUs definitely helps if
>> preemption happens before reading ICH_VTR_EL2 register and after
>> kvm_vgic_hyp_init().
> What is wrong with wrapping the critical path with
> preempt_{enabled,disabled}?

I agree with you nothing wrong implementing your suggestion. It will fix 
the problem.

> Thanks,
>
> 	M.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-11 16:28           ` Marc Zyngier
@ 2015-12-11 20:13             ` Ashwin Chaugule
  -1 siblings, 0 replies; 178+ messages in thread
From: Ashwin Chaugule @ 2015-12-11 20:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 11 December 2015 at 11:28, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 11/12/15 08:06, AKASHI Takahiro wrote:
>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>

[..]

>> How about the fixup! patch attached below?
>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>> kvm cpu hotplug will still continue to work as before.
>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>> original code, the change will not be a big jump.
>
> This seems quite complicated:
> - init EL2 on  all CPUs
> - do some initialization
> - tear all CPUs EL2 down
> - let KVM drive the vectors being set or not
>
> My questions are: why do we need to do this on *all* cpus? Can't that
> work on a single one?
>
> Also, the simple fact that we were able to get some junk value is a sign
> that something is amiss. I'd expect a splat of some sort, because we now
> have a possibility of doing things in the wrong context.

Yea. I had the same expectation too. Probably some leftover state from
the hyp stub at bootup kept us lucky.

>
>>
>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>
>> My only concern is the following kernel message I saw when kexec shut down
>> the kernel:
>> (Please note that I was running one kvm quest (pid=961) here.)
>>
>> ===
>> sh-4.3# ./kexec -d -e
>> kexec version: 15.11.16.11.06-g41e52e2
>> arch_process_options:112: command_line: (null)
>> arch_process_options:114: initrd: (null)
>> arch_process_options:115: dtb: (null)
>> arch_process_options:117: port: 0x0
>> kvm: exiting hardware virtualization
>> kvm [961]: Unsupported exception type: 6248304    <== this message
>
> That makes me feel very uncomfortable. It looks like we've exited a
> guest with some horrible value in X0. How is that even possible?
>
> This deserves to be investigated.

Did this exception trigger as a result of the fixup? For giggles, what
happens if you dont tear down the EL2 setup after reading the number
of LRs? Are we missing another site where we need to setup and
teardown?

Ashwin

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-11 20:13             ` Ashwin Chaugule
  0 siblings, 0 replies; 178+ messages in thread
From: Ashwin Chaugule @ 2015-12-11 20:13 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, vikrams, Geoff Levand, kexec, Will Deacon,
	AKASHI Takahiro, shankerd, linux-arm-kernel, Catalin Marinas,
	Christoffer Dall

On 11 December 2015 at 11:28, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 11/12/15 08:06, AKASHI Takahiro wrote:
>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>

[..]

>> How about the fixup! patch attached below?
>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>> kvm cpu hotplug will still continue to work as before.
>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>> original code, the change will not be a big jump.
>
> This seems quite complicated:
> - init EL2 on  all CPUs
> - do some initialization
> - tear all CPUs EL2 down
> - let KVM drive the vectors being set or not
>
> My questions are: why do we need to do this on *all* cpus? Can't that
> work on a single one?
>
> Also, the simple fact that we were able to get some junk value is a sign
> that something is amiss. I'd expect a splat of some sort, because we now
> have a possibility of doing things in the wrong context.

Yea. I had the same expectation too. Probably some leftover state from
the hyp stub at bootup kept us lucky.

>
>>
>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>
>> My only concern is the following kernel message I saw when kexec shut down
>> the kernel:
>> (Please note that I was running one kvm quest (pid=961) here.)
>>
>> ===
>> sh-4.3# ./kexec -d -e
>> kexec version: 15.11.16.11.06-g41e52e2
>> arch_process_options:112: command_line: (null)
>> arch_process_options:114: initrd: (null)
>> arch_process_options:115: dtb: (null)
>> arch_process_options:117: port: 0x0
>> kvm: exiting hardware virtualization
>> kvm [961]: Unsupported exception type: 6248304    <== this message
>
> That makes me feel very uncomfortable. It looks like we've exited a
> guest with some horrible value in X0. How is that even possible?
>
> This deserves to be investigated.

Did this exception trigger as a result of the fixup? For giggles, what
happens if you dont tear down the EL2 setup after reading the number
of LRs? Are we missing another site where we need to setup and
teardown?

Ashwin

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-11 16:28           ` Marc Zyngier
@ 2015-12-14  7:33             ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-14  7:33 UTC (permalink / raw)
  To: linux-arm-kernel

Marc,

On 12/12/2015 01:28 AM, Marc Zyngier wrote:
> On 11/12/15 08:06, AKASHI Takahiro wrote:
>> Ashwin, Marc,
>>
>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>> Hello,
>>>>
>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>
>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>> core in EL2.
>>>>>
>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>> kvm_arch_hardware_enable() respectively.
>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>
>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>> compiling errors.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>    arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>    arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>    arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>    arch/arm/kvm/mmu.c                |  5 +++
>>>>>    arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>    arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>    arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>    arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>    arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>    9 files changed, 138 insertions(+), 48 deletions(-)
>>>>
>>>> [..]
>>>>
>>>>>
>>>>>
>>>>>    static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>           }
>>>>>
>>>>>           /*
>>>>> -        * Execute the init code on each CPU.
>>>>> -        */
>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>> -
>>>>> -       /*
>>>>>            * Init HYP view of VGIC
>>>>>            */
>>>>>           err = kvm_vgic_hyp_init();
>>>>
>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>> (to get the number of LRs), because we're not reading it from EL2
>>>> anymore.
>>
>> Thank you for pointing this out.
>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>> I didn't notice this problem.
>
> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
> GICv3 uses some system registers that are only available at EL2, and KVM
> needs some information contained in these registers before being able to
> get initialized.

I see.

>>> Indeed, this is completely broken (I just reproduced the issue on a
>>> model). I wish this kind of details had been checked earlier, but thanks
>>> for pointing it out.
>>>
>>>> Whats the best way to fix this?
>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>
>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>> while vgic_hyp_init() can only be called once. Also,
>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>> there.
>>
>> Do you think so?
>> How about the fixup! patch attached below?
>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>> kvm cpu hotplug will still continue to work as before.
>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>> original code, the change will not be a big jump.
>
> This seems quite complicated:
> - init EL2 on  all CPUs
> - do some initialization
> - tear all CPUs EL2 down
> - let KVM drive the vectors being set or not
>
> My questions are: why do we need to do this on *all* cpus? Can't that
> work on a single one?

I did initialize all the cpus partly because using preempt_enable/disable
looked a bit ugly and partly because we may, in the future, do additional
per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
But if you're comfortable with preempt_*() stuff, I don' care.


> Also, the simple fact that we were able to get some junk value is a sign
> that something is amiss. I'd expect a splat of some sort, because we now
> have a possibility of doing things in the wrong context.
>
>>
>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>
>> My only concern is the following kernel message I saw when kexec shut down
>> the kernel:
>> (Please note that I was running one kvm quest (pid=961) here.)
>>
>> ===
>> sh-4.3# ./kexec -d -e
>> kexec version: 15.11.16.11.06-g41e52e2
>> arch_process_options:112: command_line: (null)
>> arch_process_options:114: initrd: (null)
>> arch_process_options:115: dtb: (null)
>> arch_process_options:117: port: 0x0
>> kvm: exiting hardware virtualization
>> kvm [961]: Unsupported exception type: 6248304    <== this message
>
> That makes me feel very uncomfortable. It looks like we've exited a
> guest with some horrible value in X0. How is that even possible?
>
> This deserves to be investigated.

I guess the problem is that cpu tear-down function is called even if a kvm guest
is still running in kvm_arch_vcpu_ioctl_run().
So adding a check whether cpu has been initialized or not in every iteration of
kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
a guest mode. Since this check is done while interrupt is disabled, it won't
interfere with kvm_arch_hardware_disable() called via IPI.
See the attached fixup patch.

Again, I verified the code on model.

Thanks,
-Takahiro AKASHI

> Thanks,
>
> 	M.
>

----8<----
 From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
From: AKASHI Takahiro <takahiro.akashi@linaro.org>
Date: Fri, 11 Dec 2015 13:43:35 +0900
Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug

---
  arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
  1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 518c3c7..d7e86fb 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
  		/*
  		 * Re-check atomic conditions
  		 */
-		if (signal_pending(current)) {
+		if (__hyp_get_vectors() == hyp_default_vectors) {
+			/* cpu has been torn down */
+			ret = -ENOEXEC;
+			run->exit_reason = KVM_EXIT_SHUTDOWN;
+		} else if (signal_pending(current)) {
  			ret = -EINTR;
  			run->exit_reason = KVM_EXIT_INTR;
  		}
@@ -950,7 +954,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
  	}
  }

-int kvm_arch_hardware_enable(void)
+static void cpu_init_hyp_mode(void)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t pgd_ptr;
@@ -958,9 +962,6 @@ int kvm_arch_hardware_enable(void)
  	unsigned long stack_page;
  	unsigned long vector_ptr;

-	if (__hyp_get_vectors() != hyp_default_vectors)
-		return 0;
-
  	/* Switch from the HYP stub to our own HYP init vector */
  	__hyp_set_vectors(kvm_get_idmap_vector());

@@ -973,24 +974,35 @@ int kvm_arch_hardware_enable(void)
  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);

  	kvm_arm_init_debug();
-
-	return 0;
  }

-void kvm_arch_hardware_disable(void)
+static void cpu_reset_hyp_mode(void)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t phys_idmap_start;

-	if (__hyp_get_vectors() == hyp_default_vectors)
-		return;
-
  	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
  	phys_idmap_start = kvm_get_idmap_start();

  	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
  }

+int kvm_arch_hardware_enable(void)
+{
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		cpu_init_hyp_mode();
+
+	return 0;
+}
+
+void kvm_arch_hardware_disable(void)
+{
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		return;
+
+	cpu_reset_hyp_mode();
+}
+
  #ifdef CONFIG_CPU_PM
  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
  				    unsigned long cmd,
@@ -1114,9 +1126,20 @@ static int init_hyp_mode(void)
  	}

  	/*
+	 * Init this CPU temporarily to execute kvm_hyp_call()
+	 * during kvm_vgic_hyp_init().
+	 */
+	preempt_disable();
+	cpu_init_hyp_mode();
+
+	/*
  	 * Init HYP view of VGIC
  	 */
  	err = kvm_vgic_hyp_init();
+
+	cpu_reset_hyp_mode();
+	preempt_enable();
+
  	if (err)
  		goto out_free_context;

-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-14  7:33             ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-14  7:33 UTC (permalink / raw)
  To: Marc Zyngier, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon, shankerd,
	linux-arm-kernel, kexec, Christoffer Dall

Marc,

On 12/12/2015 01:28 AM, Marc Zyngier wrote:
> On 11/12/15 08:06, AKASHI Takahiro wrote:
>> Ashwin, Marc,
>>
>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>> Hello,
>>>>
>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>
>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>> core in EL2.
>>>>>
>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>> kvm_arch_hardware_enable() respectively.
>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>
>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>> compiling errors.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>    arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>    arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>    arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>    arch/arm/kvm/mmu.c                |  5 +++
>>>>>    arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>    arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>    arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>    arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>    arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>    9 files changed, 138 insertions(+), 48 deletions(-)
>>>>
>>>> [..]
>>>>
>>>>>
>>>>>
>>>>>    static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>           }
>>>>>
>>>>>           /*
>>>>> -        * Execute the init code on each CPU.
>>>>> -        */
>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>> -
>>>>> -       /*
>>>>>            * Init HYP view of VGIC
>>>>>            */
>>>>>           err = kvm_vgic_hyp_init();
>>>>
>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>> (to get the number of LRs), because we're not reading it from EL2
>>>> anymore.
>>
>> Thank you for pointing this out.
>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>> I didn't notice this problem.
>
> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
> GICv3 uses some system registers that are only available at EL2, and KVM
> needs some information contained in these registers before being able to
> get initialized.

I see.

>>> Indeed, this is completely broken (I just reproduced the issue on a
>>> model). I wish this kind of details had been checked earlier, but thanks
>>> for pointing it out.
>>>
>>>> Whats the best way to fix this?
>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>
>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>> while vgic_hyp_init() can only be called once. Also,
>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>> there.
>>
>> Do you think so?
>> How about the fixup! patch attached below?
>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>> kvm cpu hotplug will still continue to work as before.
>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>> original code, the change will not be a big jump.
>
> This seems quite complicated:
> - init EL2 on  all CPUs
> - do some initialization
> - tear all CPUs EL2 down
> - let KVM drive the vectors being set or not
>
> My questions are: why do we need to do this on *all* cpus? Can't that
> work on a single one?

I did initialize all the cpus partly because using preempt_enable/disable
looked a bit ugly and partly because we may, in the future, do additional
per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
But if you're comfortable with preempt_*() stuff, I don' care.


> Also, the simple fact that we were able to get some junk value is a sign
> that something is amiss. I'd expect a splat of some sort, because we now
> have a possibility of doing things in the wrong context.
>
>>
>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>
>> My only concern is the following kernel message I saw when kexec shut down
>> the kernel:
>> (Please note that I was running one kvm quest (pid=961) here.)
>>
>> ===
>> sh-4.3# ./kexec -d -e
>> kexec version: 15.11.16.11.06-g41e52e2
>> arch_process_options:112: command_line: (null)
>> arch_process_options:114: initrd: (null)
>> arch_process_options:115: dtb: (null)
>> arch_process_options:117: port: 0x0
>> kvm: exiting hardware virtualization
>> kvm [961]: Unsupported exception type: 6248304    <== this message
>
> That makes me feel very uncomfortable. It looks like we've exited a
> guest with some horrible value in X0. How is that even possible?
>
> This deserves to be investigated.

I guess the problem is that cpu tear-down function is called even if a kvm guest
is still running in kvm_arch_vcpu_ioctl_run().
So adding a check whether cpu has been initialized or not in every iteration of
kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
a guest mode. Since this check is done while interrupt is disabled, it won't
interfere with kvm_arch_hardware_disable() called via IPI.
See the attached fixup patch.

Again, I verified the code on model.

Thanks,
-Takahiro AKASHI

> Thanks,
>
> 	M.
>

----8<----
 From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
From: AKASHI Takahiro <takahiro.akashi@linaro.org>
Date: Fri, 11 Dec 2015 13:43:35 +0900
Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug

---
  arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
  1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 518c3c7..d7e86fb 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
  		/*
  		 * Re-check atomic conditions
  		 */
-		if (signal_pending(current)) {
+		if (__hyp_get_vectors() == hyp_default_vectors) {
+			/* cpu has been torn down */
+			ret = -ENOEXEC;
+			run->exit_reason = KVM_EXIT_SHUTDOWN;
+		} else if (signal_pending(current)) {
  			ret = -EINTR;
  			run->exit_reason = KVM_EXIT_INTR;
  		}
@@ -950,7 +954,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
  	}
  }

-int kvm_arch_hardware_enable(void)
+static void cpu_init_hyp_mode(void)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t pgd_ptr;
@@ -958,9 +962,6 @@ int kvm_arch_hardware_enable(void)
  	unsigned long stack_page;
  	unsigned long vector_ptr;

-	if (__hyp_get_vectors() != hyp_default_vectors)
-		return 0;
-
  	/* Switch from the HYP stub to our own HYP init vector */
  	__hyp_set_vectors(kvm_get_idmap_vector());

@@ -973,24 +974,35 @@ int kvm_arch_hardware_enable(void)
  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);

  	kvm_arm_init_debug();
-
-	return 0;
  }

-void kvm_arch_hardware_disable(void)
+static void cpu_reset_hyp_mode(void)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t phys_idmap_start;

-	if (__hyp_get_vectors() == hyp_default_vectors)
-		return;
-
  	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
  	phys_idmap_start = kvm_get_idmap_start();

  	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
  }

+int kvm_arch_hardware_enable(void)
+{
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		cpu_init_hyp_mode();
+
+	return 0;
+}
+
+void kvm_arch_hardware_disable(void)
+{
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		return;
+
+	cpu_reset_hyp_mode();
+}
+
  #ifdef CONFIG_CPU_PM
  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
  				    unsigned long cmd,
@@ -1114,9 +1126,20 @@ static int init_hyp_mode(void)
  	}

  	/*
+	 * Init this CPU temporarily to execute kvm_hyp_call()
+	 * during kvm_vgic_hyp_init().
+	 */
+	preempt_disable();
+	cpu_init_hyp_mode();
+
+	/*
  	 * Init HYP view of VGIC
  	 */
  	err = kvm_vgic_hyp_init();
+
+	cpu_reset_hyp_mode();
+	preempt_enable();
+
  	if (err)
  		goto out_free_context;

-- 
1.7.9.5


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-14  7:33             ` AKASHI Takahiro
@ 2015-12-14 17:33               ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-14 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/12/15 07:33, AKASHI Takahiro wrote:
> Marc,
> 
> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>> Ashwin, Marc,
>>>
>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>> Hello,
>>>>>
>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>
>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>> core in EL2.
>>>>>>
>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>
>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>> compiling errors.
>>>>>>
>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>> ---
>>>>>>    arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>    arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>    arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>    arch/arm/kvm/mmu.c                |  5 +++
>>>>>>    arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>    arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>    arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>    arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>    arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>    9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>
>>>>> [..]
>>>>>
>>>>>>
>>>>>>
>>>>>>    static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>           }
>>>>>>
>>>>>>           /*
>>>>>> -        * Execute the init code on each CPU.
>>>>>> -        */
>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>> -
>>>>>> -       /*
>>>>>>            * Init HYP view of VGIC
>>>>>>            */
>>>>>>           err = kvm_vgic_hyp_init();
>>>>>
>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>> anymore.
>>>
>>> Thank you for pointing this out.
>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>> I didn't notice this problem.
>>
>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>> GICv3 uses some system registers that are only available at EL2, and KVM
>> needs some information contained in these registers before being able to
>> get initialized.
> 
> I see.
> 
>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>> for pointing it out.
>>>>
>>>>> Whats the best way to fix this?
>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>
>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>> while vgic_hyp_init() can only be called once. Also,
>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>> there.
>>>
>>> Do you think so?
>>> How about the fixup! patch attached below?
>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>> kvm cpu hotplug will still continue to work as before.
>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>> original code, the change will not be a big jump.
>>
>> This seems quite complicated:
>> - init EL2 on  all CPUs
>> - do some initialization
>> - tear all CPUs EL2 down
>> - let KVM drive the vectors being set or not
>>
>> My questions are: why do we need to do this on *all* cpus? Can't that
>> work on a single one?
> 
> I did initialize all the cpus partly because using preempt_enable/disable
> looked a bit ugly and partly because we may, in the future, do additional
> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
> But if you're comfortable with preempt_*() stuff, I don' care.
> 
> 
>> Also, the simple fact that we were able to get some junk value is a sign
>> that something is amiss. I'd expect a splat of some sort, because we now
>> have a possibility of doing things in the wrong context.
>>
>>>
>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>
>>> My only concern is the following kernel message I saw when kexec shut down
>>> the kernel:
>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>
>>> ===
>>> sh-4.3# ./kexec -d -e
>>> kexec version: 15.11.16.11.06-g41e52e2
>>> arch_process_options:112: command_line: (null)
>>> arch_process_options:114: initrd: (null)
>>> arch_process_options:115: dtb: (null)
>>> arch_process_options:117: port: 0x0
>>> kvm: exiting hardware virtualization
>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>
>> That makes me feel very uncomfortable. It looks like we've exited a
>> guest with some horrible value in X0. How is that even possible?
>>
>> This deserves to be investigated.
> 
> I guess the problem is that cpu tear-down function is called even if a kvm guest
> is still running in kvm_arch_vcpu_ioctl_run().
> So adding a check whether cpu has been initialized or not in every iteration of
> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
> a guest mode. Since this check is done while interrupt is disabled, it won't
> interfere with kvm_arch_hardware_disable() called via IPI.
> See the attached fixup patch.
> 
> Again, I verified the code on model.
> 
> Thanks,
> -Takahiro AKASHI
> 
>> Thanks,
>>
>> 	M.
>>
> 
> ----8<----
>  From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Date: Fri, 11 Dec 2015 13:43:35 +0900
> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
> 
> ---
>   arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>   1 file changed, 34 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 518c3c7..d7e86fb 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>   		/*
>   		 * Re-check atomic conditions
>   		 */
> -		if (signal_pending(current)) {
> +		if (__hyp_get_vectors() == hyp_default_vectors) {
> +			/* cpu has been torn down */
> +			ret = -ENOEXEC;
> +			run->exit_reason = KVM_EXIT_SHUTDOWN;


That feels completely overkill (and very slow). Why don't you maintain a
per-cpu variable containing the CPU states, which will avoid calling
__hyp_get_vectors() all the time? You should be able to reuse that
construct everywhere.

Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
(called on triple fault). KVM_EXIT_FAIL_ENTRY looks more appropriate,
and the hardware_entry_failure_reason field should be populated (and
documented).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-14 17:33               ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-14 17:33 UTC (permalink / raw)
  To: AKASHI Takahiro, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon, shankerd,
	linux-arm-kernel, kexec, Christoffer Dall

On 14/12/15 07:33, AKASHI Takahiro wrote:
> Marc,
> 
> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>> Ashwin, Marc,
>>>
>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>> Hello,
>>>>>
>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>
>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>> core in EL2.
>>>>>>
>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>
>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>> compiling errors.
>>>>>>
>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>> ---
>>>>>>    arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>    arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>    arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>    arch/arm/kvm/mmu.c                |  5 +++
>>>>>>    arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>    arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>    arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>    arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>    arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>    9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>
>>>>> [..]
>>>>>
>>>>>>
>>>>>>
>>>>>>    static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>           }
>>>>>>
>>>>>>           /*
>>>>>> -        * Execute the init code on each CPU.
>>>>>> -        */
>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>> -
>>>>>> -       /*
>>>>>>            * Init HYP view of VGIC
>>>>>>            */
>>>>>>           err = kvm_vgic_hyp_init();
>>>>>
>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>> anymore.
>>>
>>> Thank you for pointing this out.
>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>> I didn't notice this problem.
>>
>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>> GICv3 uses some system registers that are only available at EL2, and KVM
>> needs some information contained in these registers before being able to
>> get initialized.
> 
> I see.
> 
>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>> for pointing it out.
>>>>
>>>>> Whats the best way to fix this?
>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>
>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>> while vgic_hyp_init() can only be called once. Also,
>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>> there.
>>>
>>> Do you think so?
>>> How about the fixup! patch attached below?
>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>> kvm cpu hotplug will still continue to work as before.
>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>> original code, the change will not be a big jump.
>>
>> This seems quite complicated:
>> - init EL2 on  all CPUs
>> - do some initialization
>> - tear all CPUs EL2 down
>> - let KVM drive the vectors being set or not
>>
>> My questions are: why do we need to do this on *all* cpus? Can't that
>> work on a single one?
> 
> I did initialize all the cpus partly because using preempt_enable/disable
> looked a bit ugly and partly because we may, in the future, do additional
> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
> But if you're comfortable with preempt_*() stuff, I don' care.
> 
> 
>> Also, the simple fact that we were able to get some junk value is a sign
>> that something is amiss. I'd expect a splat of some sort, because we now
>> have a possibility of doing things in the wrong context.
>>
>>>
>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>
>>> My only concern is the following kernel message I saw when kexec shut down
>>> the kernel:
>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>
>>> ===
>>> sh-4.3# ./kexec -d -e
>>> kexec version: 15.11.16.11.06-g41e52e2
>>> arch_process_options:112: command_line: (null)
>>> arch_process_options:114: initrd: (null)
>>> arch_process_options:115: dtb: (null)
>>> arch_process_options:117: port: 0x0
>>> kvm: exiting hardware virtualization
>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>
>> That makes me feel very uncomfortable. It looks like we've exited a
>> guest with some horrible value in X0. How is that even possible?
>>
>> This deserves to be investigated.
> 
> I guess the problem is that cpu tear-down function is called even if a kvm guest
> is still running in kvm_arch_vcpu_ioctl_run().
> So adding a check whether cpu has been initialized or not in every iteration of
> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
> a guest mode. Since this check is done while interrupt is disabled, it won't
> interfere with kvm_arch_hardware_disable() called via IPI.
> See the attached fixup patch.
> 
> Again, I verified the code on model.
> 
> Thanks,
> -Takahiro AKASHI
> 
>> Thanks,
>>
>> 	M.
>>
> 
> ----8<----
>  From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Date: Fri, 11 Dec 2015 13:43:35 +0900
> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
> 
> ---
>   arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>   1 file changed, 34 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 518c3c7..d7e86fb 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>   		/*
>   		 * Re-check atomic conditions
>   		 */
> -		if (signal_pending(current)) {
> +		if (__hyp_get_vectors() == hyp_default_vectors) {
> +			/* cpu has been torn down */
> +			ret = -ENOEXEC;
> +			run->exit_reason = KVM_EXIT_SHUTDOWN;


That feels completely overkill (and very slow). Why don't you maintain a
per-cpu variable containing the CPU states, which will avoid calling
__hyp_get_vectors() all the time? You should be able to reuse that
construct everywhere.

Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
(called on triple fault). KVM_EXIT_FAIL_ENTRY looks more appropriate,
and the hardware_entry_failure_reason field should be populated (and
documented).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-11  8:09           ` AKASHI Takahiro
@ 2015-12-14 18:00             ` Shi, Yang
  -1 siblings, 0 replies; 178+ messages in thread
From: Shi, Yang @ 2015-12-14 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/11/2015 12:09 AM, AKASHI Takahiro wrote:
> Yang,
>
> On 12/11/2015 03:44 AM, Shi, Yang wrote:
>> On 12/3/2015 5:58 AM, Marc Zyngier wrote:
>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>> Hello,
>>>>
>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>
>>>>> The current kvm implementation on arm64 does cpu-specific
>>>>> initialization
>>>>> at system boot, and has no way to gracefully shutdown a core in
>>>>> terms of
>>>>> kvm. This prevents, especially, kexec from rebooting the system on
>>>>> a boot
>>>>> core in EL2.
>>>>>
>>>>> This patch adds a cpu tear-down function and also puts an existing
>>>>> cpu-init
>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>> kvm_arch_hardware_enable() respectively.
>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>
>>>>> Since this patch modifies common part of code between arm and
>>>>> arm64, one
>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>> compiling errors.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>   arch/arm/kvm/arm.c                | 79
>>>>> ++++++++++++++++++---------------------
>>>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>>>
>>>> [..]
>>>>
>>>>>
>>>>>
>>>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>          }
>>>>>
>>>>>          /*
>>>>> -        * Execute the init code on each CPU.
>>>>> -        */
>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>> -
>>>>> -       /*
>>>>>           * Init HYP view of VGIC
>>>>>           */
>>>>>          err = kvm_vgic_hyp_init();
>>>>
>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>> (to get the number of LRs), because we're not reading it from EL2
>>>> anymore.
>>>
>>> Indeed, this is completely broken (I just reproduced the issue on a
>>> model). I wish this kind of details had been checked earlier, but thanks
>>> for pointing it out.
>>
>> Sorry for chiming in late. I did run into something similar when I
>> tested some earlier version patches with Akashi San.
>>
>> The guest bootup just hangs with 4.1 kernel on my LS2085 board which
>> has GICv3. Not sure if this is the same issue. But,
>> my bisect did point to this patch.
>
> Can you please try my fixup! patch to determine whether it is the same
> issue or not?

I wish I could help this time, however, unfortunately, my LS2085 board 
had some hardware failure so I lost the access to it. Once I get a 
replacement or have it fixed, I will have a try.

Regards,
Yang

>
> Thanks
> -Takahiro AKASHI
>
>
>> I used to think it is my kernel's problem (4.1 sounds a little bit
>> old) since Akashi San can't reproduce this on his
>> Hikey board.
>>
>> Thanks,
>> Yang
>>
>>>
>>>> Whats the best way to fix this?
>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable
>>>> later?
>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>
>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>> while vgic_hyp_init() can only be called once. Also,
>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>> there.
>>>
>>>> - Read the VGIC number of LRs from the hyp stub?
>>>
>>> That's may UNDEF if called in the wrong context. Also, this defeats the
>>> point of stubs, which is just to install the vectors for the hypervisor.
>>>
>>>> - ..
>>>
>>> Yeah, quite.
>>>
>>> Geoff, Takahiro?
>>>
>>>     M.
>>>
>>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-14 18:00             ` Shi, Yang
  0 siblings, 0 replies; 178+ messages in thread
From: Shi, Yang @ 2015-12-14 18:00 UTC (permalink / raw)
  To: AKASHI Takahiro, Marc Zyngier, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon,
	Christoffer Dall, linux-arm-kernel, kexec, shankerd

On 12/11/2015 12:09 AM, AKASHI Takahiro wrote:
> Yang,
>
> On 12/11/2015 03:44 AM, Shi, Yang wrote:
>> On 12/3/2015 5:58 AM, Marc Zyngier wrote:
>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>> Hello,
>>>>
>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>
>>>>> The current kvm implementation on arm64 does cpu-specific
>>>>> initialization
>>>>> at system boot, and has no way to gracefully shutdown a core in
>>>>> terms of
>>>>> kvm. This prevents, especially, kexec from rebooting the system on
>>>>> a boot
>>>>> core in EL2.
>>>>>
>>>>> This patch adds a cpu tear-down function and also puts an existing
>>>>> cpu-init
>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>> kvm_arch_hardware_enable() respectively.
>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>
>>>>> Since this patch modifies common part of code between arm and
>>>>> arm64, one
>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>> compiling errors.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> ---
>>>>>   arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>   arch/arm/kvm/arm.c                | 79
>>>>> ++++++++++++++++++---------------------
>>>>>   arch/arm/kvm/mmu.c                |  5 +++
>>>>>   arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>   arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>   arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>   arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>   arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>   9 files changed, 138 insertions(+), 48 deletions(-)
>>>>
>>>> [..]
>>>>
>>>>>
>>>>>
>>>>>   static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>          }
>>>>>
>>>>>          /*
>>>>> -        * Execute the init code on each CPU.
>>>>> -        */
>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>> -
>>>>> -       /*
>>>>>           * Init HYP view of VGIC
>>>>>           */
>>>>>          err = kvm_vgic_hyp_init();
>>>>
>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>> (to get the number of LRs), because we're not reading it from EL2
>>>> anymore.
>>>
>>> Indeed, this is completely broken (I just reproduced the issue on a
>>> model). I wish this kind of details had been checked earlier, but thanks
>>> for pointing it out.
>>
>> Sorry for chiming in late. I did run into something similar when I
>> tested some earlier version patches with Akashi San.
>>
>> The guest bootup just hangs with 4.1 kernel on my LS2085 board which
>> has GICv3. Not sure if this is the same issue. But,
>> my bisect did point to this patch.
>
> Can you please try my fixup! patch to determine whether it is the same
> issue or not?

I wish I could help this time, however, unfortunately, my LS2085 board 
had some hardware failure so I lost the access to it. Once I get a 
replacement or have it fixed, I will have a try.

Regards,
Yang

>
> Thanks
> -Takahiro AKASHI
>
>
>> I used to think it is my kernel's problem (4.1 sounds a little bit
>> old) since Akashi San can't reproduce this on his
>> Hikey board.
>>
>> Thanks,
>> Yang
>>
>>>
>>>> Whats the best way to fix this?
>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable
>>>> later?
>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>
>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>> while vgic_hyp_init() can only be called once. Also,
>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>> there.
>>>
>>>> - Read the VGIC number of LRs from the hyp stub?
>>>
>>> That's may UNDEF if called in the wrong context. Also, this defeats the
>>> point of stubs, which is just to install the vectors for the hypervisor.
>>>
>>>> - ..
>>>
>>> Yeah, quite.
>>>
>>> Geoff, Takahiro?
>>>
>>>     M.
>>>
>>


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-14 17:33               ` Marc Zyngier
@ 2015-12-15  7:51                 ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-15  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/15/2015 02:33 AM, Marc Zyngier wrote:
> On 14/12/15 07:33, AKASHI Takahiro wrote:
>> Marc,
>>
>> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>> Ashwin, Marc,
>>>>
>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>
>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>> core in EL2.
>>>>>>>
>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>
>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>> compiling errors.
>>>>>>>
>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>> ---
>>>>>>>     arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>     arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>     arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>     arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>     arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>     arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>     arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>     arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>     arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>     9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>>
>>>>>> [..]
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>            }
>>>>>>>
>>>>>>>            /*
>>>>>>> -        * Execute the init code on each CPU.
>>>>>>> -        */
>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>> -
>>>>>>> -       /*
>>>>>>>             * Init HYP view of VGIC
>>>>>>>             */
>>>>>>>            err = kvm_vgic_hyp_init();
>>>>>>
>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>> anymore.
>>>>
>>>> Thank you for pointing this out.
>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>> I didn't notice this problem.
>>>
>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>> needs some information contained in these registers before being able to
>>> get initialized.
>>
>> I see.
>>
>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>> for pointing it out.
>>>>>
>>>>>> Whats the best way to fix this?
>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>>
>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>> there.
>>>>
>>>> Do you think so?
>>>> How about the fixup! patch attached below?
>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>> kvm cpu hotplug will still continue to work as before.
>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>> original code, the change will not be a big jump.
>>>
>>> This seems quite complicated:
>>> - init EL2 on  all CPUs
>>> - do some initialization
>>> - tear all CPUs EL2 down
>>> - let KVM drive the vectors being set or not
>>>
>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>> work on a single one?
>>
>> I did initialize all the cpus partly because using preempt_enable/disable
>> looked a bit ugly and partly because we may, in the future, do additional
>> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
>> But if you're comfortable with preempt_*() stuff, I don' care.
>>
>>
>>> Also, the simple fact that we were able to get some junk value is a sign
>>> that something is amiss. I'd expect a splat of some sort, because we now
>>> have a possibility of doing things in the wrong context.
>>>
>>>>
>>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>>
>>>> My only concern is the following kernel message I saw when kexec shut down
>>>> the kernel:
>>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>>
>>>> ===
>>>> sh-4.3# ./kexec -d -e
>>>> kexec version: 15.11.16.11.06-g41e52e2
>>>> arch_process_options:112: command_line: (null)
>>>> arch_process_options:114: initrd: (null)
>>>> arch_process_options:115: dtb: (null)
>>>> arch_process_options:117: port: 0x0
>>>> kvm: exiting hardware virtualization
>>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>>
>>> That makes me feel very uncomfortable. It looks like we've exited a
>>> guest with some horrible value in X0. How is that even possible?
>>>
>>> This deserves to be investigated.
>>
>> I guess the problem is that cpu tear-down function is called even if a kvm guest
>> is still running in kvm_arch_vcpu_ioctl_run().
>> So adding a check whether cpu has been initialized or not in every iteration of
>> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
>> a guest mode. Since this check is done while interrupt is disabled, it won't
>> interfere with kvm_arch_hardware_disable() called via IPI.
>> See the attached fixup patch.
>>
>> Again, I verified the code on model.
>>
>> Thanks,
>> -Takahiro AKASHI
>>
>>> Thanks,
>>>
>>> 	M.
>>>
>>
>> ----8<----
>>   From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> Date: Fri, 11 Dec 2015 13:43:35 +0900
>> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
>>
>> ---
>>    arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>>    1 file changed, 34 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 518c3c7..d7e86fb 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>    		/*
>>    		 * Re-check atomic conditions
>>    		 */
>> -		if (signal_pending(current)) {
>> +		if (__hyp_get_vectors() == hyp_default_vectors) {
>> +			/* cpu has been torn down */
>> +			ret = -ENOEXEC;
>> +			run->exit_reason = KVM_EXIT_SHUTDOWN;
>
>
> That feels completely overkill (and very slow). Why don't you maintain a
> per-cpu variable containing the CPU states, which will avoid calling
> __hyp_get_vectors() all the time? You should be able to reuse that
> construct everywhere.

OK. Since I have introduced per-cpu variable, kvm_arm_hardware_enabled, against
cpuidle issue, we will be able to re-use it.

> Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
> (called on triple fault).

No, I don't think so.
Looking at kvm_cpu_exec() in kvm-all.c of qemu, KVM_EXIT_SHUTDOWN
is handled in a generic way and results in a reset request.
On the other hand, KVM_EXIT_FAIL_ENTRY seems more arch-specific.
In addition, if kvm_vcpu_ioctl() returns a negative value, run->exit_reason
will never be examined.
So I think
    ret -> 0
    run->exit_reason -> KVM_EXIT_SHUTDOWN
or just
    ret -> -ENOEXEC
is the best.

In either way, a guest will have no good chance to gracefully shutdown itself
because we're kexec'ing (without waiting for threads' termination).

-Takahiro AKASHI

> KVM_EXIT_FAIL_ENTRY looks more appropriate,
> and the hardware_entry_failure_reason field should be populated (and
> documented).
>
> Thanks,
>
> 	M.
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-15  7:51                 ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-15  7:51 UTC (permalink / raw)
  To: Marc Zyngier, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon, shankerd,
	linux-arm-kernel, kexec, Christoffer Dall

On 12/15/2015 02:33 AM, Marc Zyngier wrote:
> On 14/12/15 07:33, AKASHI Takahiro wrote:
>> Marc,
>>
>> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>> Ashwin, Marc,
>>>>
>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>
>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>> core in EL2.
>>>>>>>
>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>
>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>> compiling errors.
>>>>>>>
>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>> ---
>>>>>>>     arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>     arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>     arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>     arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>     arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>     arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>     arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>     arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>     arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>     9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>>
>>>>>> [..]
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>     static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>            }
>>>>>>>
>>>>>>>            /*
>>>>>>> -        * Execute the init code on each CPU.
>>>>>>> -        */
>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>> -
>>>>>>> -       /*
>>>>>>>             * Init HYP view of VGIC
>>>>>>>             */
>>>>>>>            err = kvm_vgic_hyp_init();
>>>>>>
>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>> anymore.
>>>>
>>>> Thank you for pointing this out.
>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>> I didn't notice this problem.
>>>
>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>> needs some information contained in these registers before being able to
>>> get initialized.
>>
>> I see.
>>
>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>> for pointing it out.
>>>>>
>>>>>> Whats the best way to fix this?
>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>>
>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>> there.
>>>>
>>>> Do you think so?
>>>> How about the fixup! patch attached below?
>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>> kvm cpu hotplug will still continue to work as before.
>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>> original code, the change will not be a big jump.
>>>
>>> This seems quite complicated:
>>> - init EL2 on  all CPUs
>>> - do some initialization
>>> - tear all CPUs EL2 down
>>> - let KVM drive the vectors being set or not
>>>
>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>> work on a single one?
>>
>> I did initialize all the cpus partly because using preempt_enable/disable
>> looked a bit ugly and partly because we may, in the future, do additional
>> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
>> But if you're comfortable with preempt_*() stuff, I don' care.
>>
>>
>>> Also, the simple fact that we were able to get some junk value is a sign
>>> that something is amiss. I'd expect a splat of some sort, because we now
>>> have a possibility of doing things in the wrong context.
>>>
>>>>
>>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>>
>>>> My only concern is the following kernel message I saw when kexec shut down
>>>> the kernel:
>>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>>
>>>> ===
>>>> sh-4.3# ./kexec -d -e
>>>> kexec version: 15.11.16.11.06-g41e52e2
>>>> arch_process_options:112: command_line: (null)
>>>> arch_process_options:114: initrd: (null)
>>>> arch_process_options:115: dtb: (null)
>>>> arch_process_options:117: port: 0x0
>>>> kvm: exiting hardware virtualization
>>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>>
>>> That makes me feel very uncomfortable. It looks like we've exited a
>>> guest with some horrible value in X0. How is that even possible?
>>>
>>> This deserves to be investigated.
>>
>> I guess the problem is that cpu tear-down function is called even if a kvm guest
>> is still running in kvm_arch_vcpu_ioctl_run().
>> So adding a check whether cpu has been initialized or not in every iteration of
>> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
>> a guest mode. Since this check is done while interrupt is disabled, it won't
>> interfere with kvm_arch_hardware_disable() called via IPI.
>> See the attached fixup patch.
>>
>> Again, I verified the code on model.
>>
>> Thanks,
>> -Takahiro AKASHI
>>
>>> Thanks,
>>>
>>> 	M.
>>>
>>
>> ----8<----
>>   From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> Date: Fri, 11 Dec 2015 13:43:35 +0900
>> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
>>
>> ---
>>    arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>>    1 file changed, 34 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 518c3c7..d7e86fb 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>    		/*
>>    		 * Re-check atomic conditions
>>    		 */
>> -		if (signal_pending(current)) {
>> +		if (__hyp_get_vectors() == hyp_default_vectors) {
>> +			/* cpu has been torn down */
>> +			ret = -ENOEXEC;
>> +			run->exit_reason = KVM_EXIT_SHUTDOWN;
>
>
> That feels completely overkill (and very slow). Why don't you maintain a
> per-cpu variable containing the CPU states, which will avoid calling
> __hyp_get_vectors() all the time? You should be able to reuse that
> construct everywhere.

OK. Since I have introduced per-cpu variable, kvm_arm_hardware_enabled, against
cpuidle issue, we will be able to re-use it.

> Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
> (called on triple fault).

No, I don't think so.
Looking at kvm_cpu_exec() in kvm-all.c of qemu, KVM_EXIT_SHUTDOWN
is handled in a generic way and results in a reset request.
On the other hand, KVM_EXIT_FAIL_ENTRY seems more arch-specific.
In addition, if kvm_vcpu_ioctl() returns a negative value, run->exit_reason
will never be examined.
So I think
    ret -> 0
    run->exit_reason -> KVM_EXIT_SHUTDOWN
or just
    ret -> -ENOEXEC
is the best.

In either way, a guest will have no good chance to gracefully shutdown itself
because we're kexec'ing (without waiting for threads' termination).

-Takahiro AKASHI

> KVM_EXIT_FAIL_ENTRY looks more appropriate,
> and the hardware_entry_failure_reason field should be populated (and
> documented).
>
> Thanks,
>
> 	M.
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-15  7:51                 ` AKASHI Takahiro
@ 2015-12-15  8:45                   ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-15  8:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/12/15 07:51, AKASHI Takahiro wrote:
> On 12/15/2015 02:33 AM, Marc Zyngier wrote:
>> On 14/12/15 07:33, AKASHI Takahiro wrote:
>>> Marc,
>>>
>>> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>>> Ashwin, Marc,
>>>>>
>>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>
>>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>>> core in EL2.
>>>>>>>>
>>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>>
>>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>>> compiling errors.
>>>>>>>>
>>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>> ---
>>>>>>>>     arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>>     arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>>     arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>>     arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>>     arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>>     arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>>     arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>>     arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>>     arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>>     9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>>>
>>>>>>> [..]
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>>            }
>>>>>>>>
>>>>>>>>            /*
>>>>>>>> -        * Execute the init code on each CPU.
>>>>>>>> -        */
>>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>>> -
>>>>>>>> -       /*
>>>>>>>>             * Init HYP view of VGIC
>>>>>>>>             */
>>>>>>>>            err = kvm_vgic_hyp_init();
>>>>>>>
>>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>>> anymore.
>>>>>
>>>>> Thank you for pointing this out.
>>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>>> I didn't notice this problem.
>>>>
>>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>>> needs some information contained in these registers before being able to
>>>> get initialized.
>>>
>>> I see.
>>>
>>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>>> for pointing it out.
>>>>>>
>>>>>>> Whats the best way to fix this?
>>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>>>
>>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>>> there.
>>>>>
>>>>> Do you think so?
>>>>> How about the fixup! patch attached below?
>>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>>> kvm cpu hotplug will still continue to work as before.
>>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>>> original code, the change will not be a big jump.
>>>>
>>>> This seems quite complicated:
>>>> - init EL2 on  all CPUs
>>>> - do some initialization
>>>> - tear all CPUs EL2 down
>>>> - let KVM drive the vectors being set or not
>>>>
>>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>>> work on a single one?
>>>
>>> I did initialize all the cpus partly because using preempt_enable/disable
>>> looked a bit ugly and partly because we may, in the future, do additional
>>> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
>>> But if you're comfortable with preempt_*() stuff, I don' care.
>>>
>>>
>>>> Also, the simple fact that we were able to get some junk value is a sign
>>>> that something is amiss. I'd expect a splat of some sort, because we now
>>>> have a possibility of doing things in the wrong context.
>>>>
>>>>>
>>>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>>>
>>>>> My only concern is the following kernel message I saw when kexec shut down
>>>>> the kernel:
>>>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>>>
>>>>> ===
>>>>> sh-4.3# ./kexec -d -e
>>>>> kexec version: 15.11.16.11.06-g41e52e2
>>>>> arch_process_options:112: command_line: (null)
>>>>> arch_process_options:114: initrd: (null)
>>>>> arch_process_options:115: dtb: (null)
>>>>> arch_process_options:117: port: 0x0
>>>>> kvm: exiting hardware virtualization
>>>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>>>
>>>> That makes me feel very uncomfortable. It looks like we've exited a
>>>> guest with some horrible value in X0. How is that even possible?
>>>>
>>>> This deserves to be investigated.
>>>
>>> I guess the problem is that cpu tear-down function is called even if a kvm guest
>>> is still running in kvm_arch_vcpu_ioctl_run().
>>> So adding a check whether cpu has been initialized or not in every iteration of
>>> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
>>> a guest mode. Since this check is done while interrupt is disabled, it won't
>>> interfere with kvm_arch_hardware_disable() called via IPI.
>>> See the attached fixup patch.
>>>
>>> Again, I verified the code on model.
>>>
>>> Thanks,
>>> -Takahiro AKASHI
>>>
>>>> Thanks,
>>>>
>>>> 	M.
>>>>
>>>
>>> ----8<----
>>>   From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> Date: Fri, 11 Dec 2015 13:43:35 +0900
>>> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
>>>
>>> ---
>>>    arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>>>    1 file changed, 34 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>> index 518c3c7..d7e86fb 100644
>>> --- a/arch/arm/kvm/arm.c
>>> +++ b/arch/arm/kvm/arm.c
>>> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>    		/*
>>>    		 * Re-check atomic conditions
>>>    		 */
>>> -		if (signal_pending(current)) {
>>> +		if (__hyp_get_vectors() == hyp_default_vectors) {
>>> +			/* cpu has been torn down */
>>> +			ret = -ENOEXEC;
>>> +			run->exit_reason = KVM_EXIT_SHUTDOWN;
>>
>>
>> That feels completely overkill (and very slow). Why don't you maintain a
>> per-cpu variable containing the CPU states, which will avoid calling
>> __hyp_get_vectors() all the time? You should be able to reuse that
>> construct everywhere.
> 
> OK. Since I have introduced per-cpu variable, kvm_arm_hardware_enabled, against
> cpuidle issue, we will be able to re-use it.
> 
>> Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
>> (called on triple fault).
> 
> No, I don't think so.

maz at approximate:~/Work/arm-platforms$ git grep KVM_EXIT_SHUTDOWN
arch/x86/kvm/svm.c:     kvm_run->exit_reason = KVM_EXIT_SHUTDOWN;
arch/x86/kvm/vmx.c:     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
arch/x86/kvm/x86.c:                     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
include/uapi/linux/kvm.h:#define KVM_EXIT_SHUTDOWN         8

And that's it. No other architecture ever generates this, and this is
an undocumented API. So I'm not going to let that in until someone actually
defines what this thing means.

> Looking at kvm_cpu_exec() in kvm-all.c of qemu, KVM_EXIT_SHUTDOWN
> is handled in a generic way and results in a reset request.

Which is not what we want. We want to indicate that the guest couldn't
be entered. This is not due to a guest doing a triple fault (which is
the way an x86 system gets rebooted).

> On the other hand, KVM_EXIT_FAIL_ENTRY seems more arch-specific.

Certainly arch specific, but actually extremely accurate. You couldn't
enter the guest, and you describe the reason in an architecture-specific
fashion. This is also the only exit code that describe this exact case
we're talking about here.

> In addition, if kvm_vcpu_ioctl() returns a negative value, run->exit_reason
> will never be examined.
> So I think
>     ret -> 0
>     run->exit_reason -> KVM_EXIT_SHUTDOWN

ret = 0
run->exit_reason = KVM_EXIT_FAIL_ENTRY;
run->hardware_entry_failure_reason = (u64)-ENOEXEC;

> or just
>     ret -> -ENOEXEC
> is the best.
> 
> In either way, a guest will have no good chance to gracefully shutdown itself
> because we're kexec'ing (without waiting for threads' termination).

Well, at least userspace gets a chance - and should kexec fail, we have
a chance to recover.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-15  8:45                   ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-15  8:45 UTC (permalink / raw)
  To: AKASHI Takahiro, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon, shankerd,
	linux-arm-kernel, kexec, Christoffer Dall

On 15/12/15 07:51, AKASHI Takahiro wrote:
> On 12/15/2015 02:33 AM, Marc Zyngier wrote:
>> On 14/12/15 07:33, AKASHI Takahiro wrote:
>>> Marc,
>>>
>>> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>>> Ashwin, Marc,
>>>>>
>>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>
>>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>>> core in EL2.
>>>>>>>>
>>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>>
>>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>>> compiling errors.
>>>>>>>>
>>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>> ---
>>>>>>>>     arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>>     arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>>     arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>>     arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>>     arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>>     arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>>     arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>>     arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>>     arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>>     9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>>>
>>>>>>> [..]
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>>            }
>>>>>>>>
>>>>>>>>            /*
>>>>>>>> -        * Execute the init code on each CPU.
>>>>>>>> -        */
>>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>>> -
>>>>>>>> -       /*
>>>>>>>>             * Init HYP view of VGIC
>>>>>>>>             */
>>>>>>>>            err = kvm_vgic_hyp_init();
>>>>>>>
>>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>>> anymore.
>>>>>
>>>>> Thank you for pointing this out.
>>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>>> I didn't notice this problem.
>>>>
>>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>>> needs some information contained in these registers before being able to
>>>> get initialized.
>>>
>>> I see.
>>>
>>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>>> for pointing it out.
>>>>>>
>>>>>>> Whats the best way to fix this?
>>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>>>
>>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>>> there.
>>>>>
>>>>> Do you think so?
>>>>> How about the fixup! patch attached below?
>>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>>> kvm cpu hotplug will still continue to work as before.
>>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>>> original code, the change will not be a big jump.
>>>>
>>>> This seems quite complicated:
>>>> - init EL2 on  all CPUs
>>>> - do some initialization
>>>> - tear all CPUs EL2 down
>>>> - let KVM drive the vectors being set or not
>>>>
>>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>>> work on a single one?
>>>
>>> I did initialize all the cpus partly because using preempt_enable/disable
>>> looked a bit ugly and partly because we may, in the future, do additional
>>> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
>>> But if you're comfortable with preempt_*() stuff, I don' care.
>>>
>>>
>>>> Also, the simple fact that we were able to get some junk value is a sign
>>>> that something is amiss. I'd expect a splat of some sort, because we now
>>>> have a possibility of doing things in the wrong context.
>>>>
>>>>>
>>>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>>>
>>>>> My only concern is the following kernel message I saw when kexec shut down
>>>>> the kernel:
>>>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>>>
>>>>> ===
>>>>> sh-4.3# ./kexec -d -e
>>>>> kexec version: 15.11.16.11.06-g41e52e2
>>>>> arch_process_options:112: command_line: (null)
>>>>> arch_process_options:114: initrd: (null)
>>>>> arch_process_options:115: dtb: (null)
>>>>> arch_process_options:117: port: 0x0
>>>>> kvm: exiting hardware virtualization
>>>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>>>
>>>> That makes me feel very uncomfortable. It looks like we've exited a
>>>> guest with some horrible value in X0. How is that even possible?
>>>>
>>>> This deserves to be investigated.
>>>
>>> I guess the problem is that cpu tear-down function is called even if a kvm guest
>>> is still running in kvm_arch_vcpu_ioctl_run().
>>> So adding a check whether cpu has been initialized or not in every iteration of
>>> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
>>> a guest mode. Since this check is done while interrupt is disabled, it won't
>>> interfere with kvm_arch_hardware_disable() called via IPI.
>>> See the attached fixup patch.
>>>
>>> Again, I verified the code on model.
>>>
>>> Thanks,
>>> -Takahiro AKASHI
>>>
>>>> Thanks,
>>>>
>>>> 	M.
>>>>
>>>
>>> ----8<----
>>>   From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>> Date: Fri, 11 Dec 2015 13:43:35 +0900
>>> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
>>>
>>> ---
>>>    arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>>>    1 file changed, 34 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>> index 518c3c7..d7e86fb 100644
>>> --- a/arch/arm/kvm/arm.c
>>> +++ b/arch/arm/kvm/arm.c
>>> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>    		/*
>>>    		 * Re-check atomic conditions
>>>    		 */
>>> -		if (signal_pending(current)) {
>>> +		if (__hyp_get_vectors() == hyp_default_vectors) {
>>> +			/* cpu has been torn down */
>>> +			ret = -ENOEXEC;
>>> +			run->exit_reason = KVM_EXIT_SHUTDOWN;
>>
>>
>> That feels completely overkill (and very slow). Why don't you maintain a
>> per-cpu variable containing the CPU states, which will avoid calling
>> __hyp_get_vectors() all the time? You should be able to reuse that
>> construct everywhere.
> 
> OK. Since I have introduced per-cpu variable, kvm_arm_hardware_enabled, against
> cpuidle issue, we will be able to re-use it.
> 
>> Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
>> (called on triple fault).
> 
> No, I don't think so.

maz@approximate:~/Work/arm-platforms$ git grep KVM_EXIT_SHUTDOWN
arch/x86/kvm/svm.c:     kvm_run->exit_reason = KVM_EXIT_SHUTDOWN;
arch/x86/kvm/vmx.c:     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
arch/x86/kvm/x86.c:                     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
include/uapi/linux/kvm.h:#define KVM_EXIT_SHUTDOWN         8

And that's it. No other architecture ever generates this, and this is
an undocumented API. So I'm not going to let that in until someone actually
defines what this thing means.

> Looking at kvm_cpu_exec() in kvm-all.c of qemu, KVM_EXIT_SHUTDOWN
> is handled in a generic way and results in a reset request.

Which is not what we want. We want to indicate that the guest couldn't
be entered. This is not due to a guest doing a triple fault (which is
the way an x86 system gets rebooted).

> On the other hand, KVM_EXIT_FAIL_ENTRY seems more arch-specific.

Certainly arch specific, but actually extremely accurate. You couldn't
enter the guest, and you describe the reason in an architecture-specific
fashion. This is also the only exit code that describe this exact case
we're talking about here.

> In addition, if kvm_vcpu_ioctl() returns a negative value, run->exit_reason
> will never be examined.
> So I think
>     ret -> 0
>     run->exit_reason -> KVM_EXIT_SHUTDOWN

ret = 0
run->exit_reason = KVM_EXIT_FAIL_ENTRY;
run->hardware_entry_failure_reason = (u64)-ENOEXEC;

> or just
>     ret -> -ENOEXEC
> is the best.
> 
> In either way, a guest will have no good chance to gracefully shutdown itself
> because we're kexec'ing (without waiting for threads' termination).

Well, at least userspace gets a chance - and should kexec fail, we have
a chance to recover.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-11 16:31           ` Will Deacon
@ 2015-12-15  8:48             ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-15  8:48 UTC (permalink / raw)
  To: linux-arm-kernel

Will,

On 12/12/2015 01:31 AM, Will Deacon wrote:
> On Thu, Dec 10, 2015 at 10:31:48AM -0800, Geoff Levand wrote:
>> On Thu, 2015-12-03 at 13:58 +0000, Marc Zyngier wrote:
>>> Indeed, this is completely broken (I just reproduced the issue on a
>>> model).
>>
>> There are users out there that want kexec/kdump support.  I think we
>> should remove this patch from the series, replace it with a temporary
>> Kconfig conditional that allows only one of KVM or kexec to be enabled,
>> and merge kexec and kdump support in for 4.5.  This will satisfy users
>> who need kexec but not KVM, like kexec based bootloaders.  Once this
>> KVM hot plug patch is fixed we then merge it, either for 4.5 or 4.6,
>> depending on the timing.
>>
>> I'll prepare and post a v13 series that does the above.
>
> I'm not really keen on merging this based on a "temporary" Kconfig change
> with a promise to have it resolved in the future. Given how long this
> series has been floating around already, I'm not filled with confidence
> at the prospect of waiting for a fixup patch after it's been merged.
>
> Please address the outstanding review comments before reposting.

I'm working hard on addressing Marc's comment on kvm cpu hotplug, and so
can you please review other *main* part of kexec/kdump patches?
Or can I think that 'no comments' is a good sign of your acceptance?

Thanks,
-Takahiro AKASHI


> Will
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-15  8:48             ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-15  8:48 UTC (permalink / raw)
  To: Will Deacon, Geoff Levand
  Cc: Mark Rutland, vikrams, Marc Zyngier, Catalin Marinas, kexec,
	shankerd, Christoffer Dall, linux-arm-kernel

Will,

On 12/12/2015 01:31 AM, Will Deacon wrote:
> On Thu, Dec 10, 2015 at 10:31:48AM -0800, Geoff Levand wrote:
>> On Thu, 2015-12-03 at 13:58 +0000, Marc Zyngier wrote:
>>> Indeed, this is completely broken (I just reproduced the issue on a
>>> model).
>>
>> There are users out there that want kexec/kdump support.  I think we
>> should remove this patch from the series, replace it with a temporary
>> Kconfig conditional that allows only one of KVM or kexec to be enabled,
>> and merge kexec and kdump support in for 4.5.  This will satisfy users
>> who need kexec but not KVM, like kexec based bootloaders.  Once this
>> KVM hot plug patch is fixed we then merge it, either for 4.5 or 4.6,
>> depending on the timing.
>>
>> I'll prepare and post a v13 series that does the above.
>
> I'm not really keen on merging this based on a "temporary" Kconfig change
> with a promise to have it resolved in the future. Given how long this
> series has been floating around already, I'm not filled with confidence
> at the prospect of waiting for a fixup patch after it's been merged.
>
> Please address the outstanding review comments before reposting.

I'm working hard on addressing Marc's comment on kvm cpu hotplug, and so
can you please review other *main* part of kexec/kdump patches?
Or can I think that 'no comments' is a good sign of your acceptance?

Thanks,
-Takahiro AKASHI


> Will
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-15  8:45                   ` Marc Zyngier
@ 2015-12-15  9:51                     ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-15  9:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/15/2015 05:45 PM, Marc Zyngier wrote:
> On 15/12/15 07:51, AKASHI Takahiro wrote:
>> On 12/15/2015 02:33 AM, Marc Zyngier wrote:
>>> On 14/12/15 07:33, AKASHI Takahiro wrote:
>>>> Marc,
>>>>
>>>> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>>>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>>>> Ashwin, Marc,
>>>>>>
>>>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>>
>>>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>>>> core in EL2.
>>>>>>>>>
>>>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>>>
>>>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>>>> compiling errors.
>>>>>>>>>
>>>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>> ---
>>>>>>>>>      arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>>>      arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>>>      arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>>>      arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>>>      arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>>>      arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>>>      arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>>>      arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>>>      arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>>>      9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>>>>
>>>>>>>> [..]
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>>>             }
>>>>>>>>>
>>>>>>>>>             /*
>>>>>>>>> -        * Execute the init code on each CPU.
>>>>>>>>> -        */
>>>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>>>> -
>>>>>>>>> -       /*
>>>>>>>>>              * Init HYP view of VGIC
>>>>>>>>>              */
>>>>>>>>>             err = kvm_vgic_hyp_init();
>>>>>>>>
>>>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>>>> anymore.
>>>>>>
>>>>>> Thank you for pointing this out.
>>>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>>>> I didn't notice this problem.
>>>>>
>>>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>>>> needs some information contained in these registers before being able to
>>>>> get initialized.
>>>>
>>>> I see.
>>>>
>>>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>>>> for pointing it out.
>>>>>>>
>>>>>>>> Whats the best way to fix this?
>>>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>>>>
>>>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>>>> there.
>>>>>>
>>>>>> Do you think so?
>>>>>> How about the fixup! patch attached below?
>>>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>>>> kvm cpu hotplug will still continue to work as before.
>>>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>>>> original code, the change will not be a big jump.
>>>>>
>>>>> This seems quite complicated:
>>>>> - init EL2 on  all CPUs
>>>>> - do some initialization
>>>>> - tear all CPUs EL2 down
>>>>> - let KVM drive the vectors being set or not
>>>>>
>>>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>>>> work on a single one?
>>>>
>>>> I did initialize all the cpus partly because using preempt_enable/disable
>>>> looked a bit ugly and partly because we may, in the future, do additional
>>>> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
>>>> But if you're comfortable with preempt_*() stuff, I don' care.
>>>>
>>>>
>>>>> Also, the simple fact that we were able to get some junk value is a sign
>>>>> that something is amiss. I'd expect a splat of some sort, because we now
>>>>> have a possibility of doing things in the wrong context.
>>>>>
>>>>>>
>>>>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>>>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>>>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>>>>
>>>>>> My only concern is the following kernel message I saw when kexec shut down
>>>>>> the kernel:
>>>>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>>>>
>>>>>> ===
>>>>>> sh-4.3# ./kexec -d -e
>>>>>> kexec version: 15.11.16.11.06-g41e52e2
>>>>>> arch_process_options:112: command_line: (null)
>>>>>> arch_process_options:114: initrd: (null)
>>>>>> arch_process_options:115: dtb: (null)
>>>>>> arch_process_options:117: port: 0x0
>>>>>> kvm: exiting hardware virtualization
>>>>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>>>>
>>>>> That makes me feel very uncomfortable. It looks like we've exited a
>>>>> guest with some horrible value in X0. How is that even possible?
>>>>>
>>>>> This deserves to be investigated.
>>>>
>>>> I guess the problem is that cpu tear-down function is called even if a kvm guest
>>>> is still running in kvm_arch_vcpu_ioctl_run().
>>>> So adding a check whether cpu has been initialized or not in every iteration of
>>>> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
>>>> a guest mode. Since this check is done while interrupt is disabled, it won't
>>>> interfere with kvm_arch_hardware_disable() called via IPI.
>>>> See the attached fixup patch.
>>>>
>>>> Again, I verified the code on model.
>>>>
>>>> Thanks,
>>>> -Takahiro AKASHI
>>>>
>>>>> Thanks,
>>>>>
>>>>> 	M.
>>>>>
>>>>
>>>> ----8<----
>>>>    From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> Date: Fri, 11 Dec 2015 13:43:35 +0900
>>>> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
>>>>
>>>> ---
>>>>     arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>>>>     1 file changed, 34 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>>> index 518c3c7..d7e86fb 100644
>>>> --- a/arch/arm/kvm/arm.c
>>>> +++ b/arch/arm/kvm/arm.c
>>>> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>>     		/*
>>>>     		 * Re-check atomic conditions
>>>>     		 */
>>>> -		if (signal_pending(current)) {
>>>> +		if (__hyp_get_vectors() == hyp_default_vectors) {
>>>> +			/* cpu has been torn down */
>>>> +			ret = -ENOEXEC;
>>>> +			run->exit_reason = KVM_EXIT_SHUTDOWN;
>>>
>>>
>>> That feels completely overkill (and very slow). Why don't you maintain a
>>> per-cpu variable containing the CPU states, which will avoid calling
>>> __hyp_get_vectors() all the time? You should be able to reuse that
>>> construct everywhere.
>>
>> OK. Since I have introduced per-cpu variable, kvm_arm_hardware_enabled, against
>> cpuidle issue, we will be able to re-use it.
>>
>>> Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
>>> (called on triple fault).
>>
>> No, I don't think so.
>
> maz at approximate:~/Work/arm-platforms$ git grep KVM_EXIT_SHUTDOWN
> arch/x86/kvm/svm.c:     kvm_run->exit_reason = KVM_EXIT_SHUTDOWN;
> arch/x86/kvm/vmx.c:     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
> arch/x86/kvm/x86.c:                     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
> include/uapi/linux/kvm.h:#define KVM_EXIT_SHUTDOWN         8
>
> And that's it. No other architecture ever generates this, and this is
> an undocumented API. So I'm not going to let that in until someone actually
> defines what this thing means.
>
>> Looking at kvm_cpu_exec() in kvm-all.c of qemu, KVM_EXIT_SHUTDOWN
>> is handled in a generic way and results in a reset request.
>
> Which is not what we want. We want to indicate that the guest couldn't
> be entered. This is not due to a guest doing a triple fault (which is
> the way an x86 system gets rebooted).
>
>> On the other hand, KVM_EXIT_FAIL_ENTRY seems more arch-specific.
>
> Certainly arch specific, but actually extremely accurate. You couldn't
> enter the guest, and you describe the reason in an architecture-specific
> fashion. This is also the only exit code that describe this exact case
> we're talking about here.
>
>> In addition, if kvm_vcpu_ioctl() returns a negative value, run->exit_reason
>> will never be examined.
>> So I think
>>      ret -> 0
>>      run->exit_reason -> KVM_EXIT_SHUTDOWN
>
> ret = 0
> run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> run->hardware_entry_failure_reason = (u64)-ENOEXEC;

OK.

>> or just
>>      ret -> -ENOEXEC
>> is the best.
>>
>> In either way, a guest will have no good chance to gracefully shutdown itself
>> because we're kexec'ing (without waiting for threads' termination).
>
> Well, at least userspace gets a chance - and should kexec fail, we have
> a chance to recover.

Well, the current kexec implementation (on arm64) never fails
except very early stage :)

So please review the attached fixup patch, again.

Thanks,
-Takahiro AKASHI


> Thanks,
>
> 	M.
>

----8<----
 From ec6c07fe80d6ba96855468f61daffa9b91cf5622 Mon Sep 17 00:00:00 2001
From: AKASHI Takahiro <takahiro.akashi@linaro.org>
Date: Fri, 11 Dec 2015 13:43:35 +0900
Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug

---
  arch/arm/kvm/arm.c |   62 +++++++++++++++++++++++++++++++++++-----------------
  1 file changed, 42 insertions(+), 20 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 518c3c7..05eaa35 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -573,7 +573,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
  		/*
  		 * Re-check atomic conditions
  		 */
-		if (signal_pending(current)) {
+		if (!__this_cpu_read(kvm_arm_hardware_enabled)) {
+			/* cpu has been torn down */
+			ret = 0;
+			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+			run->fail_entry.hardware_entry_failure_reason
+					= (u64)-ENOEXEC;
+		} else if (signal_pending(current)) {
  			ret = -EINTR;
  			run->exit_reason = KVM_EXIT_INTR;
  		}
@@ -950,7 +956,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
  	}
  }

-int kvm_arch_hardware_enable(void)
+static void cpu_init_hyp_mode(void)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t pgd_ptr;
@@ -958,9 +964,6 @@ int kvm_arch_hardware_enable(void)
  	unsigned long stack_page;
  	unsigned long vector_ptr;

-	if (__hyp_get_vectors() != hyp_default_vectors)
-		return 0;
-
  	/* Switch from the HYP stub to our own HYP init vector */
  	__hyp_set_vectors(kvm_get_idmap_vector());

@@ -973,24 +976,38 @@ int kvm_arch_hardware_enable(void)
  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);

  	kvm_arm_init_debug();
-
-	return 0;
  }

-void kvm_arch_hardware_disable(void)
+static void cpu_reset_hyp_mode(void)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t phys_idmap_start;

-	if (__hyp_get_vectors() == hyp_default_vectors)
-		return;
-
  	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
  	phys_idmap_start = kvm_get_idmap_start();

  	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
  }

+int kvm_arch_hardware_enable(void)
+{
+	if (!__this_cpu_read(kvm_arm_hardware_enabled)) {
+		cpu_init_hyp_mode();
+		__this_cpu_write(kvm_arm_hardware_enabled, 1);
+	}
+
+	return 0;
+}
+
+void kvm_arch_hardware_disable(void)
+{
+	if (!__this_cpu_read(kvm_arm_hardware_enabled))
+		return;
+
+	cpu_reset_hyp_mode();
+	__this_cpu_write(kvm_arm_hardware_enabled, 0);
+}
+
  #ifdef CONFIG_CPU_PM
  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
  				    unsigned long cmd,
@@ -998,19 +1015,13 @@ static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
  {
  	switch (cmd) {
  	case CPU_PM_ENTER:
-		if (__hyp_get_vectors() != hyp_default_vectors)
-			__this_cpu_write(kvm_arm_hardware_enabled, 1);
-		else
-			__this_cpu_write(kvm_arm_hardware_enabled, 0);
-		/*
-		 * don't call kvm_arch_hardware_disable() in case of
-		 * CPU_PM_ENTER because it does't actually save any state.
-		 */
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			cpu_reset_hyp_mode();

  		return NOTIFY_OK;
  	case CPU_PM_EXIT:
  		if (__this_cpu_read(kvm_arm_hardware_enabled))
-			kvm_arch_hardware_enable();
+			cpu_init_hyp_mode();

  		return NOTIFY_OK;

@@ -1114,9 +1125,20 @@ static int init_hyp_mode(void)
  	}

  	/*
+	 * Init this CPU temporarily to execute kvm_hyp_call()
+	 * during kvm_vgic_hyp_init().
+	 */
+	preempt_disable();
+	cpu_init_hyp_mode();
+
+	/*
  	 * Init HYP view of VGIC
  	 */
  	err = kvm_vgic_hyp_init();
+
+	cpu_reset_hyp_mode();
+	preempt_enable();
+
  	if (err)
  		goto out_free_context;

-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-15  9:51                     ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-15  9:51 UTC (permalink / raw)
  To: Marc Zyngier, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon, shankerd,
	linux-arm-kernel, kexec, Christoffer Dall

On 12/15/2015 05:45 PM, Marc Zyngier wrote:
> On 15/12/15 07:51, AKASHI Takahiro wrote:
>> On 12/15/2015 02:33 AM, Marc Zyngier wrote:
>>> On 14/12/15 07:33, AKASHI Takahiro wrote:
>>>> Marc,
>>>>
>>>> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>>>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>>>> Ashwin, Marc,
>>>>>>
>>>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>>
>>>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>>>> core in EL2.
>>>>>>>>>
>>>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>>>
>>>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>>>> compiling errors.
>>>>>>>>>
>>>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>> ---
>>>>>>>>>      arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>>>      arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>>>      arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>>>      arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>>>      arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>>>      arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>>>      arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>>>      arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>>>      arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>>>      9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>>>>
>>>>>>>> [..]
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>>>             }
>>>>>>>>>
>>>>>>>>>             /*
>>>>>>>>> -        * Execute the init code on each CPU.
>>>>>>>>> -        */
>>>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>>>> -
>>>>>>>>> -       /*
>>>>>>>>>              * Init HYP view of VGIC
>>>>>>>>>              */
>>>>>>>>>             err = kvm_vgic_hyp_init();
>>>>>>>>
>>>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>>>> anymore.
>>>>>>
>>>>>> Thank you for pointing this out.
>>>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>>>> I didn't notice this problem.
>>>>>
>>>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>>>> needs some information contained in these registers before being able to
>>>>> get initialized.
>>>>
>>>> I see.
>>>>
>>>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>>>> for pointing it out.
>>>>>>>
>>>>>>>> Whats the best way to fix this?
>>>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>>>>
>>>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>>>> there.
>>>>>>
>>>>>> Do you think so?
>>>>>> How about the fixup! patch attached below?
>>>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>>>> kvm cpu hotplug will still continue to work as before.
>>>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>>>> original code, the change will not be a big jump.
>>>>>
>>>>> This seems quite complicated:
>>>>> - init EL2 on  all CPUs
>>>>> - do some initialization
>>>>> - tear all CPUs EL2 down
>>>>> - let KVM drive the vectors being set or not
>>>>>
>>>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>>>> work on a single one?
>>>>
>>>> I did initialize all the cpus partly because using preempt_enable/disable
>>>> looked a bit ugly and partly because we may, in the future, do additional
>>>> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
>>>> But if you're comfortable with preempt_*() stuff, I don' care.
>>>>
>>>>
>>>>> Also, the simple fact that we were able to get some junk value is a sign
>>>>> that something is amiss. I'd expect a splat of some sort, because we now
>>>>> have a possibility of doing things in the wrong context.
>>>>>
>>>>>>
>>>>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>>>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>>>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>>>>
>>>>>> My only concern is the following kernel message I saw when kexec shut down
>>>>>> the kernel:
>>>>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>>>>
>>>>>> ===
>>>>>> sh-4.3# ./kexec -d -e
>>>>>> kexec version: 15.11.16.11.06-g41e52e2
>>>>>> arch_process_options:112: command_line: (null)
>>>>>> arch_process_options:114: initrd: (null)
>>>>>> arch_process_options:115: dtb: (null)
>>>>>> arch_process_options:117: port: 0x0
>>>>>> kvm: exiting hardware virtualization
>>>>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>>>>
>>>>> That makes me feel very uncomfortable. It looks like we've exited a
>>>>> guest with some horrible value in X0. How is that even possible?
>>>>>
>>>>> This deserves to be investigated.
>>>>
>>>> I guess the problem is that cpu tear-down function is called even if a kvm guest
>>>> is still running in kvm_arch_vcpu_ioctl_run().
>>>> So adding a check whether cpu has been initialized or not in every iteration of
>>>> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
>>>> a guest mode. Since this check is done while interrupt is disabled, it won't
>>>> interfere with kvm_arch_hardware_disable() called via IPI.
>>>> See the attached fixup patch.
>>>>
>>>> Again, I verified the code on model.
>>>>
>>>> Thanks,
>>>> -Takahiro AKASHI
>>>>
>>>>> Thanks,
>>>>>
>>>>> 	M.
>>>>>
>>>>
>>>> ----8<----
>>>>    From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>> Date: Fri, 11 Dec 2015 13:43:35 +0900
>>>> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
>>>>
>>>> ---
>>>>     arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>>>>     1 file changed, 34 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>>> index 518c3c7..d7e86fb 100644
>>>> --- a/arch/arm/kvm/arm.c
>>>> +++ b/arch/arm/kvm/arm.c
>>>> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>>     		/*
>>>>     		 * Re-check atomic conditions
>>>>     		 */
>>>> -		if (signal_pending(current)) {
>>>> +		if (__hyp_get_vectors() == hyp_default_vectors) {
>>>> +			/* cpu has been torn down */
>>>> +			ret = -ENOEXEC;
>>>> +			run->exit_reason = KVM_EXIT_SHUTDOWN;
>>>
>>>
>>> That feels completely overkill (and very slow). Why don't you maintain a
>>> per-cpu variable containing the CPU states, which will avoid calling
>>> __hyp_get_vectors() all the time? You should be able to reuse that
>>> construct everywhere.
>>
>> OK. Since I have introduced per-cpu variable, kvm_arm_hardware_enabled, against
>> cpuidle issue, we will be able to re-use it.
>>
>>> Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
>>> (called on triple fault).
>>
>> No, I don't think so.
>
> maz@approximate:~/Work/arm-platforms$ git grep KVM_EXIT_SHUTDOWN
> arch/x86/kvm/svm.c:     kvm_run->exit_reason = KVM_EXIT_SHUTDOWN;
> arch/x86/kvm/vmx.c:     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
> arch/x86/kvm/x86.c:                     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
> include/uapi/linux/kvm.h:#define KVM_EXIT_SHUTDOWN         8
>
> And that's it. No other architecture ever generates this, and this is
> an undocumented API. So I'm not going to let that in until someone actually
> defines what this thing means.
>
>> Looking at kvm_cpu_exec() in kvm-all.c of qemu, KVM_EXIT_SHUTDOWN
>> is handled in a generic way and results in a reset request.
>
> Which is not what we want. We want to indicate that the guest couldn't
> be entered. This is not due to a guest doing a triple fault (which is
> the way an x86 system gets rebooted).
>
>> On the other hand, KVM_EXIT_FAIL_ENTRY seems more arch-specific.
>
> Certainly arch specific, but actually extremely accurate. You couldn't
> enter the guest, and you describe the reason in an architecture-specific
> fashion. This is also the only exit code that describe this exact case
> we're talking about here.
>
>> In addition, if kvm_vcpu_ioctl() returns a negative value, run->exit_reason
>> will never be examined.
>> So I think
>>      ret -> 0
>>      run->exit_reason -> KVM_EXIT_SHUTDOWN
>
> ret = 0
> run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> run->hardware_entry_failure_reason = (u64)-ENOEXEC;

OK.

>> or just
>>      ret -> -ENOEXEC
>> is the best.
>>
>> In either way, a guest will have no good chance to gracefully shutdown itself
>> because we're kexec'ing (without waiting for threads' termination).
>
> Well, at least userspace gets a chance - and should kexec fail, we have
> a chance to recover.

Well, the current kexec implementation (on arm64) never fails
except very early stage :)

So please review the attached fixup patch, again.

Thanks,
-Takahiro AKASHI


> Thanks,
>
> 	M.
>

----8<----
 From ec6c07fe80d6ba96855468f61daffa9b91cf5622 Mon Sep 17 00:00:00 2001
From: AKASHI Takahiro <takahiro.akashi@linaro.org>
Date: Fri, 11 Dec 2015 13:43:35 +0900
Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug

---
  arch/arm/kvm/arm.c |   62 +++++++++++++++++++++++++++++++++++-----------------
  1 file changed, 42 insertions(+), 20 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 518c3c7..05eaa35 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -573,7 +573,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
  		/*
  		 * Re-check atomic conditions
  		 */
-		if (signal_pending(current)) {
+		if (!__this_cpu_read(kvm_arm_hardware_enabled)) {
+			/* cpu has been torn down */
+			ret = 0;
+			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+			run->fail_entry.hardware_entry_failure_reason
+					= (u64)-ENOEXEC;
+		} else if (signal_pending(current)) {
  			ret = -EINTR;
  			run->exit_reason = KVM_EXIT_INTR;
  		}
@@ -950,7 +956,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
  	}
  }

-int kvm_arch_hardware_enable(void)
+static void cpu_init_hyp_mode(void)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t pgd_ptr;
@@ -958,9 +964,6 @@ int kvm_arch_hardware_enable(void)
  	unsigned long stack_page;
  	unsigned long vector_ptr;

-	if (__hyp_get_vectors() != hyp_default_vectors)
-		return 0;
-
  	/* Switch from the HYP stub to our own HYP init vector */
  	__hyp_set_vectors(kvm_get_idmap_vector());

@@ -973,24 +976,38 @@ int kvm_arch_hardware_enable(void)
  	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);

  	kvm_arm_init_debug();
-
-	return 0;
  }

-void kvm_arch_hardware_disable(void)
+static void cpu_reset_hyp_mode(void)
  {
  	phys_addr_t boot_pgd_ptr;
  	phys_addr_t phys_idmap_start;

-	if (__hyp_get_vectors() == hyp_default_vectors)
-		return;
-
  	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
  	phys_idmap_start = kvm_get_idmap_start();

  	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
  }

+int kvm_arch_hardware_enable(void)
+{
+	if (!__this_cpu_read(kvm_arm_hardware_enabled)) {
+		cpu_init_hyp_mode();
+		__this_cpu_write(kvm_arm_hardware_enabled, 1);
+	}
+
+	return 0;
+}
+
+void kvm_arch_hardware_disable(void)
+{
+	if (!__this_cpu_read(kvm_arm_hardware_enabled))
+		return;
+
+	cpu_reset_hyp_mode();
+	__this_cpu_write(kvm_arm_hardware_enabled, 0);
+}
+
  #ifdef CONFIG_CPU_PM
  static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
  				    unsigned long cmd,
@@ -998,19 +1015,13 @@ static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
  {
  	switch (cmd) {
  	case CPU_PM_ENTER:
-		if (__hyp_get_vectors() != hyp_default_vectors)
-			__this_cpu_write(kvm_arm_hardware_enabled, 1);
-		else
-			__this_cpu_write(kvm_arm_hardware_enabled, 0);
-		/*
-		 * don't call kvm_arch_hardware_disable() in case of
-		 * CPU_PM_ENTER because it does't actually save any state.
-		 */
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			cpu_reset_hyp_mode();

  		return NOTIFY_OK;
  	case CPU_PM_EXIT:
  		if (__this_cpu_read(kvm_arm_hardware_enabled))
-			kvm_arch_hardware_enable();
+			cpu_init_hyp_mode();

  		return NOTIFY_OK;

@@ -1114,9 +1125,20 @@ static int init_hyp_mode(void)
  	}

  	/*
+	 * Init this CPU temporarily to execute kvm_hyp_call()
+	 * during kvm_vgic_hyp_init().
+	 */
+	preempt_disable();
+	cpu_init_hyp_mode();
+
+	/*
  	 * Init HYP view of VGIC
  	 */
  	err = kvm_vgic_hyp_init();
+
+	cpu_reset_hyp_mode();
+	preempt_enable();
+
  	if (err)
  		goto out_free_context;

-- 
1.7.9.5



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
  2015-12-15  9:51                     ` AKASHI Takahiro
@ 2015-12-15 10:13                       ` Marc Zyngier
  -1 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-15 10:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/12/15 09:51, AKASHI Takahiro wrote:
> On 12/15/2015 05:45 PM, Marc Zyngier wrote:
>> On 15/12/15 07:51, AKASHI Takahiro wrote:
>>> On 12/15/2015 02:33 AM, Marc Zyngier wrote:
>>>> On 14/12/15 07:33, AKASHI Takahiro wrote:
>>>>> Marc,
>>>>>
>>>>> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>>>>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>>>>> Ashwin, Marc,
>>>>>>>
>>>>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>>>
>>>>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>>>>> core in EL2.
>>>>>>>>>>
>>>>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>>>>
>>>>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>>>>> compiling errors.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>>> ---
>>>>>>>>>>      arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>>>>      arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>>>>      arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>>>>      arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>>>>      arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>>>>      arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>>>>      arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>>>>      arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>>>>      arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>>>>      9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>>>>>
>>>>>>>>> [..]
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>      static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>>>>             }
>>>>>>>>>>
>>>>>>>>>>             /*
>>>>>>>>>> -        * Execute the init code on each CPU.
>>>>>>>>>> -        */
>>>>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>>>>> -
>>>>>>>>>> -       /*
>>>>>>>>>>              * Init HYP view of VGIC
>>>>>>>>>>              */
>>>>>>>>>>             err = kvm_vgic_hyp_init();
>>>>>>>>>
>>>>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>>>>> anymore.
>>>>>>>
>>>>>>> Thank you for pointing this out.
>>>>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>>>>> I didn't notice this problem.
>>>>>>
>>>>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>>>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>>>>> needs some information contained in these registers before being able to
>>>>>> get initialized.
>>>>>
>>>>> I see.
>>>>>
>>>>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>>>>> for pointing it out.
>>>>>>>>
>>>>>>>>> Whats the best way to fix this?
>>>>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>>>>>
>>>>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>>>>> there.
>>>>>>>
>>>>>>> Do you think so?
>>>>>>> How about the fixup! patch attached below?
>>>>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>>>>> kvm cpu hotplug will still continue to work as before.
>>>>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>>>>> original code, the change will not be a big jump.
>>>>>>
>>>>>> This seems quite complicated:
>>>>>> - init EL2 on  all CPUs
>>>>>> - do some initialization
>>>>>> - tear all CPUs EL2 down
>>>>>> - let KVM drive the vectors being set or not
>>>>>>
>>>>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>>>>> work on a single one?
>>>>>
>>>>> I did initialize all the cpus partly because using preempt_enable/disable
>>>>> looked a bit ugly and partly because we may, in the future, do additional
>>>>> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
>>>>> But if you're comfortable with preempt_*() stuff, I don' care.
>>>>>
>>>>>
>>>>>> Also, the simple fact that we were able to get some junk value is a sign
>>>>>> that something is amiss. I'd expect a splat of some sort, because we now
>>>>>> have a possibility of doing things in the wrong context.
>>>>>>
>>>>>>>
>>>>>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>>>>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>>>>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>>>>>
>>>>>>> My only concern is the following kernel message I saw when kexec shut down
>>>>>>> the kernel:
>>>>>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>>>>>
>>>>>>> ===
>>>>>>> sh-4.3# ./kexec -d -e
>>>>>>> kexec version: 15.11.16.11.06-g41e52e2
>>>>>>> arch_process_options:112: command_line: (null)
>>>>>>> arch_process_options:114: initrd: (null)
>>>>>>> arch_process_options:115: dtb: (null)
>>>>>>> arch_process_options:117: port: 0x0
>>>>>>> kvm: exiting hardware virtualization
>>>>>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>>>>>
>>>>>> That makes me feel very uncomfortable. It looks like we've exited a
>>>>>> guest with some horrible value in X0. How is that even possible?
>>>>>>
>>>>>> This deserves to be investigated.
>>>>>
>>>>> I guess the problem is that cpu tear-down function is called even if a kvm guest
>>>>> is still running in kvm_arch_vcpu_ioctl_run().
>>>>> So adding a check whether cpu has been initialized or not in every iteration of
>>>>> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
>>>>> a guest mode. Since this check is done while interrupt is disabled, it won't
>>>>> interfere with kvm_arch_hardware_disable() called via IPI.
>>>>> See the attached fixup patch.
>>>>>
>>>>> Again, I verified the code on model.
>>>>>
>>>>> Thanks,
>>>>> -Takahiro AKASHI
>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> 	M.
>>>>>>
>>>>>
>>>>> ----8<----
>>>>>    From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> Date: Fri, 11 Dec 2015 13:43:35 +0900
>>>>> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
>>>>>
>>>>> ---
>>>>>     arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>>>>>     1 file changed, 34 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>>>> index 518c3c7..d7e86fb 100644
>>>>> --- a/arch/arm/kvm/arm.c
>>>>> +++ b/arch/arm/kvm/arm.c
>>>>> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>>>     		/*
>>>>>     		 * Re-check atomic conditions
>>>>>     		 */
>>>>> -		if (signal_pending(current)) {
>>>>> +		if (__hyp_get_vectors() == hyp_default_vectors) {
>>>>> +			/* cpu has been torn down */
>>>>> +			ret = -ENOEXEC;
>>>>> +			run->exit_reason = KVM_EXIT_SHUTDOWN;
>>>>
>>>>
>>>> That feels completely overkill (and very slow). Why don't you maintain a
>>>> per-cpu variable containing the CPU states, which will avoid calling
>>>> __hyp_get_vectors() all the time? You should be able to reuse that
>>>> construct everywhere.
>>>
>>> OK. Since I have introduced per-cpu variable, kvm_arm_hardware_enabled, against
>>> cpuidle issue, we will be able to re-use it.
>>>
>>>> Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
>>>> (called on triple fault).
>>>
>>> No, I don't think so.
>>
>> maz at approximate:~/Work/arm-platforms$ git grep KVM_EXIT_SHUTDOWN
>> arch/x86/kvm/svm.c:     kvm_run->exit_reason = KVM_EXIT_SHUTDOWN;
>> arch/x86/kvm/vmx.c:     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
>> arch/x86/kvm/x86.c:                     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
>> include/uapi/linux/kvm.h:#define KVM_EXIT_SHUTDOWN         8
>>
>> And that's it. No other architecture ever generates this, and this is
>> an undocumented API. So I'm not going to let that in until someone actually
>> defines what this thing means.
>>
>>> Looking at kvm_cpu_exec() in kvm-all.c of qemu, KVM_EXIT_SHUTDOWN
>>> is handled in a generic way and results in a reset request.
>>
>> Which is not what we want. We want to indicate that the guest couldn't
>> be entered. This is not due to a guest doing a triple fault (which is
>> the way an x86 system gets rebooted).
>>
>>> On the other hand, KVM_EXIT_FAIL_ENTRY seems more arch-specific.
>>
>> Certainly arch specific, but actually extremely accurate. You couldn't
>> enter the guest, and you describe the reason in an architecture-specific
>> fashion. This is also the only exit code that describe this exact case
>> we're talking about here.
>>
>>> In addition, if kvm_vcpu_ioctl() returns a negative value, run->exit_reason
>>> will never be examined.
>>> So I think
>>>      ret -> 0
>>>      run->exit_reason -> KVM_EXIT_SHUTDOWN
>>
>> ret = 0
>> run->exit_reason = KVM_EXIT_FAIL_ENTRY;
>> run->hardware_entry_failure_reason = (u64)-ENOEXEC;
> 
> OK.
> 
>>> or just
>>>      ret -> -ENOEXEC
>>> is the best.
>>>
>>> In either way, a guest will have no good chance to gracefully shutdown itself
>>> because we're kexec'ing (without waiting for threads' termination).
>>
>> Well, at least userspace gets a chance - and should kexec fail, we have
>> a chance to recover.
> 
> Well, the current kexec implementation (on arm64) never fails
> except very early stage :)
> 
> So please review the attached fixup patch, again.
> 
> Thanks,
> -Takahiro AKASHI
> 
> 
>> Thanks,
>>
>> 	M.
>>
> 
> ----8<----
>  From ec6c07fe80d6ba96855468f61daffa9b91cf5622 Mon Sep 17 00:00:00 2001
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Date: Fri, 11 Dec 2015 13:43:35 +0900
> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
> 
> ---
>   arch/arm/kvm/arm.c |   62 +++++++++++++++++++++++++++++++++++-----------------
>   1 file changed, 42 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 518c3c7..05eaa35 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -573,7 +573,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>   		/*
>   		 * Re-check atomic conditions
>   		 */
> -		if (signal_pending(current)) {
> +		if (!__this_cpu_read(kvm_arm_hardware_enabled)) {

if (unlikely(...))

> +			/* cpu has been torn down */
> +			ret = 0;
> +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> +			run->fail_entry.hardware_entry_failure_reason
> +					= (u64)-ENOEXEC;
> +		} else if (signal_pending(current)) {
>   			ret = -EINTR;
>   			run->exit_reason = KVM_EXIT_INTR;
>   		}
> @@ -950,7 +956,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>   	}
>   }
> 
> -int kvm_arch_hardware_enable(void)
> +static void cpu_init_hyp_mode(void)
>   {
>   	phys_addr_t boot_pgd_ptr;
>   	phys_addr_t pgd_ptr;
> @@ -958,9 +964,6 @@ int kvm_arch_hardware_enable(void)
>   	unsigned long stack_page;
>   	unsigned long vector_ptr;
> 
> -	if (__hyp_get_vectors() != hyp_default_vectors)
> -		return 0;
> -
>   	/* Switch from the HYP stub to our own HYP init vector */
>   	__hyp_set_vectors(kvm_get_idmap_vector());
> 
> @@ -973,24 +976,38 @@ int kvm_arch_hardware_enable(void)
>   	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
> 
>   	kvm_arm_init_debug();
> -
> -	return 0;
>   }
> 
> -void kvm_arch_hardware_disable(void)
> +static void cpu_reset_hyp_mode(void)
>   {
>   	phys_addr_t boot_pgd_ptr;
>   	phys_addr_t phys_idmap_start;
> 
> -	if (__hyp_get_vectors() == hyp_default_vectors)
> -		return;
> -
>   	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
>   	phys_idmap_start = kvm_get_idmap_start();
> 
>   	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
>   }
> 
> +int kvm_arch_hardware_enable(void)
> +{
> +	if (!__this_cpu_read(kvm_arm_hardware_enabled)) {
> +		cpu_init_hyp_mode();
> +		__this_cpu_write(kvm_arm_hardware_enabled, 1);
> +	}
> +
> +	return 0;
> +}
> +
> +void kvm_arch_hardware_disable(void)
> +{
> +	if (!__this_cpu_read(kvm_arm_hardware_enabled))
> +		return;
> +
> +	cpu_reset_hyp_mode();
> +	__this_cpu_write(kvm_arm_hardware_enabled, 0);
> +}

Are you guaranteed that these two functions are called from a context
where both preemption and interrupts are disabled? If not, you cannot
use the __this_cpu* operations, and you must stick with this_cpu*.

> +
>   #ifdef CONFIG_CPU_PM
>   static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>   				    unsigned long cmd,
> @@ -998,19 +1015,13 @@ static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>   {
>   	switch (cmd) {
>   	case CPU_PM_ENTER:
> -		if (__hyp_get_vectors() != hyp_default_vectors)
> -			__this_cpu_write(kvm_arm_hardware_enabled, 1);
> -		else
> -			__this_cpu_write(kvm_arm_hardware_enabled, 0);
> -		/*
> -		 * don't call kvm_arch_hardware_disable() in case of
> -		 * CPU_PM_ENTER because it does't actually save any state.
> -		 */
> +		if (__this_cpu_read(kvm_arm_hardware_enabled))
> +			cpu_reset_hyp_mode();
> 
>   		return NOTIFY_OK;
>   	case CPU_PM_EXIT:
>   		if (__this_cpu_read(kvm_arm_hardware_enabled))
> -			kvm_arch_hardware_enable();
> +			cpu_init_hyp_mode();

It would be good to have a comment as to why we're not directly calling
hardware_enable/disable.

> 
>   		return NOTIFY_OK;
> 
> @@ -1114,9 +1125,20 @@ static int init_hyp_mode(void)
>   	}
> 
>   	/*
> +	 * Init this CPU temporarily to execute kvm_hyp_call()
> +	 * during kvm_vgic_hyp_init().
> +	 */
> +	preempt_disable();
> +	cpu_init_hyp_mode();
> +
> +	/*
>   	 * Init HYP view of VGIC
>   	 */
>   	err = kvm_vgic_hyp_init();
> +
> +	cpu_reset_hyp_mode();
> +	preempt_enable();
> +
>   	if (err)
>   		goto out_free_context;
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug
@ 2015-12-15 10:13                       ` Marc Zyngier
  0 siblings, 0 replies; 178+ messages in thread
From: Marc Zyngier @ 2015-12-15 10:13 UTC (permalink / raw)
  To: AKASHI Takahiro, Ashwin Chaugule, Geoff Levand
  Cc: Mark Rutland, vikrams, Catalin Marinas, Will Deacon, shankerd,
	linux-arm-kernel, kexec, Christoffer Dall

On 15/12/15 09:51, AKASHI Takahiro wrote:
> On 12/15/2015 05:45 PM, Marc Zyngier wrote:
>> On 15/12/15 07:51, AKASHI Takahiro wrote:
>>> On 12/15/2015 02:33 AM, Marc Zyngier wrote:
>>>> On 14/12/15 07:33, AKASHI Takahiro wrote:
>>>>> Marc,
>>>>>
>>>>> On 12/12/2015 01:28 AM, Marc Zyngier wrote:
>>>>>> On 11/12/15 08:06, AKASHI Takahiro wrote:
>>>>>>> Ashwin, Marc,
>>>>>>>
>>>>>>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>>>>>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff@infradead.org> wrote:
>>>>>>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>>>
>>>>>>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>>>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>>>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>>>>>>> core in EL2.
>>>>>>>>>>
>>>>>>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>>>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>>>>>>> kvm_arch_hardware_enable() respectively.
>>>>>>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>>>>>>
>>>>>>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>>>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>>>>>>> compiling errors.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>>>>>> ---
>>>>>>>>>>      arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>>>>>>      arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>>>>>>      arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>>>>>>      arch/arm/kvm/mmu.c                |  5 +++
>>>>>>>>>>      arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>>>>>>      arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>>>>>>      arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>>>>>>      arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>>>>>>      arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>>>>>>      9 files changed, 138 insertions(+), 48 deletions(-)
>>>>>>>>>
>>>>>>>>> [..]
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>      static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>>>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>>>>>>             }
>>>>>>>>>>
>>>>>>>>>>             /*
>>>>>>>>>> -        * Execute the init code on each CPU.
>>>>>>>>>> -        */
>>>>>>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>>>>>>> -
>>>>>>>>>> -       /*
>>>>>>>>>>              * Init HYP view of VGIC
>>>>>>>>>>              */
>>>>>>>>>>             err = kvm_vgic_hyp_init();
>>>>>>>>>
>>>>>>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>>>>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>>>>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>>>>>>> (to get the number of LRs), because we're not reading it from EL2
>>>>>>>>> anymore.
>>>>>>>
>>>>>>> Thank you for pointing this out.
>>>>>>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>>>>>>> I didn't notice this problem.
>>>>>>
>>>>>> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
>>>>>> GICv3 uses some system registers that are only available at EL2, and KVM
>>>>>> needs some information contained in these registers before being able to
>>>>>> get initialized.
>>>>>
>>>>> I see.
>>>>>
>>>>>>>> Indeed, this is completely broken (I just reproduced the issue on a
>>>>>>>> model). I wish this kind of details had been checked earlier, but thanks
>>>>>>>> for pointing it out.
>>>>>>>>
>>>>>>>>> Whats the best way to fix this?
>>>>>>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>>>>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>>>>>>>
>>>>>>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>>>>>>> while vgic_hyp_init() can only be called once. Also,
>>>>>>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>>>>>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>>>>>>> there.
>>>>>>>
>>>>>>> Do you think so?
>>>>>>> How about the fixup! patch attached below?
>>>>>>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>>>>>>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>>>>>>> kvm cpu hotplug will still continue to work as before.
>>>>>>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>>>>>>> original code, the change will not be a big jump.
>>>>>>
>>>>>> This seems quite complicated:
>>>>>> - init EL2 on  all CPUs
>>>>>> - do some initialization
>>>>>> - tear all CPUs EL2 down
>>>>>> - let KVM drive the vectors being set or not
>>>>>>
>>>>>> My questions are: why do we need to do this on *all* cpus? Can't that
>>>>>> work on a single one?
>>>>>
>>>>> I did initialize all the cpus partly because using preempt_enable/disable
>>>>> looked a bit ugly and partly because we may, in the future, do additional
>>>>> per-cpu initialization in kvm_vgic_hyp_init() and/or kvm_timer_hyp_init().
>>>>> But if you're comfortable with preempt_*() stuff, I don' care.
>>>>>
>>>>>
>>>>>> Also, the simple fact that we were able to get some junk value is a sign
>>>>>> that something is amiss. I'd expect a splat of some sort, because we now
>>>>>> have a possibility of doing things in the wrong context.
>>>>>>
>>>>>>>
>>>>>>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>>>>>>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>>>>>>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>>>>>>
>>>>>>> My only concern is the following kernel message I saw when kexec shut down
>>>>>>> the kernel:
>>>>>>> (Please note that I was running one kvm quest (pid=961) here.)
>>>>>>>
>>>>>>> ===
>>>>>>> sh-4.3# ./kexec -d -e
>>>>>>> kexec version: 15.11.16.11.06-g41e52e2
>>>>>>> arch_process_options:112: command_line: (null)
>>>>>>> arch_process_options:114: initrd: (null)
>>>>>>> arch_process_options:115: dtb: (null)
>>>>>>> arch_process_options:117: port: 0x0
>>>>>>> kvm: exiting hardware virtualization
>>>>>>> kvm [961]: Unsupported exception type: 6248304    <== this message
>>>>>>
>>>>>> That makes me feel very uncomfortable. It looks like we've exited a
>>>>>> guest with some horrible value in X0. How is that even possible?
>>>>>>
>>>>>> This deserves to be investigated.
>>>>>
>>>>> I guess the problem is that cpu tear-down function is called even if a kvm guest
>>>>> is still running in kvm_arch_vcpu_ioctl_run().
>>>>> So adding a check whether cpu has been initialized or not in every iteration of
>>>>> kvm_arch_vcpu_ioctl_run() will, if necessary, terminate a guest safely without entering
>>>>> a guest mode. Since this check is done while interrupt is disabled, it won't
>>>>> interfere with kvm_arch_hardware_disable() called via IPI.
>>>>> See the attached fixup patch.
>>>>>
>>>>> Again, I verified the code on model.
>>>>>
>>>>> Thanks,
>>>>> -Takahiro AKASHI
>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> 	M.
>>>>>>
>>>>>
>>>>> ----8<----
>>>>>    From 77f273ba5e0c3dfcf75a5a8d1da8035cc390250c Mon Sep 17 00:00:00 2001
>>>>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>> Date: Fri, 11 Dec 2015 13:43:35 +0900
>>>>> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
>>>>>
>>>>> ---
>>>>>     arch/arm/kvm/arm.c |   45 ++++++++++++++++++++++++++++++++++-----------
>>>>>     1 file changed, 34 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>>>> index 518c3c7..d7e86fb 100644
>>>>> --- a/arch/arm/kvm/arm.c
>>>>> +++ b/arch/arm/kvm/arm.c
>>>>> @@ -573,7 +573,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>>>     		/*
>>>>>     		 * Re-check atomic conditions
>>>>>     		 */
>>>>> -		if (signal_pending(current)) {
>>>>> +		if (__hyp_get_vectors() == hyp_default_vectors) {
>>>>> +			/* cpu has been torn down */
>>>>> +			ret = -ENOEXEC;
>>>>> +			run->exit_reason = KVM_EXIT_SHUTDOWN;
>>>>
>>>>
>>>> That feels completely overkill (and very slow). Why don't you maintain a
>>>> per-cpu variable containing the CPU states, which will avoid calling
>>>> __hyp_get_vectors() all the time? You should be able to reuse that
>>>> construct everywhere.
>>>
>>> OK. Since I have introduced per-cpu variable, kvm_arm_hardware_enabled, against
>>> cpuidle issue, we will be able to re-use it.
>>>
>>>> Also, I'm not sure about KVM_EXIT_SHUTDOWN. This looks very x86 specific
>>>> (called on triple fault).
>>>
>>> No, I don't think so.
>>
>> maz@approximate:~/Work/arm-platforms$ git grep KVM_EXIT_SHUTDOWN
>> arch/x86/kvm/svm.c:     kvm_run->exit_reason = KVM_EXIT_SHUTDOWN;
>> arch/x86/kvm/vmx.c:     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
>> arch/x86/kvm/x86.c:                     vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
>> include/uapi/linux/kvm.h:#define KVM_EXIT_SHUTDOWN         8
>>
>> And that's it. No other architecture ever generates this, and this is
>> an undocumented API. So I'm not going to let that in until someone actually
>> defines what this thing means.
>>
>>> Looking at kvm_cpu_exec() in kvm-all.c of qemu, KVM_EXIT_SHUTDOWN
>>> is handled in a generic way and results in a reset request.
>>
>> Which is not what we want. We want to indicate that the guest couldn't
>> be entered. This is not due to a guest doing a triple fault (which is
>> the way an x86 system gets rebooted).
>>
>>> On the other hand, KVM_EXIT_FAIL_ENTRY seems more arch-specific.
>>
>> Certainly arch specific, but actually extremely accurate. You couldn't
>> enter the guest, and you describe the reason in an architecture-specific
>> fashion. This is also the only exit code that describe this exact case
>> we're talking about here.
>>
>>> In addition, if kvm_vcpu_ioctl() returns a negative value, run->exit_reason
>>> will never be examined.
>>> So I think
>>>      ret -> 0
>>>      run->exit_reason -> KVM_EXIT_SHUTDOWN
>>
>> ret = 0
>> run->exit_reason = KVM_EXIT_FAIL_ENTRY;
>> run->hardware_entry_failure_reason = (u64)-ENOEXEC;
> 
> OK.
> 
>>> or just
>>>      ret -> -ENOEXEC
>>> is the best.
>>>
>>> In either way, a guest will have no good chance to gracefully shutdown itself
>>> because we're kexec'ing (without waiting for threads' termination).
>>
>> Well, at least userspace gets a chance - and should kexec fail, we have
>> a chance to recover.
> 
> Well, the current kexec implementation (on arm64) never fails
> except very early stage :)
> 
> So please review the attached fixup patch, again.
> 
> Thanks,
> -Takahiro AKASHI
> 
> 
>> Thanks,
>>
>> 	M.
>>
> 
> ----8<----
>  From ec6c07fe80d6ba96855468f61daffa9b91cf5622 Mon Sep 17 00:00:00 2001
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> Date: Fri, 11 Dec 2015 13:43:35 +0900
> Subject: [PATCH] fixup! arm64: kvm: allows kvm cpu hotplug
> 
> ---
>   arch/arm/kvm/arm.c |   62 +++++++++++++++++++++++++++++++++++-----------------
>   1 file changed, 42 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 518c3c7..05eaa35 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -573,7 +573,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>   		/*
>   		 * Re-check atomic conditions
>   		 */
> -		if (signal_pending(current)) {
> +		if (!__this_cpu_read(kvm_arm_hardware_enabled)) {

if (unlikely(...))

> +			/* cpu has been torn down */
> +			ret = 0;
> +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> +			run->fail_entry.hardware_entry_failure_reason
> +					= (u64)-ENOEXEC;
> +		} else if (signal_pending(current)) {
>   			ret = -EINTR;
>   			run->exit_reason = KVM_EXIT_INTR;
>   		}
> @@ -950,7 +956,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>   	}
>   }
> 
> -int kvm_arch_hardware_enable(void)
> +static void cpu_init_hyp_mode(void)
>   {
>   	phys_addr_t boot_pgd_ptr;
>   	phys_addr_t pgd_ptr;
> @@ -958,9 +964,6 @@ int kvm_arch_hardware_enable(void)
>   	unsigned long stack_page;
>   	unsigned long vector_ptr;
> 
> -	if (__hyp_get_vectors() != hyp_default_vectors)
> -		return 0;
> -
>   	/* Switch from the HYP stub to our own HYP init vector */
>   	__hyp_set_vectors(kvm_get_idmap_vector());
> 
> @@ -973,24 +976,38 @@ int kvm_arch_hardware_enable(void)
>   	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
> 
>   	kvm_arm_init_debug();
> -
> -	return 0;
>   }
> 
> -void kvm_arch_hardware_disable(void)
> +static void cpu_reset_hyp_mode(void)
>   {
>   	phys_addr_t boot_pgd_ptr;
>   	phys_addr_t phys_idmap_start;
> 
> -	if (__hyp_get_vectors() == hyp_default_vectors)
> -		return;
> -
>   	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
>   	phys_idmap_start = kvm_get_idmap_start();
> 
>   	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
>   }
> 
> +int kvm_arch_hardware_enable(void)
> +{
> +	if (!__this_cpu_read(kvm_arm_hardware_enabled)) {
> +		cpu_init_hyp_mode();
> +		__this_cpu_write(kvm_arm_hardware_enabled, 1);
> +	}
> +
> +	return 0;
> +}
> +
> +void kvm_arch_hardware_disable(void)
> +{
> +	if (!__this_cpu_read(kvm_arm_hardware_enabled))
> +		return;
> +
> +	cpu_reset_hyp_mode();
> +	__this_cpu_write(kvm_arm_hardware_enabled, 0);
> +}

Are you guaranteed that these two functions are called from a context
where both preemption and interrupts are disabled? If not, you cannot
use the __this_cpu* operations, and you must stick with this_cpu*.

> +
>   #ifdef CONFIG_CPU_PM
>   static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>   				    unsigned long cmd,
> @@ -998,19 +1015,13 @@ static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>   {
>   	switch (cmd) {
>   	case CPU_PM_ENTER:
> -		if (__hyp_get_vectors() != hyp_default_vectors)
> -			__this_cpu_write(kvm_arm_hardware_enabled, 1);
> -		else
> -			__this_cpu_write(kvm_arm_hardware_enabled, 0);
> -		/*
> -		 * don't call kvm_arch_hardware_disable() in case of
> -		 * CPU_PM_ENTER because it does't actually save any state.
> -		 */
> +		if (__this_cpu_read(kvm_arm_hardware_enabled))
> +			cpu_reset_hyp_mode();
> 
>   		return NOTIFY_OK;
>   	case CPU_PM_EXIT:
>   		if (__this_cpu_read(kvm_arm_hardware_enabled))
> -			kvm_arch_hardware_enable();
> +			cpu_init_hyp_mode();

It would be good to have a comment as to why we're not directly calling
hardware_enable/disable.

> 
>   		return NOTIFY_OK;
> 
> @@ -1114,9 +1125,20 @@ static int init_hyp_mode(void)
>   	}
> 
>   	/*
> +	 * Init this CPU temporarily to execute kvm_hyp_call()
> +	 * during kvm_vgic_hyp_init().
> +	 */
> +	preempt_disable();
> +	cpu_init_hyp_mode();
> +
> +	/*
>   	 * Init HYP view of VGIC
>   	 */
>   	err = kvm_vgic_hyp_init();
> +
> +	cpu_reset_hyp_mode();
> +	preempt_enable();
> +
>   	if (err)
>   		goto out_free_context;
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 16/16] arm64: kdump: relax BUG_ON() if more than one cpus are still active
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-15 17:05     ` Will Deacon
  -1 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> We should try best in case of kdump.
> So even if not all secondary cpus have shut down, we do kdump anyway.
> ---
>  arch/arm64/kernel/machine_kexec.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index d2d7e90..482aae7 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -148,7 +148,13 @@ void machine_kexec(struct kimage *kimage)
>  	phys_addr_t reboot_code_buffer_phys;
>  	void *reboot_code_buffer;
>  
> -	BUG_ON(num_online_cpus() > 1);
> +	if (num_online_cpus() > 1) {
> +		if (in_crash_kexec)
> +			pr_warn("kdump might fail because %d cpus are still online\n",
> +					num_online_cpus());
> +		else
> +			BUG();
> +	}

Can you just rewrite the existing BUG_ON as:

  BUG_ON(num_online_cpus() && !WARN_ON(in_crash_kexec));

?

Will

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 16/16] arm64: kdump: relax BUG_ON() if more than one cpus are still active
@ 2015-12-15 17:05     ` Will Deacon
  0 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:05 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	AKASHI Takahiro, linux-arm-kernel, christoffer.dall

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> We should try best in case of kdump.
> So even if not all secondary cpus have shut down, we do kdump anyway.
> ---
>  arch/arm64/kernel/machine_kexec.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index d2d7e90..482aae7 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -148,7 +148,13 @@ void machine_kexec(struct kimage *kimage)
>  	phys_addr_t reboot_code_buffer_phys;
>  	void *reboot_code_buffer;
>  
> -	BUG_ON(num_online_cpus() > 1);
> +	if (num_online_cpus() > 1) {
> +		if (in_crash_kexec)
> +			pr_warn("kdump might fail because %d cpus are still online\n",
> +					num_online_cpus());
> +		else
> +			BUG();
> +	}

Can you just rewrite the existing BUG_ON as:

  BUG_ON(num_online_cpus() && !WARN_ON(in_crash_kexec));

?

Will

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 09/16] arm64/kexec: Add pr_devel output
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-15 17:15     ` Will Deacon
  -1 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> To aid in debugging kexec problems or when adding new functionality to kexec add
> a new routine kexec_image_info() and several inline pr_devel statements.

We don't currently have any pr_devel invocations under arm64. Can you
either remove these or change them to pr_debug?

> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
>  arch/arm64/kernel/machine_kexec.c | 63 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
> 
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index 8b990b8..da28a26 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -25,6 +25,48 @@ extern const unsigned long arm64_relocate_new_kernel_size;
>  
>  static unsigned long kimage_start;
>  
> +/**
> + * kexec_is_dtb - Helper routine to check the device tree header signature.
> + */
> +static bool kexec_is_dtb(const void *dtb)
> +{
> +	__be32 magic;
> +
> +	return get_user(magic, (__be32 *)dtb) ? false :
> +		(be32_to_cpu(magic) == OF_DT_HEADER);

Isn't there an fdt32_to_cpu helper for this already? I also can't help
but thing this would be more readable if you didn't bother with the
ternary if form.

Will

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 09/16] arm64/kexec: Add pr_devel output
@ 2015-12-15 17:15     ` Will Deacon
  0 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:15 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	AKASHI Takahiro, linux-arm-kernel, christoffer.dall

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> To aid in debugging kexec problems or when adding new functionality to kexec add
> a new routine kexec_image_info() and several inline pr_devel statements.

We don't currently have any pr_devel invocations under arm64. Can you
either remove these or change them to pr_debug?

> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
>  arch/arm64/kernel/machine_kexec.c | 63 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
> 
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index 8b990b8..da28a26 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -25,6 +25,48 @@ extern const unsigned long arm64_relocate_new_kernel_size;
>  
>  static unsigned long kimage_start;
>  
> +/**
> + * kexec_is_dtb - Helper routine to check the device tree header signature.
> + */
> +static bool kexec_is_dtb(const void *dtb)
> +{
> +	__be32 magic;
> +
> +	return get_user(magic, (__be32 *)dtb) ? false :
> +		(be32_to_cpu(magic) == OF_DT_HEADER);

Isn't there an fdt32_to_cpu helper for this already? I also can't help
but thing this would be more readable if you didn't bother with the
ternary if form.

Will

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 14/16] arm64: kdump: update a kernel doc
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-15 17:17     ` Will Deacon
  -1 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> This patch adds arch specific descriptions about kdump usage on arm64
> to kdump.txt.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  Documentation/kdump/kdump.txt | 32 +++++++++++++++++++++++++++++++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> index bc4bd5a..7d3fad0 100644
> --- a/Documentation/kdump/kdump.txt
> +++ b/Documentation/kdump/kdump.txt
> @@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
>  a remote system.
>  
>  Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
> -s390x and arm architectures.
> +s390x, arm and arm64 architectures.
>  
>  When the system kernel boots, it reserves a small section of memory for
>  the dump-capture kernel. This ensures that ongoing Direct Memory Access
> @@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
>  
>      AUTO_ZRELADDR=y
>  
> +Dump-capture kernel config options (Arch Dependent, arm64)
> +----------------------------------------------------------
> +
> +1) Disable symmetric multi-processing support under "Processor type and
> +   features":
> +
> +   CONFIG_SMP=n
> +
> +   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
> +   when loading the dump-capture kernel, see section "Load the Dump-capture
> +   Kernel".)

SMP is now forced on, so please update this text.

> +2) The maximum memory size on the dump-capture kernel must be limited by
> +   specifying:
> +
> +   mem=X[MG]
> +
> +   where X should be less than or equal to the size in "crashkernel="
> +   boot parameter. Kexec-tools will automatically add this.
> +
> +3) Currently, kvm will not be enabled on the dump-capture kernel even
> +   if it is configured.
> +
>  Extended crashkernel syntax
>  ===========================
>  
> @@ -312,6 +335,8 @@ Boot into System Kernel
>     any space below the alignment point may be overwritten by the dump-capture kernel,
>     which means it is possible that the vmcore is not that precise as expected.
>  
> +   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
> +   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
>  
>  Load the Dump-capture Kernel
>  ============================
> @@ -334,6 +359,8 @@ For s390x:
>  	- Use image or bzImage
>  For arm:
>  	- Use zImage
> +For arm64:
> +	- Use vmlinux

What? Why doesn't Image work?

Will

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 14/16] arm64: kdump: update a kernel doc
@ 2015-12-15 17:17     ` Will Deacon
  0 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:17 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	AKASHI Takahiro, linux-arm-kernel, christoffer.dall

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> This patch adds arch specific descriptions about kdump usage on arm64
> to kdump.txt.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  Documentation/kdump/kdump.txt | 32 +++++++++++++++++++++++++++++++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> index bc4bd5a..7d3fad0 100644
> --- a/Documentation/kdump/kdump.txt
> +++ b/Documentation/kdump/kdump.txt
> @@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
>  a remote system.
>  
>  Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
> -s390x and arm architectures.
> +s390x, arm and arm64 architectures.
>  
>  When the system kernel boots, it reserves a small section of memory for
>  the dump-capture kernel. This ensures that ongoing Direct Memory Access
> @@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
>  
>      AUTO_ZRELADDR=y
>  
> +Dump-capture kernel config options (Arch Dependent, arm64)
> +----------------------------------------------------------
> +
> +1) Disable symmetric multi-processing support under "Processor type and
> +   features":
> +
> +   CONFIG_SMP=n
> +
> +   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
> +   when loading the dump-capture kernel, see section "Load the Dump-capture
> +   Kernel".)

SMP is now forced on, so please update this text.

> +2) The maximum memory size on the dump-capture kernel must be limited by
> +   specifying:
> +
> +   mem=X[MG]
> +
> +   where X should be less than or equal to the size in "crashkernel="
> +   boot parameter. Kexec-tools will automatically add this.
> +
> +3) Currently, kvm will not be enabled on the dump-capture kernel even
> +   if it is configured.
> +
>  Extended crashkernel syntax
>  ===========================
>  
> @@ -312,6 +335,8 @@ Boot into System Kernel
>     any space below the alignment point may be overwritten by the dump-capture kernel,
>     which means it is possible that the vmcore is not that precise as expected.
>  
> +   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
> +   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
>  
>  Load the Dump-capture Kernel
>  ============================
> @@ -334,6 +359,8 @@ For s390x:
>  	- Use image or bzImage
>  For arm:
>  	- Use zImage
> +For arm64:
> +	- Use vmlinux

What? Why doesn't Image work?

Will

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 11/16] arm64: kdump: reserve memory for crash dump kernel
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-15 17:29     ` Will Deacon
  -1 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> On primary kernel, the memory region used by crash dump kernel must be
> specified by "crashkernel=" boot parameter. reserve_crashkernel()
> will allocate and reserve the region for later use.
> 
> User space tools will be able to find the region marked as "Crash kernel"
> in /proc/iomem.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm64/kernel/setup.c |  7 +++++-
>  arch/arm64/mm/init.c      | 54 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 8119479..f9fffc9 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -31,7 +31,6 @@
>  #include <linux/screen_info.h>
>  #include <linux/init.h>
>  #include <linux/kexec.h>
> -#include <linux/crash_dump.h>
>  #include <linux/root_dev.h>
>  #include <linux/cpu.h>
>  #include <linux/interrupt.h>
> @@ -221,6 +220,12 @@ static void __init request_standard_resources(void)
>  		    kernel_data.end <= res->end)
>  			request_resource(res, &kernel_data);
>  	}
> +
> +#ifdef CONFIG_KEXEC
> +	/* User space tools will find "Crash kernel" region in /proc/iomem. */
> +	if (crashk_res.end)
> +		insert_resource(&iomem_resource, &crashk_res);
> +#endif

Why can't this bit be moved into reserved_crashkernel, at the end?

>  }
>  
>  #ifdef CONFIG_BLK_DEV_INITRD
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 17bf39a..24f0a1c 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -34,6 +34,7 @@
>  #include <linux/dma-contiguous.h>
>  #include <linux/efi.h>
>  #include <linux/swiotlb.h>
> +#include <linux/kexec.h>
>  
>  #include <asm/fixmap.h>
>  #include <asm/memory.h>
> @@ -66,6 +67,55 @@ static int __init early_initrd(char *p)
>  early_param("initrd", early_initrd);
>  #endif
>  
> +#ifdef CONFIG_KEXEC
> +/*
> + * reserve_crashkernel() - reserves memory for crash kernel
> + *
> + * This function reserves memory area given in "crashkernel=" kernel command
> + * line parameter. The memory reserved is used by dump capture kernel when
> + * primary kernel is crashing.
> + */
> +static void __init reserve_crashkernel(void)
> +{
> +	unsigned long long crash_size = 0, crash_base = 0;
> +	int ret;
> +
> +	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +				&crash_size, &crash_base);
> +	if (ret)
> +		return;

What's the point in ret?

> +
> +	if (crash_base == 0) {
> +		crash_base = memblock_alloc(crash_size, 1 << 21);

What is this magic number? How about SZ_2M and a comment mentioning the
arm64 boot protocol?

> +		if (crash_base == 0) {
> +			pr_warn("Unable to allocate crashkernel (size:%llx)\n",
> +				crash_size);
> +			return;
> +		}
> +	} else {
> +		/* User specifies base address explicitly. */
> +		if (!memblock_is_region_memory(crash_base, crash_size) ||
> +			memblock_is_region_reserved(crash_base, crash_size)) {
> +			pr_warn("crashkernel has wrong address or size\n");
> +			return;
> +		}
> +
> +		if (crash_base & ((1 << 21) - 1)) {

IS_ALIGNED

> +			pr_warn("crashkernel base address is not 2MB aligned\n");
> +			return;
> +		}
> +
> +		memblock_reserve(crash_base, crash_size);
> +	}
> +
> +	pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
> +		crash_size >> 20, crash_base >> 20);
> +
> +	crashk_res.start = crash_base;
> +	crashk_res.end = crash_base + crash_size - 1;
> +}

#else

static void __init reserve_crashkernel(void) {};

Will

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 11/16] arm64: kdump: reserve memory for crash dump kernel
@ 2015-12-15 17:29     ` Will Deacon
  0 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:29 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	AKASHI Takahiro, linux-arm-kernel, christoffer.dall

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> On primary kernel, the memory region used by crash dump kernel must be
> specified by "crashkernel=" boot parameter. reserve_crashkernel()
> will allocate and reserve the region for later use.
> 
> User space tools will be able to find the region marked as "Crash kernel"
> in /proc/iomem.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm64/kernel/setup.c |  7 +++++-
>  arch/arm64/mm/init.c      | 54 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 8119479..f9fffc9 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -31,7 +31,6 @@
>  #include <linux/screen_info.h>
>  #include <linux/init.h>
>  #include <linux/kexec.h>
> -#include <linux/crash_dump.h>
>  #include <linux/root_dev.h>
>  #include <linux/cpu.h>
>  #include <linux/interrupt.h>
> @@ -221,6 +220,12 @@ static void __init request_standard_resources(void)
>  		    kernel_data.end <= res->end)
>  			request_resource(res, &kernel_data);
>  	}
> +
> +#ifdef CONFIG_KEXEC
> +	/* User space tools will find "Crash kernel" region in /proc/iomem. */
> +	if (crashk_res.end)
> +		insert_resource(&iomem_resource, &crashk_res);
> +#endif

Why can't this bit be moved into reserved_crashkernel, at the end?

>  }
>  
>  #ifdef CONFIG_BLK_DEV_INITRD
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 17bf39a..24f0a1c 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -34,6 +34,7 @@
>  #include <linux/dma-contiguous.h>
>  #include <linux/efi.h>
>  #include <linux/swiotlb.h>
> +#include <linux/kexec.h>
>  
>  #include <asm/fixmap.h>
>  #include <asm/memory.h>
> @@ -66,6 +67,55 @@ static int __init early_initrd(char *p)
>  early_param("initrd", early_initrd);
>  #endif
>  
> +#ifdef CONFIG_KEXEC
> +/*
> + * reserve_crashkernel() - reserves memory for crash kernel
> + *
> + * This function reserves memory area given in "crashkernel=" kernel command
> + * line parameter. The memory reserved is used by dump capture kernel when
> + * primary kernel is crashing.
> + */
> +static void __init reserve_crashkernel(void)
> +{
> +	unsigned long long crash_size = 0, crash_base = 0;
> +	int ret;
> +
> +	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +				&crash_size, &crash_base);
> +	if (ret)
> +		return;

What's the point in ret?

> +
> +	if (crash_base == 0) {
> +		crash_base = memblock_alloc(crash_size, 1 << 21);

What is this magic number? How about SZ_2M and a comment mentioning the
arm64 boot protocol?

> +		if (crash_base == 0) {
> +			pr_warn("Unable to allocate crashkernel (size:%llx)\n",
> +				crash_size);
> +			return;
> +		}
> +	} else {
> +		/* User specifies base address explicitly. */
> +		if (!memblock_is_region_memory(crash_base, crash_size) ||
> +			memblock_is_region_reserved(crash_base, crash_size)) {
> +			pr_warn("crashkernel has wrong address or size\n");
> +			return;
> +		}
> +
> +		if (crash_base & ((1 << 21) - 1)) {

IS_ALIGNED

> +			pr_warn("crashkernel base address is not 2MB aligned\n");
> +			return;
> +		}
> +
> +		memblock_reserve(crash_base, crash_size);
> +	}
> +
> +	pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
> +		crash_size >> 20, crash_base >> 20);
> +
> +	crashk_res.start = crash_base;
> +	crashk_res.end = crash_base + crash_size - 1;
> +}

#else

static void __init reserve_crashkernel(void) {};

Will

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 13/16] arm64: kdump: add kdump support
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-15 17:45     ` Will Deacon
  -1 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> On crash dump kernel, all the information about primary kernel's core
> image is available in elf core header specified by "elfcorehdr=" boot
> parameter. reserve_elfcorehdr() will set aside the region to avoid any
> corruption by crash dump kernel.
> 
> Crash dump kernel will access the system memory of primary kernel via
> copy_oldmem_page(), which reads one page by ioremap'ing it since it does
> not reside in linear mapping on crash dump kernel.
> Please note that we should add "mem=X[MG]" boot parameter to limit the
> memory size and avoid the following assertion at ioremap():
> 	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
> 		return NULL;
> when accessing any pages beyond the usable memories of crash dump kernel.
> 
> We also need our own elfcorehdr_read() here since the weak definition of
> elfcorehdr_read() utilizes copy_oldmem_page() and will hit the assertion
> above on arm64.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm64/Kconfig             | 12 +++++++
>  arch/arm64/kernel/Makefile     |  1 +
>  arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++++++++
>  arch/arm64/mm/init.c           | 29 +++++++++++++++++
>  4 files changed, 113 insertions(+)
>  create mode 100644 arch/arm64/kernel/crash_dump.c
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index c23fd77..4bac7dc 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -545,6 +545,18 @@ config KEXEC
>  	  but it is independent of the system firmware.   And like a reboot
>  	  you can start any kernel with it, not just Linux.
>  
> +config CRASH_DUMP
> +	bool "Build kdump crash kernel"
> +	help
> +	  Generate crash dump after being started by kexec. This should
> +	  be normally only set in special crash dump kernels which are
> +	  loaded in the main kernel with kexec-tools into a specially
> +	  reserved region and then later executed after a crash by
> +	  kdump/kexec. The crash dump kernel must be compiled to a
> +	  memory address not used by the main kernel.

What does this even mean? How do I "compile to a memory address not used
by the main kernel"?

> diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
> new file mode 100644
> index 0000000..3d86c0a
> --- /dev/null
> +++ b/arch/arm64/kernel/crash_dump.c
> @@ -0,0 +1,71 @@
> +/*
> + * Routines for doing kexec-based kdump
> + *
> + * Copyright (C) 2014 Linaro Limited
> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/crash_dump.h>
> +#include <linux/errno.h>
> +#include <linux/io.h>
> +#include <linux/memblock.h>
> +#include <linux/uaccess.h>
> +#include <asm/memory.h>
> +
> +/**
> + * copy_oldmem_page() - copy one page from old kernel memory
> + * @pfn: page frame number to be copied
> + * @buf: buffer where the copied page is placed
> + * @csize: number of bytes to copy
> + * @offset: offset in bytes into the page
> + * @userbuf: if set, @buf is in a user address space
> + *
> + * This function copies one page from old kernel memory into buffer pointed by
> + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
> + * copied or negative error in case of failure.
> + */
> +ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
> +			 size_t csize, unsigned long offset,
> +			 int userbuf)
> +{
> +	void *vaddr;
> +
> +	if (!csize)
> +		return 0;
> +
> +	vaddr = ioremap_cache(pfn << PAGE_SHIFT, PAGE_SIZE);

pfn_to_page

> +	if (!vaddr)
> +		return -ENOMEM;
> +
> +	if (userbuf) {
> +		if (copy_to_user(buf, vaddr + offset, csize)) {
> +			iounmap(vaddr);
> +			return -EFAULT;
> +		}
> +	} else {
> +		memcpy(buf, vaddr + offset, csize);
> +	}
> +
> +	iounmap(vaddr);
> +
> +	return csize;
> +}
> +
> +/**
> + * elfcorehdr_read - read from ELF core header
> + * @buf: buffer where the data is placed
> + * @csize: number of bytes to read
> + * @ppos: address in the memory
> + *
> + * This function reads @count bytes from elf core header which exists
> + * on crash dump kernel's memory.
> + */
> +ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
> +{
> +	memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
> +	return count;
> +}

I know you say that we have to override this function so that we don't
hit the pfn_valid warning in ioremap, but what guarantees that the ELF
header of the crashed kernel is actually mapped in our linear mapping?

> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 24f0a1c..52a1469 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -35,6 +35,7 @@
>  #include <linux/efi.h>
>  #include <linux/swiotlb.h>
>  #include <linux/kexec.h>
> +#include <linux/crash_dump.h>
>  
>  #include <asm/fixmap.h>
>  #include <asm/memory.h>
> @@ -116,6 +117,31 @@ static void __init reserve_crashkernel(void)
>  }
>  #endif /* CONFIG_KEXEC */
>  
> +#ifdef CONFIG_CRASH_DUMP
> +/*
> + * reserve_elfcorehdr() - reserves memory for elf core header
> + *
> + * This function reserves elf core header given in "elfcorehdr=" kernel
> + * command line parameter. This region contains all the information about
> + * primary kernel's core image and is used by a dump capture kernel to
> + * access the system memory on primary kernel.
> + */
> +static void __init reserve_elfcorehdr(void)
> +{
> +	if (!elfcorehdr_size)
> +		return;
> +
> +	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
> +		pr_warn("elfcorehdr is overlapped\n");
> +		return;
> +	}
> +
> +	memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
> +
> +	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
> +		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);

I'd have thought it would be more useful to print the address as an
address rather than a size.

> +}

Similar #else trick here.

Will

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 13/16] arm64: kdump: add kdump support
@ 2015-12-15 17:45     ` Will Deacon
  0 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 17:45 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	AKASHI Takahiro, linux-arm-kernel, christoffer.dall

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> On crash dump kernel, all the information about primary kernel's core
> image is available in elf core header specified by "elfcorehdr=" boot
> parameter. reserve_elfcorehdr() will set aside the region to avoid any
> corruption by crash dump kernel.
> 
> Crash dump kernel will access the system memory of primary kernel via
> copy_oldmem_page(), which reads one page by ioremap'ing it since it does
> not reside in linear mapping on crash dump kernel.
> Please note that we should add "mem=X[MG]" boot parameter to limit the
> memory size and avoid the following assertion at ioremap():
> 	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
> 		return NULL;
> when accessing any pages beyond the usable memories of crash dump kernel.
> 
> We also need our own elfcorehdr_read() here since the weak definition of
> elfcorehdr_read() utilizes copy_oldmem_page() and will hit the assertion
> above on arm64.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> ---
>  arch/arm64/Kconfig             | 12 +++++++
>  arch/arm64/kernel/Makefile     |  1 +
>  arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++++++++
>  arch/arm64/mm/init.c           | 29 +++++++++++++++++
>  4 files changed, 113 insertions(+)
>  create mode 100644 arch/arm64/kernel/crash_dump.c
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index c23fd77..4bac7dc 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -545,6 +545,18 @@ config KEXEC
>  	  but it is independent of the system firmware.   And like a reboot
>  	  you can start any kernel with it, not just Linux.
>  
> +config CRASH_DUMP
> +	bool "Build kdump crash kernel"
> +	help
> +	  Generate crash dump after being started by kexec. This should
> +	  be normally only set in special crash dump kernels which are
> +	  loaded in the main kernel with kexec-tools into a specially
> +	  reserved region and then later executed after a crash by
> +	  kdump/kexec. The crash dump kernel must be compiled to a
> +	  memory address not used by the main kernel.

What does this even mean? How do I "compile to a memory address not used
by the main kernel"?

> diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
> new file mode 100644
> index 0000000..3d86c0a
> --- /dev/null
> +++ b/arch/arm64/kernel/crash_dump.c
> @@ -0,0 +1,71 @@
> +/*
> + * Routines for doing kexec-based kdump
> + *
> + * Copyright (C) 2014 Linaro Limited
> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/crash_dump.h>
> +#include <linux/errno.h>
> +#include <linux/io.h>
> +#include <linux/memblock.h>
> +#include <linux/uaccess.h>
> +#include <asm/memory.h>
> +
> +/**
> + * copy_oldmem_page() - copy one page from old kernel memory
> + * @pfn: page frame number to be copied
> + * @buf: buffer where the copied page is placed
> + * @csize: number of bytes to copy
> + * @offset: offset in bytes into the page
> + * @userbuf: if set, @buf is in a user address space
> + *
> + * This function copies one page from old kernel memory into buffer pointed by
> + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
> + * copied or negative error in case of failure.
> + */
> +ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
> +			 size_t csize, unsigned long offset,
> +			 int userbuf)
> +{
> +	void *vaddr;
> +
> +	if (!csize)
> +		return 0;
> +
> +	vaddr = ioremap_cache(pfn << PAGE_SHIFT, PAGE_SIZE);

pfn_to_page

> +	if (!vaddr)
> +		return -ENOMEM;
> +
> +	if (userbuf) {
> +		if (copy_to_user(buf, vaddr + offset, csize)) {
> +			iounmap(vaddr);
> +			return -EFAULT;
> +		}
> +	} else {
> +		memcpy(buf, vaddr + offset, csize);
> +	}
> +
> +	iounmap(vaddr);
> +
> +	return csize;
> +}
> +
> +/**
> + * elfcorehdr_read - read from ELF core header
> + * @buf: buffer where the data is placed
> + * @csize: number of bytes to read
> + * @ppos: address in the memory
> + *
> + * This function reads @count bytes from elf core header which exists
> + * on crash dump kernel's memory.
> + */
> +ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
> +{
> +	memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
> +	return count;
> +}

I know you say that we have to override this function so that we don't
hit the pfn_valid warning in ioremap, but what guarantees that the ELF
header of the crashed kernel is actually mapped in our linear mapping?

> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 24f0a1c..52a1469 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -35,6 +35,7 @@
>  #include <linux/efi.h>
>  #include <linux/swiotlb.h>
>  #include <linux/kexec.h>
> +#include <linux/crash_dump.h>
>  
>  #include <asm/fixmap.h>
>  #include <asm/memory.h>
> @@ -116,6 +117,31 @@ static void __init reserve_crashkernel(void)
>  }
>  #endif /* CONFIG_KEXEC */
>  
> +#ifdef CONFIG_CRASH_DUMP
> +/*
> + * reserve_elfcorehdr() - reserves memory for elf core header
> + *
> + * This function reserves elf core header given in "elfcorehdr=" kernel
> + * command line parameter. This region contains all the information about
> + * primary kernel's core image and is used by a dump capture kernel to
> + * access the system memory on primary kernel.
> + */
> +static void __init reserve_elfcorehdr(void)
> +{
> +	if (!elfcorehdr_size)
> +		return;
> +
> +	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
> +		pr_warn("elfcorehdr is overlapped\n");
> +		return;
> +	}
> +
> +	memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
> +
> +	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
> +		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);

I'd have thought it would be more useful to print the address as an
address rather than a size.

> +}

Similar #else trick here.

Will

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-15 18:29     ` Will Deacon
  -1 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 18:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the
> arm64 architecture that add support for the kexec re-boot mechanism
> (CONFIG_KEXEC) on arm64 platforms.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
>  arch/arm64/Kconfig                  |  10 +++
>  arch/arm64/include/asm/kexec.h      |  48 ++++++++++++
>  arch/arm64/kernel/Makefile          |   2 +
>  arch/arm64/kernel/machine_kexec.c   | 152 ++++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/relocate_kernel.S | 131 +++++++++++++++++++++++++++++++
>  include/uapi/linux/kexec.h          |   1 +
>  6 files changed, 344 insertions(+)
>  create mode 100644 arch/arm64/include/asm/kexec.h
>  create mode 100644 arch/arm64/kernel/machine_kexec.c
>  create mode 100644 arch/arm64/kernel/relocate_kernel.S
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 9ac16a4..c23fd77 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -535,6 +535,16 @@ config SECCOMP
>  	  and the task is only allowed to execute a few safe syscalls
>  	  defined by each seccomp mode.
>  
> +config KEXEC
> +	depends on PM_SLEEP_SMP
> +	select KEXEC_CORE
> +	bool "kexec system call"
> +	---help---
> +	  kexec is a system call that implements the ability to shutdown your
> +	  current kernel, and to start another kernel.  It is like a reboot
> +	  but it is independent of the system firmware.   And like a reboot
> +	  you can start any kernel with it, not just Linux.
> +
>  config XEN_DOM0
>  	def_bool y
>  	depends on XEN
> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> new file mode 100644
> index 0000000..46d63cd
> --- /dev/null
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -0,0 +1,48 @@
> +/*
> + * kexec for arm64
> + *
> + * Copyright (C) Linaro.
> + * Copyright (C) Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#if !defined(_ARM64_KEXEC_H)
> +#define _ARM64_KEXEC_H

Please keep to the style used elsewhere in the arch headers.

> +
> +/* Maximum physical address we can use pages from */
> +
> +#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
> +
> +/* Maximum address we can reach in physical address mode */
> +
> +#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
> +
> +/* Maximum address we can use for the control code buffer */
> +
> +#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL)
> +
> +#define KEXEC_CONTROL_PAGE_SIZE	4096

Does this work on kernels configured with 64k pages? It looks like the
kexec core code will end up using order-0 pages, so I worry that we'll
actually put down 64k and potentially confuse a 4k crash kernel, for
example.

> +#define KEXEC_ARCH KEXEC_ARCH_ARM64
> +
> +#if !defined(__ASSEMBLY__)

#ifndef

> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> new file mode 100644
> index 0000000..8b990b8
> --- /dev/null
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -0,0 +1,152 @@
> +/*
> + * kexec for arm64
> + *
> + * Copyright (C) Linaro.
> + * Copyright (C) Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/kexec.h>
> +#include <linux/of_fdt.h>
> +#include <linux/slab.h>
> +#include <linux/uaccess.h>
> +
> +#include <asm/cacheflush.h>
> +#include <asm/system_misc.h>
> +
> +#include "cpu-reset.h"
> +
> +/* Global variables for the arm64_relocate_new_kernel routine. */
> +extern const unsigned char arm64_relocate_new_kernel[];
> +extern const unsigned long arm64_relocate_new_kernel_size;
> +
> +static unsigned long kimage_start;
> +
> +void machine_kexec_cleanup(struct kimage *kimage)
> +{
> +	/* Empty routine needed to avoid build errors. */
> +}
> +
> +/**
> + * machine_kexec_prepare - Prepare for a kexec reboot.
> + *
> + * Called from the core kexec code when a kernel image is loaded.
> + */
> +int machine_kexec_prepare(struct kimage *kimage)
> +{
> +	kimage_start = kimage->start;
> +	return 0;
> +}
> +
> +/**
> + * kexec_list_flush - Helper to flush the kimage list to PoC.
> + */
> +static void kexec_list_flush(unsigned long kimage_head)
> +{
> +	unsigned long *entry;
> +
> +	for (entry = &kimage_head; ; entry++) {
> +		unsigned int flag = *entry & IND_FLAGS;
> +		void *addr = phys_to_virt(*entry & PAGE_MASK);
> +
> +		switch (flag) {
> +		case IND_INDIRECTION:
> +			entry = (unsigned long *)addr - 1;
> +			__flush_dcache_area(addr, PAGE_SIZE);
> +			break;
> +		case IND_DESTINATION:
> +			break;
> +		case IND_SOURCE:
> +			__flush_dcache_area(addr, PAGE_SIZE);
> +			break;
> +		case IND_DONE:
> +			return;
> +		default:
> +			BUG();
> +		}
> +	}
> +}
> +
> +/**
> + * kexec_segment_flush - Helper to flush the kimage segments to PoC.
> + */
> +static void kexec_segment_flush(const struct kimage *kimage)
> +{
> +	unsigned long i;
> +
> +	pr_devel("%s:\n", __func__);
> +
> +	for (i = 0; i < kimage->nr_segments; i++) {
> +		pr_devel("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
> +			i,
> +			kimage->segment[i].mem,
> +			kimage->segment[i].mem + kimage->segment[i].memsz,
> +			kimage->segment[i].memsz,
> +			kimage->segment[i].memsz /  PAGE_SIZE);
> +
> +		__flush_dcache_area(phys_to_virt(kimage->segment[i].mem),
> +			kimage->segment[i].memsz);
> +	}
> +}
> +
> +/**
> + * machine_kexec - Do the kexec reboot.
> + *
> + * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
> + */
> +void machine_kexec(struct kimage *kimage)
> +{
> +	phys_addr_t reboot_code_buffer_phys;
> +	void *reboot_code_buffer;
> +
> +	BUG_ON(num_online_cpus() > 1);
> +
> +	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
> +	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
> +
> +	/*
> +	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
> +	 * after the kernel is shut down.
> +	 */
> +	memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
> +		arm64_relocate_new_kernel_size);

At which point does the I-cache get invalidated for this?

> +
> +	/* Flush the reboot_code_buffer in preparation for its execution. */
> +	__flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
> +
> +	/* Flush the new image. */
> +	kexec_segment_flush(kimage);
> +
> +	/* Flush the kimage list. */
> +	kexec_list_flush(kimage->head);
> +
> +	pr_info("Bye!\n");
> +
> +	/* Disable all DAIF exceptions. */
> +	asm volatile ("msr daifset, #0xf" : : : "memory");

Can we not use our helpers for this?

> +
> +	setup_mm_for_reboot();
> +
> +	/*
> +	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
> +	 * transfer control to the reboot_code_buffer which contains a copy of
> +	 * the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
> +	 * uses physical addressing to relocate the new image to its final
> +	 * position and transfers control to the image entry point when the
> +	 * relocation is complete.
> +	 */
> +
> +	cpu_soft_restart(virt_to_phys(cpu_reset),
> +		is_hyp_mode_available(),
> +		reboot_code_buffer_phys, kimage->head, kimage_start);
> +
> +	BUG(); /* Should never get here. */
> +}
> +
> +void machine_crash_shutdown(struct pt_regs *regs)
> +{
> +	/* Empty routine needed to avoid build errors. */
> +}
> diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
> new file mode 100644
> index 0000000..71cab0e
> --- /dev/null
> +++ b/arch/arm64/kernel/relocate_kernel.S
> @@ -0,0 +1,131 @@
> +/*
> + * kexec for arm64
> + *
> + * Copyright (C) Linaro.
> + * Copyright (C) Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/kexec.h>
> +
> +#include <asm/assembler.h>
> +#include <asm/kexec.h>
> +#include <asm/kvm_arm.h>
> +#include <asm/page.h>
> +
> +/*
> + * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
> + *
> + * The memory that the old kernel occupies may be overwritten when coping the
> + * new image to its final location.  To assure that the
> + * arm64_relocate_new_kernel routine which does that copy is not overwritten,
> + * all code and data needed by arm64_relocate_new_kernel must be between the
> + * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end.  The
> + * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
> + * control_code_page, a special page which has been set up to be preserved
> + * during the copy operation.
> + */
> +.globl arm64_relocate_new_kernel
> +arm64_relocate_new_kernel:
> +
> +	/* Setup the list loop variables. */
> +	mov	x18, x0				/* x18 = kimage_head */
> +	mov	x17, x1				/* x17 = kimage_start */
> +	dcache_line_size x16, x0		/* x16 = dcache line size */

Why is this needed?

> +	mov	x15, xzr			/* x15 = segment start */
> +	mov	x14, xzr			/* x14 = entry ptr */
> +	mov	x13, xzr			/* x13 = copy dest */
> +
> +	/* Clear the sctlr_el2 flags. */
> +	mrs	x0, CurrentEL
> +	cmp	x0, #CurrentEL_EL2
> +	b.ne	1f
> +	mrs	x0, sctlr_el2
> +	ldr	x1, =SCTLR_EL2_FLAGS

If we're using literal pools, we probably want a .ltorg directive somewhere.

> +	bic	x0, x0, x1
> +	msr	sctlr_el2, x0
> +	isb
> +1:
> +
> +	/* Check if the new image needs relocation. */
> +	cbz	x18, .Ldone
> +	tbnz	x18, IND_DONE_BIT, .Ldone
> +
> +.Lloop:
> +	and	x12, x18, PAGE_MASK		/* x12 = addr */
> +
> +	/* Test the entry flags. */
> +.Ltest_source:
> +	tbz	x18, IND_SOURCE_BIT, .Ltest_indirection
> +
> +	mov x20, x13				/*  x20 = copy dest */
> +	mov x21, x12				/*  x21 = copy src */

Weird indentation.

> +	/* Copy page. */
> +1:	ldp	x22, x23, [x21]
> +	ldp	x24, x25, [x21, #16]
> +	ldp	x26, x27, [x21, #32]
> +	ldp	x28, x29, [x21, #48]
> +	add	x21, x21, #64
> +	stnp	x22, x23, [x20]
> +	stnp	x24, x25, [x20, #16]
> +	stnp	x26, x27, [x20, #32]
> +	stnp	x28, x29, [x20, #48]
> +	add	x20, x20, #64
> +	tst	x21, #(PAGE_SIZE - 1)
> +	b.ne	1b

We should macroise this, to save on duplication of a common routine.
You also need to address the caching issues that Mark raised separately.

> +	/* dest += PAGE_SIZE */
> +	add	x13, x13, PAGE_SIZE
> +	b	.Lnext
> +
> +.Ltest_indirection:
> +	tbz	x18, IND_INDIRECTION_BIT, .Ltest_destination
> +
> +	/* ptr = addr */
> +	mov	x14, x12
> +	b	.Lnext
> +
> +.Ltest_destination:
> +	tbz	x18, IND_DESTINATION_BIT, .Lnext
> +
> +	mov	x15, x12
> +
> +	/* dest = addr */
> +	mov	x13, x12
> +
> +.Lnext:
> +	/* entry = *ptr++ */
> +	ldr	x18, [x14], #8
> +
> +	/* while (!(entry & DONE)) */
> +	tbz	x18, IND_DONE_BIT, .Lloop
> +
> +.Ldone:
> +	dsb	sy
> +	ic	ialluis

I don't think this needs to be inner-shareable, and these dsbs can probably
be non-shareable too.

> +	dsb	sy
> +	isb
> +
> +	/* Start new image. */
> +	mov	x0, xzr
> +	mov	x1, xzr
> +	mov	x2, xzr
> +	mov	x3, xzr
> +	br	x17
> +
> +.align 3	/* To keep the 64-bit values below naturally aligned. */
> +
> +.Lcopy_end:
> +.org	KEXEC_CONTROL_PAGE_SIZE
> +
> +/*
> + * arm64_relocate_new_kernel_size - Number of bytes to copy to the
> + * control_code_page.
> + */
> +.globl arm64_relocate_new_kernel_size
> +arm64_relocate_new_kernel_size:
> +	.quad	.Lcopy_end - arm64_relocate_new_kernel
> diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
> index 99048e5..ccec467 100644
> --- a/include/uapi/linux/kexec.h
> +++ b/include/uapi/linux/kexec.h
> @@ -39,6 +39,7 @@
>  #define KEXEC_ARCH_SH      (42 << 16)
>  #define KEXEC_ARCH_MIPS_LE (10 << 16)
>  #define KEXEC_ARCH_MIPS    ( 8 << 16)
> +#define KEXEC_ARCH_ARM64   (183 << 16)

This should probably be called KEXEC_ARCH_AARCH64 for consistency with
the ELF machine name.

Will

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-15 18:29     ` Will Deacon
  0 siblings, 0 replies; 178+ messages in thread
From: Will Deacon @ 2015-12-15 18:29 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	AKASHI Takahiro, linux-arm-kernel, christoffer.dall

On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the
> arm64 architecture that add support for the kexec re-boot mechanism
> (CONFIG_KEXEC) on arm64 platforms.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
>  arch/arm64/Kconfig                  |  10 +++
>  arch/arm64/include/asm/kexec.h      |  48 ++++++++++++
>  arch/arm64/kernel/Makefile          |   2 +
>  arch/arm64/kernel/machine_kexec.c   | 152 ++++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/relocate_kernel.S | 131 +++++++++++++++++++++++++++++++
>  include/uapi/linux/kexec.h          |   1 +
>  6 files changed, 344 insertions(+)
>  create mode 100644 arch/arm64/include/asm/kexec.h
>  create mode 100644 arch/arm64/kernel/machine_kexec.c
>  create mode 100644 arch/arm64/kernel/relocate_kernel.S
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 9ac16a4..c23fd77 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -535,6 +535,16 @@ config SECCOMP
>  	  and the task is only allowed to execute a few safe syscalls
>  	  defined by each seccomp mode.
>  
> +config KEXEC
> +	depends on PM_SLEEP_SMP
> +	select KEXEC_CORE
> +	bool "kexec system call"
> +	---help---
> +	  kexec is a system call that implements the ability to shutdown your
> +	  current kernel, and to start another kernel.  It is like a reboot
> +	  but it is independent of the system firmware.   And like a reboot
> +	  you can start any kernel with it, not just Linux.
> +
>  config XEN_DOM0
>  	def_bool y
>  	depends on XEN
> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> new file mode 100644
> index 0000000..46d63cd
> --- /dev/null
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -0,0 +1,48 @@
> +/*
> + * kexec for arm64
> + *
> + * Copyright (C) Linaro.
> + * Copyright (C) Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#if !defined(_ARM64_KEXEC_H)
> +#define _ARM64_KEXEC_H

Please keep to the style used elsewhere in the arch headers.

> +
> +/* Maximum physical address we can use pages from */
> +
> +#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
> +
> +/* Maximum address we can reach in physical address mode */
> +
> +#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
> +
> +/* Maximum address we can use for the control code buffer */
> +
> +#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL)
> +
> +#define KEXEC_CONTROL_PAGE_SIZE	4096

Does this work on kernels configured with 64k pages? It looks like the
kexec core code will end up using order-0 pages, so I worry that we'll
actually put down 64k and potentially confuse a 4k crash kernel, for
example.

> +#define KEXEC_ARCH KEXEC_ARCH_ARM64
> +
> +#if !defined(__ASSEMBLY__)

#ifndef

> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> new file mode 100644
> index 0000000..8b990b8
> --- /dev/null
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -0,0 +1,152 @@
> +/*
> + * kexec for arm64
> + *
> + * Copyright (C) Linaro.
> + * Copyright (C) Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/kexec.h>
> +#include <linux/of_fdt.h>
> +#include <linux/slab.h>
> +#include <linux/uaccess.h>
> +
> +#include <asm/cacheflush.h>
> +#include <asm/system_misc.h>
> +
> +#include "cpu-reset.h"
> +
> +/* Global variables for the arm64_relocate_new_kernel routine. */
> +extern const unsigned char arm64_relocate_new_kernel[];
> +extern const unsigned long arm64_relocate_new_kernel_size;
> +
> +static unsigned long kimage_start;
> +
> +void machine_kexec_cleanup(struct kimage *kimage)
> +{
> +	/* Empty routine needed to avoid build errors. */
> +}
> +
> +/**
> + * machine_kexec_prepare - Prepare for a kexec reboot.
> + *
> + * Called from the core kexec code when a kernel image is loaded.
> + */
> +int machine_kexec_prepare(struct kimage *kimage)
> +{
> +	kimage_start = kimage->start;
> +	return 0;
> +}
> +
> +/**
> + * kexec_list_flush - Helper to flush the kimage list to PoC.
> + */
> +static void kexec_list_flush(unsigned long kimage_head)
> +{
> +	unsigned long *entry;
> +
> +	for (entry = &kimage_head; ; entry++) {
> +		unsigned int flag = *entry & IND_FLAGS;
> +		void *addr = phys_to_virt(*entry & PAGE_MASK);
> +
> +		switch (flag) {
> +		case IND_INDIRECTION:
> +			entry = (unsigned long *)addr - 1;
> +			__flush_dcache_area(addr, PAGE_SIZE);
> +			break;
> +		case IND_DESTINATION:
> +			break;
> +		case IND_SOURCE:
> +			__flush_dcache_area(addr, PAGE_SIZE);
> +			break;
> +		case IND_DONE:
> +			return;
> +		default:
> +			BUG();
> +		}
> +	}
> +}
> +
> +/**
> + * kexec_segment_flush - Helper to flush the kimage segments to PoC.
> + */
> +static void kexec_segment_flush(const struct kimage *kimage)
> +{
> +	unsigned long i;
> +
> +	pr_devel("%s:\n", __func__);
> +
> +	for (i = 0; i < kimage->nr_segments; i++) {
> +		pr_devel("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
> +			i,
> +			kimage->segment[i].mem,
> +			kimage->segment[i].mem + kimage->segment[i].memsz,
> +			kimage->segment[i].memsz,
> +			kimage->segment[i].memsz /  PAGE_SIZE);
> +
> +		__flush_dcache_area(phys_to_virt(kimage->segment[i].mem),
> +			kimage->segment[i].memsz);
> +	}
> +}
> +
> +/**
> + * machine_kexec - Do the kexec reboot.
> + *
> + * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
> + */
> +void machine_kexec(struct kimage *kimage)
> +{
> +	phys_addr_t reboot_code_buffer_phys;
> +	void *reboot_code_buffer;
> +
> +	BUG_ON(num_online_cpus() > 1);
> +
> +	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
> +	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
> +
> +	/*
> +	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
> +	 * after the kernel is shut down.
> +	 */
> +	memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
> +		arm64_relocate_new_kernel_size);

At which point does the I-cache get invalidated for this?

> +
> +	/* Flush the reboot_code_buffer in preparation for its execution. */
> +	__flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
> +
> +	/* Flush the new image. */
> +	kexec_segment_flush(kimage);
> +
> +	/* Flush the kimage list. */
> +	kexec_list_flush(kimage->head);
> +
> +	pr_info("Bye!\n");
> +
> +	/* Disable all DAIF exceptions. */
> +	asm volatile ("msr daifset, #0xf" : : : "memory");

Can we not use our helpers for this?

> +
> +	setup_mm_for_reboot();
> +
> +	/*
> +	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
> +	 * transfer control to the reboot_code_buffer which contains a copy of
> +	 * the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
> +	 * uses physical addressing to relocate the new image to its final
> +	 * position and transfers control to the image entry point when the
> +	 * relocation is complete.
> +	 */
> +
> +	cpu_soft_restart(virt_to_phys(cpu_reset),
> +		is_hyp_mode_available(),
> +		reboot_code_buffer_phys, kimage->head, kimage_start);
> +
> +	BUG(); /* Should never get here. */
> +}
> +
> +void machine_crash_shutdown(struct pt_regs *regs)
> +{
> +	/* Empty routine needed to avoid build errors. */
> +}
> diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
> new file mode 100644
> index 0000000..71cab0e
> --- /dev/null
> +++ b/arch/arm64/kernel/relocate_kernel.S
> @@ -0,0 +1,131 @@
> +/*
> + * kexec for arm64
> + *
> + * Copyright (C) Linaro.
> + * Copyright (C) Huawei Futurewei Technologies.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/kexec.h>
> +
> +#include <asm/assembler.h>
> +#include <asm/kexec.h>
> +#include <asm/kvm_arm.h>
> +#include <asm/page.h>
> +
> +/*
> + * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
> + *
> + * The memory that the old kernel occupies may be overwritten when coping the
> + * new image to its final location.  To assure that the
> + * arm64_relocate_new_kernel routine which does that copy is not overwritten,
> + * all code and data needed by arm64_relocate_new_kernel must be between the
> + * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end.  The
> + * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
> + * control_code_page, a special page which has been set up to be preserved
> + * during the copy operation.
> + */
> +.globl arm64_relocate_new_kernel
> +arm64_relocate_new_kernel:
> +
> +	/* Setup the list loop variables. */
> +	mov	x18, x0				/* x18 = kimage_head */
> +	mov	x17, x1				/* x17 = kimage_start */
> +	dcache_line_size x16, x0		/* x16 = dcache line size */

Why is this needed?

> +	mov	x15, xzr			/* x15 = segment start */
> +	mov	x14, xzr			/* x14 = entry ptr */
> +	mov	x13, xzr			/* x13 = copy dest */
> +
> +	/* Clear the sctlr_el2 flags. */
> +	mrs	x0, CurrentEL
> +	cmp	x0, #CurrentEL_EL2
> +	b.ne	1f
> +	mrs	x0, sctlr_el2
> +	ldr	x1, =SCTLR_EL2_FLAGS

If we're using literal pools, we probably want a .ltorg directive somewhere.

> +	bic	x0, x0, x1
> +	msr	sctlr_el2, x0
> +	isb
> +1:
> +
> +	/* Check if the new image needs relocation. */
> +	cbz	x18, .Ldone
> +	tbnz	x18, IND_DONE_BIT, .Ldone
> +
> +.Lloop:
> +	and	x12, x18, PAGE_MASK		/* x12 = addr */
> +
> +	/* Test the entry flags. */
> +.Ltest_source:
> +	tbz	x18, IND_SOURCE_BIT, .Ltest_indirection
> +
> +	mov x20, x13				/*  x20 = copy dest */
> +	mov x21, x12				/*  x21 = copy src */

Weird indentation.

> +	/* Copy page. */
> +1:	ldp	x22, x23, [x21]
> +	ldp	x24, x25, [x21, #16]
> +	ldp	x26, x27, [x21, #32]
> +	ldp	x28, x29, [x21, #48]
> +	add	x21, x21, #64
> +	stnp	x22, x23, [x20]
> +	stnp	x24, x25, [x20, #16]
> +	stnp	x26, x27, [x20, #32]
> +	stnp	x28, x29, [x20, #48]
> +	add	x20, x20, #64
> +	tst	x21, #(PAGE_SIZE - 1)
> +	b.ne	1b

We should macroise this, to save on duplication of a common routine.
You also need to address the caching issues that Mark raised separately.

> +	/* dest += PAGE_SIZE */
> +	add	x13, x13, PAGE_SIZE
> +	b	.Lnext
> +
> +.Ltest_indirection:
> +	tbz	x18, IND_INDIRECTION_BIT, .Ltest_destination
> +
> +	/* ptr = addr */
> +	mov	x14, x12
> +	b	.Lnext
> +
> +.Ltest_destination:
> +	tbz	x18, IND_DESTINATION_BIT, .Lnext
> +
> +	mov	x15, x12
> +
> +	/* dest = addr */
> +	mov	x13, x12
> +
> +.Lnext:
> +	/* entry = *ptr++ */
> +	ldr	x18, [x14], #8
> +
> +	/* while (!(entry & DONE)) */
> +	tbz	x18, IND_DONE_BIT, .Lloop
> +
> +.Ldone:
> +	dsb	sy
> +	ic	ialluis

I don't think this needs to be inner-shareable, and these dsbs can probably
be non-shareable too.

> +	dsb	sy
> +	isb
> +
> +	/* Start new image. */
> +	mov	x0, xzr
> +	mov	x1, xzr
> +	mov	x2, xzr
> +	mov	x3, xzr
> +	br	x17
> +
> +.align 3	/* To keep the 64-bit values below naturally aligned. */
> +
> +.Lcopy_end:
> +.org	KEXEC_CONTROL_PAGE_SIZE
> +
> +/*
> + * arm64_relocate_new_kernel_size - Number of bytes to copy to the
> + * control_code_page.
> + */
> +.globl arm64_relocate_new_kernel_size
> +arm64_relocate_new_kernel_size:
> +	.quad	.Lcopy_end - arm64_relocate_new_kernel
> diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
> index 99048e5..ccec467 100644
> --- a/include/uapi/linux/kexec.h
> +++ b/include/uapi/linux/kexec.h
> @@ -39,6 +39,7 @@
>  #define KEXEC_ARCH_SH      (42 << 16)
>  #define KEXEC_ARCH_MIPS_LE (10 << 16)
>  #define KEXEC_ARCH_MIPS    ( 8 << 16)
> +#define KEXEC_ARCH_ARM64   (183 << 16)

This should probably be called KEXEC_ARCH_AARCH64 for consistency with
the ELF machine name.

Will

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-15 18:29     ` Will Deacon
@ 2015-12-16  0:14       ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-16  0:14 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

I'll post a v12.4 of this patch that addresses your comments.

On Tue, 2015-12-15 at 18:29 +0000, Will Deacon wrote:
> +#if !defined(_ARM64_KEXEC_H)
> > +#define _ARM64_KEXEC_H
> 
> Please keep to the style used elsewhere in the arch headers.

OK.

> > +
> > +#define KEXEC_CONTROL_PAGE_SIZE> > 	> > 4096
> 
> Does this work on kernels configured with 64k pages? It looks like the
> kexec core code will end up using order-0 pages, so I worry that we'll
> actually put down 64k and potentially confuse a 4k crash kernel, for
> example.

KEXEC_CONTROL_PAGE_SIZE just tells the core kexec code how big
we need control_code_buffer to be.  That buffer is only used by
the arch code of the first stage kernel.  With 64k pages the buffer
will be a full page, but we'll only use the first 4k of it. 

> > +#define KEXEC_ARCH KEXEC_ARCH_ARM64
> > +
> > +#if !defined(__ASSEMBLY__)
> 
> #ifndef

OK.

> > + * machine_kexec - Do the kexec reboot.
> > + *
> > + * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
> > + */
> > +void machine_kexec(struct kimage *kimage)
> > +{
> > +> > 	> > phys_addr_t reboot_code_buffer_phys;
> > +> > 	> > void *reboot_code_buffer;
> > +
> > +> > 	> > BUG_ON(num_online_cpus() > 1);
> > +
> > +> > 	> > reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
> > +> > 	> > reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
> > +
> > +> > 	> > /*
> > +> > 	> >  * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
> > +> > 	> >  * after the kernel is shut down.
> > +> > 	> >  */
> > +> > 	> > memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
> > +> > 	> > 	> > arm64_relocate_new_kernel_size);
> 
> At which point does the I-cache get invalidated for this?

I'll add a call to flush_icache_range() for reboot_code_buffer.  I
think that should do it.

> > +
> > +> > 	> > /* Flush the reboot_code_buffer in preparation for its execution. */
> > +> > 	> > __flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
> > +
> > +> > 	> > /* Flush the new image. */
> > +> > 	> > kexec_segment_flush(kimage);
> > +
> > +> > 	> > /* Flush the kimage list. */
> > +> > 	> > kexec_list_flush(kimage->head);
> > +
> > +> > 	> > pr_info("Bye!\n");
> > +
> > +> > 	> > /* Disable all DAIF exceptions. */
> > +> > 	> > asm volatile ("msr > > daifset> > , #0xf" : : :
> > "memory");
> 
> Can we not use our helpers for this?

Mark Rutland had commented that calling daifset four times
through the different macros took considerable time, and
recommended a single call here.

Would you prefer a new macro for irqflags.h, maybe

  #define local_daif_disable() asm("msr daifset, #0xf" : : : "memory")?

> > +/*
> > + * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
> > + *
> > + * The memory that the old kernel occupies may be overwritten when coping the
> > + * new image to its final location.  To assure that the
> > + * arm64_relocate_new_kernel routine which does that copy is not overwritten,
> > + * all code and data needed by arm64_relocate_new_kernel must be between the
> > + * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end.  The
> > + * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
> > + * control_code_page, a special page which has been set up to be preserved
> > + * during the copy operation.
> > + */
> > +.globl arm64_relocate_new_kernel
> > +arm64_relocate_new_kernel:
> > +
> > +> > 	> > /* Setup the list loop variables. */
> > +> > 	> > mov> > 	> > x18, x0> > 	> > 	> > 	> > 	> > /* x18 = kimage_head */
> > +> > 	> > mov> > 	> > x17, x1> > 	> > 	> > 	> > 	> > /* x17 = kimage_start */
> > +> > 	> > dcache_line_size x16, x0> > 	> > 	> > /* x16 = dcache line size */
> 
> Why is this needed?

This was left over from the previous version where we
invalidated pages while copying them.  I've since added
that invalidate back, so this is again needed.

> 
> > +> > 	> > mov> > 	> > x15, xzr> > 	> > 	> > 	> > /* x15 = segment start */
> > +> > 	> > mov> > 	> > x14, xzr> > 	> > 	> > 	> > /* x14 = entry ptr */
> > +> > 	> > mov> > 	> > x13, xzr> > 	> > 	> > 	> > /* x13 = copy dest */
> > +
> > +> > 	> > /* Clear the sctlr_el2 flags. */
> > +> > 	> > mrs> > 	> > x0, CurrentEL
> > +> > 	> > cmp> > 	> > x0, #CurrentEL_EL2
> > +> > 	> > b.ne> > 	> > 1f
> > +> > 	> > mrs> > 	> > x0, sctlr_el2
> > +> > 	> > ldr> > 	> > x1, =SCTLR_EL2_FLAGS
> 
> If we're using literal pools, we probably want a .ltorg directive somewhere.

I've added one in at the end of the arm64_relocate_new_kernel
code.

> > +> > 	> > bic> > 	> > x0, x0, x1
> > +> > 	> > msr> > 	> > sctlr_el2, x0
> > +> > 	> > isb
> > +1:
> > +
> > +> > 	> > /* Check if the new image needs relocation. */
> > +> > 	> > cbz> > 	> > x18, .Ldone
> > +> > 	> > tbnz> > 	> > x18, IND_DONE_BIT, .Ldone
> > +
> > +.Lloop:
> > +> > 	> > and> > 	> > x12, x18, PAGE_MASK> > 	> > 	> > /* x12 = addr */
> > +
> > +> > 	> > /* Test the entry flags. */
> > +.Ltest_source:
> > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > +
> > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20 = copy dest */
> > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21 = copy src */
> 
> Weird indentation.

Fixed.

> > +> > 	> > /* Copy page. */
> > +1:> > 	> > ldp> > 	> > x22, x23, [x21]
> > +> > 	> > ldp> > 	> > x24, x25, [x21, #16]
> > +> > 	> > ldp> > 	> > x26, x27, [x21, #32]
> > +> > 	> > ldp> > 	> > x28, x29, [x21, #48]
> > +> > 	> > add> > 	> > x21, x21, #64
> > +> > 	> > stnp> > 	> > x22, x23, [x20]
> > +> > 	> > stnp> > 	> > x24, x25, [x20, #16]
> > +> > 	> > stnp> > 	> > x26, x27, [x20, #32]
> > +> > 	> > stnp> > 	> > x28, x29, [x20, #48]
> > +> > 	> > add> > 	> > x20, x20, #64
> > +> > 	> > tst> > 	> > x21, #(PAGE_SIZE - 1)
> > +> > 	> > b.ne> > 	> > 1b
> 
> We should macroise this, to save on duplication of a common routine.

So something like this in assembler.h?

+/*
+ * copy_page - copy src to dest using temp registers t1-t8
+ */
+	.macro copy_page dest:req src:req t1:req t2:req t3:req t4:req t5:req t6:req t7:req t8:req
+1:	ldp	/t1, /t2, [/src]
+	ldp	/t3, /t4, [/src, #16]
+	ldp	/t5, /t6, [/src, #32]
+	ldp	/t7, /t8, [/src, #48]
+	add	/src, /src, #64
+	stnp	/t1, /t2, [/dest]
+	stnp	/t3, /t4, [/dest, #16]
+	stnp	/t5, /t6, [/dest, #32]
+	stnp	/t7, /t8, [/dest, #48]
+	add	/dest, /dest, #64
+	tst	/src, #(PAGE_SIZE - 1)
+	b.ne	1b
+	.endm

> You also need to address the caching issues that Mark raised separately.

Cache maintenance has been fixed (reintroduced) in the current code.

> 
> > +> > 	> > /* dest += PAGE_SIZE */
> > +> > 	> > add> > 	> > x13, x13, PAGE_SIZE
> > +> > 	> > b> > 	> > .Lnext
> > +
> > +.Ltest_indirection:
> > +> > 	> > tbz> > 	> > x18, IND_INDIRECTION_BIT, .Ltest_destination
> > +
> > +> > 	> > /* ptr = addr */
> > +> > 	> > mov> > 	> > x14, x12
> > +> > 	> > b> > 	> > .Lnext
> > +
> > +.Ltest_destination:
> > +> > 	> > tbz> > 	> > x18, IND_DESTINATION_BIT, .Lnext
> > +
> > +> > 	> > mov> > 	> > x15, x12
> > +
> > +> > 	> > /* dest = addr */
> > +> > 	> > mov> > 	> > x13, x12
> > +
> > +.Lnext:
> > +> > 	> > /* entry = *ptr++ */
> > +> > 	> > ldr> > 	> > x18, [x14], #8
> > +
> > +> > 	> > /* while (!(entry & DONE)) */
> > +> > 	> > tbz> > 	> > x18, IND_DONE_BIT, .Lloop
> > +
> > +.Ldone:
> > +> > 	> > dsb> > 	> > sy
> > +> > 	> > ic> > 	> > ialluis
> 
> I don't think this needs to be inner-shareable, and these dsbs can probably
> be non-shareable too.

OK.

> > +> > 	> > dsb> > 	> > sy
> > +> > 	> > isb
> > +
> > +> > 	> > /* Start new image. */
> > +> > 	> > mov> > 	> > x0, xzr
> > +> > 	> > mov> > 	> > x1, xzr
> > +> > 	> > mov> > 	> > x2, xzr
> > +> > 	> > mov> > 	> > x3, xzr
> > +> > 	> > br> > 	> > x17
> > +
> > +.align 3> > 	> > /* To keep the 64-bit values below naturally aligned. */
> > +
> > +.Lcopy_end:
> > +.org> > 	> > KEXEC_CONTROL_PAGE_SIZE
> > +
> > +/*
> > + * arm64_relocate_new_kernel_size - Number of bytes to copy to the
> > + * control_code_page.
> > + */
> > +.globl arm64_relocate_new_kernel_size
> > +arm64_relocate_new_kernel_size:
> > +> > 	> > .quad> > 	> > .Lcopy_end - arm64_relocate_new_kernel
> > diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
> > index 99048e5..ccec467 100644
> > --- a/include/uapi/linux/kexec.h
> > +++ b/include/uapi/linux/kexec.h
> > @@ -39,6 +39,7 @@
> >  #define KEXEC_ARCH_SH      (42 << 16)
> >  #define KEXEC_ARCH_MIPS_LE (10 << 16)
> >  #define KEXEC_ARCH_MIPS    ( 8 << 16)
> > +#define KEXEC_ARCH_ARM64   (183 << 16)
> 
> This should probably be called KEXEC_ARCH_AARCH64 for consistency with
> the ELF machine name.

OK.

-Geoff

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-16  0:14       ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-16  0:14 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	AKASHI Takahiro, linux-arm-kernel, christoffer.dall

Hi Will,

I'll post a v12.4 of this patch that addresses your comments.

On Tue, 2015-12-15 at 18:29 +0000, Will Deacon wrote:
> +#if !defined(_ARM64_KEXEC_H)
> > +#define _ARM64_KEXEC_H
> 
> Please keep to the style used elsewhere in the arch headers.

OK.

> > +
> > +#define KEXEC_CONTROL_PAGE_SIZE> > 	> > 4096
> 
> Does this work on kernels configured with 64k pages? It looks like the
> kexec core code will end up using order-0 pages, so I worry that we'll
> actually put down 64k and potentially confuse a 4k crash kernel, for
> example.

KEXEC_CONTROL_PAGE_SIZE just tells the core kexec code how big
we need control_code_buffer to be.  That buffer is only used by
the arch code of the first stage kernel.  With 64k pages the buffer
will be a full page, but we'll only use the first 4k of it. 

> > +#define KEXEC_ARCH KEXEC_ARCH_ARM64
> > +
> > +#if !defined(__ASSEMBLY__)
> 
> #ifndef

OK.

> > + * machine_kexec - Do the kexec reboot.
> > + *
> > + * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
> > + */
> > +void machine_kexec(struct kimage *kimage)
> > +{
> > +> > 	> > phys_addr_t reboot_code_buffer_phys;
> > +> > 	> > void *reboot_code_buffer;
> > +
> > +> > 	> > BUG_ON(num_online_cpus() > 1);
> > +
> > +> > 	> > reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
> > +> > 	> > reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
> > +
> > +> > 	> > /*
> > +> > 	> >  * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
> > +> > 	> >  * after the kernel is shut down.
> > +> > 	> >  */
> > +> > 	> > memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
> > +> > 	> > 	> > arm64_relocate_new_kernel_size);
> 
> At which point does the I-cache get invalidated for this?

I'll add a call to flush_icache_range() for reboot_code_buffer.  I
think that should do it.

> > +
> > +> > 	> > /* Flush the reboot_code_buffer in preparation for its execution. */
> > +> > 	> > __flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
> > +
> > +> > 	> > /* Flush the new image. */
> > +> > 	> > kexec_segment_flush(kimage);
> > +
> > +> > 	> > /* Flush the kimage list. */
> > +> > 	> > kexec_list_flush(kimage->head);
> > +
> > +> > 	> > pr_info("Bye!\n");
> > +
> > +> > 	> > /* Disable all DAIF exceptions. */
> > +> > 	> > asm volatile ("msr > > daifset> > , #0xf" : : :
> > "memory");
> 
> Can we not use our helpers for this?

Mark Rutland had commented that calling daifset four times
through the different macros took considerable time, and
recommended a single call here.

Would you prefer a new macro for irqflags.h, maybe

  #define local_daif_disable() asm("msr daifset, #0xf" : : : "memory")?

> > +/*
> > + * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
> > + *
> > + * The memory that the old kernel occupies may be overwritten when coping the
> > + * new image to its final location.  To assure that the
> > + * arm64_relocate_new_kernel routine which does that copy is not overwritten,
> > + * all code and data needed by arm64_relocate_new_kernel must be between the
> > + * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end.  The
> > + * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
> > + * control_code_page, a special page which has been set up to be preserved
> > + * during the copy operation.
> > + */
> > +.globl arm64_relocate_new_kernel
> > +arm64_relocate_new_kernel:
> > +
> > +> > 	> > /* Setup the list loop variables. */
> > +> > 	> > mov> > 	> > x18, x0> > 	> > 	> > 	> > 	> > /* x18 = kimage_head */
> > +> > 	> > mov> > 	> > x17, x1> > 	> > 	> > 	> > 	> > /* x17 = kimage_start */
> > +> > 	> > dcache_line_size x16, x0> > 	> > 	> > /* x16 = dcache line size */
> 
> Why is this needed?

This was left over from the previous version where we
invalidated pages while copying them.  I've since added
that invalidate back, so this is again needed.

> 
> > +> > 	> > mov> > 	> > x15, xzr> > 	> > 	> > 	> > /* x15 = segment start */
> > +> > 	> > mov> > 	> > x14, xzr> > 	> > 	> > 	> > /* x14 = entry ptr */
> > +> > 	> > mov> > 	> > x13, xzr> > 	> > 	> > 	> > /* x13 = copy dest */
> > +
> > +> > 	> > /* Clear the sctlr_el2 flags. */
> > +> > 	> > mrs> > 	> > x0, CurrentEL
> > +> > 	> > cmp> > 	> > x0, #CurrentEL_EL2
> > +> > 	> > b.ne> > 	> > 1f
> > +> > 	> > mrs> > 	> > x0, sctlr_el2
> > +> > 	> > ldr> > 	> > x1, =SCTLR_EL2_FLAGS
> 
> If we're using literal pools, we probably want a .ltorg directive somewhere.

I've added one in at the end of the arm64_relocate_new_kernel
code.

> > +> > 	> > bic> > 	> > x0, x0, x1
> > +> > 	> > msr> > 	> > sctlr_el2, x0
> > +> > 	> > isb
> > +1:
> > +
> > +> > 	> > /* Check if the new image needs relocation. */
> > +> > 	> > cbz> > 	> > x18, .Ldone
> > +> > 	> > tbnz> > 	> > x18, IND_DONE_BIT, .Ldone
> > +
> > +.Lloop:
> > +> > 	> > and> > 	> > x12, x18, PAGE_MASK> > 	> > 	> > /* x12 = addr */
> > +
> > +> > 	> > /* Test the entry flags. */
> > +.Ltest_source:
> > +> > 	> > tbz> > 	> > x18, IND_SOURCE_BIT, .Ltest_indirection
> > +
> > +> > 	> > mov x20, x13> > 	> > 	> > 	> > 	> > /*  x20 = copy dest */
> > +> > 	> > mov x21, x12> > 	> > 	> > 	> > 	> > /*  x21 = copy src */
> 
> Weird indentation.

Fixed.

> > +> > 	> > /* Copy page. */
> > +1:> > 	> > ldp> > 	> > x22, x23, [x21]
> > +> > 	> > ldp> > 	> > x24, x25, [x21, #16]
> > +> > 	> > ldp> > 	> > x26, x27, [x21, #32]
> > +> > 	> > ldp> > 	> > x28, x29, [x21, #48]
> > +> > 	> > add> > 	> > x21, x21, #64
> > +> > 	> > stnp> > 	> > x22, x23, [x20]
> > +> > 	> > stnp> > 	> > x24, x25, [x20, #16]
> > +> > 	> > stnp> > 	> > x26, x27, [x20, #32]
> > +> > 	> > stnp> > 	> > x28, x29, [x20, #48]
> > +> > 	> > add> > 	> > x20, x20, #64
> > +> > 	> > tst> > 	> > x21, #(PAGE_SIZE - 1)
> > +> > 	> > b.ne> > 	> > 1b
> 
> We should macroise this, to save on duplication of a common routine.

So something like this in assembler.h?

+/*
+ * copy_page - copy src to dest using temp registers t1-t8
+ */
+	.macro copy_page dest:req src:req t1:req t2:req t3:req t4:req t5:req t6:req t7:req t8:req
+1:	ldp	/t1, /t2, [/src]
+	ldp	/t3, /t4, [/src, #16]
+	ldp	/t5, /t6, [/src, #32]
+	ldp	/t7, /t8, [/src, #48]
+	add	/src, /src, #64
+	stnp	/t1, /t2, [/dest]
+	stnp	/t3, /t4, [/dest, #16]
+	stnp	/t5, /t6, [/dest, #32]
+	stnp	/t7, /t8, [/dest, #48]
+	add	/dest, /dest, #64
+	tst	/src, #(PAGE_SIZE - 1)
+	b.ne	1b
+	.endm

> You also need to address the caching issues that Mark raised separately.

Cache maintenance has been fixed (reintroduced) in the current code.

> 
> > +> > 	> > /* dest += PAGE_SIZE */
> > +> > 	> > add> > 	> > x13, x13, PAGE_SIZE
> > +> > 	> > b> > 	> > .Lnext
> > +
> > +.Ltest_indirection:
> > +> > 	> > tbz> > 	> > x18, IND_INDIRECTION_BIT, .Ltest_destination
> > +
> > +> > 	> > /* ptr = addr */
> > +> > 	> > mov> > 	> > x14, x12
> > +> > 	> > b> > 	> > .Lnext
> > +
> > +.Ltest_destination:
> > +> > 	> > tbz> > 	> > x18, IND_DESTINATION_BIT, .Lnext
> > +
> > +> > 	> > mov> > 	> > x15, x12
> > +
> > +> > 	> > /* dest = addr */
> > +> > 	> > mov> > 	> > x13, x12
> > +
> > +.Lnext:
> > +> > 	> > /* entry = *ptr++ */
> > +> > 	> > ldr> > 	> > x18, [x14], #8
> > +
> > +> > 	> > /* while (!(entry & DONE)) */
> > +> > 	> > tbz> > 	> > x18, IND_DONE_BIT, .Lloop
> > +
> > +.Ldone:
> > +> > 	> > dsb> > 	> > sy
> > +> > 	> > ic> > 	> > ialluis
> 
> I don't think this needs to be inner-shareable, and these dsbs can probably
> be non-shareable too.

OK.

> > +> > 	> > dsb> > 	> > sy
> > +> > 	> > isb
> > +
> > +> > 	> > /* Start new image. */
> > +> > 	> > mov> > 	> > x0, xzr
> > +> > 	> > mov> > 	> > x1, xzr
> > +> > 	> > mov> > 	> > x2, xzr
> > +> > 	> > mov> > 	> > x3, xzr
> > +> > 	> > br> > 	> > x17
> > +
> > +.align 3> > 	> > /* To keep the 64-bit values below naturally aligned. */
> > +
> > +.Lcopy_end:
> > +.org> > 	> > KEXEC_CONTROL_PAGE_SIZE
> > +
> > +/*
> > + * arm64_relocate_new_kernel_size - Number of bytes to copy to the
> > + * control_code_page.
> > + */
> > +.globl arm64_relocate_new_kernel_size
> > +arm64_relocate_new_kernel_size:
> > +> > 	> > .quad> > 	> > .Lcopy_end - arm64_relocate_new_kernel
> > diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
> > index 99048e5..ccec467 100644
> > --- a/include/uapi/linux/kexec.h
> > +++ b/include/uapi/linux/kexec.h
> > @@ -39,6 +39,7 @@
> >  #define KEXEC_ARCH_SH      (42 << 16)
> >  #define KEXEC_ARCH_MIPS_LE (10 << 16)
> >  #define KEXEC_ARCH_MIPS    ( 8 << 16)
> > +#define KEXEC_ARCH_ARM64   (183 << 16)
> 
> This should probably be called KEXEC_ARCH_AARCH64 for consistency with
> the ELF machine name.

OK.

-Geoff

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12.4] arm64/kexec: Add core kexec support
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-16  0:14     ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-16  0:14 UTC (permalink / raw)
  To: linux-arm-kernel

Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the
arm64 architecture that add support for the kexec re-boot mechanism
(CONFIG_KEXEC) on arm64 platforms.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
I stuck the copy_page macro in here for review.

-Geoff

 arch/arm64/Kconfig                  |  10 +++
 arch/arm64/include/asm/kexec.h      |  48 +++++++++++
 arch/arm64/kernel/Makefile          |   2 +
 arch/arm64/kernel/machine_kexec.c   | 153 ++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S | 131 ++++++++++++++++++++++++++++++
 include/uapi/linux/kexec.h          |   1 +
 6 files changed, 345 insertions(+)
 create mode 100644 arch/arm64/include/asm/kexec.h
 create mode 100644 arch/arm64/kernel/machine_kexec.c
 create mode 100644 arch/arm64/kernel/relocate_kernel.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 21979a4..c47a623 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -24,6 +24,7 @@
 #define __ASM_ASSEMBLER_H
 
 #include <asm/asm-offsets.h>
+#include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
@@ -240,6 +241,24 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * copy_page - copy src to dest using temp registers t1-t8
+ */
+	.macro copy_page dest:req src:req t1:req t2:req t3:req t4:req t5:req t6:req t7:req t8:req
+1:	ldp	\t1, \t2, [\src]
+	ldp	\t3, \t4, [\src, #16]
+	ldp	\t5, \t6, [\src, #32]
+	ldp	\t7, \t8, [\src, #48]
+	add	\src, \src, #64
+	stnp	\t1, \t2, [\dest]
+	stnp	\t3, \t4, [\dest, #16]
+	stnp	\t5, \t6, [\dest, #32]
+	stnp	\t7, \t8, [\dest, #48]
+	add	\dest, \dest, #64
+	tst	\src, #(PAGE_SIZE - 1)
+	b.ne	1b
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 871f217..b7d0078 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -556,6 +556,16 @@ config SECCOMP
 	  and the task is only allowed to execute a few safe syscalls
 	  defined by each seccomp mode.
 
+config KEXEC
+	depends on PM_SLEEP_SMP
+	select KEXEC_CORE
+	bool "kexec system call"
+	---help---
+	  kexec is a system call that implements the ability to shutdown your
+	  current kernel, and to start another kernel.  It is like a reboot
+	  but it is independent of the system firmware.   And like a reboot
+	  you can start any kernel with it, not just Linux.
+
 config XEN_DOM0
 	def_bool y
 	depends on XEN
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
new file mode 100644
index 0000000..04744dc
--- /dev/null
+++ b/arch/arm64/include/asm/kexec.h
@@ -0,0 +1,48 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _ARM64_KEXEC_H
+#define _ARM64_KEXEC_H
+
+/* Maximum physical address we can use pages from */
+
+#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
+
+/* Maximum address we can reach in physical address mode */
+
+#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
+
+/* Maximum address we can use for the control code buffer */
+
+#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL)
+
+#define KEXEC_CONTROL_PAGE_SIZE 4096
+
+#define KEXEC_ARCH KEXEC_ARCH_AARCH64
+
+#ifndef __ASSEMBLY__
+
+/**
+ * crash_setup_regs() - save registers for the panic kernel
+ *
+ * @newregs: registers are saved here
+ * @oldregs: registers to be saved (may be %NULL)
+ */
+
+static inline void crash_setup_regs(struct pt_regs *newregs,
+				    struct pt_regs *oldregs)
+{
+	/* Empty routine needed to avoid build errors. */
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f..f68420d 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -41,6 +41,8 @@ arm64-obj-$(CONFIG_EFI)			+= efi.o efi-entry.stub.o
 arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
+arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
+					   cpu-reset.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
new file mode 100644
index 0000000..fd05308
--- /dev/null
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -0,0 +1,153 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kexec.h>
+#include <linux/of_fdt.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include <asm/cacheflush.h>
+#include <asm/system_misc.h>
+
+#include "cpu-reset.h"
+
+/* Global variables for the arm64_relocate_new_kernel routine. */
+extern const unsigned char arm64_relocate_new_kernel[];
+extern const unsigned long arm64_relocate_new_kernel_size;
+
+static unsigned long kimage_start;
+
+void machine_kexec_cleanup(struct kimage *kimage)
+{
+	/* Empty routine needed to avoid build errors. */
+}
+
+/**
+ * machine_kexec_prepare - Prepare for a kexec reboot.
+ *
+ * Called from the core kexec code when a kernel image is loaded.
+ */
+int machine_kexec_prepare(struct kimage *kimage)
+{
+	kimage_start = kimage->start;
+	return 0;
+}
+
+/**
+ * kexec_list_flush - Helper to flush the kimage list to PoC.
+ */
+static void kexec_list_flush(unsigned long kimage_head)
+{
+	unsigned long *entry;
+
+	for (entry = &kimage_head; ; entry++) {
+		unsigned int flag = *entry & IND_FLAGS;
+		void *addr = phys_to_virt(*entry & PAGE_MASK);
+
+		switch (flag) {
+		case IND_INDIRECTION:
+			entry = (unsigned long *)addr - 1;
+			__flush_dcache_area(addr, PAGE_SIZE);
+			break;
+		case IND_DESTINATION:
+			break;
+		case IND_SOURCE:
+			__flush_dcache_area(addr, PAGE_SIZE);
+			break;
+		case IND_DONE:
+			return;
+		default:
+			BUG();
+		}
+	}
+}
+
+/**
+ * kexec_segment_flush - Helper to flush the kimage segments to PoC.
+ */
+static void kexec_segment_flush(const struct kimage *kimage)
+{
+	unsigned long i;
+
+	pr_devel("%s:\n", __func__);
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		pr_devel("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
+			i,
+			kimage->segment[i].mem,
+			kimage->segment[i].mem + kimage->segment[i].memsz,
+			kimage->segment[i].memsz,
+			kimage->segment[i].memsz /  PAGE_SIZE);
+
+		__flush_dcache_area(phys_to_virt(kimage->segment[i].mem),
+			kimage->segment[i].memsz);
+	}
+}
+
+/**
+ * machine_kexec - Do the kexec reboot.
+ *
+ * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
+ */
+void machine_kexec(struct kimage *kimage)
+{
+	phys_addr_t reboot_code_buffer_phys;
+	void *reboot_code_buffer;
+
+	BUG_ON(num_online_cpus() > 1);
+
+	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
+	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
+
+	/*
+	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
+	 * after the kernel is shut down.
+	 */
+	memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
+		arm64_relocate_new_kernel_size);
+
+	/* Flush the reboot_code_buffer in preparation for its execution. */
+	__flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
+	flush_icache_range((uintptr_t)reboot_code_buffer, arm64_relocate_new_kernel_size);
+
+	/* Flush the kimage list. */
+	kexec_list_flush(kimage->head);
+
+	/* Flush the new image if already in place. */
+	if (kimage->head & IND_DONE)
+		kexec_segment_flush(kimage);
+
+	pr_info("Bye!\n");
+
+	/* Disable all DAIF exceptions. */
+	asm volatile ("msr daifset, #0xf" : : : "memory");
+
+	setup_mm_for_reboot();
+
+	/*
+	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
+	 * transfer control to the reboot_code_buffer which contains a copy of
+	 * the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
+	 * uses physical addressing to relocate the new image to its final
+	 * position and transfers control to the image entry point when the
+	 * relocation is complete.
+	 */
+
+	cpu_soft_restart(is_hyp_mode_available(), reboot_code_buffer_phys,
+		kimage->head, kimage_start);
+
+	BUG(); /* Should never get here. */
+}
+
+void machine_crash_shutdown(struct pt_regs *regs)
+{
+	/* Empty routine needed to avoid build errors. */
+}
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
new file mode 100644
index 0000000..1adbb13
--- /dev/null
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -0,0 +1,131 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kexec.h>
+
+#include <asm/assembler.h>
+#include <asm/kexec.h>
+#include <asm/page.h>
+#include <asm/sysreg.h>
+
+/*
+ * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
+ *
+ * The memory that the old kernel occupies may be overwritten when coping the
+ * new image to its final location.  To assure that the
+ * arm64_relocate_new_kernel routine which does that copy is not overwritten,
+ * all code and data needed by arm64_relocate_new_kernel must be between the
+ * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end.  The
+ * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
+ * control_code_page, a special page which has been set up to be preserved
+ * during the copy operation.
+ */
+.globl arm64_relocate_new_kernel
+arm64_relocate_new_kernel:
+
+	/* Setup the list loop variables. */
+	mov	x18, x0				/* x18 = kimage_head */
+	mov	x17, x1				/* x17 = kimage_start */
+	dcache_line_size x16, x0		/* x16 = dcache line size */
+	mov	x15, xzr			/* x15 = segment start */
+	mov	x14, xzr			/* x14 = entry ptr */
+	mov	x13, xzr			/* x13 = copy dest */
+
+	/* Clear the sctlr_el2 flags. */
+	mrs	x0, CurrentEL
+	cmp	x0, #CurrentEL_EL2
+	b.ne	1f
+	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_ELx_FLAGS
+	bic	x0, x0, x1
+	msr	sctlr_el2, x0
+	isb
+1:
+
+	/* Check if the new image needs relocation. */
+	cbz	x18, .Ldone
+	tbnz	x18, IND_DONE_BIT, .Ldone
+
+.Lloop:
+	and	x12, x18, PAGE_MASK		/* x12 = addr */
+
+	/* Test the entry flags. */
+.Ltest_source:
+	tbz	x18, IND_SOURCE_BIT, .Ltest_indirection
+
+	/* Invalidate dest page to PoC. */
+	mov     x0, x13
+	add     x19, x0, #PAGE_SIZE
+	sub     x1, x16, #1
+	bic     x0, x0, x1
+2:	dc      ivac, x0
+	add     x0, x0, x16
+	cmp     x0, x19
+	b.lo    2b
+	dsb     sy
+
+	mov x19, x13
+	mov x20, x12
+	copy_page x19, x20, x0, x1, x2, x3, x4, x5, x6, x7
+
+	/* dest += PAGE_SIZE */
+	add	x13, x13, PAGE_SIZE
+	b	.Lnext
+
+.Ltest_indirection:
+	tbz	x18, IND_INDIRECTION_BIT, .Ltest_destination
+
+	/* ptr = addr */
+	mov	x14, x12
+	b	.Lnext
+
+.Ltest_destination:
+	tbz	x18, IND_DESTINATION_BIT, .Lnext
+
+	mov	x15, x12
+
+	/* dest = addr */
+	mov	x13, x12
+
+.Lnext:
+	/* entry = *ptr++ */
+	ldr	x18, [x14], #8
+
+	/* while (!(entry & DONE)) */
+	tbz	x18, IND_DONE_BIT, .Lloop
+
+.Ldone:
+	dsb	nsh
+	ic	iallu
+	dsb	nsh
+	isb
+
+	/* Start new image. */
+	mov	x0, xzr
+	mov	x1, xzr
+	mov	x2, xzr
+	mov	x3, xzr
+	br	x17
+
+.ltorg
+
+.align 3	/* To keep the 64-bit values below naturally aligned. */
+
+.Lcopy_end:
+.org	KEXEC_CONTROL_PAGE_SIZE
+
+/*
+ * arm64_relocate_new_kernel_size - Number of bytes to copy to the
+ * control_code_page.
+ */
+.globl arm64_relocate_new_kernel_size
+arm64_relocate_new_kernel_size:
+	.quad	.Lcopy_end - arm64_relocate_new_kernel
diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index 99048e5..aae5ebf 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -39,6 +39,7 @@
 #define KEXEC_ARCH_SH      (42 << 16)
 #define KEXEC_ARCH_MIPS_LE (10 << 16)
 #define KEXEC_ARCH_MIPS    ( 8 << 16)
+#define KEXEC_ARCH_AARCH64 (183 << 16)
 
 /* The artificial cap on the number of segments passed to kexec_load. */
 #define KEXEC_SEGMENT_MAX 16
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12.4] arm64/kexec: Add core kexec support
@ 2015-12-16  0:14     ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-16  0:14 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the
arm64 architecture that add support for the kexec re-boot mechanism
(CONFIG_KEXEC) on arm64 platforms.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
I stuck the copy_page macro in here for review.

-Geoff

 arch/arm64/Kconfig                  |  10 +++
 arch/arm64/include/asm/kexec.h      |  48 +++++++++++
 arch/arm64/kernel/Makefile          |   2 +
 arch/arm64/kernel/machine_kexec.c   | 153 ++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/relocate_kernel.S | 131 ++++++++++++++++++++++++++++++
 include/uapi/linux/kexec.h          |   1 +
 6 files changed, 345 insertions(+)
 create mode 100644 arch/arm64/include/asm/kexec.h
 create mode 100644 arch/arm64/kernel/machine_kexec.c
 create mode 100644 arch/arm64/kernel/relocate_kernel.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 21979a4..c47a623 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -24,6 +24,7 @@
 #define __ASM_ASSEMBLER_H
 
 #include <asm/asm-offsets.h>
+#include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
@@ -240,6 +241,24 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * copy_page - copy src to dest using temp registers t1-t8
+ */
+	.macro copy_page dest:req src:req t1:req t2:req t3:req t4:req t5:req t6:req t7:req t8:req
+1:	ldp	\t1, \t2, [\src]
+	ldp	\t3, \t4, [\src, #16]
+	ldp	\t5, \t6, [\src, #32]
+	ldp	\t7, \t8, [\src, #48]
+	add	\src, \src, #64
+	stnp	\t1, \t2, [\dest]
+	stnp	\t3, \t4, [\dest, #16]
+	stnp	\t5, \t6, [\dest, #32]
+	stnp	\t7, \t8, [\dest, #48]
+	add	\dest, \dest, #64
+	tst	\src, #(PAGE_SIZE - 1)
+	b.ne	1b
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 871f217..b7d0078 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -556,6 +556,16 @@ config SECCOMP
 	  and the task is only allowed to execute a few safe syscalls
 	  defined by each seccomp mode.
 
+config KEXEC
+	depends on PM_SLEEP_SMP
+	select KEXEC_CORE
+	bool "kexec system call"
+	---help---
+	  kexec is a system call that implements the ability to shutdown your
+	  current kernel, and to start another kernel.  It is like a reboot
+	  but it is independent of the system firmware.   And like a reboot
+	  you can start any kernel with it, not just Linux.
+
 config XEN_DOM0
 	def_bool y
 	depends on XEN
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
new file mode 100644
index 0000000..04744dc
--- /dev/null
+++ b/arch/arm64/include/asm/kexec.h
@@ -0,0 +1,48 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _ARM64_KEXEC_H
+#define _ARM64_KEXEC_H
+
+/* Maximum physical address we can use pages from */
+
+#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
+
+/* Maximum address we can reach in physical address mode */
+
+#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL)
+
+/* Maximum address we can use for the control code buffer */
+
+#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL)
+
+#define KEXEC_CONTROL_PAGE_SIZE 4096
+
+#define KEXEC_ARCH KEXEC_ARCH_AARCH64
+
+#ifndef __ASSEMBLY__
+
+/**
+ * crash_setup_regs() - save registers for the panic kernel
+ *
+ * @newregs: registers are saved here
+ * @oldregs: registers to be saved (may be %NULL)
+ */
+
+static inline void crash_setup_regs(struct pt_regs *newregs,
+				    struct pt_regs *oldregs)
+{
+	/* Empty routine needed to avoid build errors. */
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f..f68420d 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -41,6 +41,8 @@ arm64-obj-$(CONFIG_EFI)			+= efi.o efi-entry.stub.o
 arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
+arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
+					   cpu-reset.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
new file mode 100644
index 0000000..fd05308
--- /dev/null
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -0,0 +1,153 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kexec.h>
+#include <linux/of_fdt.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include <asm/cacheflush.h>
+#include <asm/system_misc.h>
+
+#include "cpu-reset.h"
+
+/* Global variables for the arm64_relocate_new_kernel routine. */
+extern const unsigned char arm64_relocate_new_kernel[];
+extern const unsigned long arm64_relocate_new_kernel_size;
+
+static unsigned long kimage_start;
+
+void machine_kexec_cleanup(struct kimage *kimage)
+{
+	/* Empty routine needed to avoid build errors. */
+}
+
+/**
+ * machine_kexec_prepare - Prepare for a kexec reboot.
+ *
+ * Called from the core kexec code when a kernel image is loaded.
+ */
+int machine_kexec_prepare(struct kimage *kimage)
+{
+	kimage_start = kimage->start;
+	return 0;
+}
+
+/**
+ * kexec_list_flush - Helper to flush the kimage list to PoC.
+ */
+static void kexec_list_flush(unsigned long kimage_head)
+{
+	unsigned long *entry;
+
+	for (entry = &kimage_head; ; entry++) {
+		unsigned int flag = *entry & IND_FLAGS;
+		void *addr = phys_to_virt(*entry & PAGE_MASK);
+
+		switch (flag) {
+		case IND_INDIRECTION:
+			entry = (unsigned long *)addr - 1;
+			__flush_dcache_area(addr, PAGE_SIZE);
+			break;
+		case IND_DESTINATION:
+			break;
+		case IND_SOURCE:
+			__flush_dcache_area(addr, PAGE_SIZE);
+			break;
+		case IND_DONE:
+			return;
+		default:
+			BUG();
+		}
+	}
+}
+
+/**
+ * kexec_segment_flush - Helper to flush the kimage segments to PoC.
+ */
+static void kexec_segment_flush(const struct kimage *kimage)
+{
+	unsigned long i;
+
+	pr_devel("%s:\n", __func__);
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		pr_devel("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
+			i,
+			kimage->segment[i].mem,
+			kimage->segment[i].mem + kimage->segment[i].memsz,
+			kimage->segment[i].memsz,
+			kimage->segment[i].memsz /  PAGE_SIZE);
+
+		__flush_dcache_area(phys_to_virt(kimage->segment[i].mem),
+			kimage->segment[i].memsz);
+	}
+}
+
+/**
+ * machine_kexec - Do the kexec reboot.
+ *
+ * Called from the core kexec code for a sys_reboot with LINUX_REBOOT_CMD_KEXEC.
+ */
+void machine_kexec(struct kimage *kimage)
+{
+	phys_addr_t reboot_code_buffer_phys;
+	void *reboot_code_buffer;
+
+	BUG_ON(num_online_cpus() > 1);
+
+	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
+	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
+
+	/*
+	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
+	 * after the kernel is shut down.
+	 */
+	memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
+		arm64_relocate_new_kernel_size);
+
+	/* Flush the reboot_code_buffer in preparation for its execution. */
+	__flush_dcache_area(reboot_code_buffer, arm64_relocate_new_kernel_size);
+	flush_icache_range((uintptr_t)reboot_code_buffer, arm64_relocate_new_kernel_size);
+
+	/* Flush the kimage list. */
+	kexec_list_flush(kimage->head);
+
+	/* Flush the new image if already in place. */
+	if (kimage->head & IND_DONE)
+		kexec_segment_flush(kimage);
+
+	pr_info("Bye!\n");
+
+	/* Disable all DAIF exceptions. */
+	asm volatile ("msr daifset, #0xf" : : : "memory");
+
+	setup_mm_for_reboot();
+
+	/*
+	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
+	 * transfer control to the reboot_code_buffer which contains a copy of
+	 * the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
+	 * uses physical addressing to relocate the new image to its final
+	 * position and transfers control to the image entry point when the
+	 * relocation is complete.
+	 */
+
+	cpu_soft_restart(is_hyp_mode_available(), reboot_code_buffer_phys,
+		kimage->head, kimage_start);
+
+	BUG(); /* Should never get here. */
+}
+
+void machine_crash_shutdown(struct pt_regs *regs)
+{
+	/* Empty routine needed to avoid build errors. */
+}
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
new file mode 100644
index 0000000..1adbb13
--- /dev/null
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -0,0 +1,131 @@
+/*
+ * kexec for arm64
+ *
+ * Copyright (C) Linaro.
+ * Copyright (C) Huawei Futurewei Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kexec.h>
+
+#include <asm/assembler.h>
+#include <asm/kexec.h>
+#include <asm/page.h>
+#include <asm/sysreg.h>
+
+/*
+ * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
+ *
+ * The memory that the old kernel occupies may be overwritten when coping the
+ * new image to its final location.  To assure that the
+ * arm64_relocate_new_kernel routine which does that copy is not overwritten,
+ * all code and data needed by arm64_relocate_new_kernel must be between the
+ * symbols arm64_relocate_new_kernel and arm64_relocate_new_kernel_end.  The
+ * machine_kexec() routine will copy arm64_relocate_new_kernel to the kexec
+ * control_code_page, a special page which has been set up to be preserved
+ * during the copy operation.
+ */
+.globl arm64_relocate_new_kernel
+arm64_relocate_new_kernel:
+
+	/* Setup the list loop variables. */
+	mov	x18, x0				/* x18 = kimage_head */
+	mov	x17, x1				/* x17 = kimage_start */
+	dcache_line_size x16, x0		/* x16 = dcache line size */
+	mov	x15, xzr			/* x15 = segment start */
+	mov	x14, xzr			/* x14 = entry ptr */
+	mov	x13, xzr			/* x13 = copy dest */
+
+	/* Clear the sctlr_el2 flags. */
+	mrs	x0, CurrentEL
+	cmp	x0, #CurrentEL_EL2
+	b.ne	1f
+	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_ELx_FLAGS
+	bic	x0, x0, x1
+	msr	sctlr_el2, x0
+	isb
+1:
+
+	/* Check if the new image needs relocation. */
+	cbz	x18, .Ldone
+	tbnz	x18, IND_DONE_BIT, .Ldone
+
+.Lloop:
+	and	x12, x18, PAGE_MASK		/* x12 = addr */
+
+	/* Test the entry flags. */
+.Ltest_source:
+	tbz	x18, IND_SOURCE_BIT, .Ltest_indirection
+
+	/* Invalidate dest page to PoC. */
+	mov     x0, x13
+	add     x19, x0, #PAGE_SIZE
+	sub     x1, x16, #1
+	bic     x0, x0, x1
+2:	dc      ivac, x0
+	add     x0, x0, x16
+	cmp     x0, x19
+	b.lo    2b
+	dsb     sy
+
+	mov x19, x13
+	mov x20, x12
+	copy_page x19, x20, x0, x1, x2, x3, x4, x5, x6, x7
+
+	/* dest += PAGE_SIZE */
+	add	x13, x13, PAGE_SIZE
+	b	.Lnext
+
+.Ltest_indirection:
+	tbz	x18, IND_INDIRECTION_BIT, .Ltest_destination
+
+	/* ptr = addr */
+	mov	x14, x12
+	b	.Lnext
+
+.Ltest_destination:
+	tbz	x18, IND_DESTINATION_BIT, .Lnext
+
+	mov	x15, x12
+
+	/* dest = addr */
+	mov	x13, x12
+
+.Lnext:
+	/* entry = *ptr++ */
+	ldr	x18, [x14], #8
+
+	/* while (!(entry & DONE)) */
+	tbz	x18, IND_DONE_BIT, .Lloop
+
+.Ldone:
+	dsb	nsh
+	ic	iallu
+	dsb	nsh
+	isb
+
+	/* Start new image. */
+	mov	x0, xzr
+	mov	x1, xzr
+	mov	x2, xzr
+	mov	x3, xzr
+	br	x17
+
+.ltorg
+
+.align 3	/* To keep the 64-bit values below naturally aligned. */
+
+.Lcopy_end:
+.org	KEXEC_CONTROL_PAGE_SIZE
+
+/*
+ * arm64_relocate_new_kernel_size - Number of bytes to copy to the
+ * control_code_page.
+ */
+.globl arm64_relocate_new_kernel_size
+arm64_relocate_new_kernel_size:
+	.quad	.Lcopy_end - arm64_relocate_new_kernel
diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
index 99048e5..aae5ebf 100644
--- a/include/uapi/linux/kexec.h
+++ b/include/uapi/linux/kexec.h
@@ -39,6 +39,7 @@
 #define KEXEC_ARCH_SH      (42 << 16)
 #define KEXEC_ARCH_MIPS_LE (10 << 16)
 #define KEXEC_ARCH_MIPS    ( 8 << 16)
+#define KEXEC_ARCH_AARCH64 (183 << 16)
 
 /* The artificial cap on the number of segments passed to kexec_load. */
 #define KEXEC_SEGMENT_MAX 16
-- 
2.5.0


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 09/16] arm64/kexec: Add pr_devel output
  2015-12-15 17:15     ` Will Deacon
@ 2015-12-16  0:45       ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-16  0:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2015-12-15 at 17:15 +0000, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> > To aid in debugging kexec problems or when adding new functionality to kexec add
> > a new routine kexec_image_info() and several inline pr_devel statements.
> 
> We don't currently have any pr_devel invocations under arm64. Can you
> either remove these or change them to pr_debug?

OK.

> > Signed-off-by: Geoff Levand <geoff@infradead.org>
> > ---
> >  arch/arm64/kernel/machine_kexec.c | 63 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 63 insertions(+)
> > 
> > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> > index 8b990b8..da28a26 100644
> > --- a/arch/arm64/kernel/machine_kexec.c
> > +++ b/arch/arm64/kernel/machine_kexec.c
> > @@ -25,6 +25,48 @@ extern const unsigned long arm64_relocate_new_kernel_size;
> >  
> >  static unsigned long kimage_start;
> >  
> > +/**
> > + * kexec_is_dtb - Helper routine to check the device tree header signature.
> > + */
> > +static bool kexec_is_dtb(const void *dtb)
> > +{
> > +> > 	> > __be32 magic;
> > +
> > +> > 	> > return get_user(magic, (__be32 *)dtb) ? false :
> > +> > 	> > 	> > (be32_to_cpu(magic) == OF_DT_HEADER);
> 
> Isn't there an fdt32_to_cpu helper for this already? I also can't help
> but thing this would be more readable if you didn't bother with the
> ternary if form.

Sure, I'll clean it up.

-Geoff

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 09/16] arm64/kexec: Add pr_devel output
@ 2015-12-16  0:45       ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-16  0:45 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	AKASHI Takahiro, linux-arm-kernel, christoffer.dall

On Tue, 2015-12-15 at 17:15 +0000, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
> > To aid in debugging kexec problems or when adding new functionality to kexec add
> > a new routine kexec_image_info() and several inline pr_devel statements.
> 
> We don't currently have any pr_devel invocations under arm64. Can you
> either remove these or change them to pr_debug?

OK.

> > Signed-off-by: Geoff Levand <geoff@infradead.org>
> > ---
> >  arch/arm64/kernel/machine_kexec.c | 63 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 63 insertions(+)
> > 
> > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> > index 8b990b8..da28a26 100644
> > --- a/arch/arm64/kernel/machine_kexec.c
> > +++ b/arch/arm64/kernel/machine_kexec.c
> > @@ -25,6 +25,48 @@ extern const unsigned long arm64_relocate_new_kernel_size;
> >  
> >  static unsigned long kimage_start;
> >  
> > +/**
> > + * kexec_is_dtb - Helper routine to check the device tree header signature.
> > + */
> > +static bool kexec_is_dtb(const void *dtb)
> > +{
> > +> > 	> > __be32 magic;
> > +
> > +> > 	> > return get_user(magic, (__be32 *)dtb) ? false :
> > +> > 	> > 	> > (be32_to_cpu(magic) == OF_DT_HEADER);
> 
> Isn't there an fdt32_to_cpu helper for this already? I also can't help
> but thing this would be more readable if you didn't bother with the
> ternary if form.

Sure, I'll clean it up.

-Geoff


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12.4] arm64/kexec: Add pr_debug output
  2015-11-24 22:25   ` Geoff Levand
@ 2015-12-16  0:46     ` Geoff Levand
  -1 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-16  0:46 UTC (permalink / raw)
  To: linux-arm-kernel

To aid in debugging kexec problems or when adding new functionality to kexec add
a new routine kexec_image_info() and several inline pr_debug statements.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/kernel/machine_kexec.c | 70 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 68 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index fd05308..02c6eac 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -10,6 +10,7 @@
  */
 
 #include <linux/kexec.h>
+#include <linux/libfdt_env.h>
 #include <linux/of_fdt.h>
 #include <linux/slab.h>
 #include <linux/uaccess.h>
@@ -25,6 +26,50 @@ extern const unsigned long arm64_relocate_new_kernel_size;
 
 static unsigned long kimage_start;
 
+/**
+ * kexec_is_dtb - Helper routine to check the device tree header signature.
+ */
+static bool kexec_is_dtb(const void *dtb)
+{
+	__be32 magic;
+
+	if (!get_user(magic, (__be32 *)dtb))
+		return false;
+
+	return fdt32_to_cpu(magic) == OF_DT_HEADER;
+}
+
+/**
+ * kexec_image_info - For debugging output.
+ */
+#define kexec_image_info(_i) _kexec_image_info(__func__, __LINE__, _i)
+static void _kexec_image_info(const char *func, int line,
+	const struct kimage *kimage)
+{
+	unsigned long i;
+
+#if !defined(DEBUG)
+	return;
+#endif
+	pr_debug("%s:%d:\n", func, line);
+	pr_debug("  kexec kimage info:\n");
+	pr_debug("    type:        %d\n", kimage->type);
+	pr_debug("    start:       %lx\n", kimage->start);
+	pr_debug("    head:        %lx\n", kimage->head);
+	pr_debug("    nr_segments: %lu\n", kimage->nr_segments);
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		pr_debug("      segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages%s\n",
+			i,
+			kimage->segment[i].mem,
+			kimage->segment[i].mem + kimage->segment[i].memsz,
+			kimage->segment[i].memsz,
+			kimage->segment[i].memsz /  PAGE_SIZE,
+			(kexec_is_dtb(kimage->segment[i].buf) ?
+				", dtb segment" : ""));
+	}
+}
+
 void machine_kexec_cleanup(struct kimage *kimage)
 {
 	/* Empty routine needed to avoid build errors. */
@@ -38,6 +83,8 @@ void machine_kexec_cleanup(struct kimage *kimage)
 int machine_kexec_prepare(struct kimage *kimage)
 {
 	kimage_start = kimage->start;
+	kexec_image_info(kimage);
+
 	return 0;
 }
 
@@ -77,10 +124,10 @@ static void kexec_segment_flush(const struct kimage *kimage)
 {
 	unsigned long i;
 
-	pr_devel("%s:\n", __func__);
+	pr_debug("%s:\n", __func__);
 
 	for (i = 0; i < kimage->nr_segments; i++) {
-		pr_devel("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
+		pr_debug("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
 			i,
 			kimage->segment[i].mem,
 			kimage->segment[i].mem + kimage->segment[i].memsz,
@@ -107,6 +154,25 @@ void machine_kexec(struct kimage *kimage)
 	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
 	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
 
+	kexec_image_info(kimage);
+
+	pr_debug("%s:%d: control_code_page:        %p\n", __func__, __LINE__,
+		kimage->control_code_page);
+	pr_debug("%s:%d: reboot_code_buffer_phys:  %pa\n", __func__, __LINE__,
+		&reboot_code_buffer_phys);
+	pr_debug("%s:%d: reboot_code_buffer:       %p\n", __func__, __LINE__,
+		reboot_code_buffer);
+	pr_debug("%s:%d: relocate_new_kernel:      %p\n", __func__, __LINE__,
+		arm64_relocate_new_kernel);
+	pr_debug("%s:%d: relocate_new_kernel_size: 0x%lx(%lu) bytes\n",
+		__func__, __LINE__, arm64_relocate_new_kernel_size,
+		arm64_relocate_new_kernel_size);
+
+	pr_debug("%s:%d: kimage_head:              %lx\n", __func__, __LINE__,
+		kimage->head);
+	pr_debug("%s:%d: kimage_start:             %lx\n", __func__, __LINE__,
+		kimage_start);
+
 	/*
 	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
 	 * after the kernel is shut down.
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12.4] arm64/kexec: Add pr_debug output
@ 2015-12-16  0:46     ` Geoff Levand
  0 siblings, 0 replies; 178+ messages in thread
From: Geoff Levand @ 2015-12-16  0:46 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Rutland, marc.zyngier, kexec, AKASHI Takahiro,
	christoffer.dall, linux-arm-kernel

To aid in debugging kexec problems or when adding new functionality to kexec add
a new routine kexec_image_info() and several inline pr_debug statements.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/kernel/machine_kexec.c | 70 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 68 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index fd05308..02c6eac 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -10,6 +10,7 @@
  */
 
 #include <linux/kexec.h>
+#include <linux/libfdt_env.h>
 #include <linux/of_fdt.h>
 #include <linux/slab.h>
 #include <linux/uaccess.h>
@@ -25,6 +26,50 @@ extern const unsigned long arm64_relocate_new_kernel_size;
 
 static unsigned long kimage_start;
 
+/**
+ * kexec_is_dtb - Helper routine to check the device tree header signature.
+ */
+static bool kexec_is_dtb(const void *dtb)
+{
+	__be32 magic;
+
+	if (!get_user(magic, (__be32 *)dtb))
+		return false;
+
+	return fdt32_to_cpu(magic) == OF_DT_HEADER;
+}
+
+/**
+ * kexec_image_info - For debugging output.
+ */
+#define kexec_image_info(_i) _kexec_image_info(__func__, __LINE__, _i)
+static void _kexec_image_info(const char *func, int line,
+	const struct kimage *kimage)
+{
+	unsigned long i;
+
+#if !defined(DEBUG)
+	return;
+#endif
+	pr_debug("%s:%d:\n", func, line);
+	pr_debug("  kexec kimage info:\n");
+	pr_debug("    type:        %d\n", kimage->type);
+	pr_debug("    start:       %lx\n", kimage->start);
+	pr_debug("    head:        %lx\n", kimage->head);
+	pr_debug("    nr_segments: %lu\n", kimage->nr_segments);
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		pr_debug("      segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages%s\n",
+			i,
+			kimage->segment[i].mem,
+			kimage->segment[i].mem + kimage->segment[i].memsz,
+			kimage->segment[i].memsz,
+			kimage->segment[i].memsz /  PAGE_SIZE,
+			(kexec_is_dtb(kimage->segment[i].buf) ?
+				", dtb segment" : ""));
+	}
+}
+
 void machine_kexec_cleanup(struct kimage *kimage)
 {
 	/* Empty routine needed to avoid build errors. */
@@ -38,6 +83,8 @@ void machine_kexec_cleanup(struct kimage *kimage)
 int machine_kexec_prepare(struct kimage *kimage)
 {
 	kimage_start = kimage->start;
+	kexec_image_info(kimage);
+
 	return 0;
 }
 
@@ -77,10 +124,10 @@ static void kexec_segment_flush(const struct kimage *kimage)
 {
 	unsigned long i;
 
-	pr_devel("%s:\n", __func__);
+	pr_debug("%s:\n", __func__);
 
 	for (i = 0; i < kimage->nr_segments; i++) {
-		pr_devel("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
+		pr_debug("  segment[%lu]: %016lx - %016lx, %lx bytes, %lu pages\n",
 			i,
 			kimage->segment[i].mem,
 			kimage->segment[i].mem + kimage->segment[i].memsz,
@@ -107,6 +154,25 @@ void machine_kexec(struct kimage *kimage)
 	reboot_code_buffer_phys = page_to_phys(kimage->control_code_page);
 	reboot_code_buffer = phys_to_virt(reboot_code_buffer_phys);
 
+	kexec_image_info(kimage);
+
+	pr_debug("%s:%d: control_code_page:        %p\n", __func__, __LINE__,
+		kimage->control_code_page);
+	pr_debug("%s:%d: reboot_code_buffer_phys:  %pa\n", __func__, __LINE__,
+		&reboot_code_buffer_phys);
+	pr_debug("%s:%d: reboot_code_buffer:       %p\n", __func__, __LINE__,
+		reboot_code_buffer);
+	pr_debug("%s:%d: relocate_new_kernel:      %p\n", __func__, __LINE__,
+		arm64_relocate_new_kernel);
+	pr_debug("%s:%d: relocate_new_kernel_size: 0x%lx(%lu) bytes\n",
+		__func__, __LINE__, arm64_relocate_new_kernel_size,
+		arm64_relocate_new_kernel_size);
+
+	pr_debug("%s:%d: kimage_head:              %lx\n", __func__, __LINE__,
+		kimage->head);
+	pr_debug("%s:%d: kimage_start:             %lx\n", __func__, __LINE__,
+		kimage_start);
+
 	/*
 	 * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
 	 * after the kernel is shut down.
-- 
2.5.0



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 178+ messages in thread

* [PATCH v12 11/16] arm64: kdump: reserve memory for crash dump kernel
  2015-12-15 17:29     ` Will Deacon
@ 2015-12-16  5:19       ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-16  5:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/16/2015 02:29 AM, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> On primary kernel, the memory region used by crash dump kernel must be
>> specified by "crashkernel=" boot parameter. reserve_crashkernel()
>> will allocate and reserve the region for later use.
>>
>> User space tools will be able to find the region marked as "Crash kernel"
>> in /proc/iomem.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/kernel/setup.c |  7 +++++-
>>   arch/arm64/mm/init.c      | 54 +++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 60 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index 8119479..f9fffc9 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -31,7 +31,6 @@
>>   #include <linux/screen_info.h>
>>   #include <linux/init.h>
>>   #include <linux/kexec.h>
>> -#include <linux/crash_dump.h>
>>   #include <linux/root_dev.h>
>>   #include <linux/cpu.h>
>>   #include <linux/interrupt.h>
>> @@ -221,6 +220,12 @@ static void __init request_standard_resources(void)
>>   		    kernel_data.end <= res->end)
>>   			request_resource(res, &kernel_data);
>>   	}
>> +
>> +#ifdef CONFIG_KEXEC
>> +	/* User space tools will find "Crash kernel" region in /proc/iomem. */
>> +	if (crashk_res.end)
>> +		insert_resource(&iomem_resource, &crashk_res);
>> +#endif
>
> Why can't this bit be moved into reserved_crashkernel, at the end?

Because
- I want to call reserve_crashkernel() as early as possbile, (so before
   request_standard_resources()), and
- if inserting "Crash kernel" region is done before request_standard_resources(),
   request_resource() for "System RAM" region which contains "Crash kernel" will
   fail and the entry will be lost from /proc/iomem.


>>   }
>>
>>   #ifdef CONFIG_BLK_DEV_INITRD
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 17bf39a..24f0a1c 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -34,6 +34,7 @@
>>   #include <linux/dma-contiguous.h>
>>   #include <linux/efi.h>
>>   #include <linux/swiotlb.h>
>> +#include <linux/kexec.h>
>>
>>   #include <asm/fixmap.h>
>>   #include <asm/memory.h>
>> @@ -66,6 +67,55 @@ static int __init early_initrd(char *p)
>>   early_param("initrd", early_initrd);
>>   #endif
>>
>> +#ifdef CONFIG_KEXEC
>> +/*
>> + * reserve_crashkernel() - reserves memory for crash kernel
>> + *
>> + * This function reserves memory area given in "crashkernel=" kernel command
>> + * line parameter. The memory reserved is used by dump capture kernel when
>> + * primary kernel is crashing.
>> + */
>> +static void __init reserve_crashkernel(void)
>> +{
>> +	unsigned long long crash_size = 0, crash_base = 0;
>> +	int ret;
>> +
>> +	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> +				&crash_size, &crash_base);
>> +	if (ret)
>> +		return;
>
> What's the point in ret?

parse_crashkernel() will returns -EINVAL if
- the boot command line doesn't contain "crashkernel=" parameter, or
- "crashkernel=" parameter has an invalid synstax of value
Otherwise returns 0.

So just returning should be OK, but just in case that the syntax is OK, but
that a resulting crash_size gets 0, I will add a check like:
     if (!crash_size)
         return;

>> +
>> +	if (crash_base == 0) {
>> +		crash_base = memblock_alloc(crash_size, 1 << 21);
>
> What is this magic number? How about SZ_2M and a comment mentioning the
> arm64 boot protocol?

Right. I will do so.

>> +		if (crash_base == 0) {
>> +			pr_warn("Unable to allocate crashkernel (size:%llx)\n",
>> +				crash_size);
>> +			return;
>> +		}
>> +	} else {
>> +		/* User specifies base address explicitly. */
>> +		if (!memblock_is_region_memory(crash_base, crash_size) ||
>> +			memblock_is_region_reserved(crash_base, crash_size)) {
>> +			pr_warn("crashkernel has wrong address or size\n");
>> +			return;
>> +		}
>> +
>> +		if (crash_base & ((1 << 21) - 1)) {
>
> IS_ALIGNED

OK.

>> +			pr_warn("crashkernel base address is not 2MB aligned\n");
>> +			return;
>> +		}
>> +
>> +		memblock_reserve(crash_base, crash_size);
>> +	}
>> +
>> +	pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
>> +		crash_size >> 20, crash_base >> 20);
>> +
>> +	crashk_res.start = crash_base;
>> +	crashk_res.end = crash_base + crash_size - 1;
>> +}
>
> #else
>
> static void __init reserve_crashkernel(void) {};

OK.

Thanks,
-Takahiro AKASHI

> Will
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 11/16] arm64: kdump: reserve memory for crash dump kernel
@ 2015-12-16  5:19       ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-16  5:19 UTC (permalink / raw)
  To: Will Deacon, Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	christoffer.dall, linux-arm-kernel

On 12/16/2015 02:29 AM, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> On primary kernel, the memory region used by crash dump kernel must be
>> specified by "crashkernel=" boot parameter. reserve_crashkernel()
>> will allocate and reserve the region for later use.
>>
>> User space tools will be able to find the region marked as "Crash kernel"
>> in /proc/iomem.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/kernel/setup.c |  7 +++++-
>>   arch/arm64/mm/init.c      | 54 +++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 60 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index 8119479..f9fffc9 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -31,7 +31,6 @@
>>   #include <linux/screen_info.h>
>>   #include <linux/init.h>
>>   #include <linux/kexec.h>
>> -#include <linux/crash_dump.h>
>>   #include <linux/root_dev.h>
>>   #include <linux/cpu.h>
>>   #include <linux/interrupt.h>
>> @@ -221,6 +220,12 @@ static void __init request_standard_resources(void)
>>   		    kernel_data.end <= res->end)
>>   			request_resource(res, &kernel_data);
>>   	}
>> +
>> +#ifdef CONFIG_KEXEC
>> +	/* User space tools will find "Crash kernel" region in /proc/iomem. */
>> +	if (crashk_res.end)
>> +		insert_resource(&iomem_resource, &crashk_res);
>> +#endif
>
> Why can't this bit be moved into reserved_crashkernel, at the end?

Because
- I want to call reserve_crashkernel() as early as possbile, (so before
   request_standard_resources()), and
- if inserting "Crash kernel" region is done before request_standard_resources(),
   request_resource() for "System RAM" region which contains "Crash kernel" will
   fail and the entry will be lost from /proc/iomem.


>>   }
>>
>>   #ifdef CONFIG_BLK_DEV_INITRD
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 17bf39a..24f0a1c 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -34,6 +34,7 @@
>>   #include <linux/dma-contiguous.h>
>>   #include <linux/efi.h>
>>   #include <linux/swiotlb.h>
>> +#include <linux/kexec.h>
>>
>>   #include <asm/fixmap.h>
>>   #include <asm/memory.h>
>> @@ -66,6 +67,55 @@ static int __init early_initrd(char *p)
>>   early_param("initrd", early_initrd);
>>   #endif
>>
>> +#ifdef CONFIG_KEXEC
>> +/*
>> + * reserve_crashkernel() - reserves memory for crash kernel
>> + *
>> + * This function reserves memory area given in "crashkernel=" kernel command
>> + * line parameter. The memory reserved is used by dump capture kernel when
>> + * primary kernel is crashing.
>> + */
>> +static void __init reserve_crashkernel(void)
>> +{
>> +	unsigned long long crash_size = 0, crash_base = 0;
>> +	int ret;
>> +
>> +	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> +				&crash_size, &crash_base);
>> +	if (ret)
>> +		return;
>
> What's the point in ret?

parse_crashkernel() will returns -EINVAL if
- the boot command line doesn't contain "crashkernel=" parameter, or
- "crashkernel=" parameter has an invalid synstax of value
Otherwise returns 0.

So just returning should be OK, but just in case that the syntax is OK, but
that a resulting crash_size gets 0, I will add a check like:
     if (!crash_size)
         return;

>> +
>> +	if (crash_base == 0) {
>> +		crash_base = memblock_alloc(crash_size, 1 << 21);
>
> What is this magic number? How about SZ_2M and a comment mentioning the
> arm64 boot protocol?

Right. I will do so.

>> +		if (crash_base == 0) {
>> +			pr_warn("Unable to allocate crashkernel (size:%llx)\n",
>> +				crash_size);
>> +			return;
>> +		}
>> +	} else {
>> +		/* User specifies base address explicitly. */
>> +		if (!memblock_is_region_memory(crash_base, crash_size) ||
>> +			memblock_is_region_reserved(crash_base, crash_size)) {
>> +			pr_warn("crashkernel has wrong address or size\n");
>> +			return;
>> +		}
>> +
>> +		if (crash_base & ((1 << 21) - 1)) {
>
> IS_ALIGNED

OK.

>> +			pr_warn("crashkernel base address is not 2MB aligned\n");
>> +			return;
>> +		}
>> +
>> +		memblock_reserve(crash_base, crash_size);
>> +	}
>> +
>> +	pr_info("Reserving %lldMB of memory at %lldMB for crashkernel\n",
>> +		crash_size >> 20, crash_base >> 20);
>> +
>> +	crashk_res.start = crash_base;
>> +	crashk_res.end = crash_base + crash_size - 1;
>> +}
>
> #else
>
> static void __init reserve_crashkernel(void) {};

OK.

Thanks,
-Takahiro AKASHI

> Will
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 13/16] arm64: kdump: add kdump support
  2015-12-15 17:45     ` Will Deacon
@ 2015-12-16  5:41       ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-16  5:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/16/2015 02:45 AM, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> On crash dump kernel, all the information about primary kernel's core
>> image is available in elf core header specified by "elfcorehdr=" boot
>> parameter. reserve_elfcorehdr() will set aside the region to avoid any
>> corruption by crash dump kernel.
>>
>> Crash dump kernel will access the system memory of primary kernel via
>> copy_oldmem_page(), which reads one page by ioremap'ing it since it does
>> not reside in linear mapping on crash dump kernel.
>> Please note that we should add "mem=X[MG]" boot parameter to limit the
>> memory size and avoid the following assertion at ioremap():
>> 	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
>> 		return NULL;
>> when accessing any pages beyond the usable memories of crash dump kernel.
>>
>> We also need our own elfcorehdr_read() here since the weak definition of
>> elfcorehdr_read() utilizes copy_oldmem_page() and will hit the assertion
>> above on arm64.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/Kconfig             | 12 +++++++
>>   arch/arm64/kernel/Makefile     |  1 +
>>   arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++++++++
>>   arch/arm64/mm/init.c           | 29 +++++++++++++++++
>>   4 files changed, 113 insertions(+)
>>   create mode 100644 arch/arm64/kernel/crash_dump.c
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index c23fd77..4bac7dc 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -545,6 +545,18 @@ config KEXEC
>>   	  but it is independent of the system firmware.   And like a reboot
>>   	  you can start any kernel with it, not just Linux.
>>
>> +config CRASH_DUMP
>> +	bool "Build kdump crash kernel"
>> +	help
>> +	  Generate crash dump after being started by kexec. This should
>> +	  be normally only set in special crash dump kernels which are
>> +	  loaded in the main kernel with kexec-tools into a specially
>> +	  reserved region and then later executed after a crash by
>> +	  kdump/kexec. The crash dump kernel must be compiled to a
>> +	  memory address not used by the main kernel.
>
> What does this even mean? How do I "compile to a memory address not used
> by the main kernel"?

Well, it's just a copy from arm, but right, it's ambiguous.
I will remove that text.

>> diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
>> new file mode 100644
>> index 0000000..3d86c0a
>> --- /dev/null
>> +++ b/arch/arm64/kernel/crash_dump.c
>> @@ -0,0 +1,71 @@
>> +/*
>> + * Routines for doing kexec-based kdump
>> + *
>> + * Copyright (C) 2014 Linaro Limited
>> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/crash_dump.h>
>> +#include <linux/errno.h>
>> +#include <linux/io.h>
>> +#include <linux/memblock.h>
>> +#include <linux/uaccess.h>
>> +#include <asm/memory.h>
>> +
>> +/**
>> + * copy_oldmem_page() - copy one page from old kernel memory
>> + * @pfn: page frame number to be copied
>> + * @buf: buffer where the copied page is placed
>> + * @csize: number of bytes to copy
>> + * @offset: offset in bytes into the page
>> + * @userbuf: if set, @buf is in a user address space
>> + *
>> + * This function copies one page from old kernel memory into buffer pointed by
>> + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
>> + * copied or negative error in case of failure.
>> + */
>> +ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
>> +			 size_t csize, unsigned long offset,
>> +			 int userbuf)
>> +{
>> +	void *vaddr;
>> +
>> +	if (!csize)
>> +		return 0;
>> +
>> +	vaddr = ioremap_cache(pfn << PAGE_SHIFT, PAGE_SIZE);
>
> pfn_to_page

Maybe __pfn_to_phsy()?

>> +	if (!vaddr)
>> +		return -ENOMEM;
>> +
>> +	if (userbuf) {
>> +		if (copy_to_user(buf, vaddr + offset, csize)) {
>> +			iounmap(vaddr);
>> +			return -EFAULT;
>> +		}
>> +	} else {
>> +		memcpy(buf, vaddr + offset, csize);
>> +	}
>> +
>> +	iounmap(vaddr);
>> +
>> +	return csize;
>> +}
>> +
>> +/**
>> + * elfcorehdr_read - read from ELF core header
>> + * @buf: buffer where the data is placed
>> + * @csize: number of bytes to read
>> + * @ppos: address in the memory
>> + *
>> + * This function reads @count bytes from elf core header which exists
>> + * on crash dump kernel's memory.
>> + */
>> +ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
>> +{
>> +	memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
>> +	return count;
>> +}
>
> I know you say that we have to override this function so that we don't
> hit the pfn_valid warning in ioremap, but what guarantees that the ELF
> header of the crashed kernel is actually mapped in our linear mapping?

Well, in fact, it depends on kexec-tools.
In the current implementation for arm64, the elf core header is allocated
within the usable memory of crash dump kernel.

Should we add some check here?

>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 24f0a1c..52a1469 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -35,6 +35,7 @@
>>   #include <linux/efi.h>
>>   #include <linux/swiotlb.h>
>>   #include <linux/kexec.h>
>> +#include <linux/crash_dump.h>
>>
>>   #include <asm/fixmap.h>
>>   #include <asm/memory.h>
>> @@ -116,6 +117,31 @@ static void __init reserve_crashkernel(void)
>>   }
>>   #endif /* CONFIG_KEXEC */
>>
>> +#ifdef CONFIG_CRASH_DUMP
>> +/*
>> + * reserve_elfcorehdr() - reserves memory for elf core header
>> + *
>> + * This function reserves elf core header given in "elfcorehdr=" kernel
>> + * command line parameter. This region contains all the information about
>> + * primary kernel's core image and is used by a dump capture kernel to
>> + * access the system memory on primary kernel.
>> + */
>> +static void __init reserve_elfcorehdr(void)
>> +{
>> +	if (!elfcorehdr_size)
>> +		return;
>> +
>> +	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
>> +		pr_warn("elfcorehdr is overlapped\n");
>> +		return;
>> +	}
>> +
>> +	memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
>> +
>> +	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
>> +		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);
>
> I'd have thought it would be more useful to print the address as an
> address rather than a size.

Yeah, I totally agree, but all the other archs, including x86 and arm,
print the address in "%lldMB" format.
If you like, I can fix it.

>> +}
>
> Similar #else trick here.

Sure.

Thanks,
-Takahiro AKASHI

> Will
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 13/16] arm64: kdump: add kdump support
@ 2015-12-16  5:41       ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-16  5:41 UTC (permalink / raw)
  To: Will Deacon, Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	christoffer.dall, linux-arm-kernel

On 12/16/2015 02:45 AM, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> On crash dump kernel, all the information about primary kernel's core
>> image is available in elf core header specified by "elfcorehdr=" boot
>> parameter. reserve_elfcorehdr() will set aside the region to avoid any
>> corruption by crash dump kernel.
>>
>> Crash dump kernel will access the system memory of primary kernel via
>> copy_oldmem_page(), which reads one page by ioremap'ing it since it does
>> not reside in linear mapping on crash dump kernel.
>> Please note that we should add "mem=X[MG]" boot parameter to limit the
>> memory size and avoid the following assertion at ioremap():
>> 	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
>> 		return NULL;
>> when accessing any pages beyond the usable memories of crash dump kernel.
>>
>> We also need our own elfcorehdr_read() here since the weak definition of
>> elfcorehdr_read() utilizes copy_oldmem_page() and will hit the assertion
>> above on arm64.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   arch/arm64/Kconfig             | 12 +++++++
>>   arch/arm64/kernel/Makefile     |  1 +
>>   arch/arm64/kernel/crash_dump.c | 71 ++++++++++++++++++++++++++++++++++++++++++
>>   arch/arm64/mm/init.c           | 29 +++++++++++++++++
>>   4 files changed, 113 insertions(+)
>>   create mode 100644 arch/arm64/kernel/crash_dump.c
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index c23fd77..4bac7dc 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -545,6 +545,18 @@ config KEXEC
>>   	  but it is independent of the system firmware.   And like a reboot
>>   	  you can start any kernel with it, not just Linux.
>>
>> +config CRASH_DUMP
>> +	bool "Build kdump crash kernel"
>> +	help
>> +	  Generate crash dump after being started by kexec. This should
>> +	  be normally only set in special crash dump kernels which are
>> +	  loaded in the main kernel with kexec-tools into a specially
>> +	  reserved region and then later executed after a crash by
>> +	  kdump/kexec. The crash dump kernel must be compiled to a
>> +	  memory address not used by the main kernel.
>
> What does this even mean? How do I "compile to a memory address not used
> by the main kernel"?

Well, it's just a copy from arm, but right, it's ambiguous.
I will remove that text.

>> diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
>> new file mode 100644
>> index 0000000..3d86c0a
>> --- /dev/null
>> +++ b/arch/arm64/kernel/crash_dump.c
>> @@ -0,0 +1,71 @@
>> +/*
>> + * Routines for doing kexec-based kdump
>> + *
>> + * Copyright (C) 2014 Linaro Limited
>> + * Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/crash_dump.h>
>> +#include <linux/errno.h>
>> +#include <linux/io.h>
>> +#include <linux/memblock.h>
>> +#include <linux/uaccess.h>
>> +#include <asm/memory.h>
>> +
>> +/**
>> + * copy_oldmem_page() - copy one page from old kernel memory
>> + * @pfn: page frame number to be copied
>> + * @buf: buffer where the copied page is placed
>> + * @csize: number of bytes to copy
>> + * @offset: offset in bytes into the page
>> + * @userbuf: if set, @buf is in a user address space
>> + *
>> + * This function copies one page from old kernel memory into buffer pointed by
>> + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
>> + * copied or negative error in case of failure.
>> + */
>> +ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
>> +			 size_t csize, unsigned long offset,
>> +			 int userbuf)
>> +{
>> +	void *vaddr;
>> +
>> +	if (!csize)
>> +		return 0;
>> +
>> +	vaddr = ioremap_cache(pfn << PAGE_SHIFT, PAGE_SIZE);
>
> pfn_to_page

Maybe __pfn_to_phsy()?

>> +	if (!vaddr)
>> +		return -ENOMEM;
>> +
>> +	if (userbuf) {
>> +		if (copy_to_user(buf, vaddr + offset, csize)) {
>> +			iounmap(vaddr);
>> +			return -EFAULT;
>> +		}
>> +	} else {
>> +		memcpy(buf, vaddr + offset, csize);
>> +	}
>> +
>> +	iounmap(vaddr);
>> +
>> +	return csize;
>> +}
>> +
>> +/**
>> + * elfcorehdr_read - read from ELF core header
>> + * @buf: buffer where the data is placed
>> + * @csize: number of bytes to read
>> + * @ppos: address in the memory
>> + *
>> + * This function reads @count bytes from elf core header which exists
>> + * on crash dump kernel's memory.
>> + */
>> +ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
>> +{
>> +	memcpy(buf, phys_to_virt((phys_addr_t)*ppos), count);
>> +	return count;
>> +}
>
> I know you say that we have to override this function so that we don't
> hit the pfn_valid warning in ioremap, but what guarantees that the ELF
> header of the crashed kernel is actually mapped in our linear mapping?

Well, in fact, it depends on kexec-tools.
In the current implementation for arm64, the elf core header is allocated
within the usable memory of crash dump kernel.

Should we add some check here?

>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 24f0a1c..52a1469 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -35,6 +35,7 @@
>>   #include <linux/efi.h>
>>   #include <linux/swiotlb.h>
>>   #include <linux/kexec.h>
>> +#include <linux/crash_dump.h>
>>
>>   #include <asm/fixmap.h>
>>   #include <asm/memory.h>
>> @@ -116,6 +117,31 @@ static void __init reserve_crashkernel(void)
>>   }
>>   #endif /* CONFIG_KEXEC */
>>
>> +#ifdef CONFIG_CRASH_DUMP
>> +/*
>> + * reserve_elfcorehdr() - reserves memory for elf core header
>> + *
>> + * This function reserves elf core header given in "elfcorehdr=" kernel
>> + * command line parameter. This region contains all the information about
>> + * primary kernel's core image and is used by a dump capture kernel to
>> + * access the system memory on primary kernel.
>> + */
>> +static void __init reserve_elfcorehdr(void)
>> +{
>> +	if (!elfcorehdr_size)
>> +		return;
>> +
>> +	if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
>> +		pr_warn("elfcorehdr is overlapped\n");
>> +		return;
>> +	}
>> +
>> +	memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
>> +
>> +	pr_info("Reserving %lldKB of memory at %lldMB for elfcorehdr\n",
>> +		elfcorehdr_size >> 10, elfcorehdr_addr >> 20);
>
> I'd have thought it would be more useful to print the address as an
> address rather than a size.

Yeah, I totally agree, but all the other archs, including x86 and arm,
print the address in "%lldMB" format.
If you like, I can fix it.

>> +}
>
> Similar #else trick here.

Sure.

Thanks,
-Takahiro AKASHI

> Will
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 14/16] arm64: kdump: update a kernel doc
  2015-12-15 17:17     ` Will Deacon
@ 2015-12-16  5:48       ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-16  5:48 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/16/2015 02:17 AM, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> This patch adds arch specific descriptions about kdump usage on arm64
>> to kdump.txt.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   Documentation/kdump/kdump.txt | 32 +++++++++++++++++++++++++++++++-
>>   1 file changed, 31 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
>> index bc4bd5a..7d3fad0 100644
>> --- a/Documentation/kdump/kdump.txt
>> +++ b/Documentation/kdump/kdump.txt
>> @@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
>>   a remote system.
>>
>>   Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
>> -s390x and arm architectures.
>> +s390x, arm and arm64 architectures.
>>
>>   When the system kernel boots, it reserves a small section of memory for
>>   the dump-capture kernel. This ensures that ongoing Direct Memory Access
>> @@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
>>
>>       AUTO_ZRELADDR=y
>>
>> +Dump-capture kernel config options (Arch Dependent, arm64)
>> +----------------------------------------------------------
>> +
>> +1) Disable symmetric multi-processing support under "Processor type and
>> +   features":
>> +
>> +   CONFIG_SMP=n
>> +
>> +   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
>> +   when loading the dump-capture kernel, see section "Load the Dump-capture
>> +   Kernel".)
>
> SMP is now forced on, so please update this text.

OK, I will fix it.

>> +2) The maximum memory size on the dump-capture kernel must be limited by
>> +   specifying:
>> +
>> +   mem=X[MG]
>> +
>> +   where X should be less than or equal to the size in "crashkernel="
>> +   boot parameter. Kexec-tools will automatically add this.
>> +
>> +3) Currently, kvm will not be enabled on the dump-capture kernel even
>> +   if it is configured.
>> +
>>   Extended crashkernel syntax
>>   ===========================
>>
>> @@ -312,6 +335,8 @@ Boot into System Kernel
>>      any space below the alignment point may be overwritten by the dump-capture kernel,
>>      which means it is possible that the vmcore is not that precise as expected.
>>
>> +   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
>> +   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
>>
>>   Load the Dump-capture Kernel
>>   ============================
>> @@ -334,6 +359,8 @@ For s390x:
>>   	- Use image or bzImage
>>   For arm:
>>   	- Use zImage
>> +For arm64:
>> +	- Use vmlinux
>
> What? Why doesn't Image work?

Actually it depends on kexec-tools, not the kernel.
Initially Geoff implemented vmlinux support only, then
recently Pratyush (RedHat) gave us a patch for "Image" support.

So now Image should work.

-Takahiro AKASHI

> Will
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 14/16] arm64: kdump: update a kernel doc
@ 2015-12-16  5:48       ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-16  5:48 UTC (permalink / raw)
  To: Will Deacon, Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	christoffer.dall, linux-arm-kernel

On 12/16/2015 02:17 AM, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> This patch adds arch specific descriptions about kdump usage on arm64
>> to kdump.txt.
>>
>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>> ---
>>   Documentation/kdump/kdump.txt | 32 +++++++++++++++++++++++++++++++-
>>   1 file changed, 31 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
>> index bc4bd5a..7d3fad0 100644
>> --- a/Documentation/kdump/kdump.txt
>> +++ b/Documentation/kdump/kdump.txt
>> @@ -18,7 +18,7 @@ memory image to a dump file on the local disk, or across the network to
>>   a remote system.
>>
>>   Kdump and kexec are currently supported on the x86, x86_64, ppc64, ia64,
>> -s390x and arm architectures.
>> +s390x, arm and arm64 architectures.
>>
>>   When the system kernel boots, it reserves a small section of memory for
>>   the dump-capture kernel. This ensures that ongoing Direct Memory Access
>> @@ -249,6 +249,29 @@ Dump-capture kernel config options (Arch Dependent, arm)
>>
>>       AUTO_ZRELADDR=y
>>
>> +Dump-capture kernel config options (Arch Dependent, arm64)
>> +----------------------------------------------------------
>> +
>> +1) Disable symmetric multi-processing support under "Processor type and
>> +   features":
>> +
>> +   CONFIG_SMP=n
>> +
>> +   (If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
>> +   when loading the dump-capture kernel, see section "Load the Dump-capture
>> +   Kernel".)
>
> SMP is now forced on, so please update this text.

OK, I will fix it.

>> +2) The maximum memory size on the dump-capture kernel must be limited by
>> +   specifying:
>> +
>> +   mem=X[MG]
>> +
>> +   where X should be less than or equal to the size in "crashkernel="
>> +   boot parameter. Kexec-tools will automatically add this.
>> +
>> +3) Currently, kvm will not be enabled on the dump-capture kernel even
>> +   if it is configured.
>> +
>>   Extended crashkernel syntax
>>   ===========================
>>
>> @@ -312,6 +335,8 @@ Boot into System Kernel
>>      any space below the alignment point may be overwritten by the dump-capture kernel,
>>      which means it is possible that the vmcore is not that precise as expected.
>>
>> +   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
>> +   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
>>
>>   Load the Dump-capture Kernel
>>   ============================
>> @@ -334,6 +359,8 @@ For s390x:
>>   	- Use image or bzImage
>>   For arm:
>>   	- Use zImage
>> +For arm64:
>> +	- Use vmlinux
>
> What? Why doesn't Image work?

Actually it depends on kexec-tools, not the kernel.
Initially Geoff implemented vmlinux support only, then
recently Pratyush (RedHat) gave us a patch for "Image" support.

So now Image should work.

-Takahiro AKASHI

> Will
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 16/16] arm64: kdump: relax BUG_ON() if more than one cpus are still active
  2015-12-15 17:05     ` Will Deacon
@ 2015-12-16  5:51       ` AKASHI Takahiro
  -1 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-16  5:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/16/2015 02:05 AM, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> We should try best in case of kdump.
>> So even if not all secondary cpus have shut down, we do kdump anyway.
>> ---
>>   arch/arm64/kernel/machine_kexec.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>> index d2d7e90..482aae7 100644
>> --- a/arch/arm64/kernel/machine_kexec.c
>> +++ b/arch/arm64/kernel/machine_kexec.c
>> @@ -148,7 +148,13 @@ void machine_kexec(struct kimage *kimage)
>>   	phys_addr_t reboot_code_buffer_phys;
>>   	void *reboot_code_buffer;
>>
>> -	BUG_ON(num_online_cpus() > 1);
>> +	if (num_online_cpus() > 1) {
>> +		if (in_crash_kexec)
>> +			pr_warn("kdump might fail because %d cpus are still online\n",
>> +					num_online_cpus());
>> +		else
>> +			BUG();
>> +	}
>
> Can you just rewrite the existing BUG_ON as:
>
>    BUG_ON(num_online_cpus() && !WARN_ON(in_crash_kexec));
>
> ?

Yes, but
      BUG_ON((num_online_cpus() > 1) && !WARN_ON(in_crash_kexec));

Thanks,
-Takahiro AKASHI

> Will
>

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 16/16] arm64: kdump: relax BUG_ON() if more than one cpus are still active
@ 2015-12-16  5:51       ` AKASHI Takahiro
  0 siblings, 0 replies; 178+ messages in thread
From: AKASHI Takahiro @ 2015-12-16  5:51 UTC (permalink / raw)
  To: Will Deacon, Geoff Levand
  Cc: Mark Rutland, marc.zyngier, Catalin Marinas, kexec,
	christoffer.dall, linux-arm-kernel

On 12/16/2015 02:05 AM, Will Deacon wrote:
> On Tue, Nov 24, 2015 at 10:25:34PM +0000, Geoff Levand wrote:
>> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>
>> We should try best in case of kdump.
>> So even if not all secondary cpus have shut down, we do kdump anyway.
>> ---
>>   arch/arm64/kernel/machine_kexec.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>> index d2d7e90..482aae7 100644
>> --- a/arch/arm64/kernel/machine_kexec.c
>> +++ b/arch/arm64/kernel/machine_kexec.c
>> @@ -148,7 +148,13 @@ void machine_kexec(struct kimage *kimage)
>>   	phys_addr_t reboot_code_buffer_phys;
>>   	void *reboot_code_buffer;
>>
>> -	BUG_ON(num_online_cpus() > 1);
>> +	if (num_online_cpus() > 1) {
>> +		if (in_crash_kexec)
>> +			pr_warn("kdump might fail because %d cpus are still online\n",
>> +					num_online_cpus());
>> +		else
>> +			BUG();
>> +	}
>
> Can you just rewrite the existing BUG_ON as:
>
>    BUG_ON(num_online_cpus() && !WARN_ON(in_crash_kexec));
>
> ?

Yes, but
      BUG_ON((num_online_cpus() > 1) && !WARN_ON(in_crash_kexec));

Thanks,
-Takahiro AKASHI

> Will
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-16  0:14       ` Geoff Levand
@ 2015-12-16  7:18         ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-16  7:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/12/2015:04:14:30 PM, Geoff Levand wrote:
> On Tue, 2015-12-15 at 18:29 +0000, Will Deacon wrote:
> > > +> > 	> >  * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
> > > +> > 	> >  * after the kernel is shut down.
> > > +> > 	> >  */
> > > +> > 	> > memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
> > > +> > 	> > 	> > arm64_relocate_new_kernel_size);
> > 
> > At which point does the I-cache get invalidated for this?
> 
> I'll add a call to flush_icache_range() for reboot_code_buffer.  I
> think that should do it.

We execute arm64_relocate_new_kernel() code with I-cache disabled. So, do we
really need to invalidate I-cache?

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-16  7:18         ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-16  7:18 UTC (permalink / raw)
  To: Geoff Levand
  Cc: Mark Rutland, marc.zyngier, kexec, Will Deacon, AKASHI Takahiro,
	christoffer.dall, Catalin Marinas, linux-arm-kernel

On 15/12/2015:04:14:30 PM, Geoff Levand wrote:
> On Tue, 2015-12-15 at 18:29 +0000, Will Deacon wrote:
> > > +> > 	> >  * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
> > > +> > 	> >  * after the kernel is shut down.
> > > +> > 	> >  */
> > > +> > 	> > memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
> > > +> > 	> > 	> > arm64_relocate_new_kernel_size);
> > 
> > At which point does the I-cache get invalidated for this?
> 
> I'll add a call to flush_icache_range() for reboot_code_buffer.  I
> think that should do it.

We execute arm64_relocate_new_kernel() code with I-cache disabled. So, do we
really need to invalidate I-cache?

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 11/16] arm64: kdump: reserve memory for crash dump kernel
  2015-12-16  5:19       ` AKASHI Takahiro
@ 2015-12-16  7:36         ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-16  7:36 UTC (permalink / raw)
  To: linux-arm-kernel

On 16/12/2015:02:19:26 PM, AKASHI Takahiro wrote:
> On 12/16/2015 02:29 AM, Will Deacon wrote:
> >#else
> >
> >static void __init reserve_crashkernel(void) {};
> 
> OK.

Then may be you do not need to protect calling of reserve_crashkernel() with
CONFIG_KEXEC_CORE as well.

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 11/16] arm64: kdump: reserve memory for crash dump kernel
@ 2015-12-16  7:36         ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-16  7:36 UTC (permalink / raw)
  To: AKASHI Takahiro
  Cc: Mark Rutland, Geoff Levand, Catalin Marinas, Will Deacon,
	marc.zyngier, linux-arm-kernel, kexec, christoffer.dall

On 16/12/2015:02:19:26 PM, AKASHI Takahiro wrote:
> On 12/16/2015 02:29 AM, Will Deacon wrote:
> >#else
> >
> >static void __init reserve_crashkernel(void) {};
> 
> OK.

Then may be you do not need to protect calling of reserve_crashkernel() with
CONFIG_KEXEC_CORE as well.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-16  7:18         ` Pratyush Anand
@ 2015-12-16  9:30           ` James Morse
  -1 siblings, 0 replies; 178+ messages in thread
From: James Morse @ 2015-12-16  9:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 16/12/15 07:18, Pratyush Anand wrote:
> On 15/12/2015:04:14:30 PM, Geoff Levand wrote:
>> On Tue, 2015-12-15 at 18:29 +0000, Will Deacon wrote:
>>>> +> > 	> >  * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
>>>> +> > 	> >  * after the kernel is shut down.
>>>> +> > 	> >  */
>>>> +> > 	> > memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
>>>> +> > 	> > 	> > arm64_relocate_new_kernel_size);
>>>
>>> At which point does the I-cache get invalidated for this?
>>
>> I'll add a call to flush_icache_range() for reboot_code_buffer.  I
>> think that should do it.
> 
> We execute arm64_relocate_new_kernel() code with I-cache disabled. So, do we
> really need to invalidate I-cache?

I got bitten by this, see Mark's earlier reply[0]:

Mark Rutland wrote:
> The SCTLR_ELx.I only affects the attributes that the I-cache uses to
> fetch with, not whether it is enabled (it cannot be disabled
> architecturally).



James


[0]
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/382101.html

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-16  9:30           ` James Morse
  0 siblings, 0 replies; 178+ messages in thread
From: James Morse @ 2015-12-16  9:30 UTC (permalink / raw)
  To: Pratyush Anand
  Cc: Mark Rutland, Geoff Levand, Will Deacon, kexec, AKASHI Takahiro,
	linux-arm-kernel, Catalin Marinas, marc.zyngier,
	christoffer.dall

On 16/12/15 07:18, Pratyush Anand wrote:
> On 15/12/2015:04:14:30 PM, Geoff Levand wrote:
>> On Tue, 2015-12-15 at 18:29 +0000, Will Deacon wrote:
>>>> +> > 	> >  * Copy arm64_relocate_new_kernel to the reboot_code_buffer for use
>>>> +> > 	> >  * after the kernel is shut down.
>>>> +> > 	> >  */
>>>> +> > 	> > memcpy(reboot_code_buffer, arm64_relocate_new_kernel,
>>>> +> > 	> > 	> > arm64_relocate_new_kernel_size);
>>>
>>> At which point does the I-cache get invalidated for this?
>>
>> I'll add a call to flush_icache_range() for reboot_code_buffer.  I
>> think that should do it.
> 
> We execute arm64_relocate_new_kernel() code with I-cache disabled. So, do we
> really need to invalidate I-cache?

I got bitten by this, see Mark's earlier reply[0]:

Mark Rutland wrote:
> The SCTLR_ELx.I only affects the attributes that the I-cache uses to
> fetch with, not whether it is enabled (it cannot be disabled
> architecturally).



James


[0]
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/382101.html

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

* [PATCH v12 08/16] arm64/kexec: Add core kexec support
  2015-12-16  9:30           ` James Morse
@ 2015-12-16 10:32             ` Pratyush Anand
  -1 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-16 10:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 16/12/2015:09:30:34 AM, James Morse wrote:
> On 16/12/15 07:18, Pratyush Anand wrote:
> > We execute arm64_relocate_new_kernel() code with I-cache disabled. So, do we
> > really need to invalidate I-cache?
> 
> I got bitten by this, see Mark's earlier reply[0]:
> 
> Mark Rutland wrote:
> > The SCTLR_ELx.I only affects the attributes that the I-cache uses to
> > fetch with, not whether it is enabled (it cannot be disabled
> > architecturally).

Thanks James for pointing to it. I had missed that.

~Pratyush

^ permalink raw reply	[flat|nested] 178+ messages in thread

* Re: [PATCH v12 08/16] arm64/kexec: Add core kexec support
@ 2015-12-16 10:32             ` Pratyush Anand
  0 siblings, 0 replies; 178+ messages in thread
From: Pratyush Anand @ 2015-12-16 10:32 UTC (permalink / raw)
  To: James Morse
  Cc: Mark Rutland, Geoff Levand, Will Deacon, kexec, AKASHI Takahiro,
	linux-arm-kernel, Catalin Marinas, marc.zyngier,
	christoffer.dall

On 16/12/2015:09:30:34 AM, James Morse wrote:
> On 16/12/15 07:18, Pratyush Anand wrote:
> > We execute arm64_relocate_new_kernel() code with I-cache disabled. So, do we
> > really need to invalidate I-cache?
> 
> I got bitten by this, see Mark's earlier reply[0]:
> 
> Mark Rutland wrote:
> > The SCTLR_ELx.I only affects the attributes that the I-cache uses to
> > fetch with, not whether it is enabled (it cannot be disabled
> > architecturally).

Thanks James for pointing to it. I had missed that.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 178+ messages in thread

end of thread, other threads:[~2015-12-16 10:32 UTC | newest]

Thread overview: 178+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-24 22:25 [PATCH v12 00/16] arm64 kexec kernel patches v12 Geoff Levand
2015-11-24 22:25 ` Geoff Levand
2015-11-24 22:25 ` [PATCH v12 06/16] Revert "arm64: mm: remove unused cpu_set_idmap_tcr_t0sz function" Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-24 22:25 ` [PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-27 13:54   ` Marc Zyngier
2015-11-27 13:54     ` Marc Zyngier
2015-12-02 22:40   ` Ashwin Chaugule
2015-12-02 22:40     ` Ashwin Chaugule
2015-12-03 13:55     ` Ashwin Chaugule
2015-12-03 13:55       ` Ashwin Chaugule
2015-12-03 13:58     ` Marc Zyngier
2015-12-03 13:58       ` Marc Zyngier
2015-12-10 18:31       ` Geoff Levand
2015-12-10 18:31         ` Geoff Levand
2015-12-11 16:31         ` Will Deacon
2015-12-11 16:31           ` Will Deacon
2015-12-15  8:48           ` AKASHI Takahiro
2015-12-15  8:48             ` AKASHI Takahiro
2015-12-10 18:44       ` Shi, Yang
2015-12-10 18:44         ` Shi, Yang
2015-12-11  8:09         ` AKASHI Takahiro
2015-12-11  8:09           ` AKASHI Takahiro
2015-12-14 18:00           ` Shi, Yang
2015-12-14 18:00             ` Shi, Yang
2015-12-11  8:06       ` AKASHI Takahiro
2015-12-11  8:06         ` AKASHI Takahiro
2015-12-11 13:00         ` Shanker Donthineni
2015-12-11 13:00           ` Shanker Donthineni
2015-12-11 16:28         ` Marc Zyngier
2015-12-11 16:28           ` Marc Zyngier
2015-12-11 18:00           ` Shanker Donthineni
2015-12-11 18:00             ` Shanker Donthineni
2015-12-11 18:11             ` Marc Zyngier
2015-12-11 18:11               ` Marc Zyngier
2015-12-11 19:11               ` Shanker Donthineni
2015-12-11 19:11                 ` Shanker Donthineni
2015-12-11 20:13           ` Ashwin Chaugule
2015-12-11 20:13             ` Ashwin Chaugule
2015-12-14  7:33           ` AKASHI Takahiro
2015-12-14  7:33             ` AKASHI Takahiro
2015-12-14 17:33             ` Marc Zyngier
2015-12-14 17:33               ` Marc Zyngier
2015-12-15  7:51               ` AKASHI Takahiro
2015-12-15  7:51                 ` AKASHI Takahiro
2015-12-15  8:45                 ` Marc Zyngier
2015-12-15  8:45                   ` Marc Zyngier
2015-12-15  9:51                   ` AKASHI Takahiro
2015-12-15  9:51                     ` AKASHI Takahiro
2015-12-15 10:13                     ` Marc Zyngier
2015-12-15 10:13                       ` Marc Zyngier
2015-11-24 22:25 ` [PATCH v12 02/16] arm64: Convert hcalls to use HVC immediate value Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-24 22:25 ` [PATCH v12 03/16] arm64: Add new hcall HVC_CALL_FUNC Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-24 22:25 ` [PATCH v12 01/16] arm64: Fold proc-macros.S into assembler.h Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-24 22:25 ` [PATCH v12 05/16] arm64: Add back cpu_reset routines Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-27 14:19   ` Marc Zyngier
2015-11-27 14:19     ` Marc Zyngier
2015-11-30  5:28     ` Pratyush Anand
2015-11-30  5:28       ` Pratyush Anand
2015-11-30 10:40       ` Marc Zyngier
2015-11-30 10:40         ` Marc Zyngier
2015-12-02 22:57         ` Geoff Levand
2015-12-02 22:57           ` Geoff Levand
2015-12-03  9:32           ` Will Deacon
2015-12-03  9:32             ` Will Deacon
2015-12-10  0:49             ` Geoff Levand
2015-12-10  0:49               ` Geoff Levand
2015-12-10 10:17               ` Will Deacon
2015-12-10 10:17                 ` Will Deacon
2015-11-30 20:03     ` Geoff Levand
2015-11-30 20:03       ` Geoff Levand
2015-12-01  9:38       ` Marc Zyngier
2015-12-01  9:38         ` Marc Zyngier
2015-11-24 22:25 ` [PATCH v12 07/16] Revert "arm64: remove dead code" Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-24 22:25 ` [PATCH v12 16/16] arm64: kdump: relax BUG_ON() if more than one cpus are still active Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-12-15 17:05   ` Will Deacon
2015-12-15 17:05     ` Will Deacon
2015-12-16  5:51     ` AKASHI Takahiro
2015-12-16  5:51       ` AKASHI Takahiro
2015-11-24 22:25 ` [PATCH v12 11/16] arm64: kdump: reserve memory for crash dump kernel Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-12-15 17:29   ` Will Deacon
2015-12-15 17:29     ` Will Deacon
2015-12-16  5:19     ` AKASHI Takahiro
2015-12-16  5:19       ` AKASHI Takahiro
2015-12-16  7:36       ` Pratyush Anand
2015-12-16  7:36         ` Pratyush Anand
2015-11-24 22:25 ` [PATCH v12 15/16] arm64: kdump: enable kdump in the arm64 defconfig Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-24 22:25 ` [PATCH v12 09/16] arm64/kexec: Add pr_devel output Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-12-15 17:15   ` Will Deacon
2015-12-15 17:15     ` Will Deacon
2015-12-16  0:45     ` Geoff Levand
2015-12-16  0:45       ` Geoff Levand
2015-12-16  0:46   ` [PATCH v12.4] arm64/kexec: Add pr_debug output Geoff Levand
2015-12-16  0:46     ` Geoff Levand
2015-11-24 22:25 ` [PATCH v12 12/16] arm64: kdump: implement machine_crash_shutdown() Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-27 14:39   ` Marc Zyngier
2015-11-27 14:39     ` Marc Zyngier
2015-12-10 11:34     ` AKASHI Takahiro
2015-12-10 11:34       ` AKASHI Takahiro
2015-12-10 11:44       ` Marc Zyngier
2015-12-10 11:44         ` Marc Zyngier
2015-12-10 12:55         ` AKASHI Takahiro
2015-12-10 12:55           ` AKASHI Takahiro
2015-12-10 13:43           ` Marc Zyngier
2015-12-10 13:43             ` Marc Zyngier
2015-12-03  4:15   ` Pratyush Anand
2015-12-03  4:15     ` Pratyush Anand
2015-12-10 11:42     ` AKASHI Takahiro
2015-12-10 11:42       ` AKASHI Takahiro
2015-12-10 11:50       ` Pratyush Anand
2015-12-10 11:50         ` Pratyush Anand
2015-11-24 22:25 ` [PATCH v12 14/16] arm64: kdump: update a kernel doc Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-12-15 17:17   ` Will Deacon
2015-12-15 17:17     ` Will Deacon
2015-12-16  5:48     ` AKASHI Takahiro
2015-12-16  5:48       ` AKASHI Takahiro
2015-11-24 22:25 ` [PATCH v12 10/16] arm64/kexec: Enable kexec in the arm64 defconfig Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-24 22:25 ` [PATCH v12 13/16] arm64: kdump: add kdump support Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-12-15 17:45   ` Will Deacon
2015-12-15 17:45     ` Will Deacon
2015-12-16  5:41     ` AKASHI Takahiro
2015-12-16  5:41       ` AKASHI Takahiro
2015-11-24 22:25 ` [PATCH v12 08/16] arm64/kexec: Add core kexec support Geoff Levand
2015-11-24 22:25   ` Geoff Levand
2015-11-27 13:13   ` Pratyush Anand
2015-11-27 13:13     ` Pratyush Anand
2015-11-30 18:51     ` Geoff Levand
2015-11-30 18:51       ` Geoff Levand
2015-12-01  2:16       ` Pratyush Anand
2015-12-01  2:16         ` Pratyush Anand
2015-12-01 18:32         ` Azriel Samson
2015-12-01 18:32           ` Azriel Samson
2015-12-02 22:49           ` Geoff Levand
2015-12-02 22:49             ` Geoff Levand
2015-12-03  4:37             ` Azriel Samson
2015-12-03  4:37               ` Azriel Samson
2015-12-03 19:56               ` Geoff Levand
2015-12-03 19:56                 ` Geoff Levand
2015-12-04  0:39                 ` Azriel Samson
2015-12-04  0:39                   ` Azriel Samson
2015-12-04  3:54                   ` Pratyush Anand
2015-12-04  3:54                     ` Pratyush Anand
2015-12-07 18:47                     ` Geoff Levand
2015-12-07 18:47                       ` Geoff Levand
2015-12-03  6:09             ` Pratyush Anand
2015-12-03  6:09               ` Pratyush Anand
2015-12-01 19:03         ` Mark Rutland
2015-12-01 19:03           ` Mark Rutland
2015-12-02 21:08           ` Geoff Levand
2015-12-02 21:08             ` Geoff Levand
2015-12-03 16:06             ` Mark Rutland
2015-12-03 16:06               ` Mark Rutland
2015-12-15 18:29   ` Will Deacon
2015-12-15 18:29     ` Will Deacon
2015-12-16  0:14     ` Geoff Levand
2015-12-16  0:14       ` Geoff Levand
2015-12-16  7:18       ` Pratyush Anand
2015-12-16  7:18         ` Pratyush Anand
2015-12-16  9:30         ` James Morse
2015-12-16  9:30           ` James Morse
2015-12-16 10:32           ` Pratyush Anand
2015-12-16 10:32             ` Pratyush Anand
2015-12-16  0:14   ` [PATCH v12.4] " Geoff Levand
2015-12-16  0:14     ` Geoff Levand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.