linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support
@ 2015-07-18 20:08 Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 01/17] powerpc/85xx: Load all early TLB entries at once Scott Wood
                   ` (16 more replies)
  0 siblings, 17 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

This patchset adds support for kexec and kdump to e5500 and e6500 based
systems running 64-bit kernels.  It depends on
http://patchwork.ozlabs.org/patch/496952/ ("powerpc/fsl-booke-64: Allow
booting from the secondary thread") and the kexec-tools patch
http://lists.infradead.org/pipermail/kexec/2015-July/014048.html ("ppc64:
Add a flag to tell the kernel it's booting from kexec").

Scott Wood (11):
  powerpc/85xx: Load all early TLB entries at once
  powerpc/85xx: Don't use generic timebase sync on 64-bit
  crypto: caam: Blacklist CAAM when kexec is enabled
  powerpc/fsl-corenet: Disable coreint if kexec is enabled
  powerpc/fsl-booke-64: Don't limit ppc64_rma_size to one TLB entry
  powerpc/e6500: kexec: Handle hardware threads
  powerpc/book3e/kdump: Enable crash_kexec_wait_realmode
  powerpc/book3e-64: Don't limit paca to 256 MiB
  powerpc/book3e-64/kexec: Enable SMP release
  powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32
  powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop

Tiejun Chen (6):
  powerpc/85xx: Implement 64-bit kexec support
  powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts
  powerpc/booke64: Fix args to copy_and_flush
  powerpc/book3e: support CONFIG_RELOCATABLE
  powerpc/book3e-64/kexec: create an identity TLB mapping
  powerpc/book3e-64: Enable kexec

 arch/powerpc/Kconfig                          |  2 +-
 arch/powerpc/include/asm/exception-64e.h      |  4 +-
 arch/powerpc/include/asm/page.h               |  7 +--
 arch/powerpc/include/asm/smp.h                |  1 +
 arch/powerpc/kernel/crash.c                   |  6 +--
 arch/powerpc/kernel/exceptions-64e.S          | 17 +++---
 arch/powerpc/kernel/head_64.S                 | 58 ++++++++++++++++++--
 arch/powerpc/kernel/machine_kexec_64.c        | 19 +++++++
 arch/powerpc/kernel/misc_64.S                 | 62 ++++++++++++++++++++-
 arch/powerpc/kernel/paca.c                    |  6 ++-
 arch/powerpc/kernel/setup_64.c                | 22 +++++++-
 arch/powerpc/mm/fsl_booke_mmu.c               | 35 ++++++++----
 arch/powerpc/mm/mmu_decl.h                    |  4 +-
 arch/powerpc/mm/tlb_nohash.c                  | 41 +++++++++++---
 arch/powerpc/mm/tlb_nohash_low.S              | 63 ++++++++++++++++++++++
 arch/powerpc/platforms/85xx/corenet_generic.c |  4 ++
 arch/powerpc/platforms/85xx/smp.c             | 77 +++++++++++++++++++++++++--
 drivers/crypto/caam/Kconfig                   |  2 +-
 18 files changed, 385 insertions(+), 45 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 01/17] powerpc/85xx: Load all early TLB entries at once
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 02/17] powerpc/85xx: Don't use generic timebase sync on 64-bit Scott Wood
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

Use an AS=1 trampoline TLB entry to allow all normal TLB1 entries to
be loaded at once.  This avoids the need to keep the translation that
code is executing from in the same TLB entry in the final TLB
configuration as during early boot, which in turn is helpful for
relocatable kernels (e.g. kdump) where the kernel is not running from
what would be the first TLB entry.

On e6500, we limit map_mem_in_cams() to the primary hwthread of a
core (the boot cpu is always considered primary, as a kdump kernel
can be entered on any cpu),  Each TLB only needs to be set up once,
and when we do, we don't want another thread to be running when we
create a temporary trampoline TLB1 entry.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kernel/setup_64.c   |  8 +++++
 arch/powerpc/mm/fsl_booke_mmu.c  | 15 ++++++++--
 arch/powerpc/mm/mmu_decl.h       |  1 +
 arch/powerpc/mm/tlb_nohash.c     | 19 +++++++++++-
 arch/powerpc/mm/tlb_nohash_low.S | 63 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index bdcbb71..505ec2c 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -108,6 +108,14 @@ static void setup_tlb_core_data(void)
 	for_each_possible_cpu(cpu) {
 		int first = cpu_first_thread_sibling(cpu);
 
+		/*
+		 * If we boot via kdump on a non-primary thread,
+		 * make sure we point at the thread that actually
+		 * set up this TLB.
+		 */
+		if (cpu_first_thread_sibling(boot_cpuid) == first)
+			first = boot_cpuid;
+
 		paca[cpu].tcd_ptr = &paca[first].tcd;
 
 		/*
diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c
index 354ba3c..36d3c55 100644
--- a/arch/powerpc/mm/fsl_booke_mmu.c
+++ b/arch/powerpc/mm/fsl_booke_mmu.c
@@ -105,8 +105,9 @@ unsigned long p_mapped_by_tlbcam(phys_addr_t pa)
  * an unsigned long (for example, 32-bit implementations cannot support a 4GB
  * size).
  */
-static void settlbcam(int index, unsigned long virt, phys_addr_t phys,
-		unsigned long size, unsigned long flags, unsigned int pid)
+static void preptlbcam(int index, unsigned long virt, phys_addr_t phys,
+		       unsigned long size, unsigned long flags,
+		       unsigned int pid)
 {
 	unsigned int tsize;
 
@@ -141,7 +142,13 @@ static void settlbcam(int index, unsigned long virt, phys_addr_t phys,
 	tlbcam_addrs[index].start = virt;
 	tlbcam_addrs[index].limit = virt + size - 1;
 	tlbcam_addrs[index].phys = phys;
+}
 
+void settlbcam(int index, unsigned long virt, phys_addr_t phys,
+	       unsigned long size, unsigned long flags,
+	       unsigned int pid)
+{
+	preptlbcam(index, virt, phys, size, flags, pid);
 	loadcam_entry(index);
 }
 
@@ -181,13 +188,15 @@ static unsigned long map_mem_in_cams_addr(phys_addr_t phys, unsigned long virt,
 		unsigned long cam_sz;
 
 		cam_sz = calc_cam_sz(ram, virt, phys);
-		settlbcam(i, virt, phys, cam_sz, pgprot_val(PAGE_KERNEL_X), 0);
+		preptlbcam(i, virt, phys, cam_sz, pgprot_val(PAGE_KERNEL_X), 0);
 
 		ram -= cam_sz;
 		amount_mapped += cam_sz;
 		virt += cam_sz;
 		phys += cam_sz;
 	}
+
+	loadcam_multi(0, i, max_cam_idx);
 	tlbcam_index = i;
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 085b66b..27c3a2d 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -152,6 +152,7 @@ extern int switch_to_as1(void);
 extern void restore_to_as0(int esel, int offset, void *dt_ptr, int bootcpu);
 #endif
 extern void loadcam_entry(unsigned int index);
+extern void loadcam_multi(int first_idx, int num, int tmp_idx);
 
 struct tlbcam {
 	u32	MAS0;
diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c
index 723a099..a7381fb 100644
--- a/arch/powerpc/mm/tlb_nohash.c
+++ b/arch/powerpc/mm/tlb_nohash.c
@@ -42,6 +42,7 @@
 #include <asm/tlbflush.h>
 #include <asm/tlb.h>
 #include <asm/code-patching.h>
+#include <asm/cputhreads.h>
 #include <asm/hugetlb.h>
 #include <asm/paca.h>
 
@@ -628,10 +629,26 @@ static void early_init_this_mmu(void)
 #ifdef CONFIG_PPC_FSL_BOOK3E
 	if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) {
 		unsigned int num_cams;
+		int __maybe_unused cpu = smp_processor_id();
+		bool map = true;
 
 		/* use a quarter of the TLBCAM for bolted linear map */
 		num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4;
-		linear_map_top = map_mem_in_cams(linear_map_top, num_cams);
+
+		/*
+		 * Only do the mapping once per core, or else the
+		 * transient mapping would cause problems.
+		 */
+#ifdef CONFIG_SMP
+		if (cpu != boot_cpuid &&
+		    (cpu != cpu_first_thread_sibling(cpu) ||
+		     cpu == cpu_first_thread_sibling(boot_cpuid)))
+			map = false;
+#endif
+
+		if (map)
+			linear_map_top = map_mem_in_cams(linear_map_top,
+							 num_cams);
 	}
 #endif
 
diff --git a/arch/powerpc/mm/tlb_nohash_low.S b/arch/powerpc/mm/tlb_nohash_low.S
index 43ff3c7..68c4775 100644
--- a/arch/powerpc/mm/tlb_nohash_low.S
+++ b/arch/powerpc/mm/tlb_nohash_low.S
@@ -400,6 +400,7 @@ _GLOBAL(set_context)
  * extern void loadcam_entry(unsigned int index)
  *
  * Load TLBCAM[index] entry in to the L2 CAM MMU
+ * Must preserve r7, r8, r9, and r10
  */
 _GLOBAL(loadcam_entry)
 	mflr	r5
@@ -423,4 +424,66 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_BIG_PHYS)
 	tlbwe
 	isync
 	blr
+
+/*
+ * Load multiple TLB entries at once, using an alternate-space
+ * trampoline so that we don't have to care about whether the same
+ * TLB entry maps us before and after.
+ *
+ * r3 = first entry to write
+ * r4 = number of entries to write
+ * r5 = temporary tlb entry
+ */
+_GLOBAL(loadcam_multi)
+	mflr	r8
+
+	/*
+	 * Set up temporary TLB entry that is the same as what we're
+	 * running from, but in AS=1.
+	 */
+	bl	1f
+1:	mflr	r6
+	tlbsx	0,r8
+	mfspr	r6,SPRN_MAS1
+	ori	r6,r6,MAS1_TS
+	mtspr	SPRN_MAS1,r6
+	mfspr	r6,SPRN_MAS0
+	rlwimi	r6,r5,MAS0_ESEL_SHIFT,MAS0_ESEL_MASK
+	mr	r7,r5
+	mtspr	SPRN_MAS0,r6
+	isync
+	tlbwe
+	isync
+
+	/* Switch to AS=1 */
+	mfmsr	r6
+	ori	r6,r6,MSR_IS|MSR_DS
+	mtmsr	r6
+	isync
+
+	mr	r9,r3
+	add	r10,r3,r4
+2:	bl	loadcam_entry
+	addi	r9,r9,1
+	cmpw	r9,r10
+	mr	r3,r9
+	blt	2b
+
+	/* Return to AS=0 and clear the temporary entry */
+	mfmsr	r6
+	rlwinm.	r6,r6,0,~(MSR_IS|MSR_DS)
+	mtmsr	r6
+	isync
+
+	li	r6,0
+	mtspr	SPRN_MAS1,r6
+	rlwinm	r6,r7,MAS0_ESEL_SHIFT,MAS0_ESEL_MASK
+	oris	r6,r6,MAS0_TLBSEL(1)@h
+	mtspr	SPRN_MAS0,r6
+	isync
+	tlbwe
+	isync
+
+	mtlr	r8
+	blr
 #endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 02/17] powerpc/85xx: Don't use generic timebase sync on 64-bit
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 01/17] powerpc/85xx: Load all early TLB entries at once Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 03/17] crypto: caam: Blacklist CAAM when kexec is enabled Scott Wood
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

85xx currently uses the generic timebase sync mechanism when
CONFIG_KEXEC is enabled, because 32-bit 85xx kexec support does a hard
reset of each core.  64-bit 85xx kexec does not do this, so we neither
need nor want this (nor is the generic timebase sync code built on
ppc64).

FWIW, I don't like the fact that the hard reset is done on 32-bit
kexec, and I especially don't like the timebase sync being triggered
only on the presence of CONFIG_KEXEC rather than actually booting in
that environment, but that's beyond the scope of this patch...

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/platforms/85xx/smp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index c2ded03..a0763be 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -344,7 +344,7 @@ struct smp_ops_t smp_85xx_ops = {
 	.cpu_disable	= generic_cpu_disable,
 	.cpu_die	= generic_cpu_die,
 #endif
-#ifdef CONFIG_KEXEC
+#if defined(CONFIG_KEXEC) && !defined(CONFIG_PPC64)
 	.give_timebase	= smp_generic_give_timebase,
 	.take_timebase	= smp_generic_take_timebase,
 #endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 03/17] crypto: caam: Blacklist CAAM when kexec is enabled
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 01/17] powerpc/85xx: Load all early TLB entries at once Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 02/17] powerpc/85xx: Don't use generic timebase sync on 64-bit Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 04/17] powerpc/fsl-corenet: Disable coreint if " Scott Wood
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Tiejun Chen, kexec, Scott Wood, Cristian Stoica, Horia Geanta,
	Herbert Xu, linux-crypto

This driver hangs the kernel on boot when loaded via kexec.

To make this driver kexec-safe, add a suspend or freeze hook, and when
probing, don't make any assumptions about the existing hardware state
(e.g. don't request_irq before quiescing the device).

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Cristian Stoica <cristian.stoica@freescale.com>
Cc: Horia Geanta <horia.geanta@freescale.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: linux-crypto@vger.kernel.org
---
 drivers/crypto/caam/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/caam/Kconfig b/drivers/crypto/caam/Kconfig
index e286e28..310e9e0 100644
--- a/drivers/crypto/caam/Kconfig
+++ b/drivers/crypto/caam/Kconfig
@@ -1,6 +1,6 @@
 config CRYPTO_DEV_FSL_CAAM
 	tristate "Freescale CAAM-Multicore driver backend"
-	depends on FSL_SOC
+	depends on FSL_SOC && !KEXEC
 	help
 	  Enables the driver module for Freescale's Cryptographic Accelerator
 	  and Assurance Module (CAAM), also known as the SEC version 4 (SEC4).
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 04/17] powerpc/fsl-corenet: Disable coreint if kexec is enabled
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (2 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 03/17] crypto: caam: Blacklist CAAM when kexec is enabled Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 05/17] powerpc/fsl-booke-64: Don't limit ppc64_rma_size to one TLB entry Scott Wood
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

Problems have been observed in coreint (EPR) mode if interrupts are
left pending (due to the lack of device quiescence with kdump) after
having tried to deliver to a CPU but unable to deliver due to MSR[EE]
-- interrupts no longer get reliably delivered in the new kernel.  I
tried various ways of fixing it up inside the crash kernel itself, and
none worked (including resetting the entire mpic).  Masking all
interrupts and issuing EOIs in the crashing kernel did help a lot of
the time, but the behavior was not consistent.

Thus, stick to standard IACK mode when kdump is a possibility.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
Supposedly there are similar problems with certain low power states --
the SDK disables coreint when CPU hotplug is enabled -- so disabling it
for kexec as well doesn't seem like a big deal.
---
 arch/powerpc/platforms/85xx/corenet_generic.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c b/arch/powerpc/platforms/85xx/corenet_generic.c
index bd839dc..cddc9a2 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -212,7 +212,11 @@ define_machine(corenet_generic) {
 	.pcibios_fixup_bus	= fsl_pcibios_fixup_bus,
 	.pcibios_fixup_phb      = fsl_pcibios_fixup_phb,
 #endif
+#ifdef CONFIG_KEXEC
+	.get_irq		= mpic_get_irq,
+#else
 	.get_irq		= mpic_get_coreint_irq,
+#endif
 	.restart		= fsl_rstcr_restart,
 	.calibrate_decr		= generic_calibrate_decr,
 	.progress		= udbg_progress,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 05/17] powerpc/fsl-booke-64: Don't limit ppc64_rma_size to one TLB entry
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (3 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 04/17] powerpc/fsl-corenet: Disable coreint if " Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 06/17] powerpc/85xx: Implement 64-bit kexec support Scott Wood
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

This is required for kdump to work when loaded at at an address that
does not fall within the first TLB entry -- which can easily happen
because while the lower limit is enforced via reserved memory, which
doesn't affect how much is mapped, the upper limit is enforced via a
different mechanism that does.  Thus, more TLB entries are needed than
would normally be used, as the total memory to be mapped might not be a
power of two.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/mm/fsl_booke_mmu.c | 22 +++++++++++++++-------
 arch/powerpc/mm/mmu_decl.h      |  3 ++-
 arch/powerpc/mm/tlb_nohash.c    | 24 +++++++++++++++++-------
 3 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c
index 36d3c55..5eef7d7 100644
--- a/arch/powerpc/mm/fsl_booke_mmu.c
+++ b/arch/powerpc/mm/fsl_booke_mmu.c
@@ -178,7 +178,8 @@ unsigned long calc_cam_sz(unsigned long ram, unsigned long virt,
 }
 
 static unsigned long map_mem_in_cams_addr(phys_addr_t phys, unsigned long virt,
-					unsigned long ram, int max_cam_idx)
+					unsigned long ram, int max_cam_idx,
+					bool dryrun)
 {
 	int i;
 	unsigned long amount_mapped = 0;
@@ -188,7 +189,9 @@ static unsigned long map_mem_in_cams_addr(phys_addr_t phys, unsigned long virt,
 		unsigned long cam_sz;
 
 		cam_sz = calc_cam_sz(ram, virt, phys);
-		preptlbcam(i, virt, phys, cam_sz, pgprot_val(PAGE_KERNEL_X), 0);
+		if (!dryrun)
+			preptlbcam(i, virt, phys, cam_sz,
+				   pgprot_val(PAGE_KERNEL_X), 0);
 
 		ram -= cam_sz;
 		amount_mapped += cam_sz;
@@ -196,6 +199,9 @@ static unsigned long map_mem_in_cams_addr(phys_addr_t phys, unsigned long virt,
 		phys += cam_sz;
 	}
 
+	if (dryrun)
+		return amount_mapped;
+
 	loadcam_multi(0, i, max_cam_idx);
 	tlbcam_index = i;
 
@@ -208,12 +214,12 @@ static unsigned long map_mem_in_cams_addr(phys_addr_t phys, unsigned long virt,
 	return amount_mapped;
 }
 
-unsigned long map_mem_in_cams(unsigned long ram, int max_cam_idx)
+unsigned long map_mem_in_cams(unsigned long ram, int max_cam_idx, bool dryrun)
 {
 	unsigned long virt = PAGE_OFFSET;
 	phys_addr_t phys = memstart_addr;
 
-	return map_mem_in_cams_addr(phys, virt, ram, max_cam_idx);
+	return map_mem_in_cams_addr(phys, virt, ram, max_cam_idx, dryrun);
 }
 
 #ifdef CONFIG_PPC32
@@ -244,7 +250,7 @@ void __init adjust_total_lowmem(void)
 	ram = min((phys_addr_t)__max_low_memory, (phys_addr_t)total_lowmem);
 
 	i = switch_to_as1();
-	__max_low_memory = map_mem_in_cams(ram, CONFIG_LOWMEM_CAM_NUM);
+	__max_low_memory = map_mem_in_cams(ram, CONFIG_LOWMEM_CAM_NUM, false);
 	restore_to_as0(i, 0, 0, 1);
 
 	pr_info("Memory CAM mapping: ");
@@ -312,10 +318,12 @@ notrace void __init relocate_init(u64 dt_ptr, phys_addr_t start)
 		n = switch_to_as1();
 		/* map a 64M area for the second relocation */
 		if (memstart_addr > start)
-			map_mem_in_cams(0x4000000, CONFIG_LOWMEM_CAM_NUM);
+			map_mem_in_cams(0x4000000, CONFIG_LOWMEM_CAM_NUM,
+					false);
 		else
 			map_mem_in_cams_addr(start, PAGE_OFFSET + offset,
-					0x4000000, CONFIG_LOWMEM_CAM_NUM);
+					0x4000000, CONFIG_LOWMEM_CAM_NUM,
+					false);
 		restore_to_as0(n, offset, __va(dt_ptr), 1);
 		/* We should never reach here */
 		panic("Relocation error");
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 27c3a2d..9f58ff4 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -141,7 +141,8 @@ extern void MMU_init_hw(void);
 extern unsigned long mmu_mapin_ram(unsigned long top);
 
 #elif defined(CONFIG_PPC_FSL_BOOK3E)
-extern unsigned long map_mem_in_cams(unsigned long ram, int max_cam_idx);
+extern unsigned long map_mem_in_cams(unsigned long ram, int max_cam_idx,
+				     bool dryrun);
 extern unsigned long calc_cam_sz(unsigned long ram, unsigned long virt,
 				 phys_addr_t phys);
 #ifdef CONFIG_PPC32
diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c
index a7381fb..bb04e4d 100644
--- a/arch/powerpc/mm/tlb_nohash.c
+++ b/arch/powerpc/mm/tlb_nohash.c
@@ -648,7 +648,7 @@ static void early_init_this_mmu(void)
 
 		if (map)
 			linear_map_top = map_mem_in_cams(linear_map_top,
-							 num_cams);
+							 num_cams, false);
 	}
 #endif
 
@@ -746,10 +746,14 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base,
 	 * entries are supported though that may eventually
 	 * change.
 	 *
-	 * on FSL Embedded 64-bit, we adjust the RMA size to match the
-	 * first bolted TLB entry size.  We still limit max to 1G even if
-	 * the TLB could cover more.  This is due to what the early init
-	 * code is setup to do.
+	 * on FSL Embedded 64-bit, usually all RAM is bolted, but with
+	 * unusual memory sizes it's possible for some RAM to not be mapped
+	 * (such RAM is not used at all by Linux, since we don't support
+	 * highmem on 64-bit).  We limit ppc64_rma_size to what would be
+	 * mappable if this memblock is the only one.  Additional memblocks
+	 * can only increase, not decrease, the amount that ends up getting
+	 * mapped.  We still limit max to 1G even if we'll eventually map
+	 * more.  This is due to what the early init code is set up to do.
 	 *
 	 * We crop it to the size of the first MEMBLOCK to
 	 * avoid going over total available memory just in case...
@@ -757,8 +761,14 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base,
 #ifdef CONFIG_PPC_FSL_BOOK3E
 	if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) {
 		unsigned long linear_sz;
-		linear_sz = calc_cam_sz(first_memblock_size, PAGE_OFFSET,
-					first_memblock_base);
+		unsigned int num_cams;
+
+		/* use a quarter of the TLBCAM for bolted linear map */
+		num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4;
+
+		linear_sz = map_mem_in_cams(first_memblock_size, num_cams,
+					    true);
+
 		ppc64_rma_size = min_t(u64, linear_sz, 0x40000000);
 	} else
 #endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 06/17] powerpc/85xx: Implement 64-bit kexec support
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (4 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 05/17] powerpc/fsl-booke-64: Don't limit ppc64_rma_size to one TLB entry Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 07/17] powerpc/e6500: kexec: Handle hardware threads Scott Wood
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Tiejun Chen, Scott Wood

From: Tiejun Chen <tiejun.chen@windriver.com>

Unlike 32-bit 85xx kexec, we don't do a core reset.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: edit changelog, and cleanup]
Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/platforms/85xx/smp.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index a0763be..2e46684 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -351,6 +351,7 @@ struct smp_ops_t smp_85xx_ops = {
 };
 
 #ifdef CONFIG_KEXEC
+#ifdef CONFIG_PPC32
 atomic_t kexec_down_cpus = ATOMIC_INIT(0);
 
 void mpc85xx_smp_kexec_cpu_down(int crash_shutdown, int secondary)
@@ -370,9 +371,18 @@ static void mpc85xx_smp_kexec_down(void *arg)
 	if (ppc_md.kexec_cpu_down)
 		ppc_md.kexec_cpu_down(0,1);
 }
+#else
+void mpc85xx_smp_kexec_cpu_down(int crash_shutdown, int secondary)
+{
+	local_irq_disable();
+	hard_irq_disable();
+	mpic_teardown_this_cpu(secondary);
+}
+#endif
 
 static void mpc85xx_smp_machine_kexec(struct kimage *image)
 {
+#ifdef CONFIG_PPC32
 	int timeout = INT_MAX;
 	int i, num_cpus = num_present_cpus();
 
@@ -393,6 +403,7 @@ static void mpc85xx_smp_machine_kexec(struct kimage *image)
 		if ( i == smp_processor_id() ) continue;
 		mpic_reset_core(i);
 	}
+#endif
 
 	default_machine_kexec(image);
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 07/17] powerpc/e6500: kexec: Handle hardware threads
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (5 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 06/17] powerpc/85xx: Implement 64-bit kexec support Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 08/17] powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts Scott Wood
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

The new kernel will be expecting secondary threads to be disabled,
not spinning.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kernel/head_64.S     | 16 +++++++++++++
 arch/powerpc/platforms/85xx/smp.c | 48 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index d48125d..8b2bf0d 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -182,6 +182,8 @@ exception_marker:
 
 #ifdef CONFIG_PPC_BOOK3E
 _GLOBAL(fsl_secondary_thread_init)
+	mfspr	r4,SPRN_BUCSR
+
 	/* Enable branch prediction */
 	lis     r3,BUCSR_INIT@h
 	ori     r3,r3,BUCSR_INIT@l
@@ -196,10 +198,24 @@ _GLOBAL(fsl_secondary_thread_init)
 	 * number.  There are two threads per core, so shift everything
 	 * but the low bit right by two bits so that the cpu numbering is
 	 * continuous.
+	 *
+	 * If the old value of BUCSR is non-zero, this thread has run
+	 * before.  Thus, we assume we are coming from kexec or a similar
+	 * scenario, and PIR is already set to the correct value.  This
+	 * is a bit of a hack, but there are limited opportunities for
+	 * getting information into the thread and the alternatives
+	 * seemed like they'd be overkill.  We can't tell just by looking
+	 * at the old PIR value which state it's in, since the same value
+	 * could be valid for one thread out of reset and for a different
+	 * thread in Linux.
 	 */
+
 	mfspr	r3, SPRN_PIR
+	cmpwi	r4,0
+	bne	1f
 	rlwimi	r3, r3, 30, 2, 30
 	mtspr	SPRN_PIR, r3
+1:
 #endif
 
 _GLOBAL(generic_secondary_thread_init)
diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index 2e46684..5152289 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -374,9 +374,57 @@ static void mpc85xx_smp_kexec_down(void *arg)
 #else
 void mpc85xx_smp_kexec_cpu_down(int crash_shutdown, int secondary)
 {
+	int cpu = smp_processor_id();
+	int sibling = cpu_last_thread_sibling(cpu);
+	bool notified = false;
+	int disable_cpu;
+	int disable_threadbit = 0;
+	long start = mftb();
+	long now;
+
 	local_irq_disable();
 	hard_irq_disable();
 	mpic_teardown_this_cpu(secondary);
+
+	if (cpu == crashing_cpu && cpu_thread_in_core(cpu) != 0) {
+		/*
+		 * We enter the crash kernel on whatever cpu crashed,
+		 * even if it's a secondary thread.  If that's the case,
+		 * disable the corresponding primary thread.
+		 */
+		int tir = cpu_thread_in_core(cpu) ^ 1;
+
+		disable_threadbit = 1 << tir;
+		disable_cpu = cpu_first_thread_sibling(cpu) | tir;
+	} else if (sibling != crashing_cpu &&
+		   cpu_thread_in_core(cpu) == 0 &&
+		   cpu_thread_in_core(sibling) != 0) {
+		disable_threadbit = 2;
+		disable_cpu = sibling;
+	}
+
+	if (disable_threadbit) {
+		while (paca[disable_cpu].kexec_state < KEXEC_STATE_REAL_MODE) {
+			barrier();
+			now = mftb();
+			if (!notified && now - start > 1000000) {
+				pr_info("%s/%d: waiting for cpu %d to enter KEXEC_STATE_REAL_MODE (%d)\n",
+					__func__, smp_processor_id(),
+					disable_cpu,
+					paca[disable_cpu].kexec_state);
+				notified = true;
+			}
+		}
+
+		if (notified) {
+			pr_info("%s: cpu %d done waiting\n",
+				__func__, disable_cpu);
+		}
+
+		mtspr(SPRN_TENC, disable_threadbit);
+		while (mfspr(SPRN_TENSR) & disable_threadbit)
+			cpu_relax();
+	}
 }
 #endif
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 08/17] powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (6 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 07/17] powerpc/e6500: kexec: Handle hardware threads Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 09/17] powerpc/booke64: Fix args to copy_and_flush Scott Wood
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Tiejun Chen, Scott Wood

From: Tiejun Chen <tiejun.chen@windriver.com>

Rename 'interrupt_end_book3e' to '__end_interrupts' so that the symbol
can be used by both book3s and book3e.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: edit changelog]
Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kernel/exceptions-64e.S | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index f3bd5e7..9d4a006 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -542,8 +542,8 @@ interrupt_base_book3e:					/* fake trap */
 	EXCEPTION_STUB(0x320, ehpriv)
 	EXCEPTION_STUB(0x340, lrat_error)
 
-	.globl interrupt_end_book3e
-interrupt_end_book3e:
+	.globl __end_interrupts
+__end_interrupts:
 
 /* Critical Input Interrupt */
 	START_EXCEPTION(critical_input);
@@ -736,7 +736,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
 	beq+	1f
 
 	LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
-	LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
+	LOAD_REG_IMMEDIATE(r15,__end_interrupts)
 	cmpld	cr0,r10,r14
 	cmpld	cr1,r10,r15
 	blt+	cr0,1f
@@ -800,7 +800,7 @@ kernel_dbg_exc:
 	beq+	1f
 
 	LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e)
-	LOAD_REG_IMMEDIATE(r15,interrupt_end_book3e)
+	LOAD_REG_IMMEDIATE(r15,__end_interrupts)
 	cmpld	cr0,r10,r14
 	cmpld	cr1,r10,r15
 	blt+	cr0,1f
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 09/17] powerpc/booke64: Fix args to copy_and_flush
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (7 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 08/17] powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 10/17] powerpc/book3e: support CONFIG_RELOCATABLE Scott Wood
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Tiejun Chen, Scott Wood

From: Tiejun Chen <tiejun.chen@windriver.com>

Convert r4/r5, not r6, to a virtual address when calling
copy_and_flush.  Otherwise, r3 is already virtual, and copy_to_flush
tries to access r3+r6, PAGE_OFFSET gets added twice.

This isn't normally seen because on book3e we normally enter with
the kernel at zero and thus skip copy_to_flush -- but it will be
needed for kexec support.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: split patch and rewrote changelog]
Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kernel/head_64.S | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 8b2bf0d..a1e85ca 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -474,15 +474,15 @@ __after_prom_start:
  */
 	li	r3,0			/* target addr */
 #ifdef CONFIG_PPC_BOOK3E
-	tovirt(r3,r3)			/* on booke, we already run at PAGE_OFFSET */
+	tovirt(r3,r3)		/* on booke, we already run at PAGE_OFFSET */
 #endif
 	mr.	r4,r26			/* In some cases the loader may  */
+#if defined(CONFIG_PPC_BOOK3E)
+	tovirt(r4,r4)
+#endif
 	beq	9f			/* have already put us at zero */
 	li	r6,0x100		/* Start offset, the first 0x100 */
 					/* bytes were copied earlier.	 */
-#ifdef CONFIG_PPC_BOOK3E
-	tovirt(r6,r6)			/* on booke, we already run at PAGE_OFFSET */
-#endif
 
 #ifdef CONFIG_RELOCATABLE
 /*
@@ -514,6 +514,9 @@ __after_prom_start:
 p_end:	.llong	_end - _stext
 
 4:	/* Now copy the rest of the kernel up to _end */
+#if defined(CONFIG_PPC_BOOK3E)
+	tovirt(r26,r26)
+#endif
 	addis	r5,r26,(p_end - _stext)@ha
 	ld	r5,(p_end - _stext)@l(r5)	/* get _end */
 5:	bl	copy_and_flush		/* copy the rest */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 10/17] powerpc/book3e: support CONFIG_RELOCATABLE
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (8 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 09/17] powerpc/booke64: Fix args to copy_and_flush Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 11/17] powerpc/book3e/kdump: Enable crash_kexec_wait_realmode Scott Wood
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Tiejun Chen, Scott Wood

From: Tiejun Chen <tiejun.chen@windriver.com>

book3e is different with book3s since 3s includes the exception
vectors code in head_64.S as it relies on absolute addressing
which is only possible within this compilation unit. So we have
to get that label address with got.

And when boot a relocated kernel, we should reset ipvr properly again
after .relocate.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: cleanup and ifdef removal]
Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/include/asm/exception-64e.h |  4 ++--
 arch/powerpc/kernel/exceptions-64e.S     |  9 +++++++--
 arch/powerpc/kernel/head_64.S            | 22 +++++++++++++++++++---
 3 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64e.h b/arch/powerpc/include/asm/exception-64e.h
index a8b52b6..344fc43 100644
--- a/arch/powerpc/include/asm/exception-64e.h
+++ b/arch/powerpc/include/asm/exception-64e.h
@@ -204,8 +204,8 @@ exc_##label##_book3e:
 #endif
 
 #define SET_IVOR(vector_number, vector_offset)	\
-	li	r3,vector_offset@l; 		\
-	ori	r3,r3,interrupt_base_book3e@l;	\
+	LOAD_REG_ADDR(r3,interrupt_base_book3e);\
+	ori	r3,r3,vector_offset@l;		\
 	mtspr	SPRN_IVOR##vector_number,r3;
 
 #endif /* _ASM_POWERPC_EXCEPTION_64E_H */
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 9d4a006..488e631 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1351,7 +1351,10 @@ skpinv:	addi	r6,r6,1				/* Increment */
  * r4 = MAS0 w/TLBSEL & ESEL for the temp mapping
  */
 	/* Now we branch the new virtual address mapped by this entry */
-	LOAD_REG_IMMEDIATE(r6,2f)
+	bl	1f		/* Find our address */
+1:	mflr	r6
+	addi	r6,r6,(2f - 1b)
+	tovirt(r6,r6)
 	lis	r7,MSR_KERNEL@h
 	ori	r7,r7,MSR_KERNEL@l
 	mtspr	SPRN_SRR0,r6
@@ -1583,9 +1586,11 @@ _GLOBAL(book3e_secondary_thread_init)
 	mflr	r28
 	b	3b
 
+	.globl init_core_book3e
 init_core_book3e:
 	/* Establish the interrupt vector base */
-	LOAD_REG_IMMEDIATE(r3, interrupt_base_book3e)
+	tovirt(r2,r2)
+	LOAD_REG_ADDR(r3, interrupt_base_book3e)
 	mtspr	SPRN_IVPR,r3
 	sync
 	blr
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index a1e85ca..1b77956 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -457,12 +457,22 @@ __after_prom_start:
 	/* process relocations for the final address of the kernel */
 	lis	r25,PAGE_OFFSET@highest	/* compute virtual base of kernel */
 	sldi	r25,r25,32
+#if defined(CONFIG_PPC_BOOK3E)
+	tovirt(r26,r26)		/* on booke, we already run at PAGE_OFFSET */
+#endif
 	lwz	r7,__run_at_load-_stext(r26)
+#if defined(CONFIG_PPC_BOOK3E)
+	tophys(r26,r26)
+#endif
 	cmplwi	cr0,r7,1	/* flagged to stay where we are ? */
 	bne	1f
 	add	r25,r25,r26
 1:	mr	r3,r25
 	bl	relocate
+#if defined(CONFIG_PPC_BOOK3E)
+	/* IVPR needs to be set after relocation. */
+	bl	init_core_book3e
+#endif
 #endif
 
 /*
@@ -490,12 +500,21 @@ __after_prom_start:
  * variable __run_at_load, if it is set the kernel is treated as relocatable
  * kernel, otherwise it will be moved to PHYSICAL_START
  */
+#if defined(CONFIG_PPC_BOOK3E)
+	tovirt(r26,r26)		/* on booke, we already run at PAGE_OFFSET */
+#endif
 	lwz	r7,__run_at_load-_stext(r26)
 	cmplwi	cr0,r7,1
 	bne	3f
 
+#ifdef CONFIG_PPC_BOOK3E
+	LOAD_REG_ADDR(r5, __end_interrupts)
+	LOAD_REG_ADDR(r11, _stext)
+	sub	r5,r5,r11
+#else
 	/* just copy interrupts */
 	LOAD_REG_IMMEDIATE(r5, __end_interrupts - _stext)
+#endif
 	b	5f
 3:
 #endif
@@ -514,9 +533,6 @@ __after_prom_start:
 p_end:	.llong	_end - _stext
 
 4:	/* Now copy the rest of the kernel up to _end */
-#if defined(CONFIG_PPC_BOOK3E)
-	tovirt(r26,r26)
-#endif
 	addis	r5,r26,(p_end - _stext)@ha
 	ld	r5,(p_end - _stext)@l(r5)	/* get _end */
 5:	bl	copy_and_flush		/* copy the rest */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 11/17] powerpc/book3e/kdump: Enable crash_kexec_wait_realmode
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (9 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 10/17] powerpc/book3e: support CONFIG_RELOCATABLE Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 12/17] powerpc/book3e-64: Don't limit paca to 256 MiB Scott Wood
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

While book3e doesn't have "real mode", we still want to wait for
all the non-crash cpus to complete their shutdown.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kernel/crash.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/crash.c b/arch/powerpc/kernel/crash.c
index 51dbace..2bb252c 100644
--- a/arch/powerpc/kernel/crash.c
+++ b/arch/powerpc/kernel/crash.c
@@ -221,8 +221,8 @@ void crash_kexec_secondary(struct pt_regs *regs)
 #endif	/* CONFIG_SMP */
 
 /* wait for all the CPUs to hit real mode but timeout if they don't come in */
-#if defined(CONFIG_SMP) && defined(CONFIG_PPC_STD_MMU_64)
-static void crash_kexec_wait_realmode(int cpu)
+#if defined(CONFIG_SMP) && defined(CONFIG_PPC64)
+static void __maybe_unused crash_kexec_wait_realmode(int cpu)
 {
 	unsigned int msecs;
 	int i;
@@ -244,7 +244,7 @@ static void crash_kexec_wait_realmode(int cpu)
 }
 #else
 static inline void crash_kexec_wait_realmode(int cpu) {}
-#endif	/* CONFIG_SMP && CONFIG_PPC_STD_MMU_64 */
+#endif	/* CONFIG_SMP && CONFIG_PPC64 */
 
 /*
  * Register a function to be called on shutdown.  Only use this if you
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 12/17] powerpc/book3e-64: Don't limit paca to 256 MiB
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (10 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 11/17] powerpc/book3e/kdump: Enable crash_kexec_wait_realmode Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 13/17] powerpc/book3e-64/kexec: create an identity TLB mapping Scott Wood
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

This limit only makes sense on book3s, and on book3e it can cause
problems with kdump if we don't have any memory under 256 MiB.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kernel/paca.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 5a23b69..7fdff63 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -206,12 +206,16 @@ void __init allocate_pacas(void)
 {
 	int cpu, limit;
 
+	limit = ppc64_rma_size;
+
+#ifdef CONFIG_PPC_BOOK3S_64
 	/*
 	 * We can't take SLB misses on the paca, and we want to access them
 	 * in real mode, so allocate them within the RMA and also within
 	 * the first segment.
 	 */
-	limit = min(0x10000000ULL, ppc64_rma_size);
+	limit = min(0x10000000ULL, limit);
+#endif
 
 	paca_size = PAGE_ALIGN(sizeof(struct paca_struct) * nr_cpu_ids);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 13/17] powerpc/book3e-64/kexec: create an identity TLB mapping
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (11 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 12/17] powerpc/book3e-64: Don't limit paca to 256 MiB Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release Scott Wood
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Tiejun Chen, Scott Wood

From: Tiejun Chen <tiejun.chen@windriver.com>

book3e has no real MMU mode so we have to create an identity TLB
mapping to make sure we can access the real physical address.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood: cleanup, and split off some changes]
Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kernel/misc_64.S | 52 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 4e314b9..c5915f0 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -26,6 +26,7 @@
 #include <asm/thread_info.h>
 #include <asm/kexec.h>
 #include <asm/ptrace.h>
+#include <asm/mmu.h>
 
 	.text
 
@@ -487,6 +488,51 @@ kexec_flag:
 
 
 #ifdef CONFIG_KEXEC
+#ifdef CONFIG_PPC_BOOK3E
+/*
+ * BOOK3E has no real MMU mode, so we have to setup the initial TLB
+ * for a core to identity map v:0 to p:0.  This current implementation
+ * assumes that 1G is enough for kexec.
+ */
+kexec_create_tlb:
+	/*
+	 * Invalidate all non-IPROT TLB entries to avoid any TLB conflict.
+	 * IPROT TLB entries should be >= PAGE_OFFSET and thus not conflict.
+	 */
+	PPC_TLBILX_ALL(0,R0)
+	sync
+	isync
+
+	mfspr	r10,SPRN_TLB1CFG
+	andi.	r10,r10,TLBnCFG_N_ENTRY	/* Extract # entries */
+	subi	r10,r10,1	/* Last entry: no conflict with kernel text */
+	lis	r9,MAS0_TLBSEL(1)@h
+	rlwimi	r9,r10,16,4,15		/* Setup MAS0 = TLBSEL | ESEL(r9) */
+
+/* Set up a temp identity mapping v:0 to p:0 and return to it. */
+#if defined(CONFIG_SMP) || defined(CONFIG_PPC_E500MC)
+#define M_IF_NEEDED	MAS2_M
+#else
+#define M_IF_NEEDED	0
+#endif
+	mtspr	SPRN_MAS0,r9
+
+	lis	r9,(MAS1_VALID|MAS1_IPROT)@h
+	ori	r9,r9,(MAS1_TSIZE(BOOK3E_PAGESZ_1GB))@l
+	mtspr	SPRN_MAS1,r9
+
+	LOAD_REG_IMMEDIATE(r9, 0x0 | M_IF_NEEDED)
+	mtspr	SPRN_MAS2,r9
+
+	LOAD_REG_IMMEDIATE(r9, 0x0 | MAS3_SR | MAS3_SW | MAS3_SX)
+	mtspr	SPRN_MAS3,r9
+	li	r9,0
+	mtspr	SPRN_MAS7,r9
+
+	tlbwe
+	isync
+	blr
+#endif
 
 /* kexec_smp_wait(void)
  *
@@ -516,6 +562,10 @@ _GLOBAL(kexec_smp_wait)
  * don't overwrite r3 here, it is live for kexec_wait above.
  */
 real_mode:	/* assume normal blr return */
+#ifdef CONFIG_PPC_BOOK3E
+	/* Create an identity mapping. */
+	b	kexec_create_tlb
+#else
 1:	li	r9,MSR_RI
 	li	r10,MSR_DR|MSR_IR
 	mflr	r11		/* return address to SRR0 */
@@ -527,7 +577,7 @@ real_mode:	/* assume normal blr return */
 	mtspr	SPRN_SRR1,r10
 	mtspr	SPRN_SRR0,r11
 	rfid
-
+#endif
 
 /*
  * kexec_sequence(newstack, start, image, control, clear_all())
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (12 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 13/17] powerpc/book3e-64/kexec: create an identity TLB mapping Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-08-18  4:51   ` [RFC,14/17] " Michael Ellerman
  2015-08-20  4:54   ` [RFC PATCH 14/17] " Michael Ellerman
  2015-07-18 20:08 ` [RFC PATCH 15/17] powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32 Scott Wood
                   ` (2 subsequent siblings)
  16 siblings, 2 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

booted_from_exec is similar to __run_at_load, except that it is set for
regular kexec as well as kdump.

The flag is needed because the SMP release mechanism for FSL book3e is
different from when booting with normal hardware.  In theory we could
simulate the normal spin table mechanism, but not at the addresses
U-Boot put in the device tree -- so there'd need to be even more
communication between the kernel and kexec to set that up.  Since
there's already a similar flag being set (for kdump only), this seemed
like a reasonable approach.

Unlike __run_at_kexec in http://patchwork.ozlabs.org/patch/257657/
("book3e/kexec/kdump: introduce a kexec kernel flag"), this flag is at
a fixed address for ABI stability, and actually gets set properly in
the kdump case (i.e. on the crash kernel, not on the crashing kernel).

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
This depends on the kexec-tools patch "ppc64: Add a flag to tell the
kernel it's booting from kexec":
http://lists.infradead.org/pipermail/kexec/2015-July/014048.html
---
 arch/powerpc/include/asm/smp.h    |  1 +
 arch/powerpc/kernel/head_64.S     | 15 +++++++++++++++
 arch/powerpc/kernel/setup_64.c    | 14 +++++++++++++-
 arch/powerpc/platforms/85xx/smp.c | 16 ++++++++++++----
 4 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 825663c..f9245df 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -197,6 +197,7 @@ extern void generic_secondary_thread_init(void);
 extern unsigned long __secondary_hold_spinloop;
 extern unsigned long __secondary_hold_acknowledge;
 extern char __secondary_hold;
+extern u32 booted_from_kexec;
 
 extern void __early_start(void);
 #endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 1b77956..ae2d6b5 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -91,6 +91,21 @@ __secondary_hold_spinloop:
 __secondary_hold_acknowledge:
 	.llong	0x0
 
+	/* Do not move this variable as kexec-tools knows about it. */
+	. = 0x58
+	.globl	booted_from_kexec
+booted_from_kexec:
+	/*
+	 * "nkxc" -- not (necessarily) from kexec by default
+	 *
+	 * This flag is set to 1 by a loader if the kernel is being
+	 * booted by kexec.  Older kexec-tools don't know about this
+	 * flag, so platforms other than fsl-book3e should treat a value
+	 * of "nkxc" as inconclusive.  fsl-book3e relies on this to
+	 * know how to release secondary cpus.
+	 */
+	.long	0x6e6b7863
+
 #ifdef CONFIG_RELOCATABLE
 	/* This flag is set to 1 by a loader if the kernel should run
 	 * at the loaded address instead of the linked address.  This
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 505ec2c..baeddcc 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -340,11 +340,23 @@ void early_setup_secondary(void)
 #endif /* CONFIG_SMP */
 
 #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
+static bool use_spinloop(void)
+{
+#ifdef CONFIG_PPC_FSL_BOOK3E
+	return booted_from_kexec == 1;
+#else
+	return true;
+#endif
+}
+
 void smp_release_cpus(void)
 {
 	unsigned long *ptr;
 	int i;
 
+	if (!use_spinloop())
+		return;
+
 	DBG(" -> smp_release_cpus()\n");
 
 	/* All secondary cpus are spinning on a common spinloop, release them
@@ -524,7 +536,7 @@ void __init setup_system(void)
 	 * Freescale Book3e parts spin in a loop provided by firmware,
 	 * so smp_release_cpus() does nothing for them
 	 */
-#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_FSL_BOOK3E)
+#if defined(CONFIG_SMP)
 	/* Release secondary cpus out of their spinloops at 0x60 now that
 	 * we can map physical -> logical CPU ids
 	 */
diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index 5152289..4abda43 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -20,6 +20,7 @@
 #include <linux/highmem.h>
 #include <linux/cpu.h>
 
+#include <asm/kexec.h>
 #include <asm/machdep.h>
 #include <asm/pgtable.h>
 #include <asm/page.h>
@@ -203,6 +204,7 @@ static int smp_85xx_kick_cpu(int nr)
 	int hw_cpu = get_hard_smp_processor_id(nr);
 	int ioremappable;
 	int ret = 0;
+	bool have_spin_table = true;
 
 	WARN_ON(nr < 0 || nr >= NR_CPUS);
 	WARN_ON(hw_cpu < 0 || hw_cpu >= NR_CPUS);
@@ -210,6 +212,9 @@ static int smp_85xx_kick_cpu(int nr)
 	pr_debug("smp_85xx_kick_cpu: kick CPU #%d\n", nr);
 
 #ifdef CONFIG_PPC64
+	if (booted_from_kexec == 1 && system_state != SYSTEM_RUNNING)
+		have_spin_table = false;
+
 	/* Threads don't use the spin table */
 	if (cpu_thread_in_core(nr) != 0) {
 		int primary = cpu_first_thread_sibling(nr);
@@ -305,10 +310,13 @@ static int smp_85xx_kick_cpu(int nr)
 		__secondary_hold_acknowledge = -1;
 	}
 #endif
-	flush_spin_table(spin_table);
-	out_be32(&spin_table->pir, hw_cpu);
-	out_be32(&spin_table->addr_l, __pa(__early_start));
-	flush_spin_table(spin_table);
+
+	if (have_spin_table) {
+		flush_spin_table(spin_table);
+		out_be32(&spin_table->pir, hw_cpu);
+		out_be32(&spin_table->addr_l, __pa(__early_start));
+		flush_spin_table(spin_table);
+	}
 
 	/* Wait a bit for the CPU to ack. */
 	if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 15/17] powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (13 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 16/17] powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 17/17] powerpc/book3e-64: Enable kexec Scott Wood
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

The way VIRT_PHYS_OFFSET is not correct on book3e-64, because
it does not account for CONFIG_RELOCATABLE other than via the
32-bit-only virt_phys_offset.

book3e-64 can (and if the comment about a GCC miscompilation is still
relevant, should) use the normal ppc64 __va/__pa.

At this point, only booke-32 will use VIRT_PHYS_OFFSET, so given the
issues with its calculation, restrict its definition to booke-32.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/include/asm/page.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 71294a6..11889d8 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -107,12 +107,13 @@ extern long long virt_phys_offset;
 #endif
 
 /* See Description below for VIRT_PHYS_OFFSET */
-#ifdef CONFIG_RELOCATABLE_PPC32
+#if defined(CONFIG_PPC32) && defined(CONFIG_BOOKE)
+#ifdef CONFIG_RELOCATABLE
 #define VIRT_PHYS_OFFSET virt_phys_offset
 #else
 #define VIRT_PHYS_OFFSET (KERNELBASE - PHYSICAL_START)
 #endif
-
+#endif
 
 #ifdef CONFIG_PPC64
 #define MEMORY_START	0UL
@@ -204,7 +205,7 @@ extern long long virt_phys_offset;
  * On non-Book-E PPC64 PAGE_OFFSET and MEMORY_START are constants so use
  * the other definitions for __va & __pa.
  */
-#ifdef CONFIG_BOOKE
+#if defined(CONFIG_PPC32) && defined(CONFIG_BOOKE)
 #define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) + VIRT_PHYS_OFFSET))
 #define __pa(x) ((unsigned long)(x) - VIRT_PHYS_OFFSET)
 #else
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 16/17] powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (14 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 15/17] powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32 Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  2015-07-18 20:08 ` [RFC PATCH 17/17] powerpc/book3e-64: Enable kexec Scott Wood
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Scott Wood

book3e_secondary_core_init will only create a TLB entry if r4 = 0,
so do so.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kernel/misc_64.S | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index c5915f0..fb955d9 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -476,6 +476,10 @@ _GLOBAL(kexec_wait)
 #ifdef CONFIG_KEXEC		/* use no memory without kexec */
 	lwz	r4,0(r5)
 	cmpwi	0,r4,0
+#ifdef CONFIG_PPC_BOOK3E
+	/* Don't create TLB entry in book3e_secondary_core_init */
+	li	r4,0
+#endif
 	bnea	0x60
 #endif
 	b	99b
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 17/17] powerpc/book3e-64: Enable kexec
  2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
                   ` (15 preceding siblings ...)
  2015-07-18 20:08 ` [RFC PATCH 16/17] powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop Scott Wood
@ 2015-07-18 20:08 ` Scott Wood
  16 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-07-18 20:08 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tiejun Chen, kexec, Tiejun Chen, Scott Wood

From: Tiejun Chen <tiejun.chen@windriver.com>

Allow KEXEC for book3e, and bypass or convert non-book3e stuff
in kexec code.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[scottwood@freescale.com: move code to minimize diff, and cleanup]
Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/Kconfig                   |  2 +-
 arch/powerpc/kernel/machine_kexec_64.c | 19 +++++++++++++++++++
 arch/powerpc/kernel/misc_64.S          |  6 ++++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 5ef2711..710dcdb 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -414,7 +414,7 @@ config PPC64_SUPPORTS_MEMORY_FAILURE
 
 config KEXEC
 	bool "kexec system call"
-	depends on (PPC_BOOK3S || FSL_BOOKE || (44x && !SMP))
+	depends on (PPC_BOOK3S || FSL_BOOKE || (44x && !SMP)) || PPC_BOOK3E
 	help
 	  kexec is a system call that implements the ability to shutdown your
 	  current kernel, and to start another kernel.  It is like a reboot
diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/machine_kexec_64.c
index 1a74446..b0d42a7 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -30,6 +30,21 @@
 #include <asm/smp.h>
 #include <asm/hw_breakpoint.h>
 
+#ifdef CONFIG_PPC_BOOK3E
+int default_machine_kexec_prepare(struct kimage *image)
+{
+	int i;
+	/*
+	 * Since we use the kernel fault handlers and paging code to
+	 * handle the virtual mode, we must make sure no destination
+	 * overlaps kernel static data or bss.
+	 */
+	for (i = 0; i < image->nr_segments; i++)
+		if (image->segment[i].mem < __pa(_end))
+			return -ETXTBSY;
+	return 0;
+}
+#else
 int default_machine_kexec_prepare(struct kimage *image)
 {
 	int i;
@@ -95,6 +110,7 @@ int default_machine_kexec_prepare(struct kimage *image)
 
 	return 0;
 }
+#endif /* !CONFIG_PPC_BOOK3E */
 
 static void copy_segments(unsigned long ind)
 {
@@ -365,6 +381,7 @@ void default_machine_kexec(struct kimage *image)
 	/* NOTREACHED */
 }
 
+#ifndef CONFIG_PPC_BOOK3E
 /* Values we need to export to the second kernel via the device tree. */
 static unsigned long htab_base;
 static unsigned long htab_size;
@@ -411,3 +428,5 @@ static int __init export_htab_values(void)
 	return 0;
 }
 late_initcall(export_htab_values);
+#endif /* !CONFIG_PPC_BOOK3E */
+
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index fb955d9..8cf0b8ae 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -624,9 +624,13 @@ _GLOBAL(kexec_sequence)
 	lhz	r25,PACAHWCPUID(r13)	/* get our phys cpu from paca */
 
 	/* disable interrupts, we are overwriting kernel data next */
+#ifdef CONFIG_PPC_BOOK3E
+	wrteei	0
+#else
 	mfmsr	r3
 	rlwinm	r3,r3,0,17,15
 	mtmsrd	r3,1
+#endif
 
 	/* copy dest pages, flush whole dest image */
 	mr	r3,r29
@@ -648,6 +652,7 @@ _GLOBAL(kexec_sequence)
 	li	r6,1
 	stw	r6,kexec_flag-1b(5)
 
+#ifndef CONFIG_PPC_BOOK3E
 	/* clear out hardware hash page table and tlb */
 #if !defined(_CALL_ELF) || _CALL_ELF != 2
 	ld	r12,0(r27)		/* deref function descriptor */
@@ -656,6 +661,7 @@ _GLOBAL(kexec_sequence)
 #endif
 	mtctr	r12
 	bctrl				/* ppc_md.hpte_clear_all(void); */
+#endif /* !CONFIG_PPC_BOOK3E */
 
 /*
  *   kexec image calling is:
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC,14/17] powerpc/book3e-64/kexec: Enable SMP release
  2015-07-18 20:08 ` [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release Scott Wood
@ 2015-08-18  4:51   ` Michael Ellerman
  2015-08-18  5:09     ` Scott Wood
  2015-08-20  4:54   ` [RFC PATCH 14/17] " Michael Ellerman
  1 sibling, 1 reply; 25+ messages in thread
From: Michael Ellerman @ 2015-08-18  4:51 UTC (permalink / raw)
  To: Scott Wood, linuxppc-dev; +Cc: Scott Wood, Tiejun Chen, kexec

On Sat, 2015-18-07 at 20:08:51 UTC, Scott Wood wrote:
> booted_from_exec is similar to __run_at_load, except that it is set for
              ^
	      missing k.

Also do you mind using __booted_from_kexec to keep the naming similar to the
other variables down there, and also make it clear it's low level guts.

I see you asked for them to be removed on the original patch but all the other
vars in there are named that way.

> regular kexec as well as kdump.
> 
> The flag is needed because the SMP release mechanism for FSL book3e is
> different from when booting with normal hardware.  In theory we could
> simulate the normal spin table mechanism, but not at the addresses
> U-Boot put in the device tree -- so there'd need to be even more
> communication between the kernel and kexec to set that up.  Since
> there's already a similar flag being set (for kdump only), this seemed
> like a reasonable approach.

Yeah I guess it is. Obviously it'd be nicer if we didn't have to do it though.

> 
> Unlike __run_at_kexec in http://patchwork.ozlabs.org/patch/257657/
> ("book3e/kexec/kdump: introduce a kexec kernel flag"), this flag is at
> a fixed address for ABI stability, and actually gets set properly in
> the kdump case (i.e. on the crash kernel, not on the crashing kernel).
> 
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> This depends on the kexec-tools patch "ppc64: Add a flag to tell the
> kernel it's booting from kexec":
> http://lists.infradead.org/pipermail/kexec/2015-July/014048.html
> ---
>  arch/powerpc/include/asm/smp.h    |  1 +
>  arch/powerpc/kernel/head_64.S     | 15 +++++++++++++++
>  arch/powerpc/kernel/setup_64.c    | 14 +++++++++++++-
>  arch/powerpc/platforms/85xx/smp.c | 16 ++++++++++++----
>  4 files changed, 41 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
> index 825663c..f9245df 100644
> --- a/arch/powerpc/include/asm/smp.h
> +++ b/arch/powerpc/include/asm/smp.h
> @@ -197,6 +197,7 @@ extern void generic_secondary_thread_init(void);
>  extern unsigned long __secondary_hold_spinloop;
>  extern unsigned long __secondary_hold_acknowledge;
>  extern char __secondary_hold;
> +extern u32 booted_from_kexec;
>  
>  extern void __early_start(void);
>  #endif /* __ASSEMBLY__ */
> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index 1b77956..ae2d6b5 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -91,6 +91,21 @@ __secondary_hold_spinloop:
>  __secondary_hold_acknowledge:
>  	.llong	0x0
>  
> +	/* Do not move this variable as kexec-tools knows about it. */
> +	. = 0x58
> +	.globl	booted_from_kexec
> +booted_from_kexec:
> +	/*
> +	 * "nkxc" -- not (necessarily) from kexec by default
> +	 *
> +	 * This flag is set to 1 by a loader if the kernel is being
> +	 * booted by kexec.  Older kexec-tools don't know about this
> +	 * flag, so platforms other than fsl-book3e should treat a value
> +	 * of "nkxc" as inconclusive.  fsl-book3e relies on this to
> +	 * know how to release secondary cpus.
> +	 */
> +	.long	0x6e6b7863

Couldn't we say that "nkxc" (whatever that stands for) means "unknown", and
have kexec-tools write "yes" to indicate yes. I realise that's not 100% bullet
proof, but it seems like it would be good enough. And it would mean we could
use the flag on other platforms if we ever want to.

Also "nkxc" ? "bfkx" ?

> diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> index 505ec2c..baeddcc 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -340,11 +340,23 @@ void early_setup_secondary(void)
>  #endif /* CONFIG_SMP */
>  
>  #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
> +static bool use_spinloop(void)
> +{
> +#ifdef CONFIG_PPC_FSL_BOOK3E
> +	return booted_from_kexec == 1;
> +#else
> +	return true;
> +#endif

Ugh, more ifdefs.

What about:

	return IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) && (booted_from_kexec == 1);

If that works, I haven't checked. It's slightly less ugly?

> +}
> +
>  void smp_release_cpus(void)
>  {
>  	unsigned long *ptr;
>  	int i;
>  
> +	if (!use_spinloop())
> +		return;
> +
>  	DBG(" -> smp_release_cpus()\n");
>  
>  	/* All secondary cpus are spinning on a common spinloop, release them
> @@ -524,7 +536,7 @@ void __init setup_system(void)
>  	 * Freescale Book3e parts spin in a loop provided by firmware,
>  	 * so smp_release_cpus() does nothing for them
>  	 */
> -#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_FSL_BOOK3E)
> +#if defined(CONFIG_SMP)

Can you make that just #ifdef CONFIG_SMP.

>  	/* Release secondary cpus out of their spinloops at 0x60 now that
>  	 * we can map physical -> logical CPU ids
>  	 */

cheers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC,14/17] powerpc/book3e-64/kexec: Enable SMP release
  2015-08-18  4:51   ` [RFC,14/17] " Michael Ellerman
@ 2015-08-18  5:09     ` Scott Wood
  0 siblings, 0 replies; 25+ messages in thread
From: Scott Wood @ 2015-08-18  5:09 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, Tiejun Chen, kexec

On Tue, 2015-08-18 at 14:51 +1000, Michael Ellerman wrote:
> On Sat, 2015-18-07 at 20:08:51 UTC, Scott Wood wrote:
> > booted_from_exec is similar to __run_at_load, except that it is set for
>               ^
>             missing k.
> 
> Also do you mind using __booted_from_kexec to keep the naming similar to the
> other variables down there, and also make it clear it's low level guts.
> 
> I see you asked for them to be removed on the original patch but all the 
> other
> vars in there are named that way.

I'm not a fan of it as it isn't distinguishing from a non-underscore version, 
isn't there for namespacing reasons, and it's not even a private 
implementation detail -- it's part of the interface with kexec tools.  I'll 
change it if you want, though.

> > diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> > index 1b77956..ae2d6b5 100644
> > --- a/arch/powerpc/kernel/head_64.S
> > +++ b/arch/powerpc/kernel/head_64.S
> > @@ -91,6 +91,21 @@ __secondary_hold_spinloop:
> >  __secondary_hold_acknowledge:
> >     .llong  0x0
> >  
> > +   /* Do not move this variable as kexec-tools knows about it. */
> > +   . = 0x58
> > +   .globl  booted_from_kexec
> > +booted_from_kexec:
> > +   /*
> > +    * "nkxc" -- not (necessarily) from kexec by default
> > +    *
> > +    * This flag is set to 1 by a loader if the kernel is being
> > +    * booted by kexec.  Older kexec-tools don't know about this
> > +    * flag, so platforms other than fsl-book3e should treat a value
> > +    * of "nkxc" as inconclusive.  fsl-book3e relies on this to
> > +    * know how to release secondary cpus.
> > +    */
> > +   .long   0x6e6b7863
> 
> Couldn't we say that "nkxc" (whatever that stands for)

It stands for "no kexec", which is true if that value is not overwritten.

>  means "unknown", and
> have kexec-tools write "yes" to indicate yes. I realise that's not 100% 
> bullet

That is what I implemented (other than "1" versus "yes").

> > diff --git a/arch/powerpc/kernel/setup_64.c 
> > b/arch/powerpc/kernel/setup_64.c
> > index 505ec2c..baeddcc 100644
> > --- a/arch/powerpc/kernel/setup_64.c
> > +++ b/arch/powerpc/kernel/setup_64.c
> > @@ -340,11 +340,23 @@ void early_setup_secondary(void)
> >  #endif /* CONFIG_SMP */
> >  
> >  #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
> > +static bool use_spinloop(void)
> > +{
> > +#ifdef CONFIG_PPC_FSL_BOOK3E
> > +   return booted_from_kexec == 1;
> > +#else
> > +   return true;
> > +#endif
> 
> Ugh, more ifdefs.
> 
> What about:
> 
>       return IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) && (booted_from_kexec == 1);
> 
> If that works, I haven't checked. It's slightly less ugly?

That would return "false" for non-book3e which isn't correct.

If it has to be done with a single expression, then it'd be:

        return !IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) || booted_from_kexec == 1;

-Scott

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release
  2015-07-18 20:08 ` [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release Scott Wood
  2015-08-18  4:51   ` [RFC,14/17] " Michael Ellerman
@ 2015-08-20  4:54   ` Michael Ellerman
  2015-08-24 20:25     ` Scott Wood
  1 sibling, 1 reply; 25+ messages in thread
From: Michael Ellerman @ 2015-08-20  4:54 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, Tiejun Chen, kexec

Hi Scott,

Sorry for the delay. So I'm back to square one on this patch.

On Sat, 2015-07-18 at 15:08 -0500, Scott Wood wrote:
> booted_from_exec is similar to __run_at_load, except that it is set for
> regular kexec as well as kdump.
> 
> The flag is needed because the SMP release mechanism for FSL book3e is
> different from when booting with normal hardware.  In theory we could
> simulate the normal spin table mechanism, but not at the addresses
> U-Boot put in the device tree -- so there'd need to be even more
> communication between the kernel and kexec to set that up.  Since
> there's already a similar flag being set (for kdump only), this seemed
> like a reasonable approach.

Although this is a reasonable approach, I don't think it's the best approach.

AFAICS there's no reason why we can't use a device tree property for this, so I
think we should do that.

It avoids using up space in the low memory area, and also any ambiguities about
whether the value has been set or not.

The reason we used a flag like this for __run_at_load is we need to access that
very early, ie. before we've even relocated the kernel, well before we have
(easy) access to the flattened device tree.

In contrast for this, you don't need to know you're booted from kexec until
much later, so using a device tree property is cleaner and just as easy.

If you want to call it "linux,kexec-boot" or similar that's fine. Equally you
could make it more specific, something like "fsl,avoid-spin-table".


Also below ...

> diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
> index 5152289..4abda43 100644
> --- a/arch/powerpc/platforms/85xx/smp.c
> +++ b/arch/powerpc/platforms/85xx/smp.c
> @@ -305,10 +310,13 @@ static int smp_85xx_kick_cpu(int nr)
>  		__secondary_hold_acknowledge = -1;
>  	}
>  #endif
> -	flush_spin_table(spin_table);
> -	out_be32(&spin_table->pir, hw_cpu);
> -	out_be32(&spin_table->addr_l, __pa(__early_start));
> -	flush_spin_table(spin_table);
> +
> +	if (have_spin_table) {
> +		flush_spin_table(spin_table);
> +		out_be32(&spin_table->pir, hw_cpu);
> +		out_be32(&spin_table->addr_l, __pa(__early_start));
> +		flush_spin_table(spin_table);
> +	}
>  
>  	/* Wait a bit for the CPU to ack. */
>  	if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,

This looks like it's inside an #ifdef CONFIG_PPC32 block, which doesn't make
sense, so I must be missing a lead-up patch or something? (I looked on the list
but didn't find anything immediately)

cheers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release
  2015-08-20  4:54   ` [RFC PATCH 14/17] " Michael Ellerman
@ 2015-08-24 20:25     ` Scott Wood
  2015-08-25  1:57       ` Michael Ellerman
  0 siblings, 1 reply; 25+ messages in thread
From: Scott Wood @ 2015-08-24 20:25 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, Tiejun Chen, kexec

On Thu, 2015-08-20 at 14:54 +1000, Michael Ellerman wrote:
> Hi Scott,
> 
> Sorry for the delay. So I'm back to square one on this patch.
> 
> On Sat, 2015-07-18 at 15:08 -0500, Scott Wood wrote:
> > booted_from_exec is similar to __run_at_load, except that it is set for
> > regular kexec as well as kdump.
> > 
> > The flag is needed because the SMP release mechanism for FSL book3e is
> > different from when booting with normal hardware.  In theory we could
> > simulate the normal spin table mechanism, but not at the addresses
> > U-Boot put in the device tree -- so there'd need to be even more
> > communication between the kernel and kexec to set that up.  Since
> > there's already a similar flag being set (for kdump only), this seemed
> > like a reasonable approach.
> 
> Although this is a reasonable approach, I don't think it's the best 
> approach.
> 
> AFAICS there's no reason why we can't use a device tree property for this, 
> so I
> think we should do that.

OK, I'll look into that.

> 
> > diff --git a/arch/powerpc/platforms/85xx/smp.c 
> > b/arch/powerpc/platforms/85xx/smp.c
> > index 5152289..4abda43 100644
> > --- a/arch/powerpc/platforms/85xx/smp.c
> > +++ b/arch/powerpc/platforms/85xx/smp.c
> > @@ -305,10 +310,13 @@ static int smp_85xx_kick_cpu(int nr)
> >             __secondary_hold_acknowledge = -1;
> >     }
> >  #endif
> > -   flush_spin_table(spin_table);
> > -   out_be32(&spin_table->pir, hw_cpu);
> > -   out_be32(&spin_table->addr_l, __pa(__early_start));
> > -   flush_spin_table(spin_table);
> > +
> > +   if (have_spin_table) {
> > +           flush_spin_table(spin_table);
> > +           out_be32(&spin_table->pir, hw_cpu);
> > +           out_be32(&spin_table->addr_l, __pa(__early_start));
> > +           flush_spin_table(spin_table);
> > +   }
> >  
> >     /* Wait a bit for the CPU to ack. */
> >     if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,
> 
> This looks like it's inside an #ifdef CONFIG_PPC32 block, which doesn't make
> sense, so I must be missing a lead-up patch or something? (I looked on the 
> list
> but didn't find anything immediately)

Thanks for catching this.

This is apparently a mismerge due to the code having been previously worked 
on in the context of the SDK tree, which does not have that code inside 
#ifdef CONFIG_PPC32.  When I then applied the result to mainline, everything 
still appeared to work, because there's no real consequence to writing to the 
spin table in this case -- it's just a no-op.  setup_64.c is the part where 
checking booted_from_kexec (or devicetree equivalent) really matters.

-Scott

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release
  2015-08-24 20:25     ` Scott Wood
@ 2015-08-25  1:57       ` Michael Ellerman
  2015-08-25 23:40         ` Scott Wood
  0 siblings, 1 reply; 25+ messages in thread
From: Michael Ellerman @ 2015-08-25  1:57 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, Tiejun Chen, kexec

On Mon, 2015-08-24 at 15:25 -0500, Scott Wood wrote:
> On Thu, 2015-08-20 at 14:54 +1000, Michael Ellerman wrote:
> > Hi Scott,
> > 
> > Sorry for the delay. So I'm back to square one on this patch.
> > 
> > On Sat, 2015-07-18 at 15:08 -0500, Scott Wood wrote:
> > > booted_from_exec is similar to __run_at_load, except that it is set for
> > > regular kexec as well as kdump.
> > > 
> > > The flag is needed because the SMP release mechanism for FSL book3e is
> > > different from when booting with normal hardware.  In theory we could
> > > simulate the normal spin table mechanism, but not at the addresses
> > > U-Boot put in the device tree -- so there'd need to be even more
> > > communication between the kernel and kexec to set that up.  Since
> > > there's already a similar flag being set (for kdump only), this seemed
> > > like a reasonable approach.
> > 
> > Although this is a reasonable approach, I don't think it's the best 
> > approach.
> > 
> > AFAICS there's no reason why we can't use a device tree property for this, 
> > so I think we should do that.
> 
> OK, I'll look into that.

Thanks.

> > > diff --git a/arch/powerpc/platforms/85xx/smp.c 
> > > b/arch/powerpc/platforms/85xx/smp.c
> > > index 5152289..4abda43 100644
> > > --- a/arch/powerpc/platforms/85xx/smp.c
> > > +++ b/arch/powerpc/platforms/85xx/smp.c
> > > @@ -305,10 +310,13 @@ static int smp_85xx_kick_cpu(int nr)
> > >             __secondary_hold_acknowledge = -1;
> > >     }
> > >  #endif
> > > -   flush_spin_table(spin_table);
> > > -   out_be32(&spin_table->pir, hw_cpu);
> > > -   out_be32(&spin_table->addr_l, __pa(__early_start));
> > > -   flush_spin_table(spin_table);
> > > +
> > > +   if (have_spin_table) {
> > > +           flush_spin_table(spin_table);
> > > +           out_be32(&spin_table->pir, hw_cpu);
> > > +           out_be32(&spin_table->addr_l, __pa(__early_start));
> > > +           flush_spin_table(spin_table);
> > > +   }
> > >  
> > >     /* Wait a bit for the CPU to ack. */
> > >     if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,
> > 
> > This looks like it's inside an #ifdef CONFIG_PPC32 block, which doesn't make
> > sense, so I must be missing a lead-up patch or something? (I looked on the 
> > list
> > but didn't find anything immediately)
> 
> Thanks for catching this.
> 
> This is apparently a mismerge due to the code having been previously worked 
> on in the context of the SDK tree, which does not have that code inside 
> #ifdef CONFIG_PPC32.  When I then applied the result to mainline, everything 
> still appeared to work, because there's no real consequence to writing to the 
> spin table in this case -- it's just a no-op.  

Aha, that's good, I stared at it for ages thinking I was going mad, but I wasn't!

> setup_64.c is the part where checking booted_from_kexec (or devicetree
> equivalent) really matters.

OK. Can we avoid that too?

All smp_release_cpus() does is whack __secondary_hold_spinloop and then spin
for a while. For the non-kexec case writing to __secondary_hold_spinloop should
be harmless I think, so the only problem is we'll get stuck for a while in the
udelay() loop.

But you could avoid that by preemptively setting spinning_secondaries to 0 in
platform code.

That'd have to be in ppc_md.init_early(), but that's actually not very early,
the device tree is already unflattened.

I guess it's arguable whether that's more or less horrible than adding an
#ifdef'ed booted_from_kexec check, but I think I'd prefer the
spinning_secondaries solution.

cheers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release
  2015-08-25  1:57       ` Michael Ellerman
@ 2015-08-25 23:40         ` Scott Wood
  2015-08-26  1:13           ` Michael Ellerman
  0 siblings, 1 reply; 25+ messages in thread
From: Scott Wood @ 2015-08-25 23:40 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, Tiejun Chen, kexec

On Tue, 2015-08-25 at 11:57 +1000, Michael Ellerman wrote:
> On Mon, 2015-08-24 at 15:25 -0500, Scott Wood wrote:
> > On Thu, 2015-08-20 at 14:54 +1000, Michael Ellerman wrote:
> > > Hi Scott,
> > > 
> > > Sorry for the delay. So I'm back to square one on this patch.
> > > 
> > > On Sat, 2015-07-18 at 15:08 -0500, Scott Wood wrote:
> > > > booted_from_exec is similar to __run_at_load, except that it is set 
> > > > for
> > > > regular kexec as well as kdump.
> > > > 
> > > > The flag is needed because the SMP release mechanism for FSL book3e is
> > > > different from when booting with normal hardware.  In theory we could
> > > > simulate the normal spin table mechanism, but not at the addresses
> > > > U-Boot put in the device tree -- so there'd need to be even more
> > > > communication between the kernel and kexec to set that up.  Since
> > > > there's already a similar flag being set (for kdump only), this seemed
> > > > like a reasonable approach.
> > > 
> > > Although this is a reasonable approach, I don't think it's the best 
> > > approach.
> > > 
> > > AFAICS there's no reason why we can't use a device tree property for 
> > > this, 
> > > so I think we should do that.
> > 
> > OK, I'll look into that.
> 
> Thanks.
> 
> > > > diff --git a/arch/powerpc/platforms/85xx/smp.c 
> > > > b/arch/powerpc/platforms/85xx/smp.c
> > > > index 5152289..4abda43 100644
> > > > --- a/arch/powerpc/platforms/85xx/smp.c
> > > > +++ b/arch/powerpc/platforms/85xx/smp.c
> > > > @@ -305,10 +310,13 @@ static int smp_85xx_kick_cpu(int nr)
> > > >             __secondary_hold_acknowledge = -1;
> > > >     }
> > > >  #endif
> > > > -   flush_spin_table(spin_table);
> > > > -   out_be32(&spin_table->pir, hw_cpu);
> > > > -   out_be32(&spin_table->addr_l, __pa(__early_start));
> > > > -   flush_spin_table(spin_table);
> > > > +
> > > > +   if (have_spin_table) {
> > > > +           flush_spin_table(spin_table);
> > > > +           out_be32(&spin_table->pir, hw_cpu);
> > > > +           out_be32(&spin_table->addr_l, __pa(__early_start));
> > > > +           flush_spin_table(spin_table);
> > > > +   }
> > > >  
> > > >     /* Wait a bit for the CPU to ack. */
> > > >     if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,
> > > 
> > > This looks like it's inside an #ifdef CONFIG_PPC32 block, which doesn't 
> > > make
> > > sense, so I must be missing a lead-up patch or something? (I looked on 
> > > the 
> > > list
> > > but didn't find anything immediately)
> > 
> > Thanks for catching this.
> > 
> > This is apparently a mismerge due to the code having been previously 
> > worked 
> > on in the context of the SDK tree, which does not have that code inside 
> > #ifdef CONFIG_PPC32.  When I then applied the result to mainline, 
> > everything 
> > still appeared to work, because there's no real consequence to writing to 
> > the 
> > spin table in this case -- it's just a no-op.  
> 
> Aha, that's good, I stared at it for ages thinking I was going mad, but I 
> wasn't!
> 
> > setup_64.c is the part where checking booted_from_kexec (or devicetree
> > equivalent) really matters.
> 
> OK. Can we avoid that too?
> 
> All smp_release_cpus() does is whack __secondary_hold_spinloop and then spin
> for a while. For the non-kexec case writing to __secondary_hold_spinloop 
> should
> be harmless I think, so the only problem is we'll get stuck for a while in 
> the
> udelay() loop.
> 
> But you could avoid that by preemptively setting spinning_secondaries to 0 
> in
> platform code.
> 
> That'd have to be in ppc_md.init_early(), but that's actually not very 
> early,
> the device tree is already unflattened.
> 
> I guess it's arguable whether that's more or less horrible than adding an
> #ifdef'ed booted_from_kexec check, but I think I'd prefer the
> spinning_secondaries solution.

We'd still need the device tree property regardless of whether we keep 
use_spinloop() or set spinning_secondaries to zero.

use_spinloop() (with a device tree property rather than booted_from_kexec) 
seems cleaner:
 - Avoids depending on the fact that some piece of platform code executes 
after spinning_secondaries is initialized but before smp_release_cpus().
 - Doesn't put a different requirement on platform code based on 32 versus 64 
bit (we have too many 32 versus 64 bit differences as is).
 - Doesn't require the change in all relevant platform code files (we have 
both corenet_generic and qemu_e500, both of which support both 32 and 64 bit, 
and custom boards might not all use corenet_generic), whether the platform 
supports kexec or not.  I doesn't look like there's any non-Freescale book3e-
64 left in the kernel[1], but if it ever gets added, it would also be 
affected by a solution that requires platform code to do something to 
preserve the current behavior.

-Scott

[1] If this is true, and won't likely change, can the non-fsl book3e-64 TLB 
miss handlers and such come out?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release
  2015-08-25 23:40         ` Scott Wood
@ 2015-08-26  1:13           ` Michael Ellerman
  0 siblings, 0 replies; 25+ messages in thread
From: Michael Ellerman @ 2015-08-26  1:13 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, Tiejun Chen, kexec

On Tue, 2015-08-25 at 18:40 -0500, Scott Wood wrote:
> On Tue, 2015-08-25 at 11:57 +1000, Michael Ellerman wrote:
> > I guess it's arguable whether that's more or less horrible than adding an
> > #ifdef'ed booted_from_kexec check, but I think I'd prefer the
> > spinning_secondaries solution.
> 
> We'd still need the device tree property regardless of whether we keep 
> use_spinloop() or set spinning_secondaries to zero.

Yep.

> use_spinloop() (with a device tree property rather than booted_from_kexec) 
> seems cleaner:
>  - Avoids depending on the fact that some piece of platform code executes 
> after spinning_secondaries is initialized but before smp_release_cpus().

True, that is a bit fragile.

>  - Doesn't put a different requirement on platform code based on 32 versus 64 
> bit (we have too many 32 versus 64 bit differences as is).

Yeah I didn't think of that.

>  - Doesn't require the change in all relevant platform code files (we have 
> both corenet_generic and qemu_e500, both of which support both 32 and 64 bit, 
> and custom boards might not all use corenet_generic), whether the platform 
> supports kexec or not.

Yep, though they could all call a common implementation of init_early().

So I guess do it with use_spinloop(). I was just hoping to avoid more platform
specific ifdefs in the "generic" code.

> I doesn't look like there's any non-Freescale book3e-64 left in the kernel[1]
...
> [1] If this is true, and won't likely change, can the non-fsl book3e-64 TLB 
> miss handlers and such come out?

It is true, see fb5a515704d7 ("powerpc: Remove platforms/wsp and associated
pieces").

It will not change as far as I'm aware, and all the code's in the git history
anyway, so if there's unused code in there please rip it out.

cheers

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-08-26  1:13 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-18 20:08 [RFC PATCH 00/17] powerpc/fsl-book3e-64: kexec/kdump support Scott Wood
2015-07-18 20:08 ` [RFC PATCH 01/17] powerpc/85xx: Load all early TLB entries at once Scott Wood
2015-07-18 20:08 ` [RFC PATCH 02/17] powerpc/85xx: Don't use generic timebase sync on 64-bit Scott Wood
2015-07-18 20:08 ` [RFC PATCH 03/17] crypto: caam: Blacklist CAAM when kexec is enabled Scott Wood
2015-07-18 20:08 ` [RFC PATCH 04/17] powerpc/fsl-corenet: Disable coreint if " Scott Wood
2015-07-18 20:08 ` [RFC PATCH 05/17] powerpc/fsl-booke-64: Don't limit ppc64_rma_size to one TLB entry Scott Wood
2015-07-18 20:08 ` [RFC PATCH 06/17] powerpc/85xx: Implement 64-bit kexec support Scott Wood
2015-07-18 20:08 ` [RFC PATCH 07/17] powerpc/e6500: kexec: Handle hardware threads Scott Wood
2015-07-18 20:08 ` [RFC PATCH 08/17] powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts Scott Wood
2015-07-18 20:08 ` [RFC PATCH 09/17] powerpc/booke64: Fix args to copy_and_flush Scott Wood
2015-07-18 20:08 ` [RFC PATCH 10/17] powerpc/book3e: support CONFIG_RELOCATABLE Scott Wood
2015-07-18 20:08 ` [RFC PATCH 11/17] powerpc/book3e/kdump: Enable crash_kexec_wait_realmode Scott Wood
2015-07-18 20:08 ` [RFC PATCH 12/17] powerpc/book3e-64: Don't limit paca to 256 MiB Scott Wood
2015-07-18 20:08 ` [RFC PATCH 13/17] powerpc/book3e-64/kexec: create an identity TLB mapping Scott Wood
2015-07-18 20:08 ` [RFC PATCH 14/17] powerpc/book3e-64/kexec: Enable SMP release Scott Wood
2015-08-18  4:51   ` [RFC,14/17] " Michael Ellerman
2015-08-18  5:09     ` Scott Wood
2015-08-20  4:54   ` [RFC PATCH 14/17] " Michael Ellerman
2015-08-24 20:25     ` Scott Wood
2015-08-25  1:57       ` Michael Ellerman
2015-08-25 23:40         ` Scott Wood
2015-08-26  1:13           ` Michael Ellerman
2015-07-18 20:08 ` [RFC PATCH 15/17] powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32 Scott Wood
2015-07-18 20:08 ` [RFC PATCH 16/17] powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop Scott Wood
2015-07-18 20:08 ` [RFC PATCH 17/17] powerpc/book3e-64: Enable kexec Scott Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).