All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH stable 4.6 0/4] powerpc patches for 4.6 stable
@ 2016-08-09 10:38 Michael Ellerman
  2016-08-09 10:38 ` [PATCH stable 4.6 1/4] powerpc/eeh: Fix invalid cached PE primary bus Michael Ellerman
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Michael Ellerman @ 2016-08-09 10:38 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: gwshan, naveen.n.rao, Michael Neuling

This is the set of patches I have so far tagged for 4.6 stable.

Please let me know if any of these *shouldn't* go to stable.

Gavin Shan (2):
  powerpc/eeh: Fix invalid cached PE primary bus
  powerpc/eeh: Fix wrong argument passed to eeh_rmv_device()

Michael Neuling (1):
  powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0

Naveen N. Rao (1):
  powerpc/bpf/jit: Disable classic BPF JIT on ppc64le

 arch/powerpc/Kconfig             |  2 +-
 arch/powerpc/kernel/eeh_driver.c |  9 +++---
 arch/powerpc/kernel/tm.S         | 61 +++++++++++++++++++++++++++++-----------
 3 files changed, 50 insertions(+), 22 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH stable 4.6 1/4] powerpc/eeh: Fix invalid cached PE primary bus
  2016-08-09 10:38 [PATCH stable 4.6 0/4] powerpc patches for 4.6 stable Michael Ellerman
@ 2016-08-09 10:38 ` Michael Ellerman
  2016-08-09 10:38 ` [PATCH stable 4.6 2/4] powerpc/bpf/jit: Disable classic BPF JIT on ppc64le Michael Ellerman
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2016-08-09 10:38 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: gwshan, naveen.n.rao, Michael Neuling

From: Gavin Shan <gwshan@linux.vnet.ibm.com>

commit a3aa256b7258b3d19f8b44557cc64525a993b941 upstream.

Backport required due to rename of pcibios_add_pci_devices() to
pci_hp_add_devices() upstream - mpe.

The PE primary bus cannot be got from its child devices when having
full hotplug in error recovery. The PE primary bus is cached, which
is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary
bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared
before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the
PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually.

This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the
PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset())
can get valid PE primary bus through eeh_pe_bus_get().

Fixes: 67086e32b564 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE")
Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/eeh_driver.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 31e4c7e1a4b4..c42627645b54 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -648,7 +648,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 		if (pe->type & EEH_PE_VF) {
 			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
 		} else {
-			eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
 			pci_lock_rescan_remove();
 			pcibios_remove_pci_devices(bus);
 			pci_unlock_rescan_remove();
@@ -698,10 +697,12 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 		 */
 		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		if (pe->type & EEH_PE_VF)
+		if (pe->type & EEH_PE_VF) {
 			eeh_add_virt_device(edev, NULL);
-		else
+		} else {
+			eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
 			pcibios_add_pci_devices(bus);
+		}
 	} else if (frozen_bus && rmv_data->removed) {
 		pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
 		ssleep(5);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH stable 4.6 2/4] powerpc/bpf/jit: Disable classic BPF JIT on ppc64le
  2016-08-09 10:38 [PATCH stable 4.6 0/4] powerpc patches for 4.6 stable Michael Ellerman
  2016-08-09 10:38 ` [PATCH stable 4.6 1/4] powerpc/eeh: Fix invalid cached PE primary bus Michael Ellerman
@ 2016-08-09 10:38 ` Michael Ellerman
  2016-08-09 10:38 ` [PATCH stable 4.6 3/4] powerpc/eeh: Fix wrong argument passed to eeh_rmv_device() Michael Ellerman
  2016-08-09 10:38 ` [PATCH stable 4.6 4/4] powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 Michael Ellerman
  3 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2016-08-09 10:38 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: gwshan, naveen.n.rao, Michael Neuling

From: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>

commit 844e3be47693f92a108cb1fb3b0606bf25e9c7a6 upstream.

Backport required due to rename of HAVE_BPF_JIT to HAVE_CBPF_JIT
upstream - mpe.

Classic BPF JIT was never ported completely to work on little endian
powerpc. However, it can be enabled and will crash the system when used.
As such, disable use of BPF JIT on ppc64le.

Fixes: 7c105b63bd98 ("powerpc: Add CONFIG_CPU_LITTLE_ENDIAN kernel config option.")
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 7cd32c038286..4e1b060b8481 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -126,7 +126,7 @@ config PPC
 	select IRQ_FORCED_THREADING
 	select HAVE_RCU_TABLE_FREE if SMP
 	select HAVE_SYSCALL_TRACEPOINTS
-	select HAVE_BPF_JIT
+	select HAVE_BPF_JIT if CPU_BIG_ENDIAN
 	select HAVE_ARCH_JUMP_LABEL
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select ARCH_HAS_GCOV_PROFILE_ALL
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH stable 4.6 3/4] powerpc/eeh: Fix wrong argument passed to eeh_rmv_device()
  2016-08-09 10:38 [PATCH stable 4.6 0/4] powerpc patches for 4.6 stable Michael Ellerman
  2016-08-09 10:38 ` [PATCH stable 4.6 1/4] powerpc/eeh: Fix invalid cached PE primary bus Michael Ellerman
  2016-08-09 10:38 ` [PATCH stable 4.6 2/4] powerpc/bpf/jit: Disable classic BPF JIT on ppc64le Michael Ellerman
@ 2016-08-09 10:38 ` Michael Ellerman
  2016-08-09 10:38 ` [PATCH stable 4.6 4/4] powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 Michael Ellerman
  3 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2016-08-09 10:38 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: gwshan, naveen.n.rao, Michael Neuling

From: Gavin Shan <gwshan@linux.vnet.ibm.com>

commit cca0e542e02e48cce541a49c4046ec094ec27c1e upstream.

When calling eeh_rmv_device() in eeh_reset_device() for partial hotplug
case, @rmv_data instead of its address is the proper argument.
Otherwise, the stack frame is corrupted when writing to
@rmv_data (actually its address) in eeh_rmv_device(). It results in
kernel crash as observed.

This fixes the issue by passing @rmv_data, not its address to
eeh_rmv_device() in eeh_reset_device().

Fixes: 67086e32b564 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE")
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/eeh_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index c42627645b54..4a27a9684442 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -653,7 +653,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 			pci_unlock_rescan_remove();
 		}
 	} else if (frozen_bus) {
-		eeh_pe_dev_traverse(pe, eeh_rmv_device, &rmv_data);
+		eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
 	}
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH stable 4.6 4/4] powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
  2016-08-09 10:38 [PATCH stable 4.6 0/4] powerpc patches for 4.6 stable Michael Ellerman
                   ` (2 preceding siblings ...)
  2016-08-09 10:38 ` [PATCH stable 4.6 3/4] powerpc/eeh: Fix wrong argument passed to eeh_rmv_device() Michael Ellerman
@ 2016-08-09 10:38 ` Michael Ellerman
  3 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2016-08-09 10:38 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: gwshan, naveen.n.rao, Michael Neuling

From: Michael Neuling <mikey@neuling.org>

commit 190ce8693c23eae09ba5f303a83bf2fbeb6478b1 upstream.

Currently we have 2 segments that are bolted for the kernel linear
mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
stacks. Anything accessed outside of these regions may need to be
faulted in. (In practice machines with TM always have 1T segments)

If a machine has < 2TB of memory we never fault on the kernel linear
mapping as these two segments cover all physical memory. If a machine
has > 2TB of memory, there may be structures outside of these two
segments that need to be faulted in. This faulting can occur when
running as a guest as the hypervisor may remove any SLB that's not
bolted.

When we treclaim and trecheckpoint we have a window where we need to
run with the userspace GPRs. This means that we no longer have a valid
stack pointer in r1. For this window we therefore clear MSR RI to
indicate that any exceptions taken at this point won't be able to be
handled. This means that we can't take segment misses in this RI=0
window.

In this RI=0 region, we currently access the thread_struct for the
process being context switched to or from. This thread_struct access
may cause a segment fault since it's not guaranteed to be covered by
the two bolted segment entries described above.

We've seen this with a crash when running as a guest with > 2TB of
memory on PowerVM:

  Unrecoverable exception 4100 at c00000000004f138
  Oops: Unrecoverable exception, sig: 6 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
  task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
  NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
  REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
  MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
  CFAR: c00000000004ed18 SOFTE: 0
  GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
  GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
  GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
  GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
  GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
  GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
  GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
  GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
  NIP [c00000000004f138] restore_gprs+0xd0/0x16c
  LR [0000000010003a24] 0x10003a24
  Call Trace:
  [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
  [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
  [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
  [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
  [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
  [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
  [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
  [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
  Instruction dump:
  7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
  38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
  ---[ end trace 602126d0a1dedd54 ]---

This fixes this by copying the required data from the thread_struct to
the stack before we clear MSR RI. Then once we clear RI, we only access
the stack, guaranteeing there's no segment miss.

We also tighten the region over which we set RI=0 on the treclaim()
path. This may have a slight performance impact since we're adding an
mtmsr instruction.

Fixes: 090b9284d725 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/tm.S | 61 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
index bf8f34a58670..b7019b559ddb 100644
--- a/arch/powerpc/kernel/tm.S
+++ b/arch/powerpc/kernel/tm.S
@@ -110,17 +110,11 @@ _GLOBAL(tm_reclaim)
 	std	r3, STK_PARAM(R3)(r1)
 	SAVE_NVGPRS(r1)
 
-	/* We need to setup MSR for VSX register save instructions.  Here we
-	 * also clear the MSR RI since when we do the treclaim, we won't have a
-	 * valid kernel pointer for a while.  We clear RI here as it avoids
-	 * adding another mtmsr closer to the treclaim.  This makes the region
-	 * maked as non-recoverable wider than it needs to be but it saves on
-	 * inserting another mtmsrd later.
-	 */
+	/* We need to setup MSR for VSX register save instructions. */
 	mfmsr	r14
 	mr	r15, r14
 	ori	r15, r15, MSR_FP
-	li	r16, MSR_RI
+	li	r16, 0
 	ori	r16, r16, MSR_EE /* IRQs hard off */
 	andc	r15, r15, r16
 	oris	r15, r15, MSR_VEC@h
@@ -176,7 +170,17 @@ dont_backup_fp:
 1:	tdeqi   r6, 0
 	EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,0
 
-	/* The moment we treclaim, ALL of our GPRs will switch
+	/* Clear MSR RI since we are about to change r1, EE is already off. */
+	li	r4, 0
+	mtmsrd	r4, 1
+
+	/*
+	 * BE CAREFUL HERE:
+	 * At this point we can't take an SLB miss since we have MSR_RI
+	 * off. Load only to/from the stack/paca which are in SLB bolted regions
+	 * until we turn MSR RI back on.
+	 *
+	 * The moment we treclaim, ALL of our GPRs will switch
 	 * to user register state.  (FPRs, CCR etc. also!)
 	 * Use an sprg and a tm_scratch in the PACA to shuffle.
 	 */
@@ -197,6 +201,11 @@ dont_backup_fp:
 
 	/* Store the PPR in r11 and reset to decent value */
 	std	r11, GPR11(r1)			/* Temporary stash */
+
+	/* Reset MSR RI so we can take SLB faults again */
+	li	r11, MSR_RI
+	mtmsrd	r11, 1
+
 	mfspr	r11, SPRN_PPR
 	HMT_MEDIUM
 
@@ -397,11 +406,6 @@ restore_gprs:
 	ld	r5, THREAD_TM_DSCR(r3)
 	ld	r6, THREAD_TM_PPR(r3)
 
-	/* Clear the MSR RI since we are about to change R1.  EE is already off
-	 */
-	li	r4, 0
-	mtmsrd	r4, 1
-
 	REST_GPR(0, r7)				/* GPR0 */
 	REST_2GPRS(2, r7)			/* GPR2-3 */
 	REST_GPR(4, r7)				/* GPR4 */
@@ -439,10 +443,33 @@ restore_gprs:
 	ld	r6, _CCR(r7)
 	mtcr    r6
 
-	REST_GPR(1, r7)				/* GPR1 */
-	REST_GPR(5, r7)				/* GPR5-7 */
 	REST_GPR(6, r7)
-	ld	r7, GPR7(r7)
+
+	/*
+	 * Store r1 and r5 on the stack so that we can access them
+	 * after we clear MSR RI.
+	 */
+
+	REST_GPR(5, r7)
+	std	r5, -8(r1)
+	ld	r5, GPR1(r7)
+	std	r5, -16(r1)
+
+	REST_GPR(7, r7)
+
+	/* Clear MSR RI since we are about to change r1. EE is already off */
+	li	r5, 0
+	mtmsrd	r5, 1
+
+	/*
+	 * BE CAREFUL HERE:
+	 * At this point we can't take an SLB miss since we have MSR_RI
+	 * off. Load only to/from the stack/paca which are in SLB bolted regions
+	 * until we turn MSR RI back on.
+	 */
+
+	ld	r5, -8(r1)
+	ld	r1, -16(r1)
 
 	/* Commit register state as checkpointed state: */
 	TRECHKPT
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-09 10:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-09 10:38 [PATCH stable 4.6 0/4] powerpc patches for 4.6 stable Michael Ellerman
2016-08-09 10:38 ` [PATCH stable 4.6 1/4] powerpc/eeh: Fix invalid cached PE primary bus Michael Ellerman
2016-08-09 10:38 ` [PATCH stable 4.6 2/4] powerpc/bpf/jit: Disable classic BPF JIT on ppc64le Michael Ellerman
2016-08-09 10:38 ` [PATCH stable 4.6 3/4] powerpc/eeh: Fix wrong argument passed to eeh_rmv_device() Michael Ellerman
2016-08-09 10:38 ` [PATCH stable 4.6 4/4] powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.