All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] sleeping functions in invalid context on panic fixes
@ 2018-05-14 15:59 Nicholas Piggin
  2018-05-14 15:59 ` [PATCH 1/2] powerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled Nicholas Piggin
  2018-05-14 15:59 ` [PATCH 2/2] powerpc/powernv: Fix NVRAM sleep in invalid context when crashing Nicholas Piggin
  0 siblings, 2 replies; 5+ messages in thread
From: Nicholas Piggin @ 2018-05-14 15:59 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Here's a couple of fixes which seem to solve a problem where
panics can hang forever rather than reboot after 10 seconds. The
symptoms are that a CPU calls panic(), but later it is found in
idle.

Nicholas Piggin (2):
  powerpc/powernv: Fix opal_event_shutdown() called with interrupts
    disabled
  powerpc/powernv: Fix NVRAM sleep in invalid context when crashing

 arch/powerpc/platforms/powernv/opal-irqchip.c |  2 +-
 arch/powerpc/platforms/powernv/opal-nvram.c   | 14 ++++++++++++--
 2 files changed, 13 insertions(+), 3 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] powerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled
  2018-05-14 15:59 [PATCH 0/2] sleeping functions in invalid context on panic fixes Nicholas Piggin
@ 2018-05-14 15:59 ` Nicholas Piggin
  2018-05-21 10:01   ` [1/2] " Michael Ellerman
  2018-05-14 15:59 ` [PATCH 2/2] powerpc/powernv: Fix NVRAM sleep in invalid context when crashing Nicholas Piggin
  1 sibling, 1 reply; 5+ messages in thread
From: Nicholas Piggin @ 2018-05-14 15:59 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

A kernel crash in process context that calls emergency_restart from
panic will end up calling opal_event_shutdown with interrupts disabled
but not in interrupt. This causes a sleeping function to be called
which gives the following warning with sysrq+c:

    Rebooting in 10 seconds..
    BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238
    in_atomic(): 0, irqs_disabled(): 1, pid: 7669, name: bash
    CPU: 20 PID: 7669 Comm: bash Tainted: G      D W         4.17.0-rc5+ #3
    Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    ___might_sleep+0x174/0x1a0
    mutex_lock+0x38/0xb0
    __free_irq+0x68/0x460
    free_irq+0x70/0xc0
    opal_event_shutdown+0xb4/0xf0
    opal_shutdown+0x24/0xa0
    pnv_shutdown+0x28/0x40
    machine_shutdown+0x44/0x60
    machine_restart+0x28/0x80
    emergency_restart+0x30/0x50
    panic+0x2a0/0x328
    oops_end+0x1ec/0x1f0
    bad_page_fault+0xe8/0x154
    handle_page_fault+0x34/0x38
    --- interrupt: 300 at sysrq_handle_crash+0x44/0x60
    LR = __handle_sysrq+0xfc/0x260
    flag_spec.62335+0x12b844/0x1e8db4 (unreliable)
    __handle_sysrq+0xfc/0x260
    write_sysrq_trigger+0xa8/0xb0
    proc_reg_write+0xac/0x110
    __vfs_write+0x6c/0x240
    vfs_write+0xd0/0x240
    ksys_write+0x6c/0x110

Fixes: 9f0fd0499d30 ("powerpc/powernv: Add a virtual irqchip for opal events")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/platforms/powernv/opal-irqchip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/opal-irqchip.c b/arch/powerpc/platforms/powernv/opal-irqchip.c
index 9d1b8c0aaf93..05ffe05f0fdc 100644
--- a/arch/powerpc/platforms/powernv/opal-irqchip.c
+++ b/arch/powerpc/platforms/powernv/opal-irqchip.c
@@ -177,7 +177,7 @@ void opal_event_shutdown(void)
 		if (!opal_irqs[i])
 			continue;
 
-		if (in_interrupt())
+		if (in_interrupt() || irqs_disabled())
 			disable_irq_nosync(opal_irqs[i]);
 		else
 			free_irq(opal_irqs[i], NULL);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] powerpc/powernv: Fix NVRAM sleep in invalid context when crashing
  2018-05-14 15:59 [PATCH 0/2] sleeping functions in invalid context on panic fixes Nicholas Piggin
  2018-05-14 15:59 ` [PATCH 1/2] powerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled Nicholas Piggin
@ 2018-05-14 15:59 ` Nicholas Piggin
  2018-05-17 14:55   ` [2/2] " Michael Ellerman
  1 sibling, 1 reply; 5+ messages in thread
From: Nicholas Piggin @ 2018-05-14 15:59 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, stable

Similarly to opal_event_shutdown, opal_nvram_write can be called in
the crash path with irqs disabled. Special case the delay to avoid
sleeping in invalid context.

Cc: stable@vger.kernel.org # v3.2
Fixes: 3b8070335f ("powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/platforms/powernv/opal-nvram.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/opal-nvram.c b/arch/powerpc/platforms/powernv/opal-nvram.c
index 1bceb95f422d..5584247f5029 100644
--- a/arch/powerpc/platforms/powernv/opal-nvram.c
+++ b/arch/powerpc/platforms/powernv/opal-nvram.c
@@ -44,6 +44,10 @@ static ssize_t opal_nvram_read(char *buf, size_t count, loff_t *index)
 	return count;
 }
 
+/*
+ * This can be called in the panic path with interrupts off, so use
+ * mdelay in that case.
+ */
 static ssize_t opal_nvram_write(char *buf, size_t count, loff_t *index)
 {
 	s64 rc = OPAL_BUSY;
@@ -58,10 +62,16 @@ static ssize_t opal_nvram_write(char *buf, size_t count, loff_t *index)
 	while (rc == OPAL_BUSY || rc == OPAL_BUSY_EVENT) {
 		rc = opal_write_nvram(__pa(buf), count, off);
 		if (rc == OPAL_BUSY_EVENT) {
-			msleep(OPAL_BUSY_DELAY_MS);
+			if (in_interrupt() || irqs_disabled())
+				mdelay(OPAL_BUSY_DELAY_MS);
+			else
+				msleep(OPAL_BUSY_DELAY_MS);
 			opal_poll_events(NULL);
 		} else if (rc == OPAL_BUSY) {
-			msleep(OPAL_BUSY_DELAY_MS);
+			if (in_interrupt() || irqs_disabled())
+				mdelay(OPAL_BUSY_DELAY_MS);
+			else
+				msleep(OPAL_BUSY_DELAY_MS);
 		}
 	}
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [2/2] powerpc/powernv: Fix NVRAM sleep in invalid context when crashing
  2018-05-14 15:59 ` [PATCH 2/2] powerpc/powernv: Fix NVRAM sleep in invalid context when crashing Nicholas Piggin
@ 2018-05-17 14:55   ` Michael Ellerman
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2018-05-17 14:55 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: stable, Nicholas Piggin

On Mon, 2018-05-14 at 15:59:47 UTC, Nicholas Piggin wrote:
> Similarly to opal_event_shutdown, opal_nvram_write can be called in
> the crash path with irqs disabled. Special case the delay to avoid
> sleeping in invalid context.
> 
> Cc: stable@vger.kernel.org # v3.2
> Fixes: 3b8070335f ("powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/c1d2a31397ec51f0370f6bd17b19b3

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [1/2] powerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled
  2018-05-14 15:59 ` [PATCH 1/2] powerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled Nicholas Piggin
@ 2018-05-21 10:01   ` Michael Ellerman
  0 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2018-05-21 10:01 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin

On Mon, 2018-05-14 at 15:59:46 UTC, Nicholas Piggin wrote:
> A kernel crash in process context that calls emergency_restart from
> panic will end up calling opal_event_shutdown with interrupts disabled
> but not in interrupt. This causes a sleeping function to be called
> which gives the following warning with sysrq+c:
> 
>     Rebooting in 10 seconds..
>     BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238
>     in_atomic(): 0, irqs_disabled(): 1, pid: 7669, name: bash
>     CPU: 20 PID: 7669 Comm: bash Tainted: G      D W         4.17.0-rc5+ #3
>     Call Trace:
>     dump_stack+0xb0/0xf4 (unreliable)
>     ___might_sleep+0x174/0x1a0
>     mutex_lock+0x38/0xb0
>     __free_irq+0x68/0x460
>     free_irq+0x70/0xc0
>     opal_event_shutdown+0xb4/0xf0
>     opal_shutdown+0x24/0xa0
>     pnv_shutdown+0x28/0x40
>     machine_shutdown+0x44/0x60
>     machine_restart+0x28/0x80
>     emergency_restart+0x30/0x50
>     panic+0x2a0/0x328
>     oops_end+0x1ec/0x1f0
>     bad_page_fault+0xe8/0x154
>     handle_page_fault+0x34/0x38
>     --- interrupt: 300 at sysrq_handle_crash+0x44/0x60
>     LR = __handle_sysrq+0xfc/0x260
>     flag_spec.62335+0x12b844/0x1e8db4 (unreliable)
>     __handle_sysrq+0xfc/0x260
>     write_sysrq_trigger+0xa8/0xb0
>     proc_reg_write+0xac/0x110
>     __vfs_write+0x6c/0x240
>     vfs_write+0xd0/0x240
>     ksys_write+0x6c/0x110
> 
> Fixes: 9f0fd0499d30 ("powerpc/powernv: Add a virtual irqchip for opal events")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/c0beffc4f4c658fde86d52c837e784

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-05-21 10:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-14 15:59 [PATCH 0/2] sleeping functions in invalid context on panic fixes Nicholas Piggin
2018-05-14 15:59 ` [PATCH 1/2] powerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled Nicholas Piggin
2018-05-21 10:01   ` [1/2] " Michael Ellerman
2018-05-14 15:59 ` [PATCH 2/2] powerpc/powernv: Fix NVRAM sleep in invalid context when crashing Nicholas Piggin
2018-05-17 14:55   ` [2/2] " Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.