All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc/eeh: skip slot presence check when PE is temporarily unavailable.
@ 2021-05-06 17:43 Mahesh Salgaonkar
  2021-05-07  0:41 ` Oliver O'Halloran
  0 siblings, 1 reply; 3+ messages in thread
From: Mahesh Salgaonkar @ 2021-05-06 17:43 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Oliver O'Halloran

When certain PHB HW failure causes phyp to recover PHB, it marks the PE
state as temporarily unavailable. In this case, per PAPR, rtas call
ibm,read-slot-reset-state2 returns a PE state as temporarily unavailable(5)
and OS has to wait until that recovery is complete. During this state the
slot presence check 'get-sensor-state(dr-entity-sense)' returns as DR
connector empty which leads to assumption that the device has been
hot-removed. This results into no EEH recovery on this device and it stays
in failed state forever.

This patch fixes this issue by skipping slot presence check only if device
PE state is temporarily unavailable(5).

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h   |    1 +
 arch/powerpc/kernel/eeh.c        |   14 ++++++++++++--
 arch/powerpc/kernel/eeh_driver.c |   18 ++++++++++++++++++
 3 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index b1a5bba2e0b94..5dc5538e39b62 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -64,6 +64,7 @@ struct pci_dn;
 #define EEH_PE_RECOVERING	(1 << 1)	/* Recovering PE	*/
 #define EEH_PE_CFG_BLOCKED	(1 << 2)	/* Block config access	*/
 #define EEH_PE_RESET		(1 << 3)	/* PE reset in progress */
+#define EEH_PE_TEMP_UNAVAIL	(1 << 4)	/* PE is temporarily unavailable */
 
 #define EEH_PE_KEEP		(1 << 8)	/* Keep PE on hotplug	*/
 #define EEH_PE_CFG_RESTRICTED	(1 << 9)	/* Block config on error */
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 7040e430a1249..7fcbf3df18583 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -405,7 +405,8 @@ static int eeh_phb_check_failure(struct eeh_pe *pe)
 	    (ret == EEH_STATE_NOT_SUPPORT) || eeh_state_active(ret)) {
 		ret = 0;
 		goto out;
-	}
+	} else if (ret == EEH_STATE_UNAVAILABLE)
+		eeh_pe_state_mark(phb_pe, EEH_PE_TEMP_UNAVAIL);
 
 	/* Isolate the PHB and send event */
 	eeh_pe_mark_isolated(phb_pe);
@@ -519,14 +520,23 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
 	 * We will punt with the following conditions: Failure to get
 	 * PE's state, EEH not support and Permanently unavailable
 	 * state, PE is in good state.
+	 *
+	 * Certain PHB HW failure causes phyp/hypervisor to recover PHB and
+	 * until that recovery completes, the PE's state is temporarily
+	 * unavailable (EEH_STATE_UNAVAILABLE). In this state the slot
+	 * presence check must be avoided since it may not return valid
+	 * status. Mark this PE status as temporarily unavailable so
+	 * that we can check it later.
 	 */
+
 	if ((ret < 0) ||
 	    (ret == EEH_STATE_NOT_SUPPORT) || eeh_state_active(ret)) {
 		eeh_stats.false_positives++;
 		pe->false_positives++;
 		rc = 0;
 		goto dn_unlock;
-	}
+	} else if (ret == EEH_STATE_UNAVAILABLE)
+		eeh_pe_state_mark(pe, EEH_PE_TEMP_UNAVAIL);
 
 	/*
 	 * It should be corner case that the parent PE has been
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 3eff6a4888e79..a0913768f33de 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -851,6 +851,17 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		return;
 	}
 
+	/*
+	 * When PE's state is temporarily unavailable, the slot
+	 * presence check returns as DR connector empty. This leads
+	 * to assumption that the device is hot-removed and causes EEH
+	 * recovery to stop leaving the device in failed state forever.
+	 * Hence skip the slot presence check if PE's state is
+	 * temporarily unavailable and go down EEH recovery path.
+	 */
+	if (pe->state & EEH_PE_TEMP_UNAVAIL)
+		goto skip_slot_presence_check;
+
 	/*
 	 * When devices are hot-removed we might get an EEH due to
 	 * a driver attempting to touch the MMIO space of a removed
@@ -871,6 +882,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		goto out; /* nothing to recover */
 	}
 
+skip_slot_presence_check:
 	/* Log the event */
 	if (pe->type & EEH_PE_PHB) {
 		pr_err("EEH: Recovering PHB#%x, location: %s\n",
@@ -953,6 +965,12 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		}
 	}
 
+	/*
+	 * Now that we finished waiting for PE state as per PAPR,
+	 * clear the PE temporarily unavailable state.
+	 */
+	eeh_pe_state_clear(pe, EEH_PE_TEMP_UNAVAIL, true);
+
 	/* Since rtas may enable MMIO when posting the error log,
 	 * don't post the error log until after all dev drivers
 	 * have been informed.



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] powerpc/eeh: skip slot presence check when PE is temporarily unavailable.
  2021-05-06 17:43 [PATCH] powerpc/eeh: skip slot presence check when PE is temporarily unavailable Mahesh Salgaonkar
@ 2021-05-07  0:41 ` Oliver O'Halloran
  2021-10-18 17:28   ` Mahesh J Salgaonkar
  0 siblings, 1 reply; 3+ messages in thread
From: Oliver O'Halloran @ 2021-05-07  0:41 UTC (permalink / raw)
  To: Mahesh Salgaonkar; +Cc: linuxppc-dev

On Fri, May 7, 2021 at 3:43 AM Mahesh Salgaonkar <mahesh@linux.ibm.com> wrote:
>
> When certain PHB HW failure causes phyp to recover PHB, it marks the PE
> state as temporarily unavailable. In this case, per PAPR, rtas call
> ibm,read-slot-reset-state2 returns a PE state as temporarily unavailable(5)
> and OS has to wait until that recovery is complete. During this state the
> slot presence check 'get-sensor-state(dr-entity-sense)' returns as DR
> connector empty which leads to assumption that the device has been
> hot-removed. This results into no EEH recovery on this device and it stays
> in failed state forever.
>
> This patch fixes this issue by skipping slot presence check only if device
> PE state is temporarily unavailable(5).
>
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> ---
> * snip*
>
>         /*
>          * It should be corner case that the parent PE has been
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 3eff6a4888e79..a0913768f33de 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -851,6 +851,17 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
>                 return;
>         }
>
> +       /*
> +        * When PE's state is temporarily unavailable, the slot
> +        * presence check returns as DR connector empty.

That sounds like a bug in either RTAS or the hotplug slot driver (or
both). The presence check is there largely to filter out events that
we can guarantee are not recoverable (i.e. surprise hot-unplug). In
every other case (especially if we can't determine the state) we
should be going down the recovery path. If the hotplug slot driver is
incorrectly reporting the card has been removed then you should be
fixing the slot driver.

> +        * to assumption that the device is hot-removed and causes EEH
> +        * recovery to stop leaving the device in failed state forever.
> +        * Hence skip the slot presence check if PE's state is
> +        * temporarily unavailable and go down EEH recovery path.
> +        */
> +       if (pe->state & EEH_PE_TEMP_UNAVAIL)
> +               goto skip_slot_presence_check;

There's a time-of-check-vs-time-of-use error here. You're setting this
flag at the point of detection, but there can be a significant lag
time between when an EEH is initially detected and when it's handled
by the recovery thread (usually due to other events being recovered).
Transitioning the PE into and out of PE_TEMP_UNAVAILABLE is handled
autonomously by the hypervisor so by the time we get to recovery the
PE may be back into a normal state where the slot check works fine.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] powerpc/eeh: skip slot presence check when PE is temporarily unavailable.
  2021-05-07  0:41 ` Oliver O'Halloran
@ 2021-10-18 17:28   ` Mahesh J Salgaonkar
  0 siblings, 0 replies; 3+ messages in thread
From: Mahesh J Salgaonkar @ 2021-10-18 17:28 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: linuxppc-dev

On 2021-05-07 10:41:46 Fri, Oliver O'Halloran wrote:
> On Fri, May 7, 2021 at 3:43 AM Mahesh Salgaonkar <mahesh@linux.ibm.com> wrote:
> >
> > When certain PHB HW failure causes phyp to recover PHB, it marks the PE
> > state as temporarily unavailable. In this case, per PAPR, rtas call
> > ibm,read-slot-reset-state2 returns a PE state as temporarily unavailable(5)
> > and OS has to wait until that recovery is complete. During this state the
> > slot presence check 'get-sensor-state(dr-entity-sense)' returns as DR
> > connector empty which leads to assumption that the device has been
> > hot-removed. This results into no EEH recovery on this device and it stays
> > in failed state forever.
> >
> > This patch fixes this issue by skipping slot presence check only if device
> > PE state is temporarily unavailable(5).
> >
> > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> > ---
> > * snip*
> >
> >         /*
> >          * It should be corner case that the parent PE has been
> > diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> > index 3eff6a4888e79..a0913768f33de 100644
> > --- a/arch/powerpc/kernel/eeh_driver.c
> > +++ b/arch/powerpc/kernel/eeh_driver.c
> > @@ -851,6 +851,17 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
> >                 return;
> >         }
> >
> > +       /*
> > +        * When PE's state is temporarily unavailable, the slot
> > +        * presence check returns as DR connector empty.
> 
> That sounds like a bug in either RTAS or the hotplug slot driver (or
> both). The presence check is there largely to filter out events that
> we can guarantee are not recoverable (i.e. surprise hot-unplug). In
> every other case (especially if we can't determine the state) we
> should be going down the recovery path. If the hotplug slot driver is
> incorrectly reporting the card has been removed then you should be
> fixing the slot driver.

Thanks Oliver for the comment.

So phyp fixed the issue where it was incorrectly reporting the card has
been removed. After the phyp fix, the slot presence check
'get-sensor-state(dr-entity-sense)' returns extended busy error (9902)
until PHB is recovered by phyp. And once PHB is recovered, the
get-sensor-state() returns success with correct presence status.

But now we have different problem. The Linux rtas call interface
rtas_get_sensor() loops over the rtas call on extended delay return code
(9902) until the return value is either success (0) or error (-1).  This
causes EEH handler to get stuck at presence check 'rtas_get_sensor()'
for ~6 seconds before it could indicate network driver that error has
been detected and stop any active operations. With no I/O traffic this
doesn't cause any issue and EEH recovery works fine.  However with
running I/O traffic, during this 6 seconds, network driver continues its
operation and hits timeout (netdev watchdog). On timeouts, network
driver go into ffdc capture mode and reset path assuming PCI device is
in fatal condition. This causes EEH recovery to fail and sometimes it
leads to system hang or crash.

------------
[52732.244731] DEBUG: ibm_read_slot_reset_state2()
[52732.244762] DEBUG: ret = 0, rets[0]=5, rets[1]=1, rets[2]=4000, rets[3]=0x0
[52732.244798] DEBUG: in eeh_slot_presence_check
[52732.244804] DEBUG: error state check
[52732.244807] DEBUG: Is slot hotpluggable
[52732.244810] DEBUG: hotpluggable ops ?
[52732.244953] DEBUG: Calling ops->get_adapter_status
[52732.244958] DEBUG: calling rpaphp_get_sensor_state
[52736.564262] ------------[ cut here ]------------
[52736.564299] NETDEV WATCHDOG: enP64p1s0f3 (tg3): transmit queue 0 timed out
[52736.564324] WARNING: CPU: 1442 PID: 0 at net/sched/sch_generic.c:478 dev_watchdog+0x438/0x440
[...]
[52736.564505] NIP [c000000000c32368] dev_watchdog+0x438/0x440
[52736.564513] LR [c000000000c32364] dev_watchdog+0x434/0x440
------------

I am working on ways to fix this and looking at below two options. More
ideas are welcome.

1. There is an alternate call rtas_get_sensor_fast() available that does
not use rtas_busy_delay() and returns immediately with error code. Using
rtas_get_sensor_fast() for slot presence check fixes the above issue and
EEH recovery works fine. However there is no provision in
hotplug_slot_ops struct to do a quick check of adapter status that can
be used to call rtas_get_sensor_fast().

2. Another option is to move the slot presence check after reporting
network driver that error has been detected. This also fixes the issue.
However need to verify the hotplug case where if slot is empty, inform
driver to resume while skiping the recovery.

Let me know what do you think about above options and if there is any
other better way to fix this.

Thanks,
-Mahesh.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-18 17:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-06 17:43 [PATCH] powerpc/eeh: skip slot presence check when PE is temporarily unavailable Mahesh Salgaonkar
2021-05-07  0:41 ` Oliver O'Halloran
2021-10-18 17:28   ` Mahesh J Salgaonkar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.