linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback.
@ 2018-02-22 11:58 Vaibhav Jain
  2018-02-22 12:31 ` Vaibhav Jain
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Vaibhav Jain @ 2018-02-22 11:58 UTC (permalink / raw)
  To: Russell Currey, Michael Ellerman
  Cc: Vaibhav Jain, Frederic Barrat, linuxppc-dev,
	Benjamin Herrenschmidt, Bryant G . Ly

This patch puts a NULL check before branching to the address pointed
to by eeh_ops->notify_resume in eeh_report_resume(). The callback
is used to notify the arch EEH code that a pci device is back
online.

For PPC64 presently, only an implementation for pseries platform is
available and not for powernv. Hence without this patch EEH recovery
on all non-virtualized hosts is causing a kernel panic when
CONFIG_PCI_IOV is set. The panic is usually is of the form:

EEH: Notify device driver to resume
Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0x00000000
Oops: Kernel access of bad area, sig: 11 [#1]
<snip>
LR eeh_report_resume+0x218/0x220
Call Trace:
 eeh_report_resume+0x1f0/0x220 (unreliable)
 eeh_pe_dev_traverse+0x98/0x170
 eeh_handle_normal_event+0x3f4/0x650
 eeh_handle_event+0x188/0x380
 eeh_event_handler+0x208/0x210
 kthread+0x168/0x1b0
 ret_from_kernel_thread+0x5c/0xb4

Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Fixes: 856e1eb9bdd4("PCI/AER: Add uevents in AER and EEH error/resume")
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_driver.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index beea2182d754..932858a293ea 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -384,7 +384,8 @@ static void *eeh_report_resume(void *data, void *userdata)
 	eeh_pcid_put(dev);
 	pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED);
 #ifdef CONFIG_PCI_IOV
-	eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
+	if (eeh_ops->notify_resume)
+		eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
 #endif
 	return NULL;
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback.
  2018-02-22 11:58 [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback Vaibhav Jain
@ 2018-02-22 12:31 ` Vaibhav Jain
  2018-02-22 18:56 ` Bryant G. Ly
  2018-02-22 22:34 ` Michael Ellerman
  2 siblings, 0 replies; 4+ messages in thread
From: Vaibhav Jain @ 2018-02-22 12:31 UTC (permalink / raw)
  To: Russell Currey, Michael Ellerman
  Cc: Frederic Barrat, linuxppc-dev, Benjamin Herrenschmidt, Bryant G . Ly


There is already a patch for this issue applied to ppc-next viz commit
521ca5a9859a870e354d1a6b84a6ff ("powerpc/eeh: Add conditional check on
notify_resume"). So please ignore the patch.

-- 
Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Linux Technology Center, IBM India Pvt. Ltd.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback.
  2018-02-22 11:58 [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback Vaibhav Jain
  2018-02-22 12:31 ` Vaibhav Jain
@ 2018-02-22 18:56 ` Bryant G. Ly
  2018-02-22 22:34 ` Michael Ellerman
  2 siblings, 0 replies; 4+ messages in thread
From: Bryant G. Ly @ 2018-02-22 18:56 UTC (permalink / raw)
  To: Vaibhav Jain, Russell Currey, Michael Ellerman
  Cc: Frederic Barrat, linuxppc-dev, Benjamin Herrenschmidt

On 2/22/18 5:58 AM, Vaibhav Jain wrote:

> This patch puts a NULL check before branching to the address pointed
> to by eeh_ops->notify_resume in eeh_report_resume(). The callback
> is used to notify the arch EEH code that a pci device is back
> online.
>
> For PPC64 presently, only an implementation for pseries platform is
> available and not for powernv. Hence without this patch EEH recovery
> on all non-virtualized hosts is causing a kernel panic when
> CONFIG_PCI_IOV is set. The panic is usually is of the form:
>
> EEH: Notify device driver to resume
> Unable to handle kernel paging request for instruction fetch
> Faulting instruction address: 0x00000000
> Oops: Kernel access of bad area, sig: 11 [#1]
> <snip>
> LR eeh_report_resume+0x218/0x220
> Call Trace:
>  eeh_report_resume+0x1f0/0x220 (unreliable)
>  eeh_pe_dev_traverse+0x98/0x170
>  eeh_handle_normal_event+0x3f4/0x650
>  eeh_handle_event+0x188/0x380
>  eeh_event_handler+0x208/0x210
>  kthread+0x168/0x1b0
>  ret_from_kernel_thread+0x5c/0xb4
>
> Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
> Fixes: 856e1eb9bdd4("PCI/AER: Add uevents in AER and EEH error/resume")
> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/eeh_driver.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index beea2182d754..932858a293ea 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -384,7 +384,8 @@ static void *eeh_report_resume(void *data, void *userdata)
>  	eeh_pcid_put(dev);
>  	pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED);
>  #ifdef CONFIG_PCI_IOV
> -	eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
> +	if (eeh_ops->notify_resume)
> +		eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
>  #endif
>  	return NULL;
>  }

A version of this patch already upstreamed. 

https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=fixes&id=521ca5a9859a870e354d1a6b84a6ff4c07bbceb0

-Bryant

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback.
  2018-02-22 11:58 [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback Vaibhav Jain
  2018-02-22 12:31 ` Vaibhav Jain
  2018-02-22 18:56 ` Bryant G. Ly
@ 2018-02-22 22:34 ` Michael Ellerman
  2 siblings, 0 replies; 4+ messages in thread
From: Michael Ellerman @ 2018-02-22 22:34 UTC (permalink / raw)
  To: Vaibhav Jain, Russell Currey
  Cc: Vaibhav Jain, Frederic Barrat, linuxppc-dev,
	Benjamin Herrenschmidt, Bryant G . Ly

Vaibhav Jain <vaibhav@linux.vnet.ibm.com> writes:
> This patch puts a NULL check before branching to the address pointed
> to by eeh_ops->notify_resume in eeh_report_resume(). The callback
> is used to notify the arch EEH code that a pci device is back
> online.
>
> For PPC64 presently, only an implementation for pseries platform is
> available and not for powernv. Hence without this patch EEH recovery
> on all non-virtualized hosts is causing a kernel panic when
> CONFIG_PCI_IOV is set. The panic is usually is of the form:
>
> EEH: Notify device driver to resume
> Unable to handle kernel paging request for instruction fetch
> Faulting instruction address: 0x00000000
> Oops: Kernel access of bad area, sig: 11 [#1]
> <snip>
> LR eeh_report_resume+0x218/0x220
> Call Trace:
>  eeh_report_resume+0x1f0/0x220 (unreliable)
>  eeh_pe_dev_traverse+0x98/0x170
>  eeh_handle_normal_event+0x3f4/0x650
>  eeh_handle_event+0x188/0x380
>  eeh_event_handler+0x208/0x210
>  kthread+0x168/0x1b0
>  ret_from_kernel_thread+0x5c/0xb4
>
> Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
> Fixes: 856e1eb9bdd4("PCI/AER: Add uevents in AER and EEH error/resume")
> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>

10 out of 10 for the change log!

But yeah this is already fixed in my fixes branch, thanks anyway.

cheers

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-02-22 22:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-22 11:58 [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback Vaibhav Jain
2018-02-22 12:31 ` Vaibhav Jain
2018-02-22 18:56 ` Bryant G. Ly
2018-02-22 22:34 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).