* [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback.
@ 2018-02-22 11:58 Vaibhav Jain
2018-02-22 12:31 ` Vaibhav Jain
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Vaibhav Jain @ 2018-02-22 11:58 UTC (permalink / raw)
To: Russell Currey, Michael Ellerman
Cc: Vaibhav Jain, Frederic Barrat, linuxppc-dev,
Benjamin Herrenschmidt, Bryant G . Ly
This patch puts a NULL check before branching to the address pointed
to by eeh_ops->notify_resume in eeh_report_resume(). The callback
is used to notify the arch EEH code that a pci device is back
online.
For PPC64 presently, only an implementation for pseries platform is
available and not for powernv. Hence without this patch EEH recovery
on all non-virtualized hosts is causing a kernel panic when
CONFIG_PCI_IOV is set. The panic is usually is of the form:
EEH: Notify device driver to resume
Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0x00000000
Oops: Kernel access of bad area, sig: 11 [#1]
<snip>
LR eeh_report_resume+0x218/0x220
Call Trace:
eeh_report_resume+0x1f0/0x220 (unreliable)
eeh_pe_dev_traverse+0x98/0x170
eeh_handle_normal_event+0x3f4/0x650
eeh_handle_event+0x188/0x380
eeh_event_handler+0x208/0x210
kthread+0x168/0x1b0
ret_from_kernel_thread+0x5c/0xb4
Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Fixes: 856e1eb9bdd4("PCI/AER: Add uevents in AER and EEH error/resume")
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
---
arch/powerpc/kernel/eeh_driver.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index beea2182d754..932858a293ea 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -384,7 +384,8 @@ static void *eeh_report_resume(void *data, void *userdata)
eeh_pcid_put(dev);
pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED);
#ifdef CONFIG_PCI_IOV
- eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
+ if (eeh_ops->notify_resume)
+ eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
#endif
return NULL;
}
--
2.14.3
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback.
2018-02-22 11:58 [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback Vaibhav Jain
@ 2018-02-22 12:31 ` Vaibhav Jain
2018-02-22 18:56 ` Bryant G. Ly
2018-02-22 22:34 ` Michael Ellerman
2 siblings, 0 replies; 4+ messages in thread
From: Vaibhav Jain @ 2018-02-22 12:31 UTC (permalink / raw)
To: Russell Currey, Michael Ellerman
Cc: Frederic Barrat, linuxppc-dev, Benjamin Herrenschmidt, Bryant G . Ly
There is already a patch for this issue applied to ppc-next viz commit
521ca5a9859a870e354d1a6b84a6ff ("powerpc/eeh: Add conditional check on
notify_resume"). So please ignore the patch.
--
Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Linux Technology Center, IBM India Pvt. Ltd.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback.
2018-02-22 11:58 [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback Vaibhav Jain
2018-02-22 12:31 ` Vaibhav Jain
@ 2018-02-22 18:56 ` Bryant G. Ly
2018-02-22 22:34 ` Michael Ellerman
2 siblings, 0 replies; 4+ messages in thread
From: Bryant G. Ly @ 2018-02-22 18:56 UTC (permalink / raw)
To: Vaibhav Jain, Russell Currey, Michael Ellerman
Cc: Frederic Barrat, linuxppc-dev, Benjamin Herrenschmidt
On 2/22/18 5:58 AM, Vaibhav Jain wrote:
> This patch puts a NULL check before branching to the address pointed
> to by eeh_ops->notify_resume in eeh_report_resume(). The callback
> is used to notify the arch EEH code that a pci device is back
> online.
>
> For PPC64 presently, only an implementation for pseries platform is
> available and not for powernv. Hence without this patch EEH recovery
> on all non-virtualized hosts is causing a kernel panic when
> CONFIG_PCI_IOV is set. The panic is usually is of the form:
>
> EEH: Notify device driver to resume
> Unable to handle kernel paging request for instruction fetch
> Faulting instruction address: 0x00000000
> Oops: Kernel access of bad area, sig: 11 [#1]
> <snip>
> LR eeh_report_resume+0x218/0x220
> Call Trace:
> eeh_report_resume+0x1f0/0x220 (unreliable)
> eeh_pe_dev_traverse+0x98/0x170
> eeh_handle_normal_event+0x3f4/0x650
> eeh_handle_event+0x188/0x380
> eeh_event_handler+0x208/0x210
> kthread+0x168/0x1b0
> ret_from_kernel_thread+0x5c/0xb4
>
> Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
> Fixes: 856e1eb9bdd4("PCI/AER: Add uevents in AER and EEH error/resume")
> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
> ---
> arch/powerpc/kernel/eeh_driver.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index beea2182d754..932858a293ea 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -384,7 +384,8 @@ static void *eeh_report_resume(void *data, void *userdata)
> eeh_pcid_put(dev);
> pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED);
> #ifdef CONFIG_PCI_IOV
> - eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
> + if (eeh_ops->notify_resume)
> + eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
> #endif
> return NULL;
> }
A version of this patch already upstreamed.
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=fixes&id=521ca5a9859a870e354d1a6b84a6ff4c07bbceb0
-Bryant
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback.
2018-02-22 11:58 [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback Vaibhav Jain
2018-02-22 12:31 ` Vaibhav Jain
2018-02-22 18:56 ` Bryant G. Ly
@ 2018-02-22 22:34 ` Michael Ellerman
2 siblings, 0 replies; 4+ messages in thread
From: Michael Ellerman @ 2018-02-22 22:34 UTC (permalink / raw)
To: Vaibhav Jain, Russell Currey
Cc: Vaibhav Jain, Frederic Barrat, linuxppc-dev,
Benjamin Herrenschmidt, Bryant G . Ly
Vaibhav Jain <vaibhav@linux.vnet.ibm.com> writes:
> This patch puts a NULL check before branching to the address pointed
> to by eeh_ops->notify_resume in eeh_report_resume(). The callback
> is used to notify the arch EEH code that a pci device is back
> online.
>
> For PPC64 presently, only an implementation for pseries platform is
> available and not for powernv. Hence without this patch EEH recovery
> on all non-virtualized hosts is causing a kernel panic when
> CONFIG_PCI_IOV is set. The panic is usually is of the form:
>
> EEH: Notify device driver to resume
> Unable to handle kernel paging request for instruction fetch
> Faulting instruction address: 0x00000000
> Oops: Kernel access of bad area, sig: 11 [#1]
> <snip>
> LR eeh_report_resume+0x218/0x220
> Call Trace:
> eeh_report_resume+0x1f0/0x220 (unreliable)
> eeh_pe_dev_traverse+0x98/0x170
> eeh_handle_normal_event+0x3f4/0x650
> eeh_handle_event+0x188/0x380
> eeh_event_handler+0x208/0x210
> kthread+0x168/0x1b0
> ret_from_kernel_thread+0x5c/0xb4
>
> Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
> Fixes: 856e1eb9bdd4("PCI/AER: Add uevents in AER and EEH error/resume")
> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
10 out of 10 for the change log!
But yeah this is already fixed in my fixes branch, thanks anyway.
cheers
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-02-22 22:34 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-22 11:58 [PATCH] PCI/AER: Add a null check before eeh_ops->notify_resume callback Vaibhav Jain
2018-02-22 12:31 ` Vaibhav Jain
2018-02-22 18:56 ` Bryant G. Ly
2018-02-22 22:34 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).