* [PATCH v2 0/9] EEH refactoring 1
@ 2018-03-19 2:45 Sam Bobroff
2018-03-19 2:46 ` [PATCH v2 1/9] powerpc/eeh: Remove eeh_handle_event() Sam Bobroff
` (9 more replies)
0 siblings, 10 replies; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:45 UTC (permalink / raw)
To: linuxppc-dev
Hello everyone,
Here is a set of some small, mostly idempotent, changes to improve
maintainability in some of the EEH code, primarily in eeh_driver.c.
I've kept them all small to aid review but perhaps they should be squashed down
before being applied.
Cheers,
Sam.
Patch set changelog follows:
====== v1 -> v2: ======
Patch 1/9: powerpc/eeh: Remove eeh_handle_event()
Patch 2/9: powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()
Patch 3/9: powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()
Patch 4/9: powerpc/eeh: Remove misleading test in eeh_handle_normal_event()
Patch 5/9: powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()
Patch 6/9: powerpc/eeh: Clarify arguments to eeh_reset_device()
* Re-ordered parameters to eeh_reset_device() to keep pe first.
* Changed eeh_aware_driver to driver_eeh_aware.
Patch 7/9: powerpc/eeh: Remove always-true tests in eeh_reset_device()
Patch 8/9: powerpc/eeh: Factor out common code eeh_reset_device()
* In one case, added braces to "if" to match "else".
Patch 9/9: powerpc/eeh: Add eeh_state_active() helper
====== v1: ======
Patch 1/9: powerpc/eeh: Remove eeh_handle_event()
Patch 2/9: powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()
Patch 3/9: powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()
Patch 4/9: powerpc/eeh: Remove misleading test in eeh_handle_normal_event()
Patch 5/9: powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()
Patch 6/9: powerpc/eeh: Clarify arguments to eeh_reset_device()
Patch 7/9: powerpc/eeh: Remove always-true tests in eeh_reset_device()
Patch 8/9: powerpc/eeh: Factor out common code eeh_reset_device()
Patch 9/9: powerpc/eeh: Add eeh_state_active() helper
Sam Bobroff (9):
powerpc/eeh: Remove eeh_handle_event()
powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()
powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()
powerpc/eeh: Remove misleading test in eeh_handle_normal_event()
powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()
powerpc/eeh: Clarify arguments to eeh_reset_device()
powerpc/eeh: Remove always-true tests in eeh_reset_device()
powerpc/eeh: Factor out common code eeh_reset_device()
powerpc/eeh: Add eeh_state_active() helper
arch/powerpc/include/asm/eeh.h | 6 ++
arch/powerpc/include/asm/eeh_event.h | 3 +-
arch/powerpc/kernel/eeh.c | 19 ++--
arch/powerpc/kernel/eeh_cache.c | 3 +-
arch/powerpc/kernel/eeh_driver.c | 137 +++++++++++----------------
arch/powerpc/kernel/eeh_event.c | 6 +-
arch/powerpc/platforms/powernv/eeh-powernv.c | 9 +-
7 files changed, 72 insertions(+), 111 deletions(-)
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v2 1/9] powerpc/eeh: Remove eeh_handle_event()
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
@ 2018-03-19 2:46 ` Sam Bobroff
2018-03-21 5:17 ` Alexey Kardashevskiy
2018-03-28 14:13 ` [v2,1/9] " Michael Ellerman
2018-03-19 2:46 ` [PATCH v2 2/9] powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event() Sam Bobroff
` (8 subsequent siblings)
9 siblings, 2 replies; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:46 UTC (permalink / raw)
To: linuxppc-dev
The function eeh_handle_event(pe) does nothing other than switching
between calling eeh_handle_normal_event(pe) and
eeh_handle_special_event(). However it is only called in two places,
one where pe can't be NULL and the other where it must be NULL (see
eeh_event_handler()) so it does nothing but obscure the flow of
control.
So, remove it.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
arch/powerpc/include/asm/eeh_event.h | 3 ++-
arch/powerpc/kernel/eeh_driver.c | 42 +++++++++++++-----------------------
arch/powerpc/kernel/eeh_event.c | 4 ++--
3 files changed, 19 insertions(+), 30 deletions(-)
diff --git a/arch/powerpc/include/asm/eeh_event.h b/arch/powerpc/include/asm/eeh_event.h
index 1e551a2d6f82..0a168038882d 100644
--- a/arch/powerpc/include/asm/eeh_event.h
+++ b/arch/powerpc/include/asm/eeh_event.h
@@ -34,7 +34,8 @@ struct eeh_event {
int eeh_event_init(void);
int eeh_send_failure_event(struct eeh_pe *pe);
void eeh_remove_event(struct eeh_pe *pe, bool force);
-void eeh_handle_event(struct eeh_pe *pe);
+bool eeh_handle_normal_event(struct eeh_pe *pe);
+void eeh_handle_special_event(void);
#endif /* __KERNEL__ */
#endif /* ASM_POWERPC_EEH_EVENT_H */
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 0c0b66fc5bfb..51b21c97910f 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -738,9 +738,22 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
* Attempts to recover the given PE. If recovery fails or the PE has failed
* too many times, remove the PE.
*
+ * While PHB detects address or data parity errors on particular PCI
+ * slot, the associated PE will be frozen. Besides, DMA's occurring
+ * to wild addresses (which usually happen due to bugs in device
+ * drivers or in PCI adapter firmware) can cause EEH error. #SERR,
+ * #PERR or other misc PCI-related errors also can trigger EEH errors.
+ *
+ * Recovery process consists of unplugging the device driver (which
+ * generated hotplug events to userspace), then issuing a PCI #RST to
+ * the device, then reconfiguring the PCI config space for all bridges
+ * & devices under this slot, and then finally restarting the device
+ * drivers (which cause a second set of hotplug events to go out to
+ * userspace).
+ *
* Returns true if @pe should no longer be used, else false.
*/
-static bool eeh_handle_normal_event(struct eeh_pe *pe)
+bool eeh_handle_normal_event(struct eeh_pe *pe)
{
struct pci_bus *frozen_bus;
struct eeh_dev *edev, *tmp;
@@ -942,7 +955,7 @@ static bool eeh_handle_normal_event(struct eeh_pe *pe)
* specific PE. Iterates through possible failures and handles them as
* necessary.
*/
-static void eeh_handle_special_event(void)
+void eeh_handle_special_event(void)
{
struct eeh_pe *pe, *phb_pe;
struct pci_bus *bus;
@@ -1049,28 +1062,3 @@ static void eeh_handle_special_event(void)
break;
} while (rc != EEH_NEXT_ERR_NONE);
}
-
-/**
- * eeh_handle_event - Reset a PCI device after hard lockup.
- * @pe: EEH PE
- *
- * While PHB detects address or data parity errors on particular PCI
- * slot, the associated PE will be frozen. Besides, DMA's occurring
- * to wild addresses (which usually happen due to bugs in device
- * drivers or in PCI adapter firmware) can cause EEH error. #SERR,
- * #PERR or other misc PCI-related errors also can trigger EEH errors.
- *
- * Recovery process consists of unplugging the device driver (which
- * generated hotplug events to userspace), then issuing a PCI #RST to
- * the device, then reconfiguring the PCI config space for all bridges
- * & devices under this slot, and then finally restarting the device
- * drivers (which cause a second set of hotplug events to go out to
- * userspace).
- */
-void eeh_handle_event(struct eeh_pe *pe)
-{
- if (pe)
- eeh_handle_normal_event(pe);
- else
- eeh_handle_special_event();
-}
diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
index accbf8b5fd46..872bcfe8f90e 100644
--- a/arch/powerpc/kernel/eeh_event.c
+++ b/arch/powerpc/kernel/eeh_event.c
@@ -81,10 +81,10 @@ static int eeh_event_handler(void * dummy)
pr_info("EEH: Detected PCI bus error on "
"PHB#%x-PE#%x\n",
pe->phb->global_number, pe->addr);
- eeh_handle_event(pe);
+ eeh_handle_normal_event(pe);
eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
} else {
- eeh_handle_event(NULL);
+ eeh_handle_special_event();
}
kfree(event);
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 2/9] powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
2018-03-19 2:46 ` [PATCH v2 1/9] powerpc/eeh: Remove eeh_handle_event() Sam Bobroff
@ 2018-03-19 2:46 ` Sam Bobroff
2018-03-21 5:17 ` Alexey Kardashevskiy
2018-03-19 2:46 ` [PATCH v2 3/9] powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device() Sam Bobroff
` (7 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:46 UTC (permalink / raw)
To: linuxppc-dev
Currently the EEH_PE_RECOVERING flag for a PE is managed by both the
caller and callee of eeh_handle_normal_event() (among other places not
considered here). This is complicated by the fact that the PE may
or may not have been invalidated by the call.
So move the callee's handling into eeh_handle_normal_event(), which
clarifies it and allows the return type to be changed to void (because
it no longer needs to indicate at the PE has been invalidated).
This should not change behaviour except in eeh_event_handler() where
it was previously possible to cause eeh_pe_state_clear() to be called
on an invalid PE, which is now avoided.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
arch/powerpc/include/asm/eeh_event.h | 2 +-
arch/powerpc/kernel/eeh_driver.c | 29 +++++++++++------------------
arch/powerpc/kernel/eeh_event.c | 2 --
3 files changed, 12 insertions(+), 21 deletions(-)
diff --git a/arch/powerpc/include/asm/eeh_event.h b/arch/powerpc/include/asm/eeh_event.h
index 0a168038882d..9884e872686f 100644
--- a/arch/powerpc/include/asm/eeh_event.h
+++ b/arch/powerpc/include/asm/eeh_event.h
@@ -34,7 +34,7 @@ struct eeh_event {
int eeh_event_init(void);
int eeh_send_failure_event(struct eeh_pe *pe);
void eeh_remove_event(struct eeh_pe *pe, bool force);
-bool eeh_handle_normal_event(struct eeh_pe *pe);
+void eeh_handle_normal_event(struct eeh_pe *pe);
void eeh_handle_special_event(void);
#endif /* __KERNEL__ */
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 51b21c97910f..5b7a5ed4db4d 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -733,7 +733,8 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
/**
* eeh_handle_normal_event - Handle EEH events on a specific PE
- * @pe: EEH PE
+ * @pe: EEH PE - which should not be used after we return, as it may
+ * have been invalidated.
*
* Attempts to recover the given PE. If recovery fails or the PE has failed
* too many times, remove the PE.
@@ -750,10 +751,8 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
* & devices under this slot, and then finally restarting the device
* drivers (which cause a second set of hotplug events to go out to
* userspace).
- *
- * Returns true if @pe should no longer be used, else false.
*/
-bool eeh_handle_normal_event(struct eeh_pe *pe)
+void eeh_handle_normal_event(struct eeh_pe *pe)
{
struct pci_bus *frozen_bus;
struct eeh_dev *edev, *tmp;
@@ -765,9 +764,11 @@ bool eeh_handle_normal_event(struct eeh_pe *pe)
if (!frozen_bus) {
pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n",
__func__, pe->phb->global_number, pe->addr);
- return false;
+ return;
}
+ eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
+
eeh_pe_update_time_stamp(pe);
pe->freeze_count++;
if (pe->freeze_count > eeh_max_freezes) {
@@ -904,7 +905,7 @@ bool eeh_handle_normal_event(struct eeh_pe *pe)
pr_info("EEH: Notify device driver to resume\n");
eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);
- return false;
+ goto final;
hard_fail:
/*
@@ -940,12 +941,12 @@ bool eeh_handle_normal_event(struct eeh_pe *pe)
pci_lock_rescan_remove();
pci_hp_remove_devices(frozen_bus);
pci_unlock_rescan_remove();
-
/* The passed PE should no longer be used */
- return true;
+ return;
}
}
- return false;
+final:
+ eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
}
/**
@@ -1018,15 +1019,7 @@ void eeh_handle_special_event(void)
*/
if (rc == EEH_NEXT_ERR_FROZEN_PE ||
rc == EEH_NEXT_ERR_FENCED_PHB) {
- /*
- * eeh_handle_normal_event() can make the PE stale if it
- * determines that the PE cannot possibly be recovered.
- * Don't modify the PE state if that's the case.
- */
- if (eeh_handle_normal_event(pe))
- continue;
-
- eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
+ eeh_handle_normal_event(pe);
} else {
pci_lock_rescan_remove();
list_for_each_entry(hose, &hose_list, list_node) {
diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
index 872bcfe8f90e..61c9356bf9c9 100644
--- a/arch/powerpc/kernel/eeh_event.c
+++ b/arch/powerpc/kernel/eeh_event.c
@@ -73,7 +73,6 @@ static int eeh_event_handler(void * dummy)
/* We might have event without binding PE */
pe = event->pe;
if (pe) {
- eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
if (pe->type & EEH_PE_PHB)
pr_info("EEH: Detected error on PHB#%x\n",
pe->phb->global_number);
@@ -82,7 +81,6 @@ static int eeh_event_handler(void * dummy)
"PHB#%x-PE#%x\n",
pe->phb->global_number, pe->addr);
eeh_handle_normal_event(pe);
- eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
} else {
eeh_handle_special_event();
}
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 3/9] powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
2018-03-19 2:46 ` [PATCH v2 1/9] powerpc/eeh: Remove eeh_handle_event() Sam Bobroff
2018-03-19 2:46 ` [PATCH v2 2/9] powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event() Sam Bobroff
@ 2018-03-19 2:46 ` Sam Bobroff
2018-03-21 5:17 ` Alexey Kardashevskiy
2018-03-19 2:46 ` [PATCH v2 4/9] powerpc/eeh: Remove misleading test in eeh_handle_normal_event() Sam Bobroff
` (6 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:46 UTC (permalink / raw)
To: linuxppc-dev
Commit "0ba178888b05 powerpc/eeh: Remove reference to PCI device"
removed a call to pci_dev_get() from __eeh_addr_cache_get_device() but
did not update the comment to match.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
arch/powerpc/kernel/eeh_cache.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
index d4cc26618809..201943d54a6e 100644
--- a/arch/powerpc/kernel/eeh_cache.c
+++ b/arch/powerpc/kernel/eeh_cache.c
@@ -84,8 +84,7 @@ static inline struct eeh_dev *__eeh_addr_cache_get_device(unsigned long addr)
* @addr: mmio (PIO) phys address or i/o port number
*
* Given an mmio phys address, or a port number, find a pci device
- * that implements this address. Be sure to pci_dev_put the device
- * when finished. I/O port numbers are assumed to be offset
+ * that implements this address. I/O port numbers are assumed to be offset
* from zero (that is, they do *not* have pci_io_addr added in).
* It is safe to call this function within an interrupt.
*/
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 4/9] powerpc/eeh: Remove misleading test in eeh_handle_normal_event()
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
` (2 preceding siblings ...)
2018-03-19 2:46 ` [PATCH v2 3/9] powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device() Sam Bobroff
@ 2018-03-19 2:46 ` Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
2018-03-19 2:47 ` [PATCH v2 5/9] powerpc/eeh: Rename frozen_bus to bus " Sam Bobroff
` (5 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:46 UTC (permalink / raw)
To: linuxppc-dev
Remove a test that checks if "frozen_bus" is NULL, because it cannot
have changed since it was tested at the start of the function and so
must be true here.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
arch/powerpc/kernel/eeh_driver.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 5b7a5ed4db4d..04a5d9db5499 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -930,20 +930,18 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
* all removed devices correctly to avoid access
* the their PCI config any more.
*/
- if (frozen_bus) {
- if (pe->type & EEH_PE_VF) {
- eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
- eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
- } else {
- eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
- eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+ if (pe->type & EEH_PE_VF) {
+ eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+ eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+ } else {
+ eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
+ eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
- pci_lock_rescan_remove();
- pci_hp_remove_devices(frozen_bus);
- pci_unlock_rescan_remove();
- /* The passed PE should no longer be used */
- return;
- }
+ pci_lock_rescan_remove();
+ pci_hp_remove_devices(frozen_bus);
+ pci_unlock_rescan_remove();
+ /* The passed PE should no longer be used */
+ return;
}
final:
eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 5/9] powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
` (3 preceding siblings ...)
2018-03-19 2:46 ` [PATCH v2 4/9] powerpc/eeh: Remove misleading test in eeh_handle_normal_event() Sam Bobroff
@ 2018-03-19 2:47 ` Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
2018-03-19 2:48 ` [PATCH v2 6/9] powerpc/eeh: Clarify arguments to eeh_reset_device() Sam Bobroff
` (4 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:47 UTC (permalink / raw)
To: linuxppc-dev
The name "frozen_bus" is misleading: it's not necessarily frozen, it's
just the PE's PCI bus.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
arch/powerpc/kernel/eeh_driver.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 04a5d9db5499..cb584d72b0a5 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -754,14 +754,14 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
*/
void eeh_handle_normal_event(struct eeh_pe *pe)
{
- struct pci_bus *frozen_bus;
+ struct pci_bus *bus;
struct eeh_dev *edev, *tmp;
int rc = 0;
enum pci_ers_result result = PCI_ERS_RESULT_NONE;
struct eeh_rmv_data rmv_data = {LIST_HEAD_INIT(rmv_data.edev_list), 0};
- frozen_bus = eeh_pe_bus_get(pe);
- if (!frozen_bus) {
+ bus = eeh_pe_bus_get(pe);
+ if (!bus) {
pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n",
__func__, pe->phb->global_number, pe->addr);
return;
@@ -820,7 +820,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
*/
if (result == PCI_ERS_RESULT_NONE) {
pr_info("EEH: Reset with hotplug activity\n");
- rc = eeh_reset_device(pe, frozen_bus, NULL);
+ rc = eeh_reset_device(pe, bus, NULL);
if (rc) {
pr_warn("%s: Unable to reset, err=%d\n",
__func__, rc);
@@ -938,7 +938,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
pci_lock_rescan_remove();
- pci_hp_remove_devices(frozen_bus);
+ pci_hp_remove_devices(bus);
pci_unlock_rescan_remove();
/* The passed PE should no longer be used */
return;
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 6/9] powerpc/eeh: Clarify arguments to eeh_reset_device()
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
` (4 preceding siblings ...)
2018-03-19 2:47 ` [PATCH v2 5/9] powerpc/eeh: Rename frozen_bus to bus " Sam Bobroff
@ 2018-03-19 2:48 ` Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
2018-03-19 2:49 ` [PATCH v2 7/9] powerpc/eeh: Remove always-true tests in eeh_reset_device() Sam Bobroff
` (3 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:48 UTC (permalink / raw)
To: linuxppc-dev
It is currently difficult to understand the behaviour of
eeh_reset_device() due to the way it's parameters are used. In
particular, when 'bus' is NULL, it's value is still necessary so the
same value is looked up again locally under a different name
('frozen_bus') but behaviour is changed.
To clarify this, add a new parameter 'driver_eeh_aware', and have the
caller set it when it would have passed NULL for 'bus' and always pass
a value for 'bus'. Then change any test that was on 'bus' to one on
'!driver_eeh_aware' and replace uses of 'frozen_bus' with 'bus'.
Also update the function's comment.
This should not change behaviour.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
arch/powerpc/kernel/eeh_driver.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index cb584d72b0a5..07437d765434 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -619,17 +619,19 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
/**
* eeh_reset_device - Perform actual reset of a pci slot
+ * @driver_eeh_aware: Does the device's driver provide EEH support?
* @pe: EEH PE
* @bus: PCI bus corresponding to the isolcated slot
+ * @rmv_data: Optional, list to record removed devices
*
* This routine must be called to do reset on the indicated PE.
* During the reset, udev might be invoked because those affected
* PCI devices will be removed and then added.
*/
static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
- struct eeh_rmv_data *rmv_data)
+ struct eeh_rmv_data *rmv_data,
+ bool driver_eeh_aware)
{
- struct pci_bus *frozen_bus = eeh_pe_bus_get(pe);
time64_t tstamp;
int cnt, rc;
struct eeh_dev *edev;
@@ -645,7 +647,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
* into pci_hp_add_devices().
*/
eeh_pe_state_mark(pe, EEH_PE_KEEP);
- if (bus) {
+ if (!driver_eeh_aware) {
if (pe->type & EEH_PE_VF) {
eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
} else {
@@ -653,7 +655,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
pci_hp_remove_devices(bus);
pci_unlock_rescan_remove();
}
- } else if (frozen_bus) {
+ } else if (bus) {
eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
}
@@ -689,7 +691,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
* the device up before the scripts have taken it down,
* potentially weird things happen.
*/
- if (bus) {
+ if (!driver_eeh_aware) {
pr_info("EEH: Sleep 5s ahead of complete hotplug\n");
ssleep(5);
@@ -706,7 +708,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
pci_hp_add_devices(bus);
}
- } else if (frozen_bus && rmv_data->removed) {
+ } else if (bus && rmv_data->removed) {
pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
ssleep(5);
@@ -715,7 +717,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
if (pe->type & EEH_PE_VF)
eeh_add_virt_device(edev, NULL);
else
- pci_hp_add_devices(frozen_bus);
+ pci_hp_add_devices(bus);
}
eeh_pe_state_clear(pe, EEH_PE_KEEP);
@@ -820,7 +822,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
*/
if (result == PCI_ERS_RESULT_NONE) {
pr_info("EEH: Reset with hotplug activity\n");
- rc = eeh_reset_device(pe, bus, NULL);
+ rc = eeh_reset_device(pe, bus, NULL, false);
if (rc) {
pr_warn("%s: Unable to reset, err=%d\n",
__func__, rc);
@@ -872,7 +874,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
/* If any device called out for a reset, then reset the slot */
if (result == PCI_ERS_RESULT_NEED_RESET) {
pr_info("EEH: Reset without hotplug activity\n");
- rc = eeh_reset_device(pe, NULL, &rmv_data);
+ rc = eeh_reset_device(pe, bus, &rmv_data, true);
if (rc) {
pr_warn("%s: Cannot reset, err=%d\n",
__func__, rc);
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 7/9] powerpc/eeh: Remove always-true tests in eeh_reset_device()
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
` (5 preceding siblings ...)
2018-03-19 2:48 ` [PATCH v2 6/9] powerpc/eeh: Clarify arguments to eeh_reset_device() Sam Bobroff
@ 2018-03-19 2:49 ` Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
2018-03-19 2:49 ` [PATCH v2 8/9] powerpc/eeh: Factor out common code eeh_reset_device() Sam Bobroff
` (2 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:49 UTC (permalink / raw)
To: linuxppc-dev
eeh_reset_device() tests the value of 'bus' more than once but the
only caller, eeh_handle_normal_device() does this test itself and will
never pass NULL.
So, remove the dead tests.
This should not change behaviour.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
arch/powerpc/kernel/eeh_driver.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 07437d765434..93fc22e791fa 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -655,7 +655,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
pci_hp_remove_devices(bus);
pci_unlock_rescan_remove();
}
- } else if (bus) {
+ } else {
eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
}
@@ -708,7 +708,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
pci_hp_add_devices(bus);
}
- } else if (bus && rmv_data->removed) {
+ } else if (rmv_data->removed) {
pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
ssleep(5);
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 8/9] powerpc/eeh: Factor out common code eeh_reset_device()
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
` (6 preceding siblings ...)
2018-03-19 2:49 ` [PATCH v2 7/9] powerpc/eeh: Remove always-true tests in eeh_reset_device() Sam Bobroff
@ 2018-03-19 2:49 ` Sam Bobroff
2018-03-20 8:31 ` kbuild test robot
2018-03-21 2:06 ` [PATCH v3 " Sam Bobroff
2018-03-19 2:49 ` [PATCH v2 9/9] powerpc/eeh: Add eeh_state_active() helper Sam Bobroff
2018-03-21 5:16 ` [PATCH v2 0/9] EEH refactoring 1 Alexey Kardashevskiy
9 siblings, 2 replies; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:49 UTC (permalink / raw)
To: linuxppc-dev
The caller will always pass NULL for 'rmv_data' when
'eeh_aware_driver' is true, so the first two calls to
eeh_pe_dev_traverse() can be combined without changing behaviour as
can the two arms of the final 'if' block.
This should not change behaviour.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
====== v1 -> v2: ======
arch/powerpc/kernel/eeh_driver.c | 32 ++++++++++----------------------
1 file changed, 10 insertions(+), 22 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 93fc22e791fa..aa3a2b08aec9 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -647,16 +647,12 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
* into pci_hp_add_devices().
*/
eeh_pe_state_mark(pe, EEH_PE_KEEP);
- if (!driver_eeh_aware) {
- if (pe->type & EEH_PE_VF) {
- eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
- } else {
- pci_lock_rescan_remove();
- pci_hp_remove_devices(bus);
- pci_unlock_rescan_remove();
- }
- } else {
+ if (driver_eeh_aware || (pe->type & EEH_PE_VF)) {
eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
+ } else {
+ pci_lock_rescan_remove();
+ pci_hp_remove_devices(bus);
+ pci_unlock_rescan_remove();
}
/*
@@ -691,8 +687,9 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
* the device up before the scripts have taken it down,
* potentially weird things happen.
*/
- if (!driver_eeh_aware) {
- pr_info("EEH: Sleep 5s ahead of complete hotplug\n");
+ if (!driver_eeh_aware || rmv_data->removed) {
+ pr_info("EEH: Sleep 5s ahead of %s hotplug\n",
+ (eeh_aware_driver ? "partial" : "complete"));
ssleep(5);
/*
@@ -705,19 +702,10 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
if (pe->type & EEH_PE_VF) {
eeh_add_virt_device(edev, NULL);
} else {
- eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
+ if (!eeh_aware_driver)
+ eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
pci_hp_add_devices(bus);
}
- } else if (rmv_data->removed) {
- pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
- ssleep(5);
-
- edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
- eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
- if (pe->type & EEH_PE_VF)
- eeh_add_virt_device(edev, NULL);
- else
- pci_hp_add_devices(bus);
}
eeh_pe_state_clear(pe, EEH_PE_KEEP);
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 9/9] powerpc/eeh: Add eeh_state_active() helper
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
` (7 preceding siblings ...)
2018-03-19 2:49 ` [PATCH v2 8/9] powerpc/eeh: Factor out common code eeh_reset_device() Sam Bobroff
@ 2018-03-19 2:49 ` Sam Bobroff
2018-03-20 4:05 ` kbuild test robot
2018-03-21 5:20 ` Alexey Kardashevskiy
2018-03-21 5:16 ` [PATCH v2 0/9] EEH refactoring 1 Alexey Kardashevskiy
9 siblings, 2 replies; 24+ messages in thread
From: Sam Bobroff @ 2018-03-19 2:49 UTC (permalink / raw)
To: linuxppc-dev
Checking for a "fully active" device state requires testing two flag
bits, which is open coded in several places, so add a function to do
it.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
arch/powerpc/include/asm/eeh.h | 6 ++++++
arch/powerpc/kernel/eeh.c | 19 ++++++-------------
arch/powerpc/platforms/powernv/eeh-powernv.c | 9 ++-------
3 files changed, 14 insertions(+), 20 deletions(-)
diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index fd37cc101f4f..c2266ca61853 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -256,6 +256,12 @@ static inline void eeh_serialize_unlock(unsigned long flags)
raw_spin_unlock_irqrestore(&confirm_error_lock, flags);
}
+static inline bool eeh_state_active(int state)
+{
+ return (state & (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE))
+ == (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
+}
+
typedef void *(*eeh_traverse_func)(void *data, void *flag);
void eeh_set_pe_aux_size(int size);
int eeh_phb_pe_create(struct pci_controller *phb);
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 2b9df0040d6b..bc640e4c5ca5 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -394,9 +394,7 @@ static int eeh_phb_check_failure(struct eeh_pe *pe)
/* Check PHB state */
ret = eeh_ops->get_state(phb_pe, NULL);
if ((ret < 0) ||
- (ret == EEH_STATE_NOT_SUPPORT) ||
- (ret & (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) ==
- (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) {
+ (ret == EEH_STATE_NOT_SUPPORT) || eeh_state_active(ret)) {
ret = 0;
goto out;
}
@@ -433,7 +431,6 @@ static int eeh_phb_check_failure(struct eeh_pe *pe)
int eeh_dev_check_failure(struct eeh_dev *edev)
{
int ret;
- int active_flags = (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
unsigned long flags;
struct device_node *dn;
struct pci_dev *dev;
@@ -525,8 +522,7 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
* state, PE is in good state.
*/
if ((ret < 0) ||
- (ret == EEH_STATE_NOT_SUPPORT) ||
- ((ret & active_flags) == active_flags)) {
+ (ret == EEH_STATE_NOT_SUPPORT) || eeh_state_active(ret)) {
eeh_stats.false_positives++;
pe->false_positives++;
rc = 0;
@@ -546,8 +542,7 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
/* Frozen parent PE ? */
ret = eeh_ops->get_state(parent_pe, NULL);
- if (ret > 0 &&
- (ret & active_flags) != active_flags)
+ if (ret > 0 && !eeh_state_active(ret))
pe = parent_pe;
/* Next parent level */
@@ -888,7 +883,6 @@ static void *eeh_set_dev_freset(void *data, void *flag)
*/
int eeh_pe_reset_full(struct eeh_pe *pe)
{
- int active_flags = (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
int reset_state = (EEH_PE_RESET | EEH_PE_CFG_BLOCKED);
int type = EEH_RESET_HOT;
unsigned int freset = 0;
@@ -919,7 +913,7 @@ int eeh_pe_reset_full(struct eeh_pe *pe)
/* Wait until the PE is in a functioning state */
state = eeh_ops->wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
- if ((state & active_flags) == active_flags)
+ if (eeh_state_active(state))
break;
if (state < 0) {
@@ -1352,16 +1346,15 @@ static int eeh_pe_change_owner(struct eeh_pe *pe)
struct eeh_dev *edev, *tmp;
struct pci_dev *pdev;
struct pci_device_id *id;
- int flags, ret;
+ int ret;
/* Check PE state */
- flags = (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
ret = eeh_ops->get_state(pe, NULL);
if (ret < 0 || ret == EEH_STATE_NOT_SUPPORT)
return 0;
/* Unfrozen PE, nothing to do */
- if ((ret & flags) == flags)
+ if (eeh_state_active(ret))
return 0;
/* Frozen PE, check if it needs PE level reset */
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 33c86c1a1720..ddfc3544d285 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1425,11 +1425,8 @@ static int pnv_eeh_get_pe(struct pci_controller *hose,
dev_pe = dev_pe->parent;
while (dev_pe && !(dev_pe->type & EEH_PE_PHB)) {
int ret;
- int active_flags = (EEH_STATE_MMIO_ACTIVE |
- EEH_STATE_DMA_ACTIVE);
-
ret = eeh_ops->get_state(dev_pe, NULL);
- if (ret <= 0 || (ret & active_flags) == active_flags) {
+ if (ret <= 0 || eeh_state_active(ret)) {
dev_pe = dev_pe->parent;
continue;
}
@@ -1463,7 +1460,6 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
struct eeh_pe *phb_pe, *parent_pe;
__be64 frozen_pe_no;
__be16 err_type, severity;
- int active_flags = (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
long rc;
int state, ret = EEH_NEXT_ERR_NONE;
@@ -1626,8 +1622,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
/* Frozen parent PE ? */
state = eeh_ops->get_state(parent_pe, NULL);
- if (state > 0 &&
- (state & active_flags) != active_flags)
+ if (state > 0 && !eeh_state_active(state))
*pe = parent_pe;
/* Next parent level */
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2 9/9] powerpc/eeh: Add eeh_state_active() helper
2018-03-19 2:49 ` [PATCH v2 9/9] powerpc/eeh: Add eeh_state_active() helper Sam Bobroff
@ 2018-03-20 4:05 ` kbuild test robot
2018-03-21 5:20 ` Alexey Kardashevskiy
1 sibling, 0 replies; 24+ messages in thread
From: kbuild test robot @ 2018-03-20 4:05 UTC (permalink / raw)
To: Sam Bobroff; +Cc: kbuild-all, linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 13960 bytes --]
Hi Sam,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on v4.16-rc4]
[also build test ERROR on next-20180319]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Sam-Bobroff/EEH-refactoring-1/20180320-024729
config: powerpc-ppc64_defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc
All errors (new ones prefixed by >>):
In file included from include/linux/kernel.h:14:0,
from include/linux/delay.h:22,
from arch/powerpc/kernel/eeh_driver.c:25:
arch/powerpc/kernel/eeh_driver.c: In function 'eeh_reset_device':
>> arch/powerpc/kernel/eeh_driver.c:692:5: error: 'eeh_aware_driver' undeclared (first use in this function); did you mean 'eeh_state_active'?
(eeh_aware_driver ? "partial" : "complete"));
^
include/linux/printk.h:308:34: note: in definition of macro 'pr_info'
printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~~~~
arch/powerpc/kernel/eeh_driver.c:692:5: note: each undeclared identifier is reported only once for each function it appears in
(eeh_aware_driver ? "partial" : "complete"));
^
include/linux/printk.h:308:34: note: in definition of macro 'pr_info'
printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~~~~
vim +692 arch/powerpc/kernel/eeh_driver.c
5cfb20b96 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-09-30 619
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 620 /**
29f8bf1b7 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-02-27 621 * eeh_reset_device - Perform actual reset of a pci slot
a10e51924 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 622 * @driver_eeh_aware: Does the device's driver provide EEH support?
9b3c76f08 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-07 623 * @pe: EEH PE
29f8bf1b7 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-02-27 624 * @bus: PCI bus corresponding to the isolcated slot
a10e51924 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 625 * @rmv_data: Optional, list to record removed devices
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 626 *
29f8bf1b7 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-02-27 627 * This routine must be called to do reset on the indicated PE.
29f8bf1b7 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-02-27 628 * During the reset, udev might be invoked because those affected
29f8bf1b7 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-02-27 629 * PCI devices will be removed and then added.
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 630 */
67086e32b arch/powerpc/kernel/eeh_driver.c Wei Yang 2016-03-04 631 static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
a10e51924 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 632 struct eeh_rmv_data *rmv_data,
a10e51924 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 633 bool driver_eeh_aware)
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 634 {
edfd17ff3 arch/powerpc/kernel/eeh_driver.c Arnd Bergmann 2017-11-04 635 time64_t tstamp;
67086e32b arch/powerpc/kernel/eeh_driver.c Wei Yang 2016-03-04 636 int cnt, rc;
67086e32b arch/powerpc/kernel/eeh_driver.c Wei Yang 2016-03-04 637 struct eeh_dev *edev;
424054566 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2006-04-28 638
424054566 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2006-04-28 639 /* pcibios will clear the counter; save the value */
9b3c76f08 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-07 640 cnt = pe->freeze_count;
5a71978e4 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-06-20 641 tstamp = pe->tstamp;
424054566 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2006-04-28 642
20ee6a970 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-11 643 /*
20ee6a970 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-11 644 * We don't remove the corresponding PE instances because
20ee6a970 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-11 645 * we need the information afterwords. The attached EEH
20ee6a970 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-11 646 * devices are expected to be attached soon when calling
bd251b893 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2016-05-03 647 * into pci_hp_add_devices().
20ee6a970 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-11 648 */
807a827d4 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-07-24 649 eeh_pe_state_mark(pe, EEH_PE_KEEP);
e723e4b42 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 650 if (driver_eeh_aware || (pe->type & EEH_PE_VF)) {
e723e4b42 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 651 eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
67086e32b arch/powerpc/kernel/eeh_driver.c Wei Yang 2016-03-04 652 } else {
1c2042c83 arch/powerpc/kernel/eeh_driver.c Rafael J. Wysocki 2014-01-15 653 pci_lock_rescan_remove();
bd251b893 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2016-05-03 654 pci_hp_remove_devices(bus);
1c2042c83 arch/powerpc/kernel/eeh_driver.c Rafael J. Wysocki 2014-01-15 655 pci_unlock_rescan_remove();
67086e32b arch/powerpc/kernel/eeh_driver.c Wei Yang 2016-03-04 656 }
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 657
d0914f503 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-04-24 658 /*
d0914f503 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-04-24 659 * Reset the pci controller. (Asserts RST#; resets config space).
b6495c0c8 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 660 * Reconfigure bridges and devices. Don't try to bring the system
29f8bf1b7 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-02-27 661 * up if the reset failed for some reason.
d0914f503 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-04-24 662 *
d0914f503 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-04-24 663 * During the reset, it's very dangerous to have uncontrolled PCI
d0914f503 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-04-24 664 * config accesses. So we prefer to block them. However, controlled
d0914f503 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-04-24 665 * PCI config accesses initiated from EEH itself are allowed.
29f8bf1b7 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-02-27 666 */
6654c9368 arch/powerpc/kernel/eeh_driver.c Russell Currey 2016-11-17 667 rc = eeh_pe_reset_full(pe);
28bf36f92 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-11-14 668 if (rc)
b6495c0c8 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 669 return rc;
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 670
1c2042c83 arch/powerpc/kernel/eeh_driver.c Rafael J. Wysocki 2014-01-15 671 pci_lock_rescan_remove();
1c2042c83 arch/powerpc/kernel/eeh_driver.c Rafael J. Wysocki 2014-01-15 672
9b3c76f08 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-07 673 /* Restore PE */
9b3c76f08 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-07 674 eeh_ops->configure_bridge(pe);
9b3c76f08 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-07 675 eeh_pe_restore_bars(pe);
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 676
dc9c41bd9 arch/powerpc/kernel/eeh_driver.c Andrew Donnellan 2015-12-08 677 /* Clear frozen state */
5cfb20b96 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-09-30 678 rc = eeh_clear_pe_frozen_state(pe, false);
409bf7f8a arch/powerpc/kernel/eeh_driver.c Andrew Donnellan 2016-12-01 679 if (rc) {
409bf7f8a arch/powerpc/kernel/eeh_driver.c Andrew Donnellan 2016-12-01 680 pci_unlock_rescan_remove();
789547006 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-04-24 681 return rc;
409bf7f8a arch/powerpc/kernel/eeh_driver.c Andrew Donnellan 2016-12-01 682 }
789547006 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2014-04-24 683
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 684 /* Give the system 5 seconds to finish running the user-space
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 685 * hotplug shutdown scripts, e.g. ifdown for ethernet. Yes,
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 686 * this is a hack, but if we don't do this, and try to bring
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 687 * the device up before the scripts have taken it down,
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 688 * potentially weird things happen.
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 689 */
e723e4b42 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 690 if (!driver_eeh_aware || rmv_data->removed) {
e723e4b42 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 691 pr_info("EEH: Sleep 5s ahead of %s hotplug\n",
e723e4b42 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 @692 (eeh_aware_driver ? "partial" : "complete"));
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 693 ssleep(5);
f5c57710d arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-07-24 694
f5c57710d arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-07-24 695 /*
f5c57710d arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-07-24 696 * The EEH device is still connected with its parent
f5c57710d arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-07-24 697 * PE. We should disconnect it so the binding can be
f5c57710d arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-07-24 698 * rebuilt when adding PCI devices.
f5c57710d arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-07-24 699 */
67086e32b arch/powerpc/kernel/eeh_driver.c Wei Yang 2016-03-04 700 edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
f5c57710d arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-07-24 701 eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
a3aa256b7 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2016-06-17 702 if (pe->type & EEH_PE_VF) {
67086e32b arch/powerpc/kernel/eeh_driver.c Wei Yang 2016-03-04 703 eeh_add_virt_device(edev, NULL);
a3aa256b7 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2016-06-17 704 } else {
e723e4b42 arch/powerpc/kernel/eeh_driver.c Sam Bobroff 2018-03-19 705 if (!eeh_aware_driver)
a3aa256b7 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2016-06-17 706 eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
bd251b893 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2016-05-03 707 pci_hp_add_devices(bus);
a3aa256b7 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2016-06-17 708 }
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 709 }
f5c57710d arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-07-24 710 eeh_pe_state_clear(pe, EEH_PE_KEEP);
5a71978e4 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-06-20 711
5a71978e4 arch/powerpc/kernel/eeh_driver.c Gavin Shan 2013-06-20 712 pe->tstamp = tstamp;
9b3c76f08 arch/powerpc/platforms/pseries/eeh_driver.c Gavin Shan 2012-09-07 713 pe->freeze_count = cnt;
b6495c0c8 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 714
1c2042c83 arch/powerpc/kernel/eeh_driver.c Rafael J. Wysocki 2014-01-15 715 pci_unlock_rescan_remove();
b6495c0c8 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 716 return 0;
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 717 }
77bd74156 arch/powerpc/platforms/pseries/eeh_driver.c Linas Vepstas 2005-11-03 718
:::::: The code at line 692 was first introduced by commit
:::::: e723e4b4269eefe9c70fbb6c378a26ab202662e2 powerpc/eeh: Factor out common code eeh_reset_device()
:::::: TO: Sam Bobroff <sam.bobroff@au1.ibm.com>
:::::: CC: 0day robot <fengguang.wu@intel.com>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24150 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 8/9] powerpc/eeh: Factor out common code eeh_reset_device()
2018-03-19 2:49 ` [PATCH v2 8/9] powerpc/eeh: Factor out common code eeh_reset_device() Sam Bobroff
@ 2018-03-20 8:31 ` kbuild test robot
2018-03-21 2:06 ` [PATCH v3 " Sam Bobroff
1 sibling, 0 replies; 24+ messages in thread
From: kbuild test robot @ 2018-03-20 8:31 UTC (permalink / raw)
To: Sam Bobroff; +Cc: kbuild-all, linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 5661 bytes --]
Hi Sam,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on v4.16-rc4]
[also build test ERROR on next-20180319]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Sam-Bobroff/EEH-refactoring-1/20180320-024729
config: powerpc-ppc64_defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc
Note: the linux-review/Sam-Bobroff/EEH-refactoring-1/20180320-024729 HEAD 28e7cc1fa029b84221b827a6ea2908dc33b43d10 builds fine.
It only hurts bisectibility.
All errors (new ones prefixed by >>):
In file included from include/linux/kernel.h:14:0,
from include/linux/delay.h:22,
from arch/powerpc/kernel/eeh_driver.c:25:
arch/powerpc/kernel/eeh_driver.c: In function 'eeh_reset_device':
>> arch/powerpc/kernel/eeh_driver.c:692:5: error: 'eeh_aware_driver' undeclared (first use in this function); did you mean 'device_driver'?
(eeh_aware_driver ? "partial" : "complete"));
^
include/linux/printk.h:308:34: note: in definition of macro 'pr_info'
printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~~~~
arch/powerpc/kernel/eeh_driver.c:692:5: note: each undeclared identifier is reported only once for each function it appears in
(eeh_aware_driver ? "partial" : "complete"));
^
include/linux/printk.h:308:34: note: in definition of macro 'pr_info'
printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~~~~
vim +692 arch/powerpc/kernel/eeh_driver.c
619
620 /**
621 * eeh_reset_device - Perform actual reset of a pci slot
622 * @driver_eeh_aware: Does the device's driver provide EEH support?
623 * @pe: EEH PE
624 * @bus: PCI bus corresponding to the isolcated slot
625 * @rmv_data: Optional, list to record removed devices
626 *
627 * This routine must be called to do reset on the indicated PE.
628 * During the reset, udev might be invoked because those affected
629 * PCI devices will be removed and then added.
630 */
631 static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
632 struct eeh_rmv_data *rmv_data,
633 bool driver_eeh_aware)
634 {
635 time64_t tstamp;
636 int cnt, rc;
637 struct eeh_dev *edev;
638
639 /* pcibios will clear the counter; save the value */
640 cnt = pe->freeze_count;
641 tstamp = pe->tstamp;
642
643 /*
644 * We don't remove the corresponding PE instances because
645 * we need the information afterwords. The attached EEH
646 * devices are expected to be attached soon when calling
647 * into pci_hp_add_devices().
648 */
649 eeh_pe_state_mark(pe, EEH_PE_KEEP);
650 if (driver_eeh_aware || (pe->type & EEH_PE_VF)) {
651 eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
652 } else {
653 pci_lock_rescan_remove();
654 pci_hp_remove_devices(bus);
655 pci_unlock_rescan_remove();
656 }
657
658 /*
659 * Reset the pci controller. (Asserts RST#; resets config space).
660 * Reconfigure bridges and devices. Don't try to bring the system
661 * up if the reset failed for some reason.
662 *
663 * During the reset, it's very dangerous to have uncontrolled PCI
664 * config accesses. So we prefer to block them. However, controlled
665 * PCI config accesses initiated from EEH itself are allowed.
666 */
667 rc = eeh_pe_reset_full(pe);
668 if (rc)
669 return rc;
670
671 pci_lock_rescan_remove();
672
673 /* Restore PE */
674 eeh_ops->configure_bridge(pe);
675 eeh_pe_restore_bars(pe);
676
677 /* Clear frozen state */
678 rc = eeh_clear_pe_frozen_state(pe, false);
679 if (rc) {
680 pci_unlock_rescan_remove();
681 return rc;
682 }
683
684 /* Give the system 5 seconds to finish running the user-space
685 * hotplug shutdown scripts, e.g. ifdown for ethernet. Yes,
686 * this is a hack, but if we don't do this, and try to bring
687 * the device up before the scripts have taken it down,
688 * potentially weird things happen.
689 */
690 if (!driver_eeh_aware || rmv_data->removed) {
691 pr_info("EEH: Sleep 5s ahead of %s hotplug\n",
> 692 (eeh_aware_driver ? "partial" : "complete"));
693 ssleep(5);
694
695 /*
696 * The EEH device is still connected with its parent
697 * PE. We should disconnect it so the binding can be
698 * rebuilt when adding PCI devices.
699 */
700 edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
701 eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
702 if (pe->type & EEH_PE_VF) {
703 eeh_add_virt_device(edev, NULL);
704 } else {
705 if (!eeh_aware_driver)
706 eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
707 pci_hp_add_devices(bus);
708 }
709 }
710 eeh_pe_state_clear(pe, EEH_PE_KEEP);
711
712 pe->tstamp = tstamp;
713 pe->freeze_count = cnt;
714
715 pci_unlock_rescan_remove();
716 return 0;
717 }
718
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24150 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v3 8/9] powerpc/eeh: Factor out common code eeh_reset_device()
2018-03-19 2:49 ` [PATCH v2 8/9] powerpc/eeh: Factor out common code eeh_reset_device() Sam Bobroff
2018-03-20 8:31 ` kbuild test robot
@ 2018-03-21 2:06 ` Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
1 sibling, 1 reply; 24+ messages in thread
From: Sam Bobroff @ 2018-03-21 2:06 UTC (permalink / raw)
To: linuxppc-dev
The caller will always pass NULL for 'rmv_data' when
'eeh_aware_driver' is true, so the first two calls to
eeh_pe_dev_traverse() can be combined without changing behaviour as
can the two arms of the final 'if' block.
This should not change behaviour.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
---
====== v2 -> v3: ======
Corrected two missed cases of eeh_aware_driver -> driver_eeh_aware.
arch/powerpc/kernel/eeh_driver.c | 32 ++++++++++----------------------
1 file changed, 10 insertions(+), 22 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 93fc22e791fa..43ceb6263cd8 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -647,16 +647,12 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
* into pci_hp_add_devices().
*/
eeh_pe_state_mark(pe, EEH_PE_KEEP);
- if (!driver_eeh_aware) {
- if (pe->type & EEH_PE_VF) {
- eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
- } else {
- pci_lock_rescan_remove();
- pci_hp_remove_devices(bus);
- pci_unlock_rescan_remove();
- }
- } else {
+ if (driver_eeh_aware || (pe->type & EEH_PE_VF)) {
eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
+ } else {
+ pci_lock_rescan_remove();
+ pci_hp_remove_devices(bus);
+ pci_unlock_rescan_remove();
}
/*
@@ -691,8 +687,9 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
* the device up before the scripts have taken it down,
* potentially weird things happen.
*/
- if (!driver_eeh_aware) {
- pr_info("EEH: Sleep 5s ahead of complete hotplug\n");
+ if (!driver_eeh_aware || rmv_data->removed) {
+ pr_info("EEH: Sleep 5s ahead of %s hotplug\n",
+ (driver_eeh_aware ? "partial" : "complete"));
ssleep(5);
/*
@@ -705,19 +702,10 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
if (pe->type & EEH_PE_VF) {
eeh_add_virt_device(edev, NULL);
} else {
- eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
+ if (!driver_eeh_aware)
+ eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
pci_hp_add_devices(bus);
}
- } else if (rmv_data->removed) {
- pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
- ssleep(5);
-
- edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
- eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
- if (pe->type & EEH_PE_VF)
- eeh_add_virt_device(edev, NULL);
- else
- pci_hp_add_devices(bus);
}
eeh_pe_state_clear(pe, EEH_PE_KEEP);
--
2.16.1.74.g9b0b1f47b
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2 0/9] EEH refactoring 1
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
` (8 preceding siblings ...)
2018-03-19 2:49 ` [PATCH v2 9/9] powerpc/eeh: Add eeh_state_active() helper Sam Bobroff
@ 2018-03-21 5:16 ` Alexey Kardashevskiy
9 siblings, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:16 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Sam Bobroff
On 19/3/18 1:45 pm, Sam Bobroff wrote:
> Hello everyone,
>
> Here is a set of some small, mostly idempotent, changes to improve
> maintainability in some of the EEH code, primarily in eeh_driver.c.
>
> I've kept them all small to aid review but perhaps they should be squashed down
> before being applied.
>
> Cheers,
> Sam.
>
> Patch set changelog follows:
>
> ====== v1 -> v2: ======
>
> Patch 1/9: powerpc/eeh: Remove eeh_handle_event()
>
> Patch 2/9: powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()
>
> Patch 3/9: powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()
>
> Patch 4/9: powerpc/eeh: Remove misleading test in eeh_handle_normal_event()
>
> Patch 5/9: powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()
>
> Patch 6/9: powerpc/eeh: Clarify arguments to eeh_reset_device()
> * Re-ordered parameters to eeh_reset_device() to keep pe first.
> * Changed eeh_aware_driver to driver_eeh_aware.
>
> Patch 7/9: powerpc/eeh: Remove always-true tests in eeh_reset_device()
>
> Patch 8/9: powerpc/eeh: Factor out common code eeh_reset_device()
> * In one case, added braces to "if" to match "else".
>
> Patch 9/9: powerpc/eeh: Add eeh_state_active() helper
>
> ====== v1: ======
>
> Patch 1/9: powerpc/eeh: Remove eeh_handle_event()
> Patch 2/9: powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()
> Patch 3/9: powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()
> Patch 4/9: powerpc/eeh: Remove misleading test in eeh_handle_normal_event()
> Patch 5/9: powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()
> Patch 6/9: powerpc/eeh: Clarify arguments to eeh_reset_device()
> Patch 7/9: powerpc/eeh: Remove always-true tests in eeh_reset_device()
> Patch 8/9: powerpc/eeh: Factor out common code eeh_reset_device()
> Patch 9/9: powerpc/eeh: Add eeh_state_active() helper
>
> Sam Bobroff (9):
> powerpc/eeh: Remove eeh_handle_event()
> powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()
> powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()
> powerpc/eeh: Remove misleading test in eeh_handle_normal_event()
> powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()
> powerpc/eeh: Clarify arguments to eeh_reset_device()
> powerpc/eeh: Remove always-true tests in eeh_reset_device()
> powerpc/eeh: Factor out common code eeh_reset_device()
> powerpc/eeh: Add eeh_state_active() helper
>
> arch/powerpc/include/asm/eeh.h | 6 ++
> arch/powerpc/include/asm/eeh_event.h | 3 +-
> arch/powerpc/kernel/eeh.c | 19 ++--
> arch/powerpc/kernel/eeh_cache.c | 3 +-
> arch/powerpc/kernel/eeh_driver.c | 137 +++++++++++----------------
> arch/powerpc/kernel/eeh_event.c | 6 +-
> arch/powerpc/platforms/powernv/eeh-powernv.c | 9 +-
> 7 files changed, 72 insertions(+), 111 deletions(-)
>
Nice cleanup,
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 1/9] powerpc/eeh: Remove eeh_handle_event()
2018-03-19 2:46 ` [PATCH v2 1/9] powerpc/eeh: Remove eeh_handle_event() Sam Bobroff
@ 2018-03-21 5:17 ` Alexey Kardashevskiy
2018-03-28 14:13 ` [v2,1/9] " Michael Ellerman
1 sibling, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:17 UTC (permalink / raw)
To: linuxppc-dev
On 19/3/18 1:46 pm, Sam Bobroff wrote:
> The function eeh_handle_event(pe) does nothing other than switching
> between calling eeh_handle_normal_event(pe) and
> eeh_handle_special_event(). However it is only called in two places,
> one where pe can't be NULL and the other where it must be NULL (see
> eeh_event_handler()) so it does nothing but obscure the flow of
> control.
>
> So, remove it.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/include/asm/eeh_event.h | 3 ++-
> arch/powerpc/kernel/eeh_driver.c | 42 +++++++++++++-----------------------
> arch/powerpc/kernel/eeh_event.c | 4 ++--
> 3 files changed, 19 insertions(+), 30 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/eeh_event.h b/arch/powerpc/include/asm/eeh_event.h
> index 1e551a2d6f82..0a168038882d 100644
> --- a/arch/powerpc/include/asm/eeh_event.h
> +++ b/arch/powerpc/include/asm/eeh_event.h
> @@ -34,7 +34,8 @@ struct eeh_event {
> int eeh_event_init(void);
> int eeh_send_failure_event(struct eeh_pe *pe);
> void eeh_remove_event(struct eeh_pe *pe, bool force);
> -void eeh_handle_event(struct eeh_pe *pe);
> +bool eeh_handle_normal_event(struct eeh_pe *pe);
> +void eeh_handle_special_event(void);
>
> #endif /* __KERNEL__ */
> #endif /* ASM_POWERPC_EEH_EVENT_H */
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 0c0b66fc5bfb..51b21c97910f 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -738,9 +738,22 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> * Attempts to recover the given PE. If recovery fails or the PE has failed
> * too many times, remove the PE.
> *
> + * While PHB detects address or data parity errors on particular PCI
> + * slot, the associated PE will be frozen. Besides, DMA's occurring
> + * to wild addresses (which usually happen due to bugs in device
> + * drivers or in PCI adapter firmware) can cause EEH error. #SERR,
> + * #PERR or other misc PCI-related errors also can trigger EEH errors.
> + *
> + * Recovery process consists of unplugging the device driver (which
> + * generated hotplug events to userspace), then issuing a PCI #RST to
> + * the device, then reconfiguring the PCI config space for all bridges
> + * & devices under this slot, and then finally restarting the device
> + * drivers (which cause a second set of hotplug events to go out to
> + * userspace).
> + *
> * Returns true if @pe should no longer be used, else false.
> */
> -static bool eeh_handle_normal_event(struct eeh_pe *pe)
> +bool eeh_handle_normal_event(struct eeh_pe *pe)
> {
> struct pci_bus *frozen_bus;
> struct eeh_dev *edev, *tmp;
> @@ -942,7 +955,7 @@ static bool eeh_handle_normal_event(struct eeh_pe *pe)
> * specific PE. Iterates through possible failures and handles them as
> * necessary.
> */
> -static void eeh_handle_special_event(void)
> +void eeh_handle_special_event(void)
> {
> struct eeh_pe *pe, *phb_pe;
> struct pci_bus *bus;
> @@ -1049,28 +1062,3 @@ static void eeh_handle_special_event(void)
> break;
> } while (rc != EEH_NEXT_ERR_NONE);
> }
> -
> -/**
> - * eeh_handle_event - Reset a PCI device after hard lockup.
> - * @pe: EEH PE
> - *
> - * While PHB detects address or data parity errors on particular PCI
> - * slot, the associated PE will be frozen. Besides, DMA's occurring
> - * to wild addresses (which usually happen due to bugs in device
> - * drivers or in PCI adapter firmware) can cause EEH error. #SERR,
> - * #PERR or other misc PCI-related errors also can trigger EEH errors.
> - *
> - * Recovery process consists of unplugging the device driver (which
> - * generated hotplug events to userspace), then issuing a PCI #RST to
> - * the device, then reconfiguring the PCI config space for all bridges
> - * & devices under this slot, and then finally restarting the device
> - * drivers (which cause a second set of hotplug events to go out to
> - * userspace).
> - */
> -void eeh_handle_event(struct eeh_pe *pe)
> -{
> - if (pe)
> - eeh_handle_normal_event(pe);
> - else
> - eeh_handle_special_event();
> -}
> diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
> index accbf8b5fd46..872bcfe8f90e 100644
> --- a/arch/powerpc/kernel/eeh_event.c
> +++ b/arch/powerpc/kernel/eeh_event.c
> @@ -81,10 +81,10 @@ static int eeh_event_handler(void * dummy)
> pr_info("EEH: Detected PCI bus error on "
> "PHB#%x-PE#%x\n",
> pe->phb->global_number, pe->addr);
> - eeh_handle_event(pe);
> + eeh_handle_normal_event(pe);
> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
> } else {
> - eeh_handle_event(NULL);
> + eeh_handle_special_event();
> }
>
> kfree(event);
>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 2/9] powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event()
2018-03-19 2:46 ` [PATCH v2 2/9] powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event() Sam Bobroff
@ 2018-03-21 5:17 ` Alexey Kardashevskiy
0 siblings, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:17 UTC (permalink / raw)
To: linuxppc-dev
On 19/3/18 1:46 pm, Sam Bobroff wrote:
> Currently the EEH_PE_RECOVERING flag for a PE is managed by both the
> caller and callee of eeh_handle_normal_event() (among other places not
> considered here). This is complicated by the fact that the PE may
> or may not have been invalidated by the call.
>
> So move the callee's handling into eeh_handle_normal_event(), which
> clarifies it and allows the return type to be changed to void (because
> it no longer needs to indicate at the PE has been invalidated).
>
> This should not change behaviour except in eeh_event_handler() where
> it was previously possible to cause eeh_pe_state_clear() to be called
> on an invalid PE, which is now avoided.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/include/asm/eeh_event.h | 2 +-
> arch/powerpc/kernel/eeh_driver.c | 29 +++++++++++------------------
> arch/powerpc/kernel/eeh_event.c | 2 --
> 3 files changed, 12 insertions(+), 21 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/eeh_event.h b/arch/powerpc/include/asm/eeh_event.h
> index 0a168038882d..9884e872686f 100644
> --- a/arch/powerpc/include/asm/eeh_event.h
> +++ b/arch/powerpc/include/asm/eeh_event.h
> @@ -34,7 +34,7 @@ struct eeh_event {
> int eeh_event_init(void);
> int eeh_send_failure_event(struct eeh_pe *pe);
> void eeh_remove_event(struct eeh_pe *pe, bool force);
> -bool eeh_handle_normal_event(struct eeh_pe *pe);
> +void eeh_handle_normal_event(struct eeh_pe *pe);
> void eeh_handle_special_event(void);
>
> #endif /* __KERNEL__ */
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 51b21c97910f..5b7a5ed4db4d 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -733,7 +733,8 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
>
> /**
> * eeh_handle_normal_event - Handle EEH events on a specific PE
> - * @pe: EEH PE
> + * @pe: EEH PE - which should not be used after we return, as it may
> + * have been invalidated.
> *
> * Attempts to recover the given PE. If recovery fails or the PE has failed
> * too many times, remove the PE.
> @@ -750,10 +751,8 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> * & devices under this slot, and then finally restarting the device
> * drivers (which cause a second set of hotplug events to go out to
> * userspace).
> - *
> - * Returns true if @pe should no longer be used, else false.
> */
> -bool eeh_handle_normal_event(struct eeh_pe *pe)
> +void eeh_handle_normal_event(struct eeh_pe *pe)
> {
> struct pci_bus *frozen_bus;
> struct eeh_dev *edev, *tmp;
> @@ -765,9 +764,11 @@ bool eeh_handle_normal_event(struct eeh_pe *pe)
> if (!frozen_bus) {
> pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n",
> __func__, pe->phb->global_number, pe->addr);
> - return false;
> + return;
> }
>
> + eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
> +
> eeh_pe_update_time_stamp(pe);
> pe->freeze_count++;
> if (pe->freeze_count > eeh_max_freezes) {
> @@ -904,7 +905,7 @@ bool eeh_handle_normal_event(struct eeh_pe *pe)
> pr_info("EEH: Notify device driver to resume\n");
> eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);
>
> - return false;
> + goto final;
>
> hard_fail:
> /*
> @@ -940,12 +941,12 @@ bool eeh_handle_normal_event(struct eeh_pe *pe)
> pci_lock_rescan_remove();
> pci_hp_remove_devices(frozen_bus);
> pci_unlock_rescan_remove();
> -
> /* The passed PE should no longer be used */
> - return true;
> + return;
> }
> }
> - return false;
> +final:
> + eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
> }
>
> /**
> @@ -1018,15 +1019,7 @@ void eeh_handle_special_event(void)
> */
> if (rc == EEH_NEXT_ERR_FROZEN_PE ||
> rc == EEH_NEXT_ERR_FENCED_PHB) {
> - /*
> - * eeh_handle_normal_event() can make the PE stale if it
> - * determines that the PE cannot possibly be recovered.
> - * Don't modify the PE state if that's the case.
> - */
> - if (eeh_handle_normal_event(pe))
> - continue;
> -
> - eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
> + eeh_handle_normal_event(pe);
> } else {
> pci_lock_rescan_remove();
> list_for_each_entry(hose, &hose_list, list_node) {
> diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
> index 872bcfe8f90e..61c9356bf9c9 100644
> --- a/arch/powerpc/kernel/eeh_event.c
> +++ b/arch/powerpc/kernel/eeh_event.c
> @@ -73,7 +73,6 @@ static int eeh_event_handler(void * dummy)
> /* We might have event without binding PE */
> pe = event->pe;
> if (pe) {
> - eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
> if (pe->type & EEH_PE_PHB)
> pr_info("EEH: Detected error on PHB#%x\n",
> pe->phb->global_number);
> @@ -82,7 +81,6 @@ static int eeh_event_handler(void * dummy)
> "PHB#%x-PE#%x\n",
> pe->phb->global_number, pe->addr);
> eeh_handle_normal_event(pe);
> - eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
> } else {
> eeh_handle_special_event();
> }
>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 3/9] powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device()
2018-03-19 2:46 ` [PATCH v2 3/9] powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device() Sam Bobroff
@ 2018-03-21 5:17 ` Alexey Kardashevskiy
0 siblings, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:17 UTC (permalink / raw)
To: linuxppc-dev
On 19/3/18 1:46 pm, Sam Bobroff wrote:
> Commit "0ba178888b05 powerpc/eeh: Remove reference to PCI device"
> removed a call to pci_dev_get() from __eeh_addr_cache_get_device() but
> did not update the comment to match.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/kernel/eeh_cache.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c
> index d4cc26618809..201943d54a6e 100644
> --- a/arch/powerpc/kernel/eeh_cache.c
> +++ b/arch/powerpc/kernel/eeh_cache.c
> @@ -84,8 +84,7 @@ static inline struct eeh_dev *__eeh_addr_cache_get_device(unsigned long addr)
> * @addr: mmio (PIO) phys address or i/o port number
> *
> * Given an mmio phys address, or a port number, find a pci device
> - * that implements this address. Be sure to pci_dev_put the device
> - * when finished. I/O port numbers are assumed to be offset
> + * that implements this address. I/O port numbers are assumed to be offset
> * from zero (that is, they do *not* have pci_io_addr added in).
> * It is safe to call this function within an interrupt.
> */
>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 4/9] powerpc/eeh: Remove misleading test in eeh_handle_normal_event()
2018-03-19 2:46 ` [PATCH v2 4/9] powerpc/eeh: Remove misleading test in eeh_handle_normal_event() Sam Bobroff
@ 2018-03-21 5:18 ` Alexey Kardashevskiy
0 siblings, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:18 UTC (permalink / raw)
To: linuxppc-dev
On 19/3/18 1:46 pm, Sam Bobroff wrote:
> Remove a test that checks if "frozen_bus" is NULL, because it cannot
> have changed since it was tested at the start of the function and so
> must be true here.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/kernel/eeh_driver.c | 24 +++++++++++-------------
> 1 file changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 5b7a5ed4db4d..04a5d9db5499 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -930,20 +930,18 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
> * all removed devices correctly to avoid access
> * the their PCI config any more.
> */
> - if (frozen_bus) {
> - if (pe->type & EEH_PE_VF) {
> - eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
> - eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
> - } else {
> - eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
> - eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
> + if (pe->type & EEH_PE_VF) {
> + eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
> + eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
> + } else {
> + eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
> + eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>
> - pci_lock_rescan_remove();
> - pci_hp_remove_devices(frozen_bus);
> - pci_unlock_rescan_remove();
> - /* The passed PE should no longer be used */
> - return;
> - }
> + pci_lock_rescan_remove();
> + pci_hp_remove_devices(frozen_bus);
> + pci_unlock_rescan_remove();
> + /* The passed PE should no longer be used */
> + return;
> }
> final:
> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 5/9] powerpc/eeh: Rename frozen_bus to bus in eeh_handle_normal_event()
2018-03-19 2:47 ` [PATCH v2 5/9] powerpc/eeh: Rename frozen_bus to bus " Sam Bobroff
@ 2018-03-21 5:18 ` Alexey Kardashevskiy
0 siblings, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:18 UTC (permalink / raw)
To: linuxppc-dev
On 19/3/18 1:47 pm, Sam Bobroff wrote:
> The name "frozen_bus" is misleading: it's not necessarily frozen, it's
> just the PE's PCI bus.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/kernel/eeh_driver.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 04a5d9db5499..cb584d72b0a5 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -754,14 +754,14 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> */
> void eeh_handle_normal_event(struct eeh_pe *pe)
> {
> - struct pci_bus *frozen_bus;
> + struct pci_bus *bus;
> struct eeh_dev *edev, *tmp;
> int rc = 0;
> enum pci_ers_result result = PCI_ERS_RESULT_NONE;
> struct eeh_rmv_data rmv_data = {LIST_HEAD_INIT(rmv_data.edev_list), 0};
>
> - frozen_bus = eeh_pe_bus_get(pe);
> - if (!frozen_bus) {
> + bus = eeh_pe_bus_get(pe);
> + if (!bus) {
> pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n",
> __func__, pe->phb->global_number, pe->addr);
> return;
> @@ -820,7 +820,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
> */
> if (result == PCI_ERS_RESULT_NONE) {
> pr_info("EEH: Reset with hotplug activity\n");
> - rc = eeh_reset_device(pe, frozen_bus, NULL);
> + rc = eeh_reset_device(pe, bus, NULL);
> if (rc) {
> pr_warn("%s: Unable to reset, err=%d\n",
> __func__, rc);
> @@ -938,7 +938,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
> eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
>
> pci_lock_rescan_remove();
> - pci_hp_remove_devices(frozen_bus);
> + pci_hp_remove_devices(bus);
> pci_unlock_rescan_remove();
> /* The passed PE should no longer be used */
> return;
>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 6/9] powerpc/eeh: Clarify arguments to eeh_reset_device()
2018-03-19 2:48 ` [PATCH v2 6/9] powerpc/eeh: Clarify arguments to eeh_reset_device() Sam Bobroff
@ 2018-03-21 5:18 ` Alexey Kardashevskiy
0 siblings, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:18 UTC (permalink / raw)
To: linuxppc-dev
On 19/3/18 1:48 pm, Sam Bobroff wrote:
> It is currently difficult to understand the behaviour of
> eeh_reset_device() due to the way it's parameters are used. In
> particular, when 'bus' is NULL, it's value is still necessary so the
> same value is looked up again locally under a different name
> ('frozen_bus') but behaviour is changed.
>
> To clarify this, add a new parameter 'driver_eeh_aware', and have the
> caller set it when it would have passed NULL for 'bus' and always pass
> a value for 'bus'. Then change any test that was on 'bus' to one on
> '!driver_eeh_aware' and replace uses of 'frozen_bus' with 'bus'.
>
> Also update the function's comment.
>
> This should not change behaviour.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/kernel/eeh_driver.c | 20 +++++++++++---------
> 1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index cb584d72b0a5..07437d765434 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -619,17 +619,19 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>
> /**
> * eeh_reset_device - Perform actual reset of a pci slot
> + * @driver_eeh_aware: Does the device's driver provide EEH support?
> * @pe: EEH PE
> * @bus: PCI bus corresponding to the isolcated slot
> + * @rmv_data: Optional, list to record removed devices
> *
> * This routine must be called to do reset on the indicated PE.
> * During the reset, udev might be invoked because those affected
> * PCI devices will be removed and then added.
> */
> static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> - struct eeh_rmv_data *rmv_data)
> + struct eeh_rmv_data *rmv_data,
> + bool driver_eeh_aware)
> {
> - struct pci_bus *frozen_bus = eeh_pe_bus_get(pe);
> time64_t tstamp;
> int cnt, rc;
> struct eeh_dev *edev;
> @@ -645,7 +647,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> * into pci_hp_add_devices().
> */
> eeh_pe_state_mark(pe, EEH_PE_KEEP);
> - if (bus) {
> + if (!driver_eeh_aware) {
> if (pe->type & EEH_PE_VF) {
> eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
> } else {
> @@ -653,7 +655,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> pci_hp_remove_devices(bus);
> pci_unlock_rescan_remove();
> }
> - } else if (frozen_bus) {
> + } else if (bus) {
> eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
> }
>
> @@ -689,7 +691,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> * the device up before the scripts have taken it down,
> * potentially weird things happen.
> */
> - if (bus) {
> + if (!driver_eeh_aware) {
> pr_info("EEH: Sleep 5s ahead of complete hotplug\n");
> ssleep(5);
>
> @@ -706,7 +708,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
> pci_hp_add_devices(bus);
> }
> - } else if (frozen_bus && rmv_data->removed) {
> + } else if (bus && rmv_data->removed) {
> pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
> ssleep(5);
>
> @@ -715,7 +717,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> if (pe->type & EEH_PE_VF)
> eeh_add_virt_device(edev, NULL);
> else
> - pci_hp_add_devices(frozen_bus);
> + pci_hp_add_devices(bus);
> }
> eeh_pe_state_clear(pe, EEH_PE_KEEP);
>
> @@ -820,7 +822,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
> */
> if (result == PCI_ERS_RESULT_NONE) {
> pr_info("EEH: Reset with hotplug activity\n");
> - rc = eeh_reset_device(pe, bus, NULL);
> + rc = eeh_reset_device(pe, bus, NULL, false);
> if (rc) {
> pr_warn("%s: Unable to reset, err=%d\n",
> __func__, rc);
> @@ -872,7 +874,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
> /* If any device called out for a reset, then reset the slot */
> if (result == PCI_ERS_RESULT_NEED_RESET) {
> pr_info("EEH: Reset without hotplug activity\n");
> - rc = eeh_reset_device(pe, NULL, &rmv_data);
> + rc = eeh_reset_device(pe, bus, &rmv_data, true);
> if (rc) {
> pr_warn("%s: Cannot reset, err=%d\n",
> __func__, rc);
>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 7/9] powerpc/eeh: Remove always-true tests in eeh_reset_device()
2018-03-19 2:49 ` [PATCH v2 7/9] powerpc/eeh: Remove always-true tests in eeh_reset_device() Sam Bobroff
@ 2018-03-21 5:18 ` Alexey Kardashevskiy
0 siblings, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:18 UTC (permalink / raw)
To: linuxppc-dev
On 19/3/18 1:49 pm, Sam Bobroff wrote:
> eeh_reset_device() tests the value of 'bus' more than once but the
> only caller, eeh_handle_normal_device() does this test itself and will
> never pass NULL.
>
> So, remove the dead tests.
>
> This should not change behaviour.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/kernel/eeh_driver.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 07437d765434..93fc22e791fa 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -655,7 +655,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> pci_hp_remove_devices(bus);
> pci_unlock_rescan_remove();
> }
> - } else if (bus) {
> + } else {
> eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
> }
>
> @@ -708,7 +708,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
> pci_hp_add_devices(bus);
> }
> - } else if (bus && rmv_data->removed) {
> + } else if (rmv_data->removed) {
> pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
> ssleep(5);
>
>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 8/9] powerpc/eeh: Factor out common code eeh_reset_device()
2018-03-21 2:06 ` [PATCH v3 " Sam Bobroff
@ 2018-03-21 5:18 ` Alexey Kardashevskiy
0 siblings, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:18 UTC (permalink / raw)
To: linuxppc-dev
On 21/3/18 1:06 pm, Sam Bobroff wrote:
> The caller will always pass NULL for 'rmv_data' when
> 'eeh_aware_driver' is true, so the first two calls to
> eeh_pe_dev_traverse() can be combined without changing behaviour as
> can the two arms of the final 'if' block.
>
> This should not change behaviour.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> ====== v2 -> v3: ======
> Corrected two missed cases of eeh_aware_driver -> driver_eeh_aware.
>
> arch/powerpc/kernel/eeh_driver.c | 32 ++++++++++----------------------
> 1 file changed, 10 insertions(+), 22 deletions(-)
>
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index 93fc22e791fa..43ceb6263cd8 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -647,16 +647,12 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> * into pci_hp_add_devices().
> */
> eeh_pe_state_mark(pe, EEH_PE_KEEP);
> - if (!driver_eeh_aware) {
> - if (pe->type & EEH_PE_VF) {
> - eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
> - } else {
> - pci_lock_rescan_remove();
> - pci_hp_remove_devices(bus);
> - pci_unlock_rescan_remove();
> - }
> - } else {
> + if (driver_eeh_aware || (pe->type & EEH_PE_VF)) {
> eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
> + } else {
> + pci_lock_rescan_remove();
> + pci_hp_remove_devices(bus);
> + pci_unlock_rescan_remove();
> }
>
> /*
> @@ -691,8 +687,9 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> * the device up before the scripts have taken it down,
> * potentially weird things happen.
> */
> - if (!driver_eeh_aware) {
> - pr_info("EEH: Sleep 5s ahead of complete hotplug\n");
> + if (!driver_eeh_aware || rmv_data->removed) {
> + pr_info("EEH: Sleep 5s ahead of %s hotplug\n",
> + (driver_eeh_aware ? "partial" : "complete"));
> ssleep(5);
>
> /*
> @@ -705,19 +702,10 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
> if (pe->type & EEH_PE_VF) {
> eeh_add_virt_device(edev, NULL);
> } else {
> - eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
> + if (!driver_eeh_aware)
> + eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
> pci_hp_add_devices(bus);
> }
> - } else if (rmv_data->removed) {
> - pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
> - ssleep(5);
> -
> - edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
> - eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
> - if (pe->type & EEH_PE_VF)
> - eeh_add_virt_device(edev, NULL);
> - else
> - pci_hp_add_devices(bus);
> }
> eeh_pe_state_clear(pe, EEH_PE_KEEP);
>
>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 9/9] powerpc/eeh: Add eeh_state_active() helper
2018-03-19 2:49 ` [PATCH v2 9/9] powerpc/eeh: Add eeh_state_active() helper Sam Bobroff
2018-03-20 4:05 ` kbuild test robot
@ 2018-03-21 5:20 ` Alexey Kardashevskiy
1 sibling, 0 replies; 24+ messages in thread
From: Alexey Kardashevskiy @ 2018-03-21 5:20 UTC (permalink / raw)
To: linuxppc-dev
On 19/3/18 1:49 pm, Sam Bobroff wrote:
> Checking for a "fully active" device state requires testing two flag
> bits, which is open coded in several places, so add a function to do
> it.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/include/asm/eeh.h | 6 ++++++
> arch/powerpc/kernel/eeh.c | 19 ++++++-------------
> arch/powerpc/platforms/powernv/eeh-powernv.c | 9 ++-------
> 3 files changed, 14 insertions(+), 20 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index fd37cc101f4f..c2266ca61853 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -256,6 +256,12 @@ static inline void eeh_serialize_unlock(unsigned long flags)
> raw_spin_unlock_irqrestore(&confirm_error_lock, flags);
> }
>
> +static inline bool eeh_state_active(int state)
> +{
> + return (state & (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE))
> + == (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
> +}
> +
> typedef void *(*eeh_traverse_func)(void *data, void *flag);
> void eeh_set_pe_aux_size(int size);
> int eeh_phb_pe_create(struct pci_controller *phb);
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 2b9df0040d6b..bc640e4c5ca5 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -394,9 +394,7 @@ static int eeh_phb_check_failure(struct eeh_pe *pe)
> /* Check PHB state */
> ret = eeh_ops->get_state(phb_pe, NULL);
> if ((ret < 0) ||
> - (ret == EEH_STATE_NOT_SUPPORT) ||
> - (ret & (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) ==
> - (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) {
> + (ret == EEH_STATE_NOT_SUPPORT) || eeh_state_active(ret)) {
> ret = 0;
> goto out;
> }
> @@ -433,7 +431,6 @@ static int eeh_phb_check_failure(struct eeh_pe *pe)
> int eeh_dev_check_failure(struct eeh_dev *edev)
> {
> int ret;
> - int active_flags = (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
> unsigned long flags;
> struct device_node *dn;
> struct pci_dev *dev;
> @@ -525,8 +522,7 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
> * state, PE is in good state.
> */
> if ((ret < 0) ||
> - (ret == EEH_STATE_NOT_SUPPORT) ||
> - ((ret & active_flags) == active_flags)) {
> + (ret == EEH_STATE_NOT_SUPPORT) || eeh_state_active(ret)) {
> eeh_stats.false_positives++;
> pe->false_positives++;
> rc = 0;
> @@ -546,8 +542,7 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
>
> /* Frozen parent PE ? */
> ret = eeh_ops->get_state(parent_pe, NULL);
> - if (ret > 0 &&
> - (ret & active_flags) != active_flags)
> + if (ret > 0 && !eeh_state_active(ret))
> pe = parent_pe;
>
> /* Next parent level */
> @@ -888,7 +883,6 @@ static void *eeh_set_dev_freset(void *data, void *flag)
> */
> int eeh_pe_reset_full(struct eeh_pe *pe)
> {
> - int active_flags = (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
> int reset_state = (EEH_PE_RESET | EEH_PE_CFG_BLOCKED);
> int type = EEH_RESET_HOT;
> unsigned int freset = 0;
> @@ -919,7 +913,7 @@ int eeh_pe_reset_full(struct eeh_pe *pe)
>
> /* Wait until the PE is in a functioning state */
> state = eeh_ops->wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
> - if ((state & active_flags) == active_flags)
> + if (eeh_state_active(state))
> break;
>
> if (state < 0) {
> @@ -1352,16 +1346,15 @@ static int eeh_pe_change_owner(struct eeh_pe *pe)
> struct eeh_dev *edev, *tmp;
> struct pci_dev *pdev;
> struct pci_device_id *id;
> - int flags, ret;
> + int ret;
>
> /* Check PE state */
> - flags = (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
> ret = eeh_ops->get_state(pe, NULL);
> if (ret < 0 || ret == EEH_STATE_NOT_SUPPORT)
> return 0;
>
> /* Unfrozen PE, nothing to do */
> - if ((ret & flags) == flags)
> + if (eeh_state_active(ret))
> return 0;
>
> /* Frozen PE, check if it needs PE level reset */
> diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
> index 33c86c1a1720..ddfc3544d285 100644
> --- a/arch/powerpc/platforms/powernv/eeh-powernv.c
> +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
> @@ -1425,11 +1425,8 @@ static int pnv_eeh_get_pe(struct pci_controller *hose,
> dev_pe = dev_pe->parent;
> while (dev_pe && !(dev_pe->type & EEH_PE_PHB)) {
> int ret;
> - int active_flags = (EEH_STATE_MMIO_ACTIVE |
> - EEH_STATE_DMA_ACTIVE);
> -
> ret = eeh_ops->get_state(dev_pe, NULL);
> - if (ret <= 0 || (ret & active_flags) == active_flags) {
> + if (ret <= 0 || eeh_state_active(ret)) {
> dev_pe = dev_pe->parent;
> continue;
> }
> @@ -1463,7 +1460,6 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
> struct eeh_pe *phb_pe, *parent_pe;
> __be64 frozen_pe_no;
> __be16 err_type, severity;
> - int active_flags = (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE);
> long rc;
> int state, ret = EEH_NEXT_ERR_NONE;
>
> @@ -1626,8 +1622,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
>
> /* Frozen parent PE ? */
> state = eeh_ops->get_state(parent_pe, NULL);
> - if (state > 0 &&
> - (state & active_flags) != active_flags)
> + if (state > 0 && !eeh_state_active(state))
> *pe = parent_pe;
>
> /* Next parent level */
>
--
Alexey
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [v2,1/9] powerpc/eeh: Remove eeh_handle_event()
2018-03-19 2:46 ` [PATCH v2 1/9] powerpc/eeh: Remove eeh_handle_event() Sam Bobroff
2018-03-21 5:17 ` Alexey Kardashevskiy
@ 2018-03-28 14:13 ` Michael Ellerman
1 sibling, 0 replies; 24+ messages in thread
From: Michael Ellerman @ 2018-03-28 14:13 UTC (permalink / raw)
To: Sam Bobroff, linuxppc-dev
On Mon, 2018-03-19 at 02:46:20 UTC, Sam Bobroff wrote:
> The function eeh_handle_event(pe) does nothing other than switching
> between calling eeh_handle_normal_event(pe) and
> eeh_handle_special_event(). However it is only called in two places,
> one where pe can't be NULL and the other where it must be NULL (see
> eeh_event_handler()) so it does nothing but obscure the flow of
> control.
>
> So, remove it.
>
> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/68701780712f7ddb2fa81032aa1b4a
cheers
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2018-03-28 14:13 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-19 2:45 [PATCH v2 0/9] EEH refactoring 1 Sam Bobroff
2018-03-19 2:46 ` [PATCH v2 1/9] powerpc/eeh: Remove eeh_handle_event() Sam Bobroff
2018-03-21 5:17 ` Alexey Kardashevskiy
2018-03-28 14:13 ` [v2,1/9] " Michael Ellerman
2018-03-19 2:46 ` [PATCH v2 2/9] powerpc/eeh: Manage EEH_PE_RECOVERING inside eeh_handle_normal_event() Sam Bobroff
2018-03-21 5:17 ` Alexey Kardashevskiy
2018-03-19 2:46 ` [PATCH v2 3/9] powerpc/eeh: Fix misleading comment in __eeh_addr_cache_get_device() Sam Bobroff
2018-03-21 5:17 ` Alexey Kardashevskiy
2018-03-19 2:46 ` [PATCH v2 4/9] powerpc/eeh: Remove misleading test in eeh_handle_normal_event() Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
2018-03-19 2:47 ` [PATCH v2 5/9] powerpc/eeh: Rename frozen_bus to bus " Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
2018-03-19 2:48 ` [PATCH v2 6/9] powerpc/eeh: Clarify arguments to eeh_reset_device() Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
2018-03-19 2:49 ` [PATCH v2 7/9] powerpc/eeh: Remove always-true tests in eeh_reset_device() Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
2018-03-19 2:49 ` [PATCH v2 8/9] powerpc/eeh: Factor out common code eeh_reset_device() Sam Bobroff
2018-03-20 8:31 ` kbuild test robot
2018-03-21 2:06 ` [PATCH v3 " Sam Bobroff
2018-03-21 5:18 ` Alexey Kardashevskiy
2018-03-19 2:49 ` [PATCH v2 9/9] powerpc/eeh: Add eeh_state_active() helper Sam Bobroff
2018-03-20 4:05 ` kbuild test robot
2018-03-21 5:20 ` Alexey Kardashevskiy
2018-03-21 5:16 ` [PATCH v2 0/9] EEH refactoring 1 Alexey Kardashevskiy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.