All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/14] EEH refactoring 3
@ 2018-09-12  1:23 Sam Bobroff
  2018-09-12  1:23 ` [PATCH 01/14] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() Sam Bobroff
                   ` (13 more replies)
  0 siblings, 14 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

Hello everyone,

Here is another set of minor fixes and cleanups for the EEH code. There should
be no significant changes in behaviour.

I'm not sure if all of these are worth doing, and I don't want to add
unnecessary churn so please comment if you have an opinion or better approach.

The set is based on powerpc next.

Cheers,
Sam

Sam Bobroff (14):
  powerpc/eeh: Fix possible null deref in eeh_dump_dev_log()
  powerpc/eeh: Fix null deref for devices removed during EEH
  powerpc/eeh: Fix use of EEH_PE_KEEP on wrong field
  powerpc/eeh: Cleanup EEH_POSTPONED_PROBE
  powerpc/eeh: Cleanup unused field in eeh_dev
  powerpc/eeh: Cleanup eeh_add_virt_device()
  powerpc/eeh: Cleanup list_head field names
  powerpc/eeh: Cleanup field names in eeh_rmv_data
  powerpc/eeh: Cleanup logic in eeh_rmv_from_parent_pe()
  powerpc/eeh: Cleanup eeh_enabled()
  powerpc/eeh: Cleanup unnecessary eeh_pe_state_mark_with_cfg()
  powerpc/eeh: Cleanup eeh_pe_state_mark()
  powerpc/eeh: Cleanup eeh_ops.wait_state()
  powerpc/eeh: Cleanup control flow in eeh_handle_normal_event()

 arch/powerpc/include/asm/eeh.h               |  24 +-
 arch/powerpc/include/asm/ppc-pci.h           |   1 +
 arch/powerpc/kernel/eeh.c                    |  42 ++--
 arch/powerpc/kernel/eeh_dev.c                |   2 -
 arch/powerpc/kernel/eeh_driver.c             | 237 +++++++++----------
 arch/powerpc/kernel/eeh_pe.c                 | 160 +++++++------
 arch/powerpc/platforms/powernv/eeh-powernv.c |  62 +----
 arch/powerpc/platforms/pseries/eeh_pseries.c |  66 +-----
 arch/powerpc/platforms/pseries/msi.c         |   3 +-
 arch/powerpc/platforms/pseries/pci.c         |   1 +
 drivers/pci/hotplug/pnv_php.c                |   2 +-
 11 files changed, 250 insertions(+), 350 deletions(-)

-- 
2.19.0.2.gcad72f5712

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 01/14] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log()
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-10-15  4:00   ` [01/14] " Michael Ellerman
  2018-09-12  1:23 ` [PATCH 02/14] powerpc/eeh: Fix null deref for devices removed during EEH Sam Bobroff
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

If an error occurs during an unplug operation, it's possible for
eeh_dump_dev_log() to be called when edev->pdn is null, which
currently leads to dereferencing a null pointer.

Handle this by skipping the error log for those devices.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index ac1cecc05742..69754af506dd 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -169,6 +169,11 @@ static size_t eeh_dump_dev_log(struct eeh_dev *edev, char *buf, size_t len)
 	int n = 0, l = 0;
 	char buffer[128];
 
+	if (!pdn) {
+		pr_warn("EEH: Note: No error log for absent device.\n");
+		return 0;
+	}
+
 	n += scnprintf(buf+n, len-n, "%04x:%02x:%02x.%01x\n",
 		       pdn->phb->global_number, pdn->busno,
 		       PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 02/14] powerpc/eeh: Fix null deref for devices removed during EEH
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
  2018-09-12  1:23 ` [PATCH 01/14] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 03/14] powerpc/eeh: Fix use of EEH_PE_KEEP on wrong field Sam Bobroff
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

If a device is removed during EEH processing (either by a driver's
handler or as part of recovery), it can lead to a null dereference
in eeh_pe_report_edev().

To handle this, skip devices that have been removed.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh_driver.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 67619b4b3f96..4115d353c349 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -281,6 +281,10 @@ static void eeh_pe_report_edev(struct eeh_dev *edev, eeh_report_fn fn,
 	struct pci_driver *driver;
 	enum pci_ers_result new_result;
 
+	if (!edev->pdev) {
+		eeh_edev_info(edev, "no device");
+		return;
+	}
 	device_lock(&edev->pdev->dev);
 	if (eeh_edev_actionable(edev)) {
 		driver = eeh_pcid_get(edev->pdev);
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 03/14] powerpc/eeh: Fix use of EEH_PE_KEEP on wrong field
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
  2018-09-12  1:23 ` [PATCH 01/14] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() Sam Bobroff
  2018-09-12  1:23 ` [PATCH 02/14] powerpc/eeh: Fix null deref for devices removed during EEH Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 04/14] powerpc/eeh: Cleanup EEH_POSTPONED_PROBE Sam Bobroff
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

eeh_add_to_parent_pe() sometimes removes the EEH_PE_KEEP flag, but it
incorrectly removes it from pe->type, instead of pe->state.

However, rather than clearing it from the correct field, remove it.
Inspection of the code shows that it can't ever have had any effect
(even if it had been cleared from the correct field), because the
field is never tested after it is cleared by the statement in
question.

The clear statement was added by commit 807a827d4e74 ("powerpc/eeh:
Keep PE during hotplug"), but it didn't explain why it was necessary.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh_pe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 1b238ecc553e..210d239a9395 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -379,7 +379,7 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 		while (parent) {
 			if (!(parent->type & EEH_PE_INVALID))
 				break;
-			parent->type &= ~(EEH_PE_INVALID | EEH_PE_KEEP);
+			parent->type &= ~EEH_PE_INVALID;
 			parent = parent->parent;
 		}
 
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 04/14] powerpc/eeh: Cleanup EEH_POSTPONED_PROBE
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (2 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 03/14] powerpc/eeh: Fix use of EEH_PE_KEEP on wrong field Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 05/14] powerpc/eeh: Cleanup unused field in eeh_dev Sam Bobroff
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

Currently a flag, EEH_POSTPONED_PROBE, is used to prevent an incorrect
message "EEH: No capable adapters found" from being displayed during
the boot of powernv systems.

It is necessary because, on powernv, the call to eeh_probe_devices()
made from eeh_init() is too early and EEH can't yet be enabled. A
second call is made later from eeh_pnv_post_init(), which succeeds.

(On pseries, the first call succeeds because PCI devices are set up
early enough and no second call is made.)

This can be simplified by moving the early call to eeh_probe_devices()
from eeh_init() (where it's seen by both platforms) to
pSeries_final_fixup(), so that each platform only calls
eeh_probe_devices() once, at a point where it can succeed.
This is slightly later in the boot sequence, but but still early
enough and it is now in the same place in the sequence for both
platforms (the pcibios_fixup hook).

The display of the message can be cleaned up as well, by moving it
into eeh_probe_devices().

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |  1 -
 arch/powerpc/kernel/eeh.c                    | 18 ++++++------------
 arch/powerpc/platforms/powernv/eeh-powernv.c | 14 --------------
 arch/powerpc/platforms/pseries/pci.c         |  1 +
 4 files changed, 7 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 219637ea69a1..147f0117e56f 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -43,7 +43,6 @@ struct pci_dn;
 #define EEH_VALID_PE_ZERO	0x10	/* PE#0 is valid		     */
 #define EEH_ENABLE_IO_FOR_LOG	0x20	/* Enable IO for log		     */
 #define EEH_EARLY_DUMP_LOG	0x40	/* Dump log immediately		     */
-#define EEH_POSTPONED_PROBE	0x80    /* Powernv may postpone device probe */
 
 /*
  * Delay for PE reset, all in ms
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 69754af506dd..25e7384fdaba 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1041,6 +1041,11 @@ void eeh_probe_devices(void)
 		pdn = hose->pci_data;
 		traverse_pci_dn(pdn, eeh_ops->probe, NULL);
 	}
+	if (eeh_enabled())
+		pr_info("EEH: PCI Enhanced I/O Error Handling Enabled\n");
+	else
+		pr_info("EEH: No capable adapters found\n");
+
 }
 
 /**
@@ -1084,18 +1089,7 @@ static int eeh_init(void)
 		eeh_dev_phb_init_dynamic(hose);
 
 	/* Initialize EEH event */
-	ret = eeh_event_init();
-	if (ret)
-		return ret;
-
-	eeh_probe_devices();
-
-	if (eeh_enabled())
-		pr_info("EEH: PCI Enhanced I/O Error Handling Enabled\n");
-	else if (!eeh_has_flag(EEH_POSTPONED_PROBE))
-		pr_info("EEH: No capable adapters found\n");
-
-	return ret;
+	return eeh_event_init();
 }
 
 core_initcall_sync(eeh_init);
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 3c1beae29f2d..d0764f2c0733 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -223,14 +223,6 @@ int pnv_eeh_post_init(void)
 	eeh_probe_devices();
 	eeh_addr_cache_build();
 
-	if (eeh_has_flag(EEH_POSTPONED_PROBE)) {
-		eeh_clear_flag(EEH_POSTPONED_PROBE);
-		if (eeh_enabled())
-			pr_info("EEH: PCI Enhanced I/O Error Handling Enabled\n");
-		else
-			pr_info("EEH: No capable adapters found\n");
-	}
-
 	/* Register OPAL event notifier */
 	eeh_event_irq = opal_event_request(ilog2(OPAL_EVENT_PCI_ERROR));
 	if (eeh_event_irq < 0) {
@@ -391,12 +383,6 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data)
 	if ((pdn->class_code >> 8) == PCI_CLASS_BRIDGE_ISA)
 		return NULL;
 
-	/* Skip if we haven't probed yet */
-	if (phb->ioda.pe_rmap[config_addr] == IODA_INVALID_PE) {
-		eeh_add_flag(EEH_POSTPONED_PROBE);
-		return NULL;
-	}
-
 	/* Initialize eeh device */
 	edev->class_code = pdn->class_code;
 	edev->mode	&= 0xFFFFFF00;
diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
index eab96637d6cf..41d8a4d1d02e 100644
--- a/arch/powerpc/platforms/pseries/pci.c
+++ b/arch/powerpc/platforms/pseries/pci.c
@@ -239,6 +239,7 @@ void __init pSeries_final_fixup(void)
 {
 	pSeries_request_regions();
 
+	eeh_probe_devices();
 	eeh_addr_cache_build();
 
 #ifdef CONFIG_PCI_IOV
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 05/14] powerpc/eeh: Cleanup unused field in eeh_dev
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (3 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 04/14] powerpc/eeh: Cleanup EEH_POSTPONED_PROBE Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 06/14] powerpc/eeh: Cleanup eeh_add_virt_device() Sam Bobroff
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

The 'bus' member of struct eeh_dev is assigned to once but never used,
so remove it.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h   | 1 -
 arch/powerpc/kernel/eeh_driver.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 147f0117e56f..703d1f96ee8b 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -147,7 +147,6 @@ struct eeh_dev {
 	struct pci_dev *pdev;		/* Associated PCI device	*/
 	bool in_error;			/* Error flag for edev		*/
 	struct pci_dev *physfn;		/* Associated SRIOV PF		*/
-	struct pci_bus *bus;		/* PCI bus for partial hotplug	*/
 };
 
 static inline struct pci_dn *eeh_dev_to_pdn(struct eeh_dev *edev)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 4115d353c349..7766766bab57 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -543,7 +543,6 @@ static void *eeh_rmv_device(struct eeh_dev *edev, void *userdata)
 	/* Remove it from PCI subsystem */
 	pr_debug("EEH: Removing %s without EEH sensitive driver\n",
 		 pci_name(dev));
-	edev->bus = dev->bus;
 	edev->mode |= EEH_DEV_DISCONNECTED;
 	if (removed)
 		(*removed)++;
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 06/14] powerpc/eeh: Cleanup eeh_add_virt_device()
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (4 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 05/14] powerpc/eeh: Cleanup unused field in eeh_dev Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 07/14] powerpc/eeh: Cleanup list_head field names Sam Bobroff
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

Remove the unnecessary cast through void * on the first parameter and
remove the unused second parameter (always NULL).

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh_driver.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 7766766bab57..cc300eb9585c 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -469,10 +469,9 @@ static enum pci_ers_result eeh_report_failure(struct eeh_dev *edev,
 	return rc;
 }
 
-static void *eeh_add_virt_device(void *data, void *userdata)
+static void *eeh_add_virt_device(struct eeh_dev *edev)
 {
 	struct pci_driver *driver;
-	struct eeh_dev *edev = (struct eeh_dev *)data;
 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
 	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 
@@ -743,7 +742,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
 		if (pe->type & EEH_PE_VF) {
-			eeh_add_virt_device(edev, NULL);
+			eeh_add_virt_device(edev);
 		} else {
 			if (!driver_eeh_aware)
 				eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
@@ -936,7 +935,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 	 * recovered properly.
 	 */
 	list_for_each_entry_safe(edev, tmp, &rmv_data.edev_list, rmv_list) {
-		eeh_add_virt_device(edev, NULL);
+		eeh_add_virt_device(edev);
 		list_del(&edev->rmv_list);
 	}
 
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 07/14] powerpc/eeh: Cleanup list_head field names
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (5 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 06/14] powerpc/eeh: Cleanup eeh_add_virt_device() Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 08/14] powerpc/eeh: Cleanup field names in eeh_rmv_data Sam Bobroff
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

Instances of struct eeh_pe are placed in a tree structure using the
fields "child_list" and "child", so place these next to each other
in the definition.

The field "child" is a list entry, so remove the unnecessary and
misleading use of the list initializer, LIST_HEAD(), on it.

The eeh_dev struct contains two list entry fields, called "list" and
"rmv_list". Rename them to "entry" and "rmv_entry" and, as above, stop
initializing them with LIST_HEAD().

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               | 12 ++++++------
 arch/powerpc/kernel/eeh_dev.c                |  2 --
 arch/powerpc/kernel/eeh_driver.c             | 10 +++++-----
 arch/powerpc/kernel/eeh_pe.c                 | 11 +++++------
 arch/powerpc/platforms/powernv/eeh-powernv.c |  2 +-
 arch/powerpc/platforms/pseries/msi.c         |  3 ++-
 6 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 703d1f96ee8b..b48b08ed9be3 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -98,13 +98,13 @@ struct eeh_pe {
 	atomic_t pass_dev_cnt;		/* Count of passed through devs	*/
 	struct eeh_pe *parent;		/* Parent PE			*/
 	void *data;			/* PE auxillary data		*/
-	struct list_head child_list;	/* Link PE to the child list	*/
-	struct list_head edevs;		/* Link list of EEH devices	*/
-	struct list_head child;		/* Child PEs			*/
+	struct list_head child_list;	/* List of PEs below this PE	*/
+	struct list_head child;		/* Memb. child_list/eeh_phb_pe	*/
+	struct list_head edevs;		/* List of eeh_dev in this PE	*/
 };
 
 #define eeh_pe_for_each_dev(pe, edev, tmp) \
-		list_for_each_entry_safe(edev, tmp, &pe->edevs, list)
+		list_for_each_entry_safe(edev, tmp, &pe->edevs, entry)
 
 #define eeh_for_each_pe(root, pe) \
 	for (pe = root; pe; pe = eeh_pe_next(pe, root))
@@ -141,8 +141,8 @@ struct eeh_dev {
 	int aer_cap;			/* Saved AER capability		*/
 	int af_cap;			/* Saved AF capability		*/
 	struct eeh_pe *pe;		/* Associated PE		*/
-	struct list_head list;		/* Form link list in the PE	*/
-	struct list_head rmv_list;	/* Record the removed edevs	*/
+	struct list_head entry;		/* Membership in eeh_pe.edevs	*/
+	struct list_head rmv_entry;	/* Membership in rmv_list	*/
 	struct pci_dn *pdn;		/* Associated PCI device node	*/
 	struct pci_dev *pdev;		/* Associated PCI device	*/
 	bool in_error;			/* Error flag for edev		*/
diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c
index a34e6912c15e..d8c90f3284b5 100644
--- a/arch/powerpc/kernel/eeh_dev.c
+++ b/arch/powerpc/kernel/eeh_dev.c
@@ -60,8 +60,6 @@ struct eeh_dev *eeh_dev_init(struct pci_dn *pdn)
 	/* Associate EEH device with OF node */
 	pdn->edev = edev;
 	edev->pdn = pdn;
-	INIT_LIST_HEAD(&edev->list);
-	INIT_LIST_HEAD(&edev->rmv_list);
 
 	return edev;
 }
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index cc300eb9585c..7859af897058 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -404,7 +404,7 @@ static void *eeh_dev_restore_state(struct eeh_dev *edev, void *userdata)
 	 * EEH device is created.
 	 */
 	if (edev->pe && (edev->pe->state & EEH_PE_CFG_RESTRICTED)) {
-		if (list_is_last(&edev->list, &edev->pe->edevs))
+		if (list_is_last(&edev->entry, &edev->pe->edevs))
 			eeh_pe_restore_bars(edev->pe);
 
 		return NULL;
@@ -560,7 +560,7 @@ static void *eeh_rmv_device(struct eeh_dev *edev, void *userdata)
 		pdn->pe_number = IODA_INVALID_PE;
 #endif
 		if (rmv_data)
-			list_add(&edev->rmv_list, &rmv_data->edev_list);
+			list_add(&edev->rmv_entry, &rmv_data->edev_list);
 	} else {
 		pci_lock_rescan_remove();
 		pci_stop_and_remove_bus_device(dev);
@@ -739,7 +739,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 		 * PE. We should disconnect it so the binding can be
 		 * rebuilt when adding PCI devices.
 		 */
-		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
+		edev = list_first_entry(&pe->edevs, struct eeh_dev, entry);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
 		if (pe->type & EEH_PE_VF) {
 			eeh_add_virt_device(edev);
@@ -934,9 +934,9 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 	 * For those hot removed VFs, we should add back them after PF get
 	 * recovered properly.
 	 */
-	list_for_each_entry_safe(edev, tmp, &rmv_data.edev_list, rmv_list) {
+	list_for_each_entry_safe(edev, tmp, &rmv_data.edev_list, rmv_entry) {
 		eeh_add_virt_device(edev);
-		list_del(&edev->rmv_list);
+		list_del(&edev->rmv_entry);
 	}
 
 	/* Tell all device drivers that they can resume operations */
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 210d239a9395..7d6d93cd67e1 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -75,7 +75,6 @@ static struct eeh_pe *eeh_pe_alloc(struct pci_controller *phb, int type)
 	pe->type = type;
 	pe->phb = phb;
 	INIT_LIST_HEAD(&pe->child_list);
-	INIT_LIST_HEAD(&pe->child);
 	INIT_LIST_HEAD(&pe->edevs);
 
 	pe->data = (void *)pe + ALIGN(sizeof(struct eeh_pe),
@@ -360,7 +359,7 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 		edev->pe = pe;
 
 		/* Put the edev to PE */
-		list_add_tail(&edev->list, &pe->edevs);
+		list_add_tail(&edev->entry, &pe->edevs);
 		pr_debug("EEH: Add %04x:%02x:%02x.%01x to Bus PE#%x\n",
 			 pdn->phb->global_number,
 			 pdn->busno,
@@ -369,7 +368,7 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 			 pe->addr);
 		return 0;
 	} else if (pe && (pe->type & EEH_PE_INVALID)) {
-		list_add_tail(&edev->list, &pe->edevs);
+		list_add_tail(&edev->entry, &pe->edevs);
 		edev->pe = pe;
 		/*
 		 * We're running to here because of PCI hotplug caused by
@@ -429,7 +428,7 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev)
 	 * link the EEH device accordingly.
 	 */
 	list_add_tail(&pe->child, &parent->child_list);
-	list_add_tail(&edev->list, &pe->edevs);
+	list_add_tail(&edev->entry, &pe->edevs);
 	edev->pe = pe;
 	pr_debug("EEH: Add %04x:%02x:%02x.%01x to "
 		 "Device PE#%x, Parent PE#%x\n",
@@ -469,7 +468,7 @@ int eeh_rmv_from_parent_pe(struct eeh_dev *edev)
 	/* Remove the EEH device */
 	pe = eeh_dev_to_pe(edev);
 	edev->pe = NULL;
-	list_del(&edev->list);
+	list_del(&edev->entry);
 
 	/*
 	 * Check if the parent PE includes any EEH devices.
@@ -945,7 +944,7 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe)
 		return pe->bus;
 
 	/* Retrieve the parent PCI bus of first (top) PCI device */
-	edev = list_first_entry_or_null(&pe->edevs, struct eeh_dev, list);
+	edev = list_first_entry_or_null(&pe->edevs, struct eeh_dev, entry);
 	pdev = eeh_dev_to_pci_dev(edev);
 	if (pdev)
 		return pdev->bus;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index d0764f2c0733..a7e59dbf2696 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1040,7 +1040,7 @@ static int pnv_eeh_reset_vf_pe(struct eeh_pe *pe, int option)
 	int ret;
 
 	/* The VF PE should have only one child device */
-	edev = list_first_entry_or_null(&pe->edevs, struct eeh_dev, list);
+	edev = list_first_entry_or_null(&pe->edevs, struct eeh_dev, entry);
 	pdn = eeh_dev_to_pdn(edev);
 	if (!pdn)
 		return -ENXIO;
diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index b7496948129e..8011b4129e3a 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -203,7 +203,8 @@ static struct device_node *find_pe_dn(struct pci_dev *dev, int *total)
 	/* Get the top level device in the PE */
 	edev = pdn_to_eeh_dev(PCI_DN(dn));
 	if (edev->pe)
-		edev = list_first_entry(&edev->pe->edevs, struct eeh_dev, list);
+		edev = list_first_entry(&edev->pe->edevs, struct eeh_dev,
+					entry);
 	dn = pci_device_to_OF_node(edev->pdev);
 	if (!dn)
 		return NULL;
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 08/14] powerpc/eeh: Cleanup field names in eeh_rmv_data
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (6 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 07/14] powerpc/eeh: Cleanup list_head field names Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 09/14] powerpc/eeh: Cleanup logic in eeh_rmv_from_parent_pe() Sam Bobroff
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

Change the name of the fields in eeh_rmv_data to clarify their usage.

Change "edev_list" to "removed_vf_list" because it does not contain
generic edevs, but rather only edevs that contain virtual functions
(which need to be removed during recovery).

Similarly, change "removed" to "removed_dev_count" because it is a
count of any removed devices, not just those in the above list.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh_driver.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 7859af897058..ffe8293d1f06 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -35,8 +35,8 @@
 #include <asm/rtas.h>
 
 struct eeh_rmv_data {
-	struct list_head edev_list;
-	int removed;
+	struct list_head removed_vf_list;
+	int removed_dev_count;
 };
 
 static int eeh_result_priority(enum pci_ers_result result)
@@ -502,7 +502,6 @@ static void *eeh_rmv_device(struct eeh_dev *edev, void *userdata)
 	struct pci_driver *driver;
 	struct pci_dev *dev = eeh_dev_to_pci_dev(edev);
 	struct eeh_rmv_data *rmv_data = (struct eeh_rmv_data *)userdata;
-	int *removed = rmv_data ? &rmv_data->removed : NULL;
 
 	/*
 	 * Actually, we should remove the PCI bridges as well.
@@ -524,7 +523,7 @@ static void *eeh_rmv_device(struct eeh_dev *edev, void *userdata)
 	if (eeh_dev_removed(edev))
 		return NULL;
 
-	if (removed) {
+	if (rmv_data) {
 		if (eeh_pe_passed(edev->pe))
 			return NULL;
 		driver = eeh_pcid_get(dev);
@@ -543,8 +542,8 @@ static void *eeh_rmv_device(struct eeh_dev *edev, void *userdata)
 	pr_debug("EEH: Removing %s without EEH sensitive driver\n",
 		 pci_name(dev));
 	edev->mode |= EEH_DEV_DISCONNECTED;
-	if (removed)
-		(*removed)++;
+	if (rmv_data)
+		rmv_data->removed_dev_count++;
 
 	if (edev->physfn) {
 #ifdef CONFIG_PCI_IOV
@@ -560,7 +559,7 @@ static void *eeh_rmv_device(struct eeh_dev *edev, void *userdata)
 		pdn->pe_number = IODA_INVALID_PE;
 #endif
 		if (rmv_data)
-			list_add(&edev->rmv_entry, &rmv_data->edev_list);
+			list_add(&edev->rmv_entry, &rmv_data->removed_vf_list);
 	} else {
 		pci_lock_rescan_remove();
 		pci_stop_and_remove_bus_device(dev);
@@ -729,7 +728,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 	 * the device up before the scripts have taken it down,
 	 * potentially weird things happen.
 	 */
-	if (!driver_eeh_aware || rmv_data->removed) {
+	if (!driver_eeh_aware || rmv_data->removed_dev_count) {
 		pr_info("EEH: Sleep 5s ahead of %s hotplug\n",
 			(driver_eeh_aware ? "partial" : "complete"));
 		ssleep(5);
@@ -791,7 +790,8 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 	struct eeh_pe *tmp_pe;
 	int rc = 0;
 	enum pci_ers_result result = PCI_ERS_RESULT_NONE;
-	struct eeh_rmv_data rmv_data = {LIST_HEAD_INIT(rmv_data.edev_list), 0};
+	struct eeh_rmv_data rmv_data =
+		{LIST_HEAD_INIT(rmv_data.removed_vf_list), 0};
 
 	bus = eeh_pe_bus_get(pe);
 	if (!bus) {
@@ -934,7 +934,8 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 	 * For those hot removed VFs, we should add back them after PF get
 	 * recovered properly.
 	 */
-	list_for_each_entry_safe(edev, tmp, &rmv_data.edev_list, rmv_entry) {
+	list_for_each_entry_safe(edev, tmp, &rmv_data.removed_vf_list,
+				 rmv_entry) {
 		eeh_add_virt_device(edev);
 		list_del(&edev->rmv_entry);
 	}
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 09/14] powerpc/eeh: Cleanup logic in eeh_rmv_from_parent_pe()
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (7 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 08/14] powerpc/eeh: Cleanup field names in eeh_rmv_data Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 10/14] powerpc/eeh: Cleanup eeh_enabled() Sam Bobroff
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

Move the call to eeh_dev_to_pe() up, so that later it's clear that
"pe" isn't NULL.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh_pe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 7d6d93cd67e1..78f125d24bd0 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -456,7 +456,8 @@ int eeh_rmv_from_parent_pe(struct eeh_dev *edev)
 	int cnt;
 	struct pci_dn *pdn = eeh_dev_to_pdn(edev);
 
-	if (!edev->pe) {
+	pe = eeh_dev_to_pe(edev);
+	if (!pe) {
 		pr_debug("%s: No PE found for device %04x:%02x:%02x.%01x\n",
 			 __func__,  pdn->phb->global_number,
 			 pdn->busno,
@@ -466,7 +467,6 @@ int eeh_rmv_from_parent_pe(struct eeh_dev *edev)
 	}
 
 	/* Remove the EEH device */
-	pe = eeh_dev_to_pe(edev);
 	edev->pe = NULL;
 	list_del(&edev->entry);
 
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 10/14] powerpc/eeh: Cleanup eeh_enabled()
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (8 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 09/14] powerpc/eeh: Cleanup logic in eeh_rmv_from_parent_pe() Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 11/14] powerpc/eeh: Cleanup unnecessary eeh_pe_state_mark_with_cfg() Sam Bobroff
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index b48b08ed9be3..247f09ce44de 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -241,11 +241,7 @@ static inline bool eeh_has_flag(int flag)
 
 static inline bool eeh_enabled(void)
 {
-	if (eeh_has_flag(EEH_FORCE_DISABLED) ||
-	    !eeh_has_flag(EEH_ENABLED))
-		return false;
-
-	return true;
+	return eeh_has_flag(EEH_ENABLED) && !eeh_has_flag(EEH_FORCE_DISABLED);
 }
 
 static inline void eeh_serialize_lock(unsigned long *flags)
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 11/14] powerpc/eeh: Cleanup unnecessary eeh_pe_state_mark_with_cfg()
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (9 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 10/14] powerpc/eeh: Cleanup eeh_enabled() Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 12/14] powerpc/eeh: Cleanup eeh_pe_state_mark() Sam Bobroff
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

The function eeh_pe_state_mark_with_cfg() just performs the work of
eeh_pe_state_mark() and then, conditionally, the work of
eeh_pe_state_clear(). However it is only ever called with a constant
state such that the condition is always true, so replace it by direct
calls.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh.c    |  6 ++++--
 arch/powerpc/kernel/eeh_pe.c | 22 ----------------------
 2 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 25e7384fdaba..dd5ac2ad141e 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -830,7 +830,8 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state stat
 		eeh_pe_state_clear(pe, EEH_PE_ISOLATED);
 		break;
 	case pcie_hot_reset:
-		eeh_pe_state_mark_with_cfg(pe, EEH_PE_ISOLATED);
+		eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+		eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 		eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE);
 		eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev);
 		if (!(pe->type & EEH_PE_VF))
@@ -838,7 +839,8 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state stat
 		eeh_ops->reset(pe, EEH_RESET_HOT);
 		break;
 	case pcie_warm_reset:
-		eeh_pe_state_mark_with_cfg(pe, EEH_PE_ISOLATED);
+		eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+		eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 		eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE);
 		eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev);
 		if (!(pe->type & EEH_PE_VF))
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 78f125d24bd0..2b376718237f 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -670,28 +670,6 @@ void eeh_pe_state_clear(struct eeh_pe *pe, int state)
 	eeh_pe_traverse(pe, __eeh_pe_state_clear, &state);
 }
 
-/**
- * eeh_pe_state_mark_with_cfg - Mark PE state with unblocked config space
- * @pe: PE
- * @state: PE state to be set
- *
- * Set specified flag to PE and its child PEs. The PCI config space
- * of some PEs is blocked automatically when EEH_PE_ISOLATED is set,
- * which isn't needed in some situations. The function allows to set
- * the specified flag to indicated PEs without blocking their PCI
- * config space.
- */
-void eeh_pe_state_mark_with_cfg(struct eeh_pe *pe, int state)
-{
-	eeh_pe_traverse(pe, __eeh_pe_state_mark, &state);
-	if (!(state & EEH_PE_ISOLATED))
-		return;
-
-	/* Clear EEH_PE_CFG_BLOCKED, which might be set just now */
-	state = EEH_PE_CFG_BLOCKED;
-	eeh_pe_traverse(pe, __eeh_pe_state_clear, &state);
-}
-
 /*
  * Some PCI bridges (e.g. PLX bridges) have primary/secondary
  * buses assigned explicitly by firmware, and we probably have
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 12/14] powerpc/eeh: Cleanup eeh_pe_state_mark()
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (10 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 11/14] powerpc/eeh: Cleanup unnecessary eeh_pe_state_mark_with_cfg() Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 13/14] powerpc/eeh: Cleanup eeh_ops.wait_state() Sam Bobroff
  2018-09-12  1:23 ` [PATCH 14/14] powerpc/eeh: Cleanup control flow in eeh_handle_normal_event() Sam Bobroff
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

Currently, eeh_pe_state_mark() marks a PE (and it's children) with a
state and then performs additional processing if that state included
EEH_PE_ISOLATED.

The state parameter is always a constant at the call site, so
rearrange eeh_pe_state_mark() into two functions and just call the
appropriate one at each site.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/ppc-pci.h           |  1 +
 arch/powerpc/kernel/eeh.c                    |  8 +--
 arch/powerpc/kernel/eeh_driver.c             | 10 ++-
 arch/powerpc/kernel/eeh_pe.c                 | 70 +++++++++-----------
 arch/powerpc/platforms/powernv/eeh-powernv.c |  8 +--
 drivers/pci/hotplug/pnv_php.c                |  2 +-
 6 files changed, 46 insertions(+), 53 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
index 726288048652..f67da277d652 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -58,6 +58,7 @@ void eeh_save_bars(struct eeh_dev *edev);
 int rtas_write_config(struct pci_dn *, int where, int size, u32 val);
 int rtas_read_config(struct pci_dn *, int where, int size, u32 *val);
 void eeh_pe_state_mark(struct eeh_pe *pe, int state);
+void eeh_pe_mark_isolated(struct eeh_pe *pe);
 void eeh_pe_state_clear(struct eeh_pe *pe, int state);
 void eeh_pe_state_mark_with_cfg(struct eeh_pe *pe, int state);
 void eeh_pe_dev_mode_mark(struct eeh_pe *pe, int mode);
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index dd5ac2ad141e..90e718f58676 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -404,7 +404,7 @@ static int eeh_phb_check_failure(struct eeh_pe *pe)
 	}
 
 	/* Isolate the PHB and send event */
-	eeh_pe_state_mark(phb_pe, EEH_PE_ISOLATED);
+	eeh_pe_mark_isolated(phb_pe);
 	eeh_serialize_unlock(flags);
 
 	pr_err("EEH: PHB#%x failure detected, location: %s\n",
@@ -563,7 +563,7 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
 	 * with other functions on this device, and functions under
 	 * bridges.
 	 */
-	eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+	eeh_pe_mark_isolated(pe);
 	eeh_serialize_unlock(flags);
 
 	/* Most EEH events are due to device driver bugs.  Having
@@ -830,7 +830,7 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state stat
 		eeh_pe_state_clear(pe, EEH_PE_ISOLATED);
 		break;
 	case pcie_hot_reset:
-		eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+		eeh_pe_mark_isolated(pe);
 		eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 		eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE);
 		eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev);
@@ -839,7 +839,7 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state stat
 		eeh_ops->reset(pe, EEH_RESET_HOT);
 		break;
 	case pcie_warm_reset:
-		eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+		eeh_pe_mark_isolated(pe);
 		eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED);
 		eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE);
 		eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev);
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index ffe8293d1f06..c827617613c1 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -1029,7 +1029,7 @@ void eeh_handle_special_event(void)
 				phb_pe = eeh_phb_pe_get(hose);
 				if (!phb_pe) continue;
 
-				eeh_pe_state_mark(phb_pe, EEH_PE_ISOLATED);
+				eeh_pe_mark_isolated(phb_pe);
 			}
 
 			eeh_serialize_unlock(flags);
@@ -1044,11 +1044,9 @@ void eeh_handle_special_event(void)
 			/* Purge all events of the PHB */
 			eeh_remove_event(pe, true);
 
-			if (rc == EEH_NEXT_ERR_DEAD_PHB)
-				eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
-			else
-				eeh_pe_state_mark(pe,
-					EEH_PE_ISOLATED | EEH_PE_RECOVERING);
+			if (rc != EEH_NEXT_ERR_DEAD_PHB)
+				eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
+			eeh_pe_mark_isolated(pe);
 
 			eeh_serialize_unlock(flags);
 
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 2b376718237f..e43dcefbe73f 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -540,56 +540,50 @@ void eeh_pe_update_time_stamp(struct eeh_pe *pe)
 }
 
 /**
- * __eeh_pe_state_mark - Mark the state for the PE
- * @data: EEH PE
- * @flag: state
+ * eeh_pe_state_mark - Mark specified state for PE and its associated device
+ * @pe: EEH PE
  *
- * The function is used to mark the indicated state for the given
- * PE. Also, the associated PCI devices will be put into IO frozen
- * state as well.
+ * EEH error affects the current PE and its child PEs. The function
+ * is used to mark appropriate state for the affected PEs and the
+ * associated devices.
  */
-static void *__eeh_pe_state_mark(struct eeh_pe *pe, void *flag)
+void eeh_pe_state_mark(struct eeh_pe *root, int state)
 {
-	int state = *((int *)flag);
-	struct eeh_dev *edev, *tmp;
-	struct pci_dev *pdev;
-
-	/* Keep the state of permanently removed PE intact */
-	if (pe->state & EEH_PE_REMOVED)
-		return NULL;
-
-	pe->state |= state;
-
-	/* Offline PCI devices if applicable */
-	if (!(state & EEH_PE_ISOLATED))
-		return NULL;
-
-	eeh_pe_for_each_dev(pe, edev, tmp) {
-		pdev = eeh_dev_to_pci_dev(edev);
-		if (pdev)
-			pdev->error_state = pci_channel_io_frozen;
-	}
-
-	/* Block PCI config access if required */
-	if (pe->state & EEH_PE_CFG_RESTRICTED)
-		pe->state |= EEH_PE_CFG_BLOCKED;
+	struct eeh_pe *pe;
 
-	return NULL;
+	eeh_for_each_pe(root, pe)
+		if (!(pe->state & EEH_PE_REMOVED))
+			pe->state |= state;
 }
+EXPORT_SYMBOL_GPL(eeh_pe_state_mark);
 
 /**
- * eeh_pe_state_mark - Mark specified state for PE and its associated device
+ * eeh_pe_mark_isolated
  * @pe: EEH PE
  *
- * EEH error affects the current PE and its child PEs. The function
- * is used to mark appropriate state for the affected PEs and the
- * associated devices.
+ * Record that a PE has been isolated by marking the PE and it's children as
+ * EEH_PE_ISOLATED (and EEH_PE_CFG_BLOCKED, if required) and their PCI devices
+ * as pci_channel_io_frozen.
  */
-void eeh_pe_state_mark(struct eeh_pe *pe, int state)
+void eeh_pe_mark_isolated(struct eeh_pe *root)
 {
-	eeh_pe_traverse(pe, __eeh_pe_state_mark, &state);
+	struct eeh_pe *pe;
+	struct eeh_dev *edev;
+	struct pci_dev *pdev;
+
+	eeh_pe_state_mark(root, EEH_PE_ISOLATED);
+	eeh_for_each_pe(root, pe) {
+		list_for_each_entry(edev, &pe->edevs, entry) {
+			pdev = eeh_dev_to_pci_dev(edev);
+			if (pdev)
+				pdev->error_state = pci_channel_io_frozen;
+		}
+		/* Block PCI config access if required */
+		if (pe->state & EEH_PE_CFG_RESTRICTED)
+			pe->state |= EEH_PE_CFG_BLOCKED;
+	}
 }
-EXPORT_SYMBOL_GPL(eeh_pe_state_mark);
+EXPORT_SYMBOL_GPL(eeh_pe_mark_isolated);
 
 static void *__eeh_pe_dev_mode_mark(struct eeh_dev *edev, void *flag)
 {
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index a7e59dbf2696..fd1db9f286f1 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -590,7 +590,7 @@ static int pnv_eeh_get_phb_state(struct eeh_pe *pe)
 			  EEH_STATE_MMIO_ENABLED |
 			  EEH_STATE_DMA_ENABLED);
 	} else if (!(pe->state & EEH_PE_ISOLATED)) {
-		eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+		eeh_pe_mark_isolated(pe);
 		pnv_eeh_get_phb_diag(pe);
 
 		if (eeh_has_flag(EEH_EARLY_DUMP_LOG))
@@ -692,7 +692,7 @@ static int pnv_eeh_get_pe_state(struct eeh_pe *pe)
 		if (phb->freeze_pe)
 			phb->freeze_pe(phb, pe->addr);
 
-		eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+		eeh_pe_mark_isolated(pe);
 		pnv_eeh_get_phb_diag(pe);
 
 		if (eeh_has_flag(EEH_EARLY_DUMP_LOG))
@@ -1597,7 +1597,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
 		if ((ret == EEH_NEXT_ERR_FROZEN_PE  ||
 		    ret == EEH_NEXT_ERR_FENCED_PHB) &&
 		    !((*pe)->state & EEH_PE_ISOLATED)) {
-			eeh_pe_state_mark(*pe, EEH_PE_ISOLATED);
+			eeh_pe_mark_isolated(*pe);
 			pnv_eeh_get_phb_diag(*pe);
 
 			if (eeh_has_flag(EEH_EARLY_DUMP_LOG))
@@ -1626,7 +1626,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe)
 			}
 
 			/* We possibly migrate to another PE */
-			eeh_pe_state_mark(*pe, EEH_PE_ISOLATED);
+			eeh_pe_mark_isolated(*pe);
 		}
 
 		/*
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 6c2e8d7307c6..9555338f39c5 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -738,7 +738,7 @@ static irqreturn_t pnv_php_interrupt(int irq, void *data)
 		pe = edev ? edev->pe : NULL;
 		if (pe) {
 			eeh_serialize_lock(&flags);
-			eeh_pe_state_mark(pe, EEH_PE_ISOLATED);
+			eeh_pe_mark_isolated(pe);
 			eeh_serialize_unlock(flags);
 			eeh_pe_set_option(pe, EEH_OPT_FREEZE_PE);
 		}
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 13/14] powerpc/eeh: Cleanup eeh_ops.wait_state()
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (11 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 12/14] powerpc/eeh: Cleanup eeh_pe_state_mark() Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  2018-09-12  1:23 ` [PATCH 14/14] powerpc/eeh: Cleanup control flow in eeh_handle_normal_event() Sam Bobroff
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

The wait_state member of eeh_ops does not need to be platform
dependent; it's just logic around eeh_ops.get_state(). Therefore,
merge the two (slightly different!) platform versions into a new
function, eeh_wait_state() and remove the eeh_ops member.

While doing this, also correct:
* The wait logic, so that it never waits longer than max_wait.
* The wait logic, so that it never waits less than
  EEH_STATE_MIN_WAIT_TIME.
* One call site where the result is treated like a bit field before
  it's checked for negative error values.
* In pseries_eeh_get_state(), rename the "state" parameter to "delay"
  because that's what it is.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/include/asm/eeh.h               |  4 +-
 arch/powerpc/kernel/eeh.c                    |  9 ++-
 arch/powerpc/kernel/eeh_driver.c             |  2 +-
 arch/powerpc/kernel/eeh_pe.c                 | 51 +++++++++++++++
 arch/powerpc/platforms/powernv/eeh-powernv.c | 38 -----------
 arch/powerpc/platforms/pseries/eeh_pseries.c | 66 ++------------------
 6 files changed, 62 insertions(+), 108 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 247f09ce44de..8b596d096ebe 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -205,9 +205,8 @@ struct eeh_ops {
 	void* (*probe)(struct pci_dn *pdn, void *data);
 	int (*set_option)(struct eeh_pe *pe, int option);
 	int (*get_pe_addr)(struct eeh_pe *pe);
-	int (*get_state)(struct eeh_pe *pe, int *state);
+	int (*get_state)(struct eeh_pe *pe, int *delay);
 	int (*reset)(struct eeh_pe *pe, int option);
-	int (*wait_state)(struct eeh_pe *pe, int max_wait);
 	int (*get_log)(struct eeh_pe *pe, int severity, char *drv_log, unsigned long len);
 	int (*configure_bridge)(struct eeh_pe *pe);
 	int (*err_inject)(struct eeh_pe *pe, int type, int func,
@@ -264,6 +263,7 @@ typedef void *(*eeh_edev_traverse_func)(struct eeh_dev *edev, void *flag);
 typedef void *(*eeh_pe_traverse_func)(struct eeh_pe *pe, void *flag);
 void eeh_set_pe_aux_size(int size);
 int eeh_phb_pe_create(struct pci_controller *phb);
+int eeh_wait_state(struct eeh_pe *pe, int max_wait);
 struct eeh_pe *eeh_phb_pe_get(struct pci_controller *phb);
 struct eeh_pe *eeh_pe_next(struct eeh_pe *pe, struct eeh_pe *root);
 struct eeh_pe *eeh_pe_get(struct pci_controller *phb,
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 90e718f58676..c1c6ae18cfed 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -681,7 +681,7 @@ int eeh_pci_enable(struct eeh_pe *pe, int function)
 
 	/* Check if the request is finished successfully */
 	if (active_flag) {
-		rc = eeh_ops->wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
+		rc = eeh_wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
 		if (rc < 0)
 			return rc;
 
@@ -920,16 +920,15 @@ int eeh_pe_reset_full(struct eeh_pe *pe)
 			break;
 
 		/* Wait until the PE is in a functioning state */
-		state = eeh_ops->wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
-		if (eeh_state_active(state))
-			break;
-
+		state = eeh_wait_state(pe, PCI_BUS_RESET_WAIT_MSEC);
 		if (state < 0) {
 			pr_warn("%s: Unrecoverable slot failure on PHB#%x-PE#%x",
 				__func__, pe->phb->global_number, pe->addr);
 			ret = -ENOTRECOVERABLE;
 			break;
 		}
+		if (eeh_state_active(state))
+			break;
 
 		/* Set error in case this is our last attempt */
 		ret = -EIO;
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index c827617613c1..e7f757cd839b 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -836,7 +836,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 	/* Get the current PCI slot state. This can take a long time,
 	 * sometimes over 300 seconds for certain systems.
 	 */
-	rc = eeh_ops->wait_state(pe, MAX_WAIT_FOR_RECOVERY*1000);
+	rc = eeh_wait_state(pe, MAX_WAIT_FOR_RECOVERY*1000);
 	if (rc < 0 || rc == EEH_STATE_NOT_SUPPORT) {
 		pr_warn("EEH: Permanent failure\n");
 		goto hard_fail;
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index e43dcefbe73f..6fa2032e0594 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -108,6 +108,57 @@ int eeh_phb_pe_create(struct pci_controller *phb)
 	return 0;
 }
 
+/**
+ * eeh_wait_state - Wait for PE state
+ * @pe: EEH PE
+ * @max_wait: maximal period in millisecond
+ *
+ * Wait for the state of associated PE. It might take some time
+ * to retrieve the PE's state.
+ */
+int eeh_wait_state(struct eeh_pe *pe, int max_wait)
+{
+	int ret;
+	int mwait;
+
+	/*
+	 * According to PAPR, the state of PE might be temporarily
+	 * unavailable. Under the circumstance, we have to wait
+	 * for indicated time determined by firmware. The maximal
+	 * wait time is 5 minutes, which is acquired from the original
+	 * EEH implementation. Also, the original implementation
+	 * also defined the minimal wait time as 1 second.
+	 */
+#define EEH_STATE_MIN_WAIT_TIME	(1000)
+#define EEH_STATE_MAX_WAIT_TIME	(300 * 1000)
+
+	while (1) {
+		ret = eeh_ops->get_state(pe, &mwait);
+
+		if (ret != EEH_STATE_UNAVAILABLE)
+			return ret;
+
+		if (max_wait <= 0) {
+			pr_warn("%s: Timeout when getting PE's state (%d)\n",
+				__func__, max_wait);
+			return EEH_STATE_NOT_SUPPORT;
+		}
+
+		if (mwait < EEH_STATE_MIN_WAIT_TIME) {
+			pr_warn("%s: Firmware returned bad wait value %d\n",
+				__func__, mwait);
+			mwait = EEH_STATE_MIN_WAIT_TIME;
+		} else if (mwait > EEH_STATE_MAX_WAIT_TIME) {
+			pr_warn("%s: Firmware returned too long wait value %d\n",
+				__func__, mwait);
+			mwait = EEH_STATE_MAX_WAIT_TIME;
+		}
+
+		msleep(min(mwait, max_wait));
+		max_wait -= mwait;
+	}
+}
+
 /**
  * eeh_phb_pe_get - Retrieve PHB PE based on the given PHB
  * @phb: PCI controller
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index fd1db9f286f1..abc0be7507c8 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1133,43 +1133,6 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
 	return pnv_eeh_bridge_reset(bus->self, option);
 }
 
-/**
- * pnv_eeh_wait_state - Wait for PE state
- * @pe: EEH PE
- * @max_wait: maximal period in millisecond
- *
- * Wait for the state of associated PE. It might take some time
- * to retrieve the PE's state.
- */
-static int pnv_eeh_wait_state(struct eeh_pe *pe, int max_wait)
-{
-	int ret;
-	int mwait;
-
-	while (1) {
-		ret = pnv_eeh_get_state(pe, &mwait);
-
-		/*
-		 * If the PE's state is temporarily unavailable,
-		 * we have to wait for the specified time. Otherwise,
-		 * the PE's state will be returned immediately.
-		 */
-		if (ret != EEH_STATE_UNAVAILABLE)
-			return ret;
-
-		if (max_wait <= 0) {
-			pr_warn("%s: Timeout getting PE#%x's state (%d)\n",
-				__func__, pe->addr, max_wait);
-			return EEH_STATE_NOT_SUPPORT;
-		}
-
-		max_wait -= mwait;
-		msleep(mwait);
-	}
-
-	return EEH_STATE_NOT_SUPPORT;
-}
-
 /**
  * pnv_eeh_get_log - Retrieve error log
  * @pe: EEH PE
@@ -1688,7 +1651,6 @@ static struct eeh_ops pnv_eeh_ops = {
 	.get_pe_addr            = pnv_eeh_get_pe_addr,
 	.get_state              = pnv_eeh_get_state,
 	.reset                  = pnv_eeh_reset,
-	.wait_state             = pnv_eeh_wait_state,
 	.get_log                = pnv_eeh_get_log,
 	.configure_bridge       = pnv_eeh_configure_bridge,
 	.err_inject		= pnv_eeh_err_inject,
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 823cb27efa8b..c9e5ca4afb26 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -438,7 +438,7 @@ static int pseries_eeh_get_pe_addr(struct eeh_pe *pe)
 /**
  * pseries_eeh_get_state - Retrieve PE state
  * @pe: EEH PE
- * @state: return value
+ * @delay: suggested time to wait if state is unavailable
  *
  * Retrieve the state of the specified PE. On RTAS compliant
  * pseries platform, there already has one dedicated RTAS function
@@ -448,7 +448,7 @@ static int pseries_eeh_get_pe_addr(struct eeh_pe *pe)
  * RTAS calls for the purpose, we need to try the new one and back
  * to the old one if the new one couldn't work properly.
  */
-static int pseries_eeh_get_state(struct eeh_pe *pe, int *state)
+static int pseries_eeh_get_state(struct eeh_pe *pe, int *delay)
 {
 	int config_addr;
 	int ret;
@@ -499,7 +499,8 @@ static int pseries_eeh_get_state(struct eeh_pe *pe, int *state)
 		break;
 	case 5:
 		if (rets[2]) {
-			if (state) *state = rets[2];
+			if (delay)
+				*delay = rets[2];
 			result = EEH_STATE_UNAVAILABLE;
 		} else {
 			result = EEH_STATE_NOT_SUPPORT;
@@ -553,64 +554,6 @@ static int pseries_eeh_reset(struct eeh_pe *pe, int option)
 	return ret;
 }
 
-/**
- * pseries_eeh_wait_state - Wait for PE state
- * @pe: EEH PE
- * @max_wait: maximal period in millisecond
- *
- * Wait for the state of associated PE. It might take some time
- * to retrieve the PE's state.
- */
-static int pseries_eeh_wait_state(struct eeh_pe *pe, int max_wait)
-{
-	int ret;
-	int mwait;
-
-	/*
-	 * According to PAPR, the state of PE might be temporarily
-	 * unavailable. Under the circumstance, we have to wait
-	 * for indicated time determined by firmware. The maximal
-	 * wait time is 5 minutes, which is acquired from the original
-	 * EEH implementation. Also, the original implementation
-	 * also defined the minimal wait time as 1 second.
-	 */
-#define EEH_STATE_MIN_WAIT_TIME	(1000)
-#define EEH_STATE_MAX_WAIT_TIME	(300 * 1000)
-
-	while (1) {
-		ret = pseries_eeh_get_state(pe, &mwait);
-
-		/*
-		 * If the PE's state is temporarily unavailable,
-		 * we have to wait for the specified time. Otherwise,
-		 * the PE's state will be returned immediately.
-		 */
-		if (ret != EEH_STATE_UNAVAILABLE)
-			return ret;
-
-		if (max_wait <= 0) {
-			pr_warn("%s: Timeout when getting PE's state (%d)\n",
-				__func__, max_wait);
-			return EEH_STATE_NOT_SUPPORT;
-		}
-
-		if (mwait <= 0) {
-			pr_warn("%s: Firmware returned bad wait value %d\n",
-				__func__, mwait);
-			mwait = EEH_STATE_MIN_WAIT_TIME;
-		} else if (mwait > EEH_STATE_MAX_WAIT_TIME) {
-			pr_warn("%s: Firmware returned too long wait value %d\n",
-				__func__, mwait);
-			mwait = EEH_STATE_MAX_WAIT_TIME;
-		}
-
-		max_wait -= mwait;
-		msleep(mwait);
-	}
-
-	return EEH_STATE_NOT_SUPPORT;
-}
-
 /**
  * pseries_eeh_get_log - Retrieve error log
  * @pe: EEH PE
@@ -849,7 +792,6 @@ static struct eeh_ops pseries_eeh_ops = {
 	.get_pe_addr		= pseries_eeh_get_pe_addr,
 	.get_state		= pseries_eeh_get_state,
 	.reset			= pseries_eeh_reset,
-	.wait_state		= pseries_eeh_wait_state,
 	.get_log		= pseries_eeh_get_log,
 	.configure_bridge       = pseries_eeh_configure_bridge,
 	.err_inject		= NULL,
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 14/14] powerpc/eeh: Cleanup control flow in eeh_handle_normal_event()
  2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
                   ` (12 preceding siblings ...)
  2018-09-12  1:23 ` [PATCH 13/14] powerpc/eeh: Cleanup eeh_ops.wait_state() Sam Bobroff
@ 2018-09-12  1:23 ` Sam Bobroff
  13 siblings, 0 replies; 16+ messages in thread
From: Sam Bobroff @ 2018-09-12  1:23 UTC (permalink / raw)
  To: linuxppc-dev

Rather than mixing "if (state)" blocks and gotos, convert entirely to
"if (state)" blocks to make the state machine behaviour clearer.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 arch/powerpc/kernel/eeh_driver.c | 196 +++++++++++++++----------------
 1 file changed, 94 insertions(+), 102 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index e7f757cd839b..9446248eb6b8 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -808,10 +808,8 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		pr_err("EEH: PHB#%x-PE#%x has failed %d times in the last hour and has been permanently disabled.\n",
 		       pe->phb->global_number, pe->addr,
 		       pe->freeze_count);
-		goto hard_fail;
+		result = PCI_ERS_RESULT_DISCONNECT;
 	}
-	pr_warn("EEH: This PCI device has failed %d times in the last hour and will be permanently disabled after %d failures.\n",
-		pe->freeze_count, eeh_max_freezes);
 
 	/* Walk the various device drivers attached to this slot through
 	 * a reset sequence, giving each an opportunity to do what it needs
@@ -823,31 +821,39 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 	 * the error. Override the result if necessary to have partially
 	 * hotplug for this case.
 	 */
-	pr_info("EEH: Notify device drivers to shutdown\n");
-	eeh_set_channel_state(pe, pci_channel_io_frozen);
-	eeh_set_irq_state(pe, false);
-	eeh_pe_report("error_detected(IO frozen)", pe, eeh_report_error,
-		      &result);
-	if ((pe->type & EEH_PE_PHB) &&
-	    result != PCI_ERS_RESULT_NONE &&
-	    result != PCI_ERS_RESULT_NEED_RESET)
-		result = PCI_ERS_RESULT_NEED_RESET;
+	if (result != PCI_ERS_RESULT_DISCONNECT) {
+		pr_warn("EEH: This PCI device has failed %d times in the last hour and will be permanently disabled after %d failures.\n",
+			pe->freeze_count, eeh_max_freezes);
+		pr_info("EEH: Notify device drivers to shutdown\n");
+		eeh_set_channel_state(pe, pci_channel_io_frozen);
+		eeh_set_irq_state(pe, false);
+		eeh_pe_report("error_detected(IO frozen)", pe,
+			      eeh_report_error, &result);
+		if ((pe->type & EEH_PE_PHB) &&
+		    result != PCI_ERS_RESULT_NONE &&
+		    result != PCI_ERS_RESULT_NEED_RESET)
+			result = PCI_ERS_RESULT_NEED_RESET;
+	}
 
 	/* Get the current PCI slot state. This can take a long time,
 	 * sometimes over 300 seconds for certain systems.
 	 */
-	rc = eeh_wait_state(pe, MAX_WAIT_FOR_RECOVERY*1000);
-	if (rc < 0 || rc == EEH_STATE_NOT_SUPPORT) {
-		pr_warn("EEH: Permanent failure\n");
-		goto hard_fail;
+	if (result != PCI_ERS_RESULT_DISCONNECT) {
+		rc = eeh_wait_state(pe, MAX_WAIT_FOR_RECOVERY*1000);
+		if (rc < 0 || rc == EEH_STATE_NOT_SUPPORT) {
+			pr_warn("EEH: Permanent failure\n");
+			result = PCI_ERS_RESULT_DISCONNECT;
+		}
 	}
 
 	/* Since rtas may enable MMIO when posting the error log,
 	 * don't post the error log until after all dev drivers
 	 * have been informed.
 	 */
-	pr_info("EEH: Collect temporary log\n");
-	eeh_slot_error_detail(pe, EEH_LOG_TEMP);
+	if (result != PCI_ERS_RESULT_DISCONNECT) {
+		pr_info("EEH: Collect temporary log\n");
+		eeh_slot_error_detail(pe, EEH_LOG_TEMP);
+	}
 
 	/* If all device drivers were EEH-unaware, then shut
 	 * down all of the device drivers, and hope they
@@ -859,7 +865,7 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		if (rc) {
 			pr_warn("%s: Unable to reset, err=%d\n",
 				__func__, rc);
-			goto hard_fail;
+			result = PCI_ERS_RESULT_DISCONNECT;
 		}
 	}
 
@@ -868,9 +874,9 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		pr_info("EEH: Enable I/O for affected devices\n");
 		rc = eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
 
-		if (rc < 0)
-			goto hard_fail;
-		if (rc) {
+		if (rc < 0) {
+			result = PCI_ERS_RESULT_DISCONNECT;
+		} else if (rc) {
 			result = PCI_ERS_RESULT_NEED_RESET;
 		} else {
 			pr_info("EEH: Notify device drivers to resume I/O\n");
@@ -884,9 +890,9 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		pr_info("EEH: Enabled DMA for affected devices\n");
 		rc = eeh_pci_enable(pe, EEH_OPT_THAW_DMA);
 
-		if (rc < 0)
-			goto hard_fail;
-		if (rc) {
+		if (rc < 0) {
+			result = PCI_ERS_RESULT_DISCONNECT;
+		} else if (rc) {
 			result = PCI_ERS_RESULT_NEED_RESET;
 		} else {
 			/*
@@ -899,12 +905,6 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		}
 	}
 
-	/* If any device has a hard failure, then shut off everything. */
-	if (result == PCI_ERS_RESULT_DISCONNECT) {
-		pr_warn("EEH: Device driver gave up\n");
-		goto hard_fail;
-	}
-
 	/* If any device called out for a reset, then reset the slot */
 	if (result == PCI_ERS_RESULT_NEED_RESET) {
 		pr_info("EEH: Reset without hotplug activity\n");
@@ -912,89 +912,81 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		if (rc) {
 			pr_warn("%s: Cannot reset, err=%d\n",
 				__func__, rc);
-			goto hard_fail;
+			result = PCI_ERS_RESULT_DISCONNECT;
+		} else {
+			result = PCI_ERS_RESULT_NONE;
+			eeh_set_channel_state(pe, pci_channel_io_normal);
+			eeh_set_irq_state(pe, true);
+			eeh_pe_report("slot_reset", pe, eeh_report_reset,
+				      &result);
 		}
-
-		pr_info("EEH: Notify device drivers "
-			"the completion of reset\n");
-		result = PCI_ERS_RESULT_NONE;
-		eeh_set_channel_state(pe, pci_channel_io_normal);
-		eeh_set_irq_state(pe, true);
-		eeh_pe_report("slot_reset", pe, eeh_report_reset, &result);
 	}
 
-	/* All devices should claim they have recovered by now. */
-	if ((result != PCI_ERS_RESULT_RECOVERED) &&
-	    (result != PCI_ERS_RESULT_NONE)) {
-		pr_warn("EEH: Not recovered\n");
-		goto hard_fail;
-	}
-
-	/*
-	 * For those hot removed VFs, we should add back them after PF get
-	 * recovered properly.
-	 */
-	list_for_each_entry_safe(edev, tmp, &rmv_data.removed_vf_list,
-				 rmv_entry) {
-		eeh_add_virt_device(edev);
-		list_del(&edev->rmv_entry);
-	}
-
-	/* Tell all device drivers that they can resume operations */
-	pr_info("EEH: Notify device driver to resume\n");
-	eeh_set_channel_state(pe, pci_channel_io_normal);
-	eeh_set_irq_state(pe, true);
-	eeh_pe_report("resume", pe, eeh_report_resume, NULL);
-	eeh_for_each_pe(pe, tmp_pe) {
-		eeh_pe_for_each_dev(tmp_pe, edev, tmp) {
-			edev->mode &= ~EEH_DEV_NO_HANDLER;
-			edev->in_error = false;
+	if ((result == PCI_ERS_RESULT_RECOVERED) ||
+	    (result == PCI_ERS_RESULT_NONE)) {
+		/*
+		 * For those hot removed VFs, we should add back them after PF
+		 * get recovered properly.
+		 */
+		list_for_each_entry_safe(edev, tmp, &rmv_data.removed_vf_list,
+					 rmv_entry) {
+			eeh_add_virt_device(edev);
+			list_del(&edev->rmv_entry);
 		}
-	}
 
-	pr_info("EEH: Recovery successful.\n");
-	goto final;
+		/* Tell all device drivers that they can resume operations */
+		pr_info("EEH: Notify device driver to resume\n");
+		eeh_set_channel_state(pe, pci_channel_io_normal);
+		eeh_set_irq_state(pe, true);
+		eeh_pe_report("resume", pe, eeh_report_resume, NULL);
+		eeh_for_each_pe(pe, tmp_pe) {
+			eeh_pe_for_each_dev(tmp_pe, edev, tmp) {
+				edev->mode &= ~EEH_DEV_NO_HANDLER;
+				edev->in_error = false;
+			}
+		}
 
-hard_fail:
-	/*
-	 * About 90% of all real-life EEH failures in the field
-	 * are due to poorly seated PCI cards. Only 10% or so are
-	 * due to actual, failed cards.
-	 */
-	pr_err("EEH: Unable to recover from failure from PHB#%x-PE#%x.\n"
-	       "Please try reseating or replacing it\n",
-		pe->phb->global_number, pe->addr);
+		pr_info("EEH: Recovery successful.\n");
+	} else  {
+		/*
+		 * About 90% of all real-life EEH failures in the field
+		 * are due to poorly seated PCI cards. Only 10% or so are
+		 * due to actual, failed cards.
+		 */
+		pr_err("EEH: Unable to recover from failure from PHB#%x-PE#%x.\n"
+		       "Please try reseating or replacing it\n",
+			pe->phb->global_number, pe->addr);
 
-	eeh_slot_error_detail(pe, EEH_LOG_PERM);
+		eeh_slot_error_detail(pe, EEH_LOG_PERM);
 
-	/* Notify all devices that they're about to go down. */
-	eeh_set_channel_state(pe, pci_channel_io_perm_failure);
-	eeh_set_irq_state(pe, false);
-	eeh_pe_report("error_detected(permanent failure)", pe,
-		      eeh_report_failure, NULL);
+		/* Notify all devices that they're about to go down. */
+		eeh_set_channel_state(pe, pci_channel_io_perm_failure);
+		eeh_set_irq_state(pe, false);
+		eeh_pe_report("error_detected(permanent failure)", pe,
+			      eeh_report_failure, NULL);
 
-	/* Mark the PE to be removed permanently */
-	eeh_pe_state_mark(pe, EEH_PE_REMOVED);
+		/* Mark the PE to be removed permanently */
+		eeh_pe_state_mark(pe, EEH_PE_REMOVED);
 
-	/*
-	 * Shut down the device drivers for good. We mark
-	 * all removed devices correctly to avoid access
-	 * the their PCI config any more.
-	 */
-	if (pe->type & EEH_PE_VF) {
-		eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
-		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
-	} else {
-		eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
-		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+		/*
+		 * Shut down the device drivers for good. We mark
+		 * all removed devices correctly to avoid access
+		 * the their PCI config any more.
+		 */
+		if (pe->type & EEH_PE_VF) {
+			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
+		} else {
+			eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
+			eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
 
-		pci_lock_rescan_remove();
-		pci_hp_remove_devices(bus);
-		pci_unlock_rescan_remove();
-		/* The passed PE should no longer be used */
-		return;
+			pci_lock_rescan_remove();
+			pci_hp_remove_devices(bus);
+			pci_unlock_rescan_remove();
+			/* The passed PE should no longer be used */
+			return;
+		}
 	}
-final:
 	eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
 }
 
-- 
2.19.0.2.gcad72f5712

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [01/14] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log()
  2018-09-12  1:23 ` [PATCH 01/14] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() Sam Bobroff
@ 2018-10-15  4:00   ` Michael Ellerman
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2018-10-15  4:00 UTC (permalink / raw)
  To: Sam Bobroff, linuxppc-dev

On Wed, 2018-09-12 at 01:23:20 UTC, Sam Bobroff wrote:
> If an error occurs during an unplug operation, it's possible for
> eeh_dump_dev_log() to be called when edev->pdn is null, which
> currently leads to dereferencing a null pointer.
> 
> Handle this by skipping the error log for those devices.
> 
> Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/f9bc28aedfb5bbd572d2d365f3095c

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-10-15  4:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-12  1:23 [PATCH 00/14] EEH refactoring 3 Sam Bobroff
2018-09-12  1:23 ` [PATCH 01/14] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() Sam Bobroff
2018-10-15  4:00   ` [01/14] " Michael Ellerman
2018-09-12  1:23 ` [PATCH 02/14] powerpc/eeh: Fix null deref for devices removed during EEH Sam Bobroff
2018-09-12  1:23 ` [PATCH 03/14] powerpc/eeh: Fix use of EEH_PE_KEEP on wrong field Sam Bobroff
2018-09-12  1:23 ` [PATCH 04/14] powerpc/eeh: Cleanup EEH_POSTPONED_PROBE Sam Bobroff
2018-09-12  1:23 ` [PATCH 05/14] powerpc/eeh: Cleanup unused field in eeh_dev Sam Bobroff
2018-09-12  1:23 ` [PATCH 06/14] powerpc/eeh: Cleanup eeh_add_virt_device() Sam Bobroff
2018-09-12  1:23 ` [PATCH 07/14] powerpc/eeh: Cleanup list_head field names Sam Bobroff
2018-09-12  1:23 ` [PATCH 08/14] powerpc/eeh: Cleanup field names in eeh_rmv_data Sam Bobroff
2018-09-12  1:23 ` [PATCH 09/14] powerpc/eeh: Cleanup logic in eeh_rmv_from_parent_pe() Sam Bobroff
2018-09-12  1:23 ` [PATCH 10/14] powerpc/eeh: Cleanup eeh_enabled() Sam Bobroff
2018-09-12  1:23 ` [PATCH 11/14] powerpc/eeh: Cleanup unnecessary eeh_pe_state_mark_with_cfg() Sam Bobroff
2018-09-12  1:23 ` [PATCH 12/14] powerpc/eeh: Cleanup eeh_pe_state_mark() Sam Bobroff
2018-09-12  1:23 ` [PATCH 13/14] powerpc/eeh: Cleanup eeh_ops.wait_state() Sam Bobroff
2018-09-12  1:23 ` [PATCH 14/14] powerpc/eeh: Cleanup control flow in eeh_handle_normal_event() Sam Bobroff

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.