All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL
@ 2018-07-18 19:44 Bjorn Helgaas
  2018-07-18 19:44 ` [PATCH v3 1/7] PCI/AER: Clear only ERR_FATAL status bits during fatal recovery Bjorn Helgaas
                   ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-07-18 19:44 UTC (permalink / raw)
  To: Oza Pawandeep
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

This is a v3 of Oza's patches [1].  It's available at [2] if you prefer
git.

v3 changes:
  - Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only called
    from pcie_do_fatal_recovery().  Moved to first in series to avoid a
    window where ERR_FATAL recovery only clears ERR_NONFATAL bits.  Visible
    only inside the PCI core.
  - Instead of having pci_cleanup_aer_uncorrect_error_status() do different
    things based on dev->error_state, use this only for ERR_NONFATAL bits.
    I didn't change the name because it's used by many drivers.
  - Rename pci_cleanup_aer_error_device_status() to
    pci_aer_clear_device_status(), make it void, and make it visible only
    inside the PCI core.
  - Remove pcie_portdrv_err_handler.slot_reset altogether instead of making
    it a stub function.  Possibly pcie_portdrv_err_handler could be removed
    completely?

[1] https://lkml.kernel.org/r/1529661494-20936-1-git-send-email-poza@codeaurora.org
[2] https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer

---

Bjorn Helgaas (1):
      PCI/AER: Clear only ERR_FATAL status bits during fatal recovery

Oza Pawandeep (6):
      PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
      PCI/AER: Factor out ERR_NONFATAL status bit clearing
      PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
      PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL
      PCI/AER: Clear device status bits during ERR_COR handling
      PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset


 drivers/pci/pci.h              |    5 ++++
 drivers/pci/pcie/aer.c         |   47 +++++++++++++++++++++++++++-------------
 drivers/pci/pcie/err.c         |   15 +++++--------
 drivers/pci/pcie/portdrv_pci.c |   25 ---------------------
 4 files changed, 43 insertions(+), 49 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/7] PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
  2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
@ 2018-07-18 19:44 ` Bjorn Helgaas
  2018-07-18 19:44 ` [PATCH v3 2/7] PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery Bjorn Helgaas
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-07-18 19:44 UTC (permalink / raw)
  To: Oza Pawandeep
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

From: Bjorn Helgaas <bhelgaas@google.com>

During recovery from fatal errors, we previously called
pci_cleanup_aer_uncorrect_error_status(), which cleared *all* uncorrectable
error status bits (both ERR_FATAL and ERR_NONFATAL).

Instead, call a new pci_aer_clear_fatal_status() that clears only the
ERR_FATAL bits (as indicated by the PCI_ERR_UNCOR_SEVER register).

Based-on-patch-by: Oza Pawandeep <poza@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.h      |    4 ++++
 drivers/pci/pcie/aer.c |   17 +++++++++++++++++
 drivers/pci/pcie/err.c |    2 +-
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c358e7a07f3f..12fd2ac95843 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -452,4 +452,8 @@ static inline int devm_of_pci_get_host_bridge_resources(struct device *dev,
 }
 #endif
 
+#ifdef CONFIG_PCIEAER
+void pci_aer_clear_fatal_status(struct pci_dev *dev);
+#endif
+
 #endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a2e88386af28..5b4a84e3d360 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -374,6 +374,23 @@ int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_cleanup_aer_uncorrect_error_status);
 
+void pci_aer_clear_fatal_status(struct pci_dev *dev)
+{
+	int pos;
+	u32 status, sev;
+
+	pos = dev->aer_cap;
+	if (!pos)
+		return;
+
+	/* Clear status bits for ERR_FATAL errors only */
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &sev);
+	status &= sev;
+	if (status)
+		pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+}
+
 int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
 {
 	int pos;
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index f7ce0cb0b0b7..0539518f9861 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -316,7 +316,7 @@ void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
 		 * do error recovery on all subordinates of the bridge instead
 		 * of the bridge and clear the error status of the bridge.
 		 */
-		pci_cleanup_aer_uncorrect_error_status(dev);
+		pci_aer_clear_fatal_status(dev);
 	}
 
 	if (result == PCI_ERS_RESULT_RECOVERED) {


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/7] PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
  2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
  2018-07-18 19:44 ` [PATCH v3 1/7] PCI/AER: Clear only ERR_FATAL status bits during fatal recovery Bjorn Helgaas
@ 2018-07-18 19:44 ` Bjorn Helgaas
  2018-07-18 19:44 ` [PATCH v3 3/7] PCI/AER: Factor out ERR_NONFATAL status bit clearing Bjorn Helgaas
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-07-18 19:44 UTC (permalink / raw)
  To: Oza Pawandeep
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

From: Oza Pawandeep <poza@codeaurora.org>

pci_cleanup_aer_uncorrect_error_status() is called by driver .slot_reset()
methods when handling ERR_NONFATAL errors.  Previously this cleared *all*
the bits, including ERR_FATAL bits.

Since we're only handling ERR_NONFATAL errors, clear only the ERR_NONFATAL
error status bits.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pcie/aer.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 5b4a84e3d360..6f0f131b5e6a 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -360,13 +360,16 @@ EXPORT_SYMBOL_GPL(pci_disable_pcie_error_reporting);
 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
 {
 	int pos;
-	u32 status;
+	u32 status, sev;
 
 	pos = dev->aer_cap;
 	if (!pos)
 		return -EIO;
 
+	/* Clear status bits for ERR_NONFATAL errors only */
 	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &sev);
+	status &= ~sev;
 	if (status)
 		pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
 


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 3/7] PCI/AER: Factor out ERR_NONFATAL status bit clearing
  2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
  2018-07-18 19:44 ` [PATCH v3 1/7] PCI/AER: Clear only ERR_FATAL status bits during fatal recovery Bjorn Helgaas
  2018-07-18 19:44 ` [PATCH v3 2/7] PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery Bjorn Helgaas
@ 2018-07-18 19:44 ` Bjorn Helgaas
  2018-07-18 19:44 ` [PATCH v3 4/7] PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path Bjorn Helgaas
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-07-18 19:44 UTC (permalink / raw)
  To: Oza Pawandeep
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

From: Oza Pawandeep <poza@codeaurora.org>

aer_error_resume() clears all ERR_NONFATAL error status bits.  This is
exactly what pci_cleanup_aer_uncorrect_error_status(), so use that instead
of duplicating the code.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pcie/aer.c |    9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 6f0f131b5e6a..b8972fe85043 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1356,20 +1356,13 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
  */
 static void aer_error_resume(struct pci_dev *dev)
 {
-	int pos;
-	u32 status, mask;
 	u16 reg16;
 
 	/* Clean up Root device status */
 	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
 	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, reg16);
 
-	/* Clean AER Root Error Status */
-	pos = dev->aer_cap;
-	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
-	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
-	status &= ~mask; /* Clear corresponding nonfatal bits */
-	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+	pci_cleanup_aer_uncorrect_error_status(dev);
 }
 
 static struct pcie_port_service_driver aerdriver = {


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 4/7] PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
  2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
                   ` (2 preceding siblings ...)
  2018-07-18 19:44 ` [PATCH v3 3/7] PCI/AER: Factor out ERR_NONFATAL status bit clearing Bjorn Helgaas
@ 2018-07-18 19:44 ` Bjorn Helgaas
  2018-07-18 19:44 ` [PATCH v3 5/7] PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-07-18 19:44 UTC (permalink / raw)
  To: Oza Pawandeep
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

From: Oza Pawandeep <poza@codeaurora.org>

broadcast_error_message() is only used for ERR_NONFATAL events, when the
state is always pci_channel_io_normal, so remove the unused alternate path.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pcie/err.c |   11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 0539518f9861..638eda5c1d79 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -259,15 +259,10 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
 		/*
 		 * If the error is reported by an end point, we think this
 		 * error is related to the upstream link of the end point.
+		 * The error is non fatal so the bus is ok; just invoke
+		 * the callback for the function that logged the error.
 		 */
-		if (state == pci_channel_io_normal)
-			/*
-			 * the error is non fatal so the bus is ok, just invoke
-			 * the callback for the function that logged the error.
-			 */
-			cb(dev, &result_data);
-		else
-			pci_walk_bus(dev->bus, cb, &result_data);
+		cb(dev, &result_data);
 	}
 
 	return result_data.result;


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 5/7] PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL
  2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
                   ` (3 preceding siblings ...)
  2018-07-18 19:44 ` [PATCH v3 4/7] PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path Bjorn Helgaas
@ 2018-07-18 19:44 ` Bjorn Helgaas
  2018-07-18 19:44 ` [PATCH v3 6/7] PCI/AER: Clear device status bits during ERR_COR handling Bjorn Helgaas
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-07-18 19:44 UTC (permalink / raw)
  To: Oza Pawandeep
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

From: Oza Pawandeep <poza@codeaurora.org>

Clear the device status bits while handling both ERR_FATAL and ERR_NONFATAL
cases.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: rename to pci_aer_clear_device_status(), declare internal to PCI
core instead of exposing it everywhere]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.h      |    1 +
 drivers/pci/pcie/aer.c |   15 +++++++++------
 drivers/pci/pcie/err.c |    2 ++
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 12fd2ac95843..fc4978df7caf 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -454,6 +454,7 @@ static inline int devm_of_pci_get_host_bridge_resources(struct device *dev,
 
 #ifdef CONFIG_PCIEAER
 void pci_aer_clear_fatal_status(struct pci_dev *dev);
+void pci_aer_clear_device_status(struct pci_dev *dev);
 #endif
 
 #endif /* DRIVERS_PCI_H */
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index b8972fe85043..dc67f52b002f 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -357,6 +357,14 @@ int pci_disable_pcie_error_reporting(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_disable_pcie_error_reporting);
 
+void pci_aer_clear_device_status(struct pci_dev *dev)
+{
+	u16 sta;
+
+	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
+	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
+}
+
 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
 {
 	int pos;
@@ -1356,12 +1364,7 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
  */
 static void aer_error_resume(struct pci_dev *dev)
 {
-	u16 reg16;
-
-	/* Clean up Root device status */
-	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
-	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, reg16);
-
+	pci_aer_clear_device_status(dev);
 	pci_cleanup_aer_uncorrect_error_status(dev);
 }
 
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 638eda5c1d79..fdbcc555860d 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -252,6 +252,7 @@ static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
 			dev->error_state = state;
 		pci_walk_bus(dev->subordinate, cb, &result_data);
 		if (cb == report_resume) {
+			pci_aer_clear_device_status(dev);
 			pci_cleanup_aer_uncorrect_error_status(dev);
 			dev->error_state = pci_channel_io_normal;
 		}
@@ -312,6 +313,7 @@ void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service)
 		 * of the bridge and clear the error status of the bridge.
 		 */
 		pci_aer_clear_fatal_status(dev);
+		pci_aer_clear_device_status(dev);
 	}
 
 	if (result == PCI_ERS_RESULT_RECOVERED) {


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 6/7] PCI/AER: Clear device status bits during ERR_COR handling
  2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
                   ` (4 preceding siblings ...)
  2018-07-18 19:44 ` [PATCH v3 5/7] PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
@ 2018-07-18 19:44 ` Bjorn Helgaas
  2018-07-18 19:45 ` [PATCH v3 7/7] PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset Bjorn Helgaas
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-07-18 19:44 UTC (permalink / raw)
  To: Oza Pawandeep
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

From: Oza Pawandeep <poza@codeaurora.org>

In case of correctable error, the Correctable Error Detected bit in the
Device Status register is set.  Clear it after handling the error.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pcie/aer.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index dc67f52b002f..2accfd7a4c9d 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -827,6 +827,7 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info)
 		if (pos)
 			pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
 					info->status);
+		pci_aer_clear_device_status(dev);
 	} else if (info->severity == AER_NONFATAL)
 		pcie_do_nonfatal_recovery(dev);
 	else if (info->severity == AER_FATAL)


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 7/7] PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
  2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
                   ` (5 preceding siblings ...)
  2018-07-18 19:44 ` [PATCH v3 6/7] PCI/AER: Clear device status bits during ERR_COR handling Bjorn Helgaas
@ 2018-07-18 19:45 ` Bjorn Helgaas
  2018-07-19  3:53 ` [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL poza
  2018-07-19 15:56 ` poza
  8 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-07-18 19:45 UTC (permalink / raw)
  To: Oza Pawandeep
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

From: Oza Pawandeep <poza@codeaurora.org>

The pci_error_handlers.slot_reset() callback is only used for non-bridge
devices (see broadcast_error_message()).  Since portdrv only binds to
bridges, we don't need pcie_portdrv_slot_reset(), so remove it.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove pcie_portdrv_slot_reset() completely]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pcie/portdrv_pci.c |   25 -------------------------
 1 file changed, 25 deletions(-)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 973f1b80a038..b78840f54a9b 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -42,17 +42,6 @@ __setup("pcie_ports=", pcie_port_setup);
 
 /* global data */
 
-static int pcie_portdrv_restore_config(struct pci_dev *dev)
-{
-	int retval;
-
-	retval = pci_enable_device(dev);
-	if (retval)
-		return retval;
-	pci_set_master(dev);
-	return 0;
-}
-
 #ifdef CONFIG_PM
 static int pcie_port_runtime_suspend(struct device *dev)
 {
@@ -160,19 +149,6 @@ static pci_ers_result_t pcie_portdrv_mmio_enabled(struct pci_dev *dev)
 	return PCI_ERS_RESULT_RECOVERED;
 }
 
-static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
-{
-	/* If fatal, restore cfg space for possible link reset at upstream */
-	if (dev->error_state == pci_channel_io_frozen) {
-		dev->state_saved = true;
-		pci_restore_state(dev);
-		pcie_portdrv_restore_config(dev);
-		pci_enable_pcie_error_reporting(dev);
-	}
-
-	return PCI_ERS_RESULT_RECOVERED;
-}
-
 static int resume_iter(struct device *device, void *data)
 {
 	struct pcie_device *pcie_device;
@@ -208,7 +184,6 @@ static const struct pci_device_id port_pci_ids[] = { {
 static const struct pci_error_handlers pcie_portdrv_err_handler = {
 	.error_detected = pcie_portdrv_error_detected,
 	.mmio_enabled = pcie_portdrv_mmio_enabled,
-	.slot_reset = pcie_portdrv_slot_reset,
 	.resume = pcie_portdrv_err_resume,
 };
 


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL
  2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
                   ` (6 preceding siblings ...)
  2018-07-18 19:45 ` [PATCH v3 7/7] PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset Bjorn Helgaas
@ 2018-07-19  3:53 ` poza
  2018-07-19 23:00   ` Bjorn Helgaas
  2018-07-19 15:56 ` poza
  8 siblings, 1 reply; 11+ messages in thread
From: poza @ 2018-07-19  3:53 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

On 2018-07-19 01:14, Bjorn Helgaas wrote:
> This is a v3 of Oza's patches [1].  It's available at [2] if you prefer
> git.
> 
> v3 changes:
>   - Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only 
> called
>     from pcie_do_fatal_recovery().  Moved to first in series to avoid a
>     window where ERR_FATAL recovery only clears ERR_NONFATAL bits.  
> Visible
>     only inside the PCI core.
>   - Instead of having pci_cleanup_aer_uncorrect_error_status() do 
> different
>     things based on dev->error_state, use this only for ERR_NONFATAL 
> bits.
>     I didn't change the name because it's used by many drivers.
>   - Rename pci_cleanup_aer_error_device_status() to
>     pci_aer_clear_device_status(), make it void, and make it visible 
> only
>     inside the PCI core.
>   - Remove pcie_portdrv_err_handler.slot_reset altogether instead of 
> making
>     it a stub function.  Possibly pcie_portdrv_err_handler could be 
> removed
>     completely?
> 
> [1]
> https://lkml.kernel.org/r/1529661494-20936-1-git-send-email-poza@codeaurora.org
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer
> 
> ---
> 
> Bjorn Helgaas (1):
>       PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
> 
> Oza Pawandeep (6):
>       PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
>       PCI/AER: Factor out ERR_NONFATAL status bit clearing
>       PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
>       PCI/AER: Clear device status bits during ERR_FATAL and 
> ERR_NONFATAL
>       PCI/AER: Clear device status bits during ERR_COR handling
>       PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
> 
> 
>  drivers/pci/pci.h              |    5 ++++
>  drivers/pci/pcie/aer.c         |   47 
> +++++++++++++++++++++++++++-------------
>  drivers/pci/pcie/err.c         |   15 +++++--------
>  drivers/pci/pcie/portdrv_pci.c |   25 ---------------------
>  4 files changed, 43 insertions(+), 49 deletions(-)

looks good to me.
Thanks for the corrections.
some x86 compilation errors, you want me to to fix it and push v4 ?

Regards,
Oza.






^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL
  2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
                   ` (7 preceding siblings ...)
  2018-07-19  3:53 ` [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL poza
@ 2018-07-19 15:56 ` poza
  8 siblings, 0 replies; 11+ messages in thread
From: poza @ 2018-07-19 15:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

On 2018-07-19 01:14, Bjorn Helgaas wrote:
> This is a v3 of Oza's patches [1].  It's available at [2] if you prefer
> git.
> 
> v3 changes:
>   - Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only 
> called
>     from pcie_do_fatal_recovery().  Moved to first in series to avoid a
>     window where ERR_FATAL recovery only clears ERR_NONFATAL bits.  
> Visible
>     only inside the PCI core.
>   - Instead of having pci_cleanup_aer_uncorrect_error_status() do 
> different
>     things based on dev->error_state, use this only for ERR_NONFATAL 
> bits.
>     I didn't change the name because it's used by many drivers.
>   - Rename pci_cleanup_aer_error_device_status() to
>     pci_aer_clear_device_status(), make it void, and make it visible 
> only
>     inside the PCI core.
>   - Remove pcie_portdrv_err_handler.slot_reset altogether instead of 
> making
>     it a stub function.  Possibly pcie_portdrv_err_handler could be 
> removed
>     completely?
> 
> [1]
> https://lkml.kernel.org/r/1529661494-20936-1-git-send-email-poza@codeaurora.org
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer
> 
> ---
> 
> Bjorn Helgaas (1):
>       PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
> 
> Oza Pawandeep (6):
>       PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
>       PCI/AER: Factor out ERR_NONFATAL status bit clearing
>       PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
>       PCI/AER: Clear device status bits during ERR_FATAL and 
> ERR_NONFATAL
>       PCI/AER: Clear device status bits during ERR_COR handling
>       PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
> 
> 
>  drivers/pci/pci.h              |    5 ++++
>  drivers/pci/pcie/aer.c         |   47 
> +++++++++++++++++++++++++++-------------
>  drivers/pci/pcie/err.c         |   15 +++++--------
>  drivers/pci/pcie/portdrv_pci.c |   25 ---------------------
>  4 files changed, 43 insertions(+), 49 deletions(-)


Hi Bjorn,

I am planning on some things to do after this series.


your text
"
1) I don't think the driver slot_reset callbacks should be responsible
for clearing these AER status bits.  Can we clear them somewhere in
the pcie_do_nonfatal_recovery() path and remove these calls from the
drivers?
"

Oza: We can do following
broadcast_error_message()
       if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
                 should do
           pci_walk_bus(dev->subordinate, 
pci_cleanup_aer_uncorrect_error_status, NULL);

and update all the drivers and remove the call 
pci_cleanup_aer_uncorrect_error_status()


2) In principle, we should only read PCI_ERR_UNCOR_STATUS *once* per
device when handling an error.  We currently read it three times:

   aer_isr
     aer_isr_one_error
       find_source_device
         find_device_iter
           is_error_source
             read PCI_ERR_UNCOR_STATUS              # 1
Oza: this is the first legitimate read
       aer_process_err_devices
         get_device_error_info(e_info->dev[i])
           read PCI_ERR_UNCOR_STATUS                # 2
Oza: I see this read used to check if link is healthy so the purpose of 
this read looks different to me.
         handle_error_source
           pcie_do_nonfatal_recovery
             ...
               report_slot_reset
                 driver->err_handler->slot_reset
                   pci_cleanup_aer_uncorrect_error_status
                     read PCI_ERR_UNCOR_STATUS      # 3
Oza: pci_cleanup_aer_uncorrect_error_status() is generic and able to 
clear status.
for e.g. in point 4 as I suggested if we have to do
pci_walk_bus(dev->subordinate, pci_cleanup_aer_uncorrect_error_status, 
NULL); then we have to read them.


3) we need to get rid of pci_channel_io_frozen permanently.

Regards,
Oza.

















^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL
  2018-07-19  3:53 ` [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL poza
@ 2018-07-19 23:00   ` Bjorn Helgaas
  0 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-07-19 23:00 UTC (permalink / raw)
  To: poza
  Cc: Philippe Ombredanne, Thomas Gleixner, Greg Kroah-Hartman,
	Kate Stewart, Dongdong Liu, Keith Busch, Wei Zhang, Sinan Kaya,
	Timur Tabi, linux-pci, linux-kernel

On Thu, Jul 19, 2018 at 09:23:47AM +0530, poza@codeaurora.org wrote:
> On 2018-07-19 01:14, Bjorn Helgaas wrote:
> > This is a v3 of Oza's patches [1].  It's available at [2] if you prefer
> > git.
> > 
> > v3 changes:
> >   - Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only
> > called
> >     from pcie_do_fatal_recovery().  Moved to first in series to avoid a
> >     window where ERR_FATAL recovery only clears ERR_NONFATAL bits.
> > Visible
> >     only inside the PCI core.
> >   - Instead of having pci_cleanup_aer_uncorrect_error_status() do
> > different
> >     things based on dev->error_state, use this only for ERR_NONFATAL
> > bits.
> >     I didn't change the name because it's used by many drivers.
> >   - Rename pci_cleanup_aer_error_device_status() to
> >     pci_aer_clear_device_status(), make it void, and make it visible
> > only
> >     inside the PCI core.
> >   - Remove pcie_portdrv_err_handler.slot_reset altogether instead of
> > making
> >     it a stub function.  Possibly pcie_portdrv_err_handler could be
> > removed
> >     completely?
> > 
> > [1]
> > https://lkml.kernel.org/r/1529661494-20936-1-git-send-email-poza@codeaurora.org
> > [2]
> > https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer
> > 
> > ---
> > 
> > Bjorn Helgaas (1):
> >       PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
> > 
> > Oza Pawandeep (6):
> >       PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
> >       PCI/AER: Factor out ERR_NONFATAL status bit clearing
> >       PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
> >       PCI/AER: Clear device status bits during ERR_FATAL and
> > ERR_NONFATAL
> >       PCI/AER: Clear device status bits during ERR_COR handling
> >       PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
> > 
> > 
> >  drivers/pci/pci.h              |    5 ++++
> >  drivers/pci/pcie/aer.c         |   47
> > +++++++++++++++++++++++++++-------------
> >  drivers/pci/pcie/err.c         |   15 +++++--------
> >  drivers/pci/pcie/portdrv_pci.c |   25 ---------------------
> >  4 files changed, 43 insertions(+), 49 deletions(-)
> 
> looks good to me.
> Thanks for the corrections.
> some x86 compilation errors, you want me to to fix it and push v4 ?

I fixed those already.  I moved these all to the pci/aer branch for
v4.19.  I'll merge them into "next" soon.  Thanks!

Bjorn

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-07-19 23:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 1/7] PCI/AER: Clear only ERR_FATAL status bits during fatal recovery Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 2/7] PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 3/7] PCI/AER: Factor out ERR_NONFATAL status bit clearing Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 4/7] PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 5/7] PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 6/7] PCI/AER: Clear device status bits during ERR_COR handling Bjorn Helgaas
2018-07-18 19:45 ` [PATCH v3 7/7] PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset Bjorn Helgaas
2018-07-19  3:53 ` [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL poza
2018-07-19 23:00   ` Bjorn Helgaas
2018-07-19 15:56 ` poza

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.