ntb.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api
@ 2022-09-27 15:35 Zhuo Chen
  2022-09-27 15:35 ` [PATCH v2 1/9] PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core Zhuo Chen
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Hello.

Here comes patch v2, which contains some fixes and optimizations of
aer api usage. The original version can be found on the mailing list.

Changes since v1:
- Modifications to comments proposed by Bjorn. Split patch into more
  obvious parts.

Zhuo Chen (9):
  PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core
  PCI/DPC: Use pci_aer_clear_uncorrect_error_status() to clear
    uncorrectable error status
  NTB: Change to use pci_aer_clear_uncorrect_error_status()
  scsi: lpfc: Change to use pci_aer_clear_uncorrect_error_status()
  PCI/AER: Unexport pci_aer_clear_nonfatal_status()
  PCI/AER: Move check inside pcie_clear_device_status().
  PCI/AER: Use pcie_aer_is_native() to judge whether OS owns AER
  PCI/ERR: Clear fatal status when pci_channel_io_frozen
  PCI/AER: Refine status clearing process with api

 drivers/ntb/hw/idt/ntb_hw_idt.c |  4 +--
 drivers/pci/pci.c               |  7 +++--
 drivers/pci/pci.h               |  2 ++
 drivers/pci/pcie/aer.c          | 45 +++++++++++++++++++--------------
 drivers/pci/pcie/dpc.c          |  3 +--
 drivers/pci/pcie/err.c          | 15 ++++-------
 drivers/pci/pcie/portdrv_core.c |  3 +--
 drivers/scsi/lpfc/lpfc_attr.c   |  4 +--
 include/linux/aer.h             |  4 +--
 9 files changed, 46 insertions(+), 41 deletions(-)

-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/9] PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core
  2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
@ 2022-09-27 15:35 ` Zhuo Chen
  2022-09-27 19:31   ` Sathyanarayanan Kuppuswamy
  2022-09-27 15:35 ` [PATCH v2 2/9] PCI/DPC: Use pci_aer_clear_uncorrect_error_status() to clear uncorrectable error status Zhuo Chen
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Sometimes we need to clear aer uncorrectable error status, so we add
pci_aer_clear_uncorrect_error_status() to PCI core.

Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
---
 drivers/pci/pcie/aer.c | 16 ++++++++++++++++
 include/linux/aer.h    |  5 +++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index e2d8a74f83c3..4e637121be23 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -286,6 +286,22 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
 		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
 }
 
+int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev)
+{
+	int aer = dev->aer_cap;
+	u32 status;
+
+	if (!pcie_aer_is_native(dev))
+		return -EIO;
+
+	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &status);
+	if (status)
+		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pci_aer_clear_uncorrect_error_status);
+
 /**
  * pci_aer_raw_clear_status - Clear AER error registers.
  * @dev: the PCI device
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 97f64ba1b34a..154690c278cb 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -45,6 +45,7 @@ struct aer_capability_regs {
 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
 int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
+int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev);
 void pci_save_aer_state(struct pci_dev *dev);
 void pci_restore_aer_state(struct pci_dev *dev);
 #else
@@ -60,6 +61,10 @@ static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
 {
 	return -EINVAL;
 }
+static inline int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev)
+{
+	return -EINVAL;
+}
 static inline void pci_save_aer_state(struct pci_dev *dev) {}
 static inline void pci_restore_aer_state(struct pci_dev *dev) {}
 #endif
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/9] PCI/DPC: Use pci_aer_clear_uncorrect_error_status() to clear uncorrectable error status
  2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
  2022-09-27 15:35 ` [PATCH v2 1/9] PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core Zhuo Chen
@ 2022-09-27 15:35 ` Zhuo Chen
  2022-09-27 19:34   ` Sathyanarayanan Kuppuswamy
  2022-09-27 15:35 ` [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status() Zhuo Chen
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Use pci_aer_clear_nonfatal_status() in dpc_process_error(), which has
no functional changes.

Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
---
 drivers/pci/pcie/dpc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index 3e9afee02e8d..7942073fbb34 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -288,8 +288,7 @@ void dpc_process_error(struct pci_dev *pdev)
 		 dpc_get_aer_uncorrect_severity(pdev, &info) &&
 		 aer_get_device_error_info(pdev, &info)) {
 		aer_print_error(pdev, &info);
-		pci_aer_clear_nonfatal_status(pdev);
-		pci_aer_clear_fatal_status(pdev);
+		pci_aer_clear_uncorrect_error_status(pdev);
 	}
 }
 
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status()
  2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
  2022-09-27 15:35 ` [PATCH v2 1/9] PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core Zhuo Chen
  2022-09-27 15:35 ` [PATCH v2 2/9] PCI/DPC: Use pci_aer_clear_uncorrect_error_status() to clear uncorrectable error status Zhuo Chen
@ 2022-09-27 15:35 ` Zhuo Chen
  2022-09-27 19:39   ` Sathyanarayanan Kuppuswamy
  2022-09-28 11:00   ` Serge Semin
  2022-09-27 15:35 ` [PATCH v2 4/9] scsi: lpfc: " Zhuo Chen
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Status bits for ERR_NONFATAL errors only are cleared in
pci_aer_clear_nonfatal_status(), but we want clear uncorrectable
error status in idt_init_pci(), so we change to use
pci_aer_clear_uncorrect_error_status().

Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
---
 drivers/ntb/hw/idt/ntb_hw_idt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c b/drivers/ntb/hw/idt/ntb_hw_idt.c
index 0ed6f809ff2e..d5f0aa87f817 100644
--- a/drivers/ntb/hw/idt/ntb_hw_idt.c
+++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
@@ -2657,8 +2657,8 @@ static int idt_init_pci(struct idt_ntb_dev *ndev)
 	ret = pci_enable_pcie_error_reporting(pdev);
 	if (ret != 0)
 		dev_warn(&pdev->dev, "PCIe AER capability disabled\n");
-	else /* Cleanup nonfatal error status before getting to init */
-		pci_aer_clear_nonfatal_status(pdev);
+	else /* Cleanup uncorrectable error status before getting to init */
+		pci_aer_clear_uncorrect_error_status(pdev);
 
 	/* First enable the PCI device */
 	ret = pcim_enable_device(pdev);
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/9] scsi: lpfc: Change to use pci_aer_clear_uncorrect_error_status()
  2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
                   ` (2 preceding siblings ...)
  2022-09-27 15:35 ` [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status() Zhuo Chen
@ 2022-09-27 15:35 ` Zhuo Chen
  2022-09-27 19:57   ` Sathyanarayanan Kuppuswamy
  2022-09-27 15:35 ` [PATCH v2 5/9] PCI/AER: Unexport pci_aer_clear_nonfatal_status() Zhuo Chen
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Status bits for ERR_NONFATAL errors only are cleared in
pci_aer_clear_nonfatal_status(), but we want clear uncorrectable
error status in lpfc_aer_cleanup_state(), so we change to use
pci_aer_clear_uncorrect_error_status().

Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
---
 drivers/scsi/lpfc/lpfc_attr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 09cf2cd0ae60..d835cc0ba153 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -4689,7 +4689,7 @@ static DEVICE_ATTR_RW(lpfc_aer_support);
  * Description:
  * If the @buf contains 1 and the device currently has the AER support
  * enabled, then invokes the kernel AER helper routine
- * pci_aer_clear_nonfatal_status() to clean up the uncorrectable
+ * pci_aer_clear_uncorrect_error_status() to clean up the uncorrectable
  * error status register.
  *
  * Notes:
@@ -4715,7 +4715,7 @@ lpfc_aer_cleanup_state(struct device *dev, struct device_attribute *attr,
 		return -EINVAL;
 
 	if (phba->hba_flag & HBA_AER_ENABLED)
-		rc = pci_aer_clear_nonfatal_status(phba->pcidev);
+		rc = pci_aer_clear_uncorrect_error_status(phba->pcidev);
 
 	if (rc == 0)
 		return strlen(buf);
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/9] PCI/AER: Unexport pci_aer_clear_nonfatal_status()
  2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
                   ` (3 preceding siblings ...)
  2022-09-27 15:35 ` [PATCH v2 4/9] scsi: lpfc: " Zhuo Chen
@ 2022-09-27 15:35 ` Zhuo Chen
  2022-09-27 19:59   ` Sathyanarayanan Kuppuswamy
  2022-09-27 15:35 ` [PATCH v2 6/9] PCI/AER: Move check inside pcie_clear_device_status() Zhuo Chen
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Since pci_aer_clear_nonfatal_status() is used only internally, move
its declaration to the PCI internal header file. Also, no one cares
about return value of pci_aer_clear_nonfatal_status(), so make it void.

Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
---
 drivers/pci/pci.h      | 2 ++
 drivers/pci/pcie/aer.c | 7 ++-----
 include/linux/aer.h    | 5 -----
 3 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 785f31086313..a114175d08e4 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -684,6 +684,7 @@ void pci_aer_init(struct pci_dev *dev);
 void pci_aer_exit(struct pci_dev *dev);
 extern const struct attribute_group aer_stats_attr_group;
 void pci_aer_clear_fatal_status(struct pci_dev *dev);
+void pci_aer_clear_nonfatal_status(struct pci_dev *dev);
 int pci_aer_clear_status(struct pci_dev *dev);
 int pci_aer_raw_clear_status(struct pci_dev *dev);
 #else
@@ -691,6 +692,7 @@ static inline void pci_no_aer(void) { }
 static inline void pci_aer_init(struct pci_dev *d) { }
 static inline void pci_aer_exit(struct pci_dev *d) { }
 static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
+static inline void pci_aer_clear_nonfatal_status(struct pci_dev *dev) { }
 static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
 static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
 #endif
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 4e637121be23..e2ebd108339d 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -251,13 +251,13 @@ int pci_disable_pcie_error_reporting(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_disable_pcie_error_reporting);
 
-int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
+void pci_aer_clear_nonfatal_status(struct pci_dev *dev)
 {
 	int aer = dev->aer_cap;
 	u32 status, sev;
 
 	if (!pcie_aer_is_native(dev))
-		return -EIO;
+		return;
 
 	/* Clear status bits for ERR_NONFATAL errors only */
 	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &status);
@@ -265,10 +265,7 @@ int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
 	status &= ~sev;
 	if (status)
 		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
-
-	return 0;
 }
-EXPORT_SYMBOL_GPL(pci_aer_clear_nonfatal_status);
 
 void pci_aer_clear_fatal_status(struct pci_dev *dev)
 {
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 154690c278cb..f638ad955deb 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -44,7 +44,6 @@ struct aer_capability_regs {
 /* PCIe port driver needs this function to enable AER */
 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
-int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
 int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev);
 void pci_save_aer_state(struct pci_dev *dev);
 void pci_restore_aer_state(struct pci_dev *dev);
@@ -57,10 +56,6 @@ static inline int pci_disable_pcie_error_reporting(struct pci_dev *dev)
 {
 	return -EINVAL;
 }
-static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
-{
-	return -EINVAL;
-}
 static inline int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev)
 {
 	return -EINVAL;
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 6/9] PCI/AER: Move check inside pcie_clear_device_status().
  2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
                   ` (4 preceding siblings ...)
  2022-09-27 15:35 ` [PATCH v2 5/9] PCI/AER: Unexport pci_aer_clear_nonfatal_status() Zhuo Chen
@ 2022-09-27 15:35 ` Zhuo Chen
  2022-09-27 15:35 ` [PATCH v2 7/9] PCI/AER: Use pcie_aer_is_native() to judge whether OS owns AER Zhuo Chen
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

pcie_clear_device_status() doesn't check for pcie_aer_is_native()
internally, but after commit 068c29a248b6 ("PCI/ERR: Clear PCIe Device
Status errors only if OS owns AER") and commit aa344bc8b727 ("PCI/ERR:
Clear AER status only when we control AER"), both callers check before
calling it. So we move the check inside pcie_clear_device_status().

Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
---
 drivers/pci/pci.c      |  7 +++++--
 drivers/pci/pcie/aer.c |  4 ++--
 drivers/pci/pcie/err.c | 14 +++-----------
 3 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 95bc329e74c0..8caf4a5529a1 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2282,9 +2282,12 @@ EXPORT_SYMBOL_GPL(pci_set_pcie_reset_state);
 void pcie_clear_device_status(struct pci_dev *dev)
 {
 	u16 sta;
+	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 
-	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
-	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
+	if (host->native_aer || pcie_ports_native) {
+		pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
+		pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
+	}
 }
 #endif
 
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index e2ebd108339d..e2320ab27a31 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -971,11 +971,11 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info)
 		 * Correctable error does not need software intervention.
 		 * No need to go through error recovery process.
 		 */
-		if (aer)
+		if (aer) {
 			pci_write_config_dword(dev, aer + PCI_ERR_COR_STATUS,
 					info->status);
-		if (pcie_aer_is_native(dev))
 			pcie_clear_device_status(dev);
+		}
 	} else if (info->severity == AER_NONFATAL)
 		pcie_do_recovery(dev, pci_channel_io_normal, aer_root_reset);
 	else if (info->severity == AER_FATAL)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 59c90d04a609..f80b21244ef1 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -188,7 +188,6 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 	int type = pci_pcie_type(dev);
 	struct pci_dev *bridge;
 	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
-	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 
 	/*
 	 * If the error was detected by a Root Port, Downstream Port, RCEC,
@@ -241,16 +240,9 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 	pci_dbg(bridge, "broadcast resume message\n");
 	pci_walk_bridge(bridge, report_resume, &status);
 
-	/*
-	 * If we have native control of AER, clear error status in the device
-	 * that detected the error.  If the platform retained control of AER,
-	 * it is responsible for clearing this status.  In that case, the
-	 * signaling device may not even be visible to the OS.
-	 */
-	if (host->native_aer || pcie_ports_native) {
-		pcie_clear_device_status(dev);
-		pci_aer_clear_nonfatal_status(dev);
-	}
+	pcie_clear_device_status(dev);
+	pci_aer_clear_nonfatal_status(dev);
+
 	pci_info(bridge, "device recovery successful\n");
 	return status;
 
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 7/9] PCI/AER: Use pcie_aer_is_native() to judge whether OS owns AER
  2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
                   ` (5 preceding siblings ...)
  2022-09-27 15:35 ` [PATCH v2 6/9] PCI/AER: Move check inside pcie_clear_device_status() Zhuo Chen
@ 2022-09-27 15:35 ` Zhuo Chen
  2022-09-27 15:35 ` [PATCH v2 8/9] PCI/ERR: Clear fatal status when pci_channel_io_frozen Zhuo Chen
  2022-09-27 15:35 ` [PATCH v2 9/9] PCI/AER: Refine status clearing process with api Zhuo Chen
  8 siblings, 0 replies; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Use pcie_aer_is_native() in place of "host->native_aer ||
pcie_ports_native" to judge whether OS owns AER in aer_root_reset().

Replace "dev->aer_cap && (pcie_ports_native || host->native_aer)" in
get_port_device_capability() with pcie_aer_is_native(), which has no
functional changes.

Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
---
 drivers/pci/pcie/aer.c          | 5 ++---
 drivers/pci/pcie/portdrv_core.c | 3 +--
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index e2320ab27a31..a6d29269ccf2 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1403,7 +1403,6 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
 	int type = pci_pcie_type(dev);
 	struct pci_dev *root;
 	int aer;
-	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 	u32 reg32;
 	int rc;
 
@@ -1424,7 +1423,7 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
 	 */
 	aer = root ? root->aer_cap : 0;
 
-	if ((host->native_aer || pcie_ports_native) && aer) {
+	if (aer && pcie_aer_is_native(root)) {
 		/* Disable Root's interrupt in response to error messages */
 		pci_read_config_dword(root, aer + PCI_ERR_ROOT_COMMAND, &reg32);
 		reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
@@ -1443,7 +1442,7 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
 			pci_is_root_bus(dev->bus) ? "Root" : "Downstream", rc);
 	}
 
-	if ((host->native_aer || pcie_ports_native) && aer) {
+	if (aer && pcie_aer_is_native(root)) {
 		/* Clear Root Error Status */
 		pci_read_config_dword(root, aer + PCI_ERR_ROOT_STATUS, &reg32);
 		pci_write_config_dword(root, aer + PCI_ERR_ROOT_STATUS, reg32);
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 1ac7fec47d6f..844297c0c85e 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -221,8 +221,7 @@ static int get_port_device_capability(struct pci_dev *dev)
 	}
 
 #ifdef CONFIG_PCIEAER
-	if (dev->aer_cap && pci_aer_available() &&
-	    (pcie_ports_native || host->native_aer))
+	if (pcie_aer_is_native(dev) && pci_aer_available())
 		services |= PCIE_PORT_SERVICE_AER;
 #endif
 
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 8/9] PCI/ERR: Clear fatal status when pci_channel_io_frozen
  2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
                   ` (6 preceding siblings ...)
  2022-09-27 15:35 ` [PATCH v2 7/9] PCI/AER: Use pcie_aer_is_native() to judge whether OS owns AER Zhuo Chen
@ 2022-09-27 15:35 ` Zhuo Chen
  2022-09-27 15:35 ` [PATCH v2 9/9] PCI/AER: Refine status clearing process with api Zhuo Chen
  8 siblings, 0 replies; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

When state is pci_channel_io_frozen in pcie_do_recovery(),
the severity is fatal and fatal status should be cleared.
So we add pci_aer_clear_fatal_status().

Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
---
 drivers/pci/pcie/err.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index f80b21244ef1..b46f1d36c090 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -241,7 +241,10 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
 	pci_walk_bridge(bridge, report_resume, &status);
 
 	pcie_clear_device_status(dev);
-	pci_aer_clear_nonfatal_status(dev);
+	if (state == pci_channel_io_frozen)
+		pci_aer_clear_fatal_status(dev);
+	else
+		pci_aer_clear_nonfatal_status(dev);
 
 	pci_info(bridge, "device recovery successful\n");
 	return status;
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 9/9] PCI/AER: Refine status clearing process with api
  2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
                   ` (7 preceding siblings ...)
  2022-09-27 15:35 ` [PATCH v2 8/9] PCI/ERR: Clear fatal status when pci_channel_io_frozen Zhuo Chen
@ 2022-09-27 15:35 ` Zhuo Chen
  8 siblings, 0 replies; 19+ messages in thread
From: Zhuo Chen @ 2022-09-27 15:35 UTC (permalink / raw)
  To: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen
  Cc: chenzhuo.1, linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Statements clearing status in aer_enable_rootport() is functionally
equivalent with pcie_clear_device_status() and pci_aer_clear_status().
So we replace them, which has no functional changes.

After commit 20e15e673b05 ("PCI/AER: Add pci_aer_raw_clear_status()
to unconditionally clear Error Status"), pci_aer_raw_clear_status()
is only used by the EDR path, so we add note in function comment.

Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
---
 drivers/pci/pcie/aer.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a6d29269ccf2..bd5ecfa4860f 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -306,6 +306,8 @@ EXPORT_SYMBOL_GPL(pci_aer_clear_uncorrect_error_status);
  * Clearing AER error status registers unconditionally, regardless of
  * whether they're owned by firmware or the OS.
  *
+ * Used only by the EDR path. Other paths should use pci_aer_clear_status().
+ *
  * Returns 0 on success, or negative on failure.
  */
 int pci_aer_raw_clear_status(struct pci_dev *dev)
@@ -1277,24 +1279,17 @@ static void aer_enable_rootport(struct aer_rpc *rpc)
 {
 	struct pci_dev *pdev = rpc->rpd;
 	int aer = pdev->aer_cap;
-	u16 reg16;
 	u32 reg32;
 
 	/* Clear PCIe Capability's Device Status */
-	pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &reg16);
-	pcie_capability_write_word(pdev, PCI_EXP_DEVSTA, reg16);
+	pcie_clear_device_status(pdev);
 
 	/* Disable system error generation in response to error messages */
 	pcie_capability_clear_word(pdev, PCI_EXP_RTCTL,
 				   SYSTEM_ERROR_INTR_ON_MESG_MASK);
 
 	/* Clear error status */
-	pci_read_config_dword(pdev, aer + PCI_ERR_ROOT_STATUS, &reg32);
-	pci_write_config_dword(pdev, aer + PCI_ERR_ROOT_STATUS, reg32);
-	pci_read_config_dword(pdev, aer + PCI_ERR_COR_STATUS, &reg32);
-	pci_write_config_dword(pdev, aer + PCI_ERR_COR_STATUS, reg32);
-	pci_read_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS, &reg32);
-	pci_write_config_dword(pdev, aer + PCI_ERR_UNCOR_STATUS, reg32);
+	pci_aer_clear_status(pdev);
 
 	/*
 	 * Enable error reporting for the root port device and downstream port
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/9] PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core
  2022-09-27 15:35 ` [PATCH v2 1/9] PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core Zhuo Chen
@ 2022-09-27 19:31   ` Sathyanarayanan Kuppuswamy
  2022-09-28  8:32     ` Zhuo Chen
  0 siblings, 1 reply; 19+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2022-09-27 19:31 UTC (permalink / raw)
  To: Zhuo Chen, bhelgaas, ruscur, oohall, fancer.lancer, jdmason,
	dave.jiang, allenbh, james.smart, dick.kennedy, jejb,
	martin.petersen
  Cc: linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Hi,

On 9/27/22 8:35 AM, Zhuo Chen wrote:
> Sometimes we need to clear aer uncorrectable error status, so we add

Adding n actual use case will help.

> pci_aer_clear_uncorrect_error_status() to PCI core.

If possible, try to avoid "we" usage in commit log. Just say "so add
pci_aer_clear_uncorrect_error_status() function" 

> 
> Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
> ---
>  drivers/pci/pcie/aer.c | 16 ++++++++++++++++
>  include/linux/aer.h    |  5 +++++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index e2d8a74f83c3..4e637121be23 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -286,6 +286,22 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>  		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
>  }
>  
> +int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev)
> +{
> +	int aer = dev->aer_cap;
> +	u32 status;
> +
> +	if (!pcie_aer_is_native(dev))
> +		return -EIO;
> +
> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &status);
> +	if (status)
> +		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);

Why not just write all '1' and clear it? Why read and write?

> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(pci_aer_clear_uncorrect_error_status);

Add details about why you want to export in commit log.

> +
>  /**
>   * pci_aer_raw_clear_status - Clear AER error registers.
>   * @dev: the PCI device
> diff --git a/include/linux/aer.h b/include/linux/aer.h
> index 97f64ba1b34a..154690c278cb 100644
> --- a/include/linux/aer.h
> +++ b/include/linux/aer.h
> @@ -45,6 +45,7 @@ struct aer_capability_regs {
>  int pci_enable_pcie_error_reporting(struct pci_dev *dev);
>  int pci_disable_pcie_error_reporting(struct pci_dev *dev);
>  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
> +int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev);
>  void pci_save_aer_state(struct pci_dev *dev);
>  void pci_restore_aer_state(struct pci_dev *dev);
>  #else
> @@ -60,6 +61,10 @@ static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>  {
>  	return -EINVAL;
>  }
> +static inline int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev)
> +{
> +	return -EINVAL;
> +}
>  static inline void pci_save_aer_state(struct pci_dev *dev) {}
>  static inline void pci_restore_aer_state(struct pci_dev *dev) {}
>  #endif

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/9] PCI/DPC: Use pci_aer_clear_uncorrect_error_status() to clear uncorrectable error status
  2022-09-27 15:35 ` [PATCH v2 2/9] PCI/DPC: Use pci_aer_clear_uncorrect_error_status() to clear uncorrectable error status Zhuo Chen
@ 2022-09-27 19:34   ` Sathyanarayanan Kuppuswamy
  0 siblings, 0 replies; 19+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2022-09-27 19:34 UTC (permalink / raw)
  To: Zhuo Chen, bhelgaas, ruscur, oohall, fancer.lancer, jdmason,
	dave.jiang, allenbh, james.smart, dick.kennedy, jejb,
	martin.petersen
  Cc: linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi

Hi,

On 9/27/22 8:35 AM, Zhuo Chen wrote:
> Use pci_aer_clear_nonfatal_status() in dpc_process_error(), which has
> no functional changes.

Just say pci_aer_clear_uncorrect_error_status() clears both fatal and
non-fatal errors. So use it in place of pci_aer_clear_nonfatal_status()
and pci_aer_clear_fatal_status().

> 
> Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
> ---
>  drivers/pci/pcie/dpc.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
> index 3e9afee02e8d..7942073fbb34 100644
> --- a/drivers/pci/pcie/dpc.c
> +++ b/drivers/pci/pcie/dpc.c
> @@ -288,8 +288,7 @@ void dpc_process_error(struct pci_dev *pdev)
>  		 dpc_get_aer_uncorrect_severity(pdev, &info) &&
>  		 aer_get_device_error_info(pdev, &info)) {
>  		aer_print_error(pdev, &info);
> -		pci_aer_clear_nonfatal_status(pdev);
> -		pci_aer_clear_fatal_status(pdev);
> +		pci_aer_clear_uncorrect_error_status(pdev);
>  	}
>  }
>  

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status()
  2022-09-27 15:35 ` [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status() Zhuo Chen
@ 2022-09-27 19:39   ` Sathyanarayanan Kuppuswamy
  2022-09-28  4:20     ` [External] " Zhuo Chen
  2022-09-28 11:00   ` Serge Semin
  1 sibling, 1 reply; 19+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2022-09-27 19:39 UTC (permalink / raw)
  To: Zhuo Chen, bhelgaas, ruscur, oohall, fancer.lancer, jdmason,
	dave.jiang, allenbh, james.smart, dick.kennedy, jejb,
	martin.petersen
  Cc: linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi



On 9/27/22 8:35 AM, Zhuo Chen wrote:
> Status bits for ERR_NONFATAL errors only are cleared in
> pci_aer_clear_nonfatal_status(), but we want clear uncorrectable
> error status in idt_init_pci(), so we change to use
> pci_aer_clear_uncorrect_error_status().

You mean currently driver does not clear fatal errors now, and it is
a problem? Any error reported?

Also, I am wondering why is it required to clear errors during init
code. Is it a norm?

> 
> Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
> ---
>  drivers/ntb/hw/idt/ntb_hw_idt.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c b/drivers/ntb/hw/idt/ntb_hw_idt.c
> index 0ed6f809ff2e..d5f0aa87f817 100644
> --- a/drivers/ntb/hw/idt/ntb_hw_idt.c
> +++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
> @@ -2657,8 +2657,8 @@ static int idt_init_pci(struct idt_ntb_dev *ndev)
>  	ret = pci_enable_pcie_error_reporting(pdev);
>  	if (ret != 0)
>  		dev_warn(&pdev->dev, "PCIe AER capability disabled\n");
> -	else /* Cleanup nonfatal error status before getting to init */
> -		pci_aer_clear_nonfatal_status(pdev);
> +	else /* Cleanup uncorrectable error status before getting to init */
> +		pci_aer_clear_uncorrect_error_status(pdev);
>  
>  	/* First enable the PCI device */
>  	ret = pcim_enable_device(pdev);

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/9] scsi: lpfc: Change to use pci_aer_clear_uncorrect_error_status()
  2022-09-27 15:35 ` [PATCH v2 4/9] scsi: lpfc: " Zhuo Chen
@ 2022-09-27 19:57   ` Sathyanarayanan Kuppuswamy
  0 siblings, 0 replies; 19+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2022-09-27 19:57 UTC (permalink / raw)
  To: Zhuo Chen, bhelgaas, ruscur, oohall, fancer.lancer, jdmason,
	dave.jiang, allenbh, james.smart, dick.kennedy, jejb,
	martin.petersen
  Cc: linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi



On 9/27/22 8:35 AM, Zhuo Chen wrote:
> Status bits for ERR_NONFATAL errors only are cleared in
> pci_aer_clear_nonfatal_status(), but we want clear uncorrectable
> error status in lpfc_aer_cleanup_state(), so we change to use
> pci_aer_clear_uncorrect_error_status().

I think you don't need to mention status bits here. Just use terms
"fatal" and "non-fatal" errors.

lpfc_aer_cleanup_state() requires clearing both fatal and non-fatal
uncorrectable error status. But using  pci_aer_clear_nonfatal_status()
will only clear non-fatal error status. To clear both fatal and non-fatal
error status, use pci_aer_clear_uncorrect_error_status().

> 
> Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
> ---
>  drivers/scsi/lpfc/lpfc_attr.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
> index 09cf2cd0ae60..d835cc0ba153 100644
> --- a/drivers/scsi/lpfc/lpfc_attr.c
> +++ b/drivers/scsi/lpfc/lpfc_attr.c
> @@ -4689,7 +4689,7 @@ static DEVICE_ATTR_RW(lpfc_aer_support);
>   * Description:
>   * If the @buf contains 1 and the device currently has the AER support
>   * enabled, then invokes the kernel AER helper routine
> - * pci_aer_clear_nonfatal_status() to clean up the uncorrectable
> + * pci_aer_clear_uncorrect_error_status() to clean up the uncorrectable
>   * error status register.
>   *
>   * Notes:
> @@ -4715,7 +4715,7 @@ lpfc_aer_cleanup_state(struct device *dev, struct device_attribute *attr,
>  		return -EINVAL;
>  
>  	if (phba->hba_flag & HBA_AER_ENABLED)
> -		rc = pci_aer_clear_nonfatal_status(phba->pcidev);
> +		rc = pci_aer_clear_uncorrect_error_status(phba->pcidev);
>  
>  	if (rc == 0)
>  		return strlen(buf);

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/9] PCI/AER: Unexport pci_aer_clear_nonfatal_status()
  2022-09-27 15:35 ` [PATCH v2 5/9] PCI/AER: Unexport pci_aer_clear_nonfatal_status() Zhuo Chen
@ 2022-09-27 19:59   ` Sathyanarayanan Kuppuswamy
  0 siblings, 0 replies; 19+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2022-09-27 19:59 UTC (permalink / raw)
  To: Zhuo Chen, bhelgaas, ruscur, oohall, fancer.lancer, jdmason,
	dave.jiang, allenbh, james.smart, dick.kennedy, jejb,
	martin.petersen
  Cc: linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi



On 9/27/22 8:35 AM, Zhuo Chen wrote:
> Since pci_aer_clear_nonfatal_status() is used only internally, move
> its declaration to the PCI internal header file. Also, no one cares
> about return value of pci_aer_clear_nonfatal_status(), so make it void.
> 
> Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
> ---

Looks good to me.

>  drivers/pci/pci.h      | 2 ++
>  drivers/pci/pcie/aer.c | 7 ++-----
>  include/linux/aer.h    | 5 -----
>  3 files changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 785f31086313..a114175d08e4 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -684,6 +684,7 @@ void pci_aer_init(struct pci_dev *dev);
>  void pci_aer_exit(struct pci_dev *dev);
>  extern const struct attribute_group aer_stats_attr_group;
>  void pci_aer_clear_fatal_status(struct pci_dev *dev);
> +void pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>  int pci_aer_clear_status(struct pci_dev *dev);
>  int pci_aer_raw_clear_status(struct pci_dev *dev);
>  #else
> @@ -691,6 +692,7 @@ static inline void pci_no_aer(void) { }
>  static inline void pci_aer_init(struct pci_dev *d) { }
>  static inline void pci_aer_exit(struct pci_dev *d) { }
>  static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
> +static inline void pci_aer_clear_nonfatal_status(struct pci_dev *dev) { }
>  static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
>  static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
>  #endif
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 4e637121be23..e2ebd108339d 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -251,13 +251,13 @@ int pci_disable_pcie_error_reporting(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL_GPL(pci_disable_pcie_error_reporting);
>  
> -int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
> +void pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>  {
>  	int aer = dev->aer_cap;
>  	u32 status, sev;
>  
>  	if (!pcie_aer_is_native(dev))
> -		return -EIO;
> +		return;
>  
>  	/* Clear status bits for ERR_NONFATAL errors only */
>  	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &status);
> @@ -265,10 +265,7 @@ int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>  	status &= ~sev;
>  	if (status)
>  		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
> -
> -	return 0;
>  }
> -EXPORT_SYMBOL_GPL(pci_aer_clear_nonfatal_status);
>  
>  void pci_aer_clear_fatal_status(struct pci_dev *dev)
>  {
> diff --git a/include/linux/aer.h b/include/linux/aer.h
> index 154690c278cb..f638ad955deb 100644
> --- a/include/linux/aer.h
> +++ b/include/linux/aer.h
> @@ -44,7 +44,6 @@ struct aer_capability_regs {
>  /* PCIe port driver needs this function to enable AER */
>  int pci_enable_pcie_error_reporting(struct pci_dev *dev);
>  int pci_disable_pcie_error_reporting(struct pci_dev *dev);
> -int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>  int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev);
>  void pci_save_aer_state(struct pci_dev *dev);
>  void pci_restore_aer_state(struct pci_dev *dev);
> @@ -57,10 +56,6 @@ static inline int pci_disable_pcie_error_reporting(struct pci_dev *dev)
>  {
>  	return -EINVAL;
>  }
> -static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
> -{
> -	return -EINVAL;
> -}
>  static inline int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev)
>  {
>  	return -EINVAL;

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [External] Re: [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status()
  2022-09-27 19:39   ` Sathyanarayanan Kuppuswamy
@ 2022-09-28  4:20     ` Zhuo Chen
  2022-09-28  4:30       ` Sathyanarayanan Kuppuswamy
  0 siblings, 1 reply; 19+ messages in thread
From: Zhuo Chen @ 2022-09-28  4:20 UTC (permalink / raw)
  To: Sathyanarayanan Kuppuswamy, bhelgaas, ruscur, oohall,
	fancer.lancer, jdmason, dave.jiang, allenbh, james.smart,
	dick.kennedy, jejb, martin.petersen
  Cc: linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi



On 9/28/22 3:39 AM, Sathyanarayanan Kuppuswamy wrote:
> 
> 
> On 9/27/22 8:35 AM, Zhuo Chen wrote:
>> Status bits for ERR_NONFATAL errors only are cleared in
>> pci_aer_clear_nonfatal_status(), but we want clear uncorrectable
>> error status in idt_init_pci(), so we change to use
>> pci_aer_clear_uncorrect_error_status().
> 
> You mean currently driver does not clear fatal errors now, and it is
> a problem? Any error reported?
> 
Hi Sathyanarayanan,

No error reports yet, I just changes the behavior back to what it was 
before commit e7b0b847de6d ("PCI/AER: Clear only ERR_NONFATAL bits 
during non-fatal recovery"), because this commit change the original 
function in commit bf2a952d31d2 ("NTB: Add IDT 89HPESxNTx PCIe-switches 
support").

> Also, I am wondering why is it required to clear errors during init
> code. Is it a norm?
> 
I think there is no need to clear errors during init code.
>>
>> Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
>> ---
>>   drivers/ntb/hw/idt/ntb_hw_idt.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c b/drivers/ntb/hw/idt/ntb_hw_idt.c
>> index 0ed6f809ff2e..d5f0aa87f817 100644
>> --- a/drivers/ntb/hw/idt/ntb_hw_idt.c
>> +++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
>> @@ -2657,8 +2657,8 @@ static int idt_init_pci(struct idt_ntb_dev *ndev)
>>   	ret = pci_enable_pcie_error_reporting(pdev);
>>   	if (ret != 0)
>>   		dev_warn(&pdev->dev, "PCIe AER capability disabled\n");
>> -	else /* Cleanup nonfatal error status before getting to init */
>> -		pci_aer_clear_nonfatal_status(pdev);
>> +	else /* Cleanup uncorrectable error status before getting to init */
>> +		pci_aer_clear_uncorrect_error_status(pdev);
>>   
>>   	/* First enable the PCI device */
>>   	ret = pcim_enable_device(pdev);
> 

-- 
Thanks,
Zhuo Chen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [External] Re: [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status()
  2022-09-28  4:20     ` [External] " Zhuo Chen
@ 2022-09-28  4:30       ` Sathyanarayanan Kuppuswamy
  0 siblings, 0 replies; 19+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2022-09-28  4:30 UTC (permalink / raw)
  To: Zhuo Chen, bhelgaas, ruscur, oohall, fancer.lancer, jdmason,
	dave.jiang, allenbh, james.smart, dick.kennedy, jejb,
	martin.petersen
  Cc: linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi



On 9/27/22 9:20 PM, Zhuo Chen wrote:
> 
> 
> On 9/28/22 3:39 AM, Sathyanarayanan Kuppuswamy wrote:
>>
>>
>> On 9/27/22 8:35 AM, Zhuo Chen wrote:
>>> Status bits for ERR_NONFATAL errors only are cleared in
>>> pci_aer_clear_nonfatal_status(), but we want clear uncorrectable
>>> error status in idt_init_pci(), so we change to use
>>> pci_aer_clear_uncorrect_error_status().
>>
>> You mean currently driver does not clear fatal errors now, and it is
>> a problem? Any error reported?
>>
> Hi Sathyanarayanan,
> 
> No error reports yet, I just changes the behavior back to what it was before commit e7b0b847de6d ("PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery"), because this commit change the original function in commit bf2a952d31d2 ("NTB: Add IDT 89HPESxNTx PCIe-switches support").
> 

Ok. Thanks for clarifying.

>> Also, I am wondering why is it required to clear errors during init
>> code. Is it a norm?
>>
> I think there is no need to clear errors during init code.
>>>
>>> Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
>>> ---
>>>   drivers/ntb/hw/idt/ntb_hw_idt.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c b/drivers/ntb/hw/idt/ntb_hw_idt.c
>>> index 0ed6f809ff2e..d5f0aa87f817 100644
>>> --- a/drivers/ntb/hw/idt/ntb_hw_idt.c
>>> +++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
>>> @@ -2657,8 +2657,8 @@ static int idt_init_pci(struct idt_ntb_dev *ndev)
>>>       ret = pci_enable_pcie_error_reporting(pdev);
>>>       if (ret != 0)
>>>           dev_warn(&pdev->dev, "PCIe AER capability disabled\n");
>>> -    else /* Cleanup nonfatal error status before getting to init */
>>> -        pci_aer_clear_nonfatal_status(pdev);
>>> +    else /* Cleanup uncorrectable error status before getting to init */
>>> +        pci_aer_clear_uncorrect_error_status(pdev);
>>>         /* First enable the PCI device */
>>>       ret = pcim_enable_device(pdev);
>>
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/9] PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core
  2022-09-27 19:31   ` Sathyanarayanan Kuppuswamy
@ 2022-09-28  8:32     ` Zhuo Chen
  0 siblings, 0 replies; 19+ messages in thread
From: Zhuo Chen @ 2022-09-28  8:32 UTC (permalink / raw)
  To: Sathyanarayanan Kuppuswamy
  Cc: bhelgaas, ruscur, oohall, fancer.lancer, jdmason, dave.jiang,
	allenbh, james.smart, dick.kennedy, jejb, martin.petersen,
	linuxppc-dev, linux-pci, linux-kernel, ntb, linux-scsi



On 9/28/22 3:31 AM, Sathyanarayanan Kuppuswamy wrote:
> Hi,
> 
> On 9/27/22 8:35 AM, Zhuo Chen wrote:
>> Sometimes we need to clear aer uncorrectable error status, so we add
> 
> Adding n actual use case will help.
> 
>> pci_aer_clear_uncorrect_error_status() to PCI core.
> 
> If possible, try to avoid "we" usage in commit log. Just say "so add
> pci_aer_clear_uncorrect_error_status() function"
> 
>>
>> Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
>> ---
>>   drivers/pci/pcie/aer.c | 16 ++++++++++++++++
>>   include/linux/aer.h    |  5 +++++
>>   2 files changed, 21 insertions(+)
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index e2d8a74f83c3..4e637121be23 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -286,6 +286,22 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>>   		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
>>   }
>>   
>> +int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev)
>> +{
>> +	int aer = dev->aer_cap;
>> +	u32 status;
>> +
>> +	if (!pcie_aer_is_native(dev))
>> +		return -EIO;
>> +
>> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &status);
>> +	if (status)
>> +		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
> 
> Why not just write all '1' and clear it? Why read and write?

Hi Sathyanarayanan,

The current implementation and the previous implementation are all first 
read and then write to clear the status. Currently just be consistent 
with them:
  - aer_enable_rootport()
  - pci_aer_raw_clear_status()
  - pcibios_plat_dev_init() in arch/mips/pci/pci-octeon.c
  - commit fd3362cb73de ("PCI/AER: Squash aerdrv_core.c into aerdrv.c")
    pci_cleanup_aer_uncorrect_error_status
> 
>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(pci_aer_clear_uncorrect_error_status);
> 
> Add details about why you want to export in commit log.

Thanks,

I will add details and improve commit log in next version.
> 
>> +
>>   /**
>>    * pci_aer_raw_clear_status - Clear AER error registers.
>>    * @dev: the PCI device
>> diff --git a/include/linux/aer.h b/include/linux/aer.h
>> index 97f64ba1b34a..154690c278cb 100644
>> --- a/include/linux/aer.h
>> +++ b/include/linux/aer.h
>> @@ -45,6 +45,7 @@ struct aer_capability_regs {
>>   int pci_enable_pcie_error_reporting(struct pci_dev *dev);
>>   int pci_disable_pcie_error_reporting(struct pci_dev *dev);
>>   int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>> +int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev);
>>   void pci_save_aer_state(struct pci_dev *dev);
>>   void pci_restore_aer_state(struct pci_dev *dev);
>>   #else
>> @@ -60,6 +61,10 @@ static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>>   {
>>   	return -EINVAL;
>>   }
>> +static inline int pci_aer_clear_uncorrect_error_status(struct pci_dev *dev)
>> +{
>> +	return -EINVAL;
>> +}
>>   static inline void pci_save_aer_state(struct pci_dev *dev) {}
>>   static inline void pci_restore_aer_state(struct pci_dev *dev) {}
>>   #endif
> 

-- 
Thanks,
Zhuo Chen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status()
  2022-09-27 15:35 ` [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status() Zhuo Chen
  2022-09-27 19:39   ` Sathyanarayanan Kuppuswamy
@ 2022-09-28 11:00   ` Serge Semin
  1 sibling, 0 replies; 19+ messages in thread
From: Serge Semin @ 2022-09-28 11:00 UTC (permalink / raw)
  To: Zhuo Chen
  Cc: bhelgaas, ruscur, oohall, jdmason, dave.jiang, allenbh,
	james.smart, dick.kennedy, jejb, martin.petersen, linuxppc-dev,
	linux-pci, linux-kernel, ntb, linux-scsi

On Tue, Sep 27, 2022 at 11:35:18PM +0800, Zhuo Chen wrote:
> Status bits for ERR_NONFATAL errors only are cleared in
> pci_aer_clear_nonfatal_status(), but we want clear uncorrectable
> error status in idt_init_pci(), so we change to use
> pci_aer_clear_uncorrect_error_status().

Have the modification changed since:
https://lore.kernel.org/linux-pci/20220901181634.99591-2-chenzhuo.1@bytedance.com/
?
AFAICS it hasn't. Then my ab-tag should have been preserved on v2.
One more time:
Acked-by: Serge Semin <fancer.lancer@gmail.com>

Don't forget to add it should you have some more patchset re-spins.

-Sergey

> 
> Signed-off-by: Zhuo Chen <chenzhuo.1@bytedance.com>
> ---
>  drivers/ntb/hw/idt/ntb_hw_idt.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c b/drivers/ntb/hw/idt/ntb_hw_idt.c
> index 0ed6f809ff2e..d5f0aa87f817 100644
> --- a/drivers/ntb/hw/idt/ntb_hw_idt.c
> +++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
> @@ -2657,8 +2657,8 @@ static int idt_init_pci(struct idt_ntb_dev *ndev)
>  	ret = pci_enable_pcie_error_reporting(pdev);
>  	if (ret != 0)
>  		dev_warn(&pdev->dev, "PCIe AER capability disabled\n");
> -	else /* Cleanup nonfatal error status before getting to init */
> -		pci_aer_clear_nonfatal_status(pdev);
> +	else /* Cleanup uncorrectable error status before getting to init */
> +		pci_aer_clear_uncorrect_error_status(pdev);
>  
>  	/* First enable the PCI device */
>  	ret = pcim_enable_device(pdev);
> -- 
> 2.30.1 (Apple Git-130)
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-09-28 11:00 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-27 15:35 [PATCH v2 0/9] PCI/AER: Fix and optimize usage of status clearing api Zhuo Chen
2022-09-27 15:35 ` [PATCH v2 1/9] PCI/AER: Add pci_aer_clear_uncorrect_error_status() to PCI core Zhuo Chen
2022-09-27 19:31   ` Sathyanarayanan Kuppuswamy
2022-09-28  8:32     ` Zhuo Chen
2022-09-27 15:35 ` [PATCH v2 2/9] PCI/DPC: Use pci_aer_clear_uncorrect_error_status() to clear uncorrectable error status Zhuo Chen
2022-09-27 19:34   ` Sathyanarayanan Kuppuswamy
2022-09-27 15:35 ` [PATCH v2 3/9] NTB: Change to use pci_aer_clear_uncorrect_error_status() Zhuo Chen
2022-09-27 19:39   ` Sathyanarayanan Kuppuswamy
2022-09-28  4:20     ` [External] " Zhuo Chen
2022-09-28  4:30       ` Sathyanarayanan Kuppuswamy
2022-09-28 11:00   ` Serge Semin
2022-09-27 15:35 ` [PATCH v2 4/9] scsi: lpfc: " Zhuo Chen
2022-09-27 19:57   ` Sathyanarayanan Kuppuswamy
2022-09-27 15:35 ` [PATCH v2 5/9] PCI/AER: Unexport pci_aer_clear_nonfatal_status() Zhuo Chen
2022-09-27 19:59   ` Sathyanarayanan Kuppuswamy
2022-09-27 15:35 ` [PATCH v2 6/9] PCI/AER: Move check inside pcie_clear_device_status() Zhuo Chen
2022-09-27 15:35 ` [PATCH v2 7/9] PCI/AER: Use pcie_aer_is_native() to judge whether OS owns AER Zhuo Chen
2022-09-27 15:35 ` [PATCH v2 8/9] PCI/ERR: Clear fatal status when pci_channel_io_frozen Zhuo Chen
2022-09-27 15:35 ` [PATCH v2 9/9] PCI/AER: Refine status clearing process with api Zhuo Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).