All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v8 0/3] Ethernet DWMAC5 fault IRQ support
@ 2023-12-21  7:36 Suraj Jaiswal
  2023-12-21  7:36 ` [PATCH net-next v8 1/3] dt-bindings: net: qcom,ethqos: add binding doc for safety IRQ for sa8775p Suraj Jaiswal
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Suraj Jaiswal @ 2023-12-21  7:36 UTC (permalink / raw)
  To: quic_jsuraj, Vinod Koul, Bhupesh Sharma, Andy Gross,
	Bjorn Andersson, Konrad Dybcio, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring
  Cc: kernel

Add support to listen Ethernet HW common safery IRQ for correctable and 
uncorrectable fault. The safety IRQ will be triggered for ECC(error
correction code), DPP(data path parity, FSM(finite state machine) error.

Changes since v8:
- Use shared IRQ for sfty
- update error message

Changes since v7:
- Add support of common sfty irq on stmmac_request_irq_multi_msi.
- Remove uncecessary blank line.

Changes since v6:
- use name sfty_irq instead of safety_common_irq.

Changes since v5:
- Add description of ECC, DPP, FSM

Changes since v4:
- Fix DT_CHECKER warning
- use name safety for the IRQ.

Suraj Jaiswal (3):
  dt-bindings: net: qcom,ethqos: add binding doc for safety IRQ for
    sa8775p
  arm64: dts: qcom: sa8775p: enable safety IRQ
  net: stmmac: Add driver support for DWMAC5 common safety IRQ

 .../devicetree/bindings/net/qcom,ethqos.yaml  |  9 +++--
 .../devicetree/bindings/net/snps,dwmac.yaml   |  6 ++-
 arch/arm64/boot/dts/qcom/sa8775p.dtsi         | 10 +++--
 drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
 .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
 .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
 7 files changed, 65 insertions(+), 9 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v8 1/3] dt-bindings: net: qcom,ethqos: add binding doc for safety IRQ for sa8775p
  2023-12-21  7:36 [PATCH net-next v8 0/3] Ethernet DWMAC5 fault IRQ support Suraj Jaiswal
@ 2023-12-21  7:36 ` Suraj Jaiswal
  2023-12-21  7:36 ` [PATCH net-next v8 2/3] arm64: dts: qcom: sa8775p: enable safety IRQ Suraj Jaiswal
  2023-12-21  7:36 ` [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common " Suraj Jaiswal
  2 siblings, 0 replies; 12+ messages in thread
From: Suraj Jaiswal @ 2023-12-21  7:36 UTC (permalink / raw)
  To: quic_jsuraj, Vinod Koul, Bhupesh Sharma, Andy Gross,
	Bjorn Andersson, Konrad Dybcio, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring
  Cc: kernel

Add binding doc for safety IRQ. The safety IRQ will be
triggered for ECC(error correction code), DPP(data path
parity), FSM(finite state machine) error.

Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 Documentation/devicetree/bindings/net/qcom,ethqos.yaml | 9 ++++++---
 Documentation/devicetree/bindings/net/snps,dwmac.yaml  | 6 ++++--
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/qcom,ethqos.yaml b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml
index 7bdb412a0185..69a337c7e345 100644
--- a/Documentation/devicetree/bindings/net/qcom,ethqos.yaml
+++ b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml
@@ -37,12 +37,14 @@ properties:
     items:
       - description: Combined signal for various interrupt events
       - description: The interrupt that occurs when Rx exits the LPI state
+      - description: The interrupt that occurs when HW safety error triggered
 
   interrupt-names:
     minItems: 1
     items:
       - const: macirq
-      - const: eth_lpi
+      - enum: [eth_lpi, sfty]
+      - const: sfty
 
   clocks:
     maxItems: 4
@@ -89,8 +91,9 @@ examples:
                <&gcc GCC_ETH_PTP_CLK>,
                <&gcc GCC_ETH_RGMII_CLK>;
       interrupts = <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>,
-                   <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>;
-      interrupt-names = "macirq", "eth_lpi";
+                   <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 782 IRQ_TYPE_LEVEL_HIGH>;
+      interrupt-names = "macirq", "eth_lpi", "sfty";
 
       rx-fifo-depth = <4096>;
       tx-fifo-depth = <4096>;
diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
index 5c2769dc689a..9b04e2ed7c18 100644
--- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
+++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
@@ -107,13 +107,15 @@ properties:
       - description: Combined signal for various interrupt events
       - description: The interrupt to manage the remote wake-up packet detection
       - description: The interrupt that occurs when Rx exits the LPI state
+      - description: The interrupt that occurs when HW safety error triggered
 
   interrupt-names:
     minItems: 1
     items:
       - const: macirq
-      - enum: [eth_wake_irq, eth_lpi]
-      - const: eth_lpi
+      - enum: [eth_wake_irq, eth_lpi, sfty]
+      - enum: [eth_wake_irq, eth_lpi, sfty]
+      - enum: [eth_wake_irq, eth_lpi, sfty]
 
   clocks:
     minItems: 1
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v8 2/3] arm64: dts: qcom: sa8775p: enable safety IRQ
  2023-12-21  7:36 [PATCH net-next v8 0/3] Ethernet DWMAC5 fault IRQ support Suraj Jaiswal
  2023-12-21  7:36 ` [PATCH net-next v8 1/3] dt-bindings: net: qcom,ethqos: add binding doc for safety IRQ for sa8775p Suraj Jaiswal
@ 2023-12-21  7:36 ` Suraj Jaiswal
  2023-12-21  7:36 ` [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common " Suraj Jaiswal
  2 siblings, 0 replies; 12+ messages in thread
From: Suraj Jaiswal @ 2023-12-21  7:36 UTC (permalink / raw)
  To: quic_jsuraj, Vinod Koul, Bhupesh Sharma, Andy Gross,
	Bjorn Andersson, Konrad Dybcio, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring
  Cc: kernel

Add changes to support safety IRQ handling
support for ethernet.

Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
---
 arch/arm64/boot/dts/qcom/sa8775p.dtsi | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sa8775p.dtsi b/arch/arm64/boot/dts/qcom/sa8775p.dtsi
index a7eaca33d326..f3645c3f96a1 100644
--- a/arch/arm64/boot/dts/qcom/sa8775p.dtsi
+++ b/arch/arm64/boot/dts/qcom/sa8775p.dtsi
@@ -2394,8 +2394,9 @@ ethernet1: ethernet@23000000 {
 			      <0x0 0x23016000 0x0 0x100>;
 			reg-names = "stmmaceth", "rgmii";
 
-			interrupts = <GIC_SPI 929 IRQ_TYPE_LEVEL_HIGH>;
-			interrupt-names = "macirq";
+			interrupts = <GIC_SPI 929 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 781 IRQ_TYPE_LEVEL_HIGH>;
+			interrupt-names = "macirq", "sfty";
 
 			clocks = <&gcc GCC_EMAC1_AXI_CLK>,
 				 <&gcc GCC_EMAC1_SLV_AHB_CLK>,
@@ -2427,8 +2428,9 @@ ethernet0: ethernet@23040000 {
 			      <0x0 0x23056000 0x0 0x100>;
 			reg-names = "stmmaceth", "rgmii";
 
-			interrupts = <GIC_SPI 946 IRQ_TYPE_LEVEL_HIGH>;
-			interrupt-names = "macirq";
+			interrupts = <GIC_SPI 946 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 782 IRQ_TYPE_LEVEL_HIGH>;
+			interrupt-names = "macirq", "sfty";
 
 			clocks = <&gcc GCC_EMAC0_AXI_CLK>,
 				 <&gcc GCC_EMAC0_SLV_AHB_CLK>,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common safety IRQ
  2023-12-21  7:36 [PATCH net-next v8 0/3] Ethernet DWMAC5 fault IRQ support Suraj Jaiswal
  2023-12-21  7:36 ` [PATCH net-next v8 1/3] dt-bindings: net: qcom,ethqos: add binding doc for safety IRQ for sa8775p Suraj Jaiswal
  2023-12-21  7:36 ` [PATCH net-next v8 2/3] arm64: dts: qcom: sa8775p: enable safety IRQ Suraj Jaiswal
@ 2023-12-21  7:36 ` Suraj Jaiswal
  2023-12-21 12:49   ` Serge Semin
  2 siblings, 1 reply; 12+ messages in thread
From: Suraj Jaiswal @ 2023-12-21  7:36 UTC (permalink / raw)
  To: quic_jsuraj, Vinod Koul, Bhupesh Sharma, Andy Gross,
	Bjorn Andersson, Konrad Dybcio, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring
  Cc: kernel

Add support to listen HW safety IRQ like ECC(error
correction code), DPP(data path parity), FSM(finite state
machine) fault in common IRQ line.

Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
---
 drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
 .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
 .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
 4 files changed, 49 insertions(+)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 721c1f8e892f..b9233b09b80f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -344,6 +344,7 @@ enum request_irq_err {
 	REQ_IRQ_ERR_ALL,
 	REQ_IRQ_ERR_TX,
 	REQ_IRQ_ERR_RX,
+	REQ_IRQ_ERR_SFTY,
 	REQ_IRQ_ERR_SFTY_UE,
 	REQ_IRQ_ERR_SFTY_CE,
 	REQ_IRQ_ERR_LPI,
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 9f89acf31050..ca3d93851bed 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -31,6 +31,7 @@ struct stmmac_resources {
 	int wol_irq;
 	int lpi_irq;
 	int irq;
+	int sfty_irq;
 	int sfty_ce_irq;
 	int sfty_ue_irq;
 	int rx_irq[MTL_MAX_RX_QUEUES];
@@ -297,6 +298,7 @@ struct stmmac_priv {
 	void __iomem *ptpaddr;
 	void __iomem *estaddr;
 	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
+	int sfty_irq;
 	int sfty_ce_irq;
 	int sfty_ue_irq;
 	int rx_irq[MTL_MAX_RX_QUEUES];
@@ -305,6 +307,7 @@ struct stmmac_priv {
 	char int_name_mac[IFNAMSIZ + 9];
 	char int_name_wol[IFNAMSIZ + 9];
 	char int_name_lpi[IFNAMSIZ + 9];
+	char int_name_sfty[IFNAMSIZ + 10];
 	char int_name_sfty_ce[IFNAMSIZ + 10];
 	char int_name_sfty_ue[IFNAMSIZ + 10];
 	char int_name_rx_irq[MTL_MAX_TX_QUEUES][IFNAMSIZ + 14];
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 47de466e432c..7d4e827dfeab 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3592,6 +3592,10 @@ static void stmmac_free_irq(struct net_device *dev,
 		if (priv->wol_irq > 0 && priv->wol_irq != dev->irq)
 			free_irq(priv->wol_irq, dev);
 		fallthrough;
+	case REQ_IRQ_ERR_SFTY:
+		if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq)
+			free_irq(priv->sfty_irq, dev);
+		fallthrough;
 	case REQ_IRQ_ERR_WOL:
 		free_irq(dev->irq, dev);
 		fallthrough;
@@ -3661,6 +3665,23 @@ static int stmmac_request_irq_multi_msi(struct net_device *dev)
 		}
 	}
 
+	/* Request the common Safety Feature Correctible/Uncorrectible
+	 * Error line in case of another line is used
+	 */
+	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
+		int_name = priv->int_name_sfty;
+		sprintf(int_name, "%s:%s", dev->name, "safety");
+		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
+				  0, int_name, dev);
+		if (unlikely(ret < 0)) {
+			netdev_err(priv->dev,
+				   "%s: alloc sfty MSI %d (error: %d)\n",
+				   __func__, priv->sfty_irq, ret);
+			irq_err = REQ_IRQ_ERR_SFTY;
+			goto irq_error;
+		}
+	}
+
 	/* Request the Safety Feature Correctible Error line in
 	 * case of another line is used
 	 */
@@ -3798,6 +3819,21 @@ static int stmmac_request_irq_single(struct net_device *dev)
 		}
 	}
 
+	/* Request the common Safety Feature Correctible/Uncorrectible
+	 * Error line in case of another line is used
+	 */
+	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
+		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
+				  IRQF_SHARED, dev->name, dev);
+		if (unlikely(ret < 0)) {
+			netdev_err(priv->dev,
+				   "%s: ERROR: allocating the sfty IRQ %d (%d)\n",
+				   __func__, priv->sfty_irq, ret);
+			irq_err = REQ_IRQ_ERR_SFTY;
+			goto irq_error;
+		}
+	}
+
 	return 0;
 
 irq_error:
@@ -7462,6 +7498,7 @@ int stmmac_dvr_probe(struct device *device,
 	priv->dev->irq = res->irq;
 	priv->wol_irq = res->wol_irq;
 	priv->lpi_irq = res->lpi_irq;
+	priv->sfty_irq = res->sfty_irq;
 	priv->sfty_ce_irq = res->sfty_ce_irq;
 	priv->sfty_ue_irq = res->sfty_ue_irq;
 	for (i = 0; i < MTL_MAX_RX_QUEUES; i++)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 70eadc83ca68..ab250161fd79 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -743,6 +743,14 @@ int stmmac_get_platform_resources(struct platform_device *pdev,
 		dev_info(&pdev->dev, "IRQ eth_lpi not found\n");
 	}
 
+	stmmac_res->sfty_irq =
+		platform_get_irq_byname_optional(pdev, "sfty");
+	if (stmmac_res->sfty_irq < 0) {
+		if (stmmac_res->sfty_irq == -EPROBE_DEFER)
+			return -EPROBE_DEFER;
+		dev_info(&pdev->dev, "IRQ safety IRQ not found\n");
+	}
+
 	stmmac_res->addr = devm_platform_ioremap_resource(pdev, 0);
 
 	return PTR_ERR_OR_ZERO(stmmac_res->addr);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common safety IRQ
  2023-12-21  7:36 ` [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common " Suraj Jaiswal
@ 2023-12-21 12:49   ` Serge Semin
  2023-12-22  8:43     ` Suraj Jaiswal
  0 siblings, 1 reply; 12+ messages in thread
From: Serge Semin @ 2023-12-21 12:49 UTC (permalink / raw)
  To: Suraj Jaiswal, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Vinod Koul, Bhupesh Sharma, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring, kernel

Hi Suraj

On Thu, Dec 21, 2023 at 01:06:20PM +0530, Suraj Jaiswal wrote:
> Add support to listen HW safety IRQ like ECC(error
> correction code), DPP(data path parity), FSM(finite state
> machine) fault in common IRQ line.
> 
> Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>

Thanks for taking my notes into account. One more comment is further
below.

> ---
>  drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
>  .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
>  4 files changed, 49 insertions(+)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
> index 721c1f8e892f..b9233b09b80f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> @@ -344,6 +344,7 @@ enum request_irq_err {
>  	REQ_IRQ_ERR_ALL,
>  	REQ_IRQ_ERR_TX,
>  	REQ_IRQ_ERR_RX,
> +	REQ_IRQ_ERR_SFTY,
>  	REQ_IRQ_ERR_SFTY_UE,
>  	REQ_IRQ_ERR_SFTY_CE,
>  	REQ_IRQ_ERR_LPI,
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> index 9f89acf31050..ca3d93851bed 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> @@ -31,6 +31,7 @@ struct stmmac_resources {
>  	int wol_irq;
>  	int lpi_irq;
>  	int irq;
> +	int sfty_irq;
>  	int sfty_ce_irq;
>  	int sfty_ue_irq;
>  	int rx_irq[MTL_MAX_RX_QUEUES];
> @@ -297,6 +298,7 @@ struct stmmac_priv {
>  	void __iomem *ptpaddr;
>  	void __iomem *estaddr;
>  	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
> +	int sfty_irq;
>  	int sfty_ce_irq;
>  	int sfty_ue_irq;
>  	int rx_irq[MTL_MAX_RX_QUEUES];
> @@ -305,6 +307,7 @@ struct stmmac_priv {
>  	char int_name_mac[IFNAMSIZ + 9];
>  	char int_name_wol[IFNAMSIZ + 9];
>  	char int_name_lpi[IFNAMSIZ + 9];
> +	char int_name_sfty[IFNAMSIZ + 10];
>  	char int_name_sfty_ce[IFNAMSIZ + 10];
>  	char int_name_sfty_ue[IFNAMSIZ + 10];
>  	char int_name_rx_irq[MTL_MAX_TX_QUEUES][IFNAMSIZ + 14];
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 47de466e432c..7d4e827dfeab 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -3592,6 +3592,10 @@ static void stmmac_free_irq(struct net_device *dev,
>  		if (priv->wol_irq > 0 && priv->wol_irq != dev->irq)
>  			free_irq(priv->wol_irq, dev);
>  		fallthrough;
> +	case REQ_IRQ_ERR_SFTY:
> +		if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq)
> +			free_irq(priv->sfty_irq, dev);
> +		fallthrough;
>  	case REQ_IRQ_ERR_WOL:
>  		free_irq(dev->irq, dev);
>  		fallthrough;
> @@ -3661,6 +3665,23 @@ static int stmmac_request_irq_multi_msi(struct net_device *dev)
>  		}
>  	}
>  
> +	/* Request the common Safety Feature Correctible/Uncorrectible
> +	 * Error line in case of another line is used
> +	 */
> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
> +		int_name = priv->int_name_sfty;
> +		sprintf(int_name, "%s:%s", dev->name, "safety");
> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
> +				  0, int_name, dev);
> +		if (unlikely(ret < 0)) {
> +			netdev_err(priv->dev,
> +				   "%s: alloc sfty MSI %d (error: %d)\n",
> +				   __func__, priv->sfty_irq, ret);
> +			irq_err = REQ_IRQ_ERR_SFTY;
> +			goto irq_error;
> +		}
> +	}
> +
>  	/* Request the Safety Feature Correctible Error line in
>  	 * case of another line is used
>  	 */
> @@ -3798,6 +3819,21 @@ static int stmmac_request_irq_single(struct net_device *dev)
>  		}
>  	}
>  
> +	/* Request the common Safety Feature Correctible/Uncorrectible
> +	 * Error line in case of another line is used
> +	 */
> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {

> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
> +				  IRQF_SHARED, dev->name, dev);

Just noticed yesterday that stmmac_safety_interrupt() is also called
from the stmmac_interrupt() handler which is supposed to be registered
on the generic "mac" IRQ. Won't it cause races around the CSRs
(doubtfully but still worth to note) and the errors handling
(stmmac_global_err()) in case if both IRQs are raised simultaneously?
At the very least it looks suspicious and worth double-checking.

I also found out that nobody seemed to care that the same handler is
registered on MAC, WoL and LPI IRQ lines. Hmm, no race-related
problems have been reported so far for the platforms with separate
WoL/LPI IRQs. It's either a lucky coincident or the IRQs are always
assigned to the same CPU or the IRQs handle is indeed free of races.
In anyway it looks suspicious too. At the very least AFAICS the DMA
IRQ-handler is indeed racy on the status CSR access. It isn't
cleared-on-read, but write-one-to-clear. So the statistics might be
calculated more than once for the same CSR state. There might be some
other problems I failed to spot on the first glance.

David, Eric, Jacub, Paolo, your opinion about the note above?

-Serge(y)

> +		if (unlikely(ret < 0)) {
> +			netdev_err(priv->dev,
> +				   "%s: ERROR: allocating the sfty IRQ %d (%d)\n",
> +				   __func__, priv->sfty_irq, ret);
> +			irq_err = REQ_IRQ_ERR_SFTY;
> +			goto irq_error;
> +		}
> +	}
> +
>  	return 0;
>  
>  irq_error:
> @@ -7462,6 +7498,7 @@ int stmmac_dvr_probe(struct device *device,
>  	priv->dev->irq = res->irq;
>  	priv->wol_irq = res->wol_irq;
>  	priv->lpi_irq = res->lpi_irq;
> +	priv->sfty_irq = res->sfty_irq;
>  	priv->sfty_ce_irq = res->sfty_ce_irq;
>  	priv->sfty_ue_irq = res->sfty_ue_irq;
>  	for (i = 0; i < MTL_MAX_RX_QUEUES; i++)
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> index 70eadc83ca68..ab250161fd79 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> @@ -743,6 +743,14 @@ int stmmac_get_platform_resources(struct platform_device *pdev,
>  		dev_info(&pdev->dev, "IRQ eth_lpi not found\n");
>  	}
>  
> +	stmmac_res->sfty_irq =
> +		platform_get_irq_byname_optional(pdev, "sfty");
> +	if (stmmac_res->sfty_irq < 0) {
> +		if (stmmac_res->sfty_irq == -EPROBE_DEFER)
> +			return -EPROBE_DEFER;
> +		dev_info(&pdev->dev, "IRQ safety IRQ not found\n");
> +	}
> +
>  	stmmac_res->addr = devm_platform_ioremap_resource(pdev, 0);
>  
>  	return PTR_ERR_OR_ZERO(stmmac_res->addr);
> -- 
> 2.25.1
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common safety IRQ
  2023-12-21 12:49   ` Serge Semin
@ 2023-12-22  8:43     ` Suraj Jaiswal
  2023-12-22 14:35       ` Serge Semin
  0 siblings, 1 reply; 12+ messages in thread
From: Suraj Jaiswal @ 2023-12-22  8:43 UTC (permalink / raw)
  To: Serge Semin, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Vinod Koul, Bhupesh Sharma, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring, kernel

HI Serge,
please find commnet inline.

Thanks
Suraj

On 12/21/2023 6:19 PM, Serge Semin wrote:
> Hi Suraj
> 
> On Thu, Dec 21, 2023 at 01:06:20PM +0530, Suraj Jaiswal wrote:
>> Add support to listen HW safety IRQ like ECC(error
>> correction code), DPP(data path parity), FSM(finite state
>> machine) fault in common IRQ line.
>>
>> Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
> 
> Thanks for taking my notes into account. One more comment is further
> below.
> 
>> ---
>>  drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
>>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
>>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
>>  .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
>>  4 files changed, 49 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
>> index 721c1f8e892f..b9233b09b80f 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
>> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
>> @@ -344,6 +344,7 @@ enum request_irq_err {
>>  	REQ_IRQ_ERR_ALL,
>>  	REQ_IRQ_ERR_TX,
>>  	REQ_IRQ_ERR_RX,
>> +	REQ_IRQ_ERR_SFTY,
>>  	REQ_IRQ_ERR_SFTY_UE,
>>  	REQ_IRQ_ERR_SFTY_CE,
>>  	REQ_IRQ_ERR_LPI,
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>> index 9f89acf31050..ca3d93851bed 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>> @@ -31,6 +31,7 @@ struct stmmac_resources {
>>  	int wol_irq;
>>  	int lpi_irq;
>>  	int irq;
>> +	int sfty_irq;
>>  	int sfty_ce_irq;
>>  	int sfty_ue_irq;
>>  	int rx_irq[MTL_MAX_RX_QUEUES];
>> @@ -297,6 +298,7 @@ struct stmmac_priv {
>>  	void __iomem *ptpaddr;
>>  	void __iomem *estaddr;
>>  	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
>> +	int sfty_irq;
>>  	int sfty_ce_irq;
>>  	int sfty_ue_irq;
>>  	int rx_irq[MTL_MAX_RX_QUEUES];
>> @@ -305,6 +307,7 @@ struct stmmac_priv {
>>  	char int_name_mac[IFNAMSIZ + 9];
>>  	char int_name_wol[IFNAMSIZ + 9];
>>  	char int_name_lpi[IFNAMSIZ + 9];
>> +	char int_name_sfty[IFNAMSIZ + 10];
>>  	char int_name_sfty_ce[IFNAMSIZ + 10];
>>  	char int_name_sfty_ue[IFNAMSIZ + 10];
>>  	char int_name_rx_irq[MTL_MAX_TX_QUEUES][IFNAMSIZ + 14];
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 47de466e432c..7d4e827dfeab 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -3592,6 +3592,10 @@ static void stmmac_free_irq(struct net_device *dev,
>>  		if (priv->wol_irq > 0 && priv->wol_irq != dev->irq)
>>  			free_irq(priv->wol_irq, dev);
>>  		fallthrough;
>> +	case REQ_IRQ_ERR_SFTY:
>> +		if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq)
>> +			free_irq(priv->sfty_irq, dev);
>> +		fallthrough;
>>  	case REQ_IRQ_ERR_WOL:
>>  		free_irq(dev->irq, dev);
>>  		fallthrough;
>> @@ -3661,6 +3665,23 @@ static int stmmac_request_irq_multi_msi(struct net_device *dev)
>>  		}
>>  	}
>>  
>> +	/* Request the common Safety Feature Correctible/Uncorrectible
>> +	 * Error line in case of another line is used
>> +	 */
>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
>> +		int_name = priv->int_name_sfty;
>> +		sprintf(int_name, "%s:%s", dev->name, "safety");
>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
>> +				  0, int_name, dev);
>> +		if (unlikely(ret < 0)) {
>> +			netdev_err(priv->dev,
>> +				   "%s: alloc sfty MSI %d (error: %d)\n",
>> +				   __func__, priv->sfty_irq, ret);
>> +			irq_err = REQ_IRQ_ERR_SFTY;
>> +			goto irq_error;
>> +		}
>> +	}
>> +
>>  	/* Request the Safety Feature Correctible Error line in
>>  	 * case of another line is used
>>  	 */
>> @@ -3798,6 +3819,21 @@ static int stmmac_request_irq_single(struct net_device *dev)
>>  		}
>>  	}
>>  
>> +	/* Request the common Safety Feature Correctible/Uncorrectible
>> +	 * Error line in case of another line is used
>> +	 */
>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
> 
>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
>> +				  IRQF_SHARED, dev->name, dev);
> 
> Just noticed yesterday that stmmac_safety_interrupt() is also called
> from the stmmac_interrupt() handler which is supposed to be registered
> on the generic "mac" IRQ. Won't it cause races around the CSRs
> (doubtfully but still worth to note) and the errors handling
> (stmmac_global_err()) in case if both IRQs are raised simultaneously?
> At the very least it looks suspicious and worth double-checking.
> 
> I also found out that nobody seemed to care that the same handler is
> registered on MAC, WoL and LPI IRQ lines. Hmm, no race-related
> problems have been reported so far for the platforms with separate
> WoL/LPI IRQs. It's either a lucky coincident or the IRQs are always
> assigned to the same CPU or the IRQs handle is indeed free of races.
> In anyway it looks suspicious too. At the very least AFAICS the DMA
> IRQ-handler is indeed racy on the status CSR access. It isn't
> cleared-on-read, but write-one-to-clear. So the statistics might be
> calculated more than once for the same CSR state. There might be some
> other problems I failed to spot on the first glance.
> 
> David, Eric, Jacub, Paolo, your opinion about the note above?
> 
> -Serge(y)
> 
<Suraj> We are adding common IRQ similar to already present code for correcteable/uncorrecable https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L3592.
Also, we need the sfty IRQ handling as soon as the fault occured & that can only be handled if we have handler attached with sfty IRQ.
stmmac_interrupt() will only be triggerd when interrupt triggered for rx/tx packet .
while registerting with sfty IRQ will get triggered as soon as emac HW detect the fault. 
   
>> +		if (unlikely(ret < 0)) {
>> +			netdev_err(priv->dev,
>> +				   "%s: ERROR: allocating the sfty IRQ %d (%d)\n",
>> +				   __func__, priv->sfty_irq, ret);
>> +			irq_err = REQ_IRQ_ERR_SFTY;
>> +			goto irq_error;
>> +		}
>> +	}
>> +
>>  	return 0;
>>  
>>  irq_error:
>> @@ -7462,6 +7498,7 @@ int stmmac_dvr_probe(struct device *device,
>>  	priv->dev->irq = res->irq;
>>  	priv->wol_irq = res->wol_irq;
>>  	priv->lpi_irq = res->lpi_irq;
>> +	priv->sfty_irq = res->sfty_irq;
>>  	priv->sfty_ce_irq = res->sfty_ce_irq;
>>  	priv->sfty_ue_irq = res->sfty_ue_irq;
>>  	for (i = 0; i < MTL_MAX_RX_QUEUES; i++)
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>> index 70eadc83ca68..ab250161fd79 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>> @@ -743,6 +743,14 @@ int stmmac_get_platform_resources(struct platform_device *pdev,
>>  		dev_info(&pdev->dev, "IRQ eth_lpi not found\n");
>>  	}
>>  
>> +	stmmac_res->sfty_irq =
>> +		platform_get_irq_byname_optional(pdev, "sfty");
>> +	if (stmmac_res->sfty_irq < 0) {
>> +		if (stmmac_res->sfty_irq == -EPROBE_DEFER)
>> +			return -EPROBE_DEFER;
>> +		dev_info(&pdev->dev, "IRQ safety IRQ not found\n");
>> +	}
>> +
>>  	stmmac_res->addr = devm_platform_ioremap_resource(pdev, 0);
>>  
>>  	return PTR_ERR_OR_ZERO(stmmac_res->addr);
>> -- 
>> 2.25.1
>>
>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common safety IRQ
  2023-12-22  8:43     ` Suraj Jaiswal
@ 2023-12-22 14:35       ` Serge Semin
  2023-12-26 11:10         ` Suraj Jaiswal
  0 siblings, 1 reply; 12+ messages in thread
From: Serge Semin @ 2023-12-22 14:35 UTC (permalink / raw)
  To: Suraj Jaiswal
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Vinod Koul, Bhupesh Sharma, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring, kernel

On Fri, Dec 22, 2023 at 02:13:49PM +0530, Suraj Jaiswal wrote:
> HI Serge,
> please find commnet inline.
> 
> Thanks
> Suraj
> 
> On 12/21/2023 6:19 PM, Serge Semin wrote:
> > Hi Suraj
> > 
> > On Thu, Dec 21, 2023 at 01:06:20PM +0530, Suraj Jaiswal wrote:
> >> Add support to listen HW safety IRQ like ECC(error
> >> correction code), DPP(data path parity), FSM(finite state
> >> machine) fault in common IRQ line.
> >>
> >> Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
> > 
> > Thanks for taking my notes into account. One more comment is further
> > below.
> > 
> >> ---
> >>  drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
> >>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
> >>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
> >>  .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
> >>  4 files changed, 49 insertions(+)
> >>
> >> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
> >> index 721c1f8e892f..b9233b09b80f 100644
> >> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> >> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> >> @@ -344,6 +344,7 @@ enum request_irq_err {
> >>  	REQ_IRQ_ERR_ALL,
> >>  	REQ_IRQ_ERR_TX,
> >>  	REQ_IRQ_ERR_RX,
> >> +	REQ_IRQ_ERR_SFTY,
> >>  	REQ_IRQ_ERR_SFTY_UE,
> >>  	REQ_IRQ_ERR_SFTY_CE,
> >>  	REQ_IRQ_ERR_LPI,
> >> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >> index 9f89acf31050..ca3d93851bed 100644
> >> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >> @@ -31,6 +31,7 @@ struct stmmac_resources {
> >>  	int wol_irq;
> >>  	int lpi_irq;
> >>  	int irq;
> >> +	int sfty_irq;
> >>  	int sfty_ce_irq;
> >>  	int sfty_ue_irq;
> >>  	int rx_irq[MTL_MAX_RX_QUEUES];
> >> @@ -297,6 +298,7 @@ struct stmmac_priv {
> >>  	void __iomem *ptpaddr;
> >>  	void __iomem *estaddr;
> >>  	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
> >> +	int sfty_irq;
> >>  	int sfty_ce_irq;
> >>  	int sfty_ue_irq;
> >>  	int rx_irq[MTL_MAX_RX_QUEUES];
> >> @@ -305,6 +307,7 @@ struct stmmac_priv {
> >>  	char int_name_mac[IFNAMSIZ + 9];
> >>  	char int_name_wol[IFNAMSIZ + 9];
> >>  	char int_name_lpi[IFNAMSIZ + 9];
> >> +	char int_name_sfty[IFNAMSIZ + 10];
> >>  	char int_name_sfty_ce[IFNAMSIZ + 10];
> >>  	char int_name_sfty_ue[IFNAMSIZ + 10];
> >>  	char int_name_rx_irq[MTL_MAX_TX_QUEUES][IFNAMSIZ + 14];
> >> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> >> index 47de466e432c..7d4e827dfeab 100644
> >> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> >> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> >> @@ -3592,6 +3592,10 @@ static void stmmac_free_irq(struct net_device *dev,
> >>  		if (priv->wol_irq > 0 && priv->wol_irq != dev->irq)
> >>  			free_irq(priv->wol_irq, dev);
> >>  		fallthrough;
> >> +	case REQ_IRQ_ERR_SFTY:
> >> +		if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq)
> >> +			free_irq(priv->sfty_irq, dev);
> >> +		fallthrough;
> >>  	case REQ_IRQ_ERR_WOL:
> >>  		free_irq(dev->irq, dev);
> >>  		fallthrough;
> >> @@ -3661,6 +3665,23 @@ static int stmmac_request_irq_multi_msi(struct net_device *dev)
> >>  		}
> >>  	}
> >>  
> >> +	/* Request the common Safety Feature Correctible/Uncorrectible
> >> +	 * Error line in case of another line is used
> >> +	 */
> >> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
> >> +		int_name = priv->int_name_sfty;
> >> +		sprintf(int_name, "%s:%s", dev->name, "safety");
> >> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
> >> +				  0, int_name, dev);
> >> +		if (unlikely(ret < 0)) {
> >> +			netdev_err(priv->dev,
> >> +				   "%s: alloc sfty MSI %d (error: %d)\n",
> >> +				   __func__, priv->sfty_irq, ret);
> >> +			irq_err = REQ_IRQ_ERR_SFTY;
> >> +			goto irq_error;
> >> +		}
> >> +	}
> >> +
> >>  	/* Request the Safety Feature Correctible Error line in
> >>  	 * case of another line is used
> >>  	 */
> >> @@ -3798,6 +3819,21 @@ static int stmmac_request_irq_single(struct net_device *dev)
> >>  		}
> >>  	}
> >>  
> >> +	/* Request the common Safety Feature Correctible/Uncorrectible
> >> +	 * Error line in case of another line is used
> >> +	 */
> >> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
> > 
> >> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
> >> +				  IRQF_SHARED, dev->name, dev);
> > 
> > Just noticed yesterday that stmmac_safety_interrupt() is also called
> > from the stmmac_interrupt() handler which is supposed to be registered
> > on the generic "mac" IRQ. Won't it cause races around the CSRs
> > (doubtfully but still worth to note) and the errors handling
> > (stmmac_global_err()) in case if both IRQs are raised simultaneously?
> > At the very least it looks suspicious and worth double-checking.
> > 
> > I also found out that nobody seemed to care that the same handler is
> > registered on MAC, WoL and LPI IRQ lines. Hmm, no race-related
> > problems have been reported so far for the platforms with separate
> > WoL/LPI IRQs. It's either a lucky coincident or the IRQs are always
> > assigned to the same CPU or the IRQs handle is indeed free of races.
> > In anyway it looks suspicious too. At the very least AFAICS the DMA
> > IRQ-handler is indeed racy on the status CSR access. It isn't
> > cleared-on-read, but write-one-to-clear. So the statistics might be
> > calculated more than once for the same CSR state. There might be some
> > other problems I failed to spot on the first glance.
> > 
> > David, Eric, Jacub, Paolo, your opinion about the note above?
> > 
> > -Serge(y)
> > 

> <Suraj> We are adding common IRQ similar to already present code for correcteable/uncorrecable https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L3592.

From that perspective your change in stmmac_request_irq_multi_msi() is
correct, but stmmac_request_irq_single() is another story. The first
one method implies assigning the individual IRQ handlers to all
available lines. The later method assigns the _common_ handler to all
the lines. The common handler already calls the Safety IRQ handler -
stmmac_safety_feat_interrupt(). So should the safety IRQ line is
separately available it's possible to have the Safety IRQ handlers
executed concurrently - in framework of the common IRQ events handling
(if safety IRQ is raised during the common IRQ being handled) and
individual Safety IRQ. It's prune to the race condition I pointed out
to in my message above. Did you consider that problem?

> Also, we need the sfty IRQ handling as soon as the fault occured & that can only be handled if we have handler attached with sfty IRQ.
> stmmac_interrupt() will only be triggerd when interrupt triggered for rx/tx packet .
> while registerting with sfty IRQ will get triggered as soon as emac HW detect the fault. 

Please read my comment more carefully. The safety IRQ can be raised
during the common IRQ handling, thus the
stmmac_safety_feat_interrupt() method might get to be concurrently
executed.

-Serge(y)

>    
> >> +		if (unlikely(ret < 0)) {
> >> +			netdev_err(priv->dev,
> >> +				   "%s: ERROR: allocating the sfty IRQ %d (%d)\n",
> >> +				   __func__, priv->sfty_irq, ret);
> >> +			irq_err = REQ_IRQ_ERR_SFTY;
> >> +			goto irq_error;
> >> +		}
> >> +	}
> >> +
> >>  	return 0;
> >>  
> >>  irq_error:
> >> @@ -7462,6 +7498,7 @@ int stmmac_dvr_probe(struct device *device,
> >>  	priv->dev->irq = res->irq;
> >>  	priv->wol_irq = res->wol_irq;
> >>  	priv->lpi_irq = res->lpi_irq;
> >> +	priv->sfty_irq = res->sfty_irq;
> >>  	priv->sfty_ce_irq = res->sfty_ce_irq;
> >>  	priv->sfty_ue_irq = res->sfty_ue_irq;
> >>  	for (i = 0; i < MTL_MAX_RX_QUEUES; i++)
> >> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> >> index 70eadc83ca68..ab250161fd79 100644
> >> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> >> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> >> @@ -743,6 +743,14 @@ int stmmac_get_platform_resources(struct platform_device *pdev,
> >>  		dev_info(&pdev->dev, "IRQ eth_lpi not found\n");
> >>  	}
> >>  
> >> +	stmmac_res->sfty_irq =
> >> +		platform_get_irq_byname_optional(pdev, "sfty");
> >> +	if (stmmac_res->sfty_irq < 0) {
> >> +		if (stmmac_res->sfty_irq == -EPROBE_DEFER)
> >> +			return -EPROBE_DEFER;
> >> +		dev_info(&pdev->dev, "IRQ safety IRQ not found\n");
> >> +	}
> >> +
> >>  	stmmac_res->addr = devm_platform_ioremap_resource(pdev, 0);
> >>  
> >>  	return PTR_ERR_OR_ZERO(stmmac_res->addr);
> >> -- 
> >> 2.25.1
> >>
> >>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common safety IRQ
  2023-12-22 14:35       ` Serge Semin
@ 2023-12-26 11:10         ` Suraj Jaiswal
  2023-12-27 11:03           ` Suraj Jaiswal
  0 siblings, 1 reply; 12+ messages in thread
From: Suraj Jaiswal @ 2023-12-26 11:10 UTC (permalink / raw)
  To: Serge Semin
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Vinod Koul, Bhupesh Sharma, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring, kernel

Hi seren
let me check below on test setup once & get back

Thanks
Suraj

On 12/22/2023 8:05 PM, Serge Semin wrote:
> On Fri, Dec 22, 2023 at 02:13:49PM +0530, Suraj Jaiswal wrote:
>> HI Serge,
>> please find commnet inline.
>>
>> Thanks
>> Suraj
>>
>> On 12/21/2023 6:19 PM, Serge Semin wrote:
>>> Hi Suraj
>>>
>>> On Thu, Dec 21, 2023 at 01:06:20PM +0530, Suraj Jaiswal wrote:
>>>> Add support to listen HW safety IRQ like ECC(error
>>>> correction code), DPP(data path parity), FSM(finite state
>>>> machine) fault in common IRQ line.
>>>>
>>>> Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
>>>
>>> Thanks for taking my notes into account. One more comment is further
>>> below.
>>>
>>>> ---
>>>>  drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
>>>>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
>>>>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
>>>>  .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
>>>>  4 files changed, 49 insertions(+)
>>>>
>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
>>>> index 721c1f8e892f..b9233b09b80f 100644
>>>> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
>>>> @@ -344,6 +344,7 @@ enum request_irq_err {
>>>>  	REQ_IRQ_ERR_ALL,
>>>>  	REQ_IRQ_ERR_TX,
>>>>  	REQ_IRQ_ERR_RX,
>>>> +	REQ_IRQ_ERR_SFTY,
>>>>  	REQ_IRQ_ERR_SFTY_UE,
>>>>  	REQ_IRQ_ERR_SFTY_CE,
>>>>  	REQ_IRQ_ERR_LPI,
>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>>>> index 9f89acf31050..ca3d93851bed 100644
>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>>>> @@ -31,6 +31,7 @@ struct stmmac_resources {
>>>>  	int wol_irq;
>>>>  	int lpi_irq;
>>>>  	int irq;
>>>> +	int sfty_irq;
>>>>  	int sfty_ce_irq;
>>>>  	int sfty_ue_irq;
>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
>>>> @@ -297,6 +298,7 @@ struct stmmac_priv {
>>>>  	void __iomem *ptpaddr;
>>>>  	void __iomem *estaddr;
>>>>  	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
>>>> +	int sfty_irq;
>>>>  	int sfty_ce_irq;
>>>>  	int sfty_ue_irq;
>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
>>>> @@ -305,6 +307,7 @@ struct stmmac_priv {
>>>>  	char int_name_mac[IFNAMSIZ + 9];
>>>>  	char int_name_wol[IFNAMSIZ + 9];
>>>>  	char int_name_lpi[IFNAMSIZ + 9];
>>>> +	char int_name_sfty[IFNAMSIZ + 10];
>>>>  	char int_name_sfty_ce[IFNAMSIZ + 10];
>>>>  	char int_name_sfty_ue[IFNAMSIZ + 10];
>>>>  	char int_name_rx_irq[MTL_MAX_TX_QUEUES][IFNAMSIZ + 14];
>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> index 47de466e432c..7d4e827dfeab 100644
>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>> @@ -3592,6 +3592,10 @@ static void stmmac_free_irq(struct net_device *dev,
>>>>  		if (priv->wol_irq > 0 && priv->wol_irq != dev->irq)
>>>>  			free_irq(priv->wol_irq, dev);
>>>>  		fallthrough;
>>>> +	case REQ_IRQ_ERR_SFTY:
>>>> +		if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq)
>>>> +			free_irq(priv->sfty_irq, dev);
>>>> +		fallthrough;
>>>>  	case REQ_IRQ_ERR_WOL:
>>>>  		free_irq(dev->irq, dev);
>>>>  		fallthrough;
>>>> @@ -3661,6 +3665,23 @@ static int stmmac_request_irq_multi_msi(struct net_device *dev)
>>>>  		}
>>>>  	}
>>>>  
>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
>>>> +	 * Error line in case of another line is used
>>>> +	 */
>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
>>>> +		int_name = priv->int_name_sfty;
>>>> +		sprintf(int_name, "%s:%s", dev->name, "safety");
>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
>>>> +				  0, int_name, dev);
>>>> +		if (unlikely(ret < 0)) {
>>>> +			netdev_err(priv->dev,
>>>> +				   "%s: alloc sfty MSI %d (error: %d)\n",
>>>> +				   __func__, priv->sfty_irq, ret);
>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
>>>> +			goto irq_error;
>>>> +		}
>>>> +	}
>>>> +
>>>>  	/* Request the Safety Feature Correctible Error line in
>>>>  	 * case of another line is used
>>>>  	 */
>>>> @@ -3798,6 +3819,21 @@ static int stmmac_request_irq_single(struct net_device *dev)
>>>>  		}
>>>>  	}
>>>>  
>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
>>>> +	 * Error line in case of another line is used
>>>> +	 */
>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
>>>
>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
>>>> +				  IRQF_SHARED, dev->name, dev);
>>>
>>> Just noticed yesterday that stmmac_safety_interrupt() is also called
>>> from the stmmac_interrupt() handler which is supposed to be registered
>>> on the generic "mac" IRQ. Won't it cause races around the CSRs
>>> (doubtfully but still worth to note) and the errors handling
>>> (stmmac_global_err()) in case if both IRQs are raised simultaneously?
>>> At the very least it looks suspicious and worth double-checking.
>>>
>>> I also found out that nobody seemed to care that the same handler is
>>> registered on MAC, WoL and LPI IRQ lines. Hmm, no race-related
>>> problems have been reported so far for the platforms with separate
>>> WoL/LPI IRQs. It's either a lucky coincident or the IRQs are always
>>> assigned to the same CPU or the IRQs handle is indeed free of races.
>>> In anyway it looks suspicious too. At the very least AFAICS the DMA
>>> IRQ-handler is indeed racy on the status CSR access. It isn't
>>> cleared-on-read, but write-one-to-clear. So the statistics might be
>>> calculated more than once for the same CSR state. There might be some
>>> other problems I failed to spot on the first glance.
>>>
>>> David, Eric, Jacub, Paolo, your opinion about the note above?
>>>
>>> -Serge(y)
>>>
> 
>> <Suraj> We are adding common IRQ similar to already present code for correcteable/uncorrecable https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L3592.
> 
> From that perspective your change in stmmac_request_irq_multi_msi() is
> correct, but stmmac_request_irq_single() is another story. The first
> one method implies assigning the individual IRQ handlers to all
> available lines. The later method assigns the _common_ handler to all
> the lines. The common handler already calls the Safety IRQ handler -
> stmmac_safety_feat_interrupt(). So should the safety IRQ line is
> separately available it's possible to have the Safety IRQ handlers
> executed concurrently - in framework of the common IRQ events handling
> (if safety IRQ is raised during the common IRQ being handled) and
> individual Safety IRQ. It's prune to the race condition I pointed out
> to in my message above. Did you consider that problem?
> 
>> Also, we need the sfty IRQ handling as soon as the fault occured & that can only be handled if we have handler attached with sfty IRQ.
>> stmmac_interrupt() will only be triggerd when interrupt triggered for rx/tx packet .
>> while registerting with sfty IRQ will get triggered as soon as emac HW detect the fault. 
> 
> Please read my comment more carefully. The safety IRQ can be raised
> during the common IRQ handling, thus the
> stmmac_safety_feat_interrupt() method might get to be concurrently
> executed.
> 
> -Serge(y)
> 
>>    
>>>> +		if (unlikely(ret < 0)) {
>>>> +			netdev_err(priv->dev,
>>>> +				   "%s: ERROR: allocating the sfty IRQ %d (%d)\n",
>>>> +				   __func__, priv->sfty_irq, ret);
>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
>>>> +			goto irq_error;
>>>> +		}
>>>> +	}
>>>> +
>>>>  	return 0;
>>>>  
>>>>  irq_error:
>>>> @@ -7462,6 +7498,7 @@ int stmmac_dvr_probe(struct device *device,
>>>>  	priv->dev->irq = res->irq;
>>>>  	priv->wol_irq = res->wol_irq;
>>>>  	priv->lpi_irq = res->lpi_irq;
>>>> +	priv->sfty_irq = res->sfty_irq;
>>>>  	priv->sfty_ce_irq = res->sfty_ce_irq;
>>>>  	priv->sfty_ue_irq = res->sfty_ue_irq;
>>>>  	for (i = 0; i < MTL_MAX_RX_QUEUES; i++)
>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>>>> index 70eadc83ca68..ab250161fd79 100644
>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>>>> @@ -743,6 +743,14 @@ int stmmac_get_platform_resources(struct platform_device *pdev,
>>>>  		dev_info(&pdev->dev, "IRQ eth_lpi not found\n");
>>>>  	}
>>>>  
>>>> +	stmmac_res->sfty_irq =
>>>> +		platform_get_irq_byname_optional(pdev, "sfty");
>>>> +	if (stmmac_res->sfty_irq < 0) {
>>>> +		if (stmmac_res->sfty_irq == -EPROBE_DEFER)
>>>> +			return -EPROBE_DEFER;
>>>> +		dev_info(&pdev->dev, "IRQ safety IRQ not found\n");
>>>> +	}
>>>> +
>>>>  	stmmac_res->addr = devm_platform_ioremap_resource(pdev, 0);
>>>>  
>>>>  	return PTR_ERR_OR_ZERO(stmmac_res->addr);
>>>> -- 
>>>> 2.25.1
>>>>
>>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common safety IRQ
  2023-12-26 11:10         ` Suraj Jaiswal
@ 2023-12-27 11:03           ` Suraj Jaiswal
  2024-01-07 19:53             ` Serge Semin
  0 siblings, 1 reply; 12+ messages in thread
From: Suraj Jaiswal @ 2023-12-27 11:03 UTC (permalink / raw)
  To: Serge Semin
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Vinod Koul, Bhupesh Sharma, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring, kernel

Hi Seren,
please find the updated comment .

Thanks
Suraj

On 12/26/2023 4:40 PM, Suraj Jaiswal wrote:
> Hi seren
> let me check below on test setup once & get back
> 
> Thanks
> Suraj
> 
> On 12/22/2023 8:05 PM, Serge Semin wrote:
>> On Fri, Dec 22, 2023 at 02:13:49PM +0530, Suraj Jaiswal wrote:
>>> HI Serge,
>>> please find commnet inline.
>>>
>>> Thanks
>>> Suraj
>>>
>>> On 12/21/2023 6:19 PM, Serge Semin wrote:
>>>> Hi Suraj
>>>>
>>>> On Thu, Dec 21, 2023 at 01:06:20PM +0530, Suraj Jaiswal wrote:
>>>>> Add support to listen HW safety IRQ like ECC(error
>>>>> correction code), DPP(data path parity), FSM(finite state
>>>>> machine) fault in common IRQ line.
>>>>>
>>>>> Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
>>>>
>>>> Thanks for taking my notes into account. One more comment is further
>>>> below.
>>>>
>>>>> ---
>>>>>  drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
>>>>>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
>>>>>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
>>>>>  .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
>>>>>  4 files changed, 49 insertions(+)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
>>>>> index 721c1f8e892f..b9233b09b80f 100644
>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
>>>>> @@ -344,6 +344,7 @@ enum request_irq_err {
>>>>>  	REQ_IRQ_ERR_ALL,
>>>>>  	REQ_IRQ_ERR_TX,
>>>>>  	REQ_IRQ_ERR_RX,
>>>>> +	REQ_IRQ_ERR_SFTY,
>>>>>  	REQ_IRQ_ERR_SFTY_UE,
>>>>>  	REQ_IRQ_ERR_SFTY_CE,
>>>>>  	REQ_IRQ_ERR_LPI,
>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>>>>> index 9f89acf31050..ca3d93851bed 100644
>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>>>>> @@ -31,6 +31,7 @@ struct stmmac_resources {
>>>>>  	int wol_irq;
>>>>>  	int lpi_irq;
>>>>>  	int irq;
>>>>> +	int sfty_irq;
>>>>>  	int sfty_ce_irq;
>>>>>  	int sfty_ue_irq;
>>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
>>>>> @@ -297,6 +298,7 @@ struct stmmac_priv {
>>>>>  	void __iomem *ptpaddr;
>>>>>  	void __iomem *estaddr;
>>>>>  	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
>>>>> +	int sfty_irq;
>>>>>  	int sfty_ce_irq;
>>>>>  	int sfty_ue_irq;
>>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
>>>>> @@ -305,6 +307,7 @@ struct stmmac_priv {
>>>>>  	char int_name_mac[IFNAMSIZ + 9];
>>>>>  	char int_name_wol[IFNAMSIZ + 9];
>>>>>  	char int_name_lpi[IFNAMSIZ + 9];
>>>>> +	char int_name_sfty[IFNAMSIZ + 10];
>>>>>  	char int_name_sfty_ce[IFNAMSIZ + 10];
>>>>>  	char int_name_sfty_ue[IFNAMSIZ + 10];
>>>>>  	char int_name_rx_irq[MTL_MAX_TX_QUEUES][IFNAMSIZ + 14];
>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>> index 47de466e432c..7d4e827dfeab 100644
>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>> @@ -3592,6 +3592,10 @@ static void stmmac_free_irq(struct net_device *dev,
>>>>>  		if (priv->wol_irq > 0 && priv->wol_irq != dev->irq)
>>>>>  			free_irq(priv->wol_irq, dev);
>>>>>  		fallthrough;
>>>>> +	case REQ_IRQ_ERR_SFTY:
>>>>> +		if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq)
>>>>> +			free_irq(priv->sfty_irq, dev);
>>>>> +		fallthrough;
>>>>>  	case REQ_IRQ_ERR_WOL:
>>>>>  		free_irq(dev->irq, dev);
>>>>>  		fallthrough;
>>>>> @@ -3661,6 +3665,23 @@ static int stmmac_request_irq_multi_msi(struct net_device *dev)
>>>>>  		}
>>>>>  	}
>>>>>  
>>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
>>>>> +	 * Error line in case of another line is used
>>>>> +	 */
>>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
>>>>> +		int_name = priv->int_name_sfty;
>>>>> +		sprintf(int_name, "%s:%s", dev->name, "safety");
>>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
>>>>> +				  0, int_name, dev);
>>>>> +		if (unlikely(ret < 0)) {
>>>>> +			netdev_err(priv->dev,
>>>>> +				   "%s: alloc sfty MSI %d (error: %d)\n",
>>>>> +				   __func__, priv->sfty_irq, ret);
>>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
>>>>> +			goto irq_error;
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>>  	/* Request the Safety Feature Correctible Error line in
>>>>>  	 * case of another line is used
>>>>>  	 */
>>>>> @@ -3798,6 +3819,21 @@ static int stmmac_request_irq_single(struct net_device *dev)
>>>>>  		}
>>>>>  	}
>>>>>  
>>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
>>>>> +	 * Error line in case of another line is used
>>>>> +	 */
>>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
>>>>
>>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
>>>>> +				  IRQF_SHARED, dev->name, dev);
>>>>
>>>> Just noticed yesterday that stmmac_safety_interrupt() is also called
>>>> from the stmmac_interrupt() handler which is supposed to be registered
>>>> on the generic "mac" IRQ. Won't it cause races around the CSRs
>>>> (doubtfully but still worth to note) and the errors handling
>>>> (stmmac_global_err()) in case if both IRQs are raised simultaneously?
>>>> At the very least it looks suspicious and worth double-checking.
>>>>
>>>> I also found out that nobody seemed to care that the same handler is
>>>> registered on MAC, WoL and LPI IRQ lines. Hmm, no race-related
>>>> problems have been reported so far for the platforms with separate
>>>> WoL/LPI IRQs. It's either a lucky coincident or the IRQs are always
>>>> assigned to the same CPU or the IRQs handle is indeed free of races.
>>>> In anyway it looks suspicious too. At the very least AFAICS the DMA
>>>> IRQ-handler is indeed racy on the status CSR access. It isn't
>>>> cleared-on-read, but write-one-to-clear. So the statistics might be
>>>> calculated more than once for the same CSR state. There might be some
>>>> other problems I failed to spot on the first glance.
>>>>
>>>> David, Eric, Jacub, Paolo, your opinion about the note above?
>>>>
>>>> -Serge(y)
>>>>
>>
>>> <Suraj> We are adding common IRQ similar to already present code for correcteable/uncorrecable https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L3592.
>>
>> From that perspective your change in stmmac_request_irq_multi_msi() is
>> correct, but stmmac_request_irq_single() is another story. The first
>> one method implies assigning the individual IRQ handlers to all
>> available lines. The later method assigns the _common_ handler to all
>> the lines. The common handler already calls the Safety IRQ handler -
>> stmmac_safety_feat_interrupt(). So should the safety IRQ line is
>> separately available it's possible to have the Safety IRQ handlers
>> executed concurrently - in framework of the common IRQ events handling
>> (if safety IRQ is raised during the common IRQ being handled) and
>> individual Safety IRQ. It's prune to the race condition I pointed out
>> to in my message above. Did you consider that problem?
>>
>>> Also, we need the sfty IRQ handling as soon as the fault occured & that can only be handled if we have handler attached with sfty IRQ.
>>> stmmac_interrupt() will only be triggerd when interrupt triggered for rx/tx packet .
>>> while registerting with sfty IRQ will get triggered as soon as emac HW detect the fault. 
>>
>> Please read my comment more carefully. The safety IRQ can be raised
>> during the common IRQ handling, thus the
>> stmmac_safety_feat_interrupt() method might get to be concurrently
>> executed.
>>
>> -Serge(y)
>>
<Suraj> Have testing this on device . We have added print in the both the places stmmac_interrupt() as well as sfty interrupt handler.
We can see that sfty interrupt handler is getting triggred first & stmmac_safety_feat_interrupt () code added in stmmac_intterupt() is not getting triggred because looks like interrupt status bit register is already getting cleared as part of sfty interrupt handler. So it looks good . Please let us know if any further comment. 

Please find the log below .


/ # [ 1505.602173] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1505.607274] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
[ 1505.617395] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1505.622494] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
[ 1505.888913] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1505.894010] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
[ 1506.605821] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1506.610919] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
[ 1506.621034] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1506.626131] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
[ 1507.613036] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1507.618133] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
[ 1507.628249] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1507.633346] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
[ 1508.619034] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1508.624132] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
[ 1508.634245] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1508.639343] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
[ 1509.631151] sj: stmmac_safety_interrupt from sfty IRQ handler
[ 1509.636249] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'

>>>    
>>>>> +		if (unlikely(ret < 0)) {
>>>>> +			netdev_err(priv->dev,
>>>>> +				   "%s: ERROR: allocating the sfty IRQ %d (%d)\n",
>>>>> +				   __func__, priv->sfty_irq, ret);
>>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
>>>>> +			goto irq_error;
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>>  	return 0;
>>>>>  
>>>>>  irq_error:
>>>>> @@ -7462,6 +7498,7 @@ int stmmac_dvr_probe(struct device *device,
>>>>>  	priv->dev->irq = res->irq;
>>>>>  	priv->wol_irq = res->wol_irq;
>>>>>  	priv->lpi_irq = res->lpi_irq;
>>>>> +	priv->sfty_irq = res->sfty_irq;
>>>>>  	priv->sfty_ce_irq = res->sfty_ce_irq;
>>>>>  	priv->sfty_ue_irq = res->sfty_ue_irq;
>>>>>  	for (i = 0; i < MTL_MAX_RX_QUEUES; i++)
>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>>>>> index 70eadc83ca68..ab250161fd79 100644
>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>>>>> @@ -743,6 +743,14 @@ int stmmac_get_platform_resources(struct platform_device *pdev,
>>>>>  		dev_info(&pdev->dev, "IRQ eth_lpi not found\n");
>>>>>  	}
>>>>>  
>>>>> +	stmmac_res->sfty_irq =
>>>>> +		platform_get_irq_byname_optional(pdev, "sfty");
>>>>> +	if (stmmac_res->sfty_irq < 0) {
>>>>> +		if (stmmac_res->sfty_irq == -EPROBE_DEFER)
>>>>> +			return -EPROBE_DEFER;
>>>>> +		dev_info(&pdev->dev, "IRQ safety IRQ not found\n");
>>>>> +	}
>>>>> +
>>>>>  	stmmac_res->addr = devm_platform_ioremap_resource(pdev, 0);
>>>>>  
>>>>>  	return PTR_ERR_OR_ZERO(stmmac_res->addr);
>>>>> -- 
>>>>> 2.25.1
>>>>>
>>>>>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common safety IRQ
  2023-12-27 11:03           ` Suraj Jaiswal
@ 2024-01-07 19:53             ` Serge Semin
  2024-01-08 10:57               ` Suraj Jaiswal
  0 siblings, 1 reply; 12+ messages in thread
From: Serge Semin @ 2024-01-07 19:53 UTC (permalink / raw)
  To: Suraj Jaiswal
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Vinod Koul, Bhupesh Sharma, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring, kernel

On Wed, Dec 27, 2023 at 04:33:33PM +0530, Suraj Jaiswal wrote:
> Hi Seren,
> please find the updated comment .
> 
> Thanks
> Suraj
> 
> On 12/26/2023 4:40 PM, Suraj Jaiswal wrote:
> > Hi seren
> > let me check below on test setup once & get back
> > 
> > Thanks
> > Suraj
> > 
> > On 12/22/2023 8:05 PM, Serge Semin wrote:
> >> On Fri, Dec 22, 2023 at 02:13:49PM +0530, Suraj Jaiswal wrote:
> >>> HI Serge,
> >>> please find commnet inline.
> >>>
> >>> Thanks
> >>> Suraj
> >>>
> >>> On 12/21/2023 6:19 PM, Serge Semin wrote:
> >>>> Hi Suraj
> >>>>
> >>>> On Thu, Dec 21, 2023 at 01:06:20PM +0530, Suraj Jaiswal wrote:
> >>>>> Add support to listen HW safety IRQ like ECC(error
> >>>>> correction code), DPP(data path parity), FSM(finite state
> >>>>> machine) fault in common IRQ line.
> >>>>>
> >>>>> Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
> >>>>
> >>>> Thanks for taking my notes into account. One more comment is further
> >>>> below.
> >>>>
> >>>>> ---
> >>>>>  drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
> >>>>>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
> >>>>>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
> >>>>>  .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
> >>>>>  4 files changed, 49 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
> >>>>> index 721c1f8e892f..b9233b09b80f 100644
> >>>>> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> >>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> >>>>> @@ -344,6 +344,7 @@ enum request_irq_err {
> >>>>>  	REQ_IRQ_ERR_ALL,
> >>>>>  	REQ_IRQ_ERR_TX,
> >>>>>  	REQ_IRQ_ERR_RX,
> >>>>> +	REQ_IRQ_ERR_SFTY,
> >>>>>  	REQ_IRQ_ERR_SFTY_UE,
> >>>>>  	REQ_IRQ_ERR_SFTY_CE,
> >>>>>  	REQ_IRQ_ERR_LPI,
> >>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>>>> index 9f89acf31050..ca3d93851bed 100644
> >>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>>>> @@ -31,6 +31,7 @@ struct stmmac_resources {
> >>>>>  	int wol_irq;
> >>>>>  	int lpi_irq;
> >>>>>  	int irq;
> >>>>> +	int sfty_irq;
> >>>>>  	int sfty_ce_irq;
> >>>>>  	int sfty_ue_irq;
> >>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
> >>>>> @@ -297,6 +298,7 @@ struct stmmac_priv {
> >>>>>  	void __iomem *ptpaddr;
> >>>>>  	void __iomem *estaddr;
> >>>>>  	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
> >>>>> +	int sfty_irq;
> >>>>>  	int sfty_ce_irq;
> >>>>>  	int sfty_ue_irq;
> >>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
> >>>>> @@ -305,6 +307,7 @@ struct stmmac_priv {
> >>>>>  	char int_name_mac[IFNAMSIZ + 9];
> >>>>>  	char int_name_wol[IFNAMSIZ + 9];
> >>>>>  	char int_name_lpi[IFNAMSIZ + 9];
> >>>>> +	char int_name_sfty[IFNAMSIZ + 10];
> >>>>>  	char int_name_sfty_ce[IFNAMSIZ + 10];
> >>>>>  	char int_name_sfty_ue[IFNAMSIZ + 10];
> >>>>>  	char int_name_rx_irq[MTL_MAX_TX_QUEUES][IFNAMSIZ + 14];
> >>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> >>>>> index 47de466e432c..7d4e827dfeab 100644
> >>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> >>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> >>>>> @@ -3592,6 +3592,10 @@ static void stmmac_free_irq(struct net_device *dev,
> >>>>>  		if (priv->wol_irq > 0 && priv->wol_irq != dev->irq)
> >>>>>  			free_irq(priv->wol_irq, dev);
> >>>>>  		fallthrough;
> >>>>> +	case REQ_IRQ_ERR_SFTY:
> >>>>> +		if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq)
> >>>>> +			free_irq(priv->sfty_irq, dev);
> >>>>> +		fallthrough;
> >>>>>  	case REQ_IRQ_ERR_WOL:
> >>>>>  		free_irq(dev->irq, dev);
> >>>>>  		fallthrough;
> >>>>> @@ -3661,6 +3665,23 @@ static int stmmac_request_irq_multi_msi(struct net_device *dev)
> >>>>>  		}
> >>>>>  	}
> >>>>>  
> >>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
> >>>>> +	 * Error line in case of another line is used
> >>>>> +	 */
> >>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
> >>>>> +		int_name = priv->int_name_sfty;
> >>>>> +		sprintf(int_name, "%s:%s", dev->name, "safety");
> >>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
> >>>>> +				  0, int_name, dev);
> >>>>> +		if (unlikely(ret < 0)) {
> >>>>> +			netdev_err(priv->dev,
> >>>>> +				   "%s: alloc sfty MSI %d (error: %d)\n",
> >>>>> +				   __func__, priv->sfty_irq, ret);
> >>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
> >>>>> +			goto irq_error;
> >>>>> +		}
> >>>>> +	}
> >>>>> +
> >>>>>  	/* Request the Safety Feature Correctible Error line in
> >>>>>  	 * case of another line is used
> >>>>>  	 */
> >>>>> @@ -3798,6 +3819,21 @@ static int stmmac_request_irq_single(struct net_device *dev)
> >>>>>  		}
> >>>>>  	}
> >>>>>  
> >>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
> >>>>> +	 * Error line in case of another line is used
> >>>>> +	 */
> >>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
> >>>>
> >>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
> >>>>> +				  IRQF_SHARED, dev->name, dev);
> >>>>
> >>>> Just noticed yesterday that stmmac_safety_interrupt() is also called
> >>>> from the stmmac_interrupt() handler which is supposed to be registered
> >>>> on the generic "mac" IRQ. Won't it cause races around the CSRs
> >>>> (doubtfully but still worth to note) and the errors handling
> >>>> (stmmac_global_err()) in case if both IRQs are raised simultaneously?
> >>>> At the very least it looks suspicious and worth double-checking.
> >>>>
> >>>> I also found out that nobody seemed to care that the same handler is
> >>>> registered on MAC, WoL and LPI IRQ lines. Hmm, no race-related
> >>>> problems have been reported so far for the platforms with separate
> >>>> WoL/LPI IRQs. It's either a lucky coincident or the IRQs are always
> >>>> assigned to the same CPU or the IRQs handle is indeed free of races.
> >>>> In anyway it looks suspicious too. At the very least AFAICS the DMA
> >>>> IRQ-handler is indeed racy on the status CSR access. It isn't
> >>>> cleared-on-read, but write-one-to-clear. So the statistics might be
> >>>> calculated more than once for the same CSR state. There might be some
> >>>> other problems I failed to spot on the first glance.
> >>>>
> >>>> David, Eric, Jacub, Paolo, your opinion about the note above?
> >>>>
> >>>> -Serge(y)
> >>>>
> >>
> >>> <Suraj> We are adding common IRQ similar to already present code for correcteable/uncorrecable https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L3592.
> >>
> >> From that perspective your change in stmmac_request_irq_multi_msi() is
> >> correct, but stmmac_request_irq_single() is another story. The first
> >> one method implies assigning the individual IRQ handlers to all
> >> available lines. The later method assigns the _common_ handler to all
> >> the lines. The common handler already calls the Safety IRQ handler -
> >> stmmac_safety_feat_interrupt(). So should the safety IRQ line is
> >> separately available it's possible to have the Safety IRQ handlers
> >> executed concurrently - in framework of the common IRQ events handling
> >> (if safety IRQ is raised during the common IRQ being handled) and
> >> individual Safety IRQ. It's prune to the race condition I pointed out
> >> to in my message above. Did you consider that problem?
> >>
> >>> Also, we need the sfty IRQ handling as soon as the fault occured & that can only be handled if we have handler attached with sfty IRQ.
> >>> stmmac_interrupt() will only be triggerd when interrupt triggered for rx/tx packet .
> >>> while registerting with sfty IRQ will get triggered as soon as emac HW detect the fault. 
> >>
> >> Please read my comment more carefully. The safety IRQ can be raised
> >> during the common IRQ handling, thus the
> >> stmmac_safety_feat_interrupt() method might get to be concurrently
> >> executed.
> >>
> >> -Serge(y)
> >>
> <Suraj> Have testing this on device . We have added print in the both the places stmmac_interrupt() as well as sfty interrupt handler.
> We can see that sfty interrupt handler is getting triggred first & stmmac_safety_feat_interrupt () code added in stmmac_intterupt() is not getting triggred because looks like interrupt status bit register is already getting cleared as part of sfty interrupt handler. So it looks good . Please let us know if any further comment. 
> 
> Please find the log below .
> 
> 
> / # [ 1505.602173] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1505.607274] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> [ 1505.617395] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1505.622494] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
> [ 1505.888913] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1505.894010] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> [ 1506.605821] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1506.610919] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> [ 1506.621034] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1506.626131] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
> [ 1507.613036] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1507.618133] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> [ 1507.628249] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1507.633346] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
> [ 1508.619034] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1508.624132] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> [ 1508.634245] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1508.639343] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
> [ 1509.631151] sj: stmmac_safety_interrupt from sfty IRQ handler
> [ 1509.636249] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> 

The log and the way you were trying to model out the problem don't
prove that the race condition doesn't exist. They just indicate that
your test-case doesn't catch the simultaneous MAC and Safety IRQs
handling.

Moreover AFAICS from the way the stmmac_ops->safety_feat_irq_status()
callbacks are defined in DW QoS Eth and DW XGMAC modules, the race is
there. Both
dwmac5_safety_feat_irq_status()
and
dwxgmac3_safety_feat_irq_status()
get to read the MTL and DMA Safety Interrupts Status register in order
to check whether the Correctable/Uncorrectable errors have actually
happened. After that the respective MAC, MTL or DMA error handlers are
called, which get to clear the IRQs statue by reading and then writing
the respective MAC DPP FRM, MTL/DMA ECC IRQ status registers. So if
the stmmac_safety_feat_interrupt() method is concurrently called the
driver at the very least may end up with printing the errors twice.

-Serge(y)

> >>>    
> >>>>> +		if (unlikely(ret < 0)) {
> >>>>> +			netdev_err(priv->dev,
> >>>>> +				   "%s: ERROR: allocating the sfty IRQ %d (%d)\n",
> >>>>> +				   __func__, priv->sfty_irq, ret);
> >>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
> >>>>> +			goto irq_error;
> >>>>> +		}
> >>>>> +	}
> >>>>> +
> >>>>>  	return 0;
> >>>>>  
> >>>>>  irq_error:
> >>>>> @@ -7462,6 +7498,7 @@ int stmmac_dvr_probe(struct device *device,
> >>>>>  	priv->dev->irq = res->irq;
> >>>>>  	priv->wol_irq = res->wol_irq;
> >>>>>  	priv->lpi_irq = res->lpi_irq;
> >>>>> +	priv->sfty_irq = res->sfty_irq;
> >>>>>  	priv->sfty_ce_irq = res->sfty_ce_irq;
> >>>>>  	priv->sfty_ue_irq = res->sfty_ue_irq;
> >>>>>  	for (i = 0; i < MTL_MAX_RX_QUEUES; i++)
> >>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> >>>>> index 70eadc83ca68..ab250161fd79 100644
> >>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> >>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> >>>>> @@ -743,6 +743,14 @@ int stmmac_get_platform_resources(struct platform_device *pdev,
> >>>>>  		dev_info(&pdev->dev, "IRQ eth_lpi not found\n");
> >>>>>  	}
> >>>>>  
> >>>>> +	stmmac_res->sfty_irq =
> >>>>> +		platform_get_irq_byname_optional(pdev, "sfty");
> >>>>> +	if (stmmac_res->sfty_irq < 0) {
> >>>>> +		if (stmmac_res->sfty_irq == -EPROBE_DEFER)
> >>>>> +			return -EPROBE_DEFER;
> >>>>> +		dev_info(&pdev->dev, "IRQ safety IRQ not found\n");
> >>>>> +	}
> >>>>> +
> >>>>>  	stmmac_res->addr = devm_platform_ioremap_resource(pdev, 0);
> >>>>>  
> >>>>>  	return PTR_ERR_OR_ZERO(stmmac_res->addr);
> >>>>> -- 
> >>>>> 2.25.1
> >>>>>
> >>>>>
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common safety IRQ
  2024-01-07 19:53             ` Serge Semin
@ 2024-01-08 10:57               ` Suraj Jaiswal
  2024-01-09 21:40                 ` Serge Semin
  0 siblings, 1 reply; 12+ messages in thread
From: Suraj Jaiswal @ 2024-01-08 10:57 UTC (permalink / raw)
  To: Serge Semin
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Vinod Koul, Bhupesh Sharma, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring, kernel

Hi Seren,
Please find updated comment.

Thanks
Suraj

On 1/8/2024 1:23 AM, Serge Semin wrote:
> On Wed, Dec 27, 2023 at 04:33:33PM +0530, Suraj Jaiswal wrote:
>> Hi Seren,
>> please find the updated comment .
>>
>> Thanks
>> Suraj
>>
>> On 12/26/2023 4:40 PM, Suraj Jaiswal wrote:
>>> Hi seren
>>> let me check below on test setup once & get back
>>>
>>> Thanks
>>> Suraj
>>>
>>> On 12/22/2023 8:05 PM, Serge Semin wrote:
>>>> On Fri, Dec 22, 2023 at 02:13:49PM +0530, Suraj Jaiswal wrote:
>>>>> HI Serge,
>>>>> please find commnet inline.
>>>>>
>>>>> Thanks
>>>>> Suraj
>>>>>
>>>>> On 12/21/2023 6:19 PM, Serge Semin wrote:
>>>>>> Hi Suraj
>>>>>>
>>>>>> On Thu, Dec 21, 2023 at 01:06:20PM +0530, Suraj Jaiswal wrote:
>>>>>>> Add support to listen HW safety IRQ like ECC(error
>>>>>>> correction code), DPP(data path parity), FSM(finite state
>>>>>>> machine) fault in common IRQ line.
>>>>>>>
>>>>>>> Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
>>>>>>
>>>>>> Thanks for taking my notes into account. One more comment is further
>>>>>> below.
>>>>>>
>>>>>>> ---
>>>>>>>  drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
>>>>>>>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
>>>>>>>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
>>>>>>>  .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
>>>>>>>  4 files changed, 49 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
>>>>>>> index 721c1f8e892f..b9233b09b80f 100644
>>>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
>>>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
>>>>>>> @@ -344,6 +344,7 @@ enum request_irq_err {
>>>>>>>  	REQ_IRQ_ERR_ALL,
>>>>>>>  	REQ_IRQ_ERR_TX,
>>>>>>>  	REQ_IRQ_ERR_RX,
>>>>>>> +	REQ_IRQ_ERR_SFTY,
>>>>>>>  	REQ_IRQ_ERR_SFTY_UE,
>>>>>>>  	REQ_IRQ_ERR_SFTY_CE,
>>>>>>>  	REQ_IRQ_ERR_LPI,
>>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>>>>>>> index 9f89acf31050..ca3d93851bed 100644
>>>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>>>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
>>>>>>> @@ -31,6 +31,7 @@ struct stmmac_resources {
>>>>>>>  	int wol_irq;
>>>>>>>  	int lpi_irq;
>>>>>>>  	int irq;
>>>>>>> +	int sfty_irq;
>>>>>>>  	int sfty_ce_irq;
>>>>>>>  	int sfty_ue_irq;
>>>>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
>>>>>>> @@ -297,6 +298,7 @@ struct stmmac_priv {
>>>>>>>  	void __iomem *ptpaddr;
>>>>>>>  	void __iomem *estaddr;
>>>>>>>  	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
>>>>>>> +	int sfty_irq;
>>>>>>>  	int sfty_ce_irq;
>>>>>>>  	int sfty_ue_irq;
>>>>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
>>>>>>> @@ -305,6 +307,7 @@ struct stmmac_priv {
>>>>>>>  	char int_name_mac[IFNAMSIZ + 9];
>>>>>>>  	char int_name_wol[IFNAMSIZ + 9];
>>>>>>>  	char int_name_lpi[IFNAMSIZ + 9];
>>>>>>> +	char int_name_sfty[IFNAMSIZ + 10];
>>>>>>>  	char int_name_sfty_ce[IFNAMSIZ + 10];
>>>>>>>  	char int_name_sfty_ue[IFNAMSIZ + 10];
>>>>>>>  	char int_name_rx_irq[MTL_MAX_TX_QUEUES][IFNAMSIZ + 14];
>>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>>>> index 47de466e432c..7d4e827dfeab 100644
>>>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>>>> @@ -3592,6 +3592,10 @@ static void stmmac_free_irq(struct net_device *dev,
>>>>>>>  		if (priv->wol_irq > 0 && priv->wol_irq != dev->irq)
>>>>>>>  			free_irq(priv->wol_irq, dev);
>>>>>>>  		fallthrough;
>>>>>>> +	case REQ_IRQ_ERR_SFTY:
>>>>>>> +		if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq)
>>>>>>> +			free_irq(priv->sfty_irq, dev);
>>>>>>> +		fallthrough;
>>>>>>>  	case REQ_IRQ_ERR_WOL:
>>>>>>>  		free_irq(dev->irq, dev);
>>>>>>>  		fallthrough;
>>>>>>> @@ -3661,6 +3665,23 @@ static int stmmac_request_irq_multi_msi(struct net_device *dev)
>>>>>>>  		}
>>>>>>>  	}
>>>>>>>  
>>>>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
>>>>>>> +	 * Error line in case of another line is used
>>>>>>> +	 */
>>>>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
>>>>>>> +		int_name = priv->int_name_sfty;
>>>>>>> +		sprintf(int_name, "%s:%s", dev->name, "safety");
>>>>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
>>>>>>> +				  0, int_name, dev);
>>>>>>> +		if (unlikely(ret < 0)) {
>>>>>>> +			netdev_err(priv->dev,
>>>>>>> +				   "%s: alloc sfty MSI %d (error: %d)\n",
>>>>>>> +				   __func__, priv->sfty_irq, ret);
>>>>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
>>>>>>> +			goto irq_error;
>>>>>>> +		}
>>>>>>> +	}
>>>>>>> +
>>>>>>>  	/* Request the Safety Feature Correctible Error line in
>>>>>>>  	 * case of another line is used
>>>>>>>  	 */
>>>>>>> @@ -3798,6 +3819,21 @@ static int stmmac_request_irq_single(struct net_device *dev)
>>>>>>>  		}
>>>>>>>  	}
>>>>>>>  
>>>>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
>>>>>>> +	 * Error line in case of another line is used
>>>>>>> +	 */
>>>>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
>>>>>>
>>>>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
>>>>>>> +				  IRQF_SHARED, dev->name, dev);
>>>>>>
>>>>>> Just noticed yesterday that stmmac_safety_interrupt() is also called
>>>>>> from the stmmac_interrupt() handler which is supposed to be registered
>>>>>> on the generic "mac" IRQ. Won't it cause races around the CSRs
>>>>>> (doubtfully but still worth to note) and the errors handling
>>>>>> (stmmac_global_err()) in case if both IRQs are raised simultaneously?
>>>>>> At the very least it looks suspicious and worth double-checking.
>>>>>>
>>>>>> I also found out that nobody seemed to care that the same handler is
>>>>>> registered on MAC, WoL and LPI IRQ lines. Hmm, no race-related
>>>>>> problems have been reported so far for the platforms with separate
>>>>>> WoL/LPI IRQs. It's either a lucky coincident or the IRQs are always
>>>>>> assigned to the same CPU or the IRQs handle is indeed free of races.
>>>>>> In anyway it looks suspicious too. At the very least AFAICS the DMA
>>>>>> IRQ-handler is indeed racy on the status CSR access. It isn't
>>>>>> cleared-on-read, but write-one-to-clear. So the statistics might be
>>>>>> calculated more than once for the same CSR state. There might be some
>>>>>> other problems I failed to spot on the first glance.
>>>>>>
>>>>>> David, Eric, Jacub, Paolo, your opinion about the note above?
>>>>>>
>>>>>> -Serge(y)
>>>>>>
>>>>
>>>>> <Suraj> We are adding common IRQ similar to already present code for correcteable/uncorrecable https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L3592.
>>>>
>>>> From that perspective your change in stmmac_request_irq_multi_msi() is
>>>> correct, but stmmac_request_irq_single() is another story. The first
>>>> one method implies assigning the individual IRQ handlers to all
>>>> available lines. The later method assigns the _common_ handler to all
>>>> the lines. The common handler already calls the Safety IRQ handler -
>>>> stmmac_safety_feat_interrupt(). So should the safety IRQ line is
>>>> separately available it's possible to have the Safety IRQ handlers
>>>> executed concurrently - in framework of the common IRQ events handling
>>>> (if safety IRQ is raised during the common IRQ being handled) and
>>>> individual Safety IRQ. It's prune to the race condition I pointed out
>>>> to in my message above. Did you consider that problem?
>>>>
>>>>> Also, we need the sfty IRQ handling as soon as the fault occured & that can only be handled if we have handler attached with sfty IRQ.
>>>>> stmmac_interrupt() will only be triggerd when interrupt triggered for rx/tx packet .
>>>>> while registerting with sfty IRQ will get triggered as soon as emac HW detect the fault. 
>>>>
>>>> Please read my comment more carefully. The safety IRQ can be raised
>>>> during the common IRQ handling, thus the
>>>> stmmac_safety_feat_interrupt() method might get to be concurrently
>>>> executed.
>>>>
>>>> -Serge(y)
>>>>
>> <Suraj> Have testing this on device . We have added print in the both the places stmmac_interrupt() as well as sfty interrupt handler.
>> We can see that sfty interrupt handler is getting triggred first & stmmac_safety_feat_interrupt () code added in stmmac_intterupt() is not getting triggred because looks like interrupt status bit register is already getting cleared as part of sfty interrupt handler. So it looks good . Please let us know if any further comment. 
>>
>> Please find the log below .
>>
>>
>> / # [ 1505.602173] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1505.607274] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
>> [ 1505.617395] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1505.622494] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
>> [ 1505.888913] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1505.894010] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
>> [ 1506.605821] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1506.610919] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
>> [ 1506.621034] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1506.626131] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
>> [ 1507.613036] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1507.618133] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
>> [ 1507.628249] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1507.633346] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
>> [ 1508.619034] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1508.624132] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
>> [ 1508.634245] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1508.639343] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
>> [ 1509.631151] sj: stmmac_safety_interrupt from sfty IRQ handler
>> [ 1509.636249] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
>>
> 
> The log and the way you were trying to model out the problem don't
> prove that the race condition doesn't exist. They just indicate that
> your test-case doesn't catch the simultaneous MAC and Safety IRQs
> handling.
> 
> Moreover AFAICS from the way the stmmac_ops->safety_feat_irq_status()
> callbacks are defined in DW QoS Eth and DW XGMAC modules, the race is
> there. Both
> dwmac5_safety_feat_irq_status()
> and
> dwxgmac3_safety_feat_irq_status()
> get to read the MTL and DMA Safety Interrupts Status register in order
> to check whether the Correctable/Uncorrectable errors have actually
> happened. After that the respective MAC, MTL or DMA error handlers are
> called, which get to clear the IRQs statue by reading and then writing
> the respective MAC DPP FRM, MTL/DMA ECC IRQ status registers. So if
> the stmmac_safety_feat_interrupt() method is concurrently called the
> driver at the very least may end up with printing the errors twice.
> 
> -Serge(y)
> 
<Suraj> We did not see issue reported 2 time in the verfication. 
Also, we can add below change to completetly avoid call of sfty hadling as part of stmmac interrupt if irq is already defined like
below . Let me if below looks good .

static irqreturn_t stmmac_interrupt(int irq, void *dev_id)
{
	struct net_device *dev = (struct net_device *)dev_id;
	struct stmmac_priv *priv = netdev_priv(dev);

	/* Check if adapter is up */
	if (test_bit(STMMAC_DOWN, &priv->state))
		return IRQ_HANDLED;

	+ if (priv->sfty_irq <=0) {
		/* Check if a fatal error happened */
		if (stmmac_safety_feat_interrupt(priv))
			return IRQ_HANDLED;
	+ }
	/* To handle Common interrupts */
	stmmac_common_interrupt(priv);

	/* To handle DMA interrupts */
	stmmac_dma_interrupt(priv);

	return IRQ_HANDLED;
}

>>>>>    
>>>>>>> +		if (unlikely(ret < 0)) {
>>>>>>> +			netdev_err(priv->dev,
>>>>>>> +				   "%s: ERROR: allocating the sfty IRQ %d (%d)\n",
>>>>>>> +				   __func__, priv->sfty_irq, ret);
>>>>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
>>>>>>> +			goto irq_error;
>>>>>>> +		}
>>>>>>> +	}
>>>>>>> +
>>>>>>>  	return 0;
>>>>>>>  
>>>>>>>  irq_error:
>>>>>>> @@ -7462,6 +7498,7 @@ int stmmac_dvr_probe(struct device *device,
>>>>>>>  	priv->dev->irq = res->irq;
>>>>>>>  	priv->wol_irq = res->wol_irq;
>>>>>>>  	priv->lpi_irq = res->lpi_irq;
>>>>>>> +	priv->sfty_irq = res->sfty_irq;
>>>>>>>  	priv->sfty_ce_irq = res->sfty_ce_irq;
>>>>>>>  	priv->sfty_ue_irq = res->sfty_ue_irq;
>>>>>>>  	for (i = 0; i < MTL_MAX_RX_QUEUES; i++)
>>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>>>>>>> index 70eadc83ca68..ab250161fd79 100644
>>>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>>>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
>>>>>>> @@ -743,6 +743,14 @@ int stmmac_get_platform_resources(struct platform_device *pdev,
>>>>>>>  		dev_info(&pdev->dev, "IRQ eth_lpi not found\n");
>>>>>>>  	}
>>>>>>>  
>>>>>>> +	stmmac_res->sfty_irq =
>>>>>>> +		platform_get_irq_byname_optional(pdev, "sfty");
>>>>>>> +	if (stmmac_res->sfty_irq < 0) {
>>>>>>> +		if (stmmac_res->sfty_irq == -EPROBE_DEFER)
>>>>>>> +			return -EPROBE_DEFER;
>>>>>>> +		dev_info(&pdev->dev, "IRQ safety IRQ not found\n");
>>>>>>> +	}
>>>>>>> +
>>>>>>>  	stmmac_res->addr = devm_platform_ioremap_resource(pdev, 0);
>>>>>>>  
>>>>>>>  	return PTR_ERR_OR_ZERO(stmmac_res->addr);
>>>>>>> -- 
>>>>>>> 2.25.1
>>>>>>>
>>>>>>>
>>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common safety IRQ
  2024-01-08 10:57               ` Suraj Jaiswal
@ 2024-01-09 21:40                 ` Serge Semin
  0 siblings, 0 replies; 12+ messages in thread
From: Serge Semin @ 2024-01-09 21:40 UTC (permalink / raw)
  To: Suraj Jaiswal
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Vinod Koul, Bhupesh Sharma, Andy Gross, Bjorn Andersson,
	Konrad Dybcio, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Alexandre Torgue, Jose Abreu, Maxime Coquelin, netdev,
	linux-arm-msm, devicetree, linux-kernel, linux-stm32,
	Prasad Sodagudi, Andrew Halaney, Rob Herring, kernel

On Mon, Jan 08, 2024 at 04:27:57PM +0530, Suraj Jaiswal wrote:
> Hi Seren,
> Please find updated comment.
> 
> Thanks
> Suraj
> 
> On 1/8/2024 1:23 AM, Serge Semin wrote:
> > On Wed, Dec 27, 2023 at 04:33:33PM +0530, Suraj Jaiswal wrote:
> >> Hi Seren,
> >> please find the updated comment .
> >>
> >> Thanks
> >> Suraj
> >>
> >> On 12/26/2023 4:40 PM, Suraj Jaiswal wrote:
> >>> Hi seren
> >>> let me check below on test setup once & get back
> >>>
> >>> Thanks
> >>> Suraj
> >>>
> >>> On 12/22/2023 8:05 PM, Serge Semin wrote:
> >>>> On Fri, Dec 22, 2023 at 02:13:49PM +0530, Suraj Jaiswal wrote:
> >>>>> HI Serge,
> >>>>> please find commnet inline.
> >>>>>
> >>>>> Thanks
> >>>>> Suraj
> >>>>>
> >>>>> On 12/21/2023 6:19 PM, Serge Semin wrote:
> >>>>>> Hi Suraj
> >>>>>>
> >>>>>> On Thu, Dec 21, 2023 at 01:06:20PM +0530, Suraj Jaiswal wrote:
> >>>>>>> Add support to listen HW safety IRQ like ECC(error
> >>>>>>> correction code), DPP(data path parity), FSM(finite state
> >>>>>>> machine) fault in common IRQ line.
> >>>>>>>
> >>>>>>> Signed-off-by: Suraj Jaiswal <quic_jsuraj@quicinc.com>
> >>>>>>
> >>>>>> Thanks for taking my notes into account. One more comment is further
> >>>>>> below.
> >>>>>>
> >>>>>>> ---
> >>>>>>>  drivers/net/ethernet/stmicro/stmmac/common.h  |  1 +
> >>>>>>>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  3 ++
> >>>>>>>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 37 +++++++++++++++++++
> >>>>>>>  .../ethernet/stmicro/stmmac/stmmac_platform.c |  8 ++++
> >>>>>>>  4 files changed, 49 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
> >>>>>>> index 721c1f8e892f..b9233b09b80f 100644
> >>>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/common.h
> >>>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/common.h
> >>>>>>> @@ -344,6 +344,7 @@ enum request_irq_err {
> >>>>>>>  	REQ_IRQ_ERR_ALL,
> >>>>>>>  	REQ_IRQ_ERR_TX,
> >>>>>>>  	REQ_IRQ_ERR_RX,
> >>>>>>> +	REQ_IRQ_ERR_SFTY,
> >>>>>>>  	REQ_IRQ_ERR_SFTY_UE,
> >>>>>>>  	REQ_IRQ_ERR_SFTY_CE,
> >>>>>>>  	REQ_IRQ_ERR_LPI,
> >>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>>>>>> index 9f89acf31050..ca3d93851bed 100644
> >>>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>>>>>> @@ -31,6 +31,7 @@ struct stmmac_resources {
> >>>>>>>  	int wol_irq;
> >>>>>>>  	int lpi_irq;
> >>>>>>>  	int irq;
> >>>>>>> +	int sfty_irq;
> >>>>>>>  	int sfty_ce_irq;
> >>>>>>>  	int sfty_ue_irq;
> >>>>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
> >>>>>>> @@ -297,6 +298,7 @@ struct stmmac_priv {
> >>>>>>>  	void __iomem *ptpaddr;
> >>>>>>>  	void __iomem *estaddr;
> >>>>>>>  	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
> >>>>>>> +	int sfty_irq;
> >>>>>>>  	int sfty_ce_irq;
> >>>>>>>  	int sfty_ue_irq;
> >>>>>>>  	int rx_irq[MTL_MAX_RX_QUEUES];
> >>>>>>> @@ -305,6 +307,7 @@ struct stmmac_priv {
> >>>>>>>  	char int_name_mac[IFNAMSIZ + 9];
> >>>>>>>  	char int_name_wol[IFNAMSIZ + 9];
> >>>>>>>  	char int_name_lpi[IFNAMSIZ + 9];
> >>>>>>> +	char int_name_sfty[IFNAMSIZ + 10];
> >>>>>>>  	char int_name_sfty_ce[IFNAMSIZ + 10];
> >>>>>>>  	char int_name_sfty_ue[IFNAMSIZ + 10];
> >>>>>>>  	char int_name_rx_irq[MTL_MAX_TX_QUEUES][IFNAMSIZ + 14];
> >>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> >>>>>>> index 47de466e432c..7d4e827dfeab 100644
> >>>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> >>>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> >>>>>>> @@ -3592,6 +3592,10 @@ static void stmmac_free_irq(struct net_device *dev,
> >>>>>>>  		if (priv->wol_irq > 0 && priv->wol_irq != dev->irq)
> >>>>>>>  			free_irq(priv->wol_irq, dev);
> >>>>>>>  		fallthrough;
> >>>>>>> +	case REQ_IRQ_ERR_SFTY:
> >>>>>>> +		if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq)
> >>>>>>> +			free_irq(priv->sfty_irq, dev);
> >>>>>>> +		fallthrough;
> >>>>>>>  	case REQ_IRQ_ERR_WOL:
> >>>>>>>  		free_irq(dev->irq, dev);
> >>>>>>>  		fallthrough;
> >>>>>>> @@ -3661,6 +3665,23 @@ static int stmmac_request_irq_multi_msi(struct net_device *dev)
> >>>>>>>  		}
> >>>>>>>  	}
> >>>>>>>  
> >>>>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
> >>>>>>> +	 * Error line in case of another line is used
> >>>>>>> +	 */
> >>>>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
> >>>>>>> +		int_name = priv->int_name_sfty;
> >>>>>>> +		sprintf(int_name, "%s:%s", dev->name, "safety");
> >>>>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
> >>>>>>> +				  0, int_name, dev);
> >>>>>>> +		if (unlikely(ret < 0)) {
> >>>>>>> +			netdev_err(priv->dev,
> >>>>>>> +				   "%s: alloc sfty MSI %d (error: %d)\n",
> >>>>>>> +				   __func__, priv->sfty_irq, ret);
> >>>>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
> >>>>>>> +			goto irq_error;
> >>>>>>> +		}
> >>>>>>> +	}
> >>>>>>> +
> >>>>>>>  	/* Request the Safety Feature Correctible Error line in
> >>>>>>>  	 * case of another line is used
> >>>>>>>  	 */
> >>>>>>> @@ -3798,6 +3819,21 @@ static int stmmac_request_irq_single(struct net_device *dev)
> >>>>>>>  		}
> >>>>>>>  	}
> >>>>>>>  
> >>>>>>> +	/* Request the common Safety Feature Correctible/Uncorrectible
> >>>>>>> +	 * Error line in case of another line is used
> >>>>>>> +	 */
> >>>>>>> +	if (priv->sfty_irq > 0 && priv->sfty_irq != dev->irq) {
> >>>>>>
> >>>>>>> +		ret = request_irq(priv->sfty_irq, stmmac_safety_interrupt,
> >>>>>>> +				  IRQF_SHARED, dev->name, dev);
> >>>>>>
> >>>>>> Just noticed yesterday that stmmac_safety_interrupt() is also called
> >>>>>> from the stmmac_interrupt() handler which is supposed to be registered
> >>>>>> on the generic "mac" IRQ. Won't it cause races around the CSRs
> >>>>>> (doubtfully but still worth to note) and the errors handling
> >>>>>> (stmmac_global_err()) in case if both IRQs are raised simultaneously?
> >>>>>> At the very least it looks suspicious and worth double-checking.
> >>>>>>
> >>>>>> I also found out that nobody seemed to care that the same handler is
> >>>>>> registered on MAC, WoL and LPI IRQ lines. Hmm, no race-related
> >>>>>> problems have been reported so far for the platforms with separate
> >>>>>> WoL/LPI IRQs. It's either a lucky coincident or the IRQs are always
> >>>>>> assigned to the same CPU or the IRQs handle is indeed free of races.
> >>>>>> In anyway it looks suspicious too. At the very least AFAICS the DMA
> >>>>>> IRQ-handler is indeed racy on the status CSR access. It isn't
> >>>>>> cleared-on-read, but write-one-to-clear. So the statistics might be
> >>>>>> calculated more than once for the same CSR state. There might be some
> >>>>>> other problems I failed to spot on the first glance.
> >>>>>>
> >>>>>> David, Eric, Jacub, Paolo, your opinion about the note above?
> >>>>>>
> >>>>>> -Serge(y)
> >>>>>>
> >>>>
> >>>>> <Suraj> We are adding common IRQ similar to already present code for correcteable/uncorrecable https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L3592.
> >>>>
> >>>> From that perspective your change in stmmac_request_irq_multi_msi() is
> >>>> correct, but stmmac_request_irq_single() is another story. The first
> >>>> one method implies assigning the individual IRQ handlers to all
> >>>> available lines. The later method assigns the _common_ handler to all
> >>>> the lines. The common handler already calls the Safety IRQ handler -
> >>>> stmmac_safety_feat_interrupt(). So should the safety IRQ line is
> >>>> separately available it's possible to have the Safety IRQ handlers
> >>>> executed concurrently - in framework of the common IRQ events handling
> >>>> (if safety IRQ is raised during the common IRQ being handled) and
> >>>> individual Safety IRQ. It's prune to the race condition I pointed out
> >>>> to in my message above. Did you consider that problem?
> >>>>
> >>>>> Also, we need the sfty IRQ handling as soon as the fault occured & that can only be handled if we have handler attached with sfty IRQ.
> >>>>> stmmac_interrupt() will only be triggerd when interrupt triggered for rx/tx packet .
> >>>>> while registerting with sfty IRQ will get triggered as soon as emac HW detect the fault. 
> >>>>
> >>>> Please read my comment more carefully. The safety IRQ can be raised
> >>>> during the common IRQ handling, thus the
> >>>> stmmac_safety_feat_interrupt() method might get to be concurrently
> >>>> executed.
> >>>>
> >>>> -Serge(y)
> >>>>
> >> <Suraj> Have testing this on device . We have added print in the both the places stmmac_interrupt() as well as sfty interrupt handler.
> >> We can see that sfty interrupt handler is getting triggred first & stmmac_safety_feat_interrupt () code added in stmmac_intterupt() is not getting triggred because looks like interrupt status bit register is already getting cleared as part of sfty interrupt handler. So it looks good . Please let us know if any further comment. 
> >>
> >> Please find the log below .
> >>
> >>
> >> / # [ 1505.602173] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1505.607274] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> >> [ 1505.617395] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1505.622494] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
> >> [ 1505.888913] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1505.894010] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> >> [ 1506.605821] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1506.610919] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> >> [ 1506.621034] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1506.626131] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
> >> [ 1507.613036] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1507.618133] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> >> [ 1507.628249] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1507.633346] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
> >> [ 1508.619034] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1508.624132] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> >> [ 1508.634245] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1508.639343] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'TXCES: MTL TX Memory Error'
> >> [ 1509.631151] sj: stmmac_safety_interrupt from sfty IRQ handler
> >> [ 1509.636249] qcom-ethqos 23040000.ethernet eth1: Found correctable error in MTL: 'RXCES: MTL RX Memory Error'
> >>
> > 
> > The log and the way you were trying to model out the problem don't
> > prove that the race condition doesn't exist. They just indicate that
> > your test-case doesn't catch the simultaneous MAC and Safety IRQs
> > handling.
> > 
> > Moreover AFAICS from the way the stmmac_ops->safety_feat_irq_status()
> > callbacks are defined in DW QoS Eth and DW XGMAC modules, the race is
> > there. Both
> > dwmac5_safety_feat_irq_status()
> > and
> > dwxgmac3_safety_feat_irq_status()
> > get to read the MTL and DMA Safety Interrupts Status register in order
> > to check whether the Correctable/Uncorrectable errors have actually
> > happened. After that the respective MAC, MTL or DMA error handlers are
> > called, which get to clear the IRQs statue by reading and then writing
> > the respective MAC DPP FRM, MTL/DMA ECC IRQ status registers. So if
> > the stmmac_safety_feat_interrupt() method is concurrently called the
> > driver at the very least may end up with printing the errors twice.
> > 
> > -Serge(y)
> > 
> <Suraj> We did not see issue reported 2 time in the verfication. 
> Also, we can add below change to completetly avoid call of sfty hadling as part of stmmac interrupt if irq is already defined like
> below . Let me if below looks good .
> 

> static irqreturn_t stmmac_interrupt(int irq, void *dev_id)
> {
> 	struct net_device *dev = (struct net_device *)dev_id;
> 	struct stmmac_priv *priv = netdev_priv(dev);
> 
> 	/* Check if adapter is up */
> 	if (test_bit(STMMAC_DOWN, &priv->state))
> 		return IRQ_HANDLED;
> 
> 	+ if (priv->sfty_irq <=0) {
> 		/* Check if a fatal error happened */
> 		if (stmmac_safety_feat_interrupt(priv))
> 			return IRQ_HANDLED;
> 	+ }
> 	/* To handle Common interrupts */
> 	stmmac_common_interrupt(priv);
> 
> 	/* To handle DMA interrupts */
> 	stmmac_dma_interrupt(priv);
> 
> 	return IRQ_HANDLED;
> }

This isn't ideal but at least it will prevent the race condition.

Since the solution isn't that much elegant let's make it a bit more
verbose:
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
  		return IRQ_HANDLED;
 
-	/* Check if a fatal error happened */
-	if (stmmac_safety_feat_interrupt(priv))
+	/* Check ASP error if it isn't delivered via an individual IRQ */
+	if (priv->sfty_irq <= 0 && stmmac_safety_feat_interrupt(priv))
 		return IRQ_HANDLED;
 
 	/* To handle Common interrupts */

The ideal solution would be to refactor the entire IRQs handling code
by converting the common MAC IRQ handler to calling the respective
event handlers only if no individual IRQ was specified, joining in the
multi and single IRQs request/handling methods, dropping the
STMMAC_FLAG_MULTI_IRQ_EN flag, etc. To be honest with no all the
Synopsys DW *MAC IP-core HW manuals at hands it won't be an easy task
with a high risk to break stuff or create code even more complicated
than it already is.

-Serge(y)

> 
> >>>>>    
> >>>>>>> +		if (unlikely(ret < 0)) {
> >>>>>>> +			netdev_err(priv->dev,
> >>>>>>> +				   "%s: ERROR: allocating the sfty IRQ %d (%d)\n",
> >>>>>>> +				   __func__, priv->sfty_irq, ret);
> >>>>>>> +			irq_err = REQ_IRQ_ERR_SFTY;
> >>>>>>> +			goto irq_error;
> >>>>>>> +		}
> >>>>>>> +	}
> >>>>>>> +
> >>>>>>>  	return 0;
> >>>>>>>  
> >>>>>>>  irq_error:
> >>>>>>> @@ -7462,6 +7498,7 @@ int stmmac_dvr_probe(struct device *device,
> >>>>>>>  	priv->dev->irq = res->irq;
> >>>>>>>  	priv->wol_irq = res->wol_irq;
> >>>>>>>  	priv->lpi_irq = res->lpi_irq;
> >>>>>>> +	priv->sfty_irq = res->sfty_irq;
> >>>>>>>  	priv->sfty_ce_irq = res->sfty_ce_irq;
> >>>>>>>  	priv->sfty_ue_irq = res->sfty_ue_irq;
> >>>>>>>  	for (i = 0; i < MTL_MAX_RX_QUEUES; i++)
> >>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> >>>>>>> index 70eadc83ca68..ab250161fd79 100644
> >>>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> >>>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> >>>>>>> @@ -743,6 +743,14 @@ int stmmac_get_platform_resources(struct platform_device *pdev,
> >>>>>>>  		dev_info(&pdev->dev, "IRQ eth_lpi not found\n");
> >>>>>>>  	}
> >>>>>>>  
> >>>>>>> +	stmmac_res->sfty_irq =
> >>>>>>> +		platform_get_irq_byname_optional(pdev, "sfty");
> >>>>>>> +	if (stmmac_res->sfty_irq < 0) {
> >>>>>>> +		if (stmmac_res->sfty_irq == -EPROBE_DEFER)
> >>>>>>> +			return -EPROBE_DEFER;
> >>>>>>> +		dev_info(&pdev->dev, "IRQ safety IRQ not found\n");
> >>>>>>> +	}
> >>>>>>> +
> >>>>>>>  	stmmac_res->addr = devm_platform_ioremap_resource(pdev, 0);
> >>>>>>>  
> >>>>>>>  	return PTR_ERR_OR_ZERO(stmmac_res->addr);
> >>>>>>> -- 
> >>>>>>> 2.25.1
> >>>>>>>
> >>>>>>>
> >>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-01-09 21:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-21  7:36 [PATCH net-next v8 0/3] Ethernet DWMAC5 fault IRQ support Suraj Jaiswal
2023-12-21  7:36 ` [PATCH net-next v8 1/3] dt-bindings: net: qcom,ethqos: add binding doc for safety IRQ for sa8775p Suraj Jaiswal
2023-12-21  7:36 ` [PATCH net-next v8 2/3] arm64: dts: qcom: sa8775p: enable safety IRQ Suraj Jaiswal
2023-12-21  7:36 ` [PATCH net-next v8 3/3] net: stmmac: Add driver support for DWMAC5 common " Suraj Jaiswal
2023-12-21 12:49   ` Serge Semin
2023-12-22  8:43     ` Suraj Jaiswal
2023-12-22 14:35       ` Serge Semin
2023-12-26 11:10         ` Suraj Jaiswal
2023-12-27 11:03           ` Suraj Jaiswal
2024-01-07 19:53             ` Serge Semin
2024-01-08 10:57               ` Suraj Jaiswal
2024-01-09 21:40                 ` Serge Semin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.