* [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
@ 2016-08-11 16:15 ` Grygorii Strashko
0 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-08-11 16:15 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, Steven Rostedt
Cc: linux-omap, Alison Chaiken, linux-rt-users, netdev, Grygorii Strashko
Mark CPSW Rx/Tx IRQs as IRQF_NO_THREAD and avoid double scheduling on -RT
where this IRQs are forced threaded:
rx-irq
|- schedule threaded rx-irq handler
...
|- threaded rx-irq handler -> cpsw_rx_interrupt()
|- napi_schedule()
|- __raise_softirq_irqoff()
|- wakeup_proper_softirq()
...
napi
after:
rx-irq
|- cpsw_rx_interrupt()
|- napi_schedule()
|- irq_exit()
|- invoke_softirq()
|- wakeup_softirqd()
...
napi
And, as result, get benefits from the following improvements (tested
on am57xx-evm):
1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
seen any more. Now these warnings can be seen once iperf is started.
# iperf -c $IPERFHOST -w 128K -d -t 60
2) latency reduction when cyclictest is run in parallel with network load
where net_perf.sh is:
iperf -c $IPERFHOST -w 8K -d -t 60
iperf -c $IPERFHOST -w 16K -d -t 60
iperf -c $IPERFHOST -w 32K -d -t 60
iperf -c $IPERFHOST -w 64K -d -t 60
iperf -c $IPERFHOST -w 128K -d -t 60
before:
T: 0 ( 1326) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 70
T: 1 ( 1327) P:98 I:1500 C: 159981 Min: 9 Act: 15 Avg: 16 Max: 43
after:
T: 0 ( 1331) P:98 I:1000 C: 240000 Min: 8 Act: 15 Avg: 14 Max: 51
T: 1 ( 1332) P:98 I:1500 C: 159953 Min: 8 Act: 16 Avg: 15 Max: 33
3) network performance increase
win, K Mbits/s
before after %
8K 354 350.3 0.0
16K 412 551 33.7
32K 423 659.5 55.9
64K 436 728.3 67.0
128K 537 845 57.4
This change does not affect on non-RT.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
---
Hi All,
I'll be appreciated on any feedback or tested-by.
In case of positive feedback I'll resend it for upstream.
drivers/net/ethernet/ti/cpsw.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 7b59283..fa4bb81 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -769,7 +769,7 @@ static irqreturn_t cpsw_tx_interrupt(int irq, void *dev_id)
priv->tx_irq_disabled = true;
}
- napi_schedule(&priv->napi_tx);
+ napi_schedule_irqoff(&priv->napi_tx);
return IRQ_HANDLED;
}
@@ -785,7 +785,7 @@ static irqreturn_t cpsw_rx_interrupt(int irq, void *dev_id)
priv->rx_irq_disabled = true;
}
- napi_schedule(&priv->napi_rx);
+ napi_schedule_irqoff(&priv->napi_rx);
return IRQ_HANDLED;
}
@@ -2827,7 +2827,7 @@ static int cpsw_probe(struct platform_device *pdev)
priv->irqs_table[0] = irq;
ret = devm_request_irq(&pdev->dev, irq, cpsw_rx_interrupt,
- 0, dev_name(&pdev->dev), priv);
+ IRQF_NO_THREAD, dev_name(&pdev->dev), priv);
if (ret < 0) {
dev_err(priv->dev, "error attaching irq (%d)\n", ret);
goto clean_ale_ret;
@@ -2842,7 +2842,7 @@ static int cpsw_probe(struct platform_device *pdev)
priv->irqs_table[1] = irq;
ret = devm_request_irq(&pdev->dev, irq, cpsw_tx_interrupt,
- 0, dev_name(&pdev->dev), priv);
+ IRQF_NO_THREAD, dev_name(&pdev->dev), priv);
if (ret < 0) {
dev_err(priv->dev, "error attaching irq (%d)\n", ret);
goto clean_ale_ret;
--
2.9.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
@ 2016-08-11 16:15 ` Grygorii Strashko
0 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-08-11 16:15 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, Steven Rostedt
Cc: linux-omap, Alison Chaiken, linux-rt-users, netdev, Grygorii Strashko
Mark CPSW Rx/Tx IRQs as IRQF_NO_THREAD and avoid double scheduling on -RT
where this IRQs are forced threaded:
rx-irq
|- schedule threaded rx-irq handler
...
|- threaded rx-irq handler -> cpsw_rx_interrupt()
|- napi_schedule()
|- __raise_softirq_irqoff()
|- wakeup_proper_softirq()
...
napi
after:
rx-irq
|- cpsw_rx_interrupt()
|- napi_schedule()
|- irq_exit()
|- invoke_softirq()
|- wakeup_softirqd()
...
napi
And, as result, get benefits from the following improvements (tested
on am57xx-evm):
1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
seen any more. Now these warnings can be seen once iperf is started.
# iperf -c $IPERFHOST -w 128K -d -t 60
2) latency reduction when cyclictest is run in parallel with network load
where net_perf.sh is:
iperf -c $IPERFHOST -w 8K -d -t 60
iperf -c $IPERFHOST -w 16K -d -t 60
iperf -c $IPERFHOST -w 32K -d -t 60
iperf -c $IPERFHOST -w 64K -d -t 60
iperf -c $IPERFHOST -w 128K -d -t 60
before:
T: 0 ( 1326) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 70
T: 1 ( 1327) P:98 I:1500 C: 159981 Min: 9 Act: 15 Avg: 16 Max: 43
after:
T: 0 ( 1331) P:98 I:1000 C: 240000 Min: 8 Act: 15 Avg: 14 Max: 51
T: 1 ( 1332) P:98 I:1500 C: 159953 Min: 8 Act: 16 Avg: 15 Max: 33
3) network performance increase
win, K Mbits/s
before after %
8K 354 350.3 0.0
16K 412 551 33.7
32K 423 659.5 55.9
64K 436 728.3 67.0
128K 537 845 57.4
This change does not affect on non-RT.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
---
Hi All,
I'll be appreciated on any feedback or tested-by.
In case of positive feedback I'll resend it for upstream.
drivers/net/ethernet/ti/cpsw.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 7b59283..fa4bb81 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -769,7 +769,7 @@ static irqreturn_t cpsw_tx_interrupt(int irq, void *dev_id)
priv->tx_irq_disabled = true;
}
- napi_schedule(&priv->napi_tx);
+ napi_schedule_irqoff(&priv->napi_tx);
return IRQ_HANDLED;
}
@@ -785,7 +785,7 @@ static irqreturn_t cpsw_rx_interrupt(int irq, void *dev_id)
priv->rx_irq_disabled = true;
}
- napi_schedule(&priv->napi_rx);
+ napi_schedule_irqoff(&priv->napi_rx);
return IRQ_HANDLED;
}
@@ -2827,7 +2827,7 @@ static int cpsw_probe(struct platform_device *pdev)
priv->irqs_table[0] = irq;
ret = devm_request_irq(&pdev->dev, irq, cpsw_rx_interrupt,
- 0, dev_name(&pdev->dev), priv);
+ IRQF_NO_THREAD, dev_name(&pdev->dev), priv);
if (ret < 0) {
dev_err(priv->dev, "error attaching irq (%d)\n", ret);
goto clean_ale_ret;
@@ -2842,7 +2842,7 @@ static int cpsw_probe(struct platform_device *pdev)
priv->irqs_table[1] = irq;
ret = devm_request_irq(&pdev->dev, irq, cpsw_tx_interrupt,
- 0, dev_name(&pdev->dev), priv);
+ IRQF_NO_THREAD, dev_name(&pdev->dev), priv);
if (ret < 0) {
dev_err(priv->dev, "error attaching irq (%d)\n", ret);
goto clean_ale_ret;
--
2.9.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-08-11 16:15 ` Grygorii Strashko
@ 2016-08-11 16:36 ` Steven Rostedt
-1 siblings, 0 replies; 18+ messages in thread
From: Steven Rostedt @ 2016-08-11 16:36 UTC (permalink / raw)
To: Grygorii Strashko
Cc: Sebastian Andrzej Siewior, linux-omap, Alison Chaiken,
linux-rt-users, netdev
On Thu, 11 Aug 2016 19:15:40 +0300
Grygorii Strashko <grygorii.strashko@ti.com> wrote:
> Mark CPSW Rx/Tx IRQs as IRQF_NO_THREAD and avoid double scheduling on -RT
> where this IRQs are forced threaded:
> rx-irq
> |- schedule threaded rx-irq handler
> ...
> |- threaded rx-irq handler -> cpsw_rx_interrupt()
> |- napi_schedule()
> |- __raise_softirq_irqoff()
> |- wakeup_proper_softirq()
> ...
> napi
>
> after:
> rx-irq
> |- cpsw_rx_interrupt()
> |- napi_schedule()
> |- irq_exit()
> |- invoke_softirq()
> |- wakeup_softirqd()
> ...
> napi
>
> And, as result, get benefits from the following improvements (tested
> on am57xx-evm):
>
> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
> seen any more. Now these warnings can be seen once iperf is started.
> # iperf -c $IPERFHOST -w 128K -d -t 60
>
> 2) latency reduction when cyclictest is run in parallel with network load
> where net_perf.sh is:
> iperf -c $IPERFHOST -w 8K -d -t 60
> iperf -c $IPERFHOST -w 16K -d -t 60
> iperf -c $IPERFHOST -w 32K -d -t 60
> iperf -c $IPERFHOST -w 64K -d -t 60
> iperf -c $IPERFHOST -w 128K -d -t 60
>
> before:
> T: 0 ( 1326) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 70
> T: 1 ( 1327) P:98 I:1500 C: 159981 Min: 9 Act: 15 Avg: 16 Max: 43
> after:
> T: 0 ( 1331) P:98 I:1000 C: 240000 Min: 8 Act: 15 Avg: 14 Max: 51
> T: 1 ( 1332) P:98 I:1500 C: 159953 Min: 8 Act: 16 Avg: 15 Max: 33
>
> 3) network performance increase
>
> win, K Mbits/s
> before after %
> 8K 354 350.3 0.0
> 16K 412 551 33.7
> 32K 423 659.5 55.9
> 64K 436 728.3 67.0
> 128K 537 845 57.4
>
> This change does not affect on non-RT.
This looks fine to me, but it should go into the development branch,
which is currently 4.6-rt. And I can then pull it from there.
-- Steve
>
> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
> ---
> Hi All,
>
> I'll be appreciated on any feedback or tested-by.
> In case of positive feedback I'll resend it for upstream.
>
> drivers/net/ethernet/ti/cpsw.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index 7b59283..fa4bb81 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -769,7 +769,7 @@ static irqreturn_t cpsw_tx_interrupt(int irq, void *dev_id)
> priv->tx_irq_disabled = true;
> }
>
> - napi_schedule(&priv->napi_tx);
> + napi_schedule_irqoff(&priv->napi_tx);
> return IRQ_HANDLED;
> }
>
> @@ -785,7 +785,7 @@ static irqreturn_t cpsw_rx_interrupt(int irq, void *dev_id)
> priv->rx_irq_disabled = true;
> }
>
> - napi_schedule(&priv->napi_rx);
> + napi_schedule_irqoff(&priv->napi_rx);
> return IRQ_HANDLED;
> }
>
> @@ -2827,7 +2827,7 @@ static int cpsw_probe(struct platform_device *pdev)
>
> priv->irqs_table[0] = irq;
> ret = devm_request_irq(&pdev->dev, irq, cpsw_rx_interrupt,
> - 0, dev_name(&pdev->dev), priv);
> + IRQF_NO_THREAD, dev_name(&pdev->dev), priv);
> if (ret < 0) {
> dev_err(priv->dev, "error attaching irq (%d)\n", ret);
> goto clean_ale_ret;
> @@ -2842,7 +2842,7 @@ static int cpsw_probe(struct platform_device *pdev)
>
> priv->irqs_table[1] = irq;
> ret = devm_request_irq(&pdev->dev, irq, cpsw_tx_interrupt,
> - 0, dev_name(&pdev->dev), priv);
> + IRQF_NO_THREAD, dev_name(&pdev->dev), priv);
> if (ret < 0) {
> dev_err(priv->dev, "error attaching irq (%d)\n", ret);
> goto clean_ale_ret;
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
@ 2016-08-11 16:36 ` Steven Rostedt
0 siblings, 0 replies; 18+ messages in thread
From: Steven Rostedt @ 2016-08-11 16:36 UTC (permalink / raw)
To: Grygorii Strashko
Cc: Sebastian Andrzej Siewior, linux-omap, Alison Chaiken,
linux-rt-users, netdev
On Thu, 11 Aug 2016 19:15:40 +0300
Grygorii Strashko <grygorii.strashko@ti.com> wrote:
> Mark CPSW Rx/Tx IRQs as IRQF_NO_THREAD and avoid double scheduling on -RT
> where this IRQs are forced threaded:
> rx-irq
> |- schedule threaded rx-irq handler
> ...
> |- threaded rx-irq handler -> cpsw_rx_interrupt()
> |- napi_schedule()
> |- __raise_softirq_irqoff()
> |- wakeup_proper_softirq()
> ...
> napi
>
> after:
> rx-irq
> |- cpsw_rx_interrupt()
> |- napi_schedule()
> |- irq_exit()
> |- invoke_softirq()
> |- wakeup_softirqd()
> ...
> napi
>
> And, as result, get benefits from the following improvements (tested
> on am57xx-evm):
>
> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
> seen any more. Now these warnings can be seen once iperf is started.
> # iperf -c $IPERFHOST -w 128K -d -t 60
>
> 2) latency reduction when cyclictest is run in parallel with network load
> where net_perf.sh is:
> iperf -c $IPERFHOST -w 8K -d -t 60
> iperf -c $IPERFHOST -w 16K -d -t 60
> iperf -c $IPERFHOST -w 32K -d -t 60
> iperf -c $IPERFHOST -w 64K -d -t 60
> iperf -c $IPERFHOST -w 128K -d -t 60
>
> before:
> T: 0 ( 1326) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 70
> T: 1 ( 1327) P:98 I:1500 C: 159981 Min: 9 Act: 15 Avg: 16 Max: 43
> after:
> T: 0 ( 1331) P:98 I:1000 C: 240000 Min: 8 Act: 15 Avg: 14 Max: 51
> T: 1 ( 1332) P:98 I:1500 C: 159953 Min: 8 Act: 16 Avg: 15 Max: 33
>
> 3) network performance increase
>
> win, K Mbits/s
> before after %
> 8K 354 350.3 0.0
> 16K 412 551 33.7
> 32K 423 659.5 55.9
> 64K 436 728.3 67.0
> 128K 537 845 57.4
>
> This change does not affect on non-RT.
This looks fine to me, but it should go into the development branch,
which is currently 4.6-rt. And I can then pull it from there.
-- Steve
>
> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
> ---
> Hi All,
>
> I'll be appreciated on any feedback or tested-by.
> In case of positive feedback I'll resend it for upstream.
>
> drivers/net/ethernet/ti/cpsw.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index 7b59283..fa4bb81 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -769,7 +769,7 @@ static irqreturn_t cpsw_tx_interrupt(int irq, void *dev_id)
> priv->tx_irq_disabled = true;
> }
>
> - napi_schedule(&priv->napi_tx);
> + napi_schedule_irqoff(&priv->napi_tx);
> return IRQ_HANDLED;
> }
>
> @@ -785,7 +785,7 @@ static irqreturn_t cpsw_rx_interrupt(int irq, void *dev_id)
> priv->rx_irq_disabled = true;
> }
>
> - napi_schedule(&priv->napi_rx);
> + napi_schedule_irqoff(&priv->napi_rx);
> return IRQ_HANDLED;
> }
>
> @@ -2827,7 +2827,7 @@ static int cpsw_probe(struct platform_device *pdev)
>
> priv->irqs_table[0] = irq;
> ret = devm_request_irq(&pdev->dev, irq, cpsw_rx_interrupt,
> - 0, dev_name(&pdev->dev), priv);
> + IRQF_NO_THREAD, dev_name(&pdev->dev), priv);
> if (ret < 0) {
> dev_err(priv->dev, "error attaching irq (%d)\n", ret);
> goto clean_ale_ret;
> @@ -2842,7 +2842,7 @@ static int cpsw_probe(struct platform_device *pdev)
>
> priv->irqs_table[1] = irq;
> ret = devm_request_irq(&pdev->dev, irq, cpsw_tx_interrupt,
> - 0, dev_name(&pdev->dev), priv);
> + IRQF_NO_THREAD, dev_name(&pdev->dev), priv);
> if (ret < 0) {
> dev_err(priv->dev, "error attaching irq (%d)\n", ret);
> goto clean_ale_ret;
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-08-11 16:15 ` Grygorii Strashko
(?)
(?)
@ 2016-08-12 8:32 ` Sebastian Andrzej Siewior
2016-08-12 15:58 ` Grygorii Strashko
2016-08-19 14:29 ` Grygorii Strashko
-1 siblings, 2 replies; 18+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-08-12 8:32 UTC (permalink / raw)
To: Grygorii Strashko
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 2016-08-11 19:15:40 [+0300], Grygorii Strashko wrote:
> Mark CPSW Rx/Tx IRQs as IRQF_NO_THREAD and avoid double scheduling on -RT
> where this IRQs are forced threaded:
> rx-irq
> |- schedule threaded rx-irq handler
> ...
> |- threaded rx-irq handler -> cpsw_rx_interrupt()
> |- napi_schedule()
> |- __raise_softirq_irqoff()
> |- wakeup_proper_softirq()
> ...
> napi
This should not be the default path. The default should be napi running
in the context of the threaded rx-irq handler once the handler is done.
The wakeup_proper_softirq() part is only done if napi thinks that the
callback functions runs for too long. So in *that* case you continue
NAPI in the softirq-thread which runs at SCHED_OTHER.
> after:
> rx-irq
> |- cpsw_rx_interrupt()
> |- napi_schedule()
> |- irq_exit()
> |- invoke_softirq()
> |- wakeup_softirqd()
> ...
> napi
Since you schedule the softirq from an IRQ-off region / without a
process context you force the softirq to run in the thread at
SCHED_OTHER priority.
> And, as result, get benefits from the following improvements (tested
> on am57xx-evm):
>
> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
> seen any more. Now these warnings can be seen once iperf is started.
> # iperf -c $IPERFHOST -w 128K -d -t 60
Do you also see "sched: RT throttling activated"? Because I don't see
otherwise why this should pop up.
> 2) latency reduction when cyclictest is run in parallel with network load
> where net_perf.sh is:
> iperf -c $IPERFHOST -w 8K -d -t 60
> iperf -c $IPERFHOST -w 16K -d -t 60
> iperf -c $IPERFHOST -w 32K -d -t 60
> iperf -c $IPERFHOST -w 64K -d -t 60
> iperf -c $IPERFHOST -w 128K -d -t 60
>
> before:
> T: 0 ( 1326) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 70
> T: 1 ( 1327) P:98 I:1500 C: 159981 Min: 9 Act: 15 Avg: 16 Max: 43
> after:
> T: 0 ( 1331) P:98 I:1000 C: 240000 Min: 8 Act: 15 Avg: 14 Max: 51
> T: 1 ( 1332) P:98 I:1500 C: 159953 Min: 8 Act: 16 Avg: 15 Max: 33
-d 0 to have I: set to the same value.
What does -i 250 say?
And without network load we are where we were at "after" values?
What happens if s/__raise_softirq_irqoff_ksoft/__raise_softirq_irqoff/
in net/core/dev.c and chrt the priority of you network interrupt
handlers to SCHED_OTHER priority?
> 3) network performance increase
>
> win, K Mbits/s
> before after %
> 8K 354 350.3 0.0
> 16K 412 551 33.7
> 32K 423 659.5 55.9
> 64K 436 728.3 67.0
> 128K 537 845 57.4
How close are the after numbers to !RT?
Sebastian
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-08-12 8:32 ` Sebastian Andrzej Siewior
@ 2016-08-12 15:58 ` Grygorii Strashko
2016-08-19 14:29 ` Grygorii Strashko
1 sibling, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-08-12 15:58 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
Hi Sebastian,
On 08/12/2016 11:32 AM, Sebastian Andrzej Siewior wrote:
> On 2016-08-11 19:15:40 [+0300], Grygorii Strashko wrote:
>> Mark CPSW Rx/Tx IRQs as IRQF_NO_THREAD and avoid double scheduling on -RT
>> where this IRQs are forced threaded:
>> rx-irq
>> |- schedule threaded rx-irq handler
>> ...
>> |- threaded rx-irq handler -> cpsw_rx_interrupt()
>> |- napi_schedule()
>> |- __raise_softirq_irqoff()
>> |- wakeup_proper_softirq()
>> ...
>> napi
>
> This should not be the default path. The default should be napi running
> in the context of the threaded rx-irq handler once the handler is done.
> The wakeup_proper_softirq() part is only done if napi thinks that the
> callback functions runs for too long. So in *that* case you continue
> NAPI in the softirq-thread which runs at SCHED_OTHER.
Thankds for comment. You're right:
irq_thread()->irq_forced_thread_fn()->local_bh_enable()
but wouldn't here two wake_up_process() calls any way,
plus preempt_check_resched_rt() in napi_schedule().
>
>> after:
>> rx-irq
>> |- cpsw_rx_interrupt()
>> |- napi_schedule()
>> |- irq_exit()
>> |- invoke_softirq()
>> |- wakeup_softirqd()
>> ...
>> napi
>
> Since you schedule the softirq from an IRQ-off region / without a
> process context you force the softirq to run in the thread at
> SCHED_OTHER priority.
Thanks a lot for comments.
>
>> And, as result, get benefits from the following improvements (tested
>> on am57xx-evm):
>>
>> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
>> seen any more. Now these warnings can be seen once iperf is started.
>> # iperf -c $IPERFHOST -w 128K -d -t 60
>
> Do you also see "sched: RT throttling activated"? Because I don't see
> otherwise why this should pop up.
I've reverted my patch an did requested experiments (some additional info below).
I do not see "sched: RT throttling activated" :(
root@am57xx-evm:~# ./net_perf.sh & cyclictest -m -Sp98 -q -D4m
[1] 1301
# /dev/cpu_dma_latency set to 0us
Linux am57xx-evm 4.4.16-rt23-00321-ga195e6a-dirty #92 SMP PREEMPT RT Fri Aug 12 14:03:59 EEST 2016 armv7l GNU/Linux
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 16.0 KByte (WARNING: requested 8.00 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 16.0 KByte (WARNING: requested 8.00 KByte)
------------------------------------------------------------
[ 3] local 172.22.39.21 port 47500 connected with 172.22.39.17 port 5001
[ 5] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59955
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-60.0 sec 1.31 GBytes 188 Mbits/sec
[ 5] 0.0-60.0 sec 868 MBytes 121 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 32.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 32.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 47502 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59957
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.92 GBytes 275 Mbits/sec
[ 4] 0.0-60.0 sec 1.14 GBytes 163 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 47504 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59965
[ 2285.928516] NOHZ: local_softirq_pending 80
[ 2285.928549] NOHZ: local_softirq_pending 80
[ 2285.928594] NOHZ: local_softirq_pending 80
[ 2285.928689] NOHZ: local_softirq_pending 80
[ 2286.348572] NOHZ: local_softirq_pending 80
[ 2290.358623] NOHZ: local_softirq_pending 80
[ 2291.588624] NOHZ: local_softirq_pending 80
[ 2291.698579] NOHZ: local_softirq_pending 80
[ 2292.608594] NOHZ: local_softirq_pending 80
[ 2293.178627] NOHZ: local_softirq_pending 80
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.43 GBytes 205 Mbits/sec
[ 4] 0.0-60.0 sec 1.58 GBytes 226 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 47506 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59980
T: 0 ( 1309) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 60
T: 1 ( 1310) P:98 I:1500 C: 159968 Min: 8 Act: 14 Avg: 16 Max: 58
root@am57xx-evm:~# [ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.84 GBytes 264 Mbits/sec
[ 4] 0.0-60.0 sec 1.47 GBytes 210 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 47508 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59982
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 974 MBytes 136 Mbits/sec
[ 4] 0.0-60.0 sec 2.84 GBytes 406 Mbits/sec
[1]+ Done ./net_perf.sh
>
>> 2) latency reduction when cyclictest is run in parallel with network load
>> where net_perf.sh is:
>> iperf -c $IPERFHOST -w 8K -d -t 60
>> iperf -c $IPERFHOST -w 16K -d -t 60
>> iperf -c $IPERFHOST -w 32K -d -t 60
>> iperf -c $IPERFHOST -w 64K -d -t 60
>> iperf -c $IPERFHOST -w 128K -d -t 60
>>
>> before:
>> T: 0 ( 1326) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 70
>> T: 1 ( 1327) P:98 I:1500 C: 159981 Min: 9 Act: 15 Avg: 16 Max: 43
>> after:
>> T: 0 ( 1331) P:98 I:1000 C: 240000 Min: 8 Act: 15 Avg: 14 Max: 51
>> T: 1 ( 1332) P:98 I:1500 C: 159953 Min: 8 Act: 16 Avg: 15 Max: 33
>
> -d 0 to have I: set to the same value.
> What does -i 250 say?
> And without network load we are where we were at "after" values?
=============================================== before, no net load:
cyclictest -m -Sp98 -q -D4m
# /dev/cpu_dma_latency set to 0us
T: 0 ( 1270) P:98 I:1000 C: 240000 Min: 8 Act: 10 Avg: 9 Max: 38
T: 1 ( 1271) P:98 I:1500 C: 159986 Min: 8 Act: 10 Avg: 9 Max: 23
cyclictest -m -Sp98 -q -D4m -d0
# /dev/cpu_dma_latency set to 0us
T: 0 ( 1280) P:98 I:1000 C: 240000 Min: 8 Act: 10 Avg: 9 Max: 27
T: 1 ( 1281) P:98 I:1000 C: 239982 Min: 8 Act: 13 Avg: 9 Max: 26
cyclictest -m -Sp98 -q -D4m -i250
# /dev/cpu_dma_latency set to 0us
T: 0 ( 1284) P:98 I:250 C: 960000 Min: 8 Act: 10 Avg: 8 Max: 24
T: 1 ( 1285) P:98 I:750 C: 319977 Min: 8 Act: 11 Avg: 9 Max: 21
cyclictest -m -Sp98 -q -D4m -i250 -d0
# /dev/cpu_dma_latency set to 0us
T: 0 ( 1288) P:98 I:250 C: 960000 Min: 8 Act: 9 Avg: 8 Max: 33
T: 1 ( 1289) P:98 I:250 C: 959929 Min: 7 Act: 11 Avg: 9 Max: 26
=============================================== before, with net load:
cyclictest -m -Sp98 -q -D4m
T: 0 ( 1309) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 60
T: 1 ( 1310) P:98 I:1500 C: 159968 Min: 8 Act: 14 Avg: 16 Max: 58
cyclictest -m -Sp98 -q -D4m -d0
T: 0 ( 1339) P:98 I:1000 C: 240000 Min: 8 Act: 28 Avg: 18 Max: 60
T: 1 ( 1340) P:98 I:1000 C: 239969 Min: 8 Act: 25 Avg: 16 Max: 43
cyclictest -m -Sp98 -q -D4m -i250
T: 0 ( 1369) P:98 I:250 C: 960000 Min: 8 Act: 13 Avg: 17 Max: 64
T: 1 ( 1370) P:98 I:750 C: 319941 Min: 8 Act: 18 Avg: 16 Max: 56
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1400) P:98 I:250 C: 960000 Min: 8 Act: 25 Avg: 18 Max: 83
T: 1 ( 1401) P:98 I:250 C: 959801 Min: 7 Act: 27 Avg: 17 Max: 48
=============================================== after, no net load:
cyclictest -m -Sp98 -q -D4m
T: 0 ( 1287) P:98 I:1000 C: 240000 Min: 8 Act: 10 Avg: 9 Max: 30
T: 1 ( 1288) P:98 I:1500 C: 159987 Min: 8 Act: 13 Avg: 10 Max: 32
cyclictest -m -Sp98 -q -D4m -d0
T: 0 ( 1295) P:98 I:1000 C: 240000 Min: 8 Act: 8 Avg: 9 Max: 39
T: 1 ( 1296) P:98 I:1000 C: 239981 Min: 7 Act: 9 Avg: 8 Max: 29
cyclictest -m -Sp98 -q -D4m -i250
T: 0 ( 1298) P:98 I:250 C: 960000 Min: 7 Act: 8 Avg: 9 Max: 42
T: 1 ( 1299) P:98 I:750 C: 319970 Min: 8 Act: 9 Avg: 8 Max: 29
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1301) P:98 I:250 C: 960000 Min: 7 Act: 9 Avg: 8 Max: 22
T: 1 ( 1302) P:98 I:250 C: 959914 Min: 7 Act: 11 Avg: 8 Max: 28
=============================================== after, with net load:
cyclictest -m -Sp98 -q -D4m
T: 0 ( 1315) P:98 I:1000 C: 240000 Min: 8 Act: 11 Avg: 14 Max: 40
T: 1 ( 1316) P:98 I:1500 C: 159962 Min: 8 Act: 19 Avg: 15 Max: 38
cyclictest -m -Sp98 -q -D4m -d0
T: 0 ( 1365) P:98 I:1000 C: 240000 Min: 8 Act: 14 Avg: 15 Max: 51
T: 1 ( 1366) P:98 I:1000 C: 239943 Min: 8 Act: 18 Avg: 15 Max: 36
cyclictest -m -Sp98 -q -D4m -i250
T: 0 ( 1309) P:98 I:250 C: 960000 Min: 8 Act: 15 Avg: 13 Max: 42
T: 1 ( 1310) P:98 I:750 C: 319986 Min: 8 Act: 18 Avg: 15 Max: 36
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1358) P:98 I:250 C: 960000 Min: 8 Act: 11 Avg: 14 Max: 42
T: 1 ( 1359) P:98 I:250 C: 959743 Min: 7 Act: 18 Avg: 15 Max: 36
>
> What happens if s/__raise_softirq_irqoff_ksoft/__raise_softirq_irqoff/
> in net/core/dev.c and chrt the priority of you network interrupt
> handlers to SCHED_OTHER priority?
===== without this patch + __raise_softirq_irqoff + netIRQs->SCHED_OTHER
with net load:
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1325) P:98 I:1000 C: 240000 Min: 8 Act: 22 Avg: 17 Max: 51
T: 1 ( 1326) P:98 I:1500 C: 159981 Min: 8 Act: 15 Avg: 15 Max: 39
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1307) P:98 I:250 C: 960000 Min: 7 Act: 13 Avg: 16 Max: 50
T: 1 ( 1308) P:98 I:250 C: 959819 Min: 8 Act: 12 Avg: 14 Max: 37
and net parformance is better:
root@am57xx-evm:~# ps -A | grep 4848
82 ? 00:00:00 irq/354-4848400
83 ? 00:00:00 irq/355-4848400
root@am57xx-evm:~# chrt -o -p 0 82
root@am57xx-evm:~# chrt -o -p 0 83
./net_perf.sh & cyclictest -m -Sp98 -q -D4m -i250 -d0
[1] 1298
# /dev/cpu_dma_latency set to 0us
Linux am57xx-evm 4.4.16-rt23-00321-ga195e6a-dirty #95 SMP PREEMPT RT Fri Aug 12 16:20:42 EEST 2016 armv7l GNU/Linux
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 16.0 KByte (WARNING: requested 8.00 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 16.0 KByte (WARNING: requested 8.00 KByte)
------------------------------------------------------------
[ 3] local 172.22.39.21 port 58672 connected with 172.22.39.17 port 5001
[ 5] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60672
[ 78.718828] NOHZ: local_softirq_pending 80
[ 78.908849] NOHZ: local_softirq_pending 80
[ 78.908927] NOHZ: local_softirq_pending 80
[ 78.918827] NOHZ: local_softirq_pending 80
[ 79.058826] NOHZ: local_softirq_pending 80
[ 81.488918] NOHZ: local_softirq_pending 80
[ 81.489162] NOHZ: local_softirq_pending 80
[ 81.489398] NOHZ: local_softirq_pending 80
[ 81.489646] NOHZ: local_softirq_pending 80
[ 81.489895] NOHZ: local_softirq_pending 80
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-60.0 sec 1.30 GBytes 187 Mbits/sec
[ 5] 0.0-60.0 sec 637 MBytes 89.0 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 32.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 32.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 58674 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60680
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.78 GBytes 255 Mbits/sec
[ 4] 0.0-60.0 sec 1.47 GBytes 210 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 58676 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60693
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.62 GBytes 232 Mbits/sec
[ 4] 0.0-60.0 sec 2.07 GBytes 296 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 58678 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60702
T: 0 ( 1307) P:98 I:250 C: 960000 Min: 7 Act: 13 Avg: 16 Max: 50
T: 1 ( 1308) P:98 I:250 C: 959819 Min: 8 Act: 12 Avg: 14 Max: 37
root@am57xx-evm:~# [ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.77 GBytes 254 Mbits/sec
[ 4] 0.0-60.0 sec 1.77 GBytes 253 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 58680 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60711
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 2.18 GBytes 312 Mbits/sec
[ 4] 0.0-60.0 sec 2.53 GBytes 362 Mbits/sec
>
>> 3) network performance increase
>>
>> win, K Mbits/s
>> before after %
>> 8K 354 350.3 0.0
>> 16K 412 551 33.7
>> 32K 423 659.5 55.9
>> 64K 436 728.3 67.0
>> 128K 537 845 57.4
>
> How close are the after numbers to !RT?
http://processors.wiki.ti.com/index.php/Processor_SDK_Linux_Kernel_Performance_Guide#AM57XX_TCP_Performance
Min 20% drop even with this patch :(
--
regards,
-grygorii
========================= ps -l -A (after boot)
root@am57xx-evm:~# ps -l -A
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
4 S 0 1 0 1 80 0 - 1321 SyS_ep ? 00:00:05 systemd
1 S 0 2 0 0 80 0 - 0 kthrea ? 00:00:00 kthreadd
1 S 0 3 2 0 80 0 - 0 smpboo ? 00:00:00 ksoftirqd/0
5 S 0 4 2 0 58 - - 0 smpboo ? 00:00:00 ktimersoftd/0
1 S 0 5 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:0
1 S 0 6 2 0 60 -20 - 0 worker ? 00:00:00 kworker/0:0H
1 S 0 7 2 0 80 0 - 0 worker ? 00:00:00 kworker/u4:0
1 S 0 8 2 0 58 - - 0 rcu_gp ? 00:00:00 rcu_preempt
1 S 0 9 2 0 58 - - 0 rcu_gp ? 00:00:00 rcu_sched
1 S 0 10 2 0 58 - - 0 rcu_bo ? 00:00:00 rcub/0
1 S 0 11 2 0 58 - - 0 smpboo ? 00:00:00 rcuc/0
1 S 0 12 2 0 80 0 - 0 run_cl ? 00:00:00 kclksetdelayd
1 S 0 13 2 0 -40 - - 0 posix_ ? 00:00:00 posixcputmr/0
1 S 0 14 2 0 80 0 - 0 run_cm ? 00:00:00 kcmosdelayd
1 S 0 15 2 0 -40 - - 0 smpboo ? 00:00:00 migration/0
1 S 0 16 2 0 -40 - - 0 smpboo ? 00:00:00 migration/1
1 S 0 17 2 0 58 - - 0 smpboo ? 00:00:00 rcuc/1
5 S 0 18 2 0 58 - - 0 smpboo ? 00:00:00 ktimersoftd/1
1 S 0 19 2 0 80 0 - 0 smpboo ? 00:00:00 ksoftirqd/1
1 S 0 20 2 0 -40 - - 0 posix_ ? 00:00:00 posixcputmr/1
1 S 0 21 2 0 80 0 - 0 worker ? 00:00:00 kworker/1:0
1 S 0 22 2 0 60 -20 - 0 worker ? 00:00:00 kworker/1:0H
5 S 0 23 2 0 80 0 - 0 devtmp ? 00:00:00 kdevtmpfs
1 S 0 24 2 0 80 0 - 0 swork_ ? 00:00:00 kswork
1 S 0 25 2 0 60 -20 - 0 rescue ? 00:00:00 perf
1 S 0 26 2 0 9 - - 0 irq_th ? 00:00:00 irq/37-4ae10000
1 S 0 27 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:1
1 S 0 28 2 0 9 - - 0 irq_th ? 00:00:00 irq/70-48055000
1 S 0 29 2 0 9 - - 0 irq_th ? 00:00:00 irq/103-4805700
1 S 0 30 2 0 9 - - 0 irq_th ? 00:00:00 irq/136-4805900
1 S 0 31 2 0 9 - - 0 irq_th ? 00:00:00 irq/169-4805b00
1 S 0 32 2 0 9 - - 0 irq_th ? 00:00:00 irq/202-4805d00
1 S 0 33 2 0 9 - - 0 irq_th ? 00:00:00 irq/235-4805100
1 S 0 34 2 0 9 - - 0 irq_th ? 00:00:00 irq/268-4805300
1 S 0 35 2 0 60 -20 - 0 rescue ? 00:00:00 writeback
1 S 0 36 2 0 60 -20 - 0 rescue ? 00:00:00 crypto
1 S 0 37 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 38 2 0 60 -20 - 0 rescue ? 00:00:00 kblockd
1 S 0 39 2 0 9 - - 0 irq_th ? 00:00:00 irq/29-omap-dma
1 S 0 40 2 0 9 - - 0 irq_th ? 00:00:00 irq/32-43300000
1 S 0 41 2 0 9 - - 0 irq_th ? 00:00:00 irq/34-43300000
1 S 0 42 2 0 9 - - 0 irq_th ? 00:00:00 irq/343-40d0100
1 S 0 43 2 0 9 - - 0 irq_th ? 00:00:00 irq/344-40d0200
1 S 0 44 2 0 9 - - 0 irq_th ? 00:00:00 irq/345-5888200
1 S 0 45 2 0 9 - - 0 irq_th ? 00:00:00 irq/346-5508200
1 S 0 46 2 0 9 - - 0 irq_th ? 00:00:00 irq/363-4150100
1 S 0 47 2 0 9 - - 0 irq_th ? 00:00:00 irq/364-4150200
1 S 0 48 2 0 9 - - 0 irq_th ? 00:00:00 irq/337-4807000
1 S 0 49 2 0 9 - - 0 irq_th ? 00:00:00 irq/38-palmas
1 S 0 50 2 0 9 - - 0 irq_th ? 00:00:00 irq/338-4806000
1 S 0 51 2 0 9 - - 0 irq_th ? 00:00:00 irq/339-4807c00
1 S 0 52 2 0 60 -20 - 0 rescue ? 00:00:00 edac-poller
1 S 0 53 2 0 60 -20 - 0 rescue ? 00:00:00 rpciod
1 S 0 54 2 0 80 0 - 0 kswapd ? 00:00:00 kswapd0
1 S 0 55 2 0 60 -20 - 0 rescue ? 00:00:00 vmstat
1 S 0 56 2 0 80 0 - 0 fsnoti ? 00:00:00 fsnotify_mark
1 S 0 57 2 0 60 -20 - 0 rescue ? 00:00:00 nfsiod
1 S 0 66 2 0 9 - - 0 irq_th ? 00:00:00 irq/25-dra7xx-p
1 S 0 67 2 0 9 - - 0 irq_th ? 00:00:00 irq/437-PCIe PM
1 S 0 68 2 0 9 - - 0 irq_th ? 00:00:00 irq/437-aerdrv
1 S 0 69 2 0 9 - - 0 irq_th ? 00:00:00 irq/370-OMAP HD
1 S 0 70 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 71 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 72 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 73 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 74 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 75 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 76 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 77 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 78 2 0 80 0 - 0 worker ? 00:00:00 kworker/u4:1
1 S 0 81 2 0 80 0 - 0 worker ? 00:00:01 kworker/1:1
1 S 0 82 2 0 9 - - 0 irq_th ? 00:00:00 irq/354-4848400
1 S 0 83 2 0 9 - - 0 irq_th ? 00:00:00 irq/355-4848400
1 S 0 84 2 0 60 -20 - 0 rescue ? 00:00:00 kpsmoused
1 S 0 85 2 0 9 - - 0 irq_th ? 00:00:00 irq/27-talert
1 S 0 86 2 0 9 - - 0 irq_th ? 00:00:00 irq/340-mmc0
1 S 0 87 2 0 9 - - 0 irq_th ? 00:00:00 irq/230-4809c00
1 S 0 88 2 0 9 - - 0 irq_th ? 00:00:00 irq/341-mmc1
1 S 0 89 2 0 9 - - 0 irq_th ? 00:00:00 irq/342-mmc2
1 S 0 90 2 0 9 - - 0 irq_th ? 00:00:00 irq/352-4846800
1 S 0 91 2 0 9 - - 0 irq_th ? 00:00:00 irq/351-4846800
1 S 0 92 2 0 60 -20 - 0 rescue ? 00:00:00 ipv6_addrconf
1 S 0 93 2 0 9 - - 0 irq_th ? 00:00:00 irq/335-omap_dm
1 S 0 94 2 0 9 - - 0 irq_th ? 00:00:00 irq/369-OMAP DI
1 S 0 95 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 96 2 0 80 0 - 0 mmc_qu ? 00:00:00 mmcqd/0
1 S 0 97 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 98 2 0 80 0 - 0 mmc_qu ? 00:00:00 mmcqd/1
1 S 0 99 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 100 2 0 80 0 - 0 mmc_qu ? 00:00:00 mmcqd/1boot0
1 S 0 101 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 102 2 0 80 0 - 0 mmc_qu ? 00:00:00 mmcqd/1boot1
1 S 0 103 2 0 60 -20 - 0 rescue ? 00:00:00 deferwq
1 S 0 104 2 0 9 - - 0 irq_th ? 00:00:00 irq/94-Up
1 S 0 105 2 0 9 - - 0 irq_th ? 00:00:00 irq/96-Down
1 S 0 106 2 0 9 - - 0 irq_th ? 00:00:00 irq/99-Left
1 S 0 107 2 0 9 - - 0 irq_th ? 00:00:00 irq/95-Right
1 S 0 110 2 0 60 -20 - 0 worker ? 00:00:00 kworker/1:1H
1 S 0 111 2 0 80 0 - 0 kjourn ? 00:00:00 jbd2/mmcblk0p2-
1 S 0 112 2 0 60 -20 - 0 rescue ? 00:00:00 ext4-rsv-conver
1 S 0 129 2 0 60 -20 - 0 worker ? 00:00:00 kworker/0:1H
1 S 0 165 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:2
1 S 0 168 2 0 80 0 - 0 worker ? 00:00:00 kworker/1:2
4 S 0 197 1 0 80 0 - 2566 SyS_ep ? 00:00:01 systemd-journal
1 S 0 249 2 0 80 0 - 0 worker ? 00:00:00 kworker/u4:2
1 S 0 255 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:3
4 S 0 289 1 0 80 0 - 2731 SyS_ep ? 00:00:02 systemd-udevd
1 S 0 330 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:4
1 S 0 339 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:5
1 S 0 387 2 0 9 - - 0 irq_th ? 00:00:00 irq/362-4809000
1 S 0 407 2 0 9 - - 0 irq_th ? 00:00:00 irq/470-2-006f
1 S 0 409 2 0 9 - - 0 irq_th ? 00:00:00 irq/469-mcp7941
1 S 0 413 2 0 9 - - 0 irq_th ? 00:00:00 irq/75-pixcir_t
4 S 998 418 1 0 80 0 - 3038 SyS_ep ? 00:00:00 systemd-timesyn
1 S 0 429 2 0 9 - - 0 irq_th ? 00:00:00 irq/158-palmas_
1 S 0 462 2 0 9 - - 0 irq_th ? 00:00:00 irq/348-rtc2
1 S 0 464 2 0 9 - - 0 irq_th ? 00:00:00 irq/349-dwc3-om
1 S 0 466 2 0 9 - - 0 irq_th ? 00:00:00 irq/350-dwc3-om
1 S 0 493 2 0 9 - - 0 irq_th ? 00:00:00 irq/357-vpe
1 S 0 503 2 0 60 -20 - 0 rescue ? 00:00:00 ata_sff
1 S 0 509 2 0 9 - - 0 irq_th ? 00:00:00 irq/365-vip2-s0
1 S 0 510 2 0 9 - - 0 irq_th ? 00:00:00 irq/366-vip2-s1
1 S 0 514 2 0 9 - - 0 irq_th ? 00:00:00 irq/347-4a14000
1 S 0 516 2 0 80 0 - 0 scsi_e ? 00:00:00 scsi_eh_0
1 S 0 517 2 0 60 -20 - 0 rescue ? 00:00:00 scsi_tmf_0
1 S 0 595 2 0 9 - - 0 irq_th ? 00:00:00 irq/361-4b10100
1 S 0 605 2 0 80 0 - 0 worker ? 00:00:00 kworker/u4:3
4 S 0 765 1 0 80 0 - 498 poll_s ? 00:00:00 starter
4 S 0 792 1 0 80 0 - 497 syslog ? 00:00:00 klogd
4 S 0 819 1 0 80 0 - 2210 poll_s ? 00:00:01 ofonod
1 S 0 825 2 0 80 0 - 0 worker ? 00:00:02 kworker/1:3
1 S 0 887 2 0 9 - - 0 irq_th ? 00:00:00 irq/313-mbox_ip
4 S 0 900 765 0 80 0 - 35826 sigtim ? 00:00:00 charon
4 S 995 901 1 0 80 0 - 1164 poll_s ? 00:00:00 avahi-daemon
1 S 0 902 2 0 9 - - 0 irq_th ? 00:00:00 irq/317-mbox_ip
1 S 0 903 2 0 9 - - 0 irq_th ? 00:00:00 irq/321-rproc-w
1 S 0 904 2 0 9 - - 0 irq_th ? 00:00:00 irq/326-rproc-w
1 S 0 909 2 0 9 - - 0 irq_th ? 00:00:00 irq/313-mbox_ds
1 S 995 910 901 0 80 0 - 1164 unix_s ? 00:00:00 avahi-daemon
4 S 0 911 1 0 80 0 - 501 skb_re ? 00:00:00 syslogd
1 S 0 912 2 0 9 - - 0 irq_th ? 00:00:00 irq/317-mbox_ds
1 S 0 929 2 0 9 - - 0 irq_th ? 00:00:00 irq/490-remotep
1 S 0 930 2 0 9 - - 0 irq_th ? 00:00:00 irq/492-remotep
1 S 0 931 2 0 9 - - 0 irq_th ? 00:00:00 irq/494-remotep
1 S 0 934 2 0 9 - - 0 irq_th ? 00:00:00 irq/496-remotep
4 S 999 938 1 0 80 0 - 1080 SyS_ep ? 00:00:00 dbus-daemon
4 S 997 948 1 0 80 0 - 2725 SyS_ep ? 00:00:00 systemd-network
4 S 0 970 1 0 80 0 - 656 SyS_ep ? 00:00:00 systemd-logind
1 S 0 1006 2 0 9 - - 0 irq_th ? 00:00:00 irq/472-xhci-hc
4 S 0 1035 1 0 80 0 - 1221 wait ttyS2 00:00:00 login
4 S 0 1036 1 0 80 0 - 447 poll_s tty1 00:00:00 agetty
4 S 996 1040 1 0 80 0 - 754 SyS_ep ? 00:00:00 systemd-resolve
4 S 0 1048 1 0 80 0 - 1143 SyS_ep ? 00:00:00 lighttpd
5 S 0 1176 1 0 80 0 - 497 poll_s ? 00:00:00 telnetd
5 S 0 1177 1 0 80 0 - 2441 pipe_w ? 00:00:00 lad_dra7xx
5 S 65534 1190 1 0 80 0 - 614 poll_s ? 00:00:00 thttpd
5 S 0 1199 1 0 80 0 - 581 poll_s ? 00:00:00 rngd
1 S 0 1241 2 0 9 - - 0 irq_th ? 00:00:00 irq/91-48020000
1 S 0 1242 2 0 9 - - 0 irq_th ? 00:00:00 irq/301-4802000
4 S 0 1248 1 0 80 0 - 1354 SyS_ep ? 00:00:00 systemd
5 S 0 1252 1248 0 80 0 - 1854 sigtim ? 00:00:00 (sd-pam)
4 S 0 1259 1035 0 80 0 - 676 wait ttyS2 00:00:00 sh
0 R 0 1263 1259 0 80 0 - 553 - ttyS2 00:00:00 ps
========================= /proc/interrupts & /proc/softirqs (after boot)
root@am57xx-evm:~# cat /proc/interrupts
CPU0 CPU1
16: 0 0 CBAR 32 Level gp_timer
19: 30322 29398 GIC 27 Edge arch_timer
22: 0 0 CBAR 4 Level l3-dbg-irq
23: 0 0 WUGEN 10 Level l3-app-irq
25: 7 0 CBAR 232 Level dra7xx-pcie-main
26: 0 0 CBAR 233 Level dra7-pcie-msi
27: 1 0 CBAR 121 Level talert
29: 57 0 CBAR 8 Level omap-dma-engine
32: 583 0 CBAR 361 Level 43300000.edma_ccint
34: 0 0 CBAR 359 Level 43300000.edma_ccerrint
37: 0 0 CBAR 24 Level 4ae10000.gpio
38: 0 0 4ae10000.gpio 0 Level palmas
70: 1 0 CBAR 25 Level 48055000.gpio
75: 0 1 48055000.gpio 4 Edge pixcir_tangoc
91: 0 0 48055000.gpio 20 Edge 48020000.serial
94: 0 0 48055000.gpio 23 Edge Up
95: 0 0 48055000.gpio 24 Edge Right
96: 0 0 48055000.gpio 25 Edge Down
99: 0 0 48055000.gpio 28 Edge Left
103: 0 0 CBAR 26 Level 48057000.gpio
136: 0 0 CBAR 27 Level 48059000.gpio
158: 0 0 48059000.gpio 21 Edge palmas_usb_vbus
169: 0 0 CBAR 28 Level 4805b000.gpio
202: 0 0 CBAR 29 Level 4805d000.gpio
230: 0 0 4805d000.gpio 27 Edge 4809c000.mmc cd
235: 0 0 CBAR 30 Level 48051000.gpio
268: 0 0 CBAR 116 Level 48053000.gpio
301: 552 0 CBAR 69 Level 48020000.serial
313: 50 0 CBAR 251 Level mbox_ipu1_ipc3x, mbox_dsp1_ipc3x
317: 60 0 CBAR 255 Level mbox_ipu2_ipc3x, mbox_dsp2_ipc3x
321: 0 0 CBAR 35 Level rproc-wdt
326: 0 0 CBAR 40 Level rproc-wdt
335: 8 0 CBAR 108 Level omap_dmm_irq_handler
337: 1888 0 CBAR 51 Level 48070000.i2c
338: 40 0 CBAR 56 Level 48060000.i2c
339: 28 0 CBAR 55 Level 4807c000.i2c
340: 12831 0 CBAR 78 Level mmc0
341: 1473 0 CBAR 81 Level mmc1
342: 13 0 CBAR 89 Level mmc2
343: 0 0 CBAR 23 Level 40d01000.mmu
344: 0 0 CBAR 145 Level 40d02000.mmu
345: 0 0 CBAR 395 Level 58882000.mmu
346: 0 0 CBAR 396 Level 55082000.mmu
347: 0 0 CBAR 49 Level 4a140000.sata
348: 0 0 CBAR 217 Level rtc2
349: 5 0 CBAR 72 Level dwc3-omap
350: 2 0 CBAR 87 Level dwc3-omap
351: 0 0 CBAR 151 Level 48468000.mcasp_tx
352: 0 0 CBAR 150 Level 48468000.mcasp_rx
354: 24 0 CBAR 335 Level 48484000.ethernet
355: 67 0 CBAR 336 Level 48484000.ethernet
357: 0 0 CBAR 354 Level vpe
361: 20 0 CBAR 46 Level 4b101000.sham
362: 0 0 CBAR 47 Level 48090000.rng
363: 0 0 CBAR 146 Level 41501000.mmu
364: 0 0 CBAR 147 Level 41502000.mmu
365: 0 0 CBAR 352 Level vip2-s0
366: 0 0 CBAR 393 Level vip2-s1
369: 16949 0 CBAR 20 Level OMAP DISPC
370: 0 0 CBAR 96 Level OMAP HDMI
435: 0 0 palmas 8 Edge 48070000.i2c:tps659038@58:tps659038_rtc
437: 0 0 PCI-MSI 0 Edge PCIe PME, aerdrv
469: 0 0 CBAR 2 Edge mcp7941x
470: 0 0 pinctrl 1060 Edge 2-006f
472: 66 0 CBAR 71 Level xhci-hcd:usb1
490: 1 0 4b220000.intc 16 Edge remoteproc4
492: 1 0 4b220000.intc 18 Edge remoteproc5
494: 1 0 4b2a0000.intc 16 Edge remoteproc6
496: 1 0 4b2a0000.intc 18 Edge remoteproc7
IPI0: 0 1 CPU wakeup interrupts
IPI1: 0 0 Timer broadcast interrupts
IPI2: 48241 64393 Rescheduling interrupts
IPI3: 0 0 Function call interrupts
IPI4: 432 620 Single function call interrupts
IPI5: 0 0 CPU stop interrupts
IPI6: 0 0 IRQ work interrupts
IPI7: 0 0 completion interrupts
Err: 0
root@am57xx-evm:~# cat /proc/softirqs
CPU0 CPU1
HI: 0 2
TIMER: 29167 29156
NET_TX: 0 57
NET_RX: 123 86
BLOCK: 0 0
BLOCK_IOPOLL: 0 0
TASKLET: 406 386
SCHED: 28322 28253
HRTIMER: 609 384
RCU: 0 0
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
@ 2016-08-12 15:58 ` Grygorii Strashko
0 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-08-12 15:58 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
Hi Sebastian,
On 08/12/2016 11:32 AM, Sebastian Andrzej Siewior wrote:
> On 2016-08-11 19:15:40 [+0300], Grygorii Strashko wrote:
>> Mark CPSW Rx/Tx IRQs as IRQF_NO_THREAD and avoid double scheduling on -RT
>> where this IRQs are forced threaded:
>> rx-irq
>> |- schedule threaded rx-irq handler
>> ...
>> |- threaded rx-irq handler -> cpsw_rx_interrupt()
>> |- napi_schedule()
>> |- __raise_softirq_irqoff()
>> |- wakeup_proper_softirq()
>> ...
>> napi
>
> This should not be the default path. The default should be napi running
> in the context of the threaded rx-irq handler once the handler is done.
> The wakeup_proper_softirq() part is only done if napi thinks that the
> callback functions runs for too long. So in *that* case you continue
> NAPI in the softirq-thread which runs at SCHED_OTHER.
Thankds for comment. You're right:
irq_thread()->irq_forced_thread_fn()->local_bh_enable()
but wouldn't here two wake_up_process() calls any way,
plus preempt_check_resched_rt() in napi_schedule().
>
>> after:
>> rx-irq
>> |- cpsw_rx_interrupt()
>> |- napi_schedule()
>> |- irq_exit()
>> |- invoke_softirq()
>> |- wakeup_softirqd()
>> ...
>> napi
>
> Since you schedule the softirq from an IRQ-off region / without a
> process context you force the softirq to run in the thread at
> SCHED_OTHER priority.
Thanks a lot for comments.
>
>> And, as result, get benefits from the following improvements (tested
>> on am57xx-evm):
>>
>> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
>> seen any more. Now these warnings can be seen once iperf is started.
>> # iperf -c $IPERFHOST -w 128K -d -t 60
>
> Do you also see "sched: RT throttling activated"? Because I don't see
> otherwise why this should pop up.
I've reverted my patch an did requested experiments (some additional info below).
I do not see "sched: RT throttling activated" :(
root@am57xx-evm:~# ./net_perf.sh & cyclictest -m -Sp98 -q -D4m
[1] 1301
# /dev/cpu_dma_latency set to 0us
Linux am57xx-evm 4.4.16-rt23-00321-ga195e6a-dirty #92 SMP PREEMPT RT Fri Aug 12 14:03:59 EEST 2016 armv7l GNU/Linux
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 16.0 KByte (WARNING: requested 8.00 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 16.0 KByte (WARNING: requested 8.00 KByte)
------------------------------------------------------------
[ 3] local 172.22.39.21 port 47500 connected with 172.22.39.17 port 5001
[ 5] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59955
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-60.0 sec 1.31 GBytes 188 Mbits/sec
[ 5] 0.0-60.0 sec 868 MBytes 121 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 32.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 32.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 47502 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59957
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.92 GBytes 275 Mbits/sec
[ 4] 0.0-60.0 sec 1.14 GBytes 163 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 47504 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59965
[ 2285.928516] NOHZ: local_softirq_pending 80
[ 2285.928549] NOHZ: local_softirq_pending 80
[ 2285.928594] NOHZ: local_softirq_pending 80
[ 2285.928689] NOHZ: local_softirq_pending 80
[ 2286.348572] NOHZ: local_softirq_pending 80
[ 2290.358623] NOHZ: local_softirq_pending 80
[ 2291.588624] NOHZ: local_softirq_pending 80
[ 2291.698579] NOHZ: local_softirq_pending 80
[ 2292.608594] NOHZ: local_softirq_pending 80
[ 2293.178627] NOHZ: local_softirq_pending 80
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.43 GBytes 205 Mbits/sec
[ 4] 0.0-60.0 sec 1.58 GBytes 226 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 47506 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59980
T: 0 ( 1309) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 60
T: 1 ( 1310) P:98 I:1500 C: 159968 Min: 8 Act: 14 Avg: 16 Max: 58
root@am57xx-evm:~# [ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.84 GBytes 264 Mbits/sec
[ 4] 0.0-60.0 sec 1.47 GBytes 210 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 47508 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 59982
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 974 MBytes 136 Mbits/sec
[ 4] 0.0-60.0 sec 2.84 GBytes 406 Mbits/sec
[1]+ Done ./net_perf.sh
>
>> 2) latency reduction when cyclictest is run in parallel with network load
>> where net_perf.sh is:
>> iperf -c $IPERFHOST -w 8K -d -t 60
>> iperf -c $IPERFHOST -w 16K -d -t 60
>> iperf -c $IPERFHOST -w 32K -d -t 60
>> iperf -c $IPERFHOST -w 64K -d -t 60
>> iperf -c $IPERFHOST -w 128K -d -t 60
>>
>> before:
>> T: 0 ( 1326) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 70
>> T: 1 ( 1327) P:98 I:1500 C: 159981 Min: 9 Act: 15 Avg: 16 Max: 43
>> after:
>> T: 0 ( 1331) P:98 I:1000 C: 240000 Min: 8 Act: 15 Avg: 14 Max: 51
>> T: 1 ( 1332) P:98 I:1500 C: 159953 Min: 8 Act: 16 Avg: 15 Max: 33
>
> -d 0 to have I: set to the same value.
> What does -i 250 say?
> And without network load we are where we were at "after" values?
=============================================== before, no net load:
cyclictest -m -Sp98 -q -D4m
# /dev/cpu_dma_latency set to 0us
T: 0 ( 1270) P:98 I:1000 C: 240000 Min: 8 Act: 10 Avg: 9 Max: 38
T: 1 ( 1271) P:98 I:1500 C: 159986 Min: 8 Act: 10 Avg: 9 Max: 23
cyclictest -m -Sp98 -q -D4m -d0
# /dev/cpu_dma_latency set to 0us
T: 0 ( 1280) P:98 I:1000 C: 240000 Min: 8 Act: 10 Avg: 9 Max: 27
T: 1 ( 1281) P:98 I:1000 C: 239982 Min: 8 Act: 13 Avg: 9 Max: 26
cyclictest -m -Sp98 -q -D4m -i250
# /dev/cpu_dma_latency set to 0us
T: 0 ( 1284) P:98 I:250 C: 960000 Min: 8 Act: 10 Avg: 8 Max: 24
T: 1 ( 1285) P:98 I:750 C: 319977 Min: 8 Act: 11 Avg: 9 Max: 21
cyclictest -m -Sp98 -q -D4m -i250 -d0
# /dev/cpu_dma_latency set to 0us
T: 0 ( 1288) P:98 I:250 C: 960000 Min: 8 Act: 9 Avg: 8 Max: 33
T: 1 ( 1289) P:98 I:250 C: 959929 Min: 7 Act: 11 Avg: 9 Max: 26
=============================================== before, with net load:
cyclictest -m -Sp98 -q -D4m
T: 0 ( 1309) P:98 I:1000 C: 240000 Min: 8 Act: 13 Avg: 18 Max: 60
T: 1 ( 1310) P:98 I:1500 C: 159968 Min: 8 Act: 14 Avg: 16 Max: 58
cyclictest -m -Sp98 -q -D4m -d0
T: 0 ( 1339) P:98 I:1000 C: 240000 Min: 8 Act: 28 Avg: 18 Max: 60
T: 1 ( 1340) P:98 I:1000 C: 239969 Min: 8 Act: 25 Avg: 16 Max: 43
cyclictest -m -Sp98 -q -D4m -i250
T: 0 ( 1369) P:98 I:250 C: 960000 Min: 8 Act: 13 Avg: 17 Max: 64
T: 1 ( 1370) P:98 I:750 C: 319941 Min: 8 Act: 18 Avg: 16 Max: 56
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1400) P:98 I:250 C: 960000 Min: 8 Act: 25 Avg: 18 Max: 83
T: 1 ( 1401) P:98 I:250 C: 959801 Min: 7 Act: 27 Avg: 17 Max: 48
=============================================== after, no net load:
cyclictest -m -Sp98 -q -D4m
T: 0 ( 1287) P:98 I:1000 C: 240000 Min: 8 Act: 10 Avg: 9 Max: 30
T: 1 ( 1288) P:98 I:1500 C: 159987 Min: 8 Act: 13 Avg: 10 Max: 32
cyclictest -m -Sp98 -q -D4m -d0
T: 0 ( 1295) P:98 I:1000 C: 240000 Min: 8 Act: 8 Avg: 9 Max: 39
T: 1 ( 1296) P:98 I:1000 C: 239981 Min: 7 Act: 9 Avg: 8 Max: 29
cyclictest -m -Sp98 -q -D4m -i250
T: 0 ( 1298) P:98 I:250 C: 960000 Min: 7 Act: 8 Avg: 9 Max: 42
T: 1 ( 1299) P:98 I:750 C: 319970 Min: 8 Act: 9 Avg: 8 Max: 29
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1301) P:98 I:250 C: 960000 Min: 7 Act: 9 Avg: 8 Max: 22
T: 1 ( 1302) P:98 I:250 C: 959914 Min: 7 Act: 11 Avg: 8 Max: 28
=============================================== after, with net load:
cyclictest -m -Sp98 -q -D4m
T: 0 ( 1315) P:98 I:1000 C: 240000 Min: 8 Act: 11 Avg: 14 Max: 40
T: 1 ( 1316) P:98 I:1500 C: 159962 Min: 8 Act: 19 Avg: 15 Max: 38
cyclictest -m -Sp98 -q -D4m -d0
T: 0 ( 1365) P:98 I:1000 C: 240000 Min: 8 Act: 14 Avg: 15 Max: 51
T: 1 ( 1366) P:98 I:1000 C: 239943 Min: 8 Act: 18 Avg: 15 Max: 36
cyclictest -m -Sp98 -q -D4m -i250
T: 0 ( 1309) P:98 I:250 C: 960000 Min: 8 Act: 15 Avg: 13 Max: 42
T: 1 ( 1310) P:98 I:750 C: 319986 Min: 8 Act: 18 Avg: 15 Max: 36
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1358) P:98 I:250 C: 960000 Min: 8 Act: 11 Avg: 14 Max: 42
T: 1 ( 1359) P:98 I:250 C: 959743 Min: 7 Act: 18 Avg: 15 Max: 36
>
> What happens if s/__raise_softirq_irqoff_ksoft/__raise_softirq_irqoff/
> in net/core/dev.c and chrt the priority of you network interrupt
> handlers to SCHED_OTHER priority?
===== without this patch + __raise_softirq_irqoff + netIRQs->SCHED_OTHER
with net load:
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1325) P:98 I:1000 C: 240000 Min: 8 Act: 22 Avg: 17 Max: 51
T: 1 ( 1326) P:98 I:1500 C: 159981 Min: 8 Act: 15 Avg: 15 Max: 39
cyclictest -m -Sp98 -q -D4m -i250 -d0
T: 0 ( 1307) P:98 I:250 C: 960000 Min: 7 Act: 13 Avg: 16 Max: 50
T: 1 ( 1308) P:98 I:250 C: 959819 Min: 8 Act: 12 Avg: 14 Max: 37
and net parformance is better:
root@am57xx-evm:~# ps -A | grep 4848
82 ? 00:00:00 irq/354-4848400
83 ? 00:00:00 irq/355-4848400
root@am57xx-evm:~# chrt -o -p 0 82
root@am57xx-evm:~# chrt -o -p 0 83
./net_perf.sh & cyclictest -m -Sp98 -q -D4m -i250 -d0
[1] 1298
# /dev/cpu_dma_latency set to 0us
Linux am57xx-evm 4.4.16-rt23-00321-ga195e6a-dirty #95 SMP PREEMPT RT Fri Aug 12 16:20:42 EEST 2016 armv7l GNU/Linux
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 16.0 KByte (WARNING: requested 8.00 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 16.0 KByte (WARNING: requested 8.00 KByte)
------------------------------------------------------------
[ 3] local 172.22.39.21 port 58672 connected with 172.22.39.17 port 5001
[ 5] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60672
[ 78.718828] NOHZ: local_softirq_pending 80
[ 78.908849] NOHZ: local_softirq_pending 80
[ 78.908927] NOHZ: local_softirq_pending 80
[ 78.918827] NOHZ: local_softirq_pending 80
[ 79.058826] NOHZ: local_softirq_pending 80
[ 81.488918] NOHZ: local_softirq_pending 80
[ 81.489162] NOHZ: local_softirq_pending 80
[ 81.489398] NOHZ: local_softirq_pending 80
[ 81.489646] NOHZ: local_softirq_pending 80
[ 81.489895] NOHZ: local_softirq_pending 80
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-60.0 sec 1.30 GBytes 187 Mbits/sec
[ 5] 0.0-60.0 sec 637 MBytes 89.0 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 32.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 32.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 58674 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60680
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.78 GBytes 255 Mbits/sec
[ 4] 0.0-60.0 sec 1.47 GBytes 210 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 58676 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60693
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.62 GBytes 232 Mbits/sec
[ 4] 0.0-60.0 sec 2.07 GBytes 296 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 58678 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60702
T: 0 ( 1307) P:98 I:250 C: 960000 Min: 7 Act: 13 Avg: 16 Max: 50
T: 1 ( 1308) P:98 I:250 C: 959819 Min: 8 Act: 12 Avg: 14 Max: 37
root@am57xx-evm:~# [ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 1.77 GBytes 254 Mbits/sec
[ 4] 0.0-60.0 sec 1.77 GBytes 253 Mbits/sec
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.22.39.17, TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
[ 5] local 172.22.39.21 port 58680 connected with 172.22.39.17 port 5001
[ 4] local 172.22.39.21 port 5001 connected with 172.22.39.17 port 60711
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-60.0 sec 2.18 GBytes 312 Mbits/sec
[ 4] 0.0-60.0 sec 2.53 GBytes 362 Mbits/sec
>
>> 3) network performance increase
>>
>> win, K Mbits/s
>> before after %
>> 8K 354 350.3 0.0
>> 16K 412 551 33.7
>> 32K 423 659.5 55.9
>> 64K 436 728.3 67.0
>> 128K 537 845 57.4
>
> How close are the after numbers to !RT?
http://processors.wiki.ti.com/index.php/Processor_SDK_Linux_Kernel_Performance_Guide#AM57XX_TCP_Performance
Min 20% drop even with this patch :(
--
regards,
-grygorii
========================= ps -l -A (after boot)
root@am57xx-evm:~# ps -l -A
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
4 S 0 1 0 1 80 0 - 1321 SyS_ep ? 00:00:05 systemd
1 S 0 2 0 0 80 0 - 0 kthrea ? 00:00:00 kthreadd
1 S 0 3 2 0 80 0 - 0 smpboo ? 00:00:00 ksoftirqd/0
5 S 0 4 2 0 58 - - 0 smpboo ? 00:00:00 ktimersoftd/0
1 S 0 5 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:0
1 S 0 6 2 0 60 -20 - 0 worker ? 00:00:00 kworker/0:0H
1 S 0 7 2 0 80 0 - 0 worker ? 00:00:00 kworker/u4:0
1 S 0 8 2 0 58 - - 0 rcu_gp ? 00:00:00 rcu_preempt
1 S 0 9 2 0 58 - - 0 rcu_gp ? 00:00:00 rcu_sched
1 S 0 10 2 0 58 - - 0 rcu_bo ? 00:00:00 rcub/0
1 S 0 11 2 0 58 - - 0 smpboo ? 00:00:00 rcuc/0
1 S 0 12 2 0 80 0 - 0 run_cl ? 00:00:00 kclksetdelayd
1 S 0 13 2 0 -40 - - 0 posix_ ? 00:00:00 posixcputmr/0
1 S 0 14 2 0 80 0 - 0 run_cm ? 00:00:00 kcmosdelayd
1 S 0 15 2 0 -40 - - 0 smpboo ? 00:00:00 migration/0
1 S 0 16 2 0 -40 - - 0 smpboo ? 00:00:00 migration/1
1 S 0 17 2 0 58 - - 0 smpboo ? 00:00:00 rcuc/1
5 S 0 18 2 0 58 - - 0 smpboo ? 00:00:00 ktimersoftd/1
1 S 0 19 2 0 80 0 - 0 smpboo ? 00:00:00 ksoftirqd/1
1 S 0 20 2 0 -40 - - 0 posix_ ? 00:00:00 posixcputmr/1
1 S 0 21 2 0 80 0 - 0 worker ? 00:00:00 kworker/1:0
1 S 0 22 2 0 60 -20 - 0 worker ? 00:00:00 kworker/1:0H
5 S 0 23 2 0 80 0 - 0 devtmp ? 00:00:00 kdevtmpfs
1 S 0 24 2 0 80 0 - 0 swork_ ? 00:00:00 kswork
1 S 0 25 2 0 60 -20 - 0 rescue ? 00:00:00 perf
1 S 0 26 2 0 9 - - 0 irq_th ? 00:00:00 irq/37-4ae10000
1 S 0 27 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:1
1 S 0 28 2 0 9 - - 0 irq_th ? 00:00:00 irq/70-48055000
1 S 0 29 2 0 9 - - 0 irq_th ? 00:00:00 irq/103-4805700
1 S 0 30 2 0 9 - - 0 irq_th ? 00:00:00 irq/136-4805900
1 S 0 31 2 0 9 - - 0 irq_th ? 00:00:00 irq/169-4805b00
1 S 0 32 2 0 9 - - 0 irq_th ? 00:00:00 irq/202-4805d00
1 S 0 33 2 0 9 - - 0 irq_th ? 00:00:00 irq/235-4805100
1 S 0 34 2 0 9 - - 0 irq_th ? 00:00:00 irq/268-4805300
1 S 0 35 2 0 60 -20 - 0 rescue ? 00:00:00 writeback
1 S 0 36 2 0 60 -20 - 0 rescue ? 00:00:00 crypto
1 S 0 37 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 38 2 0 60 -20 - 0 rescue ? 00:00:00 kblockd
1 S 0 39 2 0 9 - - 0 irq_th ? 00:00:00 irq/29-omap-dma
1 S 0 40 2 0 9 - - 0 irq_th ? 00:00:00 irq/32-43300000
1 S 0 41 2 0 9 - - 0 irq_th ? 00:00:00 irq/34-43300000
1 S 0 42 2 0 9 - - 0 irq_th ? 00:00:00 irq/343-40d0100
1 S 0 43 2 0 9 - - 0 irq_th ? 00:00:00 irq/344-40d0200
1 S 0 44 2 0 9 - - 0 irq_th ? 00:00:00 irq/345-5888200
1 S 0 45 2 0 9 - - 0 irq_th ? 00:00:00 irq/346-5508200
1 S 0 46 2 0 9 - - 0 irq_th ? 00:00:00 irq/363-4150100
1 S 0 47 2 0 9 - - 0 irq_th ? 00:00:00 irq/364-4150200
1 S 0 48 2 0 9 - - 0 irq_th ? 00:00:00 irq/337-4807000
1 S 0 49 2 0 9 - - 0 irq_th ? 00:00:00 irq/38-palmas
1 S 0 50 2 0 9 - - 0 irq_th ? 00:00:00 irq/338-4806000
1 S 0 51 2 0 9 - - 0 irq_th ? 00:00:00 irq/339-4807c00
1 S 0 52 2 0 60 -20 - 0 rescue ? 00:00:00 edac-poller
1 S 0 53 2 0 60 -20 - 0 rescue ? 00:00:00 rpciod
1 S 0 54 2 0 80 0 - 0 kswapd ? 00:00:00 kswapd0
1 S 0 55 2 0 60 -20 - 0 rescue ? 00:00:00 vmstat
1 S 0 56 2 0 80 0 - 0 fsnoti ? 00:00:00 fsnotify_mark
1 S 0 57 2 0 60 -20 - 0 rescue ? 00:00:00 nfsiod
1 S 0 66 2 0 9 - - 0 irq_th ? 00:00:00 irq/25-dra7xx-p
1 S 0 67 2 0 9 - - 0 irq_th ? 00:00:00 irq/437-PCIe PM
1 S 0 68 2 0 9 - - 0 irq_th ? 00:00:00 irq/437-aerdrv
1 S 0 69 2 0 9 - - 0 irq_th ? 00:00:00 irq/370-OMAP HD
1 S 0 70 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 71 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 72 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 73 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 74 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 75 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 76 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 77 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 78 2 0 80 0 - 0 worker ? 00:00:00 kworker/u4:1
1 S 0 81 2 0 80 0 - 0 worker ? 00:00:01 kworker/1:1
1 S 0 82 2 0 9 - - 0 irq_th ? 00:00:00 irq/354-4848400
1 S 0 83 2 0 9 - - 0 irq_th ? 00:00:00 irq/355-4848400
1 S 0 84 2 0 60 -20 - 0 rescue ? 00:00:00 kpsmoused
1 S 0 85 2 0 9 - - 0 irq_th ? 00:00:00 irq/27-talert
1 S 0 86 2 0 9 - - 0 irq_th ? 00:00:00 irq/340-mmc0
1 S 0 87 2 0 9 - - 0 irq_th ? 00:00:00 irq/230-4809c00
1 S 0 88 2 0 9 - - 0 irq_th ? 00:00:00 irq/341-mmc1
1 S 0 89 2 0 9 - - 0 irq_th ? 00:00:00 irq/342-mmc2
1 S 0 90 2 0 9 - - 0 irq_th ? 00:00:00 irq/352-4846800
1 S 0 91 2 0 9 - - 0 irq_th ? 00:00:00 irq/351-4846800
1 S 0 92 2 0 60 -20 - 0 rescue ? 00:00:00 ipv6_addrconf
1 S 0 93 2 0 9 - - 0 irq_th ? 00:00:00 irq/335-omap_dm
1 S 0 94 2 0 9 - - 0 irq_th ? 00:00:00 irq/369-OMAP DI
1 S 0 95 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 96 2 0 80 0 - 0 mmc_qu ? 00:00:00 mmcqd/0
1 S 0 97 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 98 2 0 80 0 - 0 mmc_qu ? 00:00:00 mmcqd/1
1 S 0 99 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 100 2 0 80 0 - 0 mmc_qu ? 00:00:00 mmcqd/1boot0
1 S 0 101 2 0 60 -20 - 0 rescue ? 00:00:00 bioset
1 S 0 102 2 0 80 0 - 0 mmc_qu ? 00:00:00 mmcqd/1boot1
1 S 0 103 2 0 60 -20 - 0 rescue ? 00:00:00 deferwq
1 S 0 104 2 0 9 - - 0 irq_th ? 00:00:00 irq/94-Up
1 S 0 105 2 0 9 - - 0 irq_th ? 00:00:00 irq/96-Down
1 S 0 106 2 0 9 - - 0 irq_th ? 00:00:00 irq/99-Left
1 S 0 107 2 0 9 - - 0 irq_th ? 00:00:00 irq/95-Right
1 S 0 110 2 0 60 -20 - 0 worker ? 00:00:00 kworker/1:1H
1 S 0 111 2 0 80 0 - 0 kjourn ? 00:00:00 jbd2/mmcblk0p2-
1 S 0 112 2 0 60 -20 - 0 rescue ? 00:00:00 ext4-rsv-conver
1 S 0 129 2 0 60 -20 - 0 worker ? 00:00:00 kworker/0:1H
1 S 0 165 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:2
1 S 0 168 2 0 80 0 - 0 worker ? 00:00:00 kworker/1:2
4 S 0 197 1 0 80 0 - 2566 SyS_ep ? 00:00:01 systemd-journal
1 S 0 249 2 0 80 0 - 0 worker ? 00:00:00 kworker/u4:2
1 S 0 255 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:3
4 S 0 289 1 0 80 0 - 2731 SyS_ep ? 00:00:02 systemd-udevd
1 S 0 330 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:4
1 S 0 339 2 0 80 0 - 0 worker ? 00:00:00 kworker/0:5
1 S 0 387 2 0 9 - - 0 irq_th ? 00:00:00 irq/362-4809000
1 S 0 407 2 0 9 - - 0 irq_th ? 00:00:00 irq/470-2-006f
1 S 0 409 2 0 9 - - 0 irq_th ? 00:00:00 irq/469-mcp7941
1 S 0 413 2 0 9 - - 0 irq_th ? 00:00:00 irq/75-pixcir_t
4 S 998 418 1 0 80 0 - 3038 SyS_ep ? 00:00:00 systemd-timesyn
1 S 0 429 2 0 9 - - 0 irq_th ? 00:00:00 irq/158-palmas_
1 S 0 462 2 0 9 - - 0 irq_th ? 00:00:00 irq/348-rtc2
1 S 0 464 2 0 9 - - 0 irq_th ? 00:00:00 irq/349-dwc3-om
1 S 0 466 2 0 9 - - 0 irq_th ? 00:00:00 irq/350-dwc3-om
1 S 0 493 2 0 9 - - 0 irq_th ? 00:00:00 irq/357-vpe
1 S 0 503 2 0 60 -20 - 0 rescue ? 00:00:00 ata_sff
1 S 0 509 2 0 9 - - 0 irq_th ? 00:00:00 irq/365-vip2-s0
1 S 0 510 2 0 9 - - 0 irq_th ? 00:00:00 irq/366-vip2-s1
1 S 0 514 2 0 9 - - 0 irq_th ? 00:00:00 irq/347-4a14000
1 S 0 516 2 0 80 0 - 0 scsi_e ? 00:00:00 scsi_eh_0
1 S 0 517 2 0 60 -20 - 0 rescue ? 00:00:00 scsi_tmf_0
1 S 0 595 2 0 9 - - 0 irq_th ? 00:00:00 irq/361-4b10100
1 S 0 605 2 0 80 0 - 0 worker ? 00:00:00 kworker/u4:3
4 S 0 765 1 0 80 0 - 498 poll_s ? 00:00:00 starter
4 S 0 792 1 0 80 0 - 497 syslog ? 00:00:00 klogd
4 S 0 819 1 0 80 0 - 2210 poll_s ? 00:00:01 ofonod
1 S 0 825 2 0 80 0 - 0 worker ? 00:00:02 kworker/1:3
1 S 0 887 2 0 9 - - 0 irq_th ? 00:00:00 irq/313-mbox_ip
4 S 0 900 765 0 80 0 - 35826 sigtim ? 00:00:00 charon
4 S 995 901 1 0 80 0 - 1164 poll_s ? 00:00:00 avahi-daemon
1 S 0 902 2 0 9 - - 0 irq_th ? 00:00:00 irq/317-mbox_ip
1 S 0 903 2 0 9 - - 0 irq_th ? 00:00:00 irq/321-rproc-w
1 S 0 904 2 0 9 - - 0 irq_th ? 00:00:00 irq/326-rproc-w
1 S 0 909 2 0 9 - - 0 irq_th ? 00:00:00 irq/313-mbox_ds
1 S 995 910 901 0 80 0 - 1164 unix_s ? 00:00:00 avahi-daemon
4 S 0 911 1 0 80 0 - 501 skb_re ? 00:00:00 syslogd
1 S 0 912 2 0 9 - - 0 irq_th ? 00:00:00 irq/317-mbox_ds
1 S 0 929 2 0 9 - - 0 irq_th ? 00:00:00 irq/490-remotep
1 S 0 930 2 0 9 - - 0 irq_th ? 00:00:00 irq/492-remotep
1 S 0 931 2 0 9 - - 0 irq_th ? 00:00:00 irq/494-remotep
1 S 0 934 2 0 9 - - 0 irq_th ? 00:00:00 irq/496-remotep
4 S 999 938 1 0 80 0 - 1080 SyS_ep ? 00:00:00 dbus-daemon
4 S 997 948 1 0 80 0 - 2725 SyS_ep ? 00:00:00 systemd-network
4 S 0 970 1 0 80 0 - 656 SyS_ep ? 00:00:00 systemd-logind
1 S 0 1006 2 0 9 - - 0 irq_th ? 00:00:00 irq/472-xhci-hc
4 S 0 1035 1 0 80 0 - 1221 wait ttyS2 00:00:00 login
4 S 0 1036 1 0 80 0 - 447 poll_s tty1 00:00:00 agetty
4 S 996 1040 1 0 80 0 - 754 SyS_ep ? 00:00:00 systemd-resolve
4 S 0 1048 1 0 80 0 - 1143 SyS_ep ? 00:00:00 lighttpd
5 S 0 1176 1 0 80 0 - 497 poll_s ? 00:00:00 telnetd
5 S 0 1177 1 0 80 0 - 2441 pipe_w ? 00:00:00 lad_dra7xx
5 S 65534 1190 1 0 80 0 - 614 poll_s ? 00:00:00 thttpd
5 S 0 1199 1 0 80 0 - 581 poll_s ? 00:00:00 rngd
1 S 0 1241 2 0 9 - - 0 irq_th ? 00:00:00 irq/91-48020000
1 S 0 1242 2 0 9 - - 0 irq_th ? 00:00:00 irq/301-4802000
4 S 0 1248 1 0 80 0 - 1354 SyS_ep ? 00:00:00 systemd
5 S 0 1252 1248 0 80 0 - 1854 sigtim ? 00:00:00 (sd-pam)
4 S 0 1259 1035 0 80 0 - 676 wait ttyS2 00:00:00 sh
0 R 0 1263 1259 0 80 0 - 553 - ttyS2 00:00:00 ps
========================= /proc/interrupts & /proc/softirqs (after boot)
root@am57xx-evm:~# cat /proc/interrupts
CPU0 CPU1
16: 0 0 CBAR 32 Level gp_timer
19: 30322 29398 GIC 27 Edge arch_timer
22: 0 0 CBAR 4 Level l3-dbg-irq
23: 0 0 WUGEN 10 Level l3-app-irq
25: 7 0 CBAR 232 Level dra7xx-pcie-main
26: 0 0 CBAR 233 Level dra7-pcie-msi
27: 1 0 CBAR 121 Level talert
29: 57 0 CBAR 8 Level omap-dma-engine
32: 583 0 CBAR 361 Level 43300000.edma_ccint
34: 0 0 CBAR 359 Level 43300000.edma_ccerrint
37: 0 0 CBAR 24 Level 4ae10000.gpio
38: 0 0 4ae10000.gpio 0 Level palmas
70: 1 0 CBAR 25 Level 48055000.gpio
75: 0 1 48055000.gpio 4 Edge pixcir_tangoc
91: 0 0 48055000.gpio 20 Edge 48020000.serial
94: 0 0 48055000.gpio 23 Edge Up
95: 0 0 48055000.gpio 24 Edge Right
96: 0 0 48055000.gpio 25 Edge Down
99: 0 0 48055000.gpio 28 Edge Left
103: 0 0 CBAR 26 Level 48057000.gpio
136: 0 0 CBAR 27 Level 48059000.gpio
158: 0 0 48059000.gpio 21 Edge palmas_usb_vbus
169: 0 0 CBAR 28 Level 4805b000.gpio
202: 0 0 CBAR 29 Level 4805d000.gpio
230: 0 0 4805d000.gpio 27 Edge 4809c000.mmc cd
235: 0 0 CBAR 30 Level 48051000.gpio
268: 0 0 CBAR 116 Level 48053000.gpio
301: 552 0 CBAR 69 Level 48020000.serial
313: 50 0 CBAR 251 Level mbox_ipu1_ipc3x, mbox_dsp1_ipc3x
317: 60 0 CBAR 255 Level mbox_ipu2_ipc3x, mbox_dsp2_ipc3x
321: 0 0 CBAR 35 Level rproc-wdt
326: 0 0 CBAR 40 Level rproc-wdt
335: 8 0 CBAR 108 Level omap_dmm_irq_handler
337: 1888 0 CBAR 51 Level 48070000.i2c
338: 40 0 CBAR 56 Level 48060000.i2c
339: 28 0 CBAR 55 Level 4807c000.i2c
340: 12831 0 CBAR 78 Level mmc0
341: 1473 0 CBAR 81 Level mmc1
342: 13 0 CBAR 89 Level mmc2
343: 0 0 CBAR 23 Level 40d01000.mmu
344: 0 0 CBAR 145 Level 40d02000.mmu
345: 0 0 CBAR 395 Level 58882000.mmu
346: 0 0 CBAR 396 Level 55082000.mmu
347: 0 0 CBAR 49 Level 4a140000.sata
348: 0 0 CBAR 217 Level rtc2
349: 5 0 CBAR 72 Level dwc3-omap
350: 2 0 CBAR 87 Level dwc3-omap
351: 0 0 CBAR 151 Level 48468000.mcasp_tx
352: 0 0 CBAR 150 Level 48468000.mcasp_rx
354: 24 0 CBAR 335 Level 48484000.ethernet
355: 67 0 CBAR 336 Level 48484000.ethernet
357: 0 0 CBAR 354 Level vpe
361: 20 0 CBAR 46 Level 4b101000.sham
362: 0 0 CBAR 47 Level 48090000.rng
363: 0 0 CBAR 146 Level 41501000.mmu
364: 0 0 CBAR 147 Level 41502000.mmu
365: 0 0 CBAR 352 Level vip2-s0
366: 0 0 CBAR 393 Level vip2-s1
369: 16949 0 CBAR 20 Level OMAP DISPC
370: 0 0 CBAR 96 Level OMAP HDMI
435: 0 0 palmas 8 Edge 48070000.i2c:tps659038@58:tps659038_rtc
437: 0 0 PCI-MSI 0 Edge PCIe PME, aerdrv
469: 0 0 CBAR 2 Edge mcp7941x
470: 0 0 pinctrl 1060 Edge 2-006f
472: 66 0 CBAR 71 Level xhci-hcd:usb1
490: 1 0 4b220000.intc 16 Edge remoteproc4
492: 1 0 4b220000.intc 18 Edge remoteproc5
494: 1 0 4b2a0000.intc 16 Edge remoteproc6
496: 1 0 4b2a0000.intc 18 Edge remoteproc7
IPI0: 0 1 CPU wakeup interrupts
IPI1: 0 0 Timer broadcast interrupts
IPI2: 48241 64393 Rescheduling interrupts
IPI3: 0 0 Function call interrupts
IPI4: 432 620 Single function call interrupts
IPI5: 0 0 CPU stop interrupts
IPI6: 0 0 IRQ work interrupts
IPI7: 0 0 completion interrupts
Err: 0
root@am57xx-evm:~# cat /proc/softirqs
CPU0 CPU1
HI: 0 2
TIMER: 29167 29156
NET_TX: 0 57
NET_RX: 123 86
BLOCK: 0 0
BLOCK_IOPOLL: 0 0
TASKLET: 406 386
SCHED: 28322 28253
HRTIMER: 609 384
RCU: 0 0
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-08-12 8:32 ` Sebastian Andrzej Siewior
@ 2016-08-19 14:29 ` Grygorii Strashko
2016-08-19 14:29 ` Grygorii Strashko
1 sibling, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-08-19 14:29 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 08/12/2016 11:32 AM, Sebastian Andrzej Siewior wrote:
> On 2016-08-11 19:15:40 [+0300], Grygorii Strashko wrote:
>> Mark CPSW Rx/Tx IRQs as IRQF_NO_THREAD and avoid double scheduling on -RT
>> where this IRQs are forced threaded:
>> rx-irq
>> |- schedule threaded rx-irq handler
>> ...
>> |- threaded rx-irq handler -> cpsw_rx_interrupt()
>> |- napi_schedule()
>> |- __raise_softirq_irqoff()
>> |- wakeup_proper_softirq()
>> ...
>> napi
>
> This should not be the default path. The default should be napi running
> in the context of the threaded rx-irq handler once the handler is done.
> The wakeup_proper_softirq() part is only done if napi thinks that the
> callback functions runs for too long. So in *that* case you continue
> NAPI in the softirq-thread which runs at SCHED_OTHER.
>
>> after:
>> rx-irq
>> |- cpsw_rx_interrupt()
>> |- napi_schedule()
>> |- irq_exit()
>> |- invoke_softirq()
>> |- wakeup_softirqd()
>> ...
>> napi
>
> Since you schedule the softirq from an IRQ-off region / without a
> process context you force the softirq to run in the thread at
> SCHED_OTHER priority.
>
>> And, as result, get benefits from the following improvements (tested
>> on am57xx-evm):
>>
>> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
>> seen any more. Now these warnings can be seen once iperf is started.
>> # iperf -c $IPERFHOST -w 128K -d -t 60
>
> Do you also see "sched: RT throttling activated"? Because I don't see
> otherwise why this should pop up.
>
I've collected trace before first occurrence of "NOHZ: local_softirq_pending 80"
irq/354-4848400-85 [000] 90.639460: irq_handler_entry: irq=19 name=arch_timer
iperf-1284 [001] 90.639474: softirq_raise: vec=1 [action=TIMER]
iperf-1284 [001] 90.639486: irq_handler_exit: irq=19 ret=handled
irq/354-4848400-85 [000] 90.639488: softirq_raise: vec=7 [action=SCHED]
iperf-1284 [001] 90.639490: sched_waking: comm=ksoftirqd/1 pid=21 prio=120 target_cpu=001
irq/354-4848400-85 [000] 90.639492: softirq_raise: vec=1 [action=TIMER]
iperf-1284 [001] 90.639499: sched_wakeup: ksoftirqd/1:21 [120] success=1 CPU:001
iperf-1284 [001] 90.639504: sched_waking: comm=ktimersoftd/1 pid=20 prio=98 target_cpu=001
irq/354-4848400-85 [000] 90.639505: irq_handler_exit: irq=19 ret=handled
iperf-1284 [001] 90.639512: sched_wakeup: ktimersoftd/1:20 [98] success=1 CPU:001
iperf-1284 [001] 90.639526: sched_stat_runtime: comm=iperf pid=1284 runtime=50752 [ns] vruntime=2105322850 [ns]
iperf-1284 [001] 90.639537: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.639545: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639556: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639589: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639614: sched_switch: irq/355-4848400:86 [49] S ==> ktimersoftd/1:20 [98]
ktimersoftd/1-20 [001] 90.639625: softirq_entry: vec=1 [action=TIMER]
ktimersoftd/1-20 [001] 90.639637: sched_waking: comm=rcu_preempt pid=8 prio=98 target_cpu=001
ktimersoftd/1-20 [001] 90.639646: sched_wakeup: rcu_preempt:8 [98] success=1 CPU:001
ktimersoftd/1-20 [001] 90.639663: softirq_exit: vec=1 [action=TIMER]
ktimersoftd/1-20 [001] 90.639679: sched_switch: ktimersoftd/1:20 [98] S ==> rcu_preempt:8 [98]
rcu_preempt-8 [001] 90.639722: sched_switch: rcu_preempt:8 [98] S ==> ksoftirqd/1:21 [120]
ksoftirqd/1-21 [001] 90.639740: sched_stat_runtime: comm=ksoftirqd/1 pid=21 runtime=25539 [ns] vruntime=29960463828 [ns]
ksoftirqd/1-21 [001] 90.639750: sched_switch: ksoftirqd/1:21 [120] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.639878: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.639880: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.639884: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.639896: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.639903: sched_stat_runtime: comm=iperf pid=1284 runtime=150466 [ns] vruntime=2105473316 [ns]
iperf-1284 [001] 90.639913: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.639921: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639932: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639958: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
irq/355-4848400-86 [001] 90.639963: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:000
irq/355-4848400-86 [001] 90.639984: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640012: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.640109: sched_waking: comm=iperf pid=1284 prio=120 target_cpu=001
irq/354-4848400-85 [000] 90.640115: sched_wakeup: iperf:1284 [120] success=1 CPU:001
irq/354-4848400-85 [000] 90.640152: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640154: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.640158: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.640170: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.640177: sched_stat_runtime: comm=iperf pid=1284 runtime=163480 [ns] vruntime=2105636796 [ns]
iperf-1284 [001] 90.640187: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.640195: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640206: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640239: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640266: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.640376: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640378: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.640382: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.640394: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.640401: sched_stat_runtime: comm=iperf pid=1284 runtime=133387 [ns] vruntime=2105770183 [ns]
iperf-1284 [001] 90.640412: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.640419: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640430: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640463: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640489: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.640560: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640568: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640579: softirq_entry: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640606: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640608: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.640611: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.640623: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.640630: sched_stat_runtime: comm=iperf pid=1284 runtime=139080 [ns] vruntime=2105909263 [ns]
iperf-1284 [001] 90.640641: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.640648: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640659: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640692: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640720: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.640806: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640820: irq_handler_entry: irq=354 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640823: irq_handler_exit: irq=354 ret=handled
irq/354-4848400-85 [000] 90.640833: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640835: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.640838: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.640849: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.640857: sched_stat_runtime: comm=iperf pid=1284 runtime=135501 [ns] vruntime=2106044764 [ns]
irq/354-4848400-85 [000] 90.640861: softirq_raise: vec=3 [action=NET_RX]
iperf-1284 [001] 90.640867: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/354-4848400-85 [000] 90.640872: irq_handler_entry: irq=354 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640874: irq_handler_exit: irq=354 ret=handled
irq/355-4848400-86 [001] 90.640875: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640885: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640886: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640919: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640945: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641052: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641054: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.641058: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.641070: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.641077: sched_stat_runtime: comm=iperf pid=1284 runtime=129483 [ns] vruntime=2106174247 [ns]
iperf-1284 [001] 90.641087: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.641095: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641106: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641132: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
irq/355-4848400-86 [001] 90.641138: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:000
irq/355-4848400-86 [001] 90.641159: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641185: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641312: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641314: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.641317: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.641329: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.641336: sched_stat_runtime: comm=iperf pid=1284 runtime=149003 [ns] vruntime=2106323250 [ns]
iperf-1284 [001] 90.641347: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.641355: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641366: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641398: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641425: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641526: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641526: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641526: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.641526: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.641526: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.641557: sched_stat_runtime: comm=iperf pid=1284 runtime=130946 [ns] vruntime=2106454196 [ns]
irq/354-4848400-85 [000] 90.641565: softirq_raise: vec=3 [action=NET_RX]
iperf-1284 [001] 90.641567: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.641575: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641576: irq_handler_entry: irq=354 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641578: irq_handler_exit: irq=354 ret=handled
irq/355-4848400-86 [001] 90.641586: softirq_entry: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641589: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641618: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641645: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641760: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641762: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.641765: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.641777: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.641784: sched_stat_runtime: comm=iperf pid=1284 runtime=137941 [ns] vruntime=2106592137 [ns]
iperf-1284 [001] 90.641795: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.641802: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641814: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641845: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641871: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641916: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641924: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641935: softirq_entry: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641964: sched_waking: comm=iperf pid=1284 prio=120 target_cpu=001
irq/354-4848400-85 [000] 90.641969: sched_wakeup: iperf:1284 [120] success=1 CPU:001
irq/354-4848400-85 [000] 90.642006: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.642008: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.642011: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.642024: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.642032: sched_stat_runtime: comm=iperf pid=1284 runtime=156486 [ns] vruntime=2106748623 [ns]
iperf-1284 [001] 90.642042: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.642049: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642060: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642093: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642120: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.642233: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.642235: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.642238: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.642250: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.642257: sched_stat_runtime: comm=iperf pid=1284 runtime=135990 [ns] vruntime=2106884613 [ns]
iperf-1284 [001] 90.642268: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.642275: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642286: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642319: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642345: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.642393: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.642419: sched_switch: irq/354-4848400:85 [49] S ==> rcuc/0:11 [98]
rcuc/0-11 [000] 90.642430: irq_handler_entry: irq=354 name=48484000.ethernet
rcuc/0-11 [000] 90.642432: irq_handler_exit: irq=354 ret=handled
rcuc/0-11 [000] 90.642435: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
rcuc/0-11 [000] 90.642442: sched_migrate_task: comm=irq/354-4848400 pid=85 prio=49 orig_cpu=0 dest_cpu=1
rcuc/0-11 [000] 90.642453: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:001
iperf-1284 [001] 90.642462: sched_stat_runtime: comm=iperf pid=1284 runtime=113053 [ns] vruntime=2106997666 [ns]
rcuc/0-11 [000] 90.642464: irq_handler_entry: irq=355 name=48484000.ethernet
rcuc/0-11 [000] 90.642466: irq_handler_exit: irq=355 ret=handled
rcuc/0-11 [000] 90.642469: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
iperf-1284 [001] 90.642473: sched_switch: iperf:1284 [120] R ==> irq/354-4848400:85 [49]
irq/354-4848400-85 [001] 90.642481: softirq_raise: vec=3 [action=NET_RX]
rcuc/0-11 [000] 90.642483: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
irq/354-4848400-85 [001] 90.642493: softirq_entry: vec=3 [action=NET_RX]
rcuc/0-11 [000] 90.642497: sched_migrate_task: comm=irq/355-4848400 pid=86 prio=49 orig_cpu=1 dest_cpu=0
rcuc/0-11 [000] 90.642515: irq_handler_entry: irq=354 name=48484000.ethernet
rcuc/0-11 [000] 90.642516: irq_handler_exit: irq=354 ret=handled
rcuc/0-11 [000] 90.642533: sched_switch: rcuc/0:11 [98] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [000] 90.642541: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [000] 90.642551: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [000] 90.642562: sched_pi_setprio: comm=iperf pid=1284 oldprio=120 newprio=49
irq/355-4848400-86 [000] 90.642593: sched_switch: irq/355-4848400:86 [49] D ==> rcuc/0:11 [98]
rcuc/0-11 [000] 90.642621: sched_switch: rcuc/0:11 [98] S ==> ktimersoftd/0:4 [98]
ktimersoftd/0-4 [000] 90.642631: softirq_entry: vec=1 [action=TIMER]
ktimersoftd/0-4 [000] 90.642647: softirq_exit: vec=1 [action=TIMER]
ktimersoftd/0-4 [000] 90.642665: sched_switch: ktimersoftd/0:4 [98] S ==> ksoftirqd/0:3 [120]
ksoftirqd/0-3 [000] 90.642692: sched_stat_runtime: comm=ksoftirqd/0 pid=3 runtime=34486 [ns] vruntime=33313887612 [ns]
ksoftirqd/0-3 [000] 90.642703: sched_switch: ksoftirqd/0:3 [120] D ==> trace-cmd:1278 [120]
trace-cmd-1278 [000] 90.642742: sched_waking: comm=trace-cmd pid=1279 prio=120 target_cpu=001
trace-cmd-1278 [000] 90.642755: sched_wakeup: trace-cmd:1279 [120] success=1 CPU:001
irq/354-4848400-85 [001] 90.643012: softirq_exit: vec=3 [action=NET_RX]
trace-cmd-1278 [000] 90.643020: sched_stat_runtime: comm=trace-cmd pid=1278 runtime=328098 [ns] vruntime=554552351 [ns]
irq/354-4848400-85 [001] 90.643021: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [001] 90.643031: softirq_entry: vec=3 [action=NET_RX]
trace-cmd-1278 [000] 90.643036: sched_switch: trace-cmd:1278 [120] D ==> iperf:1282 [120]
iperf-1282 [000] 90.643055: sched_stat_runtime: comm=iperf pid=1282 runtime=34811 [ns] vruntime=560242796 [ns]
iperf-1282 [000] 90.643068: sched_switch: iperf:1282 [120] D ==> swapper/0:0 [120]
<idle>-0 [000] 90.643079: bputs: 0xc003e7d4s: 58f67c094e488
<idle>-0 [000] 90.643090: bprint: softirq_check_pending_idle: NOHZ: 80
--
regards,
-grygorii
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
@ 2016-08-19 14:29 ` Grygorii Strashko
0 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-08-19 14:29 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 08/12/2016 11:32 AM, Sebastian Andrzej Siewior wrote:
> On 2016-08-11 19:15:40 [+0300], Grygorii Strashko wrote:
>> Mark CPSW Rx/Tx IRQs as IRQF_NO_THREAD and avoid double scheduling on -RT
>> where this IRQs are forced threaded:
>> rx-irq
>> |- schedule threaded rx-irq handler
>> ...
>> |- threaded rx-irq handler -> cpsw_rx_interrupt()
>> |- napi_schedule()
>> |- __raise_softirq_irqoff()
>> |- wakeup_proper_softirq()
>> ...
>> napi
>
> This should not be the default path. The default should be napi running
> in the context of the threaded rx-irq handler once the handler is done.
> The wakeup_proper_softirq() part is only done if napi thinks that the
> callback functions runs for too long. So in *that* case you continue
> NAPI in the softirq-thread which runs at SCHED_OTHER.
>
>> after:
>> rx-irq
>> |- cpsw_rx_interrupt()
>> |- napi_schedule()
>> |- irq_exit()
>> |- invoke_softirq()
>> |- wakeup_softirqd()
>> ...
>> napi
>
> Since you schedule the softirq from an IRQ-off region / without a
> process context you force the softirq to run in the thread at
> SCHED_OTHER priority.
>
>> And, as result, get benefits from the following improvements (tested
>> on am57xx-evm):
>>
>> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
>> seen any more. Now these warnings can be seen once iperf is started.
>> # iperf -c $IPERFHOST -w 128K -d -t 60
>
> Do you also see "sched: RT throttling activated"? Because I don't see
> otherwise why this should pop up.
>
I've collected trace before first occurrence of "NOHZ: local_softirq_pending 80"
irq/354-4848400-85 [000] 90.639460: irq_handler_entry: irq=19 name=arch_timer
iperf-1284 [001] 90.639474: softirq_raise: vec=1 [action=TIMER]
iperf-1284 [001] 90.639486: irq_handler_exit: irq=19 ret=handled
irq/354-4848400-85 [000] 90.639488: softirq_raise: vec=7 [action=SCHED]
iperf-1284 [001] 90.639490: sched_waking: comm=ksoftirqd/1 pid=21 prio=120 target_cpu=001
irq/354-4848400-85 [000] 90.639492: softirq_raise: vec=1 [action=TIMER]
iperf-1284 [001] 90.639499: sched_wakeup: ksoftirqd/1:21 [120] success=1 CPU:001
iperf-1284 [001] 90.639504: sched_waking: comm=ktimersoftd/1 pid=20 prio=98 target_cpu=001
irq/354-4848400-85 [000] 90.639505: irq_handler_exit: irq=19 ret=handled
iperf-1284 [001] 90.639512: sched_wakeup: ktimersoftd/1:20 [98] success=1 CPU:001
iperf-1284 [001] 90.639526: sched_stat_runtime: comm=iperf pid=1284 runtime=50752 [ns] vruntime=2105322850 [ns]
iperf-1284 [001] 90.639537: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.639545: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639556: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639589: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639614: sched_switch: irq/355-4848400:86 [49] S ==> ktimersoftd/1:20 [98]
ktimersoftd/1-20 [001] 90.639625: softirq_entry: vec=1 [action=TIMER]
ktimersoftd/1-20 [001] 90.639637: sched_waking: comm=rcu_preempt pid=8 prio=98 target_cpu=001
ktimersoftd/1-20 [001] 90.639646: sched_wakeup: rcu_preempt:8 [98] success=1 CPU:001
ktimersoftd/1-20 [001] 90.639663: softirq_exit: vec=1 [action=TIMER]
ktimersoftd/1-20 [001] 90.639679: sched_switch: ktimersoftd/1:20 [98] S ==> rcu_preempt:8 [98]
rcu_preempt-8 [001] 90.639722: sched_switch: rcu_preempt:8 [98] S ==> ksoftirqd/1:21 [120]
ksoftirqd/1-21 [001] 90.639740: sched_stat_runtime: comm=ksoftirqd/1 pid=21 runtime=25539 [ns] vruntime=29960463828 [ns]
ksoftirqd/1-21 [001] 90.639750: sched_switch: ksoftirqd/1:21 [120] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.639878: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.639880: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.639884: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.639896: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.639903: sched_stat_runtime: comm=iperf pid=1284 runtime=150466 [ns] vruntime=2105473316 [ns]
iperf-1284 [001] 90.639913: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.639921: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639932: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.639958: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
irq/355-4848400-86 [001] 90.639963: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:000
irq/355-4848400-86 [001] 90.639984: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640012: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.640109: sched_waking: comm=iperf pid=1284 prio=120 target_cpu=001
irq/354-4848400-85 [000] 90.640115: sched_wakeup: iperf:1284 [120] success=1 CPU:001
irq/354-4848400-85 [000] 90.640152: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640154: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.640158: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.640170: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.640177: sched_stat_runtime: comm=iperf pid=1284 runtime=163480 [ns] vruntime=2105636796 [ns]
iperf-1284 [001] 90.640187: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.640195: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640206: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640239: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640266: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.640376: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640378: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.640382: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.640394: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.640401: sched_stat_runtime: comm=iperf pid=1284 runtime=133387 [ns] vruntime=2105770183 [ns]
iperf-1284 [001] 90.640412: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.640419: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640430: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640463: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640489: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.640560: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640568: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640579: softirq_entry: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640606: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640608: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.640611: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.640623: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.640630: sched_stat_runtime: comm=iperf pid=1284 runtime=139080 [ns] vruntime=2105909263 [ns]
iperf-1284 [001] 90.640641: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.640648: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640659: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640692: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640720: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.640806: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640820: irq_handler_entry: irq=354 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640823: irq_handler_exit: irq=354 ret=handled
irq/354-4848400-85 [000] 90.640833: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640835: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.640838: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.640849: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.640857: sched_stat_runtime: comm=iperf pid=1284 runtime=135501 [ns] vruntime=2106044764 [ns]
irq/354-4848400-85 [000] 90.640861: softirq_raise: vec=3 [action=NET_RX]
iperf-1284 [001] 90.640867: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/354-4848400-85 [000] 90.640872: irq_handler_entry: irq=354 name=48484000.ethernet
irq/354-4848400-85 [000] 90.640874: irq_handler_exit: irq=354 ret=handled
irq/355-4848400-86 [001] 90.640875: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.640885: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640886: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640919: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.640945: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641052: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641054: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.641058: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.641070: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.641077: sched_stat_runtime: comm=iperf pid=1284 runtime=129483 [ns] vruntime=2106174247 [ns]
iperf-1284 [001] 90.641087: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.641095: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641106: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641132: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
irq/355-4848400-86 [001] 90.641138: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:000
irq/355-4848400-86 [001] 90.641159: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641185: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641312: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641314: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.641317: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.641329: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.641336: sched_stat_runtime: comm=iperf pid=1284 runtime=149003 [ns] vruntime=2106323250 [ns]
iperf-1284 [001] 90.641347: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.641355: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641366: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641398: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641425: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641526: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641526: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641526: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.641526: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.641526: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.641557: sched_stat_runtime: comm=iperf pid=1284 runtime=130946 [ns] vruntime=2106454196 [ns]
irq/354-4848400-85 [000] 90.641565: softirq_raise: vec=3 [action=NET_RX]
iperf-1284 [001] 90.641567: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.641575: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641576: irq_handler_entry: irq=354 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641578: irq_handler_exit: irq=354 ret=handled
irq/355-4848400-86 [001] 90.641586: softirq_entry: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641589: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641618: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641645: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641760: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.641762: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.641765: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.641777: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.641784: sched_stat_runtime: comm=iperf pid=1284 runtime=137941 [ns] vruntime=2106592137 [ns]
iperf-1284 [001] 90.641795: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.641802: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641814: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641845: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.641871: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.641916: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641924: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641935: softirq_entry: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.641964: sched_waking: comm=iperf pid=1284 prio=120 target_cpu=001
irq/354-4848400-85 [000] 90.641969: sched_wakeup: iperf:1284 [120] success=1 CPU:001
irq/354-4848400-85 [000] 90.642006: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.642008: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.642011: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.642024: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.642032: sched_stat_runtime: comm=iperf pid=1284 runtime=156486 [ns] vruntime=2106748623 [ns]
iperf-1284 [001] 90.642042: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.642049: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642060: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642093: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642120: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.642233: irq_handler_entry: irq=355 name=48484000.ethernet
irq/354-4848400-85 [000] 90.642235: irq_handler_exit: irq=355 ret=handled
irq/354-4848400-85 [000] 90.642238: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
irq/354-4848400-85 [000] 90.642250: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
iperf-1284 [001] 90.642257: sched_stat_runtime: comm=iperf pid=1284 runtime=135990 [ns] vruntime=2106884613 [ns]
iperf-1284 [001] 90.642268: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [001] 90.642275: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642286: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642319: softirq_exit: vec=3 [action=NET_RX]
irq/355-4848400-86 [001] 90.642345: sched_switch: irq/355-4848400:86 [49] S ==> iperf:1284 [120]
irq/354-4848400-85 [000] 90.642393: softirq_exit: vec=3 [action=NET_RX]
irq/354-4848400-85 [000] 90.642419: sched_switch: irq/354-4848400:85 [49] S ==> rcuc/0:11 [98]
rcuc/0-11 [000] 90.642430: irq_handler_entry: irq=354 name=48484000.ethernet
rcuc/0-11 [000] 90.642432: irq_handler_exit: irq=354 ret=handled
rcuc/0-11 [000] 90.642435: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
rcuc/0-11 [000] 90.642442: sched_migrate_task: comm=irq/354-4848400 pid=85 prio=49 orig_cpu=0 dest_cpu=1
rcuc/0-11 [000] 90.642453: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:001
iperf-1284 [001] 90.642462: sched_stat_runtime: comm=iperf pid=1284 runtime=113053 [ns] vruntime=2106997666 [ns]
rcuc/0-11 [000] 90.642464: irq_handler_entry: irq=355 name=48484000.ethernet
rcuc/0-11 [000] 90.642466: irq_handler_exit: irq=355 ret=handled
rcuc/0-11 [000] 90.642469: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
iperf-1284 [001] 90.642473: sched_switch: iperf:1284 [120] R ==> irq/354-4848400:85 [49]
irq/354-4848400-85 [001] 90.642481: softirq_raise: vec=3 [action=NET_RX]
rcuc/0-11 [000] 90.642483: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
irq/354-4848400-85 [001] 90.642493: softirq_entry: vec=3 [action=NET_RX]
rcuc/0-11 [000] 90.642497: sched_migrate_task: comm=irq/355-4848400 pid=86 prio=49 orig_cpu=1 dest_cpu=0
rcuc/0-11 [000] 90.642515: irq_handler_entry: irq=354 name=48484000.ethernet
rcuc/0-11 [000] 90.642516: irq_handler_exit: irq=354 ret=handled
rcuc/0-11 [000] 90.642533: sched_switch: rcuc/0:11 [98] R ==> irq/355-4848400:86 [49]
irq/355-4848400-86 [000] 90.642541: softirq_raise: vec=3 [action=NET_RX]
irq/355-4848400-86 [000] 90.642551: softirq_entry: vec=3 [action=NET_RX]
irq/355-4848400-86 [000] 90.642562: sched_pi_setprio: comm=iperf pid=1284 oldprio=120 newprio=49
irq/355-4848400-86 [000] 90.642593: sched_switch: irq/355-4848400:86 [49] D ==> rcuc/0:11 [98]
rcuc/0-11 [000] 90.642621: sched_switch: rcuc/0:11 [98] S ==> ktimersoftd/0:4 [98]
ktimersoftd/0-4 [000] 90.642631: softirq_entry: vec=1 [action=TIMER]
ktimersoftd/0-4 [000] 90.642647: softirq_exit: vec=1 [action=TIMER]
ktimersoftd/0-4 [000] 90.642665: sched_switch: ktimersoftd/0:4 [98] S ==> ksoftirqd/0:3 [120]
ksoftirqd/0-3 [000] 90.642692: sched_stat_runtime: comm=ksoftirqd/0 pid=3 runtime=34486 [ns] vruntime=33313887612 [ns]
ksoftirqd/0-3 [000] 90.642703: sched_switch: ksoftirqd/0:3 [120] D ==> trace-cmd:1278 [120]
trace-cmd-1278 [000] 90.642742: sched_waking: comm=trace-cmd pid=1279 prio=120 target_cpu=001
trace-cmd-1278 [000] 90.642755: sched_wakeup: trace-cmd:1279 [120] success=1 CPU:001
irq/354-4848400-85 [001] 90.643012: softirq_exit: vec=3 [action=NET_RX]
trace-cmd-1278 [000] 90.643020: sched_stat_runtime: comm=trace-cmd pid=1278 runtime=328098 [ns] vruntime=554552351 [ns]
irq/354-4848400-85 [001] 90.643021: softirq_raise: vec=3 [action=NET_RX]
irq/354-4848400-85 [001] 90.643031: softirq_entry: vec=3 [action=NET_RX]
trace-cmd-1278 [000] 90.643036: sched_switch: trace-cmd:1278 [120] D ==> iperf:1282 [120]
iperf-1282 [000] 90.643055: sched_stat_runtime: comm=iperf pid=1282 runtime=34811 [ns] vruntime=560242796 [ns]
iperf-1282 [000] 90.643068: sched_switch: iperf:1282 [120] D ==> swapper/0:0 [120]
<idle>-0 [000] 90.643079: bputs: 0xc003e7d4s: 58f67c094e488
<idle>-0 [000] 90.643090: bprint: softirq_check_pending_idle: NOHZ: 80
--
regards,
-grygorii
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-08-12 15:58 ` Grygorii Strashko
(?)
@ 2016-09-08 14:28 ` Sebastian Andrzej Siewior
2016-09-08 16:24 ` Grygorii Strashko
-1 siblings, 1 reply; 18+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-09-08 14:28 UTC (permalink / raw)
To: Grygorii Strashko
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 2016-08-12 18:58:21 [+0300], Grygorii Strashko wrote:
> Hi Sebastian,
Hi Grygorii,
> Thankds for comment. You're right:
> irq_thread()->irq_forced_thread_fn()->local_bh_enable()
>
> but wouldn't here two wake_up_process() calls any way,
> plus preempt_check_resched_rt() in napi_schedule().
Usually you prefer BH handling in the IRQ-thread because it runs at
higher priority and is not interrupted by a SCHED_OTHER process. And you
can assign it a higher priority if it should be preferred over an other
interrupt. However, if the processing of the interrupt is taking too
much time (like that ping flood, a lot of network traffic) then we push
it to the softirq thread. If you do this now unconditionally in the
SCHED_OTHER softirq thread then you take away all the `good' things we
had (like processing important packets at higher priority as long as
nobody floods us). Plus you share this thread with everything else that
runs in there.
> >> And, as result, get benefits from the following improvements (tested
> >> on am57xx-evm):
> >>
> >> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
> >> seen any more. Now these warnings can be seen once iperf is started.
> >> # iperf -c $IPERFHOST -w 128K -d -t 60
> >
> > Do you also see "sched: RT throttling activated"? Because I don't see
> > otherwise why this should pop up.
>
> I've reverted my patch an did requested experiments (some additional info below).
>
> I do not see "sched: RT throttling activated" :(
That is okay. However if aim for throughput you might want to switch
away from NO_HZ (and deactivate the software watchdog wich runs at
prio 99 if enabled).
> root@am57xx-evm:~# ./net_perf.sh & cyclictest -m -Sp98 -q -D4m
> [1] 1301
> # /dev/cpu_dma_latency set to 0us
> Linux am57xx-evm 4.4.16-rt23-00321-ga195e6a-dirty #92 SMP PREEMPT RT Fri Aug 12 14:03:59 EEST 2016 armv7l GNU/Linux
>
…
> [1]+ Done ./net_perf.sh
I can't parse this. But that local_softirq_pending() warning might
contribute to lower numbers.
> =============================================== before, no net load:
> cyclictest -m -Sp98 -q -D4m -i250 -d0
> # /dev/cpu_dma_latency set to 0us
> T: 0 ( 1288) P:98 I:250 C: 960000 Min: 8 Act: 9 Avg: 8 Max: 33
> T: 1 ( 1289) P:98 I:250 C: 959929 Min: 7 Act: 11 Avg: 9 Max: 26
> =============================================== after, no net load:
> cyclictest -m -Sp98 -q -D4m -i250 -d0
> T: 0 ( 1301) P:98 I:250 C: 960000 Min: 7 Act: 9 Avg: 8 Max: 22
> T: 1 ( 1302) P:98 I:250 C: 959914 Min: 7 Act: 11 Avg: 8 Max: 28
I think those two should be equal more or less since the change should
have no impact on "no net load" or do I miss something?
> =============================================== before, with net load:
> cyclictest -m -Sp98 -q -D4m -i250 -d0
> T: 0 ( 1400) P:98 I:250 C: 960000 Min: 8 Act: 25 Avg: 18 Max: 83
> T: 1 ( 1401) P:98 I:250 C: 959801 Min: 7 Act: 27 Avg: 17 Max: 48
>
>
> =============================================== after, with net load:
> cyclictest -m -Sp98 -q -D4m -i250 -d0
> T: 0 ( 1358) P:98 I:250 C: 960000 Min: 8 Act: 11 Avg: 14 Max: 42
> T: 1 ( 1359) P:98 I:250 C: 959743 Min: 7 Act: 18 Avg: 15 Max: 36
So the max value dropped by ~50% with your patch. Interesting. What I
remember from testing is that once you had, say, one hour of hackbench
running then after that, the extra network traffic didn't contribute
much (if at all) to the max value.
That said it is hard to believe that one extra context switch
contributes about 40us to the max value on CPU0.
> > What happens if s/__raise_softirq_irqoff_ksoft/__raise_softirq_irqoff/
> > in net/core/dev.c and chrt the priority of you network interrupt
> > handlers to SCHED_OTHER priority?
>
> ===== without this patch + __raise_softirq_irqoff + netIRQs->SCHED_OTHER
>
> with net load:
> cyclictest -m -Sp98 -q -D4m -i250 -d0
> T: 0 ( 1325) P:98 I:1000 C: 240000 Min: 8 Act: 22 Avg: 17 Max: 51
> T: 1 ( 1326) P:98 I:1500 C: 159981 Min: 8 Act: 15 Avg: 15 Max: 39
>
> cyclictest -m -Sp98 -q -D4m -i250 -d0
> T: 0 ( 1307) P:98 I:250 C: 960000 Min: 7 Act: 13 Avg: 16 Max: 50
> T: 1 ( 1308) P:98 I:250 C: 959819 Min: 8 Act: 12 Avg: 14 Max: 37
>
> and net parformance is better:
> root@am57xx-evm:~# ps -A | grep 4848
> 82 ? 00:00:00 irq/354-4848400
> 83 ? 00:00:00 irq/355-4848400
> root@am57xx-evm:~# chrt -o -p 0 82
> root@am57xx-evm:~# chrt -o -p 0 83
> ./net_perf.sh & cyclictest -m -Sp98 -q -D4m -i250 -d0
> [1] 1298
> # /dev/cpu_dma_latency set to 0us
> Linux am57xx-evm 4.4.16-rt23-00321-ga195e6a-dirty #95 SMP PREEMPT RT Fri Aug 12 16:20:42 EEST 2016 armv7l GNU/Linux
So that looks nice, doesn't it?
Sebastian
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-08-19 14:29 ` Grygorii Strashko
(?)
@ 2016-09-08 16:00 ` Sebastian Andrzej Siewior
2016-09-09 12:46 ` Grygorii Strashko
-1 siblings, 1 reply; 18+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-09-08 16:00 UTC (permalink / raw)
To: Grygorii Strashko
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 2016-08-19 17:29:16 [+0300], Grygorii Strashko wrote:
> I've collected trace before first occurrence of "NOHZ: local_softirq_pending 80"
>
> irq/354-4848400-85 [000] 90.639460: irq_handler_entry: irq=19 name=arch_timer
> iperf-1284 [001] 90.639474: softirq_raise: vec=1 [action=TIMER]
> iperf-1284 [001] 90.639486: irq_handler_exit: irq=19 ret=handled
> irq/354-4848400-85 [000] 90.639488: softirq_raise: vec=7 [action=SCHED]
here we raise the SCHED softirq
> iperf-1284 [001] 90.639490: sched_waking: comm=ksoftirqd/1 pid=21 prio=120 target_cpu=001
> irq/354-4848400-85 [000] 90.639492: softirq_raise: vec=1 [action=TIMER]
and a timer
> iperf-1284 [001] 90.639499: sched_wakeup: ksoftirqd/1:21 [120] success=1 CPU:001
> iperf-1284 [001] 90.639504: sched_waking: comm=ktimersoftd/1 pid=20 prio=98 target_cpu=001
> irq/354-4848400-85 [000] 90.639505: irq_handler_exit: irq=19 ret=handled
okay. so softirq here I come
> iperf-1284 [001] 90.639512: sched_wakeup: ktimersoftd/1:20 [98] success=1 CPU:001
> iperf-1284 [001] 90.639526: sched_stat_runtime: comm=iperf pid=1284 runtime=50752 [ns] vruntime=2105322850 [ns]
> iperf-1284 [001] 90.639537: sched_switch: iperf:1284 [120] R ==> irq/355-4848400:86 [49]
> irq/355-4848400-86 [001] 90.639545: softirq_raise: vec=3 [action=NET_RX]
> irq/355-4848400-86 [001] 90.639556: softirq_entry: vec=3 [action=NET_RX]
> irq/355-4848400-86 [001] 90.639589: softirq_exit: vec=3 [action=NET_RX]
> irq/355-4848400-86 [001] 90.639614: sched_switch: irq/355-4848400:86 [49] S ==> ktimersoftd/1:20 [98]
> ktimersoftd/1-20 [001] 90.639625: softirq_entry: vec=1 [action=TIMER]
> ktimersoftd/1-20 [001] 90.639637: sched_waking: comm=rcu_preempt pid=8 prio=98 target_cpu=001
> ktimersoftd/1-20 [001] 90.639646: sched_wakeup: rcu_preempt:8 [98] success=1 CPU:001
> ktimersoftd/1-20 [001] 90.639663: softirq_exit: vec=1 [action=TIMER]
> ktimersoftd/1-20 [001] 90.639679: sched_switch: ktimersoftd/1:20 [98] S ==> rcu_preempt:8 [98]
> rcu_preempt-8 [001] 90.639722: sched_switch: rcu_preempt:8 [98] S ==> ksoftirqd/1:21 [120]
> ksoftirqd/1-21 [001] 90.639740: sched_stat_runtime: comm=ksoftirqd/1 pid=21 runtime=25539 [ns] vruntime=29960463828 [ns]
> ksoftirqd/1-21 [001] 90.639750: sched_switch: ksoftirqd/1:21 [120] S ==> iperf:1284 [120]
> irq/354-4848400-85 [000] 90.639878: irq_handler_entry: irq=355 name=48484000.ethernet
wait, no
[ lots NET_RX wake ups ]
> irq/354-4848400-85 [000] 90.640560: softirq_exit: vec=3 [action=NET_RX]
ach. NET_RX
> irq/354-4848400-85 [000] 90.640568: softirq_raise: vec=3 [action=NET_RX]
> irq/354-4848400-85 [000] 90.640579: softirq_entry: vec=3 [action=NET_RX]
> irq/354-4848400-85 [000] 90.640606: irq_handler_entry: irq=355 name=48484000.ethernet
> irq/354-4848400-85 [000] 90.640608: irq_handler_exit: irq=355 ret=handled
[ more of those ]
> irq/354-4848400-85 [000] 90.642393: softirq_exit: vec=3 [action=NET_RX]
> irq/354-4848400-85 [000] 90.642419: sched_switch: irq/354-4848400:85 [49] S ==> rcuc/0:11 [98]
We don't serve TIMER & SCHED because those two are pushed to the
ksoftirq thread(s). So we keep mostly doing NET_RX and now we switch to
the next best thing which is RCU.
> rcuc/0-11 [000] 90.642430: irq_handler_entry: irq=354 name=48484000.ethernet
but not for long.
> rcuc/0-11 [000] 90.642432: irq_handler_exit: irq=354 ret=handled
> rcuc/0-11 [000] 90.642435: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
> rcuc/0-11 [000] 90.642442: sched_migrate_task: comm=irq/354-4848400 pid=85 prio=49 orig_cpu=0 dest_cpu=1
> rcuc/0-11 [000] 90.642453: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:001
> iperf-1284 [001] 90.642462: sched_stat_runtime: comm=iperf pid=1284 runtime=113053 [ns] vruntime=2106997666 [ns]
> rcuc/0-11 [000] 90.642464: irq_handler_entry: irq=355 name=48484000.ethernet
> rcuc/0-11 [000] 90.642466: irq_handler_exit: irq=355 ret=handled
> rcuc/0-11 [000] 90.642469: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
> iperf-1284 [001] 90.642473: sched_switch: iperf:1284 [120] R ==> irq/354-4848400:85 [49]
> irq/354-4848400-85 [001] 90.642481: softirq_raise: vec=3 [action=NET_RX]
> rcuc/0-11 [000] 90.642483: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
> irq/354-4848400-85 [001] 90.642493: softirq_entry: vec=3 [action=NET_RX]
> rcuc/0-11 [000] 90.642497: sched_migrate_task: comm=irq/355-4848400 pid=86 prio=49 orig_cpu=1 dest_cpu=0
ach that IRQ thread no pinned. Good. We migrate.
> rcuc/0-11 [000] 90.642515: irq_handler_entry: irq=354 name=48484000.ethernet
> rcuc/0-11 [000] 90.642516: irq_handler_exit: irq=354 ret=handled
> rcuc/0-11 [000] 90.642533: sched_switch: rcuc/0:11 [98] R ==> irq/355-4848400:86 [49]
> irq/355-4848400-86 [000] 90.642541: softirq_raise: vec=3 [action=NET_RX]
> irq/355-4848400-86 [000] 90.642551: softirq_entry: vec=3 [action=NET_RX]
> irq/355-4848400-86 [000] 90.642562: sched_pi_setprio: comm=iperf pid=1284 oldprio=120 newprio=49
> irq/355-4848400-86 [000] 90.642593: sched_switch: irq/355-4848400:86 [49] D ==> rcuc/0:11 [98]
> rcuc/0-11 [000] 90.642621: sched_switch: rcuc/0:11 [98] S ==> ktimersoftd/0:4 [98]
> ktimersoftd/0-4 [000] 90.642631: softirq_entry: vec=1 [action=TIMER]
we did wait quite some time for that one.
> ktimersoftd/0-4 [000] 90.642647: softirq_exit: vec=1 [action=TIMER]
> ktimersoftd/0-4 [000] 90.642665: sched_switch: ktimersoftd/0:4 [98] S ==> ksoftirqd/0:3 [120]
switch back ksoftirqd where SCHED softirq might run.
> ksoftirqd/0-3 [000] 90.642692: sched_stat_runtime: comm=ksoftirqd/0 pid=3 runtime=34486 [ns] vruntime=33313887612 [ns]
> ksoftirqd/0-3 [000] 90.642703: sched_switch: ksoftirqd/0:3 [120] D ==> trace-cmd:1278 [120]
but we blocked on something in softirq and go back to trace-cmd
> trace-cmd-1278 [000] 90.642742: sched_waking: comm=trace-cmd pid=1279 prio=120 target_cpu=001
> trace-cmd-1278 [000] 90.642755: sched_wakeup: trace-cmd:1279 [120] success=1 CPU:001
> irq/354-4848400-85 [001] 90.643012: softirq_exit: vec=3 [action=NET_RX]
> trace-cmd-1278 [000] 90.643020: sched_stat_runtime: comm=trace-cmd pid=1278 runtime=328098 [ns] vruntime=554552351 [ns]
> irq/354-4848400-85 [001] 90.643021: softirq_raise: vec=3 [action=NET_RX]
> irq/354-4848400-85 [001] 90.643031: softirq_entry: vec=3 [action=NET_RX]
> trace-cmd-1278 [000] 90.643036: sched_switch: trace-cmd:1278 [120] D ==> iperf:1282 [120]
trace-cmd is going, back to iperf.
> iperf-1282 [000] 90.643055: sched_stat_runtime: comm=iperf pid=1282 runtime=34811 [ns] vruntime=560242796 [ns]
> iperf-1282 [000] 90.643068: sched_switch: iperf:1282 [120] D ==> swapper/0:0 [120]
iperf is blocked on something, too.
> <idle>-0 [000] 90.643079: bputs: 0xc003e7d4s: 58f67c094e488
> <idle>-0 [000] 90.643090: bprint: softirq_check_pending_idle: NOHZ: 80
As you see ksoftirqd left the CPU with a D so I would assume it is
blocked on a lock and waits.
NET_RX is in progress but scheduled out due to RCUC which is also
scheduled out.
I assume we got to softirq because nothing else can run. It will see
that NET_RX is pending and tries it but blocks on the lock
(lock_softirq()). It schedules out. Nothing left -> idle.
The idle code checks to see if a softirq is pending and in fact there is
SCHED on the list and ksoftirq was about to handle it but due to
ordering complication (NET_RX before SCHED) it can't. And we have the
warning.
This
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -105,6 +105,7 @@ void softirq_check_pending_idle(void)
{
static int rate_limit;
struct softirq_runner *sr = this_cpu_ptr(&softirq_runners);
+ struct task_struct *ksoft_tsk = __this_cpu_read(ksoftirqd);
u32 warnpending;
int i;
@@ -112,7 +113,7 @@ void softirq_check_pending_idle(void)
return;
warnpending = local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK;
- for (i = 0; i < NR_SOFTIRQS; i++) {
+ for (i = 0; (i < NR_SOFTIRQS) && warnpending; i++) {
struct task_struct *tsk = sr->runner[i];
/*
@@ -132,6 +133,15 @@ void softirq_check_pending_idle(void)
}
}
+ if (warnpending && ksoft_tsk) {
+ raw_spin_lock(&ksoft_tsk->pi_lock);
+ if (ksoft_tsk->pi_blocked_on || ksoft_tsk->state == TASK_RUNNING) {
+ /* Clear all bits pending in that task */
+ warnpending &= ~(ksoft_tsk->softirqs_raised);
+ }
+ raw_spin_unlock(&ksoft_tsk->pi_lock);
+ }
+
if (warnpending) {
printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
warnpending);
should avoid the warning if the softirq thread is blocked.
However:
- RCU boosting with RT prio might help here or there.
- having the hard-IRQ and IRQ-thread on the same CPU might help, too. It
is not strictly required but saves a few cycles if you don't have to
perform cross CPU wake ups and migrate task forth and back. The latter
happens at prio 99.
- I am not sure NAPI works as expected. I would assume so. There is IRQ
354 and 355 which fire after each other. One would be enough I guess.
And they seem to be short living / fire often. If NAPI works then it
should put an end to it and push it to the softirq thread.
If you have IRQ-pacing support I suggest to use something like 10ms or
so. That means your ping response will go from <= 1ms to 10ms in the
worst case but since you process more packets at a time your
throughput should increase.
If I count this correct, it too you alsmost 4ms from "raise SCHED" to
"try process SCHED" and most of the time was spent in 35[45] hard irq,
raise NET_RX or cross wakeup the IRQ thread.
Sebastian
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-09-08 14:28 ` Sebastian Andrzej Siewior
@ 2016-09-08 16:24 ` Grygorii Strashko
0 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-09-08 16:24 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 09/08/2016 05:28 PM, Sebastian Andrzej Siewior wrote:
> On 2016-08-12 18:58:21 [+0300], Grygorii Strashko wrote:
>> Hi Sebastian,
> Hi Grygorii,
>
>> Thankds for comment. You're right:
>> irq_thread()->irq_forced_thread_fn()->local_bh_enable()
>>
>> but wouldn't here two wake_up_process() calls any way,
>> plus preempt_check_resched_rt() in napi_schedule().
>
> Usually you prefer BH handling in the IRQ-thread because it runs at
> higher priority and is not interrupted by a SCHED_OTHER process. And you
> can assign it a higher priority if it should be preferred over an other
> interrupt. However, if the processing of the interrupt is taking too
> much time (like that ping flood, a lot of network traffic) then we push
> it to the softirq thread. If you do this now unconditionally in the
> SCHED_OTHER softirq thread then you take away all the `good' things we
> had (like processing important packets at higher priority as long as
> nobody floods us). Plus you share this thread with everything else that
> runs in there.
That's i understand, but effect from this patch on network throughput
is pretty amazing :)
>
>>>> And, as result, get benefits from the following improvements (tested
>>>> on am57xx-evm):
>>>>
>>>> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
>>>> seen any more. Now these warnings can be seen once iperf is started.
>>>> # iperf -c $IPERFHOST -w 128K -d -t 60
>>>
>>> Do you also see "sched: RT throttling activated"? Because I don't see
>>> otherwise why this should pop up.
>>
>> I've reverted my patch an did requested experiments (some additional info below).
>>
>> I do not see "sched: RT throttling activated" :(
>
> That is okay. However if aim for throughput you might want to switch
> away from NO_HZ (and deactivate the software watchdog wich runs at
> prio 99 if enabled).
>
>> root@am57xx-evm:~# ./net_perf.sh & cyclictest -m -Sp98 -q -D4m
>> [1] 1301
>> # /dev/cpu_dma_latency set to 0us
>> Linux am57xx-evm 4.4.16-rt23-00321-ga195e6a-dirty #92 SMP PREEMPT RT Fri Aug 12 14:03:59 EEST 2016 armv7l GNU/Linux
>>
> …
>> [1]+ Done ./net_perf.sh
>
> I can't parse this. But that local_softirq_pending() warning might
> contribute to lower numbers.
>
>> =============================================== before, no net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> # /dev/cpu_dma_latency set to 0us
>> T: 0 ( 1288) P:98 I:250 C: 960000 Min: 8 Act: 9 Avg: 8 Max: 33
>> T: 1 ( 1289) P:98 I:250 C: 959929 Min: 7 Act: 11 Avg: 9 Max: 26
>
>> =============================================== after, no net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1301) P:98 I:250 C: 960000 Min: 7 Act: 9 Avg: 8 Max: 22
>> T: 1 ( 1302) P:98 I:250 C: 959914 Min: 7 Act: 11 Avg: 8 Max: 28
>
> I think those two should be equal more or less since the change should
> have no impact on "no net load" or do I miss something?
Correct I see no differences in this case, as per above.
>
>> =============================================== before, with net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1400) P:98 I:250 C: 960000 Min: 8 Act: 25 Avg: 18 Max: 83
>> T: 1 ( 1401) P:98 I:250 C: 959801 Min: 7 Act: 27 Avg: 17 Max: 48
>>
>>
>> =============================================== after, with net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1358) P:98 I:250 C: 960000 Min: 8 Act: 11 Avg: 14 Max: 42
>> T: 1 ( 1359) P:98 I:250 C: 959743 Min: 7 Act: 18 Avg: 15 Max: 36
>
> So the max value dropped by ~50% with your patch. Interesting. What I
> remember from testing is that once you had, say, one hour of hackbench
> running then after that, the extra network traffic didn't contribute
> much (if at all) to the max value.
> That said it is hard to believe that one extra context switch
> contributes about 40us to the max value on CPU0.
Yup. but short time testing provides very stable results.
This patch is going to be tested more intensively shortly.
>
>>> What happens if s/__raise_softirq_irqoff_ksoft/__raise_softirq_irqoff/
>>> in net/core/dev.c and chrt the priority of you network interrupt
>>> handlers to SCHED_OTHER priority?
>>
>> ===== without this patch + __raise_softirq_irqoff + netIRQs->SCHED_OTHER
>>
>> with net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1325) P:98 I:1000 C: 240000 Min: 8 Act: 22 Avg: 17 Max: 51
>> T: 1 ( 1326) P:98 I:1500 C: 159981 Min: 8 Act: 15 Avg: 15 Max: 39
>>
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1307) P:98 I:250 C: 960000 Min: 7 Act: 13 Avg: 16 Max: 50
>> T: 1 ( 1308) P:98 I:250 C: 959819 Min: 8 Act: 12 Avg: 14 Max: 37
>>
>
> So that looks nice, doesn't it?
Yah, This improvement, in general.
But the fact that so significant net performance drop observed out of the box
(without any tunning) and on idle system - triggers a lot of questions ;(
I'm worry if observed original behavior can depend on usage NAPI polling for both RX/TX
in CPSW driver. CPSW request two IRQs RX and TX and both handler just do napi_schedule()[NET_RX].
--
regards,
-grygorii
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
@ 2016-09-08 16:24 ` Grygorii Strashko
0 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-09-08 16:24 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 09/08/2016 05:28 PM, Sebastian Andrzej Siewior wrote:
> On 2016-08-12 18:58:21 [+0300], Grygorii Strashko wrote:
>> Hi Sebastian,
> Hi Grygorii,
>
>> Thankds for comment. You're right:
>> irq_thread()->irq_forced_thread_fn()->local_bh_enable()
>>
>> but wouldn't here two wake_up_process() calls any way,
>> plus preempt_check_resched_rt() in napi_schedule().
>
> Usually you prefer BH handling in the IRQ-thread because it runs at
> higher priority and is not interrupted by a SCHED_OTHER process. And you
> can assign it a higher priority if it should be preferred over an other
> interrupt. However, if the processing of the interrupt is taking too
> much time (like that ping flood, a lot of network traffic) then we push
> it to the softirq thread. If you do this now unconditionally in the
> SCHED_OTHER softirq thread then you take away all the `good' things we
> had (like processing important packets at higher priority as long as
> nobody floods us). Plus you share this thread with everything else that
> runs in there.
That's i understand, but effect from this patch on network throughput
is pretty amazing :)
>
>>>> And, as result, get benefits from the following improvements (tested
>>>> on am57xx-evm):
>>>>
>>>> 1) "[ 78.348599] NOHZ: local_softirq_pending 80" message will not be
>>>> seen any more. Now these warnings can be seen once iperf is started.
>>>> # iperf -c $IPERFHOST -w 128K -d -t 60
>>>
>>> Do you also see "sched: RT throttling activated"? Because I don't see
>>> otherwise why this should pop up.
>>
>> I've reverted my patch an did requested experiments (some additional info below).
>>
>> I do not see "sched: RT throttling activated" :(
>
> That is okay. However if aim for throughput you might want to switch
> away from NO_HZ (and deactivate the software watchdog wich runs at
> prio 99 if enabled).
>
>> root@am57xx-evm:~# ./net_perf.sh & cyclictest -m -Sp98 -q -D4m
>> [1] 1301
>> # /dev/cpu_dma_latency set to 0us
>> Linux am57xx-evm 4.4.16-rt23-00321-ga195e6a-dirty #92 SMP PREEMPT RT Fri Aug 12 14:03:59 EEST 2016 armv7l GNU/Linux
>>
> …
>> [1]+ Done ./net_perf.sh
>
> I can't parse this. But that local_softirq_pending() warning might
> contribute to lower numbers.
>
>> =============================================== before, no net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> # /dev/cpu_dma_latency set to 0us
>> T: 0 ( 1288) P:98 I:250 C: 960000 Min: 8 Act: 9 Avg: 8 Max: 33
>> T: 1 ( 1289) P:98 I:250 C: 959929 Min: 7 Act: 11 Avg: 9 Max: 26
>
>> =============================================== after, no net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1301) P:98 I:250 C: 960000 Min: 7 Act: 9 Avg: 8 Max: 22
>> T: 1 ( 1302) P:98 I:250 C: 959914 Min: 7 Act: 11 Avg: 8 Max: 28
>
> I think those two should be equal more or less since the change should
> have no impact on "no net load" or do I miss something?
Correct I see no differences in this case, as per above.
>
>> =============================================== before, with net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1400) P:98 I:250 C: 960000 Min: 8 Act: 25 Avg: 18 Max: 83
>> T: 1 ( 1401) P:98 I:250 C: 959801 Min: 7 Act: 27 Avg: 17 Max: 48
>>
>>
>> =============================================== after, with net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1358) P:98 I:250 C: 960000 Min: 8 Act: 11 Avg: 14 Max: 42
>> T: 1 ( 1359) P:98 I:250 C: 959743 Min: 7 Act: 18 Avg: 15 Max: 36
>
> So the max value dropped by ~50% with your patch. Interesting. What I
> remember from testing is that once you had, say, one hour of hackbench
> running then after that, the extra network traffic didn't contribute
> much (if at all) to the max value.
> That said it is hard to believe that one extra context switch
> contributes about 40us to the max value on CPU0.
Yup. but short time testing provides very stable results.
This patch is going to be tested more intensively shortly.
>
>>> What happens if s/__raise_softirq_irqoff_ksoft/__raise_softirq_irqoff/
>>> in net/core/dev.c and chrt the priority of you network interrupt
>>> handlers to SCHED_OTHER priority?
>>
>> ===== without this patch + __raise_softirq_irqoff + netIRQs->SCHED_OTHER
>>
>> with net load:
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1325) P:98 I:1000 C: 240000 Min: 8 Act: 22 Avg: 17 Max: 51
>> T: 1 ( 1326) P:98 I:1500 C: 159981 Min: 8 Act: 15 Avg: 15 Max: 39
>>
>> cyclictest -m -Sp98 -q -D4m -i250 -d0
>> T: 0 ( 1307) P:98 I:250 C: 960000 Min: 7 Act: 13 Avg: 16 Max: 50
>> T: 1 ( 1308) P:98 I:250 C: 959819 Min: 8 Act: 12 Avg: 14 Max: 37
>>
>
> So that looks nice, doesn't it?
Yah, This improvement, in general.
But the fact that so significant net performance drop observed out of the box
(without any tunning) and on idle system - triggers a lot of questions ;(
I'm worry if observed original behavior can depend on usage NAPI polling for both RX/TX
in CPSW driver. CPSW request two IRQs RX and TX and both handler just do napi_schedule()[NET_RX].
--
regards,
-grygorii
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-09-08 16:00 ` Sebastian Andrzej Siewior
@ 2016-09-09 12:46 ` Grygorii Strashko
0 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-09-09 12:46 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 09/08/2016 07:00 PM, Sebastian Andrzej Siewior wrote:
> On 2016-08-19 17:29:16 [+0300], Grygorii Strashko wrote:
>> I've collected trace before first occurrence of "NOHZ: local_softirq_pending 80"
>>
>
>> irq/354-4848400-85 [000] 90.642393: softirq_exit: vec=3 [action=NET_RX]
>> irq/354-4848400-85 [000] 90.642419: sched_switch: irq/354-4848400:85 [49] S ==> rcuc/0:11 [98]
>
> We don't serve TIMER & SCHED because those two are pushed to the
> ksoftirq thread(s). So we keep mostly doing NET_RX and now we switch to
> the next best thing which is RCU.
>
>> rcuc/0-11 [000] 90.642430: irq_handler_entry: irq=354 name=48484000.ethernet
> but not for long.
>
>> rcuc/0-11 [000] 90.642432: irq_handler_exit: irq=354 ret=handled
>> rcuc/0-11 [000] 90.642435: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
>> rcuc/0-11 [000] 90.642442: sched_migrate_task: comm=irq/354-4848400 pid=85 prio=49 orig_cpu=0 dest_cpu=1
>> rcuc/0-11 [000] 90.642453: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:001
>> iperf-1284 [001] 90.642462: sched_stat_runtime: comm=iperf pid=1284 runtime=113053 [ns] vruntime=2106997666 [ns]
>> rcuc/0-11 [000] 90.642464: irq_handler_entry: irq=355 name=48484000.ethernet
>> rcuc/0-11 [000] 90.642466: irq_handler_exit: irq=355 ret=handled
>> rcuc/0-11 [000] 90.642469: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
>> iperf-1284 [001] 90.642473: sched_switch: iperf:1284 [120] R ==> irq/354-4848400:85 [49]
>> irq/354-4848400-85 [001] 90.642481: softirq_raise: vec=3 [action=NET_RX]
>> rcuc/0-11 [000] 90.642483: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
>> irq/354-4848400-85 [001] 90.642493: softirq_entry: vec=3 [action=NET_RX]
>> rcuc/0-11 [000] 90.642497: sched_migrate_task: comm=irq/355-4848400 pid=86 prio=49 orig_cpu=1 dest_cpu=0
> ach that IRQ thread no pinned. Good. We migrate.
>
It looks like scheduler playing ping-pong between CPUs with threaded irqs irq/354-355.
And seems this might be the case - if I pin both threaded IRQ handlers to CPU0
I can see better latency and netperf improvement
cyclictest -m -Sp98 -q -D4m
T: 0 ( 1318) P:98 I:1000 C: 240000 Min: 9 Act: 14 Avg: 15 Max: 42
T: 1 ( 1319) P:98 I:1500 C: 159909 Min: 9 Act: 14 Avg: 16 Max: 39
if I arrange hwirqs and pin pin both threaded IRQ handlers on CPU1
I can observe more less similar results as with this patch.
>> rcuc/0-11 [000] 90.642515: irq_handler_entry: irq=354 name=48484000.ethernet
>> rcuc/0-11 [000] 90.642516: irq_handler_exit: irq=354 ret=handled
>
> As you see ksoftirqd left the CPU with a D so I would assume it is
> blocked on a lock and waits.
> NET_RX is in progress but scheduled out due to RCUC which is also
> scheduled out.
>
> I assume we got to softirq because nothing else can run. It will see
> that NET_RX is pending and tries it but blocks on the lock
> (lock_softirq()). It schedules out. Nothing left -> idle.
>
> The idle code checks to see if a softirq is pending and in fact there is
> SCHED on the list and ksoftirq was about to handle it but due to
> ordering complication (NET_RX before SCHED) it can't. And we have the
> warning.
>
> This
>
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -105,6 +105,7 @@ void softirq_check_pending_idle(void)
> {
> static int rate_limit;
> struct softirq_runner *sr = this_cpu_ptr(&softirq_runners);
> + struct task_struct *ksoft_tsk = __this_cpu_read(ksoftirqd);
> u32 warnpending;
> int i;
>
> @@ -112,7 +113,7 @@ void softirq_check_pending_idle(void)
> return;
>
> warnpending = local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK;
> - for (i = 0; i < NR_SOFTIRQS; i++) {
> + for (i = 0; (i < NR_SOFTIRQS) && warnpending; i++) {
> struct task_struct *tsk = sr->runner[i];
>
> /*
> @@ -132,6 +133,15 @@ void softirq_check_pending_idle(void)
> }
> }
>
> + if (warnpending && ksoft_tsk) {
> + raw_spin_lock(&ksoft_tsk->pi_lock);
> + if (ksoft_tsk->pi_blocked_on || ksoft_tsk->state == TASK_RUNNING) {
> + /* Clear all bits pending in that task */
> + warnpending &= ~(ksoft_tsk->softirqs_raised);
> + }
> + raw_spin_unlock(&ksoft_tsk->pi_lock);
> + }
> +
> if (warnpending) {
> printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
> warnpending);
>
>
> should avoid the warning if the softirq thread is blocked.
with this change i do not see "NOHZ: local_softirq_pending 80" any more
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
>
> However:
> - RCU boosting with RT prio might help here or there.
That I'll try. current settings
CONFIG_RCU_BOOST=y
CONFIG_RCU_KTHREAD_PRIO=1
CONFIG_RCU_BOOST_DELAY=500
> - having the hard-IRQ and IRQ-thread on the same CPU might help, too. It
> is not strictly required but saves a few cycles if you don't have to
> perform cross CPU wake ups and migrate task forth and back. The latter
> happens at prio 99.
I've experimented with this and it improves netperf and I also followed instructions from [1].
But seems messed ti pin threaded irqs to cpu.
[1] https://www.osadl.org/Real-time-Ethernet-UDP-worst-case-roun.qa-farm-rt-ethernet-udp-monitor.0.html
> - I am not sure NAPI works as expected. I would assume so. There is IRQ
> 354 and 355 which fire after each other. One would be enough I guess.
> And they seem to be short living / fire often. If NAPI works then it
> should put an end to it and push it to the softirq thread.
> If you have IRQ-pacing support I suggest to use something like 10ms or
> so. That means your ping response will go from <= 1ms to 10ms in the
> worst case but since you process more packets at a time your
> throughput should increase.
> If I count this correct, it too you alsmost 4ms from "raise SCHED" to
> "try process SCHED" and most of the time was spent in 35[45] hard irq,
> raise NET_RX or cross wakeup the IRQ thread.
The question I have to dial with is why switching to RT cause so significant
netperf drop (without additional tunning) comparing to vanilla - ~120% for 256K and ~200% for 128K windows?
It's of course expected to see netperf drop, but I assume not so significant :(
And I can't find any reports or statistic related to this. Does the same happen on x86?
Thanks a lot for your comments.
--
regards,
-grygorii
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
@ 2016-09-09 12:46 ` Grygorii Strashko
0 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2016-09-09 12:46 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 09/08/2016 07:00 PM, Sebastian Andrzej Siewior wrote:
> On 2016-08-19 17:29:16 [+0300], Grygorii Strashko wrote:
>> I've collected trace before first occurrence of "NOHZ: local_softirq_pending 80"
>>
>
>> irq/354-4848400-85 [000] 90.642393: softirq_exit: vec=3 [action=NET_RX]
>> irq/354-4848400-85 [000] 90.642419: sched_switch: irq/354-4848400:85 [49] S ==> rcuc/0:11 [98]
>
> We don't serve TIMER & SCHED because those two are pushed to the
> ksoftirq thread(s). So we keep mostly doing NET_RX and now we switch to
> the next best thing which is RCU.
>
>> rcuc/0-11 [000] 90.642430: irq_handler_entry: irq=354 name=48484000.ethernet
> but not for long.
>
>> rcuc/0-11 [000] 90.642432: irq_handler_exit: irq=354 ret=handled
>> rcuc/0-11 [000] 90.642435: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
>> rcuc/0-11 [000] 90.642442: sched_migrate_task: comm=irq/354-4848400 pid=85 prio=49 orig_cpu=0 dest_cpu=1
>> rcuc/0-11 [000] 90.642453: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:001
>> iperf-1284 [001] 90.642462: sched_stat_runtime: comm=iperf pid=1284 runtime=113053 [ns] vruntime=2106997666 [ns]
>> rcuc/0-11 [000] 90.642464: irq_handler_entry: irq=355 name=48484000.ethernet
>> rcuc/0-11 [000] 90.642466: irq_handler_exit: irq=355 ret=handled
>> rcuc/0-11 [000] 90.642469: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
>> iperf-1284 [001] 90.642473: sched_switch: iperf:1284 [120] R ==> irq/354-4848400:85 [49]
>> irq/354-4848400-85 [001] 90.642481: softirq_raise: vec=3 [action=NET_RX]
>> rcuc/0-11 [000] 90.642483: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
>> irq/354-4848400-85 [001] 90.642493: softirq_entry: vec=3 [action=NET_RX]
>> rcuc/0-11 [000] 90.642497: sched_migrate_task: comm=irq/355-4848400 pid=86 prio=49 orig_cpu=1 dest_cpu=0
> ach that IRQ thread no pinned. Good. We migrate.
>
It looks like scheduler playing ping-pong between CPUs with threaded irqs irq/354-355.
And seems this might be the case - if I pin both threaded IRQ handlers to CPU0
I can see better latency and netperf improvement
cyclictest -m -Sp98 -q -D4m
T: 0 ( 1318) P:98 I:1000 C: 240000 Min: 9 Act: 14 Avg: 15 Max: 42
T: 1 ( 1319) P:98 I:1500 C: 159909 Min: 9 Act: 14 Avg: 16 Max: 39
if I arrange hwirqs and pin pin both threaded IRQ handlers on CPU1
I can observe more less similar results as with this patch.
>> rcuc/0-11 [000] 90.642515: irq_handler_entry: irq=354 name=48484000.ethernet
>> rcuc/0-11 [000] 90.642516: irq_handler_exit: irq=354 ret=handled
>
> As you see ksoftirqd left the CPU with a D so I would assume it is
> blocked on a lock and waits.
> NET_RX is in progress but scheduled out due to RCUC which is also
> scheduled out.
>
> I assume we got to softirq because nothing else can run. It will see
> that NET_RX is pending and tries it but blocks on the lock
> (lock_softirq()). It schedules out. Nothing left -> idle.
>
> The idle code checks to see if a softirq is pending and in fact there is
> SCHED on the list and ksoftirq was about to handle it but due to
> ordering complication (NET_RX before SCHED) it can't. And we have the
> warning.
>
> This
>
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -105,6 +105,7 @@ void softirq_check_pending_idle(void)
> {
> static int rate_limit;
> struct softirq_runner *sr = this_cpu_ptr(&softirq_runners);
> + struct task_struct *ksoft_tsk = __this_cpu_read(ksoftirqd);
> u32 warnpending;
> int i;
>
> @@ -112,7 +113,7 @@ void softirq_check_pending_idle(void)
> return;
>
> warnpending = local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK;
> - for (i = 0; i < NR_SOFTIRQS; i++) {
> + for (i = 0; (i < NR_SOFTIRQS) && warnpending; i++) {
> struct task_struct *tsk = sr->runner[i];
>
> /*
> @@ -132,6 +133,15 @@ void softirq_check_pending_idle(void)
> }
> }
>
> + if (warnpending && ksoft_tsk) {
> + raw_spin_lock(&ksoft_tsk->pi_lock);
> + if (ksoft_tsk->pi_blocked_on || ksoft_tsk->state == TASK_RUNNING) {
> + /* Clear all bits pending in that task */
> + warnpending &= ~(ksoft_tsk->softirqs_raised);
> + }
> + raw_spin_unlock(&ksoft_tsk->pi_lock);
> + }
> +
> if (warnpending) {
> printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
> warnpending);
>
>
> should avoid the warning if the softirq thread is blocked.
with this change i do not see "NOHZ: local_softirq_pending 80" any more
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
>
> However:
> - RCU boosting with RT prio might help here or there.
That I'll try. current settings
CONFIG_RCU_BOOST=y
CONFIG_RCU_KTHREAD_PRIO=1
CONFIG_RCU_BOOST_DELAY=500
> - having the hard-IRQ and IRQ-thread on the same CPU might help, too. It
> is not strictly required but saves a few cycles if you don't have to
> perform cross CPU wake ups and migrate task forth and back. The latter
> happens at prio 99.
I've experimented with this and it improves netperf and I also followed instructions from [1].
But seems messed ti pin threaded irqs to cpu.
[1] https://www.osadl.org/Real-time-Ethernet-UDP-worst-case-roun.qa-farm-rt-ethernet-udp-monitor.0.html
> - I am not sure NAPI works as expected. I would assume so. There is IRQ
> 354 and 355 which fire after each other. One would be enough I guess.
> And they seem to be short living / fire often. If NAPI works then it
> should put an end to it and push it to the softirq thread.
> If you have IRQ-pacing support I suggest to use something like 10ms or
> so. That means your ping response will go from <= 1ms to 10ms in the
> worst case but since you process more packets at a time your
> throughput should increase.
> If I count this correct, it too you alsmost 4ms from "raise SCHED" to
> "try process SCHED" and most of the time was spent in 35[45] hard irq,
> raise NET_RX or cross wakeup the IRQ thread.
The question I have to dial with is why switching to RT cause so significant
netperf drop (without additional tunning) comparing to vanilla - ~120% for 256K and ~200% for 128K windows?
It's of course expected to see netperf drop, but I assume not so significant :(
And I can't find any reports or statistic related to this. Does the same happen on x86?
Thanks a lot for your comments.
--
regards,
-grygorii
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-09-09 12:46 ` Grygorii Strashko
(?)
@ 2016-09-15 14:39 ` Sebastian Andrzej Siewior
-1 siblings, 0 replies; 18+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-09-15 14:39 UTC (permalink / raw)
To: Grygorii Strashko
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
On 2016-09-09 15:46:44 [+0300], Grygorii Strashko wrote:
>
> It looks like scheduler playing ping-pong between CPUs with threaded irqs irq/354-355.
> And seems this might be the case - if I pin both threaded IRQ handlers to CPU0
> I can see better latency and netperf improvement
> cyclictest -m -Sp98 -q -D4m
> T: 0 ( 1318) P:98 I:1000 C: 240000 Min: 9 Act: 14 Avg: 15 Max: 42
> T: 1 ( 1319) P:98 I:1500 C: 159909 Min: 9 Act: 14 Avg: 16 Max: 39
>
> if I arrange hwirqs and pin pin both threaded IRQ handlers on CPU1
> I can observe more less similar results as with this patch.
so no patch then.
> with this change i do not see "NOHZ: local_softirq_pending 80" any more
> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
okay. So I need to think what I do about this. Either this or trying to
run the "higher" softirq first but this could break things.
Thanks for the confirmation.
> > - having the hard-IRQ and IRQ-thread on the same CPU might help, too. It
> > is not strictly required but saves a few cycles if you don't have to
> > perform cross CPU wake ups and migrate task forth and back. The latter
> > happens at prio 99.
>
> I've experimented with this and it improves netperf and I also followed instructions from [1].
> But seems messed ti pin threaded irqs to cpu.
> [1] https://www.osadl.org/Real-time-Ethernet-UDP-worst-case-roun.qa-farm-rt-ethernet-udp-monitor.0.html
There is irq_thread() => irq_thread_check_affinity(). It might not work
as expected on ARM but it makes sense to follow the affinity mask HW irq
for the thread.
> > - I am not sure NAPI works as expected. I would assume so. There is IRQ
> > 354 and 355 which fire after each other. One would be enough I guess.
> > And they seem to be short living / fire often. If NAPI works then it
> > should put an end to it and push it to the softirq thread.
> > If you have IRQ-pacing support I suggest to use something like 10ms or
> > so. That means your ping response will go from <= 1ms to 10ms in the
> > worst case but since you process more packets at a time your
> > throughput should increase.
> > If I count this correct, it too you alsmost 4ms from "raise SCHED" to
> > "try process SCHED" and most of the time was spent in 35[45] hard irq,
> > raise NET_RX or cross wakeup the IRQ thread.
>
> The question I have to dial with is why switching to RT cause so significant
> netperf drop (without additional tunning) comparing to vanilla - ~120% for 256K and ~200% for 128K windows?
You have a sched / thread ping/pong. That is one thing. !RT with
threaded irqs should show similar problems. The higher latency is caused
by the migration thread.
> It's of course expected to see netperf drop, but I assume not so significant :(
> And I can't find any reports or statistic related to this. Does the same happen on x86?
It should. Maybe at a lower level if it handles migration more
effective. There is this watchdog thread (for instance) which tries to
detect lockups and runs at P99. It causes "worse" cyclictest numbers on
x86 and on ARM but on ARM this is more visible than on x86.
Sebastian
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
2016-09-09 12:46 ` Grygorii Strashko
@ 2017-04-07 23:35 ` Grygorii Strashko
-1 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2017-04-07 23:35 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
Hi Sebastian,
On 09/09/2016 07:46 AM, Grygorii Strashko wrote:
> On 09/08/2016 07:00 PM, Sebastian Andrzej Siewior wrote:
>> On 2016-08-19 17:29:16 [+0300], Grygorii Strashko wrote:
>>> I've collected trace before first occurrence of "NOHZ: local_softirq_pending 80"
>>>
>
>>
>>> irq/354-4848400-85 [000] 90.642393: softirq_exit: vec=3 [action=NET_RX]
>>> irq/354-4848400-85 [000] 90.642419: sched_switch: irq/354-4848400:85 [49] S ==> rcuc/0:11 [98]
>>
>> We don't serve TIMER & SCHED because those two are pushed to the
>> ksoftirq thread(s). So we keep mostly doing NET_RX and now we switch to
>> the next best thing which is RCU.
>>
>>> rcuc/0-11 [000] 90.642430: irq_handler_entry: irq=354 name=48484000.ethernet
>> but not for long.
>>
>>> rcuc/0-11 [000] 90.642432: irq_handler_exit: irq=354 ret=handled
>>> rcuc/0-11 [000] 90.642435: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
>>> rcuc/0-11 [000] 90.642442: sched_migrate_task: comm=irq/354-4848400 pid=85 prio=49 orig_cpu=0 dest_cpu=1
>>> rcuc/0-11 [000] 90.642453: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:001
>>> iperf-1284 [001] 90.642462: sched_stat_runtime: comm=iperf pid=1284 runtime=113053 [ns] vruntime=2106997666 [ns]
>>> rcuc/0-11 [000] 90.642464: irq_handler_entry: irq=355 name=48484000.ethernet
>>> rcuc/0-11 [000] 90.642466: irq_handler_exit: irq=355 ret=handled
>>> rcuc/0-11 [000] 90.642469: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
>>> iperf-1284 [001] 90.642473: sched_switch: iperf:1284 [120] R ==> irq/354-4848400:85 [49]
>>> irq/354-4848400-85 [001] 90.642481: softirq_raise: vec=3 [action=NET_RX]
>>> rcuc/0-11 [000] 90.642483: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
>>> irq/354-4848400-85 [001] 90.642493: softirq_entry: vec=3 [action=NET_RX]
>>> rcuc/0-11 [000] 90.642497: sched_migrate_task: comm=irq/355-4848400 pid=86 prio=49 orig_cpu=1 dest_cpu=0
>> ach that IRQ thread no pinned. Good. We migrate.
>>
>
> It looks like scheduler playing ping-pong between CPUs with threaded irqs irq/354-355.
> And seems this might be the case - if I pin both threaded IRQ handlers to CPU0
> I can see better latency and netperf improvement
> cyclictest -m -Sp98 -q -D4m
> T: 0 ( 1318) P:98 I:1000 C: 240000 Min: 9 Act: 14 Avg: 15 Max: 42
> T: 1 ( 1319) P:98 I:1500 C: 159909 Min: 9 Act: 14 Avg: 16 Max: 39
>
> if I arrange hwirqs and pin pin both threaded IRQ handlers on CPU1
> I can observe more less similar results as with this patch.
>
>
>>> rcuc/0-11 [000] 90.642515: irq_handler_entry: irq=354 name=48484000.ethernet
>>> rcuc/0-11 [000] 90.642516: irq_handler_exit: irq=354 ret=handled
>
>>
>> As you see ksoftirqd left the CPU with a D so I would assume it is
>> blocked on a lock and waits.
>> NET_RX is in progress but scheduled out due to RCUC which is also
>> scheduled out.
>>
>> I assume we got to softirq because nothing else can run. It will see
>> that NET_RX is pending and tries it but blocks on the lock
>> (lock_softirq()). It schedules out. Nothing left -> idle.
>>
>> The idle code checks to see if a softirq is pending and in fact there is
>> SCHED on the list and ksoftirq was about to handle it but due to
>> ordering complication (NET_RX before SCHED) it can't. And we have the
>> warning.
>>
>> This
>>
>> --- a/kernel/softirq.c
>> +++ b/kernel/softirq.c
>> @@ -105,6 +105,7 @@ void softirq_check_pending_idle(void)
>> {
>> static int rate_limit;
>> struct softirq_runner *sr = this_cpu_ptr(&softirq_runners);
>> + struct task_struct *ksoft_tsk = __this_cpu_read(ksoftirqd);
>> u32 warnpending;
>> int i;
>>
>> @@ -112,7 +113,7 @@ void softirq_check_pending_idle(void)
>> return;
>>
>> warnpending = local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK;
>> - for (i = 0; i < NR_SOFTIRQS; i++) {
>> + for (i = 0; (i < NR_SOFTIRQS) && warnpending; i++) {
>> struct task_struct *tsk = sr->runner[i];
>>
>> /*
>> @@ -132,6 +133,15 @@ void softirq_check_pending_idle(void)
>> }
>> }
>>
>> + if (warnpending && ksoft_tsk) {
>> + raw_spin_lock(&ksoft_tsk->pi_lock);
>> + if (ksoft_tsk->pi_blocked_on || ksoft_tsk->state == TASK_RUNNING) {
>> + /* Clear all bits pending in that task */
>> + warnpending &= ~(ksoft_tsk->softirqs_raised);
>> + }
>> + raw_spin_unlock(&ksoft_tsk->pi_lock);
>> + }
>> +
>> if (warnpending) {
>> printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
>> warnpending);
>>
>>
>> should avoid the warning if the softirq thread is blocked.
>
>
> with this change i do not see "NOHZ: local_softirq_pending 80" any more
> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
>
Above change is not present in K4.9 and I can still see NOHZ messages if
i run iperf on am57xx-evm, I'd like to know if you have any plans regarding this?
Copy pasting here your comments from another e-mail
--
okay. So I need to think what I do about this. Either this or trying to
run the "higher" softirq first but this could break things.
Thanks for the confirmation.
((c) Sebastian Andrzej Siewior)
--
root@am57xx-evm:~# iperf -c 192.168.1.1 -w128K -d -i5 -t120 & cyclictest -n -m -Sp97 -q -D2m
[1] 1078
# /dev/cpu_dma_latency set to 0us
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
[ 5] local 192.168.1.2 port 39346 connected with 192.168.1.1 port 5001
[ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 50594
[ 90.829952] NOHZ: local_softirq_pending 80
[ 90.830001] NOHZ: local_softirq_pending 80
[ 91.529926] NOHZ: local_softirq_pending 80
[ 93.299956] NOHZ: local_softirq_pending 80
[ 93.680078] NOHZ: local_softirq_pending 80
[ 93.680128] NOHZ: local_softirq_pending 80
[ ID] Interval Transfer Bandwidth
[ 5] 0.0- 5.0 sec 73.2 MBytes 123 Mbits/sec
[ 4] 0.0- 5.0 sec 47.5 MBytes 79.7 Mbits/sec
[ 94.719951] NOHZ: local_softirq_pending 80
[ 96.439923] NOHZ: local_softirq_pending 80
[ 96.569922] NOHZ: local_softirq_pending 80
[ 96.569951] NOHZ: local_softirq_pending 80
[ 5] 5.0-10.0 sec 73.5 MBytes 123 Mbits/sec
PS: if i set net irqs to run on cpu0 (or CPU1) only it improve net throughput and
eliminates this NOHZ messages.
--
regards,
-grygorii
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD
@ 2017-04-07 23:35 ` Grygorii Strashko
0 siblings, 0 replies; 18+ messages in thread
From: Grygorii Strashko @ 2017-04-07 23:35 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Steven Rostedt, linux-omap, Alison Chaiken, linux-rt-users, netdev
Hi Sebastian,
On 09/09/2016 07:46 AM, Grygorii Strashko wrote:
> On 09/08/2016 07:00 PM, Sebastian Andrzej Siewior wrote:
>> On 2016-08-19 17:29:16 [+0300], Grygorii Strashko wrote:
>>> I've collected trace before first occurrence of "NOHZ: local_softirq_pending 80"
>>>
>
>>
>>> irq/354-4848400-85 [000] 90.642393: softirq_exit: vec=3 [action=NET_RX]
>>> irq/354-4848400-85 [000] 90.642419: sched_switch: irq/354-4848400:85 [49] S ==> rcuc/0:11 [98]
>>
>> We don't serve TIMER & SCHED because those two are pushed to the
>> ksoftirq thread(s). So we keep mostly doing NET_RX and now we switch to
>> the next best thing which is RCU.
>>
>>> rcuc/0-11 [000] 90.642430: irq_handler_entry: irq=354 name=48484000.ethernet
>> but not for long.
>>
>>> rcuc/0-11 [000] 90.642432: irq_handler_exit: irq=354 ret=handled
>>> rcuc/0-11 [000] 90.642435: sched_waking: comm=irq/354-4848400 pid=85 prio=49 target_cpu=000
>>> rcuc/0-11 [000] 90.642442: sched_migrate_task: comm=irq/354-4848400 pid=85 prio=49 orig_cpu=0 dest_cpu=1
>>> rcuc/0-11 [000] 90.642453: sched_wakeup: irq/354-4848400:85 [49] success=1 CPU:001
>>> iperf-1284 [001] 90.642462: sched_stat_runtime: comm=iperf pid=1284 runtime=113053 [ns] vruntime=2106997666 [ns]
>>> rcuc/0-11 [000] 90.642464: irq_handler_entry: irq=355 name=48484000.ethernet
>>> rcuc/0-11 [000] 90.642466: irq_handler_exit: irq=355 ret=handled
>>> rcuc/0-11 [000] 90.642469: sched_waking: comm=irq/355-4848400 pid=86 prio=49 target_cpu=001
>>> iperf-1284 [001] 90.642473: sched_switch: iperf:1284 [120] R ==> irq/354-4848400:85 [49]
>>> irq/354-4848400-85 [001] 90.642481: softirq_raise: vec=3 [action=NET_RX]
>>> rcuc/0-11 [000] 90.642483: sched_wakeup: irq/355-4848400:86 [49] success=1 CPU:001
>>> irq/354-4848400-85 [001] 90.642493: softirq_entry: vec=3 [action=NET_RX]
>>> rcuc/0-11 [000] 90.642497: sched_migrate_task: comm=irq/355-4848400 pid=86 prio=49 orig_cpu=1 dest_cpu=0
>> ach that IRQ thread no pinned. Good. We migrate.
>>
>
> It looks like scheduler playing ping-pong between CPUs with threaded irqs irq/354-355.
> And seems this might be the case - if I pin both threaded IRQ handlers to CPU0
> I can see better latency and netperf improvement
> cyclictest -m -Sp98 -q -D4m
> T: 0 ( 1318) P:98 I:1000 C: 240000 Min: 9 Act: 14 Avg: 15 Max: 42
> T: 1 ( 1319) P:98 I:1500 C: 159909 Min: 9 Act: 14 Avg: 16 Max: 39
>
> if I arrange hwirqs and pin pin both threaded IRQ handlers on CPU1
> I can observe more less similar results as with this patch.
>
>
>>> rcuc/0-11 [000] 90.642515: irq_handler_entry: irq=354 name=48484000.ethernet
>>> rcuc/0-11 [000] 90.642516: irq_handler_exit: irq=354 ret=handled
>
>>
>> As you see ksoftirqd left the CPU with a D so I would assume it is
>> blocked on a lock and waits.
>> NET_RX is in progress but scheduled out due to RCUC which is also
>> scheduled out.
>>
>> I assume we got to softirq because nothing else can run. It will see
>> that NET_RX is pending and tries it but blocks on the lock
>> (lock_softirq()). It schedules out. Nothing left -> idle.
>>
>> The idle code checks to see if a softirq is pending and in fact there is
>> SCHED on the list and ksoftirq was about to handle it but due to
>> ordering complication (NET_RX before SCHED) it can't. And we have the
>> warning.
>>
>> This
>>
>> --- a/kernel/softirq.c
>> +++ b/kernel/softirq.c
>> @@ -105,6 +105,7 @@ void softirq_check_pending_idle(void)
>> {
>> static int rate_limit;
>> struct softirq_runner *sr = this_cpu_ptr(&softirq_runners);
>> + struct task_struct *ksoft_tsk = __this_cpu_read(ksoftirqd);
>> u32 warnpending;
>> int i;
>>
>> @@ -112,7 +113,7 @@ void softirq_check_pending_idle(void)
>> return;
>>
>> warnpending = local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK;
>> - for (i = 0; i < NR_SOFTIRQS; i++) {
>> + for (i = 0; (i < NR_SOFTIRQS) && warnpending; i++) {
>> struct task_struct *tsk = sr->runner[i];
>>
>> /*
>> @@ -132,6 +133,15 @@ void softirq_check_pending_idle(void)
>> }
>> }
>>
>> + if (warnpending && ksoft_tsk) {
>> + raw_spin_lock(&ksoft_tsk->pi_lock);
>> + if (ksoft_tsk->pi_blocked_on || ksoft_tsk->state == TASK_RUNNING) {
>> + /* Clear all bits pending in that task */
>> + warnpending &= ~(ksoft_tsk->softirqs_raised);
>> + }
>> + raw_spin_unlock(&ksoft_tsk->pi_lock);
>> + }
>> +
>> if (warnpending) {
>> printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
>> warnpending);
>>
>>
>> should avoid the warning if the softirq thread is blocked.
>
>
> with this change i do not see "NOHZ: local_softirq_pending 80" any more
> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
>
Above change is not present in K4.9 and I can still see NOHZ messages if
i run iperf on am57xx-evm, I'd like to know if you have any plans regarding this?
Copy pasting here your comments from another e-mail
--
okay. So I need to think what I do about this. Either this or trying to
run the "higher" softirq first but this could break things.
Thanks for the confirmation.
((c) Sebastian Andrzej Siewior)
--
root@am57xx-evm:~# iperf -c 192.168.1.1 -w128K -d -i5 -t120 & cyclictest -n -m -Sp97 -q -D2m
[1] 1078
# /dev/cpu_dma_latency set to 0us
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 256 KByte (WARNING: requested 128 KByte)
------------------------------------------------------------
[ 5] local 192.168.1.2 port 39346 connected with 192.168.1.1 port 5001
[ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 50594
[ 90.829952] NOHZ: local_softirq_pending 80
[ 90.830001] NOHZ: local_softirq_pending 80
[ 91.529926] NOHZ: local_softirq_pending 80
[ 93.299956] NOHZ: local_softirq_pending 80
[ 93.680078] NOHZ: local_softirq_pending 80
[ 93.680128] NOHZ: local_softirq_pending 80
[ ID] Interval Transfer Bandwidth
[ 5] 0.0- 5.0 sec 73.2 MBytes 123 Mbits/sec
[ 4] 0.0- 5.0 sec 47.5 MBytes 79.7 Mbits/sec
[ 94.719951] NOHZ: local_softirq_pending 80
[ 96.439923] NOHZ: local_softirq_pending 80
[ 96.569922] NOHZ: local_softirq_pending 80
[ 96.569951] NOHZ: local_softirq_pending 80
[ 5] 5.0-10.0 sec 73.5 MBytes 123 Mbits/sec
PS: if i set net irqs to run on cpu0 (or CPU1) only it improve net throughput and
eliminates this NOHZ messages.
--
regards,
-grygorii
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2017-04-07 23:35 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-11 16:15 [4.4-RT PATCH RFC/RFT] drivers: net: cpsw: mark rx/tx irq as IRQF_NO_THREAD Grygorii Strashko
2016-08-11 16:15 ` Grygorii Strashko
2016-08-11 16:36 ` Steven Rostedt
2016-08-11 16:36 ` Steven Rostedt
2016-08-12 8:32 ` Sebastian Andrzej Siewior
2016-08-12 15:58 ` Grygorii Strashko
2016-08-12 15:58 ` Grygorii Strashko
2016-09-08 14:28 ` Sebastian Andrzej Siewior
2016-09-08 16:24 ` Grygorii Strashko
2016-09-08 16:24 ` Grygorii Strashko
2016-08-19 14:29 ` Grygorii Strashko
2016-08-19 14:29 ` Grygorii Strashko
2016-09-08 16:00 ` Sebastian Andrzej Siewior
2016-09-09 12:46 ` Grygorii Strashko
2016-09-09 12:46 ` Grygorii Strashko
2016-09-15 14:39 ` Sebastian Andrzej Siewior
2017-04-07 23:35 ` Grygorii Strashko
2017-04-07 23:35 ` Grygorii Strashko
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.