* net latency, long wait for softirq_entry
@ 2020-12-05 18:30 Paul Thomas
2020-12-05 18:53 ` Paul Thomas
2020-12-18 13:52 ` Sebastian Andrzej Siewior
0 siblings, 2 replies; 5+ messages in thread
From: Paul Thomas @ 2020-12-05 18:30 UTC (permalink / raw)
To: linux-rt-users
(this time with a proper subject)
Hello,
I've been testing out the 5.10 rt kernel[1], and now I'm trying to
track down a networking latency issue. Generally this works well, I
have a user thread (priority 99) and the driver (macb) rx irq thread
(priority 98). With tracing (kernelshark is very cool btw) I can see
the normal flow, the irq thread does the netif_receive_skb:
softirq_raise
softirq_entry
napi_gro_receive_entry
napi_gro_receive_exit
netif_receive_skb
sched_waking
sched_wakeup
softirq_exit
sched_switch
Then the user thread does:
sys_exit_recvmsg
But then in the long latency case, instead of going straight to
softirq_entry this is delayed and it does the softirq_entry from the
ksoftirqd task instead of the irq/30-eth%d task. The problem is that
before ksoftirqd runs, another lower priority task runs (in this case
it's the macb tx interrupt thread at priority 80, but I've seen this
be user threads in slightly different tests). It seems the paths
diverge with a call to __kthread_should_park.
This is on an arm64 zynqmp platform.
Any thoughts on how this could be improved would be appreciated.
Cyclictest seems OK, a 10 minute run with stress-ng running shows a
maximum latency of 47us
# ./cyclictest -S -m -p 99 -d 0 -i 200 -D 10m
WARN: stat /dev/cpu_dma_latency failed: No such file or directory
policy: fifo: loadavg: 15.99 15.74 13.41 13/152 2605
T: 0 ( 5687) P:99 I:200 C:2999965 Min: 6 Act: 12 Avg: 11 Max: 39
T: 1 ( 5701) P:99 I:200 C:2999647 Min: 8 Act: 12 Avg: 10 Max: 40
T: 2 ( 5720) P:99 I:200 C:2999343 Min: 8 Act: 11 Avg: 10 Max: 41
T: 3 ( 5740) P:99 I:200 C:2999056 Min: 7 Act: 10 Avg: 11 Max: 47
-Paul
[1] I saw the v5.10-rc6-rt14 announcement yesterday, and rebased to
that to for the tracing mentioned here
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: net latency, long wait for softirq_entry
2020-12-05 18:30 net latency, long wait for softirq_entry Paul Thomas
@ 2020-12-05 18:53 ` Paul Thomas
2020-12-18 13:52 ` Sebastian Andrzej Siewior
1 sibling, 0 replies; 5+ messages in thread
From: Paul Thomas @ 2020-12-05 18:53 UTC (permalink / raw)
To: linux-rt-users
The trace can be found here:
https://bugzilla.kernel.org/attachment.cgi?id=293951
The softirq_raise with a long wait before the softirq_entry starts at
time 515.509590.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: net latency, long wait for softirq_entry
2020-12-05 18:30 net latency, long wait for softirq_entry Paul Thomas
2020-12-05 18:53 ` Paul Thomas
@ 2020-12-18 13:52 ` Sebastian Andrzej Siewior
[not found] ` <CAFzL-7ujcJ4KDbgVoq-N6GRTWfemCA6hfqOzBGwcaDzt=0rGzQ@mail.gmail.com>
1 sibling, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-12-18 13:52 UTC (permalink / raw)
To: Paul Thomas, Thomas Gleixner; +Cc: linux-rt-users
On 2020-12-05 13:30:21 [-0500], Paul Thomas wrote:
> (this time with a proper subject)
> Hello,
Hi,
> I've been testing out the 5.10 rt kernel[1], and now I'm trying to
…
> But then in the long latency case, instead of going straight to
> softirq_entry this is delayed and it does the softirq_entry from the
> ksoftirqd task instead of the irq/30-eth%d task. The problem is that
> before ksoftirqd runs, another lower priority task runs (in this case
> it's the macb tx interrupt thread at priority 80, but I've seen this
> be user threads in slightly different tests). It seems the paths
> diverge with a call to __kthread_should_park.
>
> This is on an arm64 zynqmp platform.
>
> Any thoughts on how this could be improved would be appreciated.
The usual flow is:
- wake IRQ thread
- IRQ thread raises NAPI (softirq NET_RX)
- NAPI is processed on the exit of the IRQ thread.
This flow was disturbed by the timer:
- stress-ng is running
- the HZ tick timer fired, raised TIMER softirq. This leads to waking
the ksoftirqd thread to process it (this is normal).
- stress-ng continues.
- the networking IRQ fires. The thread is woken up, raises NET_RX
softirq. Usually, the softirq would be invoked on return from the
interrupt thread but since ksoftirqd is already woken nothing happens.
It is delayed to ksoftirqd. The IRQ-thread goes idle.
Having a thread to run the tick-timer would avoid that scenario.
Side note: If the networking thread runs for a longer period of then it
will handover to ksoftirqd.
-Paul
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-12-21 5:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-05 18:30 net latency, long wait for softirq_entry Paul Thomas
2020-12-05 18:53 ` Paul Thomas
2020-12-18 13:52 ` Sebastian Andrzej Siewior
[not found] ` <CAFzL-7ujcJ4KDbgVoq-N6GRTWfemCA6hfqOzBGwcaDzt=0rGzQ@mail.gmail.com>
2020-12-18 16:49 ` Sebastian Andrzej Siewior
2020-12-21 0:08 ` Paul Thomas
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.