From: Gratian Crisan <gratian.crisan@ni.com>
To: Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>
Cc: linux-rt-users@vger.kernel.org
Subject: irq thread latency caused by softirq_ctrl.lock contention
Date: Wed, 15 Sep 2021 17:59:50 -0500 [thread overview]
Message-ID: <87a6kd78n7.fsf@ni.com> (raw)
Hi guys,
We are seeing some pretty significant degradation in some of our
benchmarks when using 5.10.59-rt52 or 5.14.1-rt19 kernels versus
previous numbers on 4.14-rt and I'm looking for thoughts/ideas if
anything can be done about it.
The synthetic benchmark in question is measuring the latency of a
"analog in" -> "analog out" loop. We have two versions of this
benchmark: one that is interrupt based and one that polls on the I/O.
The polling mode benchmark is comparable between 5.x-rt and 4.14-rt.
However the most extreme case of the interrupt based one shows up to 60%
degradation in latency in 5.x-rt kernels.
I was able to confirm that the issue is caused by contention on
'softirq_ctrl.lock' local lock in the softirq code. This results in
event traces with the following pattern:
<idle>-0 [000] 582.023481: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=91
ksoftirqd/0-12 [000] 582.023485: irq_handler_entry: irq=102 name=atomicchinchk
ksoftirqd/0-12 [000] 582.023485: irq_handler_exit: irq=102 ret=handled
ksoftirqd/0-12 [000] 582.023486: sched_waking: comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
ksoftirqd/0-12 [000] 582.023488: sched_wakeup: comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
ksoftirqd/0-12 [000] 582.023490: rcu_utilization: Start context switch
ksoftirqd/0-12 [000] 582.023490: rcu_utilization: End context switch
ksoftirqd/0-12 [000] 582.023493: sched_switch: prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=91 prev_state=R+ ==> next_comm=irq/102-atomicc next_pid=248 next_prio=1
irq/102-atomicc-248 [000] 582.023496: sched_pi_setprio: comm=ksoftirqd/0 pid=12 oldprio=91 newprio=1
irq/102-atomicc-248 [000] 582.023499: rcu_utilization: Start context switch
irq/102-atomicc-248 [000] 582.023499: rcu_utilization: End context switch
irq/102-atomicc-248 [000] 582.023501: sched_switch: prev_comm=irq/102-atomicc prev_pid=248 prev_prio=1 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=1
ksoftirqd/0-12 [000] 582.023503: softirq_entry: vec=7 [action=SCHED]
ksoftirqd/0-12 [000] 582.023508: softirq_exit: vec=7 [action=SCHED]
ksoftirqd/0-12 [000] 582.023510: sched_pi_setprio: comm=ksoftirqd/0 pid=12 oldprio=1 newprio=91
ksoftirqd/0-12 [000] 582.023513: sched_waking: comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
ksoftirqd/0-12 [000] 582.023515: sched_wakeup: comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
ksoftirqd/0-12 [000] 582.023519: rcu_utilization: Start context switch
ksoftirqd/0-12 [000] 582.023519: rcu_utilization: End context switch
ksoftirqd/0-12 [000] 582.023521: sched_switch: prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=91 prev_state=S ==> next_comm=irq/102-atomicc next_pid=248 next_prio=1
irq/102-atomicc-248 [000] 582.023530: sched_waking: comm=S81039292 pid=2240 prio=40 target_cpu=001
irq/102-atomicc-248 [000] 582.023532: sched_wake_idle_without_ipi: cpu=1
irq/102-atomicc-248 [000] 582.023533: sched_wakeup: comm=S81039292 pid=2240 prio=40 target_cpu=001
irq/102-atomicc-248 [000] 582.023539: rcu_utilization: Start context switch
irq/102-atomicc-248 [000] 582.023540: rcu_utilization: End context switch
irq/102-atomicc-248 [000] 582.023542: sched_switch: prev_comm=irq/102-atomicc prev_pid=248 prev_prio=1 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
The 'irq/102-atomicc-248' irq thread is our high priority data
acquisition thread. The additional pi boost and context switch seems to
account for the main differences I'm seeing versus 4.14-rt. This race +
pi boost happens with other lower priority irq threads too but the
ksoftirq case is the most common one.
I would appreciate any thoughts on how/if we could improve this?
Thanks,
Gratian
next reply other threads:[~2021-09-15 23:38 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-15 22:59 Gratian Crisan [this message]
2021-09-20 9:18 ` irq thread latency caused by softirq_ctrl.lock contention Sebastian Andrzej Siewior
2021-09-22 19:16 ` Gratian Crisan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a6kd78n7.fsf@ni.com \
--to=gratian.crisan@ni.com \
--cc=bigeasy@linutronix.de \
--cc=linux-rt-users@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.