irq thread latency caused by softirq_ctrl.lock contention

From: Gratian Crisan <gratian.crisan@ni.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Steven Rostedt <rostedt@goodmis.org>
Cc: linux-rt-users@vger.kernel.org
Subject: irq thread latency caused by softirq_ctrl.lock contention
Date: Wed, 15 Sep 2021 17:59:50 -0500	[thread overview]
Message-ID: <87a6kd78n7.fsf@ni.com> (raw)

Hi guys,

We are seeing some pretty significant degradation in some of our
benchmarks when using 5.10.59-rt52 or 5.14.1-rt19 kernels versus
previous numbers on 4.14-rt and I'm looking for thoughts/ideas if
anything can be done about it.

The synthetic benchmark in question is measuring the latency of a
"analog in" -> "analog out" loop. We have two versions of this
benchmark: one that is interrupt based and one that polls on the I/O.
The polling mode benchmark is comparable between 5.x-rt and 4.14-rt.
However the most extreme case of the interrupt based one shows up to 60%
degradation in latency in 5.x-rt kernels.

I was able to confirm that the issue is caused by contention on
'softirq_ctrl.lock' local lock in the softirq code. This results in
event traces with the following pattern:

          <idle>-0     [000]   582.023481: sched_switch:         prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=91
     ksoftirqd/0-12    [000]   582.023485: irq_handler_entry:    irq=102 name=atomicchinchk
     ksoftirqd/0-12    [000]   582.023485: irq_handler_exit:     irq=102 ret=handled
     ksoftirqd/0-12    [000]   582.023486: sched_waking:         comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
     ksoftirqd/0-12    [000]   582.023488: sched_wakeup:         comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
     ksoftirqd/0-12    [000]   582.023490: rcu_utilization:      Start context switch
     ksoftirqd/0-12    [000]   582.023490: rcu_utilization:      End context switch
     ksoftirqd/0-12    [000]   582.023493: sched_switch:         prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=91 prev_state=R+ ==> next_comm=irq/102-atomicc next_pid=248 next_prio=1
 irq/102-atomicc-248   [000]   582.023496: sched_pi_setprio:     comm=ksoftirqd/0 pid=12 oldprio=91 newprio=1
 irq/102-atomicc-248   [000]   582.023499: rcu_utilization:      Start context switch
 irq/102-atomicc-248   [000]   582.023499: rcu_utilization:      End context switch
 irq/102-atomicc-248   [000]   582.023501: sched_switch:         prev_comm=irq/102-atomicc prev_pid=248 prev_prio=1 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=1
     ksoftirqd/0-12    [000]   582.023503: softirq_entry:        vec=7 [action=SCHED]
     ksoftirqd/0-12    [000]   582.023508: softirq_exit:         vec=7 [action=SCHED]
     ksoftirqd/0-12    [000]   582.023510: sched_pi_setprio:     comm=ksoftirqd/0 pid=12 oldprio=1 newprio=91
     ksoftirqd/0-12    [000]   582.023513: sched_waking:         comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
     ksoftirqd/0-12    [000]   582.023515: sched_wakeup:         comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
     ksoftirqd/0-12    [000]   582.023519: rcu_utilization:      Start context switch
     ksoftirqd/0-12    [000]   582.023519: rcu_utilization:      End context switch
     ksoftirqd/0-12    [000]   582.023521: sched_switch:         prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=91 prev_state=S ==> next_comm=irq/102-atomicc next_pid=248 next_prio=1
 irq/102-atomicc-248   [000]   582.023530: sched_waking:         comm=S81039292 pid=2240 prio=40 target_cpu=001
 irq/102-atomicc-248   [000]   582.023532: sched_wake_idle_without_ipi: cpu=1
 irq/102-atomicc-248   [000]   582.023533: sched_wakeup:         comm=S81039292 pid=2240 prio=40 target_cpu=001
 irq/102-atomicc-248   [000]   582.023539: rcu_utilization:      Start context switch
 irq/102-atomicc-248   [000]   582.023540: rcu_utilization:      End context switch
 irq/102-atomicc-248   [000]   582.023542: sched_switch:         prev_comm=irq/102-atomicc prev_pid=248 prev_prio=1 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120

The 'irq/102-atomicc-248' irq thread is our high priority data
acquisition thread. The additional pi boost and context switch seems to
account for the main differences I'm seeing versus 4.14-rt. This race +
pi boost happens with other lower priority irq threads too but the
ksoftirq case is the most common one.

I would appreciate any thoughts on how/if we could improve this?

Thanks,
    Gratian