All of lore.kernel.org
 help / color / mirror / Atom feed
* irq thread latency caused by softirq_ctrl.lock contention
@ 2021-09-15 22:59 Gratian Crisan
  2021-09-20  9:18 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 3+ messages in thread
From: Gratian Crisan @ 2021-09-15 22:59 UTC (permalink / raw)
  To: Thomas Gleixner, Sebastian Andrzej Siewior, Steven Rostedt; +Cc: linux-rt-users

Hi guys,

We are seeing some pretty significant degradation in some of our
benchmarks when using 5.10.59-rt52 or 5.14.1-rt19 kernels versus
previous numbers on 4.14-rt and I'm looking for thoughts/ideas if
anything can be done about it.

The synthetic benchmark in question is measuring the latency of a
"analog in" -> "analog out" loop. We have two versions of this
benchmark: one that is interrupt based and one that polls on the I/O.
The polling mode benchmark is comparable between 5.x-rt and 4.14-rt.
However the most extreme case of the interrupt based one shows up to 60%
degradation in latency in 5.x-rt kernels.

I was able to confirm that the issue is caused by contention on
'softirq_ctrl.lock' local lock in the softirq code. This results in
event traces with the following pattern:

          <idle>-0     [000]   582.023481: sched_switch:         prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=91
     ksoftirqd/0-12    [000]   582.023485: irq_handler_entry:    irq=102 name=atomicchinchk
     ksoftirqd/0-12    [000]   582.023485: irq_handler_exit:     irq=102 ret=handled
     ksoftirqd/0-12    [000]   582.023486: sched_waking:         comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
     ksoftirqd/0-12    [000]   582.023488: sched_wakeup:         comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
     ksoftirqd/0-12    [000]   582.023490: rcu_utilization:      Start context switch
     ksoftirqd/0-12    [000]   582.023490: rcu_utilization:      End context switch
     ksoftirqd/0-12    [000]   582.023493: sched_switch:         prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=91 prev_state=R+ ==> next_comm=irq/102-atomicc next_pid=248 next_prio=1
 irq/102-atomicc-248   [000]   582.023496: sched_pi_setprio:     comm=ksoftirqd/0 pid=12 oldprio=91 newprio=1
 irq/102-atomicc-248   [000]   582.023499: rcu_utilization:      Start context switch
 irq/102-atomicc-248   [000]   582.023499: rcu_utilization:      End context switch
 irq/102-atomicc-248   [000]   582.023501: sched_switch:         prev_comm=irq/102-atomicc prev_pid=248 prev_prio=1 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=1
     ksoftirqd/0-12    [000]   582.023503: softirq_entry:        vec=7 [action=SCHED]
     ksoftirqd/0-12    [000]   582.023508: softirq_exit:         vec=7 [action=SCHED]
     ksoftirqd/0-12    [000]   582.023510: sched_pi_setprio:     comm=ksoftirqd/0 pid=12 oldprio=1 newprio=91
     ksoftirqd/0-12    [000]   582.023513: sched_waking:         comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
     ksoftirqd/0-12    [000]   582.023515: sched_wakeup:         comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
     ksoftirqd/0-12    [000]   582.023519: rcu_utilization:      Start context switch
     ksoftirqd/0-12    [000]   582.023519: rcu_utilization:      End context switch
     ksoftirqd/0-12    [000]   582.023521: sched_switch:         prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=91 prev_state=S ==> next_comm=irq/102-atomicc next_pid=248 next_prio=1
 irq/102-atomicc-248   [000]   582.023530: sched_waking:         comm=S81039292 pid=2240 prio=40 target_cpu=001
 irq/102-atomicc-248   [000]   582.023532: sched_wake_idle_without_ipi: cpu=1
 irq/102-atomicc-248   [000]   582.023533: sched_wakeup:         comm=S81039292 pid=2240 prio=40 target_cpu=001
 irq/102-atomicc-248   [000]   582.023539: rcu_utilization:      Start context switch
 irq/102-atomicc-248   [000]   582.023540: rcu_utilization:      End context switch
 irq/102-atomicc-248   [000]   582.023542: sched_switch:         prev_comm=irq/102-atomicc prev_pid=248 prev_prio=1 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120

The 'irq/102-atomicc-248' irq thread is our high priority data
acquisition thread. The additional pi boost and context switch seems to
account for the main differences I'm seeing versus 4.14-rt. This race +
pi boost happens with other lower priority irq threads too but the
ksoftirq case is the most common one.

I would appreciate any thoughts on how/if we could improve this?

Thanks,
    Gratian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: irq thread latency caused by softirq_ctrl.lock contention
  2021-09-15 22:59 irq thread latency caused by softirq_ctrl.lock contention Gratian Crisan
@ 2021-09-20  9:18 ` Sebastian Andrzej Siewior
  2021-09-22 19:16   ` Gratian Crisan
  0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-20  9:18 UTC (permalink / raw)
  To: Gratian Crisan; +Cc: Thomas Gleixner, Steven Rostedt, linux-rt-users

On 2021-09-15 17:59:50 [-0500], Gratian Crisan wrote:
> Hi guys,
Hi,

…
> The 'irq/102-atomicc-248' irq thread is our high priority data
> acquisition thread. The additional pi boost and context switch seems to
> account for the main differences I'm seeing versus 4.14-rt. This race +
> pi boost happens with other lower priority irq threads too but the
> ksoftirq case is the most common one.
> 
> I would appreciate any thoughts on how/if we could improve this?

It appears to be a consequence of the new softirq design/ handling.
Earlier we could have multiple softirqs running in parallel on a single
CPU (as in NET_RX and NET_TX).
With the new design only one softirq can be handled at a time resulting
in a full synchronisation at local_bh_diable() time by the lock you
mention in subject.
In your case it appears that irq/102-atomicc is force-threaded and
therefore requires to disable BH before its execution. This is just to
mimic what upstream does in terms of locking and to ensure that BH
invocation happens after the threaded interrupt ended.
If there is nothing special about this interrupt handler (in terms of BH
handling) you could request a threaded handler for the IRQ. The manually
threaded handler do not disable BH before their invocation.

> Thanks,
>     Gratian

Sebastian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Re: irq thread latency caused by softirq_ctrl.lock contention
  2021-09-20  9:18 ` Sebastian Andrzej Siewior
@ 2021-09-22 19:16   ` Gratian Crisan
  0 siblings, 0 replies; 3+ messages in thread
From: Gratian Crisan @ 2021-09-22 19:16 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Thomas Gleixner, Steven Rostedt, linux-rt-users


Sebastian Andrzej Siewior writes:

> On 2021-09-15 17:59:50 [-0500], Gratian Crisan wrote:
>> Hi guys,
> Hi,
>
> …
>> The 'irq/102-atomicc-248' irq thread is our high priority data
>> acquisition thread. The additional pi boost and context switch seems to
>> account for the main differences I'm seeing versus 4.14-rt. This race +
>> pi boost happens with other lower priority irq threads too but the
>> ksoftirq case is the most common one.
>> 
>> I would appreciate any thoughts on how/if we could improve this?
>
> It appears to be a consequence of the new softirq design/ handling.
> Earlier we could have multiple softirqs running in parallel on a single
> CPU (as in NET_RX and NET_TX).
> With the new design only one softirq can be handled at a time resulting
> in a full synchronisation at local_bh_diable() time by the lock you
> mention in subject.

Makes sense.

> In your case it appears that irq/102-atomicc is force-threaded and
> therefore requires to disable BH before its execution. This is just to
> mimic what upstream does in terms of locking and to ensure that BH
> invocation happens after the threaded interrupt ended.

Yes, good catch.

> If there is nothing special about this interrupt handler (in terms of BH
> handling) you could request a threaded handler for the IRQ. The manually
> threaded handler do not disable BH before their invocation.
>

I appreciate the insight. I think this will solve our problem.

>
> Sebastian

Thanks again,
    Gratian

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-22 19:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-15 22:59 irq thread latency caused by softirq_ctrl.lock contention Gratian Crisan
2021-09-20  9:18 ` Sebastian Andrzej Siewior
2021-09-22 19:16   ` Gratian Crisan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.