linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* CONFIG_PREEMPT_RT local_softirq_pending warning when ISR blocks
@ 2015-03-05  5:16 Brian Silverman
  2015-03-09 16:08 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 4+ messages in thread
From: Brian Silverman @ 2015-03-05  5:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-rt-users, Austin Schuh, Thomas Gleixner, Ingo Molnar

Hi,

I'm seeing "NOHZ: local_softirq_pending" warnings with 3.14.31-rt28
(3.14.3-rt4 too). They're mostly "NOHZ: local_softirq_pending 08"
(NET_RX) but they happen occasionally for other softirqs too (0c aka
NET_RX and NET_TX is second-most common). CONFIG_PREEMPT_RT_FULL and
CONFIG_NO_HZ_FULL(_ALL) are enabled. Heavy load on a CAN card
(sja1000) consistently triggers the messages, but they also happen
occasionally without explicitly trying to stress the system.

I got a trace showing a threaded ISR handler blocking on an rt_mutex
after raising a softirq, which results in the CPU going idle with the
softirq still pending because irq_exit() hasn't happened yet. I don't
know whether the softirq should somehow be getting run at that point,
the message is spurious in this situation, or maybe there's some other
reason this should never happen.

The task holding the rt_mutex was preempted by another ISR handler at
the same priority as the one which has the problem. I'm not quite
sure, but i think it might have been running a softirq at the time it
got preempted if that makes any difference.

Here's the parts of the trace that look relevant, split up between the
2 interesting CPUs (both hyperthreads of 1 of the 2 cores in the
system). If anybody can think of any other information that would be
useful, please do ask.

Beforehand, 000 is spending most of its time in interrupts, but bash
is doing something related to memory management on it in between.
            bash-14721 [000] ......1  6854.629126: rt_spin_lock
<-free_pcppages_bulk
            bash-14721 [000] ....1.1  6854.629126: mm_page_pcpu_drain:
page=ffffea000fa1aa40 pfn=4097705 order=0 migratetype=1
            bash-14721 [000] ......1  6854.629127:
get_pfnblock_flags_mask <-free_pcppages_bulk
            bash-14721 [000] ......1  6854.629127:
__mod_zone_page_state <-free_pcppages_bulk
            bash-14721 [000] ....1.1  6854.629127: mm_page_pcpu_drain:
page=ffffea000f572ac0 pfn=4021419 order=0 migratetype=0
            bash-14721 [000] ......1  6854.629128:
get_pfnblock_flags_mask <-free_pcppages_bulk
            bash-14721 [000] ......1  6854.629128:
__mod_zone_page_state <-free_pcppages_bulk
... # lots more virtually identical repetitions of those last 3 lines
            bash-14721 [000] ....1.1  6854.629139: mm_page_pcpu_drain:
page=ffffea000f481a80 pfn=4005994 order=0 migratetype=1
            bash-14721 [000] ......1  6854.629139:
get_pfnblock_flags_mask <-free_pcppages_bulk
            bash-14721 [000] d.....1  6854.629139: do_IRQ <-ret_from_intr
            bash-14721 [000] d.....1  6854.629139: irq_enter <-do_IRQ
... # wakes up the can1 ISR thread on 001 and the can0 one on 000
(same physical IRQ line)
            bash-14721 [000] d...3.1  6854.629261: sched_switch:
prev_comm=bash prev_pid=14721 prev_prio=120 prev_state=R+ ==>
next_comm=irq/18-can0 next_pid=2015 next_prio=28
... # runs the can0 ISR
     irq/18-can0-2015  [000] d...3..  6854.629283: sched_switch:
prev_comm=irq/18-can0 prev_pid=2015 prev_prio=28 prev_state=S ==>
next_comm=ksoftirqd/0 next_pid=3 next_prio=98
...
     ksoftirqd/0-3     [000] ....1.1  6854.629291: softirq_entry:
vec=1 [action=TIMER]
...
     ksoftirqd/0-3     [000] ....1.1  6854.629293: softirq_exit: vec=1
[action=TIMER]
...
     ksoftirqd/0-3     [000] .......  6854.629298: schedule <-smpboot_thread_fn
...
     ksoftirqd/0-3     [000] d...3..  6854.629304: sched_switch:
prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=98 prev_state=S ==>
next_comm=bash next_pid=14721 next_prio=28
...
            bash-14721 [000] d...1.1  6854.629308:
smp_trace_reschedule_interrupt <-trace_reschedule_interrupt # Actually
unnecessary schedule IPI from 001?
            bash-14721 [000] d...1.1  6854.629309: irq_enter
<-smp_trace_reschedule_interrupt
...
            bash-14721 [000] ....1.1  6854.629316:
__tick_nohz_task_switch <-__schedule
            bash-14721 [000] ......1  6854.629316:
__mod_zone_page_state <-free_pcppages_bulk
            bash-14721 [000] ....1.1  6854.629317: mm_page_pcpu_drain:
page=ffffea000f57a900 pfn=4021924 order=0 migratetype=0
            bash-14721 [000] ......1  6854.629317:
get_pfnblock_flags_mask <-free_pcppages_bulk
            bash-14721 [000] ......1  6854.629317:
__mod_zone_page_state <-free_pcppages_bulk
... # more of this like it was doing before
I don't see it unlocking the problematic mutex before the trace ends.

The relevant part for 001 starts while it's running the can1 ISR.
     irq/18-can1-7228  [001] ....1.1  6854.629275: netif_rx: dev=can1
skbaddr=ffff880412d8fc00 len=16
     irq/18-can1-7228  [001] ......1  6854.629275: migrate_disable
<-netif_rx_internal
     irq/18-can1-7228  [001] ......2  6854.629275: enqueue_to_backlog
<-netif_rx_internal
     irq/18-can1-7228  [001] d.....2  6854.629276: _raw_spin_lock
<-enqueue_to_backlog
     irq/18-can1-7228  [001] d...1.2  6854.629276:
__raise_softirq_irqoff <-enqueue_to_backlog
     irq/18-can1-7228  [001] d...1.2  6854.629276:
do_raise_softirq_irqoff <-__raise_softirq_irqoff
     irq/18-can1-7228  [001] d...2.2  6854.629276: softirq_raise:
vec=3 [action=NET_RX]
... # continues handling the can1 interrupt
     irq/18-can1-7228  [001] ......6  6854.629286: rt_spin_lock
<-get_page_from_freelist
     irq/18-can1-7228  [001] ......6  6854.629287:
rt_spin_lock_slowlock <-get_page_from_freelist
     irq/18-can1-7228  [001] ......6  6854.629287: _raw_spin_lock
<-rt_spin_lock_slowlock
     irq/18-can1-7228  [001] ....1.6  6854.629287:
__try_to_take_rt_mutex <-rt_spin_lock_slowlock
     irq/18-can1-7228  [001] ....1.6  6854.629287: _raw_spin_lock_irq
<-rt_spin_lock_slowlock
     irq/18-can1-7228  [001] d...2.6  6854.629288:
_raw_spin_unlock_irq <-rt_spin_lock_slowlock
     irq/18-can1-7228  [001] ....1.6  6854.629288:
task_blocks_on_rt_mutex <-rt_spin_lock_slowlock
... # rt_mutex/scheduling stuff
     irq/18-can1-7228  [001] d...4.6  6854.629291: sched_pi_setprio:
comm=bash pid=14721 oldprio=120 newprio=28
... # more scheduler stuff
     irq/18-can1-7228  [001] d...3.6  6854.629299:
native_smp_send_reschedule <-rt_mutex_setprio
... # more scheduler stuff
     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_fair
<-__schedule
     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_stop
<-__schedule
     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_dl
<-__schedule
     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_rt
<-__schedule
     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_fair
<-__schedule
     irq/18-can1-7228  [001] d...2.6  6854.629308: pick_next_task_idle
<-__schedule
     irq/18-can1-7228  [001] d...3.6  6854.629308: sched_switch:
prev_comm=irq/18-can1 prev_pid=7228 prev_prio=28 prev_state=D ==>
next_comm=swapper/1 next_pid=0 next_prio=120
...
          <idle>-0     [001] d...1..  6854.629319:
softirq_check_pending_idle <-tick_nohz_idle_enter
My tracing_off() call is in softirq_check_pending_idle, so that's it.

Thanks,
Brian Silverman

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CONFIG_PREEMPT_RT local_softirq_pending warning when ISR blocks
  2015-03-05  5:16 CONFIG_PREEMPT_RT local_softirq_pending warning when ISR blocks Brian Silverman
@ 2015-03-09 16:08 ` Sebastian Andrzej Siewior
  2015-03-10  0:36   ` Brian Silverman
  0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-09 16:08 UTC (permalink / raw)
  To: Brian Silverman
  Cc: linux-kernel, linux-rt-users, Austin Schuh, Thomas Gleixner, Ingo Molnar

* Brian Silverman | 2015-03-05 00:16:20 [-0500]:

>Beforehand, 000 is spending most of its time in interrupts, but bash
>is doing something related to memory management on it in between.
>            bash-14721 [000] ......1  6854.629126: rt_spin_lock <-free_pcppages_bulk
>            bash-14721 [000] ....1.1  6854.629126: mm_page_pcpu_drain: page=ffffea000fa1aa40 pfn=4097705 order=0 migratetype=1
>            bash-14721 [000] ......1  6854.629127: get_pfnblock_flags_mask <-free_pcppages_bulk
>            bash-14721 [000] ......1  6854.629127: __mod_zone_page_state <-free_pcppages_bulk
>            bash-14721 [000] ....1.1  6854.629127: mm_page_pcpu_drain: page=ffffea000f572ac0 pfn=4021419 order=0 migratetype=0
>            bash-14721 [000] ......1  6854.629128: get_pfnblock_flags_mask <-free_pcppages_bulk
>            bash-14721 [000] ......1  6854.629128: __mod_zone_page_state <-free_pcppages_bulk
>... # lots more virtually identical repetitions of those last 3 lines
>            bash-14721 [000] ....1.1  6854.629139: mm_page_pcpu_drain: page=ffffea000f481a80 pfn=4005994 order=0 migratetype=1
>            bash-14721 [000] ......1  6854.629139: get_pfnblock_flags_mask <-free_pcppages_bulk
You free memory and hold the zone->lock

>            bash-14721 [000] d.....1  6854.629139: do_IRQ <-ret_from_intr
>            bash-14721 [000] d.....1  6854.629139: irq_enter <-do_IRQ
>... # wakes up the can1 ISR thread on 001 and the can0 one on 000
>(same physical IRQ line)
>            bash-14721 [000] d...3.1  6854.629261: sched_switch: prev_comm=bash prev_pid=14721 prev_prio=120 prev_state=R+ ==> next_comm=irq/18-can0 next_pid=2015 next_prio=28

I would assume that this one raises NET_RX softirq. But at the bottom
you have the irq handler on the other CPU which confuses me…

>... # runs the can0 ISR
>     irq/18-can0-2015  [000] d...3..  6854.629283: sched_switch: prev_comm=irq/18-can0 prev_pid=2015 prev_prio=28 prev_state=S ==> next_comm=ksoftirqd/0 next_pid=3 next_prio=98
>...
>     ksoftirqd/0-3     [000] ....1.1  6854.629291: softirq_entry: vec=1 [action=TIMER]
>...
>     ksoftirqd/0-3     [000] ....1.1  6854.629293: softirq_exit: vec=1 [action=TIMER]
only the timer since nobody raised NET_RX

>...
>     ksoftirqd/0-3     [000] .......  6854.629298: schedule <-smpboot_thread_fn ...
>     ksoftirqd/0-3     [000] d...3..  6854.629304: sched_switch: prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=98 prev_state=S ==> next_comm=bash next_pid=14721 next_prio=28
>...
>            bash-14721 [000] d...1.1  6854.629308: smp_trace_reschedule_interrupt <-trace_reschedule_interrupt
># Actually unnecessary schedule IPI from 001?
>            bash-14721 [000] d...1.1  6854.629309: irq_enter <-smp_trace_reschedule_interrupt
>...
>            bash-14721 [000] ....1.1  6854.629316: __tick_nohz_task_switch <-__schedule
>            bash-14721 [000] ......1  6854.629316: __mod_zone_page_state <-free_pcppages_bulk
>            bash-14721 [000] ....1.1  6854.629317: mm_page_pcpu_drain: page=ffffea000f57a900 pfn=4021924 order=0 migratetype=0
>            bash-14721 [000] ......1  6854.629317: get_pfnblock_flags_mask <-free_pcppages_bulk
>            bash-14721 [000] ......1  6854.629317: __mod_zone_page_state <-free_pcppages_bulk
and it continues cleaning up memory.

>... # more of this like it was doing before
>I don't see it unlocking the problematic mutex before the trace ends.
>
>The relevant part for 001 starts while it's running the can1 ISR.
>     irq/18-can1-7228  [001] ....1.1  6854.629275: netif_rx: dev=can1 skbaddr=ffff880412d8fc00 len=16
>     irq/18-can1-7228  [001] ......1  6854.629275: migrate_disable <-netif_rx_internal
>     irq/18-can1-7228  [001] ......2  6854.629275: enqueue_to_backlog <-netif_rx_internal

enqueue_to_backlog() looks like packet reception so this should be
handled in napi so I assume we run in NET_RX softirq

>     irq/18-can1-7228  [001] d.....2  6854.629276: _raw_spin_lock <-enqueue_to_backlog
>     irq/18-can1-7228  [001] d...1.2  6854.629276: __raise_softirq_irqoff <-enqueue_to_backlog
>     irq/18-can1-7228  [001] d...1.2  6854.629276: do_raise_softirq_irqoff <-__raise_softirq_irqoff
>     irq/18-can1-7228  [001] d...2.2  6854.629276: softirq_raise: vec=3 [action=NET_RX]
>... # continues handling the can1 interrupt
>     irq/18-can1-7228  [001] ......6  6854.629286: rt_spin_lock <-get_page_from_freelist
>     irq/18-can1-7228  [001] ......6  6854.629287: rt_spin_lock_slowlock <-get_page_from_freelist
>     irq/18-can1-7228  [001] ......6  6854.629287: _raw_spin_lock <-rt_spin_lock_slowlock
>     irq/18-can1-7228  [001] ....1.6  6854.629287: __try_to_take_rt_mutex <-rt_spin_lock_slowlock
>     irq/18-can1-7228  [001] ....1.6  6854.629287: _raw_spin_lock_irq <-rt_spin_lock_slowlock
>     irq/18-can1-7228  [001] d...2.6  6854.629288: _raw_spin_unlock_irq <-rt_spin_lock_slowlock
>     irq/18-can1-7228  [001] ....1.6  6854.629288: task_blocks_on_rt_mutex <-rt_spin_lock_slowlock

it might be zone->lock it goes after. It boosts the bash process which
seems to free memory so it would make sense.

>... # rt_mutex/scheduling stuff
>     irq/18-can1-7228  [001] d...4.6  6854.629291: sched_pi_setprio: comm=bash pid=14721 oldprio=120 newprio=28
>... # more scheduler stuff
>     irq/18-can1-7228  [001] d...3.6  6854.629299: native_smp_send_reschedule <-rt_mutex_setprio
>... # more scheduler stuff
>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_fair <-__schedule
>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_stop <-__schedule
>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_dl <-__schedule
>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_rt <-__schedule
>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_fair <-__schedule
>     irq/18-can1-7228  [001] d...2.6  6854.629308: pick_next_task_idle <-__schedule
>     irq/18-can1-7228  [001] d...3.6  6854.629308: sched_switch: prev_comm=irq/18-can1 prev_pid=7228 prev_prio=28 prev_state=D ==>
>next_comm=swapper/1 next_pid=0 next_prio=120
>...
>          <idle>-0     [001] d...1..  6854.629319: softirq_check_pending_idle <-tick_nohz_idle_enter
>My tracing_off() call is in softirq_check_pending_idle, so that's it.

It looks like your softirq for net_rx is getting a packet and then after
raising NET_RX (again?) it blocks on a lock. In order to get this lock
it boosts and schedules bash. It gets runable but on the other CPU. On
CPU1 there is nothig going is nothing going and the only runable task is
the idle thread. And this is probably where the warning is written
because we go to idle while we should process a softirq instead.

>Thanks,
>Brian Silverman

Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CONFIG_PREEMPT_RT local_softirq_pending warning when ISR blocks
  2015-03-09 16:08 ` Sebastian Andrzej Siewior
@ 2015-03-10  0:36   ` Brian Silverman
  2015-03-13 16:38     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 4+ messages in thread
From: Brian Silverman @ 2015-03-10  0:36 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, linux-rt-users, Austin Schuh, Thomas Gleixner, Ingo Molnar

On Mon, Mar 9, 2015 at 12:08 PM, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> * Brian Silverman | 2015-03-05 00:16:20 [-0500]:
>
>>Beforehand, 000 is spending most of its time in interrupts, but bash
>>is doing something related to memory management on it in between.
>>            bash-14721 [000] ......1  6854.629126: rt_spin_lock <-free_pcppages_bulk
>>            bash-14721 [000] ....1.1  6854.629126: mm_page_pcpu_drain: page=ffffea000fa1aa40 pfn=4097705 order=0 migratetype=1
>>            bash-14721 [000] ......1  6854.629127: get_pfnblock_flags_mask <-free_pcppages_bulk
>>            bash-14721 [000] ......1  6854.629127: __mod_zone_page_state <-free_pcppages_bulk
>>            bash-14721 [000] ....1.1  6854.629127: mm_page_pcpu_drain: page=ffffea000f572ac0 pfn=4021419 order=0 migratetype=0
>>            bash-14721 [000] ......1  6854.629128: get_pfnblock_flags_mask <-free_pcppages_bulk
>>            bash-14721 [000] ......1  6854.629128: __mod_zone_page_state <-free_pcppages_bulk
>>... # lots more virtually identical repetitions of those last 3 lines
>>            bash-14721 [000] ....1.1  6854.629139: mm_page_pcpu_drain: page=ffffea000f481a80 pfn=4005994 order=0 migratetype=1
>>            bash-14721 [000] ......1  6854.629139: get_pfnblock_flags_mask <-free_pcppages_bulk
> You free memory and hold the zone->lock
>
>>            bash-14721 [000] d.....1  6854.629139: do_IRQ <-ret_from_intr
>>            bash-14721 [000] d.....1  6854.629139: irq_enter <-do_IRQ
>>... # wakes up the can1 ISR thread on 001 and the can0 one on 000
>>(same physical IRQ line)
>>            bash-14721 [000] d...3.1  6854.629261: sched_switch: prev_comm=bash prev_pid=14721 prev_prio=120 prev_state=R+ ==> next_comm=irq/18-can0 next_pid=2015 next_prio=28
>
> I would assume that this one raises NET_RX softirq. But at the bottom
> you have the irq handler on the other CPU which confuses me…

There wasn't actually any traffic on can0 for this test, so it didn't.
The can0 ISR only makes a few reads/writes to the device, never
calling netif_rx.

The can1 handler (which actually raises a NET_RX softirq) runs on 001
because it's pinned there.

>
>>... # runs the can0 ISR
>>     irq/18-can0-2015  [000] d...3..  6854.629283: sched_switch: prev_comm=irq/18-can0 prev_pid=2015 prev_prio=28 prev_state=S ==> next_comm=ksoftirqd/0 next_pid=3 next_prio=98
>>...
>>     ksoftirqd/0-3     [000] ....1.1  6854.629291: softirq_entry: vec=1 [action=TIMER]
>>...
>>     ksoftirqd/0-3     [000] ....1.1  6854.629293: softirq_exit: vec=1 [action=TIMER]
> only the timer since nobody raised NET_RX

Correct. I included that as context for what 000 spent its time doing
with the lock held, not implying that it's incorrect.

>
>>...
>>     ksoftirqd/0-3     [000] .......  6854.629298: schedule <-smpboot_thread_fn ...
>>     ksoftirqd/0-3     [000] d...3..  6854.629304: sched_switch: prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=98 prev_state=S ==> next_comm=bash next_pid=14721 next_prio=28
>>...
>>            bash-14721 [000] d...1.1  6854.629308: smp_trace_reschedule_interrupt <-trace_reschedule_interrupt
>># Actually unnecessary schedule IPI from 001?
>>            bash-14721 [000] d...1.1  6854.629309: irq_enter <-smp_trace_reschedule_interrupt
>>...
>>            bash-14721 [000] ....1.1  6854.629316: __tick_nohz_task_switch <-__schedule
>>            bash-14721 [000] ......1  6854.629316: __mod_zone_page_state <-free_pcppages_bulk
>>            bash-14721 [000] ....1.1  6854.629317: mm_page_pcpu_drain: page=ffffea000f57a900 pfn=4021924 order=0 migratetype=0
>>            bash-14721 [000] ......1  6854.629317: get_pfnblock_flags_mask <-free_pcppages_bulk
>>            bash-14721 [000] ......1  6854.629317: __mod_zone_page_state <-free_pcppages_bulk
> and it continues cleaning up memory.
>
>>... # more of this like it was doing before
>>I don't see it unlocking the problematic mutex before the trace ends.
>>
>>The relevant part for 001 starts while it's running the can1 ISR.
>>     irq/18-can1-7228  [001] ....1.1  6854.629275: netif_rx: dev=can1 skbaddr=ffff880412d8fc00 len=16
>>     irq/18-can1-7228  [001] ......1  6854.629275: migrate_disable <-netif_rx_internal
>>     irq/18-can1-7228  [001] ......2  6854.629275: enqueue_to_backlog <-netif_rx_internal
>
> enqueue_to_backlog() looks like packet reception so this should be
> handled in napi so I assume we run in NET_RX softirq

enqueue_to_backlog is the napi function that actually raises the NET_RX softirq.

>>     irq/18-can1-7228  [001] d.....2  6854.629276: _raw_spin_lock <-enqueue_to_backlog
>>     irq/18-can1-7228  [001] d...1.2  6854.629276: __raise_softirq_irqoff <-enqueue_to_backlog
>>     irq/18-can1-7228  [001] d...1.2  6854.629276: do_raise_softirq_irqoff <-__raise_softirq_irqoff
>>     irq/18-can1-7228  [001] d...2.2  6854.629276: softirq_raise: vec=3 [action=NET_RX]
This is where the softirq gets raised.

>>... # continues handling the can1 interrupt
>>     irq/18-can1-7228  [001] ......6  6854.629286: rt_spin_lock <-get_page_from_freelist
>>     irq/18-can1-7228  [001] ......6  6854.629287: rt_spin_lock_slowlock <-get_page_from_freelist
>>     irq/18-can1-7228  [001] ......6  6854.629287: _raw_spin_lock <-rt_spin_lock_slowlock
>>     irq/18-can1-7228  [001] ....1.6  6854.629287: __try_to_take_rt_mutex <-rt_spin_lock_slowlock
>>     irq/18-can1-7228  [001] ....1.6  6854.629287: _raw_spin_lock_irq <-rt_spin_lock_slowlock
>>     irq/18-can1-7228  [001] d...2.6  6854.629288: _raw_spin_unlock_irq <-rt_spin_lock_slowlock
>>     irq/18-can1-7228  [001] ....1.6  6854.629288: task_blocks_on_rt_mutex <-rt_spin_lock_slowlock
>
> it might be zone->lock it goes after. It boosts the bash process which
> seems to free memory so it would make sense.
>
>>... # rt_mutex/scheduling stuff
>>     irq/18-can1-7228  [001] d...4.6  6854.629291: sched_pi_setprio: comm=bash pid=14721 oldprio=120 newprio=28
>>... # more scheduler stuff
>>     irq/18-can1-7228  [001] d...3.6  6854.629299: native_smp_send_reschedule <-rt_mutex_setprio
>>... # more scheduler stuff
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_fair <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_stop <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_dl <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_rt <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629307: pick_next_task_fair <-__schedule
>>     irq/18-can1-7228  [001] d...2.6  6854.629308: pick_next_task_idle <-__schedule
>>     irq/18-can1-7228  [001] d...3.6  6854.629308: sched_switch: prev_comm=irq/18-can1 prev_pid=7228 prev_prio=28 prev_state=D ==>
>>next_comm=swapper/1 next_pid=0 next_prio=120
>>...
>>          <idle>-0     [001] d...1..  6854.629319: softirq_check_pending_idle <-tick_nohz_idle_enter
>>My tracing_off() call is in softirq_check_pending_idle, so that's it.
>
> It looks like your softirq for net_rx is getting a packet and then after
> raising NET_RX (again?) it blocks on a lock. In order to get this lock
> it boosts and schedules bash. It gets runable but on the other CPU. On
> CPU1 there is nothig going is nothing going and the only runable task is
> the idle thread. And this is probably where the warning is written
> because we go to idle while we should process a softirq instead.

That sounds like the issue. Doing the softirq instead of going idle in
this situation seems like it means calling thread_do_softirq() from
__schedule, but I don't know where the right place is. Can anybody
give me some help on where exactly to check for softirqs from?

Thanks,
Brian

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CONFIG_PREEMPT_RT local_softirq_pending warning when ISR blocks
  2015-03-10  0:36   ` Brian Silverman
@ 2015-03-13 16:38     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-13 16:38 UTC (permalink / raw)
  To: Brian Silverman
  Cc: linux-kernel, linux-rt-users, Austin Schuh, Thomas Gleixner, Ingo Molnar

* Brian Silverman | 2015-03-09 20:36:27 [-0400]:

>> It looks like your softirq for net_rx is getting a packet and then after
>> raising NET_RX (again?) it blocks on a lock. In order to get this lock
>> it boosts and schedules bash. It gets runable but on the other CPU. On
>> CPU1 there is nothig going is nothing going and the only runable task is
>> the idle thread. And this is probably where the warning is written
>> because we go to idle while we should process a softirq instead.
>
>That sounds like the issue. Doing the softirq instead of going idle in
>this situation seems like it means calling thread_do_softirq() from
>__schedule, but I don't know where the right place is. Can anybody
>give me some help on where exactly to check for softirqs from?

I was slightly wrong.

>     irq/18-can1-7228  [001] d.....2  6854.629276: _raw_spin_lock <-enqueue_to_backlog
>     irq/18-can1-7228  [001] d...1.2  6854.629276: __raise_softirq_irqoff <-enqueue_to_backlog
>     irq/18-can1-7228  [001] d...1.2  6854.629276: do_raise_softirq_irqoff <-__raise_softirq_irqoff
>     irq/18-can1-7228  [001] d...2.2  6854.629276: softirq_raise: vec=3 [action=NET_RX]
>... # continues handling the can1 interrupt
>     irq/18-can1-7228  [001] ......6  6854.629286: rt_spin_lock <-get_page_from_freelist
>     irq/18-can1-7228  [001] ......6  6854.629287: rt_spin_lock_slowlock <-get_page_from_freelist
>     irq/18-can1-7228  [001] ......6  6854.629287: _raw_spin_lock <-rt_spin_lock_slowlock
>     irq/18-can1-7228  [001] ....1.6  6854.629287: __try_to_take_rt_mutex <-rt_spin_lock_slowlock
>     irq/18-can1-7228  [001] ....1.6  6854.629287: _raw_spin_lock_irq <-rt_spin_lock_slowlock
>     irq/18-can1-7228  [001] d...2.6  6854.629288: _raw_spin_unlock_irq <-rt_spin_lock_slowlock
>     irq/18-can1-7228  [001] ....1.6  6854.629288: task_blocks_on_rt_mutex <-rt_spin_lock_slowlock
You raised the softirq but you did not wakeup softirqd. It is expected
that you process softirq(s) in irq-thread context once you leave the
interrupt thread (that is on local_bh_enable() because the thread is run
with BH off).
But this did not happen yet. While you are in your interrupt thread you
got blocked on a lock. And since your CPU is idle otherwise, the
scheduler puts the idle task on.
softirq_check_pending_idle() has a check for this kind of things but
only if the ksoftirqd itself got blocked. In your case it is a process
with BH switched off.
You wouldn't see the warning if you start a task in userland that just
loops and keeps the CPU busy :)

So. One thing I noticed from looking at the code, is that if a thread is
marked IRQF_NO_SOFTIRQ_CALL() then it won't raise process softirqs at
all. This is a bug but since nobody uses IRQF_NO_SOFTIRQ_CALL nobody
noticed it so far (or nobody uses it because it is not working).

And for you. I'm not sure yet what is the best thing to do here. We
could
- teach softirq_check_pending_idle() to ignore these things because
  once the irq thread is unblocked, it will process the softirq.
- utilize the otherwise idle CPU and schedule ksoftirq with the proper
  mask.

Both isn't that easy, I think…

>Thanks,
>Brian

Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-03-13 16:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-05  5:16 CONFIG_PREEMPT_RT local_softirq_pending warning when ISR blocks Brian Silverman
2015-03-09 16:08 ` Sebastian Andrzej Siewior
2015-03-10  0:36   ` Brian Silverman
2015-03-13 16:38     ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).