From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752633AbbCJAhG (ORCPT ); Mon, 9 Mar 2015 20:37:06 -0400 Received: from mail-qg0-f42.google.com ([209.85.192.42]:41934 "EHLO mail-qg0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751709AbbCJAg6 convert rfc822-to-8bit (ORCPT ); Mon, 9 Mar 2015 20:36:58 -0400 MIME-Version: 1.0 In-Reply-To: <20150309160845.GE13768@linutronix.de> References: <20150309160845.GE13768@linutronix.de> From: Brian Silverman Date: Mon, 9 Mar 2015 20:36:27 -0400 Message-ID: Subject: Re: CONFIG_PREEMPT_RT local_softirq_pending warning when ISR blocks To: Sebastian Andrzej Siewior Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, Austin Schuh , Thomas Gleixner , Ingo Molnar Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 9, 2015 at 12:08 PM, Sebastian Andrzej Siewior wrote: > * Brian Silverman | 2015-03-05 00:16:20 [-0500]: > >>Beforehand, 000 is spending most of its time in interrupts, but bash >>is doing something related to memory management on it in between. >> bash-14721 [000] ......1 6854.629126: rt_spin_lock <-free_pcppages_bulk >> bash-14721 [000] ....1.1 6854.629126: mm_page_pcpu_drain: page=ffffea000fa1aa40 pfn=4097705 order=0 migratetype=1 >> bash-14721 [000] ......1 6854.629127: get_pfnblock_flags_mask <-free_pcppages_bulk >> bash-14721 [000] ......1 6854.629127: __mod_zone_page_state <-free_pcppages_bulk >> bash-14721 [000] ....1.1 6854.629127: mm_page_pcpu_drain: page=ffffea000f572ac0 pfn=4021419 order=0 migratetype=0 >> bash-14721 [000] ......1 6854.629128: get_pfnblock_flags_mask <-free_pcppages_bulk >> bash-14721 [000] ......1 6854.629128: __mod_zone_page_state <-free_pcppages_bulk >>... # lots more virtually identical repetitions of those last 3 lines >> bash-14721 [000] ....1.1 6854.629139: mm_page_pcpu_drain: page=ffffea000f481a80 pfn=4005994 order=0 migratetype=1 >> bash-14721 [000] ......1 6854.629139: get_pfnblock_flags_mask <-free_pcppages_bulk > You free memory and hold the zone->lock > >> bash-14721 [000] d.....1 6854.629139: do_IRQ <-ret_from_intr >> bash-14721 [000] d.....1 6854.629139: irq_enter <-do_IRQ >>... # wakes up the can1 ISR thread on 001 and the can0 one on 000 >>(same physical IRQ line) >> bash-14721 [000] d...3.1 6854.629261: sched_switch: prev_comm=bash prev_pid=14721 prev_prio=120 prev_state=R+ ==> next_comm=irq/18-can0 next_pid=2015 next_prio=28 > > I would assume that this one raises NET_RX softirq. But at the bottom > you have the irq handler on the other CPU which confuses me… There wasn't actually any traffic on can0 for this test, so it didn't. The can0 ISR only makes a few reads/writes to the device, never calling netif_rx. The can1 handler (which actually raises a NET_RX softirq) runs on 001 because it's pinned there. > >>... # runs the can0 ISR >> irq/18-can0-2015 [000] d...3.. 6854.629283: sched_switch: prev_comm=irq/18-can0 prev_pid=2015 prev_prio=28 prev_state=S ==> next_comm=ksoftirqd/0 next_pid=3 next_prio=98 >>... >> ksoftirqd/0-3 [000] ....1.1 6854.629291: softirq_entry: vec=1 [action=TIMER] >>... >> ksoftirqd/0-3 [000] ....1.1 6854.629293: softirq_exit: vec=1 [action=TIMER] > only the timer since nobody raised NET_RX Correct. I included that as context for what 000 spent its time doing with the lock held, not implying that it's incorrect. > >>... >> ksoftirqd/0-3 [000] ....... 6854.629298: schedule <-smpboot_thread_fn ... >> ksoftirqd/0-3 [000] d...3.. 6854.629304: sched_switch: prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=98 prev_state=S ==> next_comm=bash next_pid=14721 next_prio=28 >>... >> bash-14721 [000] d...1.1 6854.629308: smp_trace_reschedule_interrupt <-trace_reschedule_interrupt >># Actually unnecessary schedule IPI from 001? >> bash-14721 [000] d...1.1 6854.629309: irq_enter <-smp_trace_reschedule_interrupt >>... >> bash-14721 [000] ....1.1 6854.629316: __tick_nohz_task_switch <-__schedule >> bash-14721 [000] ......1 6854.629316: __mod_zone_page_state <-free_pcppages_bulk >> bash-14721 [000] ....1.1 6854.629317: mm_page_pcpu_drain: page=ffffea000f57a900 pfn=4021924 order=0 migratetype=0 >> bash-14721 [000] ......1 6854.629317: get_pfnblock_flags_mask <-free_pcppages_bulk >> bash-14721 [000] ......1 6854.629317: __mod_zone_page_state <-free_pcppages_bulk > and it continues cleaning up memory. > >>... # more of this like it was doing before >>I don't see it unlocking the problematic mutex before the trace ends. >> >>The relevant part for 001 starts while it's running the can1 ISR. >> irq/18-can1-7228 [001] ....1.1 6854.629275: netif_rx: dev=can1 skbaddr=ffff880412d8fc00 len=16 >> irq/18-can1-7228 [001] ......1 6854.629275: migrate_disable <-netif_rx_internal >> irq/18-can1-7228 [001] ......2 6854.629275: enqueue_to_backlog <-netif_rx_internal > > enqueue_to_backlog() looks like packet reception so this should be > handled in napi so I assume we run in NET_RX softirq enqueue_to_backlog is the napi function that actually raises the NET_RX softirq. >> irq/18-can1-7228 [001] d.....2 6854.629276: _raw_spin_lock <-enqueue_to_backlog >> irq/18-can1-7228 [001] d...1.2 6854.629276: __raise_softirq_irqoff <-enqueue_to_backlog >> irq/18-can1-7228 [001] d...1.2 6854.629276: do_raise_softirq_irqoff <-__raise_softirq_irqoff >> irq/18-can1-7228 [001] d...2.2 6854.629276: softirq_raise: vec=3 [action=NET_RX] This is where the softirq gets raised. >>... # continues handling the can1 interrupt >> irq/18-can1-7228 [001] ......6 6854.629286: rt_spin_lock <-get_page_from_freelist >> irq/18-can1-7228 [001] ......6 6854.629287: rt_spin_lock_slowlock <-get_page_from_freelist >> irq/18-can1-7228 [001] ......6 6854.629287: _raw_spin_lock <-rt_spin_lock_slowlock >> irq/18-can1-7228 [001] ....1.6 6854.629287: __try_to_take_rt_mutex <-rt_spin_lock_slowlock >> irq/18-can1-7228 [001] ....1.6 6854.629287: _raw_spin_lock_irq <-rt_spin_lock_slowlock >> irq/18-can1-7228 [001] d...2.6 6854.629288: _raw_spin_unlock_irq <-rt_spin_lock_slowlock >> irq/18-can1-7228 [001] ....1.6 6854.629288: task_blocks_on_rt_mutex <-rt_spin_lock_slowlock > > it might be zone->lock it goes after. It boosts the bash process which > seems to free memory so it would make sense. > >>... # rt_mutex/scheduling stuff >> irq/18-can1-7228 [001] d...4.6 6854.629291: sched_pi_setprio: comm=bash pid=14721 oldprio=120 newprio=28 >>... # more scheduler stuff >> irq/18-can1-7228 [001] d...3.6 6854.629299: native_smp_send_reschedule <-rt_mutex_setprio >>... # more scheduler stuff >> irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_fair <-__schedule >> irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_stop <-__schedule >> irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_dl <-__schedule >> irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_rt <-__schedule >> irq/18-can1-7228 [001] d...2.6 6854.629307: pick_next_task_fair <-__schedule >> irq/18-can1-7228 [001] d...2.6 6854.629308: pick_next_task_idle <-__schedule >> irq/18-can1-7228 [001] d...3.6 6854.629308: sched_switch: prev_comm=irq/18-can1 prev_pid=7228 prev_prio=28 prev_state=D ==> >>next_comm=swapper/1 next_pid=0 next_prio=120 >>... >> -0 [001] d...1.. 6854.629319: softirq_check_pending_idle <-tick_nohz_idle_enter >>My tracing_off() call is in softirq_check_pending_idle, so that's it. > > It looks like your softirq for net_rx is getting a packet and then after > raising NET_RX (again?) it blocks on a lock. In order to get this lock > it boosts and schedules bash. It gets runable but on the other CPU. On > CPU1 there is nothig going is nothing going and the only runable task is > the idle thread. And this is probably where the warning is written > because we go to idle while we should process a softirq instead. That sounds like the issue. Doing the softirq instead of going idle in this situation seems like it means calling thread_do_softirq() from __schedule, but I don't know where the right place is. Can anybody give me some help on where exactly to check for softirqs from? Thanks, Brian