From: Heiner Kallweit <hkallweit1@gmail.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Anna-Maria Gleixner <anna-maria@linutronix.de>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Fix 80d20d35af1e ("nohz: Fix local_timer_softirq_pending()") may have revealed another problem
Date: Wed, 16 Jan 2019 19:42:49 +0100 [thread overview]
Message-ID: <fb9c1fe7-dfea-44cb-1865-64f4962a07b5@gmail.com> (raw)
In-Reply-To: <20190116062408.GA26169@lenoir>
On 16.01.2019 07:24, Frederic Weisbecker wrote:
> On Fri, Dec 28, 2018 at 12:11:12AM +0100, Heiner Kallweit wrote:
>>
>> # tracer: nop
>> #
>> # _-----=> irqs-off
>> # / _----=> need-resched
>> # | / _---=> hardirq/softirq
>> # || / _--=> preempt-depth
>> # ||| / delay
>> # TASK-PID CPU# |||| TIMESTAMP FUNCTION
>> # | | | |||| | |
>> <idle>-0 [001] d.h2 1479.099092: softirq_raise: vec=1 [action=TIMER]
>> <idle>-0 [001] d.h2 1479.099098: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.099106: softirq_raise: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.099114: softirq_entry: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.099120: softirq_exit: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.099121: softirq_entry: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.099134: softirq_exit: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.099135: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.099141: softirq_exit: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.100094: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.100109: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.100116: softirq_exit: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.101091: softirq_raise: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.101113: softirq_entry: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.101118: softirq_exit: vec=1 [action=TIMER]
>> <idle>-0 [001] d.h2 1479.102094: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.102114: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.102121: softirq_exit: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.103091: softirq_raise: vec=1 [action=TIMER]
>> <idle>-0 [001] d.h2 1479.103097: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.103105: softirq_raise: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.103114: softirq_entry: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.103118: softirq_exit: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.103119: softirq_entry: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.103131: softirq_exit: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.103132: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.103138: softirq_exit: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.105092: softirq_raise: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.105115: softirq_entry: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.105119: softirq_exit: vec=1 [action=TIMER]
>> <idle>-0 [001] d.h2 1479.106092: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.106112: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] .Ns2 1479.106144: softirq_exit: vec=9 [action=RCU]
>> cpuhp/1-13 [001] d..2 1479.106279: timer_cancel: timer=0000000009a25653
>> <idle>-0 [001] d.h2 1479.106965: softirq_raise: vec=1 [action=TIMER]
>> <idle>-0 [001] d.h2 1479.106969: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.106974: softirq_raise: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.106981: softirq_entry: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.106984: softirq_exit: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.106985: softirq_entry: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.106994: softirq_exit: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.106995: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.106999: softirq_exit: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.107996: softirq_raise: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.108010: softirq_entry: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.108014: softirq_exit: vec=1 [action=TIMER]
>> <idle>-0 [001] d.h2 1479.109009: softirq_raise: vec=1 [action=TIMER]
>> <idle>-0 [001] d.h2 1479.109013: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.109024: softirq_entry: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.109028: softirq_exit: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.109028: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.109033: softirq_exit: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.110013: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.110033: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] ..s2 1479.110040: softirq_exit: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.111011: softirq_raise: vec=1 [action=TIMER]
>> <idle>-0 [001] d.h2 1479.111017: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] d.h2 1479.111026: softirq_raise: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.111035: softirq_entry: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.111040: softirq_exit: vec=1 [action=TIMER]
>> <idle>-0 [001] ..s2 1479.111040: softirq_entry: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.111052: softirq_exit: vec=7 [action=SCHED]
>> <idle>-0 [001] ..s2 1479.111052: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] .Ns2 1479.111079: softirq_exit: vec=9 [action=RCU]
>> cpuhp/1-13 [001] dNh2 1479.112930: softirq_raise: vec=1 [action=TIMER]
>> cpuhp/1-13 [001] dNh2 1479.112935: softirq_raise: vec=9 [action=RCU]
>> <idle>-0 [001] d..1 1479.113077: <stack trace>
>> => can_stop_idle_tick.isra.14
>> => tick_nohz_get_sleep_length
>> => menu_select
>> => cpuidle_select
>> => do_idle
>> => cpu_startup_entry
>> => start_secondary
>> => secondary_startup_64
>> <idle>-0 [001] .Ns2 1479.113110: softirq_entry: vec=1 [action=TIMER]
>> <idle>-0 [001] .Ns2 1479.113114: softirq_exit: vec=1 [action=TIMER]
>> <idle>-0 [001] .Ns2 1479.113115: softirq_entry: vec=9 [action=RCU]
>> <idle>-0 [001] .Ns2 1479.113139: softirq_exit: vec=9 [action=RCU]
>
>
> So, I'm being very doubtful now actually.
>
> First of all, I realize my patch only fixes things halfway. Only softirq raised from
> hardirqs are going to be handled after ksoftirqd is parked. If it's raised outside
> interrupts, the softirq is going to be ignored. If other smpboot kthreads, following
> ksoftirqd parking, raise softirqs from their park() callback we should at least warn
> that we are ignoring that.
>
> Also, I'm still not sure if you're running threaded IRQs. Very likely not as
> the softirqs in your traces are executed in idle, which means from an IRQ interrupting
> idle. You haven't passed "threadirqs" in your kernel parameters, right?
> In any case we need to figure out why your softirqs raised at 1479.112930 and
> 1479.112935 don't get handled.
>
The only softirq I'm aware of is network NAPI processing. And no, I have no kernel
command line parameter "threadirqs" set.
> Finally I don't understand why you have so many timer softirqs and yet no timer traces
> at all, I turned them on in the script I gave you. So something weird is happening with
> the timers code. At best we are wasting lots of CPU cycles with those empty softirqs, at
> worst we are miscalculating some expiration somewhere, possibly ignoring a timer.
>
> So I'll need to cook up an updated debug patch for you to trigger if you don't
> mind.
>
I don't mind ..
> Thanks.
>
Heiner
next prev parent reply other threads:[~2019-01-16 18:43 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-16 6:13 Fix 80d20d35af1e ("nohz: Fix local_timer_softirq_pending()") may have revealed another problem Heiner Kallweit
2018-08-18 11:26 ` Thomas Gleixner
2018-08-18 22:34 ` Heiner Kallweit
2018-08-24 4:12 ` Frederic Weisbecker
2018-08-24 5:59 ` Heiner Kallweit
2018-08-24 8:01 ` Thomas Gleixner
2018-08-24 14:30 ` Frederic Weisbecker
2018-08-24 17:06 ` Heiner Kallweit
2018-08-28 2:25 ` Frederic Weisbecker
2018-09-27 16:05 ` Thomas Gleixner
2018-09-28 13:18 ` Frederic Weisbecker
2018-09-28 20:35 ` Heiner Kallweit
2018-10-15 20:58 ` Heiner Kallweit
2018-12-24 21:11 ` Heiner Kallweit
2018-12-27 6:53 ` Frederic Weisbecker
2018-12-27 23:11 ` Heiner Kallweit
2018-12-28 1:31 ` Frederic Weisbecker
2018-12-28 6:34 ` Heiner Kallweit
2018-12-28 6:39 ` Heiner Kallweit
2019-01-09 22:20 ` Heiner Kallweit
2019-01-11 21:36 ` Frederic Weisbecker
2019-01-16 6:24 ` Frederic Weisbecker
2019-01-16 18:42 ` Heiner Kallweit [this message]
2019-01-24 19:37 ` Heiner Kallweit
2019-02-14 19:05 ` Heiner Kallweit
2019-02-14 21:47 ` Thomas Gleixner
2019-02-14 22:33 ` Heiner Kallweit
2019-02-15 0:31 ` Frederic Weisbecker
2019-02-16 9:14 ` Heiner Kallweit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fb9c1fe7-dfea-44cb-1865-64f4962a07b5@gmail.com \
--to=hkallweit1@gmail.com \
--cc=anna-maria@linutronix.de \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).