From: Chris Metcalf <cmetcalf@ezchip.com>
To: Frederic Weisbecker <fweisbec@gmail.com>,
Don Zickus <dzickus@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>, Andrew Jones <drjones@redhat.com>,
Ulrich Obergfell <uobergfe@redhat.com>,
Fabian Frederick <fabf@skynet.be>,
Aaron Tomlin <atomlin@redhat.com>, Ben Zhang <benzh@chromium.org>,
Christoph Lameter <cl@linux.com>,
Gilad Ben-Yossef <gilad@benyossef.com>,
Steven Rostedt <rostedt@goodmis.org>,
<linux-kernel@vger.kernel.org>, Jonathan Corbet <corbet@lwn.net>,
<linux-doc@vger.kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v10 1/3] smpboot: allow excluding cpus from the smpboot threads
Date: Thu, 4 Jun 2015 13:25:49 -0400 [thread overview]
Message-ID: <55708A1D.70707@ezchip.com> (raw)
In-Reply-To: <20150501212329.GA4179@lerouge>
On 05/01/2015 05:23 PM, Frederic Weisbecker wrote:
> On Fri, May 01, 2015 at 03:57:51PM -0400, Chris Metcalf wrote:
>> On 05/01/2015 04:53 AM, Frederic Weisbecker wrote:
>>>> + /* Unpark any threads that were voluntarily parked. */
>>>> + for_each_cpu_not(cpu, &ht->cpumask) {
>>>> + if (cpu_online(cpu)) {
>>>> + struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu);
>>>> + if (tsk)
>>>> + kthread_unpark(tsk);
>>> I'm still not clear why we are doing that. kthread_stop() should be able
>>> to handle parked kthreads, otherwise it needs to be fixed.
>> Checking without the unpark, it's actually only a problem with nohz_full.
>> In a system without nohz_full, the kthreads are able to stop even when
>> they are parked; it's only in the nohz_full case that things wedge.
>> For example, booting with only cpu 0 as a housekeeping core (and
>> therefore all watchdogs 1-35 on my 36-core tilegx are parked), and
>> immediately doing "echo 0 > /proc/sys/kernel/watchdog", I see
>> (via SysRq ^O-l) the first parked watchdog, on cpu 1, hung with:
>>
>> frame 0: 0xfffffff7000f2928 lock_hrtimer_base+0xb8/0xc0
>> frame 1: 0xfffffff7000f2a28 hrtimer_try_to_cancel+0x40/0x170
>> frame 2: 0xfffffff7000f2a28 hrtimer_try_to_cancel+0x40/0x170
>> frame 3: 0xfffffff7000f2b98 hrtimer_cancel+0x40/0x68
>> frame 4: 0xfffffff70014cce0 watchdog_disable+0x50/0x70
>> frame 5: 0xfffffff70008c2d0 smpboot_thread_fn+0x350/0x438
>> frame 6: 0xfffffff700084b28 kthread+0x160/0x178
I finally had some time to look into this issue some more.
With PROVE_LOCKING enabled (after a fix I'll send to LKML shortly), we
get no warnings, and ^O-d to print locks shows:
Showing all locks held in the system:
3 locks held by watchdog/1/15:
#0: (&(&hp->lock)->rlock){-.....}, at: [<fffffff700620740>] hvc_poll+0xb8/0x4b8
#1: (rcu_read_lock){......}, at: [<fffffff70061d710>] __handle_sysrq+0x0/0x440
#2: (tasklist_lock){.+.+..}, at: [<fffffff7000d7310>] debug_show_all_locks+0xc0/0x350
3 locks held by sh/1732:
#0: (sb_writers#4){.+.+.+}, at: [<fffffff70022f6b8>] vfs_write+0x268/0x2c0
#1: (watchdog_proc_mutex){+.+.+.}, at: [<fffffff70016f368>] proc_watchdog_common+0x78/0x1c8
#2: (smpboot_threads_lock){+.+.+.}, at: [<fffffff700093558>] smpboot_unregister_percpu_thread+0x48/0x88
All the watchdog/1/15 locks are attributable to the fact that it's running
on the same core that ended up handling the "^O-d" request from SysRq.
The sh process from which I ran the echo eventually shows up as "blocked for
more than 120 seconds" and pretty much where you'd expect it to be, waiting
on a completion in kthread_stop() at kthread.c:473.
I instrumented lock_hrtimer_base(), and timer->base is null, and never gets
set non-null, so the loop spins forever. Perhaps something in nohz is preventing
the timer->base from being set?
I'm happy to keep debugging this but I'm not really clear on what could
be going wrong here. Any ideas?
--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com
next prev parent reply other threads:[~2015-06-04 17:26 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-30 18:51 [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores cmetcalf
2015-03-30 19:12 ` Don Zickus
2015-03-30 19:32 ` [PATCH v2] " Chris Metcalf
2015-03-30 20:02 ` Don Zickus
2015-04-02 15:19 ` Frederic Weisbecker
2015-03-31 2:04 ` [PATCH] " Mike Galbraith
2015-03-31 6:34 ` Mike Galbraith
2015-03-31 18:32 ` Chris Metcalf
2015-03-31 7:25 ` Ingo Molnar
2015-03-31 18:30 ` Chris Metcalf
2015-04-02 13:35 ` Don Zickus
2015-04-02 13:49 ` Chris Metcalf
2015-04-02 14:15 ` Don Zickus
2015-04-02 15:38 ` Frederic Weisbecker
2015-04-02 15:42 ` Chris Metcalf
2015-04-02 16:08 ` Don Zickus
2015-04-02 16:48 ` Frederic Weisbecker
2015-04-02 17:39 ` [PATCH v3] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-02 18:06 ` Peter Zijlstra
2015-04-02 18:16 ` Chris Metcalf
2015-04-02 18:33 ` Peter Zijlstra
2015-04-02 18:49 ` Chris Metcalf
2015-04-02 18:45 ` Don Zickus
2015-04-03 16:08 ` [PATCH v4 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-03 16:08 ` [PATCH v4 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-05 16:46 ` Ulrich Obergfell
2015-04-06 19:45 ` [PATCH v5 0/2] nohz/watchdog/smp_hotplug_thread changes cmetcalf
2015-04-06 19:45 ` [PATCH v5 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 13:28 ` Frederic Weisbecker
2015-04-08 14:06 ` Chris Metcalf
2015-04-08 17:29 ` Frederic Weisbecker
2015-04-06 19:45 ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-07 15:44 ` Don Zickus
2015-04-07 15:56 ` Sasha Levin
2015-04-07 17:49 ` Chris Metcalf
2015-04-08 14:01 ` Frederic Weisbecker
2015-04-08 19:11 ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 19:11 ` [PATCH v6 2/2] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-08 20:37 ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads Thomas Gleixner
2015-04-09 20:29 ` [PATCH] " Chris Metcalf
2015-04-10 1:58 ` Frederic Weisbecker
2015-04-10 16:33 ` Chris Metcalf
2015-04-12 19:14 ` Frederic Weisbecker
2015-04-13 16:06 ` Chris Metcalf
2015-04-13 21:54 ` Frederic Weisbecker
2015-04-14 19:37 ` [PATCH v8 1/3] " Chris Metcalf
2015-04-14 19:37 ` [PATCH v8 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-16 10:46 ` Ulrich Obergfell
2015-04-17 15:41 ` Chris Metcalf
2015-04-22 8:20 ` Ulrich Obergfell
2015-04-28 17:52 ` Chris Metcalf
2015-04-29 8:48 ` Ulrich Obergfell
2015-04-17 1:31 ` Chai Wen
2015-04-17 16:10 ` Chris Metcalf
2015-04-14 19:37 ` [PATCH v8 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-16 15:28 ` [PATCH v8 1/3] smpboot: allow excluding cpus from the smpboot threads Frederic Weisbecker
2015-04-16 15:50 ` Chris Metcalf
2015-04-16 16:48 ` Frederic Weisbecker
2015-04-17 16:17 ` Chris Metcalf
2015-04-17 18:37 ` [PATCH v9 " Chris Metcalf
2015-04-17 18:37 ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-21 12:32 ` Ulrich Obergfell
2015-04-28 18:07 ` Chris Metcalf
2015-04-29 9:49 ` Ulrich Obergfell
2015-04-29 13:10 ` Don Zickus
2015-04-21 14:07 ` Ulrich Obergfell
2015-04-22 15:18 ` Don Zickus
2015-04-25 15:42 ` Ulrich Obergfell
2015-04-22 11:02 ` Ulrich Obergfell
2015-04-22 15:21 ` Don Zickus
2015-04-27 20:27 ` Chris Metcalf
2015-04-28 15:17 ` Don Zickus
2015-04-28 19:42 ` Andrew Morton
2015-04-30 19:39 ` [PATCH v10 0/3] add watchdog_cpumask to help nohz_full Chris Metcalf
2015-04-30 19:39 ` [PATCH v10 1/3] smpboot: allow excluding cpus from the smpboot threads Chris Metcalf
2015-05-01 8:53 ` Frederic Weisbecker
2015-05-01 19:57 ` Chris Metcalf
2015-05-01 21:23 ` Frederic Weisbecker
2015-05-04 22:06 ` Chris Metcalf
2015-06-03 2:34 ` Don Zickus
2015-06-04 17:25 ` Chris Metcalf [this message]
2015-05-01 20:00 ` [PATCH] smpboot: dynamically allocate the cpumask Chris Metcalf
2015-04-30 19:39 ` [PATCH v10 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-30 20:00 ` Don Zickus
2015-04-30 20:09 ` Chris Metcalf
2015-05-01 13:46 ` Don Zickus
2015-04-30 19:39 ` [PATCH v10 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26 ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Andrew Morton
2015-04-29 22:26 ` Andrew Morton
2015-04-17 18:37 ` [PATCH v9 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26 ` Andrew Morton
2015-04-29 22:26 ` [PATCH v9 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-30 16:07 ` Chris Metcalf
2015-04-14 15:23 ` [PATCH] " Frederic Weisbecker
2015-04-14 15:39 ` Chris Metcalf
2015-04-14 17:57 ` Thomas Gleixner
2015-04-10 20:48 ` [PATCH v7 1/3] " Chris Metcalf
2015-04-10 20:48 ` [PATCH v7 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-10 20:48 ` [PATCH v7 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-10 21:11 ` [PATCH v7 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-13 15:48 ` Chris Metcalf
2015-04-08 19:21 ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz Chris Metcalf
2015-04-08 22:31 ` Frederic Weisbecker
2015-03-31 10:17 ` [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores Christoph Lameter
2015-03-31 18:39 ` Chris Metcalf
2015-04-02 14:13 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55708A1D.70707@ezchip.com \
--to=cmetcalf@ezchip.com \
--cc=akpm@linux-foundation.org \
--cc=atomlin@redhat.com \
--cc=benzh@chromium.org \
--cc=cl@linux.com \
--cc=corbet@lwn.net \
--cc=drjones@redhat.com \
--cc=dzickus@redhat.com \
--cc=fabf@skynet.be \
--cc=fweisbec@gmail.com \
--cc=gilad@benyossef.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=uobergfe@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).