From: Chris Metcalf <cmetcalf@ezchip.com>
To: Frederic Weisbecker <fweisbec@gmail.com>,
Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Andrew Jones <drjones@redhat.com>,
chai wen <chaiw.fnst@cn.fujitsu.com>,
Ulrich Obergfell <uobergfe@redhat.com>,
Fabian Frederick <fabf@skynet.be>,
Aaron Tomlin <atomlin@redhat.com>, Ben Zhang <benzh@chromium.org>,
Christoph Lameter <cl@linux.com>,
Gilad Ben-Yossef <gilad@benyossef.com>,
Steven Rostedt <rostedt@goodmis.org>,
open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores
Date: Thu, 2 Apr 2015 11:42:43 -0400 [thread overview]
Message-ID: <551D6373.2030000@ezchip.com> (raw)
In-Reply-To: <20150402153827.GC10357@lerouge>
On 04/02/2015 11:38 AM, Frederic Weisbecker wrote:
> On Thu, Apr 02, 2015 at 10:15:27AM -0400, Don Zickus wrote:
>> On Thu, Apr 02, 2015 at 09:49:45AM -0400, Chris Metcalf wrote:
>>>> Can I ask how the NO_HZ_FULL technology works from userspace? Is there a
>>>> system command that has to be sent? How does the kernel know to turn off
>>>> ticks and trust userspace to do the right thing?
>>> The NO_HZ_FULL option, when configured into the kernel, lets
>>> you boot with "nohz_full=1-15" (or whatever cpumask you like),
>>> typically in conjunction with "isolcpus=1-15". At this point no tasks
>>> will run on those cores until explicitly placed there by affinity, and
>>> once there and running in userspace, the kernel will automatically
>>> get out of their way and not interrupt at all. This lets those tasks
>>> run with 100.000% of the cpu, which is a requirement for many
>>> user-space device drivers running high throughput devices.
>>> (This is typically the use case for the tile architecture customers.)
>>>
>>> So, other than a boot flag, there are no system commands or
>>> other APIs to deal with.
>> Ah, I am starting to understand your approach in the original patch better.
>>
>>> Part of the requirement, though, is that there can be only one task
>>> bound and runnable on that cpu, otherwise the kernel has to be
>>> involved to do the context-switching off of the scheduler tick.
>>> This is why having the standard watchdog kernel thread doesn't
>>> work in this context.
>> So, there is no preemption happening, which means the softlockup is rather
>> pointless.
> Still useful actually because nohz full only takes effect when a single task runs
> on the CPU. But there can still be more than 1 task running, just nohz full will
> be disabled. It all happens dynamically.
>
>> Can interrupts be disabled or handled on that cpu? I am trying
>> to see if the hardlockup detector becomes rather silly on those cpus too.
> No interrupts aren't disabled on these CPUs. Now the goal is to avoid them:
> migrate irqs, nohz full, etc...
>
> But there can be irqs. And actually there is at least 1 tick every second in
> order to keep the scheduler stats moving forward. We plan to get rid of it but
> anyway the point is that IRQ can happen on nohz full CPUs.
>
>>> I continue to suspect that the right model here is to disable the
>>> watchdog specifically on the cores that the user has tagged with
>>> the nohz_full boot argument. I agree that there might be a case
>>> to be made for leaving the watchdog conditionally (as suggested
>>> by Ingo) but it should be possible to have the watchdogs on
>>> the nohz_full cores be turned off completely if desired.
>> I think I might be slowly coming around to your thoughts. I might request a
>> different patch though based on the answers above. Maybe even create a
>> subset of the online cpus for the watchdog to work off of. The watchdog
>> would copy the online cpu mask, mask off the nohz cpus and just function
>> that way. It would print loud messages for each nohz cpu it was masking
>> off.
> All agreed with that! We should at least keep the watchdog running on
> non-nohz-full CPUs. And also allow to re-enable it everywhere when needed,
> in case we have a lockup to chase on nohz full CPUs.
>
>> Then perhaps as a debug aid, expose a /proc/sys/kernel/watchdog_cpumask for
>> folks to modify in case they want to enable the watchdog on the nohz cpus.
> That sounds like a good idea.
OK, I will respin v2 of the patch as follows:
- Provide a watchdog_cpumask as suggested by Don.
- On a non-NO_HZ_FULL build, it defaults to cpu_possible as normal
- On a NO_HZ_FULL build, it defaults to the housekeeping cpus
- If the mask is modified, we disable and then re-enable the watchdog,
so that the watchdog init code can exit() the appropriate threads as
they start up
This should address the various concerns that have been raised.
--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com
next prev parent reply other threads:[~2015-04-02 15:43 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-30 18:51 [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores cmetcalf
2015-03-30 19:12 ` Don Zickus
2015-03-30 19:32 ` [PATCH v2] " Chris Metcalf
2015-03-30 20:02 ` Don Zickus
2015-04-02 15:19 ` Frederic Weisbecker
2015-03-31 2:04 ` [PATCH] " Mike Galbraith
2015-03-31 6:34 ` Mike Galbraith
2015-03-31 18:32 ` Chris Metcalf
2015-03-31 7:25 ` Ingo Molnar
2015-03-31 18:30 ` Chris Metcalf
2015-04-02 13:35 ` Don Zickus
2015-04-02 13:49 ` Chris Metcalf
2015-04-02 14:15 ` Don Zickus
2015-04-02 15:38 ` Frederic Weisbecker
2015-04-02 15:42 ` Chris Metcalf [this message]
2015-04-02 16:08 ` Don Zickus
2015-04-02 16:48 ` Frederic Weisbecker
2015-04-02 17:39 ` [PATCH v3] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-02 18:06 ` Peter Zijlstra
2015-04-02 18:16 ` Chris Metcalf
2015-04-02 18:33 ` Peter Zijlstra
2015-04-02 18:49 ` Chris Metcalf
2015-04-02 18:45 ` Don Zickus
2015-04-03 16:08 ` [PATCH v4 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-03 16:08 ` [PATCH v4 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-05 16:46 ` Ulrich Obergfell
2015-04-06 19:45 ` [PATCH v5 0/2] nohz/watchdog/smp_hotplug_thread changes cmetcalf
2015-04-06 19:45 ` [PATCH v5 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 13:28 ` Frederic Weisbecker
2015-04-08 14:06 ` Chris Metcalf
2015-04-08 17:29 ` Frederic Weisbecker
2015-04-06 19:45 ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-07 15:44 ` Don Zickus
2015-04-07 15:56 ` Sasha Levin
2015-04-07 17:49 ` Chris Metcalf
2015-04-08 14:01 ` Frederic Weisbecker
2015-04-08 19:11 ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 19:11 ` [PATCH v6 2/2] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-08 20:37 ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads Thomas Gleixner
2015-04-09 20:29 ` [PATCH] " Chris Metcalf
2015-04-10 1:58 ` Frederic Weisbecker
2015-04-10 16:33 ` Chris Metcalf
2015-04-12 19:14 ` Frederic Weisbecker
2015-04-13 16:06 ` Chris Metcalf
2015-04-13 21:54 ` Frederic Weisbecker
2015-04-14 19:37 ` [PATCH v8 1/3] " Chris Metcalf
2015-04-14 19:37 ` [PATCH v8 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-16 10:46 ` Ulrich Obergfell
2015-04-17 15:41 ` Chris Metcalf
2015-04-22 8:20 ` Ulrich Obergfell
2015-04-28 17:52 ` Chris Metcalf
2015-04-29 8:48 ` Ulrich Obergfell
2015-04-17 1:31 ` Chai Wen
2015-04-17 16:10 ` Chris Metcalf
2015-04-14 19:37 ` [PATCH v8 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-16 15:28 ` [PATCH v8 1/3] smpboot: allow excluding cpus from the smpboot threads Frederic Weisbecker
2015-04-16 15:50 ` Chris Metcalf
2015-04-16 16:48 ` Frederic Weisbecker
2015-04-17 16:17 ` Chris Metcalf
2015-04-17 18:37 ` [PATCH v9 " Chris Metcalf
2015-04-17 18:37 ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-21 12:32 ` Ulrich Obergfell
2015-04-28 18:07 ` Chris Metcalf
2015-04-29 9:49 ` Ulrich Obergfell
2015-04-29 13:10 ` Don Zickus
2015-04-21 14:07 ` Ulrich Obergfell
2015-04-22 15:18 ` Don Zickus
2015-04-25 15:42 ` Ulrich Obergfell
2015-04-22 11:02 ` Ulrich Obergfell
2015-04-22 15:21 ` Don Zickus
2015-04-27 20:27 ` Chris Metcalf
2015-04-28 15:17 ` Don Zickus
2015-04-28 19:42 ` Andrew Morton
2015-04-30 19:39 ` [PATCH v10 0/3] add watchdog_cpumask to help nohz_full Chris Metcalf
2015-04-30 19:39 ` [PATCH v10 1/3] smpboot: allow excluding cpus from the smpboot threads Chris Metcalf
2015-05-01 8:53 ` Frederic Weisbecker
2015-05-01 19:57 ` Chris Metcalf
2015-05-01 21:23 ` Frederic Weisbecker
2015-05-04 22:06 ` Chris Metcalf
2015-06-03 2:34 ` Don Zickus
2015-06-04 17:25 ` Chris Metcalf
2015-05-01 20:00 ` [PATCH] smpboot: dynamically allocate the cpumask Chris Metcalf
2015-04-30 19:39 ` [PATCH v10 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-30 20:00 ` Don Zickus
2015-04-30 20:09 ` Chris Metcalf
2015-05-01 13:46 ` Don Zickus
2015-04-30 19:39 ` [PATCH v10 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26 ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Andrew Morton
2015-04-29 22:26 ` Andrew Morton
2015-04-17 18:37 ` [PATCH v9 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26 ` Andrew Morton
2015-04-29 22:26 ` [PATCH v9 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-30 16:07 ` Chris Metcalf
2015-04-14 15:23 ` [PATCH] " Frederic Weisbecker
2015-04-14 15:39 ` Chris Metcalf
2015-04-14 17:57 ` Thomas Gleixner
2015-04-10 20:48 ` [PATCH v7 1/3] " Chris Metcalf
2015-04-10 20:48 ` [PATCH v7 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-10 20:48 ` [PATCH v7 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-10 21:11 ` [PATCH v7 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-13 15:48 ` Chris Metcalf
2015-04-08 19:21 ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz Chris Metcalf
2015-04-08 22:31 ` Frederic Weisbecker
2015-03-31 10:17 ` [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores Christoph Lameter
2015-03-31 18:39 ` Chris Metcalf
2015-04-02 14:13 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=551D6373.2030000@ezchip.com \
--to=cmetcalf@ezchip.com \
--cc=akpm@linux-foundation.org \
--cc=atomlin@redhat.com \
--cc=benzh@chromium.org \
--cc=chaiw.fnst@cn.fujitsu.com \
--cc=cl@linux.com \
--cc=drjones@redhat.com \
--cc=dzickus@redhat.com \
--cc=fabf@skynet.be \
--cc=fweisbec@gmail.com \
--cc=gilad@benyossef.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=rostedt@goodmis.org \
--cc=uobergfe@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).