From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753590AbbC3UCl (ORCPT ); Mon, 30 Mar 2015 16:02:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46698 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753225AbbC3UCf (ORCPT ); Mon, 30 Mar 2015 16:02:35 -0400 Date: Mon, 30 Mar 2015 16:02:06 -0400 From: Don Zickus To: Chris Metcalf Cc: Andrew Morton , Andrew Jones , chai wen , Ingo Molnar , Ulrich Obergfell , Fabian Frederick , Aaron Tomlin , Ben Zhang , Christoph Lameter , Frederic Weisbecker , Gilad Ben-Yossef , Steven Rostedt , open list Subject: Re: [PATCH v2] watchdog: nohz: don't run watchdog on nohz_full cores Message-ID: <20150330200206.GQ162412@redhat.com> References: <1427741465-15747-1-git-send-email-cmetcalf@ezchip.com> <20150330191245.GO162412@redhat.com> <5519A4E7.8040708@ezchip.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5519A4E7.8040708@ezchip.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 30, 2015 at 03:32:55PM -0400, Chris Metcalf wrote: > On 03/30/2015 03:12 PM, Don Zickus wrote: > >On Mon, Mar 30, 2015 at 02:51:05PM -0400, cmetcalf@ezchip.com wrote: > >>From: Chris Metcalf > >> > >>Running watchdog can be a helpful debugging feature on regular > >>cores, but it's incompatible with nohz_full, since it forces > >>regular scheduling events. Accordingly, just exit out immediately > >>from any nohz_full core. > >> > >>An alternate approach would be to add a flags field or function to > >>smp_hotplug_thread to control on which cores the percpu threads > >>are created, but it wasn't clear that much mechanism was useful. > >Hi Chris, > > > >It seems like the correct solution would be to hook into the idle_loop > >somehow. If the cpu is idle, then it seems unlikely that a lockup could > >occur. > > With nohz_full, though, the cpu might be running userspace code > with the intention of keeping kernel ticks disabled. Even returning > to kernel mode to try to figure out if we "should" be running the > watchdog on a given core will induce exactly the kind of interrupts > that nohz_full is designed to prevent. > > My assumption is generally that nohz_full cores don't spend a lot of > time in the kernel anyway, as they are optimized for user space. > > I guess you could imagine doing something per-cpu on the nohz_full > cores where we effectively call watchdog_disable() whenever a > nohz_full core enters userspace, and watchdog_enable() whenever it > enters the kernel. We could add some per-cpu state in the watchdog > code to track whether that core was currently enabled or disabled > to avoid double-enabling or double-disabling. I would think > context_tracking_user_exit()/_enter() would be the place to do this. > > This feels like a lot of overhead, potentially. Thoughts? A few months ago I might have thought that a reasonable approach. But recently we have added code to make the watchdog an all or nothing approach across the system. This might make it difficult to do what you are suggesting. I do not know enough about the nohz code to know what the right approach is here. Perhaps Federic can enlighten me? Cheers, Don > > >My fear with this apporach is a lockup would occur on the nohz cpu and it > >would go undetected because that cpu is disabled. Further no printk is > >thrown out to even indicate a cpu is disabled making it more difficult to > >debug. > > Assuming we stick with my simple "exit the watchdog thread" model, > we probably likely wouldn't want to warn for every nohz_full core, but a > one-shot message makes sense, e.g. just "watchdog: disabled on > nohz_full cores". > > >>Signed-off-by: Chris Metcalf > >>--- > >> kernel/watchdog.c | 5 +++++ > >> 1 file changed, 5 insertions(+) > >> > >>diff --git a/kernel/watchdog.c b/kernel/watchdog.c > >>index 3174bf8e3538..8a46d9d8a66f 100644 > >>--- a/kernel/watchdog.c > >>+++ b/kernel/watchdog.c > >>@@ -19,6 +19,7 @@ > >> #include > >> #include > >> #include > >>+#include > >> #include > >> #include > >>@@ -431,6 +432,10 @@ static void watchdog_enable(unsigned int cpu) > >> hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); > >> hrtimer->function = watchdog_timer_fn; > >>+ /* nohz_full cpus do not do watchdog checking. */ > >>+ if (tick_nohz_full_cpu(cpu)) > >>+ do_exit(0); > >>+ > >> /* Enable the perf event */ > >> watchdog_nmi_enable(cpu); > >>-- > >>2.1.2 > >> > > -- > Chris Metcalf, EZChip Semiconductor > http://www.ezchip.com >