From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755302AbdKAUbd (ORCPT ); Wed, 1 Nov 2017 16:31:33 -0400 Received: from terminus.zytor.com ([65.50.211.136]:48261 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752395AbdKAUba (ORCPT ); Wed, 1 Nov 2017 16:31:30 -0400 Date: Wed, 1 Nov 2017 13:28:17 -0700 From: tip-bot for Don Zickus Message-ID: Cc: hpa@zytor.com, linux-kernel@vger.kernel.org, dzickus@redhat.com, linux@roeck-us.net, mingo@kernel.org, tglx@linutronix.de, peterz@infradead.org Reply-To: peterz@infradead.org, tglx@linutronix.de, mingo@kernel.org, linux@roeck-us.net, dzickus@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org In-Reply-To: <20171101181126.j727fqjmdthjz4xk@redhat.com> References: <20171101181126.j727fqjmdthjz4xk@redhat.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:core/urgent] watchdog/hardlockup/perf: Use atomics to track in-use cpu counter Git-Commit-ID: 42f930da7f00c0ab23df4c7aed36137f35988980 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 42f930da7f00c0ab23df4c7aed36137f35988980 Gitweb: https://git.kernel.org/tip/42f930da7f00c0ab23df4c7aed36137f35988980 Author: Don Zickus AuthorDate: Wed, 1 Nov 2017 14:11:27 -0400 Committer: Thomas Gleixner CommitDate: Wed, 1 Nov 2017 21:18:40 +0100 watchdog/hardlockup/perf: Use atomics to track in-use cpu counter Guenter reported: There is still a problem. When running echo 6 > /proc/sys/kernel/watchdog_thresh echo 5 > /proc/sys/kernel/watchdog_thresh repeatedly, the message NMI watchdog: Enabled. Permanently consumes one hw-PMU counter. stops after a while (after ~10-30 iterations, with fluctuations). Maybe watchdog_cpus needs to be atomic ? That's correct as this again is affected by the asynchronous nature of the smpboot thread unpark mechanism. CPU 0 CPU1 CPU2 write(watchdog_thresh, 6) stop() park() update() start() unpark() thread->unpark() cnt++; write(watchdog_thresh, 5) thread->unpark() stop() park() thread->park() cnt--; cnt++; update() start() unpark() That's not a functional problem, it just affects the informational message. Convert watchdog_cpus to atomic_t to prevent the problem Reported-and-tested-by: Guenter Roeck Signed-off-by: Don Zickus Signed-off-by: Thomas Gleixner Cc: Peter Zijlstra Link: https://lkml.kernel.org/r/20171101181126.j727fqjmdthjz4xk@redhat.com --- kernel/watchdog_hld.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index a7f137c..a84b205 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -12,6 +12,7 @@ #define pr_fmt(fmt) "NMI watchdog: " fmt #include +#include #include #include @@ -25,7 +26,7 @@ static DEFINE_PER_CPU(struct perf_event *, dead_event); static struct cpumask dead_events_mask; static unsigned long hardlockup_allcpu_dumped; -static unsigned int watchdog_cpus; +static atomic_t watchdog_cpus = ATOMIC_INIT(0); void arch_touch_nmi_watchdog(void) { @@ -189,7 +190,8 @@ void hardlockup_detector_perf_enable(void) if (hardlockup_detector_event_create()) return; - if (!watchdog_cpus++) + /* use original value for check */ + if (!atomic_fetch_inc(&watchdog_cpus)) pr_info("Enabled. Permanently consumes one hw-PMU counter.\n"); perf_event_enable(this_cpu_read(watchdog_ev)); @@ -207,7 +209,7 @@ void hardlockup_detector_perf_disable(void) this_cpu_write(watchdog_ev, NULL); this_cpu_write(dead_event, event); cpumask_set_cpu(smp_processor_id(), &dead_events_mask); - watchdog_cpus--; + atomic_dec(&watchdog_cpus); } }