From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755222AbdKATtj (ORCPT ); Wed, 1 Nov 2017 15:49:39 -0400 Received: from terminus.zytor.com ([65.50.211.136]:49917 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755198AbdKATti (ORCPT ); Wed, 1 Nov 2017 15:49:38 -0400 Date: Wed, 1 Nov 2017 12:46:25 -0700 From: tip-bot for Don Zickus Message-ID: Cc: linux@roeck-us.net, hpa@zytor.com, dzickus@redhat.com, tglx@linutronix.de, mingo@kernel.org, linux-kernel@vger.kernel.org, peterz@infradead.org Reply-To: linux-kernel@vger.kernel.org, peterz@infradead.org, linux@roeck-us.net, dzickus@redhat.com, hpa@zytor.com, tglx@linutronix.de, mingo@kernel.org In-Reply-To: <20171101181126.j727fqjmdthjz4xk@redhat.com> References: <20171101181126.j727fqjmdthjz4xk@redhat.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:core/urgent] watchdog/hardlockup/perf: Use atomics to track in-use cpu counter Git-Commit-ID: c7254c8aabe3025770fdb6f2d84aded11716ca2b X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: c7254c8aabe3025770fdb6f2d84aded11716ca2b Gitweb: https://git.kernel.org/tip/c7254c8aabe3025770fdb6f2d84aded11716ca2b Author: Don Zickus AuthorDate: Wed, 1 Nov 2017 14:11:27 -0400 Committer: Thomas Gleixner CommitDate: Wed, 1 Nov 2017 20:41:28 +0100 watchdog/hardlockup/perf: Use atomics to track in-use cpu counter Guenter reported: There is still a problem. When running echo 6 > /proc/sys/kernel/watchdog_thresh echo 5 > /proc/sys/kernel/watchdog_thresh repeatedly, the message NMI watchdog: Enabled. Permanently consumes one hw-PMU counter. stops after a while (after ~10-30 iterations, with fluctuations). Maybe watchdog_cpus needs to be atomic ? That's correct as this again is affected by the asynchronous nature of the smpboot thread unpark mechanism. CPU 0 CPU1 CPU2 write(watchdog_thresh, 6) stop() park() update() start() unpark() thread->unpark() cnt++; write(watchdog_thresh, 5) thread->unpark() stop() park() thread->park() cnt--; cnt++; update() start() unpark() That's not a functional problem, it just affects the informational message. Convert watchdog_cpus to atomic_t to prevent the problem Reported-and-tested-by: Guenter Roeck Signed-off-by: Don Zickus Signed-off-by: Thomas Gleixner Cc: Peter Zijlstra Link: https://lkml.kernel.org/r/20171101181126.j727fqjmdthjz4xk@redhat.com --- kernel/watchdog_hld.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index f8db56b..52218f2 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -12,6 +12,7 @@ #define pr_fmt(fmt) "NMI watchdog: " fmt #include +#include #include #include @@ -25,7 +26,7 @@ static DEFINE_PER_CPU(struct perf_event *, dead_event); static struct cpumask dead_events_mask; static unsigned long hardlockup_allcpu_dumped; -static unsigned int watchdog_cpus; +static atomic_t watchdog_cpus = ATOMIC_INIT(0); void arch_touch_nmi_watchdog(void) { @@ -189,7 +190,8 @@ void hardlockup_detector_perf_enable(void) if (hardlockup_detector_event_create()) return; - if (!watchdog_cpus++) + /* use original value for check */ + if (!atomic_fetch_inc(&watchdog_cpus)) pr_info("Enabled. Permanently consumes one hw-PMU counter.\n"); perf_event_enable(this_cpu_read(watchdog_ev)); @@ -207,7 +209,7 @@ void hardlockup_detector_perf_disable(void) this_cpu_write(watchdog_ev, NULL); this_cpu_write(dead_event, event); cpumask_set_cpu(smp_processor_id(), &dead_events_mask); - watchdog_cpus--; + atomic_dec(&watchdog_cpus); } }