From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751477AbdIMSGp (ORCPT ); Wed, 13 Sep 2017 14:06:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47056 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751106AbdIMSGo (ORCPT ); Wed, 13 Sep 2017 14:06:44 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com D8B534A6EA Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=dzickus@redhat.com Date: Wed, 13 Sep 2017 14:06:42 -0400 From: Don Zickus To: Thomas Gleixner Cc: LKML , Ingo Molnar , Peter Zijlstra , Borislav Petkov , Andrew Morton , Sebastian Siewior , Nicholas Piggin , Chris Metcalf , Ulrich Obergfell Subject: Re: [patch V2 00/29] lockup_detector: Cure hotplug deadlocks and replace duct tape Message-ID: <20170913180642.ywrszlfziobv7yiu@redhat.com> References: <20170912193654.321505854@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170912193654.321505854@linutronix.de> User-Agent: NeoMutt/20170428-dirty (1.8.2) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 13 Sep 2017 18:06:44 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 12, 2017 at 09:36:54PM +0200, Thomas Gleixner wrote: > The lockup detector is broken is several ways: > > - It's deadlock prone vs. CPU hotplug in various ways. Some of these > are due to recursive cpus_read_lock() others are due to > cpus_read_lock() from CPU hotplug callbacks which immediately lock > the machine because cpus are write locked. > > - The handling of the cpu hotplug threads happens sideways to the > smpboot thread infrastructure, which is racy and pointless > > - The handling of the user space sysctl interface is a complete > trainwreck as it fiddles directly with variables which can be > modified or evaluated by the running watchdogs. > > - The perf event initialization is a trainwreck as it tries to create > perf events over and over even if perf is not functional (no > hardware, ....). To avoid excessive dmesg spam it contains magic > printk ratelimiting along with either wrong or useless messages. > > - The code structure is horrible as ifdef sections are scattered all > over the place which makes it unreadable > > - There is more wreckage, but see the changelogs for the ugly details. > Aside from the simple compile issue in patch 25. I have no issues with this patchset. Thanks Thomas! Reviewed-by: Don Zickus > The following series sanitizes the facility and addresses the problems. > > Changes since V1: > > - Wrapped the perf specific calls into the weak watchdog_nmi_* > functions > > - Fixed the compile error pointed out by Don > > - Fixed the reconfiguration parameter inconsistency which broke > powerpc > > - Picked up the updated version of patch 11/29 > > Delta patch below. > > Thanks, > > tglx > --- > Diffstat for the series: > > arch/parisc/kernel/process.c | 2 > arch/powerpc/kernel/watchdog.c | 22 - > arch/x86/events/intel/core.c | 11 > include/linux/nmi.h | 121 +++---- > include/linux/smpboot.h | 4 > kernel/cpu.c | 6 > kernel/smpboot.c | 22 - > kernel/sysctl.c | 22 - > kernel/watchdog.c | 633 ++++++++++++++--------------------------- > kernel/watchdog_hld.c | 193 ++++++------ > 10 files changed, 434 insertions(+), 602 deletions(-) > > Delta patch vs. V1 > 8<------------------------ > --- a/arch/powerpc/kernel/watchdog.c > +++ b/arch/powerpc/kernel/watchdog.c > @@ -355,12 +355,12 @@ static void watchdog_calc_timeouts(void) > wd_timer_period_ms = watchdog_thresh * 1000 * 2 / 5; > } > > -void watchdog_nmi_reconfigure(bool stop) > +void watchdog_nmi_reconfigure(bool run) > { > int cpu; > > cpus_read_lock(); > - if (stop) { > + if (!run) { > for_each_cpu(cpu, &wd_cpus_enabled) > stop_wd_on_cpu(cpu); > } else { > --- a/include/linux/nmi.h > +++ b/include/linux/nmi.h > @@ -102,7 +102,7 @@ static inline void hardlockup_detector_p > static inline void hardlockup_detector_perf_enable(void) { } > static inline void hardlockup_detector_perf_cleanup(void) { } > # if !defined(CONFIG_HAVE_NMI_WATCHDOG) > -static int hardlockup_detector_perf_init(void) { return -ENODEV; } > +static inline int hardlockup_detector_perf_init(void) { return -ENODEV; } > static inline void arch_touch_nmi_watchdog(void) {} > # else > static int hardlockup_detector_perf_init(void) { return 0; } > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -105,18 +105,32 @@ static int __init hardlockup_all_cpu_bac > * softlockup watchdog threads start and stop. The arch must select the > * SOFTLOCKUP_DETECTOR Kconfig. > */ > -int __weak watchdog_nmi_enable(unsigned int cpu) { return 0; } > -void __weak watchdog_nmi_disable(unsigned int cpu) { } > +int __weak watchdog_nmi_enable(unsigned int cpu) > +{ > + hardlockup_detector_perf_enable(); > + return 0; > +} > + > +void __weak watchdog_nmi_disable(unsigned int cpu) > +{ > + hardlockup_detector_perf_disable(); > +} > + > +/* Return 0, if a NMI watchdog is available. Error code otherwise */ > +int __weak __init void watchdog_nmi_probe(void) > +{ > + return hardlockup_detector_perf_init(); > +} > > /** > * watchdog_nmi_reconfigure - Optional function to reconfigure NMI watchdogs > - * @stop: If true stop the watchdogs on all enabled CPUs > - * If false start the watchdogs on all enabled CPUs > + * @run: If false stop the watchdogs on all enabled CPUs > + * If true start the watchdogs on all enabled CPUs > * > * The core call order is: > - * watchdog_nmi_reconfigure(true); > - * update_variables(); > * watchdog_nmi_reconfigure(false); > + * update_variables(); > + * watchdog_nmi_reconfigure(true); > * > * The second call which starts the watchdogs again guarantees that the > * following variables are stable across the call. > @@ -126,13 +140,13 @@ void __weak watchdog_nmi_disable(unsigne > * > * After the call the variables can be changed again. > */ > -void __weak watchdog_nmi_reconfigure(bool stop) { } > +void __weak watchdog_nmi_reconfigure(bool run) { } > > /** > * lockup_detector_update_enable - Update the sysctl enable bit > * > * Caller needs to make sure that the NMI/perf watchdogs are off, so this > - * can't race with hardlockup_detector_disable(). > + * can't race with watchdog_nmi_disable(). > */ > static void lockup_detector_update_enable(void) > { > @@ -453,8 +467,7 @@ static void watchdog_enable(unsigned int > __touch_watchdog(); > /* Enable the perf event */ > if (watchdog_enabled & NMI_WATCHDOG_ENABLED) > - hardlockup_detector_perf_enable(); > - watchdog_nmi_enable(cpu); > + watchdog_nmi_enable(cpu); > > watchdog_set_prio(SCHED_FIFO, MAX_RT_PRIO - 1); > } > @@ -469,7 +482,6 @@ static void watchdog_disable(unsigned in > * between disabling the timer and disabling the perf event causes > * the perf NMI to detect a false positive. > */ > - hardlockup_detector_perf_disable(); > watchdog_nmi_disable(cpu); > hrtimer_cancel(hrtimer); > } > @@ -745,12 +757,6 @@ int proc_watchdog_cpumask(struct ctl_tab > } > #endif /* CONFIG_SYSCTL */ > > -static __init void detect_nmi_watchdog(void) > -{ > - if (!hardlockup_detector_perf_init()) > - nmi_watchdog_available = true; > -} > - > void __init lockup_detector_init(void) > { > #ifdef CONFIG_NO_HZ_FULL > @@ -763,6 +769,7 @@ void __init lockup_detector_init(void) > cpumask_copy(&watchdog_cpumask, cpu_possible_mask); > #endif > > - detect_nmi_watchdog(); > + if (!watchdog_nmi_probe()) > + nmi_watchdog_available = true; > softlockup_init_threads(); > } > > >