From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751115AbdJCGvJ (ORCPT ); Tue, 3 Oct 2017 02:51:09 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:52187 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750797AbdJCGvI (ORCPT ); Tue, 3 Oct 2017 02:51:08 -0400 Date: Tue, 3 Oct 2017 08:50:59 +0200 (CEST) From: Thomas Gleixner To: Michael Ellerman cc: LKML , Ingo Molnar , Peter Zijlstra , Borislav Petkov , Andrew Morton , Sebastian Siewior , Nicholas Piggin , Don Zickus , Chris Metcalf , Ulrich Obergfell , Benjamin Herrenschmidt , linuxppc-dev@lists.ozlabs.org Subject: Re: [patch V2 22/29] lockup_detector: Make watchdog_nmi_reconfigure() two stage In-Reply-To: <87d165dqew.fsf@concordia.ellerman.id.au> Message-ID: References: <20170912193654.321505854@linutronix.de> <20170912194147.862865570@linutronix.de> <87d165dqew.fsf@concordia.ellerman.id.au> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 3 Oct 2017, Michael Ellerman wrote: > Hi Thomas, > Unfortunately this is hitting the WARN_ON in start_wd_cpu() on powerpc > because we're calling it multiple times for the boot CPU. > > The first call is via: > > start_wd_on_cpu+0x80/0x2f0 > watchdog_nmi_reconfigure+0x124/0x170 > softlockup_reconfigure_threads+0x110/0x130 > lockup_detector_init+0xbc/0xe0 > kernel_init_freeable+0x18c/0x37c > kernel_init+0x2c/0x160 > ret_from_kernel_thread+0x5c/0xbc > > And then again via the CPU hotplug registration: > > start_wd_on_cpu+0x80/0x2f0 > cpuhp_invoke_callback+0x194/0x620 > cpuhp_thread_fun+0x7c/0x1b0 > smpboot_thread_fn+0x290/0x2a0 > kthread+0x168/0x1b0 > ret_from_kernel_thread+0x5c/0xbc > > > The first call is new because previously watchdog_nmi_reconfigure() > wasn't called from softlockup_reconfigure_threads(). Hmm, don't you have the same problem with CPU hotplug or do you just get lucky because the hotplug callback in your code is ordered vs. the softlockup thread hotplug callback in a way that this does not hit? > I'm not sure what the easiest fix is. One option would be to just drop > the WARN_ON, it's just there for paranoia AFAICS. The straight forward way is to make use of the new probe function. Patch below. Thanks, tglx 8<------------------ --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -375,20 +375,18 @@ void watchdog_nmi_start(void) /* * This runs after lockup_detector_init() which sets up watchdog_cpumask. */ -static int __init powerpc_watchdog_init(void) +int __init watchdog_nmi_probe(void) { int err; - watchdog_calc_timeouts(); - - err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "powerpc/watchdog:online", - start_wd_on_cpu, stop_wd_on_cpu); + err = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, + "powerpc/watchdog:online", + start_wd_on_cpu, stop_wd_on_cpu); if (err < 0) pr_warn("Watchdog could not be initialized"); return 0; } -arch_initcall(powerpc_watchdog_init); static void handle_backtrace_ipi(struct pt_regs *regs) { --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -608,7 +608,6 @@ static inline int watchdog_park_threads( static inline void watchdog_unpark_threads(void) { } static inline int watchdog_enable_all_cpus(void) { return 0; } static inline void watchdog_disable_all_cpus(void) { } -static inline void softlockup_init_threads(void) { } static void softlockup_reconfigure_threads(void) { cpus_read_lock(); @@ -617,6 +616,10 @@ static void softlockup_reconfigure_threa watchdog_nmi_start(); cpus_read_unlock(); } +static inline void softlockup_init_threads(void) +{ + softlockup_reconfigure_threads(); +} #endif /* !CONFIG_SOFTLOCKUP_DETECTOR */ static void __lockup_detector_cleanup(void)