From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751672AbdJCKB5 (ORCPT ); Tue, 3 Oct 2017 06:01:57 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:34744 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751184AbdJCKB4 (ORCPT ); Tue, 3 Oct 2017 06:01:56 -0400 X-Google-Smtp-Source: AOwi7QAqG+3EB2D9mKz2p/TFEEvcYBGAbK6oFURTHssYnuyMmd50PBt0NY9IL6cvMAfgbO8zTmL1lQ== Date: Tue, 3 Oct 2017 20:01:26 +1000 From: Nicholas Piggin To: Thomas Gleixner Cc: Michael Ellerman , LKML , Ingo Molnar , Peter Zijlstra , Borislav Petkov , Andrew Morton , Sebastian Siewior , Don Zickus , Chris Metcalf , Ulrich Obergfell , Benjamin Herrenschmidt , linuxppc-dev@lists.ozlabs.org Subject: Re: [patch V2 22/29] lockup_detector: Make watchdog_nmi_reconfigure() two stage Message-ID: <20171003200126.358155b7@roar.ozlabs.ibm.com> In-Reply-To: References: <20170912193654.321505854@linutronix.de> <20170912194147.862865570@linutronix.de> <87d165dqew.fsf@concordia.ellerman.id.au> Organization: IBM X-Mailer: Claws Mail 3.15.0-dirty (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 3 Oct 2017 09:04:03 +0200 (CEST) Thomas Gleixner wrote: > On Tue, 3 Oct 2017, Thomas Gleixner wrote: > > On Tue, 3 Oct 2017, Michael Ellerman wrote: > > > Hi Thomas, > > > Unfortunately this is hitting the WARN_ON in start_wd_cpu() on powerpc > > > because we're calling it multiple times for the boot CPU. > > > > > > The first call is via: > > > > > > start_wd_on_cpu+0x80/0x2f0 > > > watchdog_nmi_reconfigure+0x124/0x170 > > > softlockup_reconfigure_threads+0x110/0x130 > > > lockup_detector_init+0xbc/0xe0 > > > kernel_init_freeable+0x18c/0x37c > > > kernel_init+0x2c/0x160 > > > ret_from_kernel_thread+0x5c/0xbc > > > > > > And then again via the CPU hotplug registration: > > > > > > start_wd_on_cpu+0x80/0x2f0 > > > cpuhp_invoke_callback+0x194/0x620 > > > cpuhp_thread_fun+0x7c/0x1b0 > > > smpboot_thread_fn+0x290/0x2a0 > > > kthread+0x168/0x1b0 > > > ret_from_kernel_thread+0x5c/0xbc > > > > > > > > > The first call is new because previously watchdog_nmi_reconfigure() > > > wasn't called from softlockup_reconfigure_threads(). > > > > Hmm, don't you have the same problem with CPU hotplug or do you just get > > lucky because the hotplug callback in your code is ordered vs. the > > softlockup thread hotplug callback in a way that this does not hit? I had the idea that it watchdog_nmi_reconfigure() being only called with get_online_cpus held would prevent hotplug callbacks running. > > Which leads me to the question why you need the hotplug state at all if the > softlockup detector is enabled. Wouldn't it make more sense to only > register the state if softlockup detector is turned off in Kconfig and > actually move it to the core code? I don't understand what you mean exactly, but it was done to avoid relying on the softlockup detector at all, because it wasn't needed for anything else (unlike the perf lockup detector). Thanks, Nick