From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751742AbdJCLgp (ORCPT ); Tue, 3 Oct 2017 07:36:45 -0400 Received: from ozlabs.org ([103.22.144.67]:38831 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751571AbdJCLgn (ORCPT ); Tue, 3 Oct 2017 07:36:43 -0400 From: Michael Ellerman To: Thomas Gleixner Cc: LKML , Ingo Molnar , Peter Zijlstra , Borislav Petkov , Andrew Morton , Sebastian Siewior , Nicholas Piggin , Don Zickus , Chris Metcalf , Ulrich Obergfell , Benjamin Herrenschmidt , linuxppc-dev@lists.ozlabs.org Subject: Re: [patch V2 22/29] lockup_detector: Make watchdog_nmi_reconfigure() two stage In-Reply-To: References: <20170912193654.321505854@linutronix.de> <20170912194147.862865570@linutronix.de> <87d165dqew.fsf@concordia.ellerman.id.au> Date: Tue, 03 Oct 2017 22:36:41 +1100 Message-ID: <87o9pocvjq.fsf@concordia.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thomas Gleixner writes: > On Tue, 3 Oct 2017, Michael Ellerman wrote: >> Hi Thomas, >> Unfortunately this is hitting the WARN_ON in start_wd_cpu() on powerpc >> because we're calling it multiple times for the boot CPU. >> >> The first call is via: >> >> start_wd_on_cpu+0x80/0x2f0 >> watchdog_nmi_reconfigure+0x124/0x170 >> softlockup_reconfigure_threads+0x110/0x130 >> lockup_detector_init+0xbc/0xe0 >> kernel_init_freeable+0x18c/0x37c >> kernel_init+0x2c/0x160 >> ret_from_kernel_thread+0x5c/0xbc >> >> And then again via the CPU hotplug registration: >> >> start_wd_on_cpu+0x80/0x2f0 >> cpuhp_invoke_callback+0x194/0x620 >> cpuhp_thread_fun+0x7c/0x1b0 >> smpboot_thread_fn+0x290/0x2a0 >> kthread+0x168/0x1b0 >> ret_from_kernel_thread+0x5c/0xbc >> >> >> The first call is new because previously watchdog_nmi_reconfigure() >> wasn't called from softlockup_reconfigure_threads(). > > Hmm, don't you have the same problem with CPU hotplug or do you just get > lucky because the hotplug callback in your code is ordered vs. the > softlockup thread hotplug callback in a way that this does not hit? I don't see it with CPU hotplug. AFAICS that's because softlockup_reconfigure_threads() isn't called for CPU hotplug. Unless there's a path I'm missing? >> I'm not sure what the easiest fix is. One option would be to just drop >> the WARN_ON, it's just there for paranoia AFAICS. > > The straight forward way is to make use of the new probe function. Patch > below. Thanks. Hmm, I tried that patch, it makes the warning go away. But then I triggered a deliberate hard lockup and got nothing. Then I went back to the existing code (in linux-next), and I still get no warning from a deliberate hard lockup. So seems there may be some more gremlins. Will test more in the morning. cheers