From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1753483AbdJaNtA (ORCPT <rfc822;w@1wt.eu>);
        Tue, 31 Oct 2017 09:49:00 -0400
Received: from merlin.infradead.org ([205.233.59.134]:48502 "EHLO
        merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753056AbdJaNs6 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 31 Oct 2017 09:48:58 -0400
Date: Tue, 31 Oct 2017 14:48:50 +0100
From: Peter Zijlstra <peterz@infradead.org>
To: Guenter Roeck <linux@roeck-us.net>
Cc: Thomas Gleixner <tglx@linutronix.de>, linux-kernel@vger.kernel.org,
        Don Zickus <dzickus@redhat.com>, Ingo Molnar <mingo@kernel.org>
Subject: Re: Crashes in perf_event_ctx_lock_nested
Message-ID: <20171031134850.ynix2zqypmca2mtt@hirez.programming.kicks-ass.net>
References: <20171030224512.GA13592@roeck-us.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20171030224512.GA13592@roeck-us.net>
User-Agent: NeoMutt/20170609 (1.8.3)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, Oct 30, 2017 at 03:45:12PM -0700, Guenter Roeck wrote:
> I added some logging and a long msleep() in hardlockup_detector_perf_cleanup().
> Here is the result:
> 
> [    0.274361] NMI watchdog: ############ hardlockup_detector_perf_init
> [    0.274915] NMI watchdog: ############ hardlockup_detector_event_create(0)
> [    0.277049] NMI watchdog: ############ hardlockup_detector_perf_cleanup
> [    0.277593] NMI watchdog: ############ hardlockup_detector_perf_enable(0)
> [    0.278027] NMI watchdog: ############ hardlockup_detector_event_create(0)
> [    1.312044] NMI watchdog: ############ hardlockup_detector_perf_cleanup done
> [    1.385122] NMI watchdog: ############ hardlockup_detector_perf_enable(1)
> [    1.386028] NMI watchdog: ############ hardlockup_detector_event_create(1)
> [    1.466102] NMI watchdog: ############ hardlockup_detector_perf_enable(2)
> [    1.475536] NMI watchdog: ############ hardlockup_detector_event_create(2)
> [    1.535099] NMI watchdog: ############ hardlockup_detector_perf_enable(3)
> [    1.535101] NMI watchdog: ############ hardlockup_detector_event_create(3)

> [    7.222816] NMI watchdog: ############ hardlockup_detector_perf_disable(0)
> [    7.230567] NMI watchdog: ############ hardlockup_detector_perf_disable(1)
> [    7.243138] NMI watchdog: ############ hardlockup_detector_perf_disable(2)
> [    7.250966] NMI watchdog: ############ hardlockup_detector_perf_disable(3)
> [    7.258826] NMI watchdog: ############ hardlockup_detector_perf_enable(1)
> [    7.258827] NMI watchdog: ############ hardlockup_detector_perf_cleanup
> [    7.258831] NMI watchdog: ############ hardlockup_detector_perf_enable(2)
> [    7.258833] NMI watchdog: ############ hardlockup_detector_perf_enable(0)
> [    7.258834] NMI watchdog: ############ hardlockup_detector_event_create(2)
> [    7.258835] NMI watchdog: ############ hardlockup_detector_event_create(0)
> [    7.260169] NMI watchdog: ############ hardlockup_detector_perf_enable(3)
> [    7.260170] NMI watchdog: ############ hardlockup_detector_event_create(3)
> [    7.494251] NMI watchdog: ############ hardlockup_detector_event_create(1)
> [    8.287135] NMI watchdog: ############ hardlockup_detector_perf_cleanup done
> 
> Looks like there are a number of problems: hardlockup_detector_event_create()
> creates the event data structure even if it is already created, 

Right, that does look dodgy. And on its own should be fairly straight
forward to cure. But I'd like to understand the rest of it first.

> and hardlockup_detector_perf_cleanup() runs unprotected and in
> parallel to the enable/create functions.

Well, looking at the code, cpu_maps_update_begin() aka.
cpu_add_remove_lock is serializing cpu_up() and cpu_down() and _should_
thereby also serialize cleanup vs the smp_hotplug_thread operations.

Your trace does indeed indicate this is not the case, but I cannot, from
the code, see how this could happen.

Could you use trace_printk() instead and boot with
"trace_options=stacktrace" ?

> ALso, the following message is seen twice.
> 
> [    0.278758] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
> [    7.258838] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
> 
> I don't offer a proposed patch since I have no idea how to best solve the
> problem.
> 
> Also, is the repeated enable/disable/cleanup as part of the normal boot
> really necessary ?

That's weird, I don't see that on my machines. We very much only bring
up the CPUs _once_. Also note they're 7s apart. Did you do something
funny like resume-from-disk or so?