LKML Archive on lore.kernel.org
 help / color / Atom feed
From: tip-bot for Thomas Gleixner <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: hpa@zytor.com, tglx@linutronix.de, linux@roeck-us.net,
	dzickus@redhat.com, linux-kernel@vger.kernel.org,
	mingo@kernel.org, peterz@infradead.org
Subject: [tip:core/urgent] watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy")
Date: Wed, 1 Nov 2017 13:27:53 -0700
Message-ID: <tip-9c388a5ed1960b2ebbebd3dbe7553092b0c15ec1@git.kernel.org> (raw)
In-Reply-To: <alpine.DEB.2.20.1710312145190.1942@nanos>

Commit-ID:  9c388a5ed1960b2ebbebd3dbe7553092b0c15ec1
Gitweb:     https://git.kernel.org/tip/9c388a5ed1960b2ebbebd3dbe7553092b0c15ec1
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 31 Oct 2017 22:32:00 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 1 Nov 2017 21:18:39 +0100

watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy")

Guenter reported a crash in the watchdog/perf code, which is caused by
cleanup() and enable() running concurrently. The reason for this is:

The watchdog functions are serialized via the watchdog_mutex and cpu
hotplug locking, but the enable of the perf based watchdog happens in
context of the unpark callback of the smpboot thread. But that unpark
function is not synchronous inside the locking. The unparking of the thread
just wakes it up and leaves so there is no guarantee when the thread is
executing.

If it starts running _before_ the cleanup happened then it will create a
event and overwrite the dead event pointer. The new event is then cleaned
up because the event is marked dead.

    lock(watchdog_mutex);
    lockup_detector_reconfigure();
        cpus_read_lock();
	stop();
	   park()
	update();
	start();
	   unpark()
	cpus_read_unlock();		thread runs()
					  overwrite dead event ptr
	cleanup();
	  free new event, which is active inside perf....
    unlock(watchdog_mutex);

The park side is safe as that actually waits for the thread to reach
parked state.

Commit a33d44843d45 removed the protection against this kind of scenario
under the stupid assumption that the hotplug serialization and the
watchdog_mutex cover everything. 

Bring it back.

Reverts: a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy")
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Feels-stupid Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Don Zickus <dzickus@redhat.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710312145190.1942@nanos


---
 kernel/watchdog_hld.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index 71a62ce..a7f137c 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -21,6 +21,7 @@
 static DEFINE_PER_CPU(bool, hard_watchdog_warn);
 static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
 static DEFINE_PER_CPU(struct perf_event *, watchdog_ev);
+static DEFINE_PER_CPU(struct perf_event *, dead_event);
 static struct cpumask dead_events_mask;
 
 static unsigned long hardlockup_allcpu_dumped;
@@ -203,6 +204,8 @@ void hardlockup_detector_perf_disable(void)
 
 	if (event) {
 		perf_event_disable(event);
+		this_cpu_write(watchdog_ev, NULL);
+		this_cpu_write(dead_event, event);
 		cpumask_set_cpu(smp_processor_id(), &dead_events_mask);
 		watchdog_cpus--;
 	}
@@ -218,7 +221,7 @@ void hardlockup_detector_perf_cleanup(void)
 	int cpu;
 
 	for_each_cpu(cpu, &dead_events_mask) {
-		struct perf_event *event = per_cpu(watchdog_ev, cpu);
+		struct perf_event *event = per_cpu(dead_event, cpu);
 
 		/*
 		 * Required because for_each_cpu() reports  unconditionally
@@ -226,7 +229,7 @@ void hardlockup_detector_perf_cleanup(void)
 		 */
 		if (event)
 			perf_event_release_kernel(event);
-		per_cpu(watchdog_ev, cpu) = NULL;
+		per_cpu(dead_event, cpu) = NULL;
 	}
 	cpumask_clear(&dead_events_mask);
 }

  parent reply index

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-30 22:45 Crashes in perf_event_ctx_lock_nested Guenter Roeck
2017-10-31 13:48 ` Peter Zijlstra
2017-10-31 17:16   ` Guenter Roeck
2017-10-31 18:50     ` Don Zickus
2017-10-31 20:12       ` Guenter Roeck
2017-10-31 20:23         ` Don Zickus
2017-10-31 21:32   ` Thomas Gleixner
2017-10-31 22:11     ` Guenter Roeck
2017-11-01 18:11       ` Don Zickus
2017-11-01 18:34         ` Guenter Roeck
2017-11-01 19:46         ` [tip:core/urgent] watchdog/hardlockup/perf: Use atomics to track in-use cpu counter tip-bot for Don Zickus
2017-11-01 20:28         ` tip-bot for Don Zickus
2017-11-01 18:22       ` Crashes in perf_event_ctx_lock_nested Thomas Gleixner
2017-11-01  8:14     ` Peter Zijlstra
2017-11-01  8:26       ` Thomas Gleixner
2017-11-01 19:46     ` [tip:core/urgent] watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy") tip-bot for Thomas Gleixner
2017-11-01 20:32       ` Guenter Roeck
2017-11-01 20:52         ` Thomas Gleixner
2017-11-01 20:27     ` tip-bot for Thomas Gleixner [this message]
2017-10-31 18:48 ` Crashes in perf_event_ctx_lock_nested Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-9c388a5ed1960b2ebbebd3dbe7553092b0c15ec1@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=dzickus@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git