From: tip-bot for Thomas Gleixner <tipbot@zytor.com> To: linux-tip-commits@vger.kernel.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@kernel.org, hpa@zytor.com, dzickus@redhat.com, linux@roeck-us.net, tglx@linutronix.de Subject: [tip:core/urgent] watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy") Date: Wed, 1 Nov 2017 12:46:00 -0700 [thread overview] Message-ID: <tip-1c294733b7b9f712f78d15cfa75ffdea72b79abb@git.kernel.org> (raw) In-Reply-To: <alpine.DEB.2.20.1710312145190.1942@nanos> Commit-ID: 1c294733b7b9f712f78d15cfa75ffdea72b79abb Gitweb: https://git.kernel.org/tip/1c294733b7b9f712f78d15cfa75ffdea72b79abb Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 31 Oct 2017 22:32:00 +0100 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Wed, 1 Nov 2017 20:41:27 +0100 watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy") Guenter reported a crash in the watchdog/perf code, which is caused by cleanup() and enable() running concurrently. The reason for this is: The watchdog functions are serialized via the watchdog_mutex and cpu hotplug locking, but the enable of the perf based watchdog happens in context of the unpark callback of the smpboot thread. But that unpark function is not synchronous inside the locking. The unparking of the thread just wakes it up and leaves so there is no guarantee when the thread is executing. If it starts running _before_ the cleanup happened then it will create a event and overwrite the dead event pointer. The new event is then cleaned up because the event is marked dead. lock(watchdog_mutex); lockup_detector_reconfigure(); cpus_read_lock(); stop(); park() update(); start(); unpark() cpus_read_unlock(); thread runs() overwrite dead event ptr cleanup(); free new event, which is active inside perf.... unlock(watchdog_mutex); The park side is safe as that actually waits for the thread to reach parked state. Commit a33d44843d45 removed the protection against this kind of scenario under the stupid assumption that the hotplug serialization and the watchdog_mutex cover everything. Bring it back. Reverts: a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy") Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Feels-stupid Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710312145190.1942@nanos --- kernel/watchdog_hld.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index 71a62ce..f8db56b 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -21,6 +21,7 @@ static DEFINE_PER_CPU(bool, hard_watchdog_warn); static DEFINE_PER_CPU(bool, watchdog_nmi_touch); static DEFINE_PER_CPU(struct perf_event *, watchdog_ev); +static DEFINE_PER_CPU(struct perf_event *, dead_event); static struct cpumask dead_events_mask; static unsigned long hardlockup_allcpu_dumped; @@ -203,6 +204,8 @@ void hardlockup_detector_perf_disable(void) if (event) { perf_event_disable(event); + this_cpu_write(watchdog_ev, NULL); + this_cpu_write(dead_event, event); cpumask_set_cpu(smp_processor_id(), &dead_events_mask); watchdog_cpus--; } @@ -218,7 +221,7 @@ void hardlockup_detector_perf_cleanup(void) int cpu; for_each_cpu(cpu, &dead_events_mask) { - struct perf_event *event = per_cpu(watchdog_ev, cpu); + struct perf_event *event = per_cpu(dead_event, cpu); /* * Required because for_each_cpu() reports unconditionally @@ -226,7 +229,7 @@ void hardlockup_detector_perf_cleanup(void) */ if (event) perf_event_release_kernel(event); - per_cpu(watchdog_ev, cpu) = NULL; + per_cpu(dead_event_ev, cpu) = NULL; } cpumask_clear(&dead_events_mask); }
next prev parent reply other threads:[~2017-11-01 19:49 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-10-30 22:45 Crashes in perf_event_ctx_lock_nested Guenter Roeck 2017-10-31 13:48 ` Peter Zijlstra 2017-10-31 17:16 ` Guenter Roeck 2017-10-31 18:50 ` Don Zickus 2017-10-31 20:12 ` Guenter Roeck 2017-10-31 20:23 ` Don Zickus 2017-10-31 21:32 ` Thomas Gleixner 2017-10-31 22:11 ` Guenter Roeck 2017-11-01 18:11 ` Don Zickus 2017-11-01 18:34 ` Guenter Roeck 2017-11-01 19:46 ` [tip:core/urgent] watchdog/hardlockup/perf: Use atomics to track in-use cpu counter tip-bot for Don Zickus 2017-11-01 20:28 ` tip-bot for Don Zickus 2017-11-01 18:22 ` Crashes in perf_event_ctx_lock_nested Thomas Gleixner 2017-11-01 8:14 ` Peter Zijlstra 2017-11-01 8:26 ` Thomas Gleixner 2017-11-01 19:46 ` tip-bot for Thomas Gleixner [this message] 2017-11-01 20:32 ` [tip:core/urgent] watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy") Guenter Roeck 2017-11-01 20:52 ` Thomas Gleixner 2017-11-01 20:27 ` tip-bot for Thomas Gleixner 2017-10-31 18:48 ` Crashes in perf_event_ctx_lock_nested Don Zickus
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=tip-1c294733b7b9f712f78d15cfa75ffdea72b79abb@git.kernel.org \ --to=tipbot@zytor.com \ --cc=dzickus@redhat.com \ --cc=hpa@zytor.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-tip-commits@vger.kernel.org \ --cc=linux@roeck-us.net \ --cc=mingo@kernel.org \ --cc=peterz@infradead.org \ --cc=tglx@linutronix.de \ --subject='Re: [tip:core/urgent] watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy")' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.