All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, vince@deater.net,
	eranian@google.com, Arnaldo Carvalho de Melo <acme@infradead.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>
Subject: [PATCH v2 1/5] perf: Fix a race between mmap_close and set_output of AUX events
Date: Tue,  6 Sep 2016 16:23:49 +0300	[thread overview]
Message-ID: <20160906132353.19887-2-alexander.shishkin@linux.intel.com> (raw)
In-Reply-To: <20160906132353.19887-1-alexander.shishkin@linux.intel.com>

In the mmap_close path we need to stop all the AUX events that are
writing data to the AUX area that we are unmapping, before we can
safely free the pages. To determine if an event needs to be stopped,
we're comparing its ->rb against the one that's getting unmapped.
However, a SET_OUTPUT ioctl may turn up inside an AUX transaction
and swizzle event::rb to some other ring buffer, but the transaction
will keep writing data to the old ring buffer until the event gets
scheduled out. At this point, mmap_close will skip over such an
event and will proceed to free the AUX area, while it's still being
used by this event, which will set off a warning in the mmap_close
path and cause a memory corruption.

To avoid this, always stop an AUX event before its ->rb is updated;
this will release the (potentially) last reference on the AUX area
of the buffer. If the event gets restarted, its new ring buffer will
be used. If another SET_OUTPUT comes and switches it back to the
old ring buffer that's getting unmapped, it's also fine: this
ring buffer's aux_mmap_count will be zero and AUX transactions won't
start any more.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 kernel/events/core.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 803481cb6c..71df77e234 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2464,11 +2464,11 @@ static int __perf_event_stop(void *info)
 	return 0;
 }
 
-static int perf_event_restart(struct perf_event *event)
+static int perf_event_stop(struct perf_event *event, int restart)
 {
 	struct stop_event_data sd = {
 		.event		= event,
-		.restart	= 1,
+		.restart	= restart,
 	};
 	int ret = 0;
 
@@ -4811,6 +4811,19 @@ static void ring_buffer_attach(struct perf_event *event,
 		spin_unlock_irqrestore(&rb->event_lock, flags);
 	}
 
+	/*
+	 * Avoid racing with perf_mmap_close(AUX): stop the event
+	 * before swizzling the event::rb pointer; if it's getting
+	 * unmapped, its aux_mmap_count will be 0 and it won't
+	 * restart. See the comment in __perf_pmu_output_stop().
+	 *
+	 * Data will inevitably be lost when set_output is done in
+	 * mid-air, but then again, whoever does it like this is
+	 * not in for the data anyway.
+	 */
+	if (has_aux(event))
+		perf_event_stop(event, 0);
+
 	rcu_assign_pointer(event->rb, rb);
 
 	if (old_rb) {
@@ -6086,7 +6099,7 @@ static void perf_event_addr_filters_exec(struct perf_event *event, void *data)
 	raw_spin_unlock_irqrestore(&ifh->lock, flags);
 
 	if (restart)
-		perf_event_restart(event);
+		perf_event_stop(event, 1);
 }
 
 void perf_event_exec(void)
@@ -6130,7 +6143,13 @@ static void __perf_event_output_stop(struct perf_event *event, void *data)
 
 	/*
 	 * In case of inheritance, it will be the parent that links to the
-	 * ring-buffer, but it will be the child that's actually using it:
+	 * ring-buffer, but it will be the child that's actually using it.
+	 *
+	 * We are using event::rb to determine if the event should be stopped,
+	 * however this may race with ring_buffer_attach() (through set_output),
+	 * which will make us skip the event that actually needs to be stopped.
+	 * So ring_buffer_attach() has to stop an aux event before re-assigning
+	 * its rb pointer.
 	 */
 	if (rcu_dereference(parent->rb) == rb)
 		ro->err = __perf_event_stop(&sd);
@@ -6653,7 +6672,7 @@ static void __perf_addr_filters_adjust(struct perf_event *event, void *data)
 	raw_spin_unlock_irqrestore(&ifh->lock, flags);
 
 	if (restart)
-		perf_event_restart(event);
+		perf_event_stop(event, 1);
 }
 
 /*
@@ -7831,7 +7850,7 @@ static void perf_event_addr_filters_apply(struct perf_event *event)
 	mmput(mm);
 
 restart:
-	perf_event_restart(event);
+	perf_event_stop(event, 1);
 }
 
 /*
-- 
2.9.3

  reply	other threads:[~2016-09-06 13:28 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-06 13:23 [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
2016-09-06 13:23 ` Alexander Shishkin [this message]
2016-09-10 12:38   ` [tip:perf/core] perf/core: Fix a race between mmap_close() and set_output() of AUX events tip-bot for Alexander Shishkin
2016-09-06 13:23 ` [PATCH v2 2/5] perf: Fix aux_mmap_count vs aux_refcount order Alexander Shishkin
2016-09-10 12:38   ` [tip:perf/core] perf/core: " tip-bot for Alexander Shishkin
2016-09-06 13:23 ` [PATCH v2 3/5] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
2016-09-10 12:39   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2016-09-06 13:23 ` [PATCH v2 4/5] perf/x86/intel/bts: Fix BTS PMI detection Alexander Shishkin
2016-09-10 12:39   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2016-09-20 13:12   ` [PATCH] perf/x86/intel/bts: don't dereference ds unconditionally Sebastian Andrzej Siewior
2016-09-20 13:44     ` Alexander Shishkin
2016-09-20 13:54       ` Alexander Shishkin
2016-09-20 14:13     ` [tip:perf/urgent] perf/x86/intel/bts: Make sure debug store is valid tip-bot for Sebastian Andrzej Siewior
2016-09-06 13:23 ` [PATCH v2 5/5] perf/x86/intel/bts: Kill a silly warning Alexander Shishkin
2016-09-10 12:40   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2016-09-06 17:19 ` [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Ingo Molnar
2016-09-07  0:13   ` Vince Weaver
2016-09-07 15:20   ` Alexander Shishkin
2016-09-07 15:36     ` Vince Weaver
2016-09-07 16:38       ` Peter Zijlstra
2016-09-07 18:33       ` Alexander Shishkin
2016-09-08  3:36         ` Vince Weaver
2016-09-08  8:51           ` Alexander Shishkin
2016-09-08 12:54             ` Vince Weaver
2016-09-08  6:21     ` Ingo Molnar
2016-09-08  6:23       ` Ingo Molnar
2016-09-08  8:43       ` Alexander Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160906132353.19887-2-alexander.shishkin@linux.intel.com \
    --to=alexander.shishkin@linux.intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@infradead.org \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=vince@deater.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.