All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
@ 2016-09-06 13:23 Alexander Shishkin
  2016-09-06 13:23 ` [PATCH v2 1/5] perf: Fix a race between mmap_close and set_output of AUX events Alexander Shishkin
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-06 13:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

Hi,

There were more bugs since the previous version, plus the BTS barriers
got fixed. With these patches, my testcase keeps running and no
spurious NMI warnings pop up any more.

Original story:

Recently Vince has reported warnings and panics coming from the
general direction of AUX tracing. I found two bugs which manifest
similarly, one in intel_bts driver and one in AUX unmapping path.

Both are triggered by racing SET_OUTPUT against mmap_close while
running AUX tracing. I have a test case that set fire to the kernel
within a few seconds by doing this, which I can share if anyone
cares.

These are all good candidates for 4.7-stable and the BTS ones can be
theoretically backported further.

Alexander Shishkin (5):
  perf: Fix a race between mmap_close and set_output of AUX events
  perf: Fix aux_mmap_count vs aux_refcount order
  perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  perf/x86/intel/bts: Fix BTS PMI detection
  perf/x86/intel/bts: Kill a silly warning

 arch/x86/events/intel/bts.c | 123 +++++++++++++++++++++++++++++++++-----------
 kernel/events/core.c        |  31 ++++++++---
 kernel/events/ring_buffer.c |  16 +++---
 3 files changed, 129 insertions(+), 41 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/5] perf: Fix a race between mmap_close and set_output of AUX events
  2016-09-06 13:23 [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
@ 2016-09-06 13:23 ` Alexander Shishkin
  2016-09-10 12:38   ` [tip:perf/core] perf/core: Fix a race between mmap_close() and set_output() " tip-bot for Alexander Shishkin
  2016-09-06 13:23 ` [PATCH v2 2/5] perf: Fix aux_mmap_count vs aux_refcount order Alexander Shishkin
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-06 13:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

In the mmap_close path we need to stop all the AUX events that are
writing data to the AUX area that we are unmapping, before we can
safely free the pages. To determine if an event needs to be stopped,
we're comparing its ->rb against the one that's getting unmapped.
However, a SET_OUTPUT ioctl may turn up inside an AUX transaction
and swizzle event::rb to some other ring buffer, but the transaction
will keep writing data to the old ring buffer until the event gets
scheduled out. At this point, mmap_close will skip over such an
event and will proceed to free the AUX area, while it's still being
used by this event, which will set off a warning in the mmap_close
path and cause a memory corruption.

To avoid this, always stop an AUX event before its ->rb is updated;
this will release the (potentially) last reference on the AUX area
of the buffer. If the event gets restarted, its new ring buffer will
be used. If another SET_OUTPUT comes and switches it back to the
old ring buffer that's getting unmapped, it's also fine: this
ring buffer's aux_mmap_count will be zero and AUX transactions won't
start any more.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 kernel/events/core.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 803481cb6c..71df77e234 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2464,11 +2464,11 @@ static int __perf_event_stop(void *info)
 	return 0;
 }
 
-static int perf_event_restart(struct perf_event *event)
+static int perf_event_stop(struct perf_event *event, int restart)
 {
 	struct stop_event_data sd = {
 		.event		= event,
-		.restart	= 1,
+		.restart	= restart,
 	};
 	int ret = 0;
 
@@ -4811,6 +4811,19 @@ static void ring_buffer_attach(struct perf_event *event,
 		spin_unlock_irqrestore(&rb->event_lock, flags);
 	}
 
+	/*
+	 * Avoid racing with perf_mmap_close(AUX): stop the event
+	 * before swizzling the event::rb pointer; if it's getting
+	 * unmapped, its aux_mmap_count will be 0 and it won't
+	 * restart. See the comment in __perf_pmu_output_stop().
+	 *
+	 * Data will inevitably be lost when set_output is done in
+	 * mid-air, but then again, whoever does it like this is
+	 * not in for the data anyway.
+	 */
+	if (has_aux(event))
+		perf_event_stop(event, 0);
+
 	rcu_assign_pointer(event->rb, rb);
 
 	if (old_rb) {
@@ -6086,7 +6099,7 @@ static void perf_event_addr_filters_exec(struct perf_event *event, void *data)
 	raw_spin_unlock_irqrestore(&ifh->lock, flags);
 
 	if (restart)
-		perf_event_restart(event);
+		perf_event_stop(event, 1);
 }
 
 void perf_event_exec(void)
@@ -6130,7 +6143,13 @@ static void __perf_event_output_stop(struct perf_event *event, void *data)
 
 	/*
 	 * In case of inheritance, it will be the parent that links to the
-	 * ring-buffer, but it will be the child that's actually using it:
+	 * ring-buffer, but it will be the child that's actually using it.
+	 *
+	 * We are using event::rb to determine if the event should be stopped,
+	 * however this may race with ring_buffer_attach() (through set_output),
+	 * which will make us skip the event that actually needs to be stopped.
+	 * So ring_buffer_attach() has to stop an aux event before re-assigning
+	 * its rb pointer.
 	 */
 	if (rcu_dereference(parent->rb) == rb)
 		ro->err = __perf_event_stop(&sd);
@@ -6653,7 +6672,7 @@ static void __perf_addr_filters_adjust(struct perf_event *event, void *data)
 	raw_spin_unlock_irqrestore(&ifh->lock, flags);
 
 	if (restart)
-		perf_event_restart(event);
+		perf_event_stop(event, 1);
 }
 
 /*
@@ -7831,7 +7850,7 @@ static void perf_event_addr_filters_apply(struct perf_event *event)
 	mmput(mm);
 
 restart:
-	perf_event_restart(event);
+	perf_event_stop(event, 1);
 }
 
 /*
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/5] perf: Fix aux_mmap_count vs aux_refcount order
  2016-09-06 13:23 [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
  2016-09-06 13:23 ` [PATCH v2 1/5] perf: Fix a race between mmap_close and set_output of AUX events Alexander Shishkin
@ 2016-09-06 13:23 ` Alexander Shishkin
  2016-09-10 12:38   ` [tip:perf/core] perf/core: " tip-bot for Alexander Shishkin
  2016-09-06 13:23 ` [PATCH v2 3/5] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-06 13:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

The order of accesses to ring buffer's aux_mmap_count and aux_refcount
has to be preserved across the users, namely perf_mmap_close() and
perf_aux_output_begin(), otherwise the inversion can result in the latter
holding the last reference to the aux buffer and subsequently free'ing
it in atomic context, triggering a warning.

> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 257 at kernel/events/ring_buffer.c:541 __rb_free_aux+0x11a/0x130
> CPU: 0 PID: 257 Comm: stopbug Not tainted 4.8.0-rc1+ #2596
> Call Trace:
>  [<ffffffff810f3e0b>] __warn+0xcb/0xf0
>  [<ffffffff810f3f3d>] warn_slowpath_null+0x1d/0x20
>  [<ffffffff8121182a>] __rb_free_aux+0x11a/0x130
>  [<ffffffff812127a8>] rb_free_aux+0x18/0x20
>  [<ffffffff81212913>] perf_aux_output_begin+0x163/0x1e0
>  [<ffffffff8100c33a>] bts_event_start+0x3a/0xd0
>  [<ffffffff8100c42d>] bts_event_add+0x5d/0x80
>  [<ffffffff81203646>] event_sched_in.isra.104+0xf6/0x2f0
>  [<ffffffff8120652e>] group_sched_in+0x6e/0x190
>  [<ffffffff8120694e>] ctx_sched_in+0x2fe/0x5f0
>  [<ffffffff81206ca0>] perf_event_sched_in+0x60/0x80
>  [<ffffffff81206d1b>] ctx_resched+0x5b/0x90
>  [<ffffffff81207281>] __perf_event_enable+0x1e1/0x240
>  [<ffffffff81200639>] event_function+0xa9/0x180
>  [<ffffffff81202000>] ? perf_cgroup_attach+0x70/0x70
>  [<ffffffff8120203f>] remote_function+0x3f/0x50
>  [<ffffffff811971f3>] flush_smp_call_function_queue+0x83/0x150
>  [<ffffffff81197bd3>] generic_smp_call_function_single_interrupt+0x13/0x60
>  [<ffffffff810a6477>] smp_call_function_single_interrupt+0x27/0x40
>  [<ffffffff81a26ea9>] call_function_single_interrupt+0x89/0x90
>  [<ffffffff81120056>] finish_task_switch+0xa6/0x210
>  [<ffffffff81120017>] ? finish_task_switch+0x67/0x210
>  [<ffffffff81a1e83d>] __schedule+0x3dd/0xb50
>  [<ffffffff81a1efe5>] schedule+0x35/0x80
>  [<ffffffff81128031>] sys_sched_yield+0x61/0x70
>  [<ffffffff81a25be5>] entry_SYSCALL_64_fastpath+0x18/0xa8
> ---[ end trace 6235f556f5ea83a9 ]---

This patch puts the checks in perf_aux_output_begin() in the same order
as that of perf_mmap_close().

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 kernel/events/ring_buffer.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index ae9b90dc9a..71c9194b1f 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -330,15 +330,19 @@ void *perf_aux_output_begin(struct perf_output_handle *handle,
 	if (!rb)
 		return NULL;
 
-	if (!rb_has_aux(rb) || !atomic_inc_not_zero(&rb->aux_refcount))
-		goto err;
-
 	/*
-	 * If rb::aux_mmap_count is zero (and rb_has_aux() above went through),
-	 * the aux buffer is in perf_mmap_close(), about to get freed.
+	 * If aux_mmap_count is zero, the aux buffer is in perf_mmap_close(),
+	 * about to get freed, so we leave immediately.
+	 *
+	 * Checking rb::aux_mmap_count and rb::refcount has to be done in
+	 * the same order, see perf_mmap_close. Otherwise we end up freeing
+	 * aux pages in this path, which is a bug, because in_atomic().
 	 */
 	if (!atomic_read(&rb->aux_mmap_count))
-		goto err_put;
+		goto err;
+
+	if (!rb_has_aux(rb) || !atomic_inc_not_zero(&rb->aux_refcount))
+		goto err;
 
 	/*
 	 * Nesting is not supported for AUX area, make sure nested
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 3/5] perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  2016-09-06 13:23 [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
  2016-09-06 13:23 ` [PATCH v2 1/5] perf: Fix a race between mmap_close and set_output of AUX events Alexander Shishkin
  2016-09-06 13:23 ` [PATCH v2 2/5] perf: Fix aux_mmap_count vs aux_refcount order Alexander Shishkin
@ 2016-09-06 13:23 ` Alexander Shishkin
  2016-09-10 12:39   ` [tip:perf/core] " tip-bot for Alexander Shishkin
  2016-09-06 13:23 ` [PATCH v2 4/5] perf/x86/intel/bts: Fix BTS PMI detection Alexander Shishkin
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-06 13:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

The intel_bts driver is using a cpu-local 'started' variable to order
callbacks and PMIs and make sure that AUX transactions don't get messed
up. However, the ordering rules in regard to this variable is a complete
mess, which recently resulted in perf_fuzzer-triggered warnings and
panics.

The general ordering rule that is patch is enforcing is that this
cpu-local variable be set only when the cpu-local AUX transaction is
active; consequently, this variable is to be checked before the AUX
related bits can be touched.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 arch/x86/events/intel/bts.c | 104 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 80 insertions(+), 24 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 0a6e393a2e..61e1d713b1 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -31,7 +31,17 @@
 struct bts_ctx {
 	struct perf_output_handle	handle;
 	struct debug_store		ds_back;
-	int				started;
+	int				state;
+};
+
+/* BTS context states: */
+enum {
+	/* no ongoing AUX transactions */
+	BTS_STATE_STOPPED = 0,
+	/* AUX transaction is on, BTS tracing is disabled */
+	BTS_STATE_INACTIVE,
+	/* AUX transaction is on, BTS tracing is running */
+	BTS_STATE_ACTIVE,
 };
 
 static DEFINE_PER_CPU(struct bts_ctx, bts_ctx);
@@ -204,6 +214,15 @@ static void bts_update(struct bts_ctx *bts)
 static int
 bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle);
 
+/*
+ * Ordering PMU callbacks wrt themselves and the PMI is done by means
+ * of bts::state, which:
+ *  - is set when bts::handle::event is valid, that is, between
+ *    perf_aux_output_begin() and perf_aux_output_end();
+ *  - is zero otherwise;
+ *  - is ordered against bts::handle::event with a compiler barrier.
+ */
+
 static void __bts_event_start(struct perf_event *event)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
@@ -221,10 +240,13 @@ static void __bts_event_start(struct perf_event *event)
 
 	/*
 	 * local barrier to make sure that ds configuration made it
-	 * before we enable BTS
+	 * before we enable BTS and bts::state goes ACTIVE
 	 */
 	wmb();
 
+	/* INACTIVE/STOPPED -> ACTIVE */
+	WRITE_ONCE(bts->state, BTS_STATE_ACTIVE);
+
 	intel_pmu_enable_bts(config);
 
 }
@@ -251,9 +273,6 @@ static void bts_event_start(struct perf_event *event, int flags)
 
 	__bts_event_start(event);
 
-	/* PMI handler: this counter is running and likely generating PMIs */
-	ACCESS_ONCE(bts->started) = 1;
-
 	return;
 
 fail_end_stop:
@@ -263,30 +282,34 @@ fail_stop:
 	event->hw.state = PERF_HES_STOPPED;
 }
 
-static void __bts_event_stop(struct perf_event *event)
+static void __bts_event_stop(struct perf_event *event, int state)
 {
+	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
+
+	/* ACTIVE -> INACTIVE(PMI)/STOPPED(->stop()) */
+	WRITE_ONCE(bts->state, state);
+
 	/*
 	 * No extra synchronization is mandated by the documentation to have
 	 * BTS data stores globally visible.
 	 */
 	intel_pmu_disable_bts();
-
-	if (event->hw.state & PERF_HES_STOPPED)
-		return;
-
-	ACCESS_ONCE(event->hw.state) |= PERF_HES_STOPPED;
 }
 
 static void bts_event_stop(struct perf_event *event, int flags)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
-	struct bts_buffer *buf = perf_get_aux(&bts->handle);
+	struct bts_buffer *buf = NULL;
+	int state = READ_ONCE(bts->state);
+
+	if (state == BTS_STATE_ACTIVE)
+		__bts_event_stop(event, BTS_STATE_STOPPED);
 
-	/* PMI handler: don't restart this counter */
-	ACCESS_ONCE(bts->started) = 0;
+	if (state != BTS_STATE_STOPPED)
+		buf = perf_get_aux(&bts->handle);
 
-	__bts_event_stop(event);
+	event->hw.state |= PERF_HES_STOPPED;
 
 	if (flags & PERF_EF_UPDATE) {
 		bts_update(bts);
@@ -296,6 +319,7 @@ static void bts_event_stop(struct perf_event *event, int flags)
 				bts->handle.head =
 					local_xchg(&buf->data_size,
 						   buf->nr_pages << PAGE_SHIFT);
+
 			perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),
 					    !!local_xchg(&buf->lost, 0));
 		}
@@ -310,8 +334,20 @@ static void bts_event_stop(struct perf_event *event, int flags)
 void intel_bts_enable_local(void)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
+	int state = READ_ONCE(bts->state);
+
+	/*
+	 * Here we transition from INACTIVE to ACTIVE;
+	 * if we instead are STOPPED from the interrupt handler,
+	 * stay that way. Can't be ACTIVE here though.
+	 */
+	if (WARN_ON_ONCE(state == BTS_STATE_ACTIVE))
+		return;
+
+	if (state == BTS_STATE_STOPPED)
+		return;
 
-	if (bts->handle.event && bts->started)
+	if (bts->handle.event)
 		__bts_event_start(bts->handle.event);
 }
 
@@ -319,8 +355,15 @@ void intel_bts_disable_local(void)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
 
+	/*
+	 * Here we transition from ACTIVE to INACTIVE;
+	 * do nothing for STOPPED or INACTIVE.
+	 */
+	if (READ_ONCE(bts->state) != BTS_STATE_ACTIVE)
+		return;
+
 	if (bts->handle.event)
-		__bts_event_stop(bts->handle.event);
+		__bts_event_stop(bts->handle.event, BTS_STATE_INACTIVE);
 }
 
 static int
@@ -407,9 +450,13 @@ int intel_bts_interrupt(void)
 	struct perf_event *event = bts->handle.event;
 	struct bts_buffer *buf;
 	s64 old_head;
-	int err;
+	int err = -ENOSPC;
 
-	if (!event || !bts->started)
+	/*
+	 * this is wrapped in intel_bts_enable_local/intel_bts_disable_local,
+	 * so we can only be INACTIVE or STOPPED
+	 */
+	if (READ_ONCE(bts->state) == BTS_STATE_STOPPED)
 		return 0;
 
 	buf = perf_get_aux(&bts->handle);
@@ -432,12 +479,21 @@ int intel_bts_interrupt(void)
 			    !!local_xchg(&buf->lost, 0));
 
 	buf = perf_aux_output_begin(&bts->handle, event);
-	if (!buf)
-		return 1;
+	if (buf)
+		err = bts_buffer_reset(buf, &bts->handle);
 
-	err = bts_buffer_reset(buf, &bts->handle);
-	if (err)
-		perf_aux_output_end(&bts->handle, 0, false);
+	if (err) {
+		WRITE_ONCE(bts->state, BTS_STATE_STOPPED);
+
+		if (buf) {
+			/*
+			 * BTS_STATE_STOPPED should be visible before
+			 * cleared handle::event
+			 */
+			barrier();
+			perf_aux_output_end(&bts->handle, 0, false);
+		}
+	}
 
 	return 1;
 }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 4/5] perf/x86/intel/bts: Fix BTS PMI detection
  2016-09-06 13:23 [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
                   ` (2 preceding siblings ...)
  2016-09-06 13:23 ` [PATCH v2 3/5] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
@ 2016-09-06 13:23 ` Alexander Shishkin
  2016-09-10 12:39   ` [tip:perf/core] " tip-bot for Alexander Shishkin
  2016-09-20 13:12   ` [PATCH] perf/x86/intel/bts: don't dereference ds unconditionally Sebastian Andrzej Siewior
  2016-09-06 13:23 ` [PATCH v2 5/5] perf/x86/intel/bts: Kill a silly warning Alexander Shishkin
  2016-09-06 17:19 ` [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Ingo Molnar
  5 siblings, 2 replies; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-06 13:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

Since BTS doesn't have a dedicated PMI status bit, the driver needs to
take extra care to check for the condition that triggers it to avoid
spurious NMI warnings.

Regardless of the local BTS context state, the only way of knowing that
the NMI is ours is to compare the write pointer against the interrupt
threshold.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 arch/x86/events/intel/bts.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 61e1d713b1..9233edf993 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -446,26 +446,37 @@ bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle)
 
 int intel_bts_interrupt(void)
 {
+	struct debug_store *ds = this_cpu_ptr(&cpu_hw_events)->ds;
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
 	struct perf_event *event = bts->handle.event;
 	struct bts_buffer *buf;
 	s64 old_head;
-	int err = -ENOSPC;
+	int err = -ENOSPC, handled = 0;
+
+	/*
+	 * The only surefire way of knowing if this NMI is ours is by checking
+	 * the write ptr against the PMI threshold.
+	 */
+	if (ds->bts_index >= ds->bts_interrupt_threshold)
+		handled = 1;
 
 	/*
 	 * this is wrapped in intel_bts_enable_local/intel_bts_disable_local,
 	 * so we can only be INACTIVE or STOPPED
 	 */
 	if (READ_ONCE(bts->state) == BTS_STATE_STOPPED)
-		return 0;
+		return handled;
 
 	buf = perf_get_aux(&bts->handle);
+	if (!buf)
+		return handled;
+
 	/*
 	 * Skip snapshot counters: they don't use the interrupt, but
 	 * there's no other way of telling, because the pointer will
 	 * keep moving
 	 */
-	if (!buf || buf->snapshot)
+	if (buf->snapshot)
 		return 0;
 
 	old_head = local_read(&buf->head);
@@ -473,7 +484,7 @@ int intel_bts_interrupt(void)
 
 	/* no new data */
 	if (old_head == local_read(&buf->head))
-		return 0;
+		return handled;
 
 	perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),
 			    !!local_xchg(&buf->lost, 0));
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 5/5] perf/x86/intel/bts: Kill a silly warning
  2016-09-06 13:23 [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
                   ` (3 preceding siblings ...)
  2016-09-06 13:23 ` [PATCH v2 4/5] perf/x86/intel/bts: Fix BTS PMI detection Alexander Shishkin
@ 2016-09-06 13:23 ` Alexander Shishkin
  2016-09-10 12:40   ` [tip:perf/core] " tip-bot for Alexander Shishkin
  2016-09-06 17:19 ` [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Ingo Molnar
  5 siblings, 1 reply; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-06 13:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

At the moment, intel_bts will WARN() out if there is more than one
event writing to the same ring buffer, via SET_OUTPUT, and will only
send data from one event to a buffer.

There is no reason to have this warning in, so kill it.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 arch/x86/events/intel/bts.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 9233edf993..bdcd651099 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -378,8 +378,6 @@ bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle)
 		return 0;
 
 	head = handle->head & ((buf->nr_pages << PAGE_SHIFT) - 1);
-	if (WARN_ON_ONCE(head != local_read(&buf->head)))
-		return -EINVAL;
 
 	phys = &buf->buf[buf->cur_buf];
 	space = phys->offset + phys->displacement + phys->size - head;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-06 13:23 [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
                   ` (4 preceding siblings ...)
  2016-09-06 13:23 ` [PATCH v2 5/5] perf/x86/intel/bts: Kill a silly warning Alexander Shishkin
@ 2016-09-06 17:19 ` Ingo Molnar
  2016-09-07  0:13   ` Vince Weaver
  2016-09-07 15:20   ` Alexander Shishkin
  5 siblings, 2 replies; 27+ messages in thread
From: Ingo Molnar @ 2016-09-06 17:19 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo


* Alexander Shishkin <alexander.shishkin@linux.intel.com> wrote:

> Hi,
> 
> There were more bugs since the previous version, plus the BTS barriers got 
> fixed. With these patches, my testcase keeps running and no spurious NMI 
> warnings pop up any more.

Could you please also run the fuzzer that Vince uses, does it now pass on hardware 
you have access to?

I'd like to make "passes the fuzzer" a standard requirement before new changes are 
accepted to perf core.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-06 17:19 ` [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Ingo Molnar
@ 2016-09-07  0:13   ` Vince Weaver
  2016-09-07 15:20   ` Alexander Shishkin
  1 sibling, 0 replies; 27+ messages in thread
From: Vince Weaver @ 2016-09-07  0:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alexander Shishkin, Peter Zijlstra, Ingo Molnar, linux-kernel,
	eranian, Arnaldo Carvalho de Melo, vincent.weaver

On Tue, 6 Sep 2016, Ingo Molnar wrote:

> > There were more bugs since the previous version, plus the BTS barriers got 
> > fixed. With these patches, my testcase keeps running and no spurious NMI 
> > warnings pop up any more.
> 
> Could you please also run the fuzzer that Vince uses, does it now pass 
> on hardware you have access to?

I ran 4.8-rc5 with these patches applied on a Skylake machine that shows 
the various bts problems pretty quickly normally.

It ran a while and didn't hit any of them, but it did fall over with RCU 
errors that are (probably) unrelated.  

I'd have a better report but I'm having a lot of trouble getting kernel 
oopses to appear on the serial console with this new install on the 
skylake machine.  It may not be totally systemd's fault but I am going to 
blame it anyway.

Vince

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-06 17:19 ` [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Ingo Molnar
  2016-09-07  0:13   ` Vince Weaver
@ 2016-09-07 15:20   ` Alexander Shishkin
  2016-09-07 15:36     ` Vince Weaver
  2016-09-08  6:21     ` Ingo Molnar
  1 sibling, 2 replies; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-07 15:20 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo

Ingo Molnar <mingo@kernel.org> writes:

> * Alexander Shishkin <alexander.shishkin@linux.intel.com> wrote:
>
>> Hi,
>> 
>> There were more bugs since the previous version, plus the BTS barriers got 
>> fixed. With these patches, my testcase keeps running and no spurious NMI 
>> warnings pop up any more.
>
> Could you please also run the fuzzer that Vince uses, does it now pass on hardware 
> you have access to?

Sure. And yes, I did catch a warning, which calls for one more patch
(below). Also one unrelated thing in PEBS that Peter fixed.

> I'd like to make "passes the fuzzer" a standard requirement before new changes are 
> accepted to perf core.

Let's make it so.

For the sake of consistency, this one needs to go before 3/5. I'll
re-send the whole series, though, if need be. I've got 2 perf_fuzzers
running on this meanwhile.

>From c170c607bfdc3804578033faa43f342a5d95eb6c Mon Sep 17 00:00:00 2001
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Date: Wed, 7 Sep 2016 18:05:08 +0300
Subject: [PATCH] perf/x86/intel: Don't disable "intel_bts" around "intel"
 event batching

At the moment, intel_bts events get disabled from intel PMU's disable
callback, which includes event scheduling transactions of said PMU,
which have nothing to do with intel_bts events.

We do want to keep intel_bts events off inside the PMI handler to
avoid filling up their buffer too soon.

This patch moves intel_bts enabling/disabling directly to the PMI
handler.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 arch/x86/events/intel/core.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 88792f846d..e2d71513c9 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -1730,9 +1730,11 @@ static __initconst const u64 knl_hw_cache_extra_regs
  * disabled state if called consecutively.
  *
  * During consecutive calls, the same disable value will be written to related
- * registers, so the PMU state remains unchanged. hw.state in
- * intel_bts_disable_local will remain PERF_HES_STOPPED too in consecutive
- * calls.
+ * registers, so the PMU state remains unchanged.
+ *
+ * intel_bts events don't coexist with intel pmu's BTS events because of
+ * x86_add_exclusive(x86_lbr_exclusive_lbr); there's no need to keep them
+ * disabled around intel pmu's event batching etc, only inside the PMI handler.
  */
 static void __intel_pmu_disable_all(void)
 {
@@ -1742,8 +1744,6 @@ static void __intel_pmu_disable_all(void)
 
 	if (test_bit(INTEL_PMC_IDX_FIXED_BTS, cpuc->active_mask))
 		intel_pmu_disable_bts();
-	else
-		intel_bts_disable_local();
 
 	intel_pmu_pebs_disable_all();
 }
@@ -1771,8 +1771,7 @@ static void __intel_pmu_enable_all(int added, bool pmi)
 			return;
 
 		intel_pmu_enable_bts(event->hw.config);
-	} else
-		intel_bts_enable_local();
+	}
 }
 
 static void intel_pmu_enable_all(int added)
@@ -2076,6 +2075,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 	 */
 	if (!x86_pmu.late_ack)
 		apic_write(APIC_LVTPC, APIC_DM_NMI);
+	intel_bts_disable_local();
 	__intel_pmu_disable_all();
 	handled = intel_pmu_drain_bts_buffer();
 	handled += intel_bts_interrupt();
@@ -2175,6 +2175,7 @@ done:
 	/* Only restore PMU state when it's active. See x86_pmu_disable(). */
 	if (cpuc->enabled)
 		__intel_pmu_enable_all(0, true);
+	intel_bts_enable_local();
 
 	/*
 	 * Only unmask the NMI after the overflow counters
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-07 15:20   ` Alexander Shishkin
@ 2016-09-07 15:36     ` Vince Weaver
  2016-09-07 16:38       ` Peter Zijlstra
  2016-09-07 18:33       ` Alexander Shishkin
  2016-09-08  6:21     ` Ingo Molnar
  1 sibling, 2 replies; 27+ messages in thread
From: Vince Weaver @ 2016-09-07 15:36 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Ingo Molnar, Peter Zijlstra, Ingo Molnar, linux-kernel, eranian,
	Arnaldo Carvalho de Melo, vincent.weaver

On Wed, 7 Sep 2016, Alexander Shishkin wrote:

> Sure. And yes, I did catch a warning, which calls for one more patch
> (below). Also one unrelated thing in PEBS that Peter fixed.

Does that fix this which I just got on my skylake machine (4.8-rc5 with 
your other 5 patches applied)

[ 5351.822559] WARNING: CPU: 3 PID: 19191 at arch/x86/events/intel/bts.c:344 event_function+0xa1/0x160
[ 5351.823895] CPU: 3 PID: 19191 Comm: perf_fuzzer Not tainted 4.8.0-rc5+ #3
[ 5351.823896] Hardware name: LENOVO 10FY0017US/SKYBAY, BIOS FWKT53A   06/06/2016
[ 5351.823897]  0000000000000086 000000001a69ecd3 ffffffff8cf1f9f5 0000000000000000
[ 5351.823898]  0000000000000000 ffffffff8cc7b624 ffffcaa8ffcc26f0 ffff9c676e1aba00
[ 5351.823900]  ffff9c6771222800 0000000000000000 ffff9c676e1aba00 ffff9c676f6f7e40
[ 5351.823901] Call Trace:
[ 5351.823905]  [<ffffffff8cf1f9f5>] ? dump_stack+0x5c/0x77
[ 5351.823907]  [<ffffffff8cc7b624>] ? __warn+0xc4/0xe0
[ 5351.823909]  [<ffffffff8cd624d1>] ? event_function+0xa1/0x160
[ 5351.823910]  [<ffffffff8cd69db0>] ? ctx_resched+0x50/0x50
[ 5351.823911]  [<ffffffff8cd63a3a>] ? remote_function+0x3a/0x40
[ 5351.823913]  [<ffffffff8ccfaecd>] ? generic_exec_single+0x9d/0x100
[ 5351.823914]  [<ffffffff8ccfaea1>] ? generic_exec_single+0x71/0x100
[ 5351.823916]  [<ffffffff8cd63a00>] ? perf_cgroup_attach+0x70/0x70
[ 5351.823917]  [<ffffffff8ccfaffd>] ? smp_call_function_single+0xcd/0x130
[ 5351.823918]  [<ffffffff8ccfaffd>] ? smp_call_function_single+0xcd/0x130
[ 5351.823919]  [<ffffffff8cd628b9>] ? task_function_call+0x49/0x70
[ 5351.823920]  [<ffffffff8cd62430>] ? cpu_clock_event_read+0x10/0x10
[ 5351.823921]  [<ffffffff8cd67438>] ? event_function_call+0x98/0x100
[ 5351.823922]  [<ffffffff8cd69db0>] ? ctx_resched+0x50/0x50
[ 5351.823923]  [<ffffffff8cd67530>] ? perf_event_disable+0x30/0x30
[ 5351.823924]  [<ffffffff8cd62642>] ? perf_event_for_each_child+0x32/0x90
[ 5351.823925]  [<ffffffff8cd6aa51>] ? perf_event_task_enable+0x61/0xb0
[ 5351.823927]  [<ffffffff8cc9066e>] ? SyS_prctl+0x2ae/0x470
[ 5351.823929]  [<ffffffff8cc03b6f>] ? do_syscall_64+0x5f/0x160
[ 5351.823930]  [<ffffffff8d1e7b65>] ? entry_SYSCALL64_slow_path+0x25/0x25
[ 5351.823931] ---[ end trace bc7b0b7d0c024d60 ]---

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-07 15:36     ` Vince Weaver
@ 2016-09-07 16:38       ` Peter Zijlstra
  2016-09-07 18:33       ` Alexander Shishkin
  1 sibling, 0 replies; 27+ messages in thread
From: Peter Zijlstra @ 2016-09-07 16:38 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Alexander Shishkin, Ingo Molnar, Ingo Molnar, linux-kernel,
	eranian, Arnaldo Carvalho de Melo, vincent.weaver

On Wed, Sep 07, 2016 at 11:36:48AM -0400, Vince Weaver wrote:
> On Wed, 7 Sep 2016, Alexander Shishkin wrote:
> 
> >  Also one unrelated thing in PEBS that Peter fixed.

---
Subject: perf,x86: Fix PEBSv3 record drain
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed Sep 7 14:42:55 CEST 2016

Alexander hit the WARN_ON_ONCE(!event) on his Skylake while running
the fuzzer.

This means the PEBSv3 record included a status bit for an inactive
event, something that _should_ not happen.

Move the code that filters the status bits against our known PEBS
events up a spot to guarantee we only deal with events we know about.

Further add "continue" statements to the WARN_ON_ONCE()s such that
we'll not die nor generate silly events in case we ever do hit them
again.

Cc: Vince Weaver <vince@deater.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: stable@vger.kernel.org
Fixes: a3d86542de88 ("perf/x86/intel/pebs: Add PEBSv3 decoding")
Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/events/intel/ds.c |   19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1312,18 +1312,18 @@ static void intel_pmu_drain_pebs_nhm(str
 		struct pebs_record_nhm *p = at;
 		u64 pebs_status;
 
-		/* PEBS v3 has accurate status bits */
+		pebs_status = p->status & cpuc->pebs_enabled;
+		pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
+
+		/* PEBS v3 has more accurate status bits */
 		if (x86_pmu.intel_cap.pebs_format >= 3) {
-			for_each_set_bit(bit, (unsigned long *)&p->status,
-					 MAX_PEBS_EVENTS)
+			for_each_set_bit(bit, (unsigned long *)&pebs_status,
+					 x86_pmu.max_pebs_events)
 				counts[bit]++;
 
 			continue;
 		}
 
-		pebs_status = p->status & cpuc->pebs_enabled;
-		pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
-
 		/*
 		 * On some CPUs the PEBS status can be zero when PEBS is
 		 * racing with clearing of GLOBAL_STATUS.
@@ -1371,8 +1371,11 @@ static void intel_pmu_drain_pebs_nhm(str
 			continue;
 
 		event = cpuc->events[bit];
-		WARN_ON_ONCE(!event);
-		WARN_ON_ONCE(!event->attr.precise_ip);
+		if (WARN_ON_ONCE(!event))
+			continue;
+
+		if (WARN_ON_ONCE(!event->attr.precise_ip))
+			continue;
 
 		/* log dropped samples number */
 		if (error[bit])

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-07 15:36     ` Vince Weaver
  2016-09-07 16:38       ` Peter Zijlstra
@ 2016-09-07 18:33       ` Alexander Shishkin
  2016-09-08  3:36         ` Vince Weaver
  1 sibling, 1 reply; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-07 18:33 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Ingo Molnar, Peter Zijlstra, Ingo Molnar, linux-kernel, eranian,
	Arnaldo Carvalho de Melo, vincent.weaver

Vince Weaver <vince@deater.net> writes:

> On Wed, 7 Sep 2016, Alexander Shishkin wrote:
>
>> Sure. And yes, I did catch a warning, which calls for one more patch
>> (below). Also one unrelated thing in PEBS that Peter fixed.
>
> Does that fix this which I just got on my skylake machine (4.8-rc5 with 
> your other 5 patches applied)
>
> [ 5351.822559] WARNING: CPU: 3 PID: 19191 at arch/x86/events/intel/bts.c:344 event_function+0xa1/0x160

Yes, it fixes a problem that triggered this warning. Can't tell from the
absence of backtrace what's going on here, though.

Regards,
--
Alex

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-07 18:33       ` Alexander Shishkin
@ 2016-09-08  3:36         ` Vince Weaver
  2016-09-08  8:51           ` Alexander Shishkin
  0 siblings, 1 reply; 27+ messages in thread
From: Vince Weaver @ 2016-09-08  3:36 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Ingo Molnar, Peter Zijlstra, Ingo Molnar, linux-kernel, eranian,
	Arnaldo Carvalho de Melo, vincent.weaver

On Wed, 7 Sep 2016, Alexander Shishkin wrote:

> Vince Weaver <vince@deater.net> writes:
> 
> > On Wed, 7 Sep 2016, Alexander Shishkin wrote:
> >
> >> Sure. And yes, I did catch a warning, which calls for one more patch
> >> (below). Also one unrelated thing in PEBS that Peter fixed.
> >
> > Does that fix this which I just got on my skylake machine (4.8-rc5 with 
> > your other 5 patches applied)
> >
> > [ 5351.822559] WARNING: CPU: 3 PID: 19191 at arch/x86/events/intel/bts.c:344 event_function+0xa1/0x160
> 
> Yes, it fixes a problem that triggered this warning. Can't tell from the
> absence of backtrace what's going on here, though.

On the skylake machine with the original 5 patches I got this after 
continuing to fuzz.  Sorry about the lack of frame pointer, next
compile will have it enabled.

If it matters, prior to this I hit the unrelated
[25510.278199] WARNING: CPU: 1 PID: 25405 at kernel/events/core.c:3554 perf_event_read+0x18f/0x1a0


[28682.174684] WARNING: CPU: 7 PID: 31992 at kernel/events/core.c:4961 perf_mmap_close+0x2e1/0x2f0
[28682.280579] CPU: 7 PID: 31992 Comm: perf_fuzzer Tainted: G        W       4.8.0-rc5+ #3
[28682.288739] Hardware name: LENOVO 10FY0017US/SKYBAY, BIOS FWKT53A   06/06/2016
[28682.296096]  0000000000000286 000000008e07b373 ffffffff8cf1f9f5 0000000000000000
[28682.303738]  0000000000000000 ffffffff8cc7b624 ffff9c676e780e00 ffff9c676e029280
[28682.311362]  ffff9c6772ab3398 ffff9c676e029000 ffff9c676e780bb0 ffff9c676f7c0280
[28682.319036] Call Trace:
[28682.321529]  [<ffffffff8cf1f9f5>] ? dump_stack+0x5c/0x77
[28682.326939]  [<ffffffff8cc7b624>] ? __warn+0xc4/0xe0
[28682.332018]  [<ffffffff8cd6dc01>] ? perf_mmap_close+0x2e1/0x2f0
[28682.338083]  [<ffffffff8cd664b0>] ? perf_iterate_ctx+0x150/0x150
[28682.344218]  [<ffffffff8cdb5e7d>] ? remove_vma+0x2d/0x70
[28682.349662]  [<ffffffff8cdb84d6>] ? do_munmap+0x246/0x400
[28682.355197]  [<ffffffff8cdb8c1b>] ? SyS_munmap+0x4b/0x70
[28682.360612]  [<ffffffff8cc03b6f>] ? do_syscall_64+0x5f/0x160
[28682.366404]  [<ffffffff8d1e7b65>] ? entry_SYSCALL64_slow_path+0x25/0x25
[28682.373149] ---[ end trace bc7b0b7d0c024d62 ]---
[28682.377975] ------------[ cut here ]------------
[28682.382681] WARNING: CPU: 4 PID: 0 at kernel/events/ring_buffer.c:543 __rb_free_aux+0x110/0x120
[28682.489412] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G        W       4.8.0-rc5+ #3
[28682.498106] Hardware name: LENOVO 10FY0017US/SKYBAY, BIOS FWKT53A   06/06/2016
[28682.506518]  0000000000000086 6d2f07ae6326ed5e ffffffff8cf1f9f5 0000000000000000
[28682.515223]  0000000000000000 ffffffff8cc7b624 ffff9c676e780e00 ffff9c677dd0b740
[28682.523996]  0000000000000000 ffff9c676e029000 ffff9c676e780e00 00001a16168dab54
[28682.532867] Call Trace:
[28682.536458]  <IRQ>  [<ffffffff8cf1f9f5>] ? dump_stack+0x5c/0x77
[28682.543615]  [<ffffffff8cc7b624>] ? __warn+0xc4/0xe0
[28682.549741]  [<ffffffff8cd700e0>] ? __rb_free_aux+0x110/0x120
[28682.556829]  [<ffffffff8cd70f3a>] ? perf_aux_output_end+0xba/0x100
[28682.564312]  [<ffffffff8cc0c9e1>] ? bts_event_stop+0xc1/0x130
[28682.571238]  [<ffffffff8cd63a00>] ? perf_cgroup_attach+0x70/0x70
[28682.578402]  [<ffffffff8cd621f0>] ? __perf_event_stop+0x40/0x60
[28682.585484]  [<ffffffff8cd63a3a>] ? remote_function+0x3a/0x40
[28682.592644]  [<ffffffff8ccfad46>] ? flush_smp_call_function_queue+0x76/0x160
[28682.600867]  [<ffffffff8cc4e6e9>] ? smp_trace_call_function_single_interrupt+0x29/0xc0
[28682.609914]  [<ffffffff8d1e9ce2>] ? trace_call_function_single_interrupt+0x82/0x90
[28682.618608]  <EOI>  [<ffffffff8d0a9d13>] ? cpuidle_enter_state+0x113/0x260
[28682.626570]  [<ffffffff8ccbd7dc>] ? cpu_startup_entry+0x2cc/0x370
[28682.633675]  [<ffffffff8cc4f07d>] ? start_secondary+0x14d/0x190
[28682.640749] ---[ end trace bc7b0b7d0c024d63 ]---

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-07 15:20   ` Alexander Shishkin
  2016-09-07 15:36     ` Vince Weaver
@ 2016-09-08  6:21     ` Ingo Molnar
  2016-09-08  6:23       ` Ingo Molnar
  2016-09-08  8:43       ` Alexander Shishkin
  1 sibling, 2 replies; 27+ messages in thread
From: Ingo Molnar @ 2016-09-08  6:21 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo


* Alexander Shishkin <alexander.shishkin@linux.intel.com> wrote:

> Ingo Molnar <mingo@kernel.org> writes:
> 
> > * Alexander Shishkin <alexander.shishkin@linux.intel.com> wrote:
> >
> >> Hi,
> >> 
> >> There were more bugs since the previous version, plus the BTS barriers got 
> >> fixed. With these patches, my testcase keeps running and no spurious NMI 
> >> warnings pop up any more.
> >
> > Could you please also run the fuzzer that Vince uses, does it now pass on hardware 
> > you have access to?
> 
> Sure. And yes, I did catch a warning, which calls for one more patch
> (below). Also one unrelated thing in PEBS that Peter fixed.
> 
> > I'd like to make "passes the fuzzer" a standard requirement before new changes are 
> > accepted to perf core.
> 
> Let's make it so.
> 
> For the sake of consistency, this one needs to go before 3/5. I'll
> re-send the whole series, though, if need be. I've got 2 perf_fuzzers
> running on this meanwhile.

Yeah, please re-send it - and please also Vince's Reported-by tag to all commits 
that would explain failures that Vince reported.

Also, please document how much and what type of fuzzer testing the series got: 
fuzzer version, time it ran and (rough) hardware it ran on would be useful. (That 
way we can look back later on whether there was any fuzzer testing on AMD systems 
for example, which you might not be able to perform.)

It would also be very, very nice to also add a Documentation/perf/testing.txt step 
by step ELI5 style document that explains how to set up and run the fuzzer!

> + * registers, so the PMU state remains unchanged.
> + *
> + * intel_bts events don't coexist with intel pmu's BTS events because of
> + * x86_add_exclusive(x86_lbr_exclusive_lbr); there's no need to keep them
> + * disabled around intel pmu's event batching etc, only inside the PMI handler.

Pet peeve nit: please capitalize 'PMU' correctly and consistently (upper case). 
This paragraph has both variants: "PMU" and "pmu" which is the worst variant 
really.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-08  6:21     ` Ingo Molnar
@ 2016-09-08  6:23       ` Ingo Molnar
  2016-09-08  8:43       ` Alexander Shishkin
  1 sibling, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2016-09-08  6:23 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo


* Ingo Molnar <mingo@kernel.org> wrote:

> Yeah, please re-send it - and please also Vince's Reported-by tag to all commits 
                                           ^--- add
> that would explain failures that Vince reported.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-08  6:21     ` Ingo Molnar
  2016-09-08  6:23       ` Ingo Molnar
@ 2016-09-08  8:43       ` Alexander Shishkin
  1 sibling, 0 replies; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-08  8:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo

Ingo Molnar <mingo@kernel.org> writes:

> * Alexander Shishkin <alexander.shishkin@linux.intel.com> wrote:
>
>> Ingo Molnar <mingo@kernel.org> writes:
>> 
>> > * Alexander Shishkin <alexander.shishkin@linux.intel.com> wrote:
>> >
>> >> Hi,
>> >> 
>> >> There were more bugs since the previous version, plus the BTS barriers got 
>> >> fixed. With these patches, my testcase keeps running and no spurious NMI 
>> >> warnings pop up any more.
>> >
>> > Could you please also run the fuzzer that Vince uses, does it now pass on hardware 
>> > you have access to?
>> 
>> Sure. And yes, I did catch a warning, which calls for one more patch
>> (below). Also one unrelated thing in PEBS that Peter fixed.
>> 
>> > I'd like to make "passes the fuzzer" a standard requirement before new changes are 
>> > accepted to perf core.
>> 
>> Let's make it so.
>> 
>> For the sake of consistency, this one needs to go before 3/5. I'll
>> re-send the whole series, though, if need be. I've got 2 perf_fuzzers
>> running on this meanwhile.
>
> Yeah, please re-send it - and please also Vince's Reported-by tag to all commits 
> that would explain failures that Vince reported.
>
> Also, please document how much and what type of fuzzer testing the series got: 
> fuzzer version, time it ran and (rough) hardware it ran on would be useful. (That 
> way we can look back later on whether there was any fuzzer testing on AMD systems 
> for example, which you might not be able to perform.)

Not sure if run time is useful with the fuzzer, but otherwise seems
reasonable. How do we put this stuff into a commit message, though?

Perf-Fuzzer-ID: d18d23aae6

?

> It would also be very, very nice to also add a Documentation/perf/testing.txt step 
> by step ELI5 style document that explains how to set up and run the fuzzer!

Actually, I'd like that too as I'm not sure I'm doing it 100% right (had
to patch it some time ago to stop it from segfaulting).

>> + * intel_bts events don't coexist with intel pmu's BTS events because of
>> + * x86_add_exclusive(x86_lbr_exclusive_lbr); there's no need to keep them
>> + * disabled around intel pmu's event batching etc, only inside the PMI handler.
>
> Pet peeve nit: please capitalize 'PMU' correctly and consistently (upper case). 
> This paragraph has both variants: "PMU" and "pmu" which is the worst variant 
> really.

Sure.

Regards,
--
Alex

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-08  3:36         ` Vince Weaver
@ 2016-09-08  8:51           ` Alexander Shishkin
  2016-09-08 12:54             ` Vince Weaver
  0 siblings, 1 reply; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-08  8:51 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Ingo Molnar, Peter Zijlstra, Ingo Molnar, linux-kernel, eranian,
	Arnaldo Carvalho de Melo, vincent.weaver

Vince Weaver <vincent.weaver@maine.edu> writes:

> On the skylake machine with the original 5 patches I got this after 
> continuing to fuzz.  Sorry about the lack of frame pointer, next
> compile will have it enabled.
>
> If it matters, prior to this I hit the unrelated
> [25510.278199] WARNING: CPU: 1 PID: 25405 at kernel/events/core.c:3554 perf_event_read+0x18f/0x1a0
>
>
> [28682.174684] WARNING: CPU: 7 PID: 31992 at kernel/events/core.c:4961 perf_mmap_close+0x2e1/0x2f0

It just keeps on giving, doesn't it. I've got the same thing here during
the night, looking at it again. Did you capture the seed by any chance?

Thanks,
--
Alex

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-09-08  8:51           ` Alexander Shishkin
@ 2016-09-08 12:54             ` Vince Weaver
  0 siblings, 0 replies; 27+ messages in thread
From: Vince Weaver @ 2016-09-08 12:54 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Vince Weaver, Ingo Molnar, Peter Zijlstra, Ingo Molnar,
	linux-kernel, eranian, Arnaldo Carvalho de Melo

On Thu, 8 Sep 2016, Alexander Shishkin wrote:

> Vince Weaver <vincent.weaver@maine.edu> writes:
> 
> > On the skylake machine with the original 5 patches I got this after 
> > continuing to fuzz.  Sorry about the lack of frame pointer, next
x> > compile will have it enabled.
> >
> > If it matters, prior to this I hit the unrelated
> > [25510.278199] WARNING: CPU: 1 PID: 25405 at kernel/events/core.c:3554 perf_event_read+0x18f/0x1a0
> >
> >
> > [28682.174684] WARNING: CPU: 7 PID: 31992 at kernel/events/core.c:4961 perf_mmap_close+0x2e1/0x2f0
> 
> It just keeps on giving, doesn't it. I've got the same thing here during
> the night, looking at it again. Did you capture the seed by any chance?

It was triggered by

	Seeding random number generator with 1473284719
        /proc/sys/kernel/perf_event_max_sample_rate currently: 1000/s
        /proc/sys/kernel/perf_event_paranoid currently: 0
        To reproduce, try: ./perf_fuzzer -s 30000 -r 1473284719

but in my experience these things don't always reproduce deterministically.  

I can try reproducing it but I can't right now because overnight the 
machine locked hard (even sysrq isn't helping) with just the unhelpful 
message of

[70953.494637] ------------[ cut here ]------------

not sure if I have the serial console configured wrong or if it just locks 
up before it can send anything.

Anyway I'll head to the lab after my class and reboot things.

Vince

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/core] perf/core: Fix a race between mmap_close() and set_output() of AUX events
  2016-09-06 13:23 ` [PATCH v2 1/5] perf: Fix a race between mmap_close and set_output of AUX events Alexander Shishkin
@ 2016-09-10 12:38   ` tip-bot for Alexander Shishkin
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Alexander Shishkin @ 2016-09-10 12:38 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: eranian, linux-kernel, vincent.weaver, torvalds, acme, mingo,
	hpa, peterz, acme, tglx, alexander.shishkin, jolsa

Commit-ID:  767ae08678c2c796bcd7f582ee457aee20a28a1e
Gitweb:     http://git.kernel.org/tip/767ae08678c2c796bcd7f582ee457aee20a28a1e
Author:     Alexander Shishkin <alexander.shishkin@linux.intel.com>
AuthorDate: Tue, 6 Sep 2016 16:23:49 +0300
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 10 Sep 2016 11:15:36 +0200

perf/core: Fix a race between mmap_close() and set_output() of AUX events

In the mmap_close() path we need to stop all the AUX events that are
writing data to the AUX area that we are unmapping, before we can
safely free the pages. To determine if an event needs to be stopped,
we're comparing its ->rb against the one that's getting unmapped.
However, a SET_OUTPUT ioctl may turn up inside an AUX transaction
and swizzle event::rb to some other ring buffer, but the transaction
will keep writing data to the old ring buffer until the event gets
scheduled out. At this point, mmap_close() will skip over such an
event and will proceed to free the AUX area, while it's still being
used by this event, which will set off a warning in the mmap_close()
path and cause a memory corruption.

To avoid this, always stop an AUX event before its ->rb is updated;
this will release the (potentially) last reference on the AUX area
of the buffer. If the event gets restarted, its new ring buffer will
be used. If another SET_OUTPUT comes and switches it back to the
old ring buffer that's getting unmapped, it's also fine: this
ring buffer's aux_mmap_count will be zero and AUX transactions won't
start any more.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 07ac859..a54f2c2 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2496,11 +2496,11 @@ static int __perf_event_stop(void *info)
 	return 0;
 }
 
-static int perf_event_restart(struct perf_event *event)
+static int perf_event_stop(struct perf_event *event, int restart)
 {
 	struct stop_event_data sd = {
 		.event		= event,
-		.restart	= 1,
+		.restart	= restart,
 	};
 	int ret = 0;
 
@@ -4845,6 +4845,19 @@ static void ring_buffer_attach(struct perf_event *event,
 		spin_unlock_irqrestore(&rb->event_lock, flags);
 	}
 
+	/*
+	 * Avoid racing with perf_mmap_close(AUX): stop the event
+	 * before swizzling the event::rb pointer; if it's getting
+	 * unmapped, its aux_mmap_count will be 0 and it won't
+	 * restart. See the comment in __perf_pmu_output_stop().
+	 *
+	 * Data will inevitably be lost when set_output is done in
+	 * mid-air, but then again, whoever does it like this is
+	 * not in for the data anyway.
+	 */
+	if (has_aux(event))
+		perf_event_stop(event, 0);
+
 	rcu_assign_pointer(event->rb, rb);
 
 	if (old_rb) {
@@ -6120,7 +6133,7 @@ static void perf_event_addr_filters_exec(struct perf_event *event, void *data)
 	raw_spin_unlock_irqrestore(&ifh->lock, flags);
 
 	if (restart)
-		perf_event_restart(event);
+		perf_event_stop(event, 1);
 }
 
 void perf_event_exec(void)
@@ -6164,7 +6177,13 @@ static void __perf_event_output_stop(struct perf_event *event, void *data)
 
 	/*
 	 * In case of inheritance, it will be the parent that links to the
-	 * ring-buffer, but it will be the child that's actually using it:
+	 * ring-buffer, but it will be the child that's actually using it.
+	 *
+	 * We are using event::rb to determine if the event should be stopped,
+	 * however this may race with ring_buffer_attach() (through set_output),
+	 * which will make us skip the event that actually needs to be stopped.
+	 * So ring_buffer_attach() has to stop an aux event before re-assigning
+	 * its rb pointer.
 	 */
 	if (rcu_dereference(parent->rb) == rb)
 		ro->err = __perf_event_stop(&sd);
@@ -6678,7 +6697,7 @@ static void __perf_addr_filters_adjust(struct perf_event *event, void *data)
 	raw_spin_unlock_irqrestore(&ifh->lock, flags);
 
 	if (restart)
-		perf_event_restart(event);
+		perf_event_stop(event, 1);
 }
 
 /*
@@ -7867,7 +7886,7 @@ static void perf_event_addr_filters_apply(struct perf_event *event)
 	mmput(mm);
 
 restart:
-	perf_event_restart(event);
+	perf_event_stop(event, 1);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [tip:perf/core] perf/core: Fix aux_mmap_count vs aux_refcount order
  2016-09-06 13:23 ` [PATCH v2 2/5] perf: Fix aux_mmap_count vs aux_refcount order Alexander Shishkin
@ 2016-09-10 12:38   ` tip-bot for Alexander Shishkin
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Alexander Shishkin @ 2016-09-10 12:38 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, peterz, linux-kernel, acme, hpa, vincent.weaver, mingo,
	alexander.shishkin, acme, jolsa, eranian, torvalds

Commit-ID:  b79ccadd6bb10e72cf784a298ca6dc1398eb9a24
Gitweb:     http://git.kernel.org/tip/b79ccadd6bb10e72cf784a298ca6dc1398eb9a24
Author:     Alexander Shishkin <alexander.shishkin@linux.intel.com>
AuthorDate: Tue, 6 Sep 2016 16:23:50 +0300
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 10 Sep 2016 11:15:36 +0200

perf/core: Fix aux_mmap_count vs aux_refcount order

The order of accesses to ring buffer's aux_mmap_count and aux_refcount
has to be preserved across the users, namely perf_mmap_close() and
perf_aux_output_begin(), otherwise the inversion can result in the latter
holding the last reference to the aux buffer and subsequently free'ing
it in atomic context, triggering a warning.

> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 257 at kernel/events/ring_buffer.c:541 __rb_free_aux+0x11a/0x130
> CPU: 0 PID: 257 Comm: stopbug Not tainted 4.8.0-rc1+ #2596
> Call Trace:
>  [<ffffffff810f3e0b>] __warn+0xcb/0xf0
>  [<ffffffff810f3f3d>] warn_slowpath_null+0x1d/0x20
>  [<ffffffff8121182a>] __rb_free_aux+0x11a/0x130
>  [<ffffffff812127a8>] rb_free_aux+0x18/0x20
>  [<ffffffff81212913>] perf_aux_output_begin+0x163/0x1e0
>  [<ffffffff8100c33a>] bts_event_start+0x3a/0xd0
>  [<ffffffff8100c42d>] bts_event_add+0x5d/0x80
>  [<ffffffff81203646>] event_sched_in.isra.104+0xf6/0x2f0
>  [<ffffffff8120652e>] group_sched_in+0x6e/0x190
>  [<ffffffff8120694e>] ctx_sched_in+0x2fe/0x5f0
>  [<ffffffff81206ca0>] perf_event_sched_in+0x60/0x80
>  [<ffffffff81206d1b>] ctx_resched+0x5b/0x90
>  [<ffffffff81207281>] __perf_event_enable+0x1e1/0x240
>  [<ffffffff81200639>] event_function+0xa9/0x180
>  [<ffffffff81202000>] ? perf_cgroup_attach+0x70/0x70
>  [<ffffffff8120203f>] remote_function+0x3f/0x50
>  [<ffffffff811971f3>] flush_smp_call_function_queue+0x83/0x150
>  [<ffffffff81197bd3>] generic_smp_call_function_single_interrupt+0x13/0x60
>  [<ffffffff810a6477>] smp_call_function_single_interrupt+0x27/0x40
>  [<ffffffff81a26ea9>] call_function_single_interrupt+0x89/0x90
>  [<ffffffff81120056>] finish_task_switch+0xa6/0x210
>  [<ffffffff81120017>] ? finish_task_switch+0x67/0x210
>  [<ffffffff81a1e83d>] __schedule+0x3dd/0xb50
>  [<ffffffff81a1efe5>] schedule+0x35/0x80
>  [<ffffffff81128031>] sys_sched_yield+0x61/0x70
>  [<ffffffff81a25be5>] entry_SYSCALL_64_fastpath+0x18/0xa8
> ---[ end trace 6235f556f5ea83a9 ]---

This patch puts the checks in perf_aux_output_begin() in the same order
as that of perf_mmap_close().

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-3-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/ring_buffer.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index ae9b90d..257fa46 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -330,15 +330,22 @@ void *perf_aux_output_begin(struct perf_output_handle *handle,
 	if (!rb)
 		return NULL;
 
-	if (!rb_has_aux(rb) || !atomic_inc_not_zero(&rb->aux_refcount))
+	if (!rb_has_aux(rb))
 		goto err;
 
 	/*
-	 * If rb::aux_mmap_count is zero (and rb_has_aux() above went through),
-	 * the aux buffer is in perf_mmap_close(), about to get freed.
+	 * If aux_mmap_count is zero, the aux buffer is in perf_mmap_close(),
+	 * about to get freed, so we leave immediately.
+	 *
+	 * Checking rb::aux_mmap_count and rb::refcount has to be done in
+	 * the same order, see perf_mmap_close. Otherwise we end up freeing
+	 * aux pages in this path, which is a bug, because in_atomic().
 	 */
 	if (!atomic_read(&rb->aux_mmap_count))
-		goto err_put;
+		goto err;
+
+	if (!atomic_inc_not_zero(&rb->aux_refcount))
+		goto err;
 
 	/*
 	 * Nesting is not supported for AUX area, make sure nested

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [tip:perf/core] perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  2016-09-06 13:23 ` [PATCH v2 3/5] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
@ 2016-09-10 12:39   ` tip-bot for Alexander Shishkin
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Alexander Shishkin @ 2016-09-10 12:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, acme, linux-kernel, alexander.shishkin, eranian, peterz,
	torvalds, mingo, hpa, jolsa, acme, vincent.weaver

Commit-ID:  a9a94401c2b5805c71e39427b1af1bf1b9f67cd0
Gitweb:     http://git.kernel.org/tip/a9a94401c2b5805c71e39427b1af1bf1b9f67cd0
Author:     Alexander Shishkin <alexander.shishkin@linux.intel.com>
AuthorDate: Tue, 6 Sep 2016 16:23:51 +0300
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 10 Sep 2016 11:15:37 +0200

perf/x86/intel/bts: Fix confused ordering of PMU callbacks

The intel_bts driver is using a CPU-local 'started' variable to order
callbacks and PMIs and make sure that AUX transactions don't get messed
up. However, the ordering rules in regard to this variable is a complete
mess, which recently resulted in perf_fuzzer-triggered warnings and
panics.

The general ordering rule that is patch is enforcing is that this
cpu-local variable be set only when the cpu-local AUX transaction is
active; consequently, this variable is to be checked before the AUX
related bits can be touched.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-4-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/bts.c | 104 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 80 insertions(+), 24 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 0a6e393..61e1d71 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -31,7 +31,17 @@
 struct bts_ctx {
 	struct perf_output_handle	handle;
 	struct debug_store		ds_back;
-	int				started;
+	int				state;
+};
+
+/* BTS context states: */
+enum {
+	/* no ongoing AUX transactions */
+	BTS_STATE_STOPPED = 0,
+	/* AUX transaction is on, BTS tracing is disabled */
+	BTS_STATE_INACTIVE,
+	/* AUX transaction is on, BTS tracing is running */
+	BTS_STATE_ACTIVE,
 };
 
 static DEFINE_PER_CPU(struct bts_ctx, bts_ctx);
@@ -204,6 +214,15 @@ static void bts_update(struct bts_ctx *bts)
 static int
 bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle);
 
+/*
+ * Ordering PMU callbacks wrt themselves and the PMI is done by means
+ * of bts::state, which:
+ *  - is set when bts::handle::event is valid, that is, between
+ *    perf_aux_output_begin() and perf_aux_output_end();
+ *  - is zero otherwise;
+ *  - is ordered against bts::handle::event with a compiler barrier.
+ */
+
 static void __bts_event_start(struct perf_event *event)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
@@ -221,10 +240,13 @@ static void __bts_event_start(struct perf_event *event)
 
 	/*
 	 * local barrier to make sure that ds configuration made it
-	 * before we enable BTS
+	 * before we enable BTS and bts::state goes ACTIVE
 	 */
 	wmb();
 
+	/* INACTIVE/STOPPED -> ACTIVE */
+	WRITE_ONCE(bts->state, BTS_STATE_ACTIVE);
+
 	intel_pmu_enable_bts(config);
 
 }
@@ -251,9 +273,6 @@ static void bts_event_start(struct perf_event *event, int flags)
 
 	__bts_event_start(event);
 
-	/* PMI handler: this counter is running and likely generating PMIs */
-	ACCESS_ONCE(bts->started) = 1;
-
 	return;
 
 fail_end_stop:
@@ -263,30 +282,34 @@ fail_stop:
 	event->hw.state = PERF_HES_STOPPED;
 }
 
-static void __bts_event_stop(struct perf_event *event)
+static void __bts_event_stop(struct perf_event *event, int state)
 {
+	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
+
+	/* ACTIVE -> INACTIVE(PMI)/STOPPED(->stop()) */
+	WRITE_ONCE(bts->state, state);
+
 	/*
 	 * No extra synchronization is mandated by the documentation to have
 	 * BTS data stores globally visible.
 	 */
 	intel_pmu_disable_bts();
-
-	if (event->hw.state & PERF_HES_STOPPED)
-		return;
-
-	ACCESS_ONCE(event->hw.state) |= PERF_HES_STOPPED;
 }
 
 static void bts_event_stop(struct perf_event *event, int flags)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
-	struct bts_buffer *buf = perf_get_aux(&bts->handle);
+	struct bts_buffer *buf = NULL;
+	int state = READ_ONCE(bts->state);
+
+	if (state == BTS_STATE_ACTIVE)
+		__bts_event_stop(event, BTS_STATE_STOPPED);
 
-	/* PMI handler: don't restart this counter */
-	ACCESS_ONCE(bts->started) = 0;
+	if (state != BTS_STATE_STOPPED)
+		buf = perf_get_aux(&bts->handle);
 
-	__bts_event_stop(event);
+	event->hw.state |= PERF_HES_STOPPED;
 
 	if (flags & PERF_EF_UPDATE) {
 		bts_update(bts);
@@ -296,6 +319,7 @@ static void bts_event_stop(struct perf_event *event, int flags)
 				bts->handle.head =
 					local_xchg(&buf->data_size,
 						   buf->nr_pages << PAGE_SHIFT);
+
 			perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),
 					    !!local_xchg(&buf->lost, 0));
 		}
@@ -310,8 +334,20 @@ static void bts_event_stop(struct perf_event *event, int flags)
 void intel_bts_enable_local(void)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
+	int state = READ_ONCE(bts->state);
+
+	/*
+	 * Here we transition from INACTIVE to ACTIVE;
+	 * if we instead are STOPPED from the interrupt handler,
+	 * stay that way. Can't be ACTIVE here though.
+	 */
+	if (WARN_ON_ONCE(state == BTS_STATE_ACTIVE))
+		return;
+
+	if (state == BTS_STATE_STOPPED)
+		return;
 
-	if (bts->handle.event && bts->started)
+	if (bts->handle.event)
 		__bts_event_start(bts->handle.event);
 }
 
@@ -319,8 +355,15 @@ void intel_bts_disable_local(void)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
 
+	/*
+	 * Here we transition from ACTIVE to INACTIVE;
+	 * do nothing for STOPPED or INACTIVE.
+	 */
+	if (READ_ONCE(bts->state) != BTS_STATE_ACTIVE)
+		return;
+
 	if (bts->handle.event)
-		__bts_event_stop(bts->handle.event);
+		__bts_event_stop(bts->handle.event, BTS_STATE_INACTIVE);
 }
 
 static int
@@ -407,9 +450,13 @@ int intel_bts_interrupt(void)
 	struct perf_event *event = bts->handle.event;
 	struct bts_buffer *buf;
 	s64 old_head;
-	int err;
+	int err = -ENOSPC;
 
-	if (!event || !bts->started)
+	/*
+	 * this is wrapped in intel_bts_enable_local/intel_bts_disable_local,
+	 * so we can only be INACTIVE or STOPPED
+	 */
+	if (READ_ONCE(bts->state) == BTS_STATE_STOPPED)
 		return 0;
 
 	buf = perf_get_aux(&bts->handle);
@@ -432,12 +479,21 @@ int intel_bts_interrupt(void)
 			    !!local_xchg(&buf->lost, 0));
 
 	buf = perf_aux_output_begin(&bts->handle, event);
-	if (!buf)
-		return 1;
+	if (buf)
+		err = bts_buffer_reset(buf, &bts->handle);
 
-	err = bts_buffer_reset(buf, &bts->handle);
-	if (err)
-		perf_aux_output_end(&bts->handle, 0, false);
+	if (err) {
+		WRITE_ONCE(bts->state, BTS_STATE_STOPPED);
+
+		if (buf) {
+			/*
+			 * BTS_STATE_STOPPED should be visible before
+			 * cleared handle::event
+			 */
+			barrier();
+			perf_aux_output_end(&bts->handle, 0, false);
+		}
+	}
 
 	return 1;
 }

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [tip:perf/core] perf/x86/intel/bts: Fix BTS PMI detection
  2016-09-06 13:23 ` [PATCH v2 4/5] perf/x86/intel/bts: Fix BTS PMI detection Alexander Shishkin
@ 2016-09-10 12:39   ` tip-bot for Alexander Shishkin
  2016-09-20 13:12   ` [PATCH] perf/x86/intel/bts: don't dereference ds unconditionally Sebastian Andrzej Siewior
  1 sibling, 0 replies; 27+ messages in thread
From: tip-bot for Alexander Shishkin @ 2016-09-10 12:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, hpa, peterz, linux-kernel, vincent.weaver, eranian,
	torvalds, acme, mingo, jolsa, alexander.shishkin, acme

Commit-ID:  4d4c474124649198d9b0a065c06f9362cf18e14e
Gitweb:     http://git.kernel.org/tip/4d4c474124649198d9b0a065c06f9362cf18e14e
Author:     Alexander Shishkin <alexander.shishkin@linux.intel.com>
AuthorDate: Tue, 6 Sep 2016 16:23:52 +0300
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 10 Sep 2016 11:15:38 +0200

perf/x86/intel/bts: Fix BTS PMI detection

Since BTS doesn't have a dedicated PMI status bit, the driver needs to
take extra care to check for the condition that triggers it to avoid
spurious NMI warnings.

Regardless of the local BTS context state, the only way of knowing that
the NMI is ours is to compare the write pointer against the interrupt
threshold.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-5-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/bts.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 61e1d71..9233edf 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -446,26 +446,37 @@ bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle)
 
 int intel_bts_interrupt(void)
 {
+	struct debug_store *ds = this_cpu_ptr(&cpu_hw_events)->ds;
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
 	struct perf_event *event = bts->handle.event;
 	struct bts_buffer *buf;
 	s64 old_head;
-	int err = -ENOSPC;
+	int err = -ENOSPC, handled = 0;
+
+	/*
+	 * The only surefire way of knowing if this NMI is ours is by checking
+	 * the write ptr against the PMI threshold.
+	 */
+	if (ds->bts_index >= ds->bts_interrupt_threshold)
+		handled = 1;
 
 	/*
 	 * this is wrapped in intel_bts_enable_local/intel_bts_disable_local,
 	 * so we can only be INACTIVE or STOPPED
 	 */
 	if (READ_ONCE(bts->state) == BTS_STATE_STOPPED)
-		return 0;
+		return handled;
 
 	buf = perf_get_aux(&bts->handle);
+	if (!buf)
+		return handled;
+
 	/*
 	 * Skip snapshot counters: they don't use the interrupt, but
 	 * there's no other way of telling, because the pointer will
 	 * keep moving
 	 */
-	if (!buf || buf->snapshot)
+	if (buf->snapshot)
 		return 0;
 
 	old_head = local_read(&buf->head);
@@ -473,7 +484,7 @@ int intel_bts_interrupt(void)
 
 	/* no new data */
 	if (old_head == local_read(&buf->head))
-		return 0;
+		return handled;
 
 	perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),
 			    !!local_xchg(&buf->lost, 0));

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [tip:perf/core] perf/x86/intel/bts: Kill a silly warning
  2016-09-06 13:23 ` [PATCH v2 5/5] perf/x86/intel/bts: Kill a silly warning Alexander Shishkin
@ 2016-09-10 12:40   ` tip-bot for Alexander Shishkin
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Alexander Shishkin @ 2016-09-10 12:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, torvalds, peterz, alexander.shishkin, eranian,
	tglx, jolsa, acme, mingo, acme, hpa

Commit-ID:  ef9ef3befa0d76008e988a9ed9fe439e803351b9
Gitweb:     http://git.kernel.org/tip/ef9ef3befa0d76008e988a9ed9fe439e803351b9
Author:     Alexander Shishkin <alexander.shishkin@linux.intel.com>
AuthorDate: Tue, 6 Sep 2016 16:23:53 +0300
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 10 Sep 2016 11:15:38 +0200

perf/x86/intel/bts: Kill a silly warning

At the moment, intel_bts will WARN() out if there is more than one
event writing to the same ring buffer, via SET_OUTPUT, and will only
send data from one event to a buffer.

There is no reason to have this warning in, so kill it.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-6-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/bts.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 9233edf..bdcd651 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -378,8 +378,6 @@ bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle)
 		return 0;
 
 	head = handle->head & ((buf->nr_pages << PAGE_SHIFT) - 1);
-	if (WARN_ON_ONCE(head != local_read(&buf->head)))
-		return -EINVAL;
 
 	phys = &buf->buf[buf->cur_buf];
 	space = phys->offset + phys->displacement + phys->size - head;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH] perf/x86/intel/bts: don't dereference ds unconditionally
  2016-09-06 13:23 ` [PATCH v2 4/5] perf/x86/intel/bts: Fix BTS PMI detection Alexander Shishkin
  2016-09-10 12:39   ` [tip:perf/core] " tip-bot for Alexander Shishkin
@ 2016-09-20 13:12   ` Sebastian Andrzej Siewior
  2016-09-20 13:44     ` Alexander Shishkin
  2016-09-20 14:13     ` [tip:perf/urgent] perf/x86/intel/bts: Make sure debug store is valid tip-bot for Sebastian Andrzej Siewior
  1 sibling, 2 replies; 27+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-09-20 13:12 UTC (permalink / raw)
  To: Alexander Shishkin, Ingo Molnar
  Cc: Peter Zijlstra, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, tglx

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Since commit 4d4c47412464 ("perf/x86/intel/bts: Fix BTS PMI detection")
my box goes boom on boot:

| .... node  #0, CPUs:      #1 #2 #3 #4 #5 #6 #7
| BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
| IP: [<ffffffff8100c463>] intel_bts_interrupt+0x43/0x130
| Call Trace:
|  <NMI> d [<ffffffff8100b341>] intel_pmu_handle_irq+0x51/0x4b0
|  [<ffffffff81004d47>] perf_event_nmi_handler+0x27/0x40

I don't know what is going on here but ds is not always dereferenced
unconditionally hence here the `ds' check to avoid the crash.

Fixes: 4d4c47412464 ("perf/x86/intel/bts: Fix BTS PMI detection")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/events/intel/bts.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index bdcd6510992c..6ff66efa0feb 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -455,7 +455,7 @@ int intel_bts_interrupt(void)
 	 * The only surefire way of knowing if this NMI is ours is by checking
 	 * the write ptr against the PMI threshold.
 	 */
-	if (ds->bts_index >= ds->bts_interrupt_threshold)
+	if (ds && (ds->bts_index >= ds->bts_interrupt_threshold))
 		handled = 1;
 
 	/*
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] perf/x86/intel/bts: don't dereference ds unconditionally
  2016-09-20 13:12   ` [PATCH] perf/x86/intel/bts: don't dereference ds unconditionally Sebastian Andrzej Siewior
@ 2016-09-20 13:44     ` Alexander Shishkin
  2016-09-20 13:54       ` Alexander Shishkin
  2016-09-20 14:13     ` [tip:perf/urgent] perf/x86/intel/bts: Make sure debug store is valid tip-bot for Sebastian Andrzej Siewior
  1 sibling, 1 reply; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-20 13:44 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Ingo Molnar
  Cc: Peter Zijlstra, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, tglx

Sebastian Andrzej Siewior <sebastian@breakpoint.cc> writes:

> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>
> Since commit 4d4c47412464 ("perf/x86/intel/bts: Fix BTS PMI detection")
> my box goes boom on boot:
>
> | .... node  #0, CPUs:      #1 #2 #3 #4 #5 #6 #7
> | BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> | IP: [<ffffffff8100c463>] intel_bts_interrupt+0x43/0x130
> | Call Trace:
> |  <NMI> d [<ffffffff8100b341>] intel_pmu_handle_irq+0x51/0x4b0
> |  [<ffffffff81004d47>] perf_event_nmi_handler+0x27/0x40
>
> I don't know what is going on here but ds is not always dereferenced
> unconditionally hence here the `ds' check to avoid the crash.

Good catch! I'm going to guess you don't have the NMI watchdog enabled?

Thanks,
--
Alex

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] perf/x86/intel/bts: don't dereference ds unconditionally
  2016-09-20 13:44     ` Alexander Shishkin
@ 2016-09-20 13:54       ` Alexander Shishkin
  0 siblings, 0 replies; 27+ messages in thread
From: Alexander Shishkin @ 2016-09-20 13:54 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Ingo Molnar
  Cc: Peter Zijlstra, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, tglx

Alexander Shishkin <alexander.shishkin@linux.intel.com> writes:

> Sebastian Andrzej Siewior <sebastian@breakpoint.cc> writes:
>
>> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>>
>> Since commit 4d4c47412464 ("perf/x86/intel/bts: Fix BTS PMI detection")
>> my box goes boom on boot:
>>
>> | .... node  #0, CPUs:      #1 #2 #3 #4 #5 #6 #7
>> | BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
>> | IP: [<ffffffff8100c463>] intel_bts_interrupt+0x43/0x130
>> | Call Trace:
>> |  <NMI> d [<ffffffff8100b341>] intel_pmu_handle_irq+0x51/0x4b0
>> |  [<ffffffff81004d47>] perf_event_nmi_handler+0x27/0x40
>>
>> I don't know what is going on here but ds is not always dereferenced
>> unconditionally hence here the `ds' check to avoid the crash.
>
> Good catch! I'm going to guess you don't have the NMI watchdog enabled?

That is to say,

Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/urgent] perf/x86/intel/bts: Make sure debug store is valid
  2016-09-20 13:12   ` [PATCH] perf/x86/intel/bts: don't dereference ds unconditionally Sebastian Andrzej Siewior
  2016-09-20 13:44     ` Alexander Shishkin
@ 2016-09-20 14:13     ` tip-bot for Sebastian Andrzej Siewior
  1 sibling, 0 replies; 27+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2016-09-20 14:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: a.p.zijlstra, acme, bigeasy, tglx, linux-kernel, hpa,
	alexander.shishkin, mingo

Commit-ID:  f1e1c9e5e357c05253affb13be29285c5cb56bf0
Gitweb:     http://git.kernel.org/tip/f1e1c9e5e357c05253affb13be29285c5cb56bf0
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Tue, 20 Sep 2016 15:12:21 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 20 Sep 2016 16:06:09 +0200

perf/x86/intel/bts: Make sure debug store is valid

Since commit 4d4c47412464 ("perf/x86/intel/bts: Fix BTS PMI detection")
my box goes boom on boot:

| .... node  #0, CPUs:      #1 #2 #3 #4 #5 #6 #7
| BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
| IP: [<ffffffff8100c463>] intel_bts_interrupt+0x43/0x130
| Call Trace:
|  <NMI> d [<ffffffff8100b341>] intel_pmu_handle_irq+0x51/0x4b0
|  [<ffffffff81004d47>] perf_event_nmi_handler+0x27/0x40

This happens because the code introduced in this commit dereferences the
debug store pointer unconditionally. The debug store is not guaranteed to
be available, so a NULL pointer check as on other places is required.

Fixes: 4d4c47412464 ("perf/x86/intel/bts: Fix BTS PMI detection")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: vince@deater.net
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/20160920131220.xg5pbdjtznszuyzb@breakpoint.cc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/events/intel/bts.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index bdcd651..6ff66ef 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -455,7 +455,7 @@ int intel_bts_interrupt(void)
 	 * The only surefire way of knowing if this NMI is ours is by checking
 	 * the write ptr against the PMI threshold.
 	 */
-	if (ds->bts_index >= ds->bts_interrupt_threshold)
+	if (ds && (ds->bts_index >= ds->bts_interrupt_threshold))
 		handled = 1;
 
 	/*

^ permalink raw reply related	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2016-09-20 14:15 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-06 13:23 [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
2016-09-06 13:23 ` [PATCH v2 1/5] perf: Fix a race between mmap_close and set_output of AUX events Alexander Shishkin
2016-09-10 12:38   ` [tip:perf/core] perf/core: Fix a race between mmap_close() and set_output() " tip-bot for Alexander Shishkin
2016-09-06 13:23 ` [PATCH v2 2/5] perf: Fix aux_mmap_count vs aux_refcount order Alexander Shishkin
2016-09-10 12:38   ` [tip:perf/core] perf/core: " tip-bot for Alexander Shishkin
2016-09-06 13:23 ` [PATCH v2 3/5] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
2016-09-10 12:39   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2016-09-06 13:23 ` [PATCH v2 4/5] perf/x86/intel/bts: Fix BTS PMI detection Alexander Shishkin
2016-09-10 12:39   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2016-09-20 13:12   ` [PATCH] perf/x86/intel/bts: don't dereference ds unconditionally Sebastian Andrzej Siewior
2016-09-20 13:44     ` Alexander Shishkin
2016-09-20 13:54       ` Alexander Shishkin
2016-09-20 14:13     ` [tip:perf/urgent] perf/x86/intel/bts: Make sure debug store is valid tip-bot for Sebastian Andrzej Siewior
2016-09-06 13:23 ` [PATCH v2 5/5] perf/x86/intel/bts: Kill a silly warning Alexander Shishkin
2016-09-10 12:40   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2016-09-06 17:19 ` [PATCH v2 0/5] perf, bts: Fallout from the fuzzer for perf/urgent Ingo Molnar
2016-09-07  0:13   ` Vince Weaver
2016-09-07 15:20   ` Alexander Shishkin
2016-09-07 15:36     ` Vince Weaver
2016-09-07 16:38       ` Peter Zijlstra
2016-09-07 18:33       ` Alexander Shishkin
2016-09-08  3:36         ` Vince Weaver
2016-09-08  8:51           ` Alexander Shishkin
2016-09-08 12:54             ` Vince Weaver
2016-09-08  6:21     ` Ingo Molnar
2016-09-08  6:23       ` Ingo Molnar
2016-09-08  8:43       ` Alexander Shishkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.