All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent
@ 2016-08-23  8:54 Alexander Shishkin
  2016-08-23  8:54 ` [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-23  8:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

Hi,

Recently Vince has reported warnings and panics coming from the
general direction of AUX tracing. I found two bugs which manifest
similarly, one in intel_bts driver and one in AUX unmapping path.

Both are triggered by racing SET_OUTPUT against mmap_close while
running AUX tracing. I have a test case that set fire to the kernel
within a few seconds by doing this, which I can share if anyone
cares.

These are all good candidates for 4.7-stable and the BTS ones can be
theoretically backported further.

Alexander Shishkin (3):
  perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  perf/x86/intel/bts: Kill a silly warning
  perf: Fix a race between mmap_close and set_output of AUX events

 arch/x86/events/intel/bts.c | 87 +++++++++++++++++++++++++++++++++------------
 kernel/events/core.c        | 31 ++++++++++++----
 2 files changed, 89 insertions(+), 29 deletions(-)

-- 
2.8.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  2016-08-23  8:54 [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
@ 2016-08-23  8:54 ` Alexander Shishkin
  2016-08-24 12:28   ` Alexander Shishkin
  2016-08-23  8:54 ` [PATCH 2/3] perf/x86/intel/bts: Kill a silly warning Alexander Shishkin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-23  8:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

The intel_bts driver is using a cpu-local 'started' variable to order
callbacks and PMIs and make sure that AUX transactions don't get messed
up. However, the ordering rules in regard to this variable is a complete
mess, which recently resulted in perf_fuzzer-triggered warnings and
panics.

The general ordering rule that is patch is enforcing is that this
cpu-local variable be set only when the cpu-local AUX transaction is
active; consequently, this variable is to be checked before the AUX
related bits can be touched.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 arch/x86/events/intel/bts.c | 85 ++++++++++++++++++++++++++++++++++-----------
 1 file changed, 64 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 0a6e393a2e..0b8ccff564 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -204,6 +204,16 @@ static void bts_update(struct bts_ctx *bts)
 static int
 bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle);
 
+/*
+ * Ordering PMU callbacks wrt themselves and the PMI is done by means
+ * of bts::started, which:
+ *  - is set when bts::handle::event is valid, that is, between
+ *    perf_aux_output_begin() and perf_aux_output_end();
+ *  - is zero otherwise;
+ *  - is ordered against bts::handle::event with a local barrier to
+ *    enforce CPU ordering.
+ */
+
 static void __bts_event_start(struct perf_event *event)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
@@ -221,10 +231,13 @@ static void __bts_event_start(struct perf_event *event)
 
 	/*
 	 * local barrier to make sure that ds configuration made it
-	 * before we enable BTS
+	 * before we enable BTS and bts::started is set
 	 */
 	wmb();
 
+	/* PMI handler: this counter is running and likely generating PMIs */
+	WRITE_ONCE(bts->started, 1);
+
 	intel_pmu_enable_bts(config);
 
 }
@@ -251,9 +264,6 @@ static void bts_event_start(struct perf_event *event, int flags)
 
 	__bts_event_start(event);
 
-	/* PMI handler: this counter is running and likely generating PMIs */
-	ACCESS_ONCE(bts->started) = 1;
-
 	return;
 
 fail_end_stop:
@@ -265,28 +275,34 @@ fail_stop:
 
 static void __bts_event_stop(struct perf_event *event)
 {
+	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
+
+	/* PMI handler: don't restart this counter */
+	WRITE_ONCE(bts->started, 0);
+
 	/*
 	 * No extra synchronization is mandated by the documentation to have
 	 * BTS data stores globally visible.
 	 */
 	intel_pmu_disable_bts();
-
-	if (event->hw.state & PERF_HES_STOPPED)
-		return;
-
-	ACCESS_ONCE(event->hw.state) |= PERF_HES_STOPPED;
 }
 
 static void bts_event_stop(struct perf_event *event, int flags)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
-	struct bts_buffer *buf = perf_get_aux(&bts->handle);
+	struct bts_buffer *buf;
 
-	/* PMI handler: don't restart this counter */
-	ACCESS_ONCE(bts->started) = 0;
+	if (READ_ONCE(bts->started)) {
+		__bts_event_stop(event);
+
+		/* order buf (handle::event load) against bts::started store */
+		mb();
+	}
+
+	buf = perf_get_aux(&bts->handle);
 
-	__bts_event_stop(event);
+	event->hw.state |= PERF_HES_STOPPED;
 
 	if (flags & PERF_EF_UPDATE) {
 		bts_update(bts);
@@ -311,7 +327,13 @@ void intel_bts_enable_local(void)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
 
-	if (bts->handle.event && bts->started)
+	if (!READ_ONCE(bts->started))
+		return;
+
+	/* matches wmb() ordering bts::started store against handle::event */
+	rmb();
+
+	if (bts->handle.event)
 		__bts_event_start(bts->handle.event);
 }
 
@@ -319,6 +341,12 @@ void intel_bts_disable_local(void)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
 
+	if (!READ_ONCE(bts->started))
+		return;
+
+	/* matches wmb() ordering bts::started store against handle::event */
+	rmb();
+
 	if (bts->handle.event)
 		__bts_event_stop(bts->handle.event);
 }
@@ -407,9 +435,15 @@ int intel_bts_interrupt(void)
 	struct perf_event *event = bts->handle.event;
 	struct bts_buffer *buf;
 	s64 old_head;
-	int err;
+	int err = -ENOSPC;
 
-	if (!event || !bts->started)
+	if (!READ_ONCE(bts->started))
+		return 0;
+
+	/* matches wmb() ordering bts::started store against handle::event */
+	rmb();
+
+	if (!event)
 		return 0;
 
 	buf = perf_get_aux(&bts->handle);
@@ -432,12 +466,21 @@ int intel_bts_interrupt(void)
 			    !!local_xchg(&buf->lost, 0));
 
 	buf = perf_aux_output_begin(&bts->handle, event);
-	if (!buf)
-		return 1;
+	if (buf)
+		err = bts_buffer_reset(buf, &bts->handle);
 
-	err = bts_buffer_reset(buf, &bts->handle);
-	if (err)
-		perf_aux_output_end(&bts->handle, 0, false);
+	if (err) {
+		WRITE_ONCE(bts->started, 0);
+
+		if (buf) {
+			/*
+			 * cleared bts::started should be visible before
+			 * cleared handle::event
+			 */
+			wmb();
+			perf_aux_output_end(&bts->handle, 0, false);
+		}
+	}
 
 	return 1;
 }
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] perf/x86/intel/bts: Kill a silly warning
  2016-08-23  8:54 [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
  2016-08-23  8:54 ` [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
@ 2016-08-23  8:54 ` Alexander Shishkin
  2016-08-23  8:54 ` [PATCH 3/3] perf: Fix a race between mmap_close and set_output of AUX events Alexander Shishkin
  2016-08-23 16:24 ` [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent Vince Weaver
  3 siblings, 0 replies; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-23  8:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

At the moment, intel_bts will WARN() out if there is more than one
event writing to the same ring buffer, via SET_OUTPUT, and will only
send data from one event to a buffer.

There is no reason to have this warning in, so kill it.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 arch/x86/events/intel/bts.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 0b8ccff564..41975c6c2d 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -363,8 +363,6 @@ bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle)
 		return 0;
 
 	head = handle->head & ((buf->nr_pages << PAGE_SHIFT) - 1);
-	if (WARN_ON_ONCE(head != local_read(&buf->head)))
-		return -EINVAL;
 
 	phys = &buf->buf[buf->cur_buf];
 	space = phys->offset + phys->displacement + phys->size - head;
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] perf: Fix a race between mmap_close and set_output of AUX events
  2016-08-23  8:54 [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
  2016-08-23  8:54 ` [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
  2016-08-23  8:54 ` [PATCH 2/3] perf/x86/intel/bts: Kill a silly warning Alexander Shishkin
@ 2016-08-23  8:54 ` Alexander Shishkin
  2016-08-23 16:24 ` [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent Vince Weaver
  3 siblings, 0 replies; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-23  8:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian,
	Arnaldo Carvalho de Melo, Alexander Shishkin

In the mmap_close path we need to stop all the AUX events that are
writing data to the AUX area that we are unmapping, before we can
safely free the pages. To determine if an event needs to be stopped,
we're comparing its ->rb against the one that's getting unmapped.
However, a SET_OUTPUT ioctl may turn up inside an AUX transaction
and swizzle event::rb to some other ring buffer, but the transaction
will keep writing data to the old ring buffer until the event gets
scheduled out. At this point, mmap_close will skip over such an
event and will proceed to free the AUX area, while it's still being
used by this event, which will set off a warning in the mmap_close
path and cause a memory corruption.

To avoid this, always stop an AUX event before its ->rb is updated;
this will release the (potentially) last reference on the AUX area
of the buffer. If the event gets restarted, its new ring buffer will
be used. If another SET_OUTPUT comes and switches it back to the
old ring buffer that's getting unmapped, it's also fine: this
ring buffer's aux_mmap_count will be zero and AUX transactions won't
start any more.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 kernel/events/core.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 803481cb6c..71df77e234 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2464,11 +2464,11 @@ static int __perf_event_stop(void *info)
 	return 0;
 }
 
-static int perf_event_restart(struct perf_event *event)
+static int perf_event_stop(struct perf_event *event, int restart)
 {
 	struct stop_event_data sd = {
 		.event		= event,
-		.restart	= 1,
+		.restart	= restart,
 	};
 	int ret = 0;
 
@@ -4811,6 +4811,19 @@ static void ring_buffer_attach(struct perf_event *event,
 		spin_unlock_irqrestore(&rb->event_lock, flags);
 	}
 
+	/*
+	 * Avoid racing with perf_mmap_close(AUX): stop the event
+	 * before swizzling the event::rb pointer; if it's getting
+	 * unmapped, its aux_mmap_count will be 0 and it won't
+	 * restart. See the comment in __perf_pmu_output_stop().
+	 *
+	 * Data will inevitably be lost when set_output is done in
+	 * mid-air, but then again, whoever does it like this is
+	 * not in for the data anyway.
+	 */
+	if (has_aux(event))
+		perf_event_stop(event, 0);
+
 	rcu_assign_pointer(event->rb, rb);
 
 	if (old_rb) {
@@ -6086,7 +6099,7 @@ static void perf_event_addr_filters_exec(struct perf_event *event, void *data)
 	raw_spin_unlock_irqrestore(&ifh->lock, flags);
 
 	if (restart)
-		perf_event_restart(event);
+		perf_event_stop(event, 1);
 }
 
 void perf_event_exec(void)
@@ -6130,7 +6143,13 @@ static void __perf_event_output_stop(struct perf_event *event, void *data)
 
 	/*
 	 * In case of inheritance, it will be the parent that links to the
-	 * ring-buffer, but it will be the child that's actually using it:
+	 * ring-buffer, but it will be the child that's actually using it.
+	 *
+	 * We are using event::rb to determine if the event should be stopped,
+	 * however this may race with ring_buffer_attach() (through set_output),
+	 * which will make us skip the event that actually needs to be stopped.
+	 * So ring_buffer_attach() has to stop an aux event before re-assigning
+	 * its rb pointer.
 	 */
 	if (rcu_dereference(parent->rb) == rb)
 		ro->err = __perf_event_stop(&sd);
@@ -6653,7 +6672,7 @@ static void __perf_addr_filters_adjust(struct perf_event *event, void *data)
 	raw_spin_unlock_irqrestore(&ifh->lock, flags);
 
 	if (restart)
-		perf_event_restart(event);
+		perf_event_stop(event, 1);
 }
 
 /*
@@ -7831,7 +7850,7 @@ static void perf_event_addr_filters_apply(struct perf_event *event)
 	mmput(mm);
 
 restart:
-	perf_event_restart(event);
+	perf_event_stop(event, 1);
 }
 
 /*
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-08-23  8:54 [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
                   ` (2 preceding siblings ...)
  2016-08-23  8:54 ` [PATCH 3/3] perf: Fix a race between mmap_close and set_output of AUX events Alexander Shishkin
@ 2016-08-23 16:24 ` Vince Weaver
  2016-08-23 17:10   ` Alexander Shishkin
  3 siblings, 1 reply; 14+ messages in thread
From: Vince Weaver @ 2016-08-23 16:24 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, eranian,
	Arnaldo Carvalho de Melo

On Tue, 23 Aug 2016, Alexander Shishkin wrote:

> Recently Vince has reported warnings and panics coming from the
> general direction of AUX tracing. I found two bugs which manifest
> similarly, one in intel_bts driver and one in AUX unmapping path.
> 
> Both are triggered by racing SET_OUTPUT against mmap_close while
> running AUX tracing. I have a test case that set fire to the kernel
> within a few seconds by doing this, which I can share if anyone
> cares.

I've applied these and am fuzzing on my Haswell machine and I'm getting 
lots and lots of

[  945.536374] Dazed and confused, but trying to continue
[  945.542391] Uhhuh. NMI received for unknown reason 21 on CPU 2.
[  945.549360] Do you have a strange power saving mode enabled?
[  945.556701] Dazed and confused, but trying to continue
[  945.563643] Uhhuh. NMI received for unknown reason 31 on CPU 2.
[  945.571379] Do you have a strange power saving mode enabled?
[  945.578754] Dazed and confused, but trying to continue
[  945.630889] Uhhuh. NMI received for unknown reason 31 on CPU 2.
[  945.638141] Do you have a strange power saving mode enabled?
[  945.645523] Dazed and confused, but trying to continue

messages that I wasn't getting before.  Of course, I updated from -rc2 to 
-rc3 before applying the patch so it could also be related to that too, I 
can try backing out the patches and see if it still happens.

Vince

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-08-23 16:24 ` [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent Vince Weaver
@ 2016-08-23 17:10   ` Alexander Shishkin
  2016-08-23 20:55     ` Vince Weaver
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-23 17:10 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, eranian,
	Arnaldo Carvalho de Melo

Vince Weaver <vince@deater.net> writes:

> On Tue, 23 Aug 2016, Alexander Shishkin wrote:
>
>> Recently Vince has reported warnings and panics coming from the
>> general direction of AUX tracing. I found two bugs which manifest
>> similarly, one in intel_bts driver and one in AUX unmapping path.
>> 
>> Both are triggered by racing SET_OUTPUT against mmap_close while
>> running AUX tracing. I have a test case that set fire to the kernel
>> within a few seconds by doing this, which I can share if anyone
>> cares.
>
> I've applied these and am fuzzing on my Haswell machine and I'm getting 
> lots and lots of
>
> [  945.536374] Dazed and confused, but trying to continue
> [  945.542391] Uhhuh. NMI received for unknown reason 21 on CPU 2.
> [  945.549360] Do you have a strange power saving mode enabled?
> [  945.556701] Dazed and confused, but trying to continue
> [  945.563643] Uhhuh. NMI received for unknown reason 31 on CPU 2.
> [  945.571379] Do you have a strange power saving mode enabled?
> [  945.578754] Dazed and confused, but trying to continue
> [  945.630889] Uhhuh. NMI received for unknown reason 31 on CPU 2.
> [  945.638141] Do you have a strange power saving mode enabled?
> [  945.645523] Dazed and confused, but trying to continue
>
> messages that I wasn't getting before.  Of course, I updated from -rc2 to 
> -rc3 before applying the patch so it could also be related to that too, I 
> can try backing out the patches and see if it still happens.

Uhhuh, that's not cool, let me look some more. But it should stop dying
on bts/aux stuff at least.

Regards,
--
Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-08-23 17:10   ` Alexander Shishkin
@ 2016-08-23 20:55     ` Vince Weaver
  2016-08-24  3:44       ` Alexander Shishkin
  0 siblings, 1 reply; 14+ messages in thread
From: Vince Weaver @ 2016-08-23 20:55 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, eranian,
	Arnaldo Carvalho de Melo

On Tue, 23 Aug 2016, Alexander Shishkin wrote:

> Vince Weaver <vince@deater.net> writes:
> 
> > On Tue, 23 Aug 2016, Alexander Shishkin wrote:
> >
> >> Recently Vince has reported warnings and panics coming from the
> >> general direction of AUX tracing. I found two bugs which manifest
> >> similarly, one in intel_bts driver and one in AUX unmapping path.
> >> 
> >> Both are triggered by racing SET_OUTPUT against mmap_close while
> >> running AUX tracing. I have a test case that set fire to the kernel
> >> within a few seconds by doing this, which I can share if anyone
> >> cares.
> >
> > I've applied these and am fuzzing on my Haswell machine and I'm getting 
> > lots and lots of
> >
> > [  945.536374] Dazed and confused, but trying to continue
> > [  945.542391] Uhhuh. NMI received for unknown reason 21 on CPU 2.
> > [  945.549360] Do you have a strange power saving mode enabled?
> > [  945.556701] Dazed and confused, but trying to continue
> > [  945.563643] Uhhuh. NMI received for unknown reason 31 on CPU 2.
> > [  945.571379] Do you have a strange power saving mode enabled?
> > [  945.578754] Dazed and confused, but trying to continue
> > [  945.630889] Uhhuh. NMI received for unknown reason 31 on CPU 2.
> > [  945.638141] Do you have a strange power saving mode enabled?
> > [  945.645523] Dazed and confused, but trying to continue
> >
> > messages that I wasn't getting before.  Of course, I updated from -rc2 to 
> > -rc3 before applying the patch so it could also be related to that too, I 
> > can try backing out the patches and see if it still happens.
> 
> Uhhuh, that's not cool, let me look some more. But it should stop dying
> on bts/aux stuff at least.

Just wanted to say that when I backed out these patches the NMI errors 
went away, so they do seem to be a result of this patch series.

Vince

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent
  2016-08-23 20:55     ` Vince Weaver
@ 2016-08-24  3:44       ` Alexander Shishkin
  0 siblings, 0 replies; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-24  3:44 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Alexander Shishkin, Peter Zijlstra, Ingo Molnar,
	Linux Kernel Mailing List, Stephane Eranian,
	Arnaldo Carvalho de Melo

Thanks! I'll get back to you with something better.

On 23 August 2016 at 23:55, Vince Weaver <vince@deater.net> wrote:
> On Tue, 23 Aug 2016, Alexander Shishkin wrote:
>
>> Vince Weaver <vince@deater.net> writes:
>>
>> > On Tue, 23 Aug 2016, Alexander Shishkin wrote:
>> >
>> >> Recently Vince has reported warnings and panics coming from the
>> >> general direction of AUX tracing. I found two bugs which manifest
>> >> similarly, one in intel_bts driver and one in AUX unmapping path.
>> >>
>> >> Both are triggered by racing SET_OUTPUT against mmap_close while
>> >> running AUX tracing. I have a test case that set fire to the kernel
>> >> within a few seconds by doing this, which I can share if anyone
>> >> cares.
>> >
>> > I've applied these and am fuzzing on my Haswell machine and I'm getting
>> > lots and lots of
>> >
>> > [  945.536374] Dazed and confused, but trying to continue
>> > [  945.542391] Uhhuh. NMI received for unknown reason 21 on CPU 2.
>> > [  945.549360] Do you have a strange power saving mode enabled?
>> > [  945.556701] Dazed and confused, but trying to continue
>> > [  945.563643] Uhhuh. NMI received for unknown reason 31 on CPU 2.
>> > [  945.571379] Do you have a strange power saving mode enabled?
>> > [  945.578754] Dazed and confused, but trying to continue
>> > [  945.630889] Uhhuh. NMI received for unknown reason 31 on CPU 2.
>> > [  945.638141] Do you have a strange power saving mode enabled?
>> > [  945.645523] Dazed and confused, but trying to continue
>> >
>> > messages that I wasn't getting before.  Of course, I updated from -rc2 to
>> > -rc3 before applying the patch so it could also be related to that too, I
>> > can try backing out the patches and see if it still happens.
>>
>> Uhhuh, that's not cool, let me look some more. But it should stop dying
>> on bts/aux stuff at least.
>
> Just wanted to say that when I backed out these patches the NMI errors
> went away, so they do seem to be a result of this patch series.
>
> Vince

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  2016-08-23  8:54 ` [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
@ 2016-08-24 12:28   ` Alexander Shishkin
  2016-08-24 14:15     ` Alexander Shishkin
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-24 12:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, vince, eranian, Arnaldo Carvalho de Melo

Alexander Shishkin <alexander.shishkin@linux.intel.com> writes:

> The intel_bts driver is using a cpu-local 'started' variable to order
> callbacks and PMIs and make sure that AUX transactions don't get messed
> up. However, the ordering rules in regard to this variable is a complete
> mess, which recently resulted in perf_fuzzer-triggered warnings and
> panics.
>
> The general ordering rule that is patch is enforcing is that this
> cpu-local variable be set only when the cpu-local AUX transaction is
> active; consequently, this variable is to be checked before the AUX
> related bits can be touched.
>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>

Ok, this one is broken, please disregard.

Regards,
--
Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  2016-08-24 12:28   ` Alexander Shishkin
@ 2016-08-24 14:15     ` Alexander Shishkin
  2016-08-26 20:49       ` Vince Weaver
  2016-08-29 14:14       ` Peter Zijlstra
  0 siblings, 2 replies; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-24 14:15 UTC (permalink / raw)
  To: vince, Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, eranian, Arnaldo Carvalho de Melo

Alexander Shishkin <alexander.shishkin@linux.intel.com> writes:

> Alexander Shishkin <alexander.shishkin@linux.intel.com> writes:
>
>> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>
> Ok, this one is broken, please disregard.

Vince, can you try the following (with the other two in this series)?

---

>From 68713194b3df8e565c4d319a80e9e7338fa1ec13 Mon Sep 17 00:00:00 2001
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Date: Tue, 23 Aug 2016 10:50:57 +0300
Subject: [PATCH] perf/x86/intel/bts: Fix confused ordering of PMU callbacks

The intel_bts driver is using a cpu-local 'started' variable to order
callbacks and PMIs and make sure that AUX transactions don't get messed
up. However, the ordering rules in regard to this variable is a complete
mess, which recently resulted in perf_fuzzer-triggered warnings and
panics.

The general ordering rule that is patch is enforcing is that this
cpu-local variable be set only when the cpu-local AUX transaction is
active; consequently, this variable is to be checked before the AUX
related bits can be touched.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 arch/x86/events/intel/bts.c | 119 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 96 insertions(+), 23 deletions(-)

diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 0a6e393a2e..13ce76e895 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -31,7 +31,18 @@
 struct bts_ctx {
 	struct perf_output_handle	handle;
 	struct debug_store		ds_back;
-	int				started;
+	local_t				pmi_pending;
+	int				state;
+};
+
+/* BTS context states: */
+enum {
+	/* no ongoing AUX transactions */
+	BTS_STATE_STOPPED = 0,
+	/* AUX transaction is on, BTS tracing is disabled */
+	BTS_STATE_INACTIVE,
+	/* AUX transaction is on, BTS tracing is running */
+	BTS_STATE_ACTIVE,
 };
 
 static DEFINE_PER_CPU(struct bts_ctx, bts_ctx);
@@ -191,6 +202,9 @@ static void bts_update(struct bts_ctx *bts)
 		if (ds->bts_index >= ds->bts_absolute_maximum)
 			local_inc(&buf->lost);
 
+		if (ds->bts_index >= ds->bts_interrupt_threshold)
+			local_inc(&bts->pmi_pending);
+
 		/*
 		 * old and head are always in the same physical buffer, so we
 		 * can subtract them to get the data size.
@@ -204,6 +218,16 @@ static void bts_update(struct bts_ctx *bts)
 static int
 bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle);
 
+/*
+ * Ordering PMU callbacks wrt themselves and the PMI is done by means
+ * of bts::state, which:
+ *  - is set when bts::handle::event is valid, that is, between
+ *    perf_aux_output_begin() and perf_aux_output_end();
+ *  - is zero otherwise;
+ *  - is ordered against bts::handle::event with a local barrier to
+ *    enforce CPU ordering.
+ */
+
 static void __bts_event_start(struct perf_event *event)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
@@ -221,10 +245,13 @@ static void __bts_event_start(struct perf_event *event)
 
 	/*
 	 * local barrier to make sure that ds configuration made it
-	 * before we enable BTS
+	 * before we enable BTS and bts::state goes ACTIVE
 	 */
 	wmb();
 
+	/* INACTIVE/STOPPED -> ACTIVE */
+	WRITE_ONCE(bts->state, BTS_STATE_ACTIVE);
+
 	intel_pmu_enable_bts(config);
 
 }
@@ -251,9 +278,6 @@ static void bts_event_start(struct perf_event *event, int flags)
 
 	__bts_event_start(event);
 
-	/* PMI handler: this counter is running and likely generating PMIs */
-	ACCESS_ONCE(bts->started) = 1;
-
 	return;
 
 fail_end_stop:
@@ -265,28 +289,36 @@ fail_stop:
 
 static void __bts_event_stop(struct perf_event *event)
 {
+	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
+
+	/* ACTIVE -> INACTIVE */
+	WRITE_ONCE(bts->state, BTS_STATE_INACTIVE);
+
 	/*
 	 * No extra synchronization is mandated by the documentation to have
 	 * BTS data stores globally visible.
 	 */
 	intel_pmu_disable_bts();
-
-	if (event->hw.state & PERF_HES_STOPPED)
-		return;
-
-	ACCESS_ONCE(event->hw.state) |= PERF_HES_STOPPED;
 }
 
 static void bts_event_stop(struct perf_event *event, int flags)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
-	struct bts_buffer *buf = perf_get_aux(&bts->handle);
+	struct bts_buffer *buf;
+
+	if (READ_ONCE(bts->state) == BTS_STATE_ACTIVE)
+		__bts_event_stop(event);
+
+	/*
+	 * order buf (handle::event load) against bts::state store;
+	 * matches wmb() in intel_bts_interrupt()
+	 */
+	mb();
 
-	/* PMI handler: don't restart this counter */
-	ACCESS_ONCE(bts->started) = 0;
+	buf = perf_get_aux(&bts->handle);
 
-	__bts_event_stop(event);
+	event->hw.state |= PERF_HES_STOPPED;
 
 	if (flags & PERF_EF_UPDATE) {
 		bts_update(bts);
@@ -298,6 +330,11 @@ static void bts_event_stop(struct perf_event *event, int flags)
 						   buf->nr_pages << PAGE_SHIFT);
 			perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),
 					    !!local_xchg(&buf->lost, 0));
+			/*
+			 * no need to enforce ordering here as the PMI won't come after
+			 * __bts_event_stop()+mb() above
+			 */
+			WRITE_ONCE(bts->state, BTS_STATE_STOPPED);
 		}
 
 		cpuc->ds->bts_index = bts->ds_back.bts_buffer_base;
@@ -311,7 +348,13 @@ void intel_bts_enable_local(void)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
 
-	if (bts->handle.event && bts->started)
+	if (READ_ONCE(bts->state) == BTS_STATE_ACTIVE)
+		return;
+
+	/* matches wmb() ordering bts::state store against handle::event */
+	rmb();
+
+	if (bts->handle.event)
 		__bts_event_start(bts->handle.event);
 }
 
@@ -319,6 +362,12 @@ void intel_bts_disable_local(void)
 {
 	struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
 
+	if (READ_ONCE(bts->state) == BTS_STATE_INACTIVE)
+		return;
+
+	/* matches wmb() ordering bts::state store against handle::event */
+	rmb();
+
 	if (bts->handle.event)
 		__bts_event_stop(bts->handle.event);
 }
@@ -407,10 +456,25 @@ int intel_bts_interrupt(void)
 	struct perf_event *event = bts->handle.event;
 	struct bts_buffer *buf;
 	s64 old_head;
-	int err;
+	int err = -ENOSPC, handled = 0;
 
-	if (!event || !bts->started)
-		return 0;
+	if (local_read(&bts->pmi_pending)) {
+		handled = 1;
+		local_set(&bts->pmi_pending, 0);
+	}
+
+	/*
+	 * this is wrapped in intel_bts_enable_local/intel_bts_disable_local,
+	 * so we can only be INACTIVE or STOPPED
+	 */
+	if (READ_ONCE(bts->state) == BTS_STATE_STOPPED)
+		return handled;
+
+	/* matches wmb() ordering bts::state store against handle::event */
+	rmb();
+
+	if (!event)
+		return handled;
 
 	buf = perf_get_aux(&bts->handle);
 	/*
@@ -432,12 +496,21 @@ int intel_bts_interrupt(void)
 			    !!local_xchg(&buf->lost, 0));
 
 	buf = perf_aux_output_begin(&bts->handle, event);
-	if (!buf)
-		return 1;
+	if (buf)
+		err = bts_buffer_reset(buf, &bts->handle);
+
+	if (err) {
+		WRITE_ONCE(bts->state, BTS_STATE_STOPPED);
 
-	err = bts_buffer_reset(buf, &bts->handle);
-	if (err)
-		perf_aux_output_end(&bts->handle, 0, false);
+		if (buf) {
+			/*
+			 * BTS_STATE_STOPPED should be visible before
+			 * cleared handle::event
+			 */
+			wmb();
+			perf_aux_output_end(&bts->handle, 0, false);
+		}
+	}
 
 	return 1;
 }
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  2016-08-24 14:15     ` Alexander Shishkin
@ 2016-08-26 20:49       ` Vince Weaver
  2016-08-27  6:35         ` Alexander Shishkin
  2016-08-29 14:14       ` Peter Zijlstra
  1 sibling, 1 reply; 14+ messages in thread
From: Vince Weaver @ 2016-08-26 20:49 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, eranian,
	Arnaldo Carvalho de Melo

On Wed, 24 Aug 2016, Alexander Shishkin wrote:

> Alexander Shishkin <alexander.shishkin@linux.intel.com> writes:
> 
> > Alexander Shishkin <alexander.shishkin@linux.intel.com> writes:
> >
> >> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> >
> > Ok, this one is broken, please disregard.
> 
> Vince, can you try the following (with the other two in this series)?
> 

Sorry for the delay.  Not sure if testing was still needed (or if it was 
superceded by Will Deacon's patch) but I ran the fuzzer for a few hours 
with this patch and the other two applied and didn't see any of the NMI 
errors that I saw with the original patch.

Vince

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  2016-08-26 20:49       ` Vince Weaver
@ 2016-08-27  6:35         ` Alexander Shishkin
  0 siblings, 0 replies; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-27  6:35 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Alexander Shishkin, Peter Zijlstra, Ingo Molnar,
	Linux Kernel Mailing List, Stephane Eranian,
	Arnaldo Carvalho de Melo

Yes, Will's patch notwithstanding, these patches fix the
warnings/oopses that you observed.

On 26 August 2016 at 23:49, Vince Weaver <vince@deater.net> wrote:
> On Wed, 24 Aug 2016, Alexander Shishkin wrote:
>
>> Alexander Shishkin <alexander.shishkin@linux.intel.com> writes:
>>
>> > Alexander Shishkin <alexander.shishkin@linux.intel.com> writes:
>> >
>> >> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> >
>> > Ok, this one is broken, please disregard.
>>
>> Vince, can you try the following (with the other two in this series)?
>>
>
> Sorry for the delay.  Not sure if testing was still needed (or if it was
> superceded by Will Deacon's patch) but I ran the fuzzer for a few hours
> with this patch and the other two applied and didn't see any of the NMI
> errors that I saw with the original patch.
>
> Vince

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  2016-08-24 14:15     ` Alexander Shishkin
  2016-08-26 20:49       ` Vince Weaver
@ 2016-08-29 14:14       ` Peter Zijlstra
  2016-08-29 14:43         ` Alexander Shishkin
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2016-08-29 14:14 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: vince, Ingo Molnar, linux-kernel, eranian, Arnaldo Carvalho de Melo

On Wed, Aug 24, 2016 at 05:15:54PM +0300, Alexander Shishkin wrote:
> @@ -221,10 +245,13 @@ static void __bts_event_start(struct perf_event *event)
>  
>  	/*
>  	 * local barrier to make sure that ds configuration made it
> -	 * before we enable BTS
> +	 * before we enable BTS and bts::state goes ACTIVE
>  	 */
>  	wmb();
>  
> +	/* INACTIVE/STOPPED -> ACTIVE */
> +	WRITE_ONCE(bts->state, BTS_STATE_ACTIVE);
> +
>  	intel_pmu_enable_bts(config);
>  
>  }

Alexander, were you going to post a new version of this patch without
the barrier confusion?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  2016-08-29 14:14       ` Peter Zijlstra
@ 2016-08-29 14:43         ` Alexander Shishkin
  0 siblings, 0 replies; 14+ messages in thread
From: Alexander Shishkin @ 2016-08-29 14:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: vince, Ingo Molnar, linux-kernel, eranian, Arnaldo Carvalho de Melo

Peter Zijlstra <peterz@infradead.org> writes:

> On Wed, Aug 24, 2016 at 05:15:54PM +0300, Alexander Shishkin wrote:
>> @@ -221,10 +245,13 @@ static void __bts_event_start(struct perf_event *event)
>>  
>>  	/*
>>  	 * local barrier to make sure that ds configuration made it
>> -	 * before we enable BTS
>> +	 * before we enable BTS and bts::state goes ACTIVE
>>  	 */
>>  	wmb();
>>  
>> +	/* INACTIVE/STOPPED -> ACTIVE */
>> +	WRITE_ONCE(bts->state, BTS_STATE_ACTIVE);
>> +
>>  	intel_pmu_enable_bts(config);
>>  
>>  }
>
> Alexander, were you going to post a new version of this patch without
> the barrier confusion?

Yes, only the testcase exploded again over the weekend, looking at it
again.

Regards,
--
Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-08-29 14:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-23  8:54 [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent Alexander Shishkin
2016-08-23  8:54 ` [PATCH 1/3] perf/x86/intel/bts: Fix confused ordering of PMU callbacks Alexander Shishkin
2016-08-24 12:28   ` Alexander Shishkin
2016-08-24 14:15     ` Alexander Shishkin
2016-08-26 20:49       ` Vince Weaver
2016-08-27  6:35         ` Alexander Shishkin
2016-08-29 14:14       ` Peter Zijlstra
2016-08-29 14:43         ` Alexander Shishkin
2016-08-23  8:54 ` [PATCH 2/3] perf/x86/intel/bts: Kill a silly warning Alexander Shishkin
2016-08-23  8:54 ` [PATCH 3/3] perf: Fix a race between mmap_close and set_output of AUX events Alexander Shishkin
2016-08-23 16:24 ` [PATCH 0/3] perf, bts: Fallout from the fuzzer for perf/urgent Vince Weaver
2016-08-23 17:10   ` Alexander Shishkin
2016-08-23 20:55     ` Vince Weaver
2016-08-24  3:44       ` Alexander Shishkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.