All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] perf crash fix
@ 2010-06-03  3:13 Frederic Weisbecker
  2010-06-03  8:02 ` Peter Zijlstra
  2010-06-03  9:26 ` Peter Zijlstra
  0 siblings, 2 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2010-06-03  3:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Stephane Eranian

Ingo,

Please pull the perf/urgent branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	perf/urgent

Thanks,
	Frederic
---

Frederic Weisbecker (1):
      perf: Fix racy software pmu disabling


 kernel/perf_event.c |   20 ++++++++------------
 1 files changed, 8 insertions(+), 12 deletions(-)

---
commit f5ec7ecb6794782197305cd2fdd552b914800330
Author: Frederic Weisbecker <fweisbec@gmail.com>
Date:   Thu Jun 3 03:46:03 2010 +0200

    perf: Fix racy software pmu disabling
    
    Running "perf record -a -e faults" triggers the following bug:
    
    	BUG: unable to handle kernel paging request at 00200200 <-- LIST_POISON2
    	IP: [<c10aeebb>] perf_swevent_disable+0xb/0x20
    	*pde = 00000000
    	Oops: 0002 [#1] PREEMPT SMP
    	last sysfs file: /sys/devices/system/cpu/online
    
    	Pid: 3357, comm: perf Not tainted 2.6.35-rc1-atom #571 MS-7418/MS-7418
    	EIP: 0060:[<c10aeebb>] EFLAGS: 00010046 CPU: 0
    	EIP is at perf_swevent_disable+0xb/0x20
    	EAX: f5641400 EBX: f5641400 ECX: 00200200 EDX: 00000000
    	ESI: 00000000 EDI: f56414f8 EBP: f49dda80 ESP: f49dda80
    	 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
    	Process perf (pid: 3357, ti=f49dc000 task=f48504b0 task.ti=f49dc000)
    	Stack:
    	 f49ddacc c10b0a9d 3b987840 00000000 00000001 00000000 00000000 000f41a8
    	<0> 00000000 f5641400 00000001 00000000 3b9aca00 00000000 000003e8 00000000
    	<0> f5641400 00000001 00000000 f49ddaf0 c10b5b71 00000001 00000000 c283f180
    	Call Trace:
    	 [<c10b0a9d>] ? perf_adjust_period+0x3cd/0x3f0
    	 [<c10b5b71>] ? perf_ctx_adjust_freq+0xe1/0x140
    	 [<c10b5d0f>] ? perf_event_task_tick+0x13f/0x160
    	 [<c103e9e0>] ? scheduler_tick+0x110/0x310
    	 [<c10510e7>] ? update_process_times+0x47/0x60
    	 [<c106d55c>] ? tick_sched_timer+0x5c/0xb0
    	 [<c1061f07>] ? __run_hrtimer+0x67/0x330
    	 [<c106d500>] ? tick_sched_timer+0x0/0xb0
    	 [<c1062562>] ? hrtimer_interrupt+0x202/0x2f0
    	 [<c134f8d5>] ? write_msg+0xd5/0xe0
    	 [<c101edb0>] ? smp_apic_timer_interrupt+0x50/0x90
    	 [<c1218ce8>] ? trace_hardirqs_off_thunk+0xc/0x14
    	 [<c150fe3b>] ? apic_timer_interrupt+0x2f/0x34
    	 [<c150f56f>] ? _raw_spin_unlock_irqrestore+0x2f/0x60
    	 [<c1041fb5>] ? vprintk+0x145/0x450
    	 [<c1064d62>] ? sched_clock_local+0xd2/0x170
    	 [<c1064f28>] ? sched_clock_cpu+0x128/0x170
    	 [<c150b6a3>] ? printk+0x18/0x1a
    	 [<c107076c>] ? lockdep_rcu_dereference+0x3c/0xb0
    	 [<c10b0697>] ? perf_swevent_enable+0x1e7/0x220
    	 [<c100f7c8>] ? intel_pmu_disable_all+0x68/0xd0
    	 [<c100f9b5>] ? hw_perf_disable+0x45/0x50
    	 [<c10b0aaa>] ? perf_adjust_period+0x3da/0x3f0
    	 [<c10b5f4b>] ? __perf_event_overflow+0x21b/0x260
    	 [<c1008f54>] ? native_sched_clock+0x24/0x90
    	 [<c10b60f4>] ? perf_swevent_overflow+0x164/0x180
    	 [<c10b6193>] ? perf_swevent_add+0x83/0xd0
    	 [<c10b6544>] ? __perf_sw_event+0x1d4/0x290
    	 [<c10b63a6>] ? __perf_sw_event+0x36/0x290
    	 [<c106ebdd>] ? put_lock_stats+0xd/0x30
    	 [<c106fc36>] ? lock_release_holdtime+0xa6/0x1a0
    	 [<c1074b0f>] ? lock_release_non_nested+0x2cf/0x2e0
    	 [<c103388b>] ? get_parent_ip+0xb/0x40
    	 [<c1034c2b>] ? sub_preempt_count+0x7b/0xb0
    	 [<c10cf72a>] ? might_fault+0x5a/0xb0
    	 [<c102778d>] ? do_page_fault+0x25d/0x3b0
    	 [<c1071544>] ? trace_hardirqs_on_caller+0x124/0x170
    	 [<c102766b>] ? do_page_fault+0x13b/0x3b0
    	 [<c10cf770>] ? might_fault+0xa0/0xb0
    	 [<c10cf72a>] ? might_fault+0x5a/0xb0
    	 [<c1219032>] ? copy_to_user+0x32/0x70
    	 [<c1002bfb>] ? sysenter_exit+0xf/0x16
    	 [<c1071544>] ? trace_hardirqs_on_caller+0x124/0x170
    	 [<c1027530>] ? do_page_fault+0x0/0x3b0
    	 [<c1510083>] ? error_code+0x6b/0x70
    	 [<c1027530>] ? do_page_fault+0x0/0x3b0
    
    What happens here is a double pmu->disable() due to a race between
    two perf_adjust_period().
    
    We first overflow a page fault event and then re-adjust the period.
    When we reset the period_left, we stop the pmu by removing the
    perf event from the software event hlist. And just before we
    re-enable it, we are interrupted by a sched tick that also tries to
    re-adjust the period. There we eventually disable the event a second
    time, which leads to a double hlist_del_rcu() that ends up
    dereferencing LIST_POISON2.
    
    In fact, the goal of embracing the reset of the period_left with
    a pmu:stop() and pmu:start() is only relevant to hardware events. We
    want them to reprogram the next period interrupt.
    
    But this is useless for software events. They have their own way to
    handle the period left, and in a non-racy way. They don't need to
    be stopped here.
    
    So, use a new pair of perf_event_stop/start_hwevent that only stop
    and restart hardware events in this path.
    
    The race won't happen with hardware events as sched ticks can't
    happen during nmis.
    
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Stephane Eranian <eranian@google.com>

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 858f56f..b666d7d 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -1510,20 +1510,16 @@ do {					\
 	return div64_u64(dividend, divisor);
 }
 
-static void perf_event_stop(struct perf_event *event)
+static void perf_event_stop_hwevent(struct perf_event *event)
 {
-	if (!event->pmu->stop)
-		return event->pmu->disable(event);
-
-	return event->pmu->stop(event);
+	if (event->pmu->stop && !is_software_event(event))
+		return event->pmu->stop(event);
 }
 
-static int perf_event_start(struct perf_event *event)
+static int perf_event_start_hwevent(struct perf_event *event)
 {
-	if (!event->pmu->start)
-		return event->pmu->enable(event);
-
-	return event->pmu->start(event);
+	if (event->pmu->start && !is_software_event(event))
+		return event->pmu->start(event);
 }
 
 static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count)
@@ -1546,9 +1542,9 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count)
 
 	if (atomic64_read(&hwc->period_left) > 8*sample_period) {
 		perf_disable();
-		perf_event_stop(event);
+		perf_event_stop_hwevent(event);
 		atomic64_set(&hwc->period_left, 0);
-		perf_event_start(event);
+		perf_event_start_hwevent(event);
 		perf_enable();
 	}
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] perf crash fix
  2010-06-03  3:13 [GIT PULL] perf crash fix Frederic Weisbecker
@ 2010-06-03  8:02 ` Peter Zijlstra
  2010-06-03  9:08   ` Peter Zijlstra
  2010-06-03 12:35   ` Frederic Weisbecker
  2010-06-03  9:26 ` Peter Zijlstra
  1 sibling, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2010-06-03  8:02 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian

On Thu, 2010-06-03 at 05:13 +0200, Frederic Weisbecker wrote:

> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index 858f56f..b666d7d 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -1510,20 +1510,16 @@ do {					\
>  	return div64_u64(dividend, divisor);
>  }
>  
> -static void perf_event_stop(struct perf_event *event)
> +static void perf_event_stop_hwevent(struct perf_event *event)
>  {
> -	if (!event->pmu->stop)
> -		return event->pmu->disable(event);
> -
> -	return event->pmu->stop(event);
> +	if (event->pmu->stop && !is_software_event(event))
> +		return event->pmu->stop(event);
>  }
>  
> -static int perf_event_start(struct perf_event *event)
> +static int perf_event_start_hwevent(struct perf_event *event)
>  {
> -	if (!event->pmu->start)
> -		return event->pmu->enable(event);
> -
> -	return event->pmu->start(event);
> +	if (event->pmu->start && !is_software_event(event))
> +		return event->pmu->start(event);
>  }
>  
>  static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count)
> @@ -1546,9 +1542,9 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count)
>  
>  	if (atomic64_read(&hwc->period_left) > 8*sample_period) {
>  		perf_disable();
> -		perf_event_stop(event);
> +		perf_event_stop_hwevent(event);
>  		atomic64_set(&hwc->period_left, 0);
> -		perf_event_start(event);
> +		perf_event_start_hwevent(event);
>  		perf_enable();
>  	}
>  }

Urhm,. isn't is much easier to simply give the software events a NOP
stop/start callback?

 kernel/perf_event.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 858f56f..16c99e1 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4276,9 +4276,15 @@ static void perf_swevent_disable(struct perf_event *event)
 	hlist_del_rcu(&event->hlist_entry);
 }
 
+static void perf_swevent_nop(struct perf_event *event)
+{
+}
+
 static const struct pmu perf_ops_generic = {
 	.enable		= perf_swevent_enable,
 	.disable	= perf_swevent_disable,
+	.start		= perf_swevent_nop,
+	.stop		= perf_swevent_nop,
 	.read		= perf_swevent_read,
 	.unthrottle	= perf_swevent_unthrottle,
 };


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] perf crash fix
  2010-06-03  8:02 ` Peter Zijlstra
@ 2010-06-03  9:08   ` Peter Zijlstra
  2010-06-03 12:35   ` Frederic Weisbecker
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2010-06-03  9:08 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian

On Thu, 2010-06-03 at 10:02 +0200, Peter Zijlstra wrote:
> On Thu, 2010-06-03 at 05:13 +0200, Frederic Weisbecker wrote:
> 
> > diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> > index 858f56f..b666d7d 100644
> > --- a/kernel/perf_event.c
> > +++ b/kernel/perf_event.c
> > @@ -1510,20 +1510,16 @@ do {					\
> >  	return div64_u64(dividend, divisor);
> >  }
> >  
> > -static void perf_event_stop(struct perf_event *event)
> > +static void perf_event_stop_hwevent(struct perf_event *event)
> >  {
> > -	if (!event->pmu->stop)
> > -		return event->pmu->disable(event);
> > -
> > -	return event->pmu->stop(event);
> > +	if (event->pmu->stop && !is_software_event(event))
> > +		return event->pmu->stop(event);
> >  }
> >  
> > -static int perf_event_start(struct perf_event *event)
> > +static int perf_event_start_hwevent(struct perf_event *event)
> >  {
> > -	if (!event->pmu->start)
> > -		return event->pmu->enable(event);
> > -
> > -	return event->pmu->start(event);
> > +	if (event->pmu->start && !is_software_event(event))
> > +		return event->pmu->start(event);
> >  }
> >  
> >  static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count)
> > @@ -1546,9 +1542,9 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count)
> >  
> >  	if (atomic64_read(&hwc->period_left) > 8*sample_period) {
> >  		perf_disable();
> > -		perf_event_stop(event);
> > +		perf_event_stop_hwevent(event);
> >  		atomic64_set(&hwc->period_left, 0);
> > -		perf_event_start(event);
> > +		perf_event_start_hwevent(event);
> >  		perf_enable();
> >  	}
> >  }

Your patch actually breaks some hardware pmu implementations.

> Urhm,. isn't is much easier to simply give the software events a NOP
> stop/start callback?
> 
>  kernel/perf_event.c |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index 858f56f..16c99e1 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -4276,9 +4276,15 @@ static void perf_swevent_disable(struct perf_event *event)
>  	hlist_del_rcu(&event->hlist_entry);
>  }
>  
> +static void perf_swevent_nop(struct perf_event *event)
> +{
> +}
> +
>  static const struct pmu perf_ops_generic = {
>  	.enable		= perf_swevent_enable,
>  	.disable	= perf_swevent_disable,
> +	.start		= perf_swevent_nop,
> +	.stop		= perf_swevent_nop,
>  	.read		= perf_swevent_read,
>  	.unthrottle	= perf_swevent_unthrottle,
>  };
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] perf crash fix
  2010-06-03  3:13 [GIT PULL] perf crash fix Frederic Weisbecker
  2010-06-03  8:02 ` Peter Zijlstra
@ 2010-06-03  9:26 ` Peter Zijlstra
  2010-06-03  9:33   ` Peter Zijlstra
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2010-06-03  9:26 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian

On Thu, 2010-06-03 at 05:13 +0200, Frederic Weisbecker wrote:
>     What happens here is a double pmu->disable() due to a race between
>     two perf_adjust_period().
>     
>     We first overflow a page fault event and then re-adjust the period.
>     When we reset the period_left, we stop the pmu by removing the
>     perf event from the software event hlist. And just before we
>     re-enable it, we are interrupted by a sched tick that also tries to
>     re-adjust the period. There we eventually disable the event a second
>     time, which leads to a double hlist_del_rcu() that ends up
>     dereferencing LIST_POISON2.
>     
>     In fact, the goal of embracing the reset of the period_left with
>     a pmu:stop() and pmu:start() is only relevant to hardware events. We
>     want them to reprogram the next period interrupt.
>     
>     But this is useless for software events. They have their own way to
>     handle the period left, and in a non-racy way. They don't need to
>     be stopped here.
>     
>     So, use a new pair of perf_event_stop/start_hwevent that only stop
>     and restart hardware events in this path.
>     
>     The race won't happen with hardware events as sched ticks can't
>     happen during nmis. 

I've queued the below.

---
Subject: perf: Fix crash in swevents
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Thu Jun 03 11:21:20 CEST 2010

Frederic reported that because swevents handling doesn't disable IRQs
anymore, we can get a recursion of perf_adjust_period(), once from
overflow handling and once from the tick.

If both call ->disable, we get a double hlist_del_rcu() and trigger
a LIST_POISON2 dereference.

Since we don't actually need to stop/start a swevent to re-programm
the hardware (lack of hardware to program), simply nop out these
callbacks for the swevent pmu.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/perf_event.c |   15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -4055,13 +4055,6 @@ static void perf_swevent_overflow(struct
 	}
 }
 
-static void perf_swevent_unthrottle(struct perf_event *event)
-{
-	/*
-	 * Nothing to do, we already reset hwc->interrupts.
-	 */
-}
-
 static void perf_swevent_add(struct perf_event *event, u64 nr,
 			       int nmi, struct perf_sample_data *data,
 			       struct pt_regs *regs)
@@ -4276,11 +4269,17 @@ static void perf_swevent_disable(struct 
 	hlist_del_rcu(&event->hlist_entry);
 }
 
+static void perf_swevent_nop(struct perf_event *event)
+{
+}
+
 static const struct pmu perf_ops_generic = {
 	.enable		= perf_swevent_enable,
 	.disable	= perf_swevent_disable,
+	.start		= perf_swevent_nop,
+	.stop		= perf_swevent_nop,
 	.read		= perf_swevent_read,
-	.unthrottle	= perf_swevent_unthrottle,
+	.unthrottle	= perf_swevent_nop, /* hwc->interrupts already reset */
 };
 
 /*


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] perf crash fix
  2010-06-03  9:26 ` Peter Zijlstra
@ 2010-06-03  9:33   ` Peter Zijlstra
  2010-06-03 17:54     ` [tip:perf/urgent] perf: Fix crash in swevents tip-bot for Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2010-06-03  9:33 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian

On Thu, 2010-06-03 at 11:26 +0200, Peter Zijlstra wrote:
> On Thu, 2010-06-03 at 05:13 +0200, Frederic Weisbecker wrote:
> >     What happens here is a double pmu->disable() due to a race between
> >     two perf_adjust_period().
> >     
> >     We first overflow a page fault event and then re-adjust the period.
> >     When we reset the period_left, we stop the pmu by removing the
> >     perf event from the software event hlist. And just before we
> >     re-enable it, we are interrupted by a sched tick that also tries to
> >     re-adjust the period. There we eventually disable the event a second
> >     time, which leads to a double hlist_del_rcu() that ends up
> >     dereferencing LIST_POISON2.
> >     
> >     In fact, the goal of embracing the reset of the period_left with
> >     a pmu:stop() and pmu:start() is only relevant to hardware events. We
> >     want them to reprogram the next period interrupt.
> >     
> >     But this is useless for software events. They have their own way to
> >     handle the period left, and in a non-racy way. They don't need to
> >     be stopped here.
> >     
> >     So, use a new pair of perf_event_stop/start_hwevent that only stop
> >     and restart hardware events in this path.
> >     
> >     The race won't happen with hardware events as sched ticks can't
> >     happen during nmis. 
> 
> I've queued the below.

Uhhm, the compiler reminded me of trace events..

---
Subject: perf: Fix crash in swevents
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Thu Jun 03 11:21:20 CEST 2010

Frederic reported that because swevents handling doesn't disable IRQs
anymore, we can get a recursion of perf_adjust_period(), once from
overflow handling and once from the tick.

If both call ->disable, we get a double hlist_del_rcu() and trigger
a LIST_POISON2 dereference.

Since we don't actually need to stop/start a swevent to re-programm
the hardware (lack of hardware to program), simply nop out these
callbacks for the swevent pmu.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1275557202.27810.35201.camel@twins>
---
 kernel/perf_event.c |   24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -4055,13 +4055,6 @@ static void perf_swevent_overflow(struct
 	}
 }
 
-static void perf_swevent_unthrottle(struct perf_event *event)
-{
-	/*
-	 * Nothing to do, we already reset hwc->interrupts.
-	 */
-}
-
 static void perf_swevent_add(struct perf_event *event, u64 nr,
 			       int nmi, struct perf_sample_data *data,
 			       struct pt_regs *regs)
@@ -4276,11 +4269,22 @@ static void perf_swevent_disable(struct 
 	hlist_del_rcu(&event->hlist_entry);
 }
 
+static void perf_swevent_void(struct perf_event *event)
+{
+}
+
+static int perf_swevent_int(struct perf_event *event)
+{
+	return 0;
+}
+
 static const struct pmu perf_ops_generic = {
 	.enable		= perf_swevent_enable,
 	.disable	= perf_swevent_disable,
+	.start		= perf_swevent_int,
+	.stop		= perf_swevent_void,
 	.read		= perf_swevent_read,
-	.unthrottle	= perf_swevent_unthrottle,
+	.unthrottle	= perf_swevent_void, /* hwc->interrupts already reset */
 };
 
 /*
@@ -4561,8 +4565,10 @@ static int swevent_hlist_get(struct perf
 static const struct pmu perf_ops_tracepoint = {
 	.enable		= perf_trace_enable,
 	.disable	= perf_trace_disable,
+	.start		= perf_swevent_int,
+	.stop		= perf_swevent_void,
 	.read		= perf_swevent_read,
-	.unthrottle	= perf_swevent_unthrottle,
+	.unthrottle	= perf_swevent_void,
 };
 
 static int perf_tp_filter_match(struct perf_event *event,


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] perf crash fix
  2010-06-03  8:02 ` Peter Zijlstra
  2010-06-03  9:08   ` Peter Zijlstra
@ 2010-06-03 12:35   ` Frederic Weisbecker
  2010-06-03 12:40     ` Peter Zijlstra
  1 sibling, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2010-06-03 12:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian

On Thu, Jun 03, 2010 at 10:02:39AM +0200, Peter Zijlstra wrote:
> On Thu, 2010-06-03 at 05:13 +0200, Frederic Weisbecker wrote:
> 
> > diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> > index 858f56f..b666d7d 100644
> > --- a/kernel/perf_event.c
> > +++ b/kernel/perf_event.c
> > @@ -1510,20 +1510,16 @@ do {					\
> >  	return div64_u64(dividend, divisor);
> >  }
> >  
> > -static void perf_event_stop(struct perf_event *event)
> > +static void perf_event_stop_hwevent(struct perf_event *event)
> >  {
> > -	if (!event->pmu->stop)
> > -		return event->pmu->disable(event);
> > -
> > -	return event->pmu->stop(event);
> > +	if (event->pmu->stop && !is_software_event(event))
> > +		return event->pmu->stop(event);
> >  }
> >  
> > -static int perf_event_start(struct perf_event *event)
> > +static int perf_event_start_hwevent(struct perf_event *event)
> >  {
> > -	if (!event->pmu->start)
> > -		return event->pmu->enable(event);
> > -
> > -	return event->pmu->start(event);
> > +	if (event->pmu->start && !is_software_event(event))
> > +		return event->pmu->start(event);
> >  }
> >  
> >  static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count)
> > @@ -1546,9 +1542,9 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count)
> >  
> >  	if (atomic64_read(&hwc->period_left) > 8*sample_period) {
> >  		perf_disable();
> > -		perf_event_stop(event);
> > +		perf_event_stop_hwevent(event);
> >  		atomic64_set(&hwc->period_left, 0);
> > -		perf_event_start(event);
> > +		perf_event_start_hwevent(event);
> >  		perf_enable();
> >  	}
> >  }
> 
> Urhm,. isn't is much easier to simply give the software events a NOP
> stop/start callback?


I wanted to, but I thought we could avoid two indirect calls on each
ticks and I was also afraid of breaking start/stop original semantics,
more especially the role of perf_event_stop/start


But that's about quite small details. I'm ok with your patch (the version
that also handles trace events ;)

Thanks.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] perf crash fix
  2010-06-03 12:35   ` Frederic Weisbecker
@ 2010-06-03 12:40     ` Peter Zijlstra
  2010-06-03 12:42       ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2010-06-03 12:40 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian

On Thu, 2010-06-03 at 14:35 +0200, Frederic Weisbecker wrote:

> I wanted to, but I thought we could avoid two indirect calls on each
> ticks and I was also afraid of breaking start/stop original semantics,
> more especially the role of perf_event_stop/start

Right, so you lost the fallback to ->enable/->disable, which would break
hardware PMU implementations that didn't actually implement
->start/->stop (pretty much all of them except x86).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GIT PULL] perf crash fix
  2010-06-03 12:40     ` Peter Zijlstra
@ 2010-06-03 12:42       ` Frederic Weisbecker
  0 siblings, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2010-06-03 12:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, LKML, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian

On Thu, Jun 03, 2010 at 02:40:31PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-06-03 at 14:35 +0200, Frederic Weisbecker wrote:
> 
> > I wanted to, but I thought we could avoid two indirect calls on each
> > ticks and I was also afraid of breaking start/stop original semantics,
> > more especially the role of perf_event_stop/start
> 
> Right, so you lost the fallback to ->enable/->disable, which would break
> hardware PMU implementations that didn't actually implement
> ->start/->stop (pretty much all of them except x86).


Indeed.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip:perf/urgent] perf: Fix crash in swevents
  2010-06-03  9:33   ` Peter Zijlstra
@ 2010-06-03 17:54     ` tip-bot for Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Peter Zijlstra @ 2010-06-03 17:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, fweisbec, a.p.zijlstra, tglx, mingo

Commit-ID:  c6df8d5ab87a246942d138321e1721edbb69f6e1
Gitweb:     http://git.kernel.org/tip/c6df8d5ab87a246942d138321e1721edbb69f6e1
Author:     Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Thu, 3 Jun 2010 11:21:20 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 3 Jun 2010 17:03:08 +0200

perf: Fix crash in swevents

Frederic reported that because swevents handling doesn't disable IRQs
anymore, we can get a recursion of perf_adjust_period(), once from
overflow handling and once from the tick.

If both call ->disable, we get a double hlist_del_rcu() and trigger
a LIST_POISON2 dereference.

Since we don't actually need to stop/start a swevent to re-programm
the hardware (lack of hardware to program), simply nop out these
callbacks for the swevent pmu.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1275557609.27810.35218.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/perf_event.c |   24 +++++++++++++++---------
 1 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 858f56f..31d6afe 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4055,13 +4055,6 @@ static void perf_swevent_overflow(struct perf_event *event, u64 overflow,
 	}
 }
 
-static void perf_swevent_unthrottle(struct perf_event *event)
-{
-	/*
-	 * Nothing to do, we already reset hwc->interrupts.
-	 */
-}
-
 static void perf_swevent_add(struct perf_event *event, u64 nr,
 			       int nmi, struct perf_sample_data *data,
 			       struct pt_regs *regs)
@@ -4276,11 +4269,22 @@ static void perf_swevent_disable(struct perf_event *event)
 	hlist_del_rcu(&event->hlist_entry);
 }
 
+static void perf_swevent_void(struct perf_event *event)
+{
+}
+
+static int perf_swevent_int(struct perf_event *event)
+{
+	return 0;
+}
+
 static const struct pmu perf_ops_generic = {
 	.enable		= perf_swevent_enable,
 	.disable	= perf_swevent_disable,
+	.start		= perf_swevent_int,
+	.stop		= perf_swevent_void,
 	.read		= perf_swevent_read,
-	.unthrottle	= perf_swevent_unthrottle,
+	.unthrottle	= perf_swevent_void, /* hwc->interrupts already reset */
 };
 
 /*
@@ -4561,8 +4565,10 @@ static int swevent_hlist_get(struct perf_event *event)
 static const struct pmu perf_ops_tracepoint = {
 	.enable		= perf_trace_enable,
 	.disable	= perf_trace_disable,
+	.start		= perf_swevent_int,
+	.stop		= perf_swevent_void,
 	.read		= perf_swevent_read,
-	.unthrottle	= perf_swevent_unthrottle,
+	.unthrottle	= perf_swevent_void,
 };
 
 static int perf_tp_filter_match(struct perf_event *event,

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-06-03 17:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-03  3:13 [GIT PULL] perf crash fix Frederic Weisbecker
2010-06-03  8:02 ` Peter Zijlstra
2010-06-03  9:08   ` Peter Zijlstra
2010-06-03 12:35   ` Frederic Weisbecker
2010-06-03 12:40     ` Peter Zijlstra
2010-06-03 12:42       ` Frederic Weisbecker
2010-06-03  9:26 ` Peter Zijlstra
2010-06-03  9:33   ` Peter Zijlstra
2010-06-03 17:54     ` [tip:perf/urgent] perf: Fix crash in swevents tip-bot for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.