* perf_events: question about __perf_event_read()
@ 2011-01-21 18:06 Stephane Eranian
2011-01-21 18:18 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Stephane Eranian @ 2011-01-21 18:06 UTC (permalink / raw)
To: LKML
Cc: Peter Zijlstra, mingo, Frédéric Weisbecker,
Paul Mackerras, David S. Miller
Hi,
I think the code below still has a problem in case of a per-cpu event.
If you issue a read() on a different CPU, then you IPI to the event's cpu.
By the time you get there, the event may be de-scheduled in which
case you don't want to issue event->pmu_read() nor update context
timings. The function has a test but it seems to be checking the per-cpu
case only.
I have seen panics on P4 with this code because it goes all the way
down to rdmsrl() with a bogus counter index (like -1).
Am I missing something here?
static void __perf_event_read(void *info)
{
struct perf_event *event = info;
struct perf_event_context *ctx = event->ctx;
struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
/*
* If this is a task context, we need to check whether it is
* the current task context of this cpu. If not it has been
* scheduled out before the smp call arrived. In that case
* event->count would have been updated to a recent sample
* when the event was scheduled out.
*/
if (ctx->task && cpuctx->task_ctx != ctx)
return;
raw_spin_lock(&ctx->lock);
update_context_time(ctx);
update_event_times(event);
raw_spin_unlock(&ctx->lock);
event->pmu->read(event);
}
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: perf_events: question about __perf_event_read()
2011-01-21 18:06 perf_events: question about __perf_event_read() Stephane Eranian
@ 2011-01-21 18:18 ` Peter Zijlstra
2011-01-21 18:25 ` Stephane Eranian
0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2011-01-21 18:18 UTC (permalink / raw)
To: Stephane Eranian
Cc: LKML, mingo, Frédéric Weisbecker, Paul Mackerras,
David S. Miller
On Fri, 2011-01-21 at 19:06 +0100, Stephane Eranian wrote:
> Hi,
>
> I think the code below still has a problem in case of a per-cpu event.
>
> If you issue a read() on a different CPU, then you IPI to the event's cpu.
> By the time you get there, the event may be de-scheduled in which
> case you don't want to issue event->pmu_read() nor update context
> timings. The function has a test but it seems to be checking the per-cpu
> case only.
>
> I have seen panics on P4 with this code because it goes all the way
> down to rdmsrl() with a bogus counter index (like -1).
>
> Am I missing something here?
>
> static void __perf_event_read(void *info)
> {
> struct perf_event *event = info;
> struct perf_event_context *ctx = event->ctx;
> struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
>
> /*
> * If this is a task context, we need to check whether it is
> * the current task context of this cpu. If not it has been
> * scheduled out before the smp call arrived. In that case
> * event->count would have been updated to a recent sample
> * when the event was scheduled out.
> */
> if (ctx->task && cpuctx->task_ctx != ctx)
> return;
>
> raw_spin_lock(&ctx->lock);
Shouldn't we re-check event->state once we hold this lock?
> update_context_time(ctx);
> update_event_times(event);
> raw_spin_unlock(&ctx->lock);
>
> event->pmu->read(event);
> }
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: perf_events: question about __perf_event_read()
2011-01-21 18:18 ` Peter Zijlstra
@ 2011-01-21 18:25 ` Stephane Eranian
2011-01-26 14:40 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Stephane Eranian @ 2011-01-21 18:25 UTC (permalink / raw)
To: Peter Zijlstra
Cc: LKML, mingo, Frédéric Weisbecker, Paul Mackerras,
David S. Miller
On Fri, Jan 21, 2011 at 7:18 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, 2011-01-21 at 19:06 +0100, Stephane Eranian wrote:
>> Hi,
>>
>> I think the code below still has a problem in case of a per-cpu event.
>>
>> If you issue a read() on a different CPU, then you IPI to the event's cpu.
>> By the time you get there, the event may be de-scheduled in which
>> case you don't want to issue event->pmu_read() nor update context
>> timings. The function has a test but it seems to be checking the per-cpu
>> case only.
>>
>> I have seen panics on P4 with this code because it goes all the way
>> down to rdmsrl() with a bogus counter index (like -1).
>>
>> Am I missing something here?
>>
>> static void __perf_event_read(void *info)
>> {
>> struct perf_event *event = info;
>> struct perf_event_context *ctx = event->ctx;
>> struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
>>
>> /*
>> * If this is a task context, we need to check whether it is
>> * the current task context of this cpu. If not it has been
>> * scheduled out before the smp call arrived. In that case
>> * event->count would have been updated to a recent sample
>> * when the event was scheduled out.
>> */
>> if (ctx->task && cpuctx->task_ctx != ctx)
>> return;
>>
>> raw_spin_lock(&ctx->lock);
>
> Shouldn't we re-check event->state once we hold this lock?
>
I remember checking this about a month ago and tip-x86 had
some checks in this routine, but now the're gone.
I think you need something like:
+ if (ctx->is_active)
+ update_context_time(ctx);
+ if (event->state == PERF_EVENT_STATE_ACTIVE)
+ event->pmu->read(event);
>> update_context_time(ctx);
>> update_event_times(event);
>> raw_spin_unlock(&ctx->lock);
>>
>> event->pmu->read(event);
>> }
>
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: perf_events: question about __perf_event_read()
2011-01-21 18:25 ` Stephane Eranian
@ 2011-01-26 14:40 ` Peter Zijlstra
2011-01-26 14:50 ` Stephane Eranian
0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2011-01-26 14:40 UTC (permalink / raw)
To: Stephane Eranian
Cc: LKML, mingo, Frédéric Weisbecker, Paul Mackerras,
David S. Miller
On Fri, 2011-01-21 at 19:25 +0100, Stephane Eranian wrote:
> I remember checking this about a month ago and tip-x86 had
> some checks in this routine, but now the're gone.
I did a quick look through history and I couldn't find anything like
that, a well...
> I think you need something like:
>
> + if (ctx->is_active)
> + update_context_time(ctx);
>
> + if (event->state == PERF_EVENT_STATE_ACTIVE)
> + event->pmu->read(event);
Something like so?
---
Subject: perf: Fix race in perf_event_read()
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed Jan 26 15:38:35 CET 2011
It is quite possible for the event to have been disabled between
perf_event_read() sending the IPI and the CPU servicing the IPI and
calling __perf_event_read(), hence revalidate the state.
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
---
kernel/perf_event.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -1901,11 +1901,12 @@ static void __perf_event_read(void *info
return;
raw_spin_lock(&ctx->lock);
- update_context_time(ctx);
+ if (ctx->is_active)
+ update_context_time(ctx);
update_event_times(event);
+ if (event->state == PERF_EVENT_STATE_ACTIVE)
+ event->pmu->read(event);
raw_spin_unlock(&ctx->lock);
-
- event->pmu->read(event);
}
static inline u64 perf_event_count(struct perf_event *event)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: perf_events: question about __perf_event_read()
2011-01-26 14:40 ` Peter Zijlstra
@ 2011-01-26 14:50 ` Stephane Eranian
0 siblings, 0 replies; 5+ messages in thread
From: Stephane Eranian @ 2011-01-26 14:50 UTC (permalink / raw)
To: Peter Zijlstra
Cc: LKML, mingo, Frédéric Weisbecker, Paul Mackerras,
David S. Miller
Yeah, something like that.
I didn't quite understand why the event->pmu->read() was outside of the
lock.
Thanks.
On Wed, Jan 26, 2011 at 3:40 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, 2011-01-21 at 19:25 +0100, Stephane Eranian wrote:
>
>> I remember checking this about a month ago and tip-x86 had
>> some checks in this routine, but now the're gone.
>
> I did a quick look through history and I couldn't find anything like
> that, a well...
>
>> I think you need something like:
>>
>> + if (ctx->is_active)
>> + update_context_time(ctx);
>>
>> + if (event->state == PERF_EVENT_STATE_ACTIVE)
>> + event->pmu->read(event);
>
> Something like so?
>
> ---
> Subject: perf: Fix race in perf_event_read()
> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Wed Jan 26 15:38:35 CET 2011
>
> It is quite possible for the event to have been disabled between
> perf_event_read() sending the IPI and the CPU servicing the IPI and
> calling __perf_event_read(), hence revalidate the state.
>
> Reported-by: Stephane Eranian <eranian@google.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> LKML-Reference: <new-submission>
> ---
> kernel/perf_event.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> Index: linux-2.6/kernel/perf_event.c
> ===================================================================
> --- linux-2.6.orig/kernel/perf_event.c
> +++ linux-2.6/kernel/perf_event.c
> @@ -1901,11 +1901,12 @@ static void __perf_event_read(void *info
> return;
>
> raw_spin_lock(&ctx->lock);
> - update_context_time(ctx);
> + if (ctx->is_active)
> + update_context_time(ctx);
> update_event_times(event);
> + if (event->state == PERF_EVENT_STATE_ACTIVE)
> + event->pmu->read(event);
> raw_spin_unlock(&ctx->lock);
> -
> - event->pmu->read(event);
> }
>
> static inline u64 perf_event_count(struct perf_event *event)
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-01-26 14:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-21 18:06 perf_events: question about __perf_event_read() Stephane Eranian
2011-01-21 18:18 ` Peter Zijlstra
2011-01-21 18:25 ` Stephane Eranian
2011-01-26 14:40 ` Peter Zijlstra
2011-01-26 14:50 ` Stephane Eranian
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.