All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] perf/core: Avoid context switch overheads
@ 2017-08-08 10:00 石祤
  2017-08-08 10:37 ` Peter Zijlstra
  0 siblings, 1 reply; 4+ messages in thread
From: 石祤 @ 2017-08-08 10:00 UTC (permalink / raw)
  To: yang_oliver, peterz, mingo, acme, alexander.shishkin,
	linux-kernel, tglx, eranian, torvalds, jolsa, linxiulei
  Cc: leilei.lin

From: "leilei.lin" <leilei.lin@alibaba-inc.com>

A performance issue caused by less strickly check in task
sched when these tasks were once attached by per-task perf_event.

A task will alloc task->perf_event_ctxp[ctxn] when it was called
by perf_event_open, and task->perf_event_ctxp[ctxn] would not
ever be freed to NULL.

__perf_event_task_sched_in()
        if (task->perf_event_ctxp[ctxn]) //  here is always true
                perf_event_context_sched_in() // operate pmu

50% at most performance overhead was observed under some extreme
test case. Therefor, add a more strick check as to ctx->nr_events,
when ctx->nr_events == 0, it's no need to continue.

Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com>
---
 kernel/events/core.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 426c2ff..3d86695 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3180,6 +3180,13 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
 		return;
 
 	perf_ctx_lock(cpuctx, ctx);
+	/*
+	 * We must check ctx->nr_events while holding ctx->lock, such
+	 * that we serialize against perf_install_in_context().
+	 */
+	if (!cpuctx->task_ctx && !ctx->nr_events)
+		goto unlock;
+
 	perf_pmu_disable(ctx->pmu);
 	/*
 	 * We want to keep the following priority order:
@@ -3193,6 +3200,8 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
 		cpu_ctx_sched_out(cpuctx, EVENT_FLEXIBLE);
 	perf_event_sched_in(cpuctx, ctx, task);
 	perf_pmu_enable(ctx->pmu);
+
+unlock:
 	perf_ctx_unlock(cpuctx, ctx);
 }
 
-- 
2.8.4.31.g9ed660f

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] perf/core: Avoid context switch overheads
  2017-08-08 10:00 [PATCH v2] perf/core: Avoid context switch overheads 石祤
@ 2017-08-08 10:37 ` Peter Zijlstra
  2017-08-09  0:28   ` 林守磊
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2017-08-08 10:37 UTC (permalink / raw)
  To: 石祤
  Cc: yang_oliver, mingo, acme, alexander.shishkin, linux-kernel, tglx,
	eranian, torvalds, jolsa, leilei.lin

On Tue, Aug 08, 2017 at 06:00:45PM +0800, 石祤 wrote:

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 426c2ff..3d86695 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3180,6 +3180,13 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
>  		return;
>  
>  	perf_ctx_lock(cpuctx, ctx);
> +	/*
> +	 * We must check ctx->nr_events while holding ctx->lock, such
> +	 * that we serialize against perf_install_in_context().
> +	 */
> +	if (!cpuctx->task_ctx && !ctx->nr_events)
> +		goto unlock;

Do we really need the cpuctx->task_ctx test? I think that task_ctx is
'tight' these days. We never have it set unless there are events
scheduled for that context.

I even think the cpuctx->task_ctx == ctx test right above here is
superfluous these days. That could only happen when the
perf_install_in_context() IPI came before perf_event_task_sched_in(),
but we removed the arch option to do context switches with IRQs enabled.

> +
>  	perf_pmu_disable(ctx->pmu);
>  	/*
>  	 * We want to keep the following priority order:
> @@ -3193,6 +3200,8 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
>  		cpu_ctx_sched_out(cpuctx, EVENT_FLEXIBLE);
>  	perf_event_sched_in(cpuctx, ctx, task);
>  	perf_pmu_enable(ctx->pmu);
> +
> +unlock:
>  	perf_ctx_unlock(cpuctx, ctx);
>  }
>  
> -- 
> 2.8.4.31.g9ed660f
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] perf/core: Avoid context switch overheads
  2017-08-08 10:37 ` Peter Zijlstra
@ 2017-08-09  0:28   ` 林守磊
  0 siblings, 0 replies; 4+ messages in thread
From: 林守磊 @ 2017-08-09  0:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: yang_oliver, mingo, acme, alexander.shishkin, linux-kernel, tglx,
	Stephane Eranian, torvalds, jolsa, leilei.lin

2017-08-08 18:37 GMT+08:00 Peter Zijlstra <peterz@infradead.org>:
> On Tue, Aug 08, 2017 at 06:00:45PM +0800, 石祤 wrote:
>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 426c2ff..3d86695 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -3180,6 +3180,13 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
>>               return;
>>
>>       perf_ctx_lock(cpuctx, ctx);
>> +     /*
>> +      * We must check ctx->nr_events while holding ctx->lock, such
>> +      * that we serialize against perf_install_in_context().
>> +      */
>> +     if (!cpuctx->task_ctx && !ctx->nr_events)
>> +             goto unlock;
>
> Do we really need the cpuctx->task_ctx test? I think that task_ctx is
> 'tight' these days. We never have it set unless there are events
> scheduled for that context.
>
> I even think the cpuctx->task_ctx == ctx test right above here is
> superfluous these days. That could only happen when the
> perf_install_in_context() IPI came before perf_event_task_sched_in(),
> but we removed the arch option to do context switches with IRQs enabled.
>

It looks that cpuctx->task_ctx exists somewhere else, so I thought it was
conservative making this patch.

For a centain, during my process of debugging I didn't figure out any value
of cpuctx->task_ctx. I shall make a v3.

Thanks

>> +
>>       perf_pmu_disable(ctx->pmu);
>>       /*
>>        * We want to keep the following priority order:
>> @@ -3193,6 +3200,8 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
>>               cpu_ctx_sched_out(cpuctx, EVENT_FLEXIBLE);
>>       perf_event_sched_in(cpuctx, ctx, task);
>>       perf_pmu_enable(ctx->pmu);
>> +
>> +unlock:
>>       perf_ctx_unlock(cpuctx, ctx);
>>  }
>>
>> --
>> 2.8.4.31.g9ed660f
>>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] perf/core: Avoid context switch overheads
@ 2017-08-08  4:43 石祤
  0 siblings, 0 replies; 4+ messages in thread
From: 石祤 @ 2017-08-08  4:43 UTC (permalink / raw)
  To: yang_oliver, peterz, mingo, acme, alexander.shishkin,
	linux-kernel, tglx, eranian, torvalds, jolsa, linxiulei
  Cc: leilei.lin

From: "leilei.lin" <leilei.lin@alibaba-inc.com>

A performance issue caused by less strickly check in task
sched when these tasks were once attached by per-task perf_event.

A task will alloc task->perf_event_ctxp[ctxn] when it was called
by perf_event_open, and task->perf_event_ctxp[ctxn] would not
ever be freed to NULL.

__perf_event_task_sched_in()
        if (task->perf_event_ctxp[ctxn]) //  here is always true
                perf_event_context_sched_in() // operate pmu

50% at most performance overhead was observed under some extreme
test case. Therefor, add a more strick check as to ctx->nr_events,
when ctx->nr_events == 0, it's no need to continue.

Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com>
---
 kernel/events/core.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 426c2ff..3d86695 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3180,6 +3180,13 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
 		return;
 
 	perf_ctx_lock(cpuctx, ctx);
+	/*
+	 * We must check ctx->nr_events while holding ctx->lock, such
+	 * that we serialize against perf_install_in_context().
+	 */
+	if (!cpuctx->task_ctx && !ctx->nr_events)
+		goto unlock;
+
 	perf_pmu_disable(ctx->pmu);
 	/*
 	 * We want to keep the following priority order:
@@ -3193,6 +3200,8 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
 		cpu_ctx_sched_out(cpuctx, EVENT_FLEXIBLE);
 	perf_event_sched_in(cpuctx, ctx, task);
 	perf_pmu_enable(ctx->pmu);
+
+unlock:
 	perf_ctx_unlock(cpuctx, ctx);
 }
 
-- 
2.8.4.31.g9ed660f

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-08-09  0:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-08 10:00 [PATCH v2] perf/core: Avoid context switch overheads 石祤
2017-08-08 10:37 ` Peter Zijlstra
2017-08-09  0:28   ` 林守磊
  -- strict thread matches above, loose matches on Subject: below --
2017-08-08  4:43 石祤

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.