linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
@ 2018-03-06  9:36 linxiulei
  2018-03-06 11:50 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: linxiulei @ 2018-03-06  9:36 UTC (permalink / raw)
  To: peterz, jolsa, mingo, acme, alexander.shishkin, linux-kernel,
	tglx, eranian, torvalds, linux-perf-users, brendan.d.gregg
  Cc: yang_oliver, leilei.lin

From: "leilei.lin" <leilei.lin@alibaba-inc.com>

Do not install cgroup event into the CPU context and schedule it
if the cgroup is not running on this CPU

While there is no task of cgroup running specified CPU, current
kernel still install cgroup event into CPU context that causes
another cgroup event can't be installed into this CPU.

This patch prevent scheduling events at __perf_install_in_context()
and installing events at list_update_cgroup_event() if cgroup isn't
running on specified CPU.

Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com>
---
 v2: Set cpuctx->cgrp only if the same cgroup is running on this
   CPU otherwise following events couldn't be activated immediately
 v3: Enhance the comments and commit message
 v4: Adjust to config
 v5: Clear cpuctx->cgrp only when no cgroup event exists

 kernel/events/core.c | 54 +++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 41 insertions(+), 13 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4df5b69..f3ffa70 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -933,31 +933,45 @@ list_update_cgroup_event(struct perf_event *event,
 {
 	struct perf_cpu_context *cpuctx;
 	struct list_head *cpuctx_entry;
+	struct perf_cgroup *cgrp;
 
 	if (!is_cgroup_event(event))
 		return;
 
-	if (add && ctx->nr_cgroups++)
-		return;
-	else if (!add && --ctx->nr_cgroups)
-		return;
 	/*
 	 * Because cgroup events are always per-cpu events,
 	 * this will always be called from the right CPU.
 	 */
 	cpuctx = __get_cpu_context(ctx);
-	cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
-	/* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU .*/
-	if (add) {
-		struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
+	cgrp = perf_cgroup_from_task(current, ctx);
+
+	/*
+	 * if only the cgroup is running on this cpu
+	 * and cpuctx->cgrp == NULL (otherwise it would've
+	 * been set with running cgroup), we put this cgroup
+	 * into cpu context. Or it would case mismatch in
+	 * following cgroup events at event_filter_match()
+	 */
+	if (add && !cpuctx->cgrp &&
+			cgroup_is_descendant(cgrp->css.cgroup,
+			event->cgrp->css.cgroup)) {
+		cpuctx->cgrp = cgrp;
+	}
+
+	if (add && ctx->nr_cgroups++)
+		return;
+	else if (!add && --ctx->nr_cgroups)
+		return;
 
+	/* no cgroup running */
+	if (!add)
+		cpuctx->cgrp = NULL;
+
+	cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
+	if (add)
 		list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
-		if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
-			cpuctx->cgrp = cgrp;
-	} else {
+	else
 		list_del(cpuctx_entry);
-		cpuctx->cgrp = NULL;
-	}
 }
 
 #else /* !CONFIG_CGROUP_PERF */
@@ -2311,6 +2325,20 @@ static int  __perf_install_in_context(void *info)
 		raw_spin_lock(&task_ctx->lock);
 	}
 
+#ifdef CONFIG_CGROUP_PERF
+	if (is_cgroup_event(event)) {
+		/*
+		 * Only care about cgroup events.
+		 *
+		 * If only the task belongs to cgroup of this event,
+		 * we will continue the installment
+		 */
+		struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
+		reprogram = cgroup_is_descendant(cgrp->css.cgroup,
+					event->cgrp->css.cgroup);
+	}
+#endif
+
 	if (reprogram) {
 		ctx_sched_out(ctx, cpuctx, EVENT_TIME);
 		add_event_to_ctx(event, ctx);
-- 
2.8.4.31.g9ed660f

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
  2018-03-06  9:36 [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu linxiulei
@ 2018-03-06 11:50 ` Peter Zijlstra
  2018-03-07 11:19   ` Lin Xiulei
  2018-03-12  7:53 ` Ingo Molnar
  2018-03-12 17:47 ` [tip:perf/core] perf/core: Fix installing cgroup events on CPU tip-bot for leilei.lin
  2 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2018-03-06 11:50 UTC (permalink / raw)
  To: linxiulei
  Cc: jolsa, mingo, acme, alexander.shishkin, linux-kernel, tglx,
	eranian, torvalds, linux-perf-users, brendan.d.gregg,
	yang_oliver, leilei.lin

On Tue, Mar 06, 2018 at 05:36:37PM +0800, linxiulei@gmail.com wrote:
> From: "leilei.lin" <leilei.lin@alibaba-inc.com>
> 
> Do not install cgroup event into the CPU context and schedule it
> if the cgroup is not running on this CPU

OK, so far so good, this explains the bit in
__perf_install_in_context().

> While there is no task of cgroup running specified CPU, current
> kernel still install cgroup event into CPU context that causes
> another cgroup event can't be installed into this CPU.
> 
> This patch prevent scheduling events at __perf_install_in_context()
> and installing events at list_update_cgroup_event() if cgroup isn't
> running on specified CPU.

This bit doesn't make sense, you don't in fact avoid anything in
list_update_cgroup_event(), you do more, not less.

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4df5b69..f3ffa70 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -933,31 +933,45 @@ list_update_cgroup_event(struct perf_event *event,
>  {
>  	struct perf_cpu_context *cpuctx;
>  	struct list_head *cpuctx_entry;
> +	struct perf_cgroup *cgrp;
>  
>  	if (!is_cgroup_event(event))
>  		return;
>  
>  	/*
>  	 * Because cgroup events are always per-cpu events,
>  	 * this will always be called from the right CPU.
>  	 */
>  	cpuctx = __get_cpu_context(ctx);
> +	cgrp = perf_cgroup_from_task(current, ctx);
> +
> +	/*
> +	 * if only the cgroup is running on this cpu
> +	 * and cpuctx->cgrp == NULL (otherwise it would've
> +	 * been set with running cgroup), we put this cgroup
> +	 * into cpu context. Or it would case mismatch in
> +	 * following cgroup events at event_filter_match()
> +	 */

This is utterly incomprehensible, what?

> +	if (add && !cpuctx->cgrp &&
> +			cgroup_is_descendant(cgrp->css.cgroup,
> +			event->cgrp->css.cgroup)) {
> +		cpuctx->cgrp = cgrp;
> +	}

And that's just horrible coding style. Maybe something like:

	if (add && cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) {
		if (cpuctx->cgrp)
			WARN_ON_ONCE(cpuctx->cgrp != cgrp);
		cpuctx->cgrp = cgrp;
	}

that? But that still needs a comment to explain _why_ we do that here.
Under what condition would we fail to have cpuctx->cgrp set while
ctx->nr_cgroups. Your comment doesn't explain nor does your Changelog.

> +
> +	if (add && ctx->nr_cgroups++)
> +		return;
> +	else if (!add && --ctx->nr_cgroups)
> +		return;
>  
> +	/* no cgroup running */
> +	if (!add)
> +		cpuctx->cgrp = NULL;
> +
> +	cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
> +	if (add)
>  		list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
> +	else
>  		list_del(cpuctx_entry);
>  }
>  
>  #else /* !CONFIG_CGROUP_PERF */
> @@ -2311,6 +2325,20 @@ static int  __perf_install_in_context(void *info)
>  		raw_spin_lock(&task_ctx->lock);
>  	}
>  
> +#ifdef CONFIG_CGROUP_PERF
> +	if (is_cgroup_event(event)) {
> +		/*
> +		 * Only care about cgroup events.
> +		 *

That bit is entirely spurious, if it right after if (is_cgroup_event()),
obviously this block is only for cgroup events.

> +		 * If only the task belongs to cgroup of this event,
> +		 * we will continue the installment

And that isn't really english. I think you meant to write something
like:

		/*
		 * If the current cgroup doesn't match the event's
		 * cgroup, we should not try to schedule it.
		 */

> +		 */
> +		struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
> +		reprogram = cgroup_is_descendant(cgrp->css.cgroup,
> +					event->cgrp->css.cgroup);
> +	}
> +#endif
> +
>  	if (reprogram) {
>  		ctx_sched_out(ctx, cpuctx, EVENT_TIME);
>  		add_event_to_ctx(event, ctx);
> -- 
> 2.8.4.31.g9ed660f
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
  2018-03-06 11:50 ` Peter Zijlstra
@ 2018-03-07 11:19   ` Lin Xiulei
  2018-03-12 12:24     ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Lin Xiulei @ 2018-03-07 11:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, mingo, acme, alexander.shishkin, linux-kernel, tglx,
	Stephane Eranian, torvalds, linux-perf-users, Brendan Gregg,
	yang_oliver, leilei.lin

2018-03-06 19:50 GMT+08:00 Peter Zijlstra <peterz@infradead.org>:
> On Tue, Mar 06, 2018 at 05:36:37PM +0800, linxiulei@gmail.com wrote:
>> From: "leilei.lin" <leilei.lin@alibaba-inc.com>
>>
>> Do not install cgroup event into the CPU context and schedule it
>> if the cgroup is not running on this CPU
>
> OK, so far so good, this explains the bit in
> __perf_install_in_context().
>

Actually, the new codes in __perf_install_in_context() only takes care whether
if events should be scheduled with PMU.

>> While there is no task of cgroup running specified CPU, current
>> kernel still install cgroup event into CPU context that causes
>> another cgroup event can't be installed into this CPU.
>>
>> This patch prevent scheduling events at __perf_install_in_context()
>> and installing events at list_update_cgroup_event() if cgroup isn't
>> running on specified CPU.
>
> This bit doesn't make sense, you don't in fact avoid anything in
> list_update_cgroup_event(), you do more, not less.
>

And the new codes in list_update_cgroup_event() don't want cpuctx->cgrp
to be set arbitrarily. The more logic, you mentioned, was added for making
sure cpuctx->cgrp is set consistently with the cgroup running on the cpu.

>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 4df5b69..f3ffa70 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -933,31 +933,45 @@ list_update_cgroup_event(struct perf_event *event,
>>  {
>>       struct perf_cpu_context *cpuctx;
>>       struct list_head *cpuctx_entry;
>> +     struct perf_cgroup *cgrp;
>>
>>       if (!is_cgroup_event(event))
>>               return;
>>
>>       /*
>>        * Because cgroup events are always per-cpu events,
>>        * this will always be called from the right CPU.
>>        */
>>       cpuctx = __get_cpu_context(ctx);
>> +     cgrp = perf_cgroup_from_task(current, ctx);
>> +
>> +     /*
>> +      * if only the cgroup is running on this cpu
>> +      * and cpuctx->cgrp == NULL (otherwise it would've
>> +      * been set with running cgroup), we put this cgroup
>> +      * into cpu context. Or it would case mismatch in
>> +      * following cgroup events at event_filter_match()
>> +      */
>
> This is utterly incomprehensible, what?

Yes, this is bit messy. I should've made it clear. This comment was supposed
to explain the reason why I modified the if statement below.

And the logic is

1) when cpuctx-> cgrp is NULL, we __must__ take care of how to set it
appropriately, that means, we __have to__ check if the cgroup is running
on the cpu

2) when cpuctx-> cgrp is __NOT__ NULL. It means cpuctx->cgrp had been
set appropriately by cgroup_switch() or list_update_cgroup_event() before.
Therefore, We do __nothing__ here

>
>> +     if (add && !cpuctx->cgrp &&
>> +                     cgroup_is_descendant(cgrp->css.cgroup,
>> +                     event->cgrp->css.cgroup)) {
>> +             cpuctx->cgrp = cgrp;
>> +     }
>
> And that's just horrible coding style. Maybe something like:
>
>         if (add && cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup)) {
>                 if (cpuctx->cgrp)
>                         WARN_ON_ONCE(cpuctx->cgrp != cgrp);
>                 cpuctx->cgrp = cgrp;
>         }
>
> that? But that still needs a comment to explain _why_ we do that here.
> Under what condition would we fail to have cpuctx->cgrp set while
> ctx->nr_cgroups. Your comment doesn't explain nor does your Changelog.
>

        if (cpuctx->cgrp == NULL) /* As I said above, we only take
care this case. */
             if (add && cgroup_is_descendant(cgrp->css.cgroup,
event->cgrp->css.cgroup)) {
                      /* only when this cgroup is running */
                      cpuctx->cgrp = cgrp;
         }

>> +
>> +     if (add && ctx->nr_cgroups++)
>> +             return;
>> +     else if (!add && --ctx->nr_cgroups)
>> +             return;
>>
>> +     /* no cgroup running */
>> +     if (!add)
>> +             cpuctx->cgrp = NULL;
>> +
>> +     cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
>> +     if (add)
>>               list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
>> +     else
>>               list_del(cpuctx_entry);
>>  }
>>
>>  #else /* !CONFIG_CGROUP_PERF */
>> @@ -2311,6 +2325,20 @@ static int  __perf_install_in_context(void *info)
>>               raw_spin_lock(&task_ctx->lock);
>>       }
>>
>> +#ifdef CONFIG_CGROUP_PERF
>> +     if (is_cgroup_event(event)) {
>> +             /*
>> +              * Only care about cgroup events.
>> +              *
>
> That bit is entirely spurious, if it right after if (is_cgroup_event()),
> obviously this block is only for cgroup events.
>

Totally, : )

>> +              * If only the task belongs to cgroup of this event,
>> +              * we will continue the installment
>
> And that isn't really english. I think you meant to write something
> like:
>
>                 /*
>                  * If the current cgroup doesn't match the event's
>                  * cgroup, we should not try to schedule it.
>                  */
>

Totally again, : ) Thanks

>> +              */
>> +             struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
>> +             reprogram = cgroup_is_descendant(cgrp->css.cgroup,
>> +                                     event->cgrp->css.cgroup);
>> +     }
>> +#endif
>> +
>>       if (reprogram) {
>>               ctx_sched_out(ctx, cpuctx, EVENT_TIME);
>>               add_event_to_ctx(event, ctx);
>> --
>> 2.8.4.31.g9ed660f
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
  2018-03-06  9:36 [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu linxiulei
  2018-03-06 11:50 ` Peter Zijlstra
@ 2018-03-12  7:53 ` Ingo Molnar
  2018-03-12 11:46   ` Lin Xiulei
  2018-03-12 17:47 ` [tip:perf/core] perf/core: Fix installing cgroup events on CPU tip-bot for leilei.lin
  2 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2018-03-12  7:53 UTC (permalink / raw)
  To: linxiulei
  Cc: peterz, jolsa, mingo, acme, alexander.shishkin, linux-kernel,
	tglx, eranian, torvalds, linux-perf-users, brendan.d.gregg,
	yang_oliver, leilei.lin


* linxiulei@gmail.com <linxiulei@gmail.com> wrote:

>  	/*
>  	 * Because cgroup events are always per-cpu events,
>  	 * this will always be called from the right CPU.
>  	 */

> +	/*
> +	 * if only the cgroup is running on this cpu
> +	 * and cpuctx->cgrp == NULL (otherwise it would've
> +	 * been set with running cgroup), we put this cgroup
> +	 * into cpu context. Or it would case mismatch in
> +	 * following cgroup events at event_filter_match()
> +	 */

Beyond making sure that what you comment on makes sense, please also follow 
existing comment style, which I quoted above.

There's 3 separate mistakes in that paragraph alone.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
  2018-03-12  7:53 ` Ingo Molnar
@ 2018-03-12 11:46   ` Lin Xiulei
  0 siblings, 0 replies; 8+ messages in thread
From: Lin Xiulei @ 2018-03-12 11:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Jiri Olsa, mingo, acme, alexander.shishkin,
	linux-kernel, tglx, Stephane Eranian, torvalds, linux-perf-users,
	Brendan Gregg, yang_oliver, leilei.lin

2018-03-12 15:53 GMT+08:00 Ingo Molnar <mingo@kernel.org>:
>
> * linxiulei@gmail.com <linxiulei@gmail.com> wrote:
>
>>       /*
>>        * Because cgroup events are always per-cpu events,
>>        * this will always be called from the right CPU.
>>        */
>
>> +     /*
>> +      * if only the cgroup is running on this cpu
>> +      * and cpuctx->cgrp == NULL (otherwise it would've
>> +      * been set with running cgroup), we put this cgroup
>> +      * into cpu context. Or it would case mismatch in
>> +      * following cgroup events at event_filter_match()
>> +      */
>
> Beyond making sure that what you comment on makes sense, please also follow
> existing comment style, which I quoted above.
>
> There's 3 separate mistakes in that paragraph alone.
>
> Thanks,
>
>         Ingo

Thank Ingo for pointing it out. though I'd try to follow it as
possible as I could. I wish I have somewhere to check style with
specific documents : )

+       /*
+        * We will set cpuctx->cgrp only if:
+        *   1) cpuctx->cgrp is NULL. Because cpuctx->cgrp was set
+        *      correctly by sched_in(). We don't need to set it again.
+        *   2) cgroup related to this event is running on this CPU.
+        *      Otherwise it will fail in event_filter_match()
+        */

Peter, did I explain that logic clearly and prove it right?

Regards

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
  2018-03-07 11:19   ` Lin Xiulei
@ 2018-03-12 12:24     ` Peter Zijlstra
  2018-03-12 13:10       ` Lin Xiulei
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2018-03-12 12:24 UTC (permalink / raw)
  To: Lin Xiulei
  Cc: Jiri Olsa, mingo, acme, alexander.shishkin, linux-kernel, tglx,
	Stephane Eranian, torvalds, linux-perf-users, Brendan Gregg,
	yang_oliver, leilei.lin

On Wed, Mar 07, 2018 at 07:19:15PM +0800, Lin Xiulei wrote:

> >> +     /*
> >> +      * if only the cgroup is running on this cpu
> >> +      * and cpuctx->cgrp == NULL (otherwise it would've
> >> +      * been set with running cgroup), we put this cgroup
> >> +      * into cpu context. Or it would case mismatch in
> >> +      * following cgroup events at event_filter_match()
> >> +      */
> >
> > This is utterly incomprehensible, what?
> 
> Yes, this is bit messy. I should've made it clear. This comment was supposed
> to explain the reason why I modified the if statement below.
> 
> And the logic is
> 
> 1) when cpuctx-> cgrp is NULL, we __must__ take care of how to set it
> appropriately, that means, we __have to__ check if the cgroup is running
> on the cpu

Yes, however, the changelog needs to explain why this was
changed, and the above does not explain where the old code was wrong.

In what case do we fail to do 1 with the current code?

The existing nr_cgroups logic ensures we only continue for the
first/last cgroup event for that context. Is the problem that if the
first cgroup is _not_ of the current cgroup and we correctly do _not_
set cpuctx->cgrp, a second event that _is_ of the right cgroup will then
not have the opportunity to set cpuctx->cgrp ?

> 2) when cpuctx-> cgrp is __NOT__ NULL. It means cpuctx->cgrp had been
> set appropriately by cgroup_switch() or list_update_cgroup_event() before.
> Therefore, We do __nothing__ here

Right, but my variant doubled checked it. If its _not_ NULL it must be
the current task's cgrp, otherwise we have a problem. Same difference,
more paranoia :-)

But I suppose you're right and we can avoid a bunch of looking up if
cpuctx->cgrp is already set.


How is the below patch?

---
Subject: perf/core: Fix installing cgroup events on CPU
From: "leilei.lin" <leilei.lin@alibaba-inc.com>
Date: Tue, 6 Mar 2018 17:36:37 +0800

There's two problems when installing cgroup events on CPUs: firstly
list_update_cgroup_event() only tries to set cpuctx->cgrp for the
first event, if that mismatches on @cgrp we'll not try again for later
additions.

Secondly, when we install a cgroup event into an active context, only
issue an event reprogram when the event matches the current cgroup
context. This avoids a pointless event reprogramming.

Cc: acme@kernel.org
Cc: yang_oliver@hotmail.com
Cc: mingo@redhat.com
Cc: jolsa@redhat.com
Cc: eranian@gmail.com
Cc: torvalds@linux-foundation.org
Cc: tglx@linutronix.de
Cc: brendan.d.gregg@gmail.com
Cc: alexander.shishkin@linux.intel.com
Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com>
[peterz: Changelog and comments]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180306093637.28247-1-linxiulei@gmail.com
---
 kernel/events/core.c |   46 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 35 insertions(+), 11 deletions(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -937,27 +937,39 @@ list_update_cgroup_event(struct perf_eve
 	if (!is_cgroup_event(event))
 		return;
 
-	if (add && ctx->nr_cgroups++)
-		return;
-	else if (!add && --ctx->nr_cgroups)
-		return;
 	/*
 	 * Because cgroup events are always per-cpu events,
 	 * this will always be called from the right CPU.
 	 */
 	cpuctx = __get_cpu_context(ctx);
-	cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
-	/* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU .*/
-	if (add) {
+
+	/*
+	 * Since setting cpuctx->cgrp is conditional on the current @cgrp
+	 * matching the event's cgroup, we must do this for every new event,
+	 * because if the first would mismatch, the second would not try again
+	 * and we would leave cpuctx->cgrp unset.
+	 */
+	if (add && !cpuctx->cgrp) {
 		struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
 
-		list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
 		if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
 			cpuctx->cgrp = cgrp;
-	} else {
-		list_del(cpuctx_entry);
-		cpuctx->cgrp = NULL;
 	}
+
+	if (add && ctx->nr_cgroups++)
+		return;
+	else if (!add && --ctx->nr_cgroups)
+		return;
+
+	/* no cgroup running */
+	if (!add)
+		cpuctx->cgrp = NULL;
+
+	cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
+	if (add)
+		list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
+	else
+		list_del(cpuctx_entry);
 }
 
 #else /* !CONFIG_CGROUP_PERF */
@@ -2491,6 +2503,18 @@ static int  __perf_install_in_context(vo
 		raw_spin_lock(&task_ctx->lock);
 	}
 
+#ifdef CONFIG_CGROUP_PERF
+	if (is_cgroup_event(event)) {
+		/*
+		 * If the current cgroup doesn't match the event's
+		 * cgroup, we should not try to schedule it.
+		 */
+		struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
+		reprogram = cgroup_is_descendant(cgrp->css.cgroup,
+					event->cgrp->css.cgroup);
+	}
+#endif
+
 	if (reprogram) {
 		ctx_sched_out(ctx, cpuctx, EVENT_TIME);
 		add_event_to_ctx(event, ctx);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu
  2018-03-12 12:24     ` Peter Zijlstra
@ 2018-03-12 13:10       ` Lin Xiulei
  0 siblings, 0 replies; 8+ messages in thread
From: Lin Xiulei @ 2018-03-12 13:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, mingo, acme, alexander.shishkin, linux-kernel, tglx,
	Stephane Eranian, torvalds, linux-perf-users, Brendan Gregg,
	yang_oliver, leilei.lin

2018-03-12 20:24 GMT+08:00 Peter Zijlstra <peterz@infradead.org>:
> On Wed, Mar 07, 2018 at 07:19:15PM +0800, Lin Xiulei wrote:
>
>> >> +     /*
>> >> +      * if only the cgroup is running on this cpu
>> >> +      * and cpuctx->cgrp == NULL (otherwise it would've
>> >> +      * been set with running cgroup), we put this cgroup
>> >> +      * into cpu context. Or it would case mismatch in
>> >> +      * following cgroup events at event_filter_match()
>> >> +      */
>> >
>> > This is utterly incomprehensible, what?
>>
>> Yes, this is bit messy. I should've made it clear. This comment was supposed
>> to explain the reason why I modified the if statement below.
>>
>> And the logic is
>>
>> 1) when cpuctx-> cgrp is NULL, we __must__ take care of how to set it
>> appropriately, that means, we __have to__ check if the cgroup is running
>> on the cpu
>
> Yes, however, the changelog needs to explain why this was
> changed, and the above does not explain where the old code was wrong.
>

Yes, it's good to be involved in community since you guys give great opinions

> In what case do we fail to do 1 with the current code?
>

With current code, it doesn't check if the cgroup is running on the
CPU. then problem happens when two cgroup events on the same CPU are
opened in a row, which means no schedule happen between it

1) event A on cgroup A has no tasks running on this CPU
2) event B on cgroup B has some task running on the CPU

With the current code, cpuctx->cgrp would be set as cgroup A
arbitrarily, then it opens event B, cpuctx->cgrp will not be set
anymore, and it fails in event_filter_match() because cpuctx->cgrp !=
cgroup B. We have to wait schedule() happened to correct it. But in
this situation, we lost a slight period that measures event B.


> The existing nr_cgroups logic ensures we only continue for the
> first/last cgroup event for that context. Is the problem that if the
> first cgroup is _not_ of the current cgroup and we correctly do _not_
> set cpuctx->cgrp, a second event that _is_ of the right cgroup will then
> not have the opportunity to set cpuctx->cgrp ?
>

Yes, You got it.

>> 2) when cpuctx-> cgrp is __NOT__ NULL. It means cpuctx->cgrp had been
>> set appropriately by cgroup_switch() or list_update_cgroup_event() before.
>> Therefore, We do __nothing__ here
>
> Right, but my variant doubled checked it. If its _not_ NULL it must be
> the current task's cgrp, otherwise we have a problem. Same difference,
> more paranoia :-)
>
> But I suppose you're right and we can avoid a bunch of looking up if
> cpuctx->cgrp is already set.
>

Yes, actually you have a good instinct. But what my further concern is
when we should set cpuctx->cgrp to NULL again when one running event
is closed ? By this way, we can avoid a bunch of checking
cgroup_is_descendant(). I didn't figure it out in this patch, and of
course I didn't make it worse. :-)

>
> How is the below patch?
>

You're AWESOME : )

> ---
> Subject: perf/core: Fix installing cgroup events on CPU
> From: "leilei.lin" <leilei.lin@alibaba-inc.com>
> Date: Tue, 6 Mar 2018 17:36:37 +0800
>
> There's two problems when installing cgroup events on CPUs: firstly
> list_update_cgroup_event() only tries to set cpuctx->cgrp for the
> first event, if that mismatches on @cgrp we'll not try again for later
> additions.
>
> Secondly, when we install a cgroup event into an active context, only
> issue an event reprogram when the event matches the current cgroup
> context. This avoids a pointless event reprogramming.
>
> Cc: acme@kernel.org
> Cc: yang_oliver@hotmail.com
> Cc: mingo@redhat.com
> Cc: jolsa@redhat.com
> Cc: eranian@gmail.com
> Cc: torvalds@linux-foundation.org
> Cc: tglx@linutronix.de
> Cc: brendan.d.gregg@gmail.com
> Cc: alexander.shishkin@linux.intel.com
> Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com>
> [peterz: Changelog and comments]
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: http://lkml.kernel.org/r/20180306093637.28247-1-linxiulei@gmail.com
> ---
>  kernel/events/core.c |   46 +++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 35 insertions(+), 11 deletions(-)
>
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -937,27 +937,39 @@ list_update_cgroup_event(struct perf_eve
>         if (!is_cgroup_event(event))
>                 return;
>
> -       if (add && ctx->nr_cgroups++)
> -               return;
> -       else if (!add && --ctx->nr_cgroups)
> -               return;
>         /*
>          * Because cgroup events are always per-cpu events,
>          * this will always be called from the right CPU.
>          */
>         cpuctx = __get_cpu_context(ctx);
> -       cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
> -       /* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU .*/
> -       if (add) {
> +
> +       /*
> +        * Since setting cpuctx->cgrp is conditional on the current @cgrp
> +        * matching the event's cgroup, we must do this for every new event,
> +        * because if the first would mismatch, the second would not try again
> +        * and we would leave cpuctx->cgrp unset.
> +        */
> +       if (add && !cpuctx->cgrp) {
>                 struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
>
> -               list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
>                 if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
>                         cpuctx->cgrp = cgrp;
> -       } else {
> -               list_del(cpuctx_entry);
> -               cpuctx->cgrp = NULL;
>         }
> +
> +       if (add && ctx->nr_cgroups++)
> +               return;
> +       else if (!add && --ctx->nr_cgroups)
> +               return;
> +
> +       /* no cgroup running */
> +       if (!add)
> +               cpuctx->cgrp = NULL;
> +
> +       cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
> +       if (add)
> +               list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
> +       else
> +               list_del(cpuctx_entry);
>  }
>
>  #else /* !CONFIG_CGROUP_PERF */
> @@ -2491,6 +2503,18 @@ static int  __perf_install_in_context(vo
>                 raw_spin_lock(&task_ctx->lock);
>         }
>
> +#ifdef CONFIG_CGROUP_PERF
> +       if (is_cgroup_event(event)) {
> +               /*
> +                * If the current cgroup doesn't match the event's
> +                * cgroup, we should not try to schedule it.
> +                */
> +               struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
> +               reprogram = cgroup_is_descendant(cgrp->css.cgroup,
> +                                       event->cgrp->css.cgroup);
> +       }
> +#endif
> +
>         if (reprogram) {
>                 ctx_sched_out(ctx, cpuctx, EVENT_TIME);
>                 add_event_to_ctx(event, ctx);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip:perf/core] perf/core: Fix installing cgroup events on CPU
  2018-03-06  9:36 [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu linxiulei
  2018-03-06 11:50 ` Peter Zijlstra
  2018-03-12  7:53 ` Ingo Molnar
@ 2018-03-12 17:47 ` tip-bot for leilei.lin
  2 siblings, 0 replies; 8+ messages in thread
From: tip-bot for leilei.lin @ 2018-03-12 17:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, linux-kernel, alexander.shishkin, mingo, hpa,
	leilei.lin, acme, peterz, vincent.weaver, jolsa, tglx, eranian

Commit-ID:  33801b94741d6c3be9713c10aa627477216c21e2
Gitweb:     https://git.kernel.org/tip/33801b94741d6c3be9713c10aa627477216c21e2
Author:     leilei.lin <leilei.lin@alibaba-inc.com>
AuthorDate: Tue, 6 Mar 2018 17:36:37 +0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 12 Mar 2018 15:28:51 +0100

perf/core: Fix installing cgroup events on CPU

There's two problems when installing cgroup events on CPUs: firstly
list_update_cgroup_event() only tries to set cpuctx->cgrp for the
first event, if that mismatches on @cgrp we'll not try again for later
additions.

Secondly, when we install a cgroup event into an active context, only
issue an event reprogram when the event matches the current cgroup
context. This avoids a pointless event reprogramming.

Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com>
[ Improved the changelog and comments. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: brendan.d.gregg@gmail.com
Cc: eranian@gmail.com
Cc: linux-kernel@vger.kernel.org
Cc: yang_oliver@hotmail.com
Link: http://lkml.kernel.org/r/20180306093637.28247-1-linxiulei@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 46 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 35 insertions(+), 11 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f98c0f88cc94..969f865f9f1c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -937,27 +937,39 @@ list_update_cgroup_event(struct perf_event *event,
 	if (!is_cgroup_event(event))
 		return;
 
-	if (add && ctx->nr_cgroups++)
-		return;
-	else if (!add && --ctx->nr_cgroups)
-		return;
 	/*
 	 * Because cgroup events are always per-cpu events,
 	 * this will always be called from the right CPU.
 	 */
 	cpuctx = __get_cpu_context(ctx);
-	cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
-	/* cpuctx->cgrp is NULL unless a cgroup event is active in this CPU .*/
-	if (add) {
+
+	/*
+	 * Since setting cpuctx->cgrp is conditional on the current @cgrp
+	 * matching the event's cgroup, we must do this for every new event,
+	 * because if the first would mismatch, the second would not try again
+	 * and we would leave cpuctx->cgrp unset.
+	 */
+	if (add && !cpuctx->cgrp) {
 		struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
 
-		list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
 		if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
 			cpuctx->cgrp = cgrp;
-	} else {
-		list_del(cpuctx_entry);
-		cpuctx->cgrp = NULL;
 	}
+
+	if (add && ctx->nr_cgroups++)
+		return;
+	else if (!add && --ctx->nr_cgroups)
+		return;
+
+	/* no cgroup running */
+	if (!add)
+		cpuctx->cgrp = NULL;
+
+	cpuctx_entry = &cpuctx->cgrp_cpuctx_entry;
+	if (add)
+		list_add(cpuctx_entry, this_cpu_ptr(&cgrp_cpuctx_list));
+	else
+		list_del(cpuctx_entry);
 }
 
 #else /* !CONFIG_CGROUP_PERF */
@@ -2489,6 +2501,18 @@ static int  __perf_install_in_context(void *info)
 		raw_spin_lock(&task_ctx->lock);
 	}
 
+#ifdef CONFIG_CGROUP_PERF
+	if (is_cgroup_event(event)) {
+		/*
+		 * If the current cgroup doesn't match the event's
+		 * cgroup, we should not try to schedule it.
+		 */
+		struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
+		reprogram = cgroup_is_descendant(cgrp->css.cgroup,
+					event->cgrp->css.cgroup);
+	}
+#endif
+
 	if (reprogram) {
 		ctx_sched_out(ctx, cpuctx, EVENT_TIME);
 		add_event_to_ctx(event, ctx);

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-03-12 17:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-06  9:36 [PATCH RESEND v5] perf/core: Fix installing arbitrary cgroup event into cpu linxiulei
2018-03-06 11:50 ` Peter Zijlstra
2018-03-07 11:19   ` Lin Xiulei
2018-03-12 12:24     ` Peter Zijlstra
2018-03-12 13:10       ` Lin Xiulei
2018-03-12  7:53 ` Ingo Molnar
2018-03-12 11:46   ` Lin Xiulei
2018-03-12 17:47 ` [tip:perf/core] perf/core: Fix installing cgroup events on CPU tip-bot for leilei.lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).