linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf: fix group with mixed hw and sw events
@ 2018-05-03 19:47 Song Liu
  2018-05-22 21:59 ` Song Liu
  2018-05-25  9:48 ` [tip:perf/core] perf/core: Fix group scheduling " tip-bot for Song Liu
  0 siblings, 2 replies; 3+ messages in thread
From: Song Liu @ 2018-05-03 19:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: Song Liu, kernel-team, Peter Zijlstra

When hw and sw events are mixed in the same group, they are all attached
to the hw perf_event_context. This sometimes requires moving group of
perf_event to a different context. We found an issue in the moving. Here
is an example of it.

   perf stat -e '{faults,ref-cycles,faults}'  -I 1000

     1.005591180              1,297      faults
     1.005591180        457,476,576      ref-cycles
     1.005591180    <not supported>      faults

First, sw event "faults" is attached to the sw context, and become the
group leader. Then, hw event "ref-cycles" is attached, so both events
are moved to hw context. Last, another sw "faults" tries to attach,
but it fails because of mismatch between the new target ctx (from sw
pmu) and the group_leader's ctx (hw context, same as ref-cycles).

The broken condition is:
   group_leader is sw event;
   group_leader is on hw context;
   add a sw event to the group.

This patch fixes this scenario by checking group_leader's context
(instead of just event type). If group_leader is on hw context, use
pmu of this context to look up context for the new event.

Fixes: b04243ef7006 ("perf: Complete software pmu grouping")
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 include/linux/perf_event.h |  8 ++++++++
 kernel/events/core.c       | 21 +++++++++++----------
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e71e99e..def866f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1016,6 +1016,14 @@ static inline int is_software_event(struct perf_event *event)
 	return event->event_caps & PERF_EV_CAP_SOFTWARE;
 }
 
+/*
+ * Return 1 for event in sw context, 0 for event in hw context
+ */
+static inline int in_software_context(struct perf_event *event)
+{
+	return event->ctx->pmu->task_ctx_nr == perf_sw_context;
+}
+
 extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX];
 
 extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 67612ce..ce6aa5f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10521,19 +10521,20 @@ SYSCALL_DEFINE5(perf_event_open,
 	if (pmu->task_ctx_nr == perf_sw_context)
 		event->event_caps |= PERF_EV_CAP_SOFTWARE;
 
-	if (group_leader &&
-	    (is_software_event(event) != is_software_event(group_leader))) {
-		if (is_software_event(event)) {
+	if (group_leader) {
+		if (is_software_event(event) &&
+		    !in_software_context(group_leader)) {
 			/*
-			 * If event and group_leader are not both a software
-			 * event, and event is, then group leader is not.
+			 * If the event is a sw event, but the group_leader
+			 * is on hw context.
 			 *
-			 * Allow the addition of software events to !software
-			 * groups, this is safe because software events never
-			 * fail to schedule.
+			 * Allow the addition of software events to hw
+			 * groups, this is safe because software events
+			 * never fail to schedule.
 			 */
-			pmu = group_leader->pmu;
-		} else if (is_software_event(group_leader) &&
+			pmu = group_leader->ctx->pmu;
+		} else if (!is_software_event(event) &&
+			   is_software_event(group_leader) &&
 			   (group_leader->group_caps & PERF_EV_CAP_SOFTWARE)) {
 			/*
 			 * In case the group is a pure software group, and we
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] perf: fix group with mixed hw and sw events
  2018-05-03 19:47 [PATCH] perf: fix group with mixed hw and sw events Song Liu
@ 2018-05-22 21:59 ` Song Liu
  2018-05-25  9:48 ` [tip:perf/core] perf/core: Fix group scheduling " tip-bot for Song Liu
  1 sibling, 0 replies; 3+ messages in thread
From: Song Liu @ 2018-05-22 21:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Kernel Team, Peter Zijlstra

Dear Peter,

Could you please share your comments on this minor fix?

Best,
Song


> On May 3, 2018, at 12:47 PM, Song Liu <songliubraving@fb.com> wrote:
> 
> When hw and sw events are mixed in the same group, they are all attached
> to the hw perf_event_context. This sometimes requires moving group of
> perf_event to a different context. We found an issue in the moving. Here
> is an example of it.
> 
>   perf stat -e '{faults,ref-cycles,faults}'  -I 1000
> 
>     1.005591180              1,297      faults
>     1.005591180        457,476,576      ref-cycles
>     1.005591180    <not supported>      faults
> 
> First, sw event "faults" is attached to the sw context, and become the
> group leader. Then, hw event "ref-cycles" is attached, so both events
> are moved to hw context. Last, another sw "faults" tries to attach,
> but it fails because of mismatch between the new target ctx (from sw
> pmu) and the group_leader's ctx (hw context, same as ref-cycles).
> 
> The broken condition is:
>   group_leader is sw event;
>   group_leader is on hw context;
>   add a sw event to the group.
> 
> This patch fixes this scenario by checking group_leader's context
> (instead of just event type). If group_leader is on hw context, use
> pmu of this context to look up context for the new event.
> 
> Fixes: b04243ef7006 ("perf: Complete software pmu grouping")
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
> include/linux/perf_event.h |  8 ++++++++
> kernel/events/core.c       | 21 +++++++++++----------
> 2 files changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index e71e99e..def866f 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1016,6 +1016,14 @@ static inline int is_software_event(struct perf_event *event)
> 	return event->event_caps & PERF_EV_CAP_SOFTWARE;
> }
> 
> +/*
> + * Return 1 for event in sw context, 0 for event in hw context
> + */
> +static inline int in_software_context(struct perf_event *event)
> +{
> +	return event->ctx->pmu->task_ctx_nr == perf_sw_context;
> +}
> +
> extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX];
> 
> extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64);
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 67612ce..ce6aa5f 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10521,19 +10521,20 @@ SYSCALL_DEFINE5(perf_event_open,
> 	if (pmu->task_ctx_nr == perf_sw_context)
> 		event->event_caps |= PERF_EV_CAP_SOFTWARE;
> 
> -	if (group_leader &&
> -	    (is_software_event(event) != is_software_event(group_leader))) {
> -		if (is_software_event(event)) {
> +	if (group_leader) {
> +		if (is_software_event(event) &&
> +		    !in_software_context(group_leader)) {
> 			/*
> -			 * If event and group_leader are not both a software
> -			 * event, and event is, then group leader is not.
> +			 * If the event is a sw event, but the group_leader
> +			 * is on hw context.
> 			 *
> -			 * Allow the addition of software events to !software
> -			 * groups, this is safe because software events never
> -			 * fail to schedule.
> +			 * Allow the addition of software events to hw
> +			 * groups, this is safe because software events
> +			 * never fail to schedule.
> 			 */
> -			pmu = group_leader->pmu;
> -		} else if (is_software_event(group_leader) &&
> +			pmu = group_leader->ctx->pmu;
> +		} else if (!is_software_event(event) &&
> +			   is_software_event(group_leader) &&
> 			   (group_leader->group_caps & PERF_EV_CAP_SOFTWARE)) {
> 			/*
> 			 * In case the group is a pure software group, and we
> -- 
> 2.9.5
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [tip:perf/core] perf/core: Fix group scheduling with mixed hw and sw events
  2018-05-03 19:47 [PATCH] perf: fix group with mixed hw and sw events Song Liu
  2018-05-22 21:59 ` Song Liu
@ 2018-05-25  9:48 ` tip-bot for Song Liu
  1 sibling, 0 replies; 3+ messages in thread
From: tip-bot for Song Liu @ 2018-05-25  9:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: songliubraving, eranian, peterz, hpa, mingo, kernel-team,
	torvalds, linux-kernel, tglx, alexander.shishkin, jolsa, acme,
	vincent.weaver

Commit-ID:  a1150c202207cc8501bebc45b63c264f91959260
Gitweb:     https://git.kernel.org/tip/a1150c202207cc8501bebc45b63c264f91959260
Author:     Song Liu <songliubraving@fb.com>
AuthorDate: Thu, 3 May 2018 12:47:16 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 25 May 2018 08:11:10 +0200

perf/core: Fix group scheduling with mixed hw and sw events

When hw and sw events are mixed in the same group, they are all attached
to the hw perf_event_context. This sometimes requires moving group of
perf_event to a different context.

We found a bug in how the kernel handles this, for example if we do:

   perf stat -e '{faults,ref-cycles,faults}'  -I 1000

     1.005591180              1,297      faults
     1.005591180        457,476,576      ref-cycles
     1.005591180    <not supported>      faults

First, sw event "faults" is attached to the sw context, and becomes the
group leader. Then, hw event "ref-cycles" is attached, so both events
are moved to the hw context. Last, another sw "faults" tries to attach,
but it fails because of mismatch between the new target ctx (from sw
pmu) and the group_leader's ctx (hw context, same as ref-cycles).

The broken condition is:
   group_leader is sw event;
   group_leader is on hw context;
   add a sw event to the group.

Fix this scenario by checking group_leader's context (instead of just
event type). If group_leader is on hw context, use the ->pmu of this
context to look up context for the new event.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <kernel-team@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: b04243ef7006 ("perf: Complete software pmu grouping")
Link: http://lkml.kernel.org/r/20180503194716.162815-1-songliubraving@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/perf_event.h |  8 ++++++++
 kernel/events/core.c       | 21 +++++++++++----------
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e71e99eb9a4e..def866f7269b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1016,6 +1016,14 @@ static inline int is_software_event(struct perf_event *event)
 	return event->event_caps & PERF_EV_CAP_SOFTWARE;
 }
 
+/*
+ * Return 1 for event in sw context, 0 for event in hw context
+ */
+static inline int in_software_context(struct perf_event *event)
+{
+	return event->ctx->pmu->task_ctx_nr == perf_sw_context;
+}
+
 extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX];
 
 extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 67612ce359ad..ce6aa5ff3c96 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10521,19 +10521,20 @@ SYSCALL_DEFINE5(perf_event_open,
 	if (pmu->task_ctx_nr == perf_sw_context)
 		event->event_caps |= PERF_EV_CAP_SOFTWARE;
 
-	if (group_leader &&
-	    (is_software_event(event) != is_software_event(group_leader))) {
-		if (is_software_event(event)) {
+	if (group_leader) {
+		if (is_software_event(event) &&
+		    !in_software_context(group_leader)) {
 			/*
-			 * If event and group_leader are not both a software
-			 * event, and event is, then group leader is not.
+			 * If the event is a sw event, but the group_leader
+			 * is on hw context.
 			 *
-			 * Allow the addition of software events to !software
-			 * groups, this is safe because software events never
-			 * fail to schedule.
+			 * Allow the addition of software events to hw
+			 * groups, this is safe because software events
+			 * never fail to schedule.
 			 */
-			pmu = group_leader->pmu;
-		} else if (is_software_event(group_leader) &&
+			pmu = group_leader->ctx->pmu;
+		} else if (!is_software_event(event) &&
+			   is_software_event(group_leader) &&
 			   (group_leader->group_caps & PERF_EV_CAP_SOFTWARE)) {
 			/*
 			 * In case the group is a pure software group, and we

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-05-25  9:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-03 19:47 [PATCH] perf: fix group with mixed hw and sw events Song Liu
2018-05-22 21:59 ` Song Liu
2018-05-25  9:48 ` [tip:perf/core] perf/core: Fix group scheduling " tip-bot for Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).