[PATCH 1/5] perf,core: allow invalid context events to be part of sw/hw groups

* [PATCH 1/5] perf,core: allow invalid context events to be part of sw/hw groups
@ 2015-03-03  8:54 kan.liang
  2015-03-03  8:54 ` [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps kan.liang
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: kan.liang @ 2015-03-03  8:54 UTC (permalink / raw)
  To: a.p.zijlstra, acme, linux-kernel; +Cc: ak, Kan Liang

From: Kan Liang <kan.liang@intel.com>

The pmu marked as perf_invalid_context don't have any state to switch on
context switch. Everything is global. So it is OK to be part of sw/hw
groups.
In sched_out/sched_in, del/add must be called, so the
perf_invalid_context event can be disabled/enabled accordingly during
context switch. The event count only be read when the event is already
sched_in.

However group read doesn't work with mix events.

For example,
perf record -e '{cycles,uncore_imc_0/cas_count_read/}:S' -a sleep 1
It always gets EINVAL.

This patch set intends to fix this issue.
perf record -e '{cycles,uncore_imc_0/cas_count_read/}:S' -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.202 MB perf.data (12 samples) ]

This patch special case invalid context events and allow them to be part
of sw/hw groups.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 include/linux/perf_event.h |  8 ++++++
 kernel/events/core.c       | 72 ++++++++++++++++++++++++++++++++++------------
 2 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index b8f69d3..6775e6c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -711,6 +711,14 @@ static inline bool is_sampling_event(struct perf_event *event)
 /*
  * Return 1 for a software event, 0 for a hardware event
  */
+static inline int is_invalid_context_event(struct perf_event *event)
+{
+	return event->pmu->task_ctx_nr == perf_invalid_context;
+}
+
+/*
+ * Return 1 for a software event, 0 for a hardware event
+ */
 static inline int is_software_event(struct perf_event *event)
 {
 	return event->pmu->task_ctx_nr == perf_sw_context;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 89f0f16..9a709ab 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1344,7 +1344,7 @@ static void perf_group_attach(struct perf_event *event)
 	WARN_ON_ONCE(group_leader->ctx != event->ctx);
 
 	if (group_leader->group_flags & PERF_GROUP_SOFTWARE &&
-			!is_software_event(event))
+			!is_software_event(event) && !is_invalid_context_event(event))
 		group_leader->group_flags &= ~PERF_GROUP_SOFTWARE;
 
 	list_add_tail(&event->group_entry, &group_leader->sibling_list);
@@ -7549,31 +7549,65 @@ SYSCALL_DEFINE5(perf_event_open,
 	account_event(event);
 
 	/*
-	 * Special case software events and allow them to be part of
-	 * any hardware group.
+	 * Special case for software events and invalid context events.
+	 * Allow software events to be part of any hardware group.
+	 * Invalid context events can only be the group leader for pure
+	 * invalid context event group, but could be part of any
+	 * software/hardware group.
 	 */
 	pmu = event->pmu;
 
 	if (group_leader &&
-	    (is_software_event(event) != is_software_event(group_leader))) {
-		if (is_software_event(event)) {
+	   (group_leader->pmu->task_ctx_nr != event->pmu->task_ctx_nr)) {
+		if (is_invalid_context_event(group_leader)) {
+			err = -EINVAL;
+			goto err_alloc;
+		} else if (is_software_event(group_leader)) {
+			if (is_invalid_context_event(event)) {
+				if (group_leader->group_flags & PERF_GROUP_SOFTWARE) {
+					/*
+					 * If group_leader is software event
+					 * and event is invalid context event
+					 * allow the addition of invalid
+					 * context event to software groups.
+					 */
+					pmu = group_leader->pmu;
+				} else {
+					/*
+					 * Group leader is software event,
+					 * but the group is not software event.
+					 * There must be hardware event in group,
+					 * find it and set it's pmu to event->pmu.
+					 */
+					struct perf_event *tmp;
+
+					list_for_each_entry(tmp, &group_leader->sibling_list, group_entry) {
+						if (tmp->pmu->task_ctx_nr == perf_hw_context) {
+							pmu = tmp->pmu;
+							break;
+						}
+					}
+					if (pmu == event->pmu)
+						goto err_alloc;
+				}
+			} else {
+				if (group_leader->group_flags & PERF_GROUP_SOFTWARE) {
+					/*
+					 * In case the group is pure software group,
+					 * and we try to add a hardware event,
+					 * move the whole group to hardware context.
+					 */
+					move_group = 1;
+				}
+			}
+		} else {
 			/*
-			 * If event and group_leader are not both a software
-			 * event, and event is, then group leader is not.
-			 *
-			 * Allow the addition of software events to !software
-			 * groups, this is safe because software events never
-			 * fail to schedule.
+			 * If group_leader is hardware event and event is not,
+			 * allow the addition of !hardware events to hardware
+			 * groups. This is safe because software events and
+			 * invalid context events never fail to schedule.
 			 */
 			pmu = group_leader->pmu;
-		} else if (is_software_event(group_leader) &&
-			   (group_leader->group_flags & PERF_GROUP_SOFTWARE)) {
-			/*
-			 * In case the group is a pure software group, and we
-			 * try to add a hardware event, move the whole group to
-			 * the hardware context.
-			 */
-			move_group = 1;
 		}
 	}
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread