linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf: fix pmu::filter_match for SW-led groups
@ 2016-06-14 15:10 Mark Rutland
  2016-07-02 16:40 ` Peter Zijlstra
  2016-07-07  8:31 ` [tip:perf/core] perf/core: Fix " tip-bot for Mark Rutland
  0 siblings, 2 replies; 8+ messages in thread
From: Mark Rutland @ 2016-06-14 15:10 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mark Rutland, Alexander Shishkin, Arnaldo Carvalho de Melo,
	Ingo Molnar, Peter Zijlstra, Will Deacon

Commit 66eb579e66ecfea5 ("perf: allow for PMU-specific event filtering")
added pmu::filter_match. This was intended to avoid HW constraints on
events from resulting in extremely pessimistic scheduling.

However, pmu::filter_match is only called for the leader of each event
group. When the leader is a SW event, we do not filter the groups, and
may fail at pmu::add time, and when this happens we'll give up on
scheduling any event groups later in the list until they are rotated
ahead of the failing group.

This can result in extremely sub-optimal scheduling behaviour, e.g. if
running the following on a big.LITTLE platform:

$ taskset -c 0 ./perf stat \
 -e 'a57{context-switches,armv8_cortex_a57/config=0x11/}' \
 -e 'a53{context-switches,armv8_cortex_a53/config=0x11/}' \
 ls

     <not counted>      context-switches                                              (0.00%)
     <not counted>      armv8_cortex_a57/config=0x11/                                     (0.00%)
                24      context-switches                                              (37.36%)
          57589154      armv8_cortex_a53/config=0x11/                                     (37.36%)

Here the 'a53' event group was always eligible to be scheduled, but the
a57 group never eligible to be scheduled, as the task was always affine
to a Cortex-A53 CPU. The SW (group leader) event in the 'a57' group was
eligible, but the HW event failed at pmu::add time, resulting in
ctx_flexible_sched_in giving up on scheduling further groups with HW
events.

One way of avoiding this is to check pmu::filter_match on siblings as
well as the group leader. If any of these fail their pmu::filter_match,
we must skip the entire group before attempting to add any events.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
---
 kernel/events/core.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

I've tried to find a better way of handling this (without needing to walk the
siblings list), but so far I'm at a loss. At least it's "only" O(n) in the size
of the sibling list we were going to walk anyway.

I suspect that at a more fundamental level, I need to stop sharing a
perf_hw_context between HW PMUs (i.e. replace task_struct::perf_event_ctxp with
something that can handle multiple HW PMUs). From previous attempts I'm not
sure if that's going to be possible.

Any ideas appreciated!

Mark.

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9c51ec3..c0b6db0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1678,12 +1678,32 @@ static bool is_orphaned_event(struct perf_event *event)
 	return event->state == PERF_EVENT_STATE_DEAD;
 }
 
-static inline int pmu_filter_match(struct perf_event *event)
+static inline int __pmu_filter_match(struct perf_event *event)
 {
 	struct pmu *pmu = event->pmu;
 	return pmu->filter_match ? pmu->filter_match(event) : 1;
 }
 
+/*
+ * Check whether we should attempt to schedule an event group based on
+ * PMU-specific filtering. An event group can consist of HW and SW events,
+ * potentially with a SW leader, so we must check all the filters to determine
+ * whether a group is schedulable.
+ */
+static inline int pmu_filter_match(struct perf_event *event)
+{
+	struct perf_event *child;
+
+	if (!__pmu_filter_match(event))
+		return 0;
+
+	list_for_each_entry(child, &event->sibling_list, group_entry)
+		if (!__pmu_filter_match(child))
+			return 0;
+
+	return 1;
+}
+
 static inline int
 event_filter_match(struct perf_event *event)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-07-07  9:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-14 15:10 [PATCH] perf: fix pmu::filter_match for SW-led groups Mark Rutland
2016-07-02 16:40 ` Peter Zijlstra
2016-07-04 18:05   ` Mark Rutland
2016-07-05  8:35     ` Peter Zijlstra
2016-07-05  9:44       ` Mark Rutland
2016-07-05 12:04         ` Peter Zijlstra
2016-07-05 12:52           ` Mark Rutland
2016-07-07  8:31 ` [tip:perf/core] perf/core: Fix " tip-bot for Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).