linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC perf] perf: try schedule more hw events, even when previous groups failed
@ 2018-02-08 23:59 Song Liu
  2018-02-10 14:13 ` Peter Zijlstra
  0 siblings, 1 reply; 2+ messages in thread
From: Song Liu @ 2018-02-08 23:59 UTC (permalink / raw)
  To: linux-kernel, peterz; +Cc: kernel-team, ak, kan.liang, Song Liu

In current perf event scheduling, once a hw group failed to schedule, we
will not try to schedule other hw groups in the list. This behavior is
reasonable in most cases, but it is weird with ref-cycles on Intel CPUs.

For recent Intel CPUs, ref-cycles can only be served on fixed PMC
counter2. If there are two perf_events for ref-cycles, schedule will
fail even when there are still free PMC. Then the scheduler will not
try other events. In the following example, there are always free PMC
for event "cycles", but it is only scheduled 66% of time.

[root@localhost ~] perf stat -C 0 -e cycles,ref-cycles,ref-cycles  -- sleep 1
 Performance counter stats for 'CPU(s) 0':

        50,197,136      cycles                             (66.64%)
        70,278,035      ref-cycles                         (66.67%)
        73,521,750      ref-cycles                         (33.33%)

       1.000860603 seconds time elapsed

This patch slightly change the behavior of the scheduler by always try
all event groups. With the patch, the same perf command will monitor
cycles 100% of time.

[root@localhost ~]# perf stat -C 0 -e cycles,ref-cycles,ref-cycles  -- sleep 1
 Performance counter stats for 'CPU(s) 0':

        48,737,503      cycles
        81,706,878      ref-cycles                         (66.63%)
        78,632,325      ref-cycles                         (33.37%)

       1.001283168 seconds time elapsed

I understand that this will make scheduling more expensive for some use
cases. It can be improved by exposing more information from
event_sched_in() and use different strategies for ref-cycles conflicts
and all PMC busy cases. But that would be a much bigger change, so I
would like suggestions before moving ahead with it.

Please share your comments and suggestions on this.

Thanks in advance.
---
 kernel/events/core.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5a54630..efdae82 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2159,8 +2159,7 @@ group_sched_in(struct perf_event *group_event,
  * Work out whether we can put this event group on the CPU now.
  */
 static int group_can_go_on(struct perf_event *event,
-			   struct perf_cpu_context *cpuctx,
-			   int can_add_hw)
+			   struct perf_cpu_context *cpuctx)
 {
 	/*
 	 * Groups consisting entirely of software events can always go on.
@@ -2179,11 +2178,8 @@ static int group_can_go_on(struct perf_event *event,
 	 */
 	if (event->attr.exclusive && cpuctx->active_oncpu)
 		return 0;
-	/*
-	 * Otherwise, try to add it if all previous groups were able
-	 * to go on.
-	 */
-	return can_add_hw;
+
+	return 1;
 }
 
 static void add_event_to_ctx(struct perf_event *event,
@@ -3004,7 +3000,7 @@ ctx_pinned_sched_in(struct perf_event_context *ctx,
 		if (!event_filter_match(event))
 			continue;
 
-		if (group_can_go_on(event, cpuctx, 1))
+		if (group_can_go_on(event, cpuctx))
 			group_sched_in(event, cpuctx, ctx);
 
 		/*
@@ -3021,7 +3017,6 @@ ctx_flexible_sched_in(struct perf_event_context *ctx,
 		      struct perf_cpu_context *cpuctx)
 {
 	struct perf_event *event;
-	int can_add_hw = 1;
 
 	list_for_each_entry(event, &ctx->flexible_groups, group_entry) {
 		/* Ignore events in OFF or ERROR state */
@@ -3034,10 +3029,8 @@ ctx_flexible_sched_in(struct perf_event_context *ctx,
 		if (!event_filter_match(event))
 			continue;
 
-		if (group_can_go_on(event, cpuctx, can_add_hw)) {
-			if (group_sched_in(event, cpuctx, ctx))
-				can_add_hw = 0;
-		}
+		if (group_can_go_on(event, cpuctx))
+			group_sched_in(event, cpuctx, ctx);
 	}
 }
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [RFC perf] perf: try schedule more hw events, even when previous groups failed
  2018-02-08 23:59 [RFC perf] perf: try schedule more hw events, even when previous groups failed Song Liu
@ 2018-02-10 14:13 ` Peter Zijlstra
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Zijlstra @ 2018-02-10 14:13 UTC (permalink / raw)
  To: Song Liu; +Cc: linux-kernel, kernel-team, ak, kan.liang

On Thu, Feb 08, 2018 at 03:59:48PM -0800, Song Liu wrote:
> Please share your comments and suggestions on this.

Like I said last time: No, we're not going to do this. Find that thread
if you want more detail.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-02-10 14:13 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-08 23:59 [RFC perf] perf: try schedule more hw events, even when previous groups failed Song Liu
2018-02-10 14:13 ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).