All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: acme@kernel.org, alexander.shishkin@linux.intel.com,
	jolsa@redhat.com, namhyung@kernel.org, songliubraving@fb.com,
	eranian@google.com, alexey.budankov@linux.intel.com,
	ak@linux.intel.com, mark.rutland@arm.com, megha.dey@intel.com,
	frederic@kernel.org, maddy@linux.ibm.com, irogers@google.com,
	kim.phillips@amd.com, linux-kernel@vger.kernel.org,
	santosh.shukla@amd.com
Subject: Re: [RFC v2] perf: Rewrite core context handling
Date: Tue, 23 Aug 2022 10:57:41 +0200	[thread overview]
Message-ID: <YwSWhXW+BUA3WkIE@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <aab04cfb-2dd5-89dc-213d-7fa253615864@amd.com>

On Tue, Aug 02, 2022 at 11:46:32AM +0530, Ravi Bangoria wrote:
> On 13-Jun-22 8:13 PM, Peter Zijlstra wrote:
> > On Mon, Jun 13, 2022 at 04:35:11PM +0200, Peter Zijlstra wrote:

> >> +static void ctx_pinned_sched_in(struct perf_event_context *ctx, struct pmu *pmu)
> >>  {
> >> +	struct perf_event_pmu_context *pmu_ctx;
> >>  	int can_add_hw = 1;
> >>  
> >> -	if (ctx != &cpuctx->ctx)
> >> -		cpuctx = NULL;
> >> -
> >> -	visit_groups_merge(cpuctx, &ctx->pinned_groups,
> >> -			   smp_processor_id(),
> >> -			   merge_sched_in, &can_add_hw);
> >> +	if (pmu) {
> >> +		visit_groups_merge(ctx, &ctx->pinned_groups,
> >> +				   smp_processor_id(), pmu,
> >> +				   merge_sched_in, &can_add_hw);
> >> +	} else {
> >> +		/*
> >> +		 * XXX: This can be optimized for per-task context by calling
> >> +		 * visit_groups_merge() only once with:
> >> +		 *   1) pmu=NULL
> >> +		 *   2) Ignoring pmu in perf_event_groups_cmp() when it's NULL
> >> +		 *   3) Making can_add_hw a per-pmu variable
> >> +		 *
> >> +		 * Though, it can not be opimized for per-cpu context because
> >> +		 * per-cpu rb-tree consist of pmu-subtrees and pmu-subtrees
> >> +		 * consist of cgroup-subtrees. i.e. a cgroup events of same
> >> +		 * cgroup but different pmus are seperated out into respective
> >> +		 * pmu-subtrees.
> >> +		 */
> >> +		list_for_each_entry(pmu_ctx, &ctx->pmu_ctx_list, pmu_ctx_entry) {
> >> +			can_add_hw = 1;
> >> +			visit_groups_merge(ctx, &ctx->pinned_groups,
> >> +					   smp_processor_id(), pmu_ctx->pmu,
> >> +					   merge_sched_in, &can_add_hw);
> >> +		}
> >> +	}
> >>  }
> > 
> > I'm not sure I follow.. task context can have multiple PMUs just the
> > same as CPU context can, that's more or less the entire point of the
> > patch.
> 
> Current rbtree key is {cpu, cgroup_id, group_idx}. However, effective key for
> task specific context is {cpu, group_idx} because cgroup_id is always 0. And
> effective key for cpu specific context is {cgroup_id, group_idx} because cpu
> is same for entire rbtree.
> 
> With New design, rbtree key will be {cpu, pmu, cgroup_id, group_idx}. But as
> explained above, effective key for task specific context will be {cpu, pmu,
> group_idx}. Thus, we can handle pmu=NULL in visit_groups_merge(), same as you
> did in the very first RFC[1]. (This may make things more complicated though
> because we might also need to increase heap size to accommodate all pmu events
> in single heap. Current heap size is 2 for task specific context, which is
> sufficient if we iterate over all pmus).
> 
> Same optimization won't work for cpu specific context because, it's effective
> key would be {pmu, cgroup_id, group_idx} i.e. each pmu subtree is made up of
> cgroup subtrees.

Agreed, new order is: {cpu, pmu, cgroup_id, group_idx}

Event scheduling looks at the {cpu, pmu, cgroup_id} subtree to find the
leftmost group_idx event to schedule next.

However, since cgroup events are per-cpu events, per-task events will
always have cgroup=NULL. Resulting in the subtrees:

  {-1, pmu, NULL} and {cpu, pmu, NULL}

Which is what the code does, it iterates ctx->pmu_ctx_list to find all
@pmu values and then for each does the schedule dance.

Now, I suppose making that:

  {-1, NULL, NULL}, {cpu, NULL, NULL}

could work, but wouldn't iterating the the tree be more expensive than
just finding the sub-trees as we do now?

You also talk about extending extending the heap, which I read like
doing the heap-merge over:

 {-1, pmu0, NULL}, {-1, pmu1, NULL}, ...
 {cpu, pmu0, NULL}, ...

But that doesn't make sense, the schedule dance is per-pmu.

Or am I just still not getting it?


  reply	other threads:[~2022-08-23 10:26 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-13 13:47 [RFC v2] perf: Rewrite core context handling Ravi Bangoria
2022-01-13 19:06 ` [RFC PATCH] perf: find_get_pmu_context can be static kernel test robot
2022-01-13 19:15 ` [RFC v2] perf: Rewrite core context handling kernel test robot
2022-01-17  7:18 ` [perf] f7cf7134e4: WARNING:at_kernel/events/core.c:#__pmu_ctx_sched_out kernel test robot
2022-01-17  7:18   ` kernel test robot
2022-01-31  4:43 ` [RFC v2] perf: Rewrite core context handling Ravi Bangoria
2022-06-13 14:35 ` Peter Zijlstra
2022-06-13 14:36   ` Peter Zijlstra
2022-06-13 14:38   ` Peter Zijlstra
2022-08-02  6:11     ` Ravi Bangoria
2022-08-22 15:29       ` Peter Zijlstra
2022-08-22 15:43         ` Peter Zijlstra
2022-08-22 16:37           ` Ravi Bangoria
2022-08-23  4:20             ` Ravi Bangoria
2022-08-29  3:54               ` Ravi Bangoria
2022-08-23  6:30             ` Peter Zijlstra
2022-08-29  4:00             ` Ravi Bangoria
2022-08-29 11:58               ` Peter Zijlstra
2022-08-22 16:52       ` Peter Zijlstra
2022-08-23  4:57         ` Ravi Bangoria
2022-06-13 14:41   ` Peter Zijlstra
2022-08-22 14:38     ` Ravi Bangoria
2022-06-13 14:43   ` Peter Zijlstra
2022-08-02  6:16     ` Ravi Bangoria
2022-08-23  8:57       ` Peter Zijlstra [this message]
2022-08-24  5:07         ` Ravi Bangoria
2022-08-24  7:27           ` Peter Zijlstra
2022-08-24  7:53             ` Ravi Bangoria
2022-06-13 14:55   ` Peter Zijlstra
2022-08-02  6:10     ` Ravi Bangoria
2022-08-22 16:44       ` Peter Zijlstra
2022-08-23  4:46         ` Ravi Bangoria
2022-06-17 13:36   ` Peter Zijlstra
2022-08-24 10:13     ` Peter Zijlstra
2022-06-27  4:18   ` Ravi Bangoria
2022-08-02  6:06     ` Ravi Bangoria
2022-08-24 12:15   ` Peter Zijlstra
2022-08-24 14:59     ` Peter Zijlstra
2022-08-25  5:39       ` Ravi Bangoria
2022-08-25  9:17         ` Peter Zijlstra
2022-08-25 11:03       ` Ravi Bangoria
2022-08-02  6:13 ` Ravi Bangoria
2022-08-23  7:10   ` Peter Zijlstra
2022-08-02  6:17 ` Ravi Bangoria
2022-08-23  7:26   ` Peter Zijlstra
2022-08-23 15:14     ` Ravi Bangoria
2022-08-22 14:40 ` Ravi Bangoria
2022-01-14 21:48 kernel test robot
2022-01-18  6:01 ` kernel test robot
2022-01-18  6:01   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YwSWhXW+BUA3WkIE@worktop.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexey.budankov@linux.intel.com \
    --cc=eranian@google.com \
    --cc=frederic@kernel.org \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=kim.phillips@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maddy@linux.ibm.com \
    --cc=mark.rutland@arm.com \
    --cc=megha.dey@intel.com \
    --cc=namhyung@kernel.org \
    --cc=ravi.bangoria@amd.com \
    --cc=santosh.shukla@amd.com \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.